Three-tier model selection

Different AI tasks have different cost + capability needs. Writing a one-paragraph narrative for a single finding is wildly different from reading a 100,000-token codebase. Pentestas picks the model tier that fits each task instead of routing everything through one giant model.

The tiers

Tier	Default	Used for
small	`claude-haiku-4-5-20251001`	Finding narratives, summaries, log classification, false-positive filter, report cleanup
medium	`claude-sonnet-4-6`	Per-category vulnerability analysis (SQLi / XSS / SSRF / Auth / Authz), attack-chain synthesis, payload generation, exploitation tool-use
large	`claude-opus-4-6`	Source-code architecture analysis, cross-finding deep reasoning, multi-agent planning

Total cost drops ~60% vs. the old "Sonnet everywhere" posture without sacrificing quality on the tasks that matter.

Overrides

Env vars let operators swap individual tiers for their own model preferences:

bash

export ANTHROPIC_SMALL_MODEL="claude-haiku-4-5-20251001"
export ANTHROPIC_MEDIUM_MODEL="claude-sonnet-4-6"
export ANTHROPIC_LARGE_MODEL="claude-opus-4-6"

Legacy single-model override (applies to every tier — back-compat for old deployments):

bash

export ANTHROPIC_DEFAULT_MODEL="claude-sonnet-4-6"

Task → tier mapping

Every LLM call site in Pentestas goes through a task label → tier resolver. The mapping is defined in backend/core/model_tiers.py:

python

finding_narrative:        small
summary:                  small
log_classify:             small
false_positive_filter:    small
report_cleanup:           small

vuln_analysis_injection:  medium
vuln_analysis_xss:        medium
vuln_analysis_ssrf:       medium
vuln_analysis_auth:       medium
vuln_analysis_authz:      medium
exploit_planning:         medium
attack_chain_synthesis:   medium
payload_generation:       medium
exploitation_tool_use:    medium

source_code_analysis:     large
architecture_recon:       large
deep_chain_reasoning:     large

Unknown tasks fall back to medium — the safe default.

Bring-your-own-key

Pro+ plans can supply an Anthropic API key under Settings → AI. Usage counts against your own billing. All three tiers use the same key; tier-specific keys aren't supported (yet).

When you'd want to override

Cost-conscious scans — set both ANTHROPIC_MEDIUM_MODEL and ANTHROPIC_LARGE_MODEL to Haiku. You lose analysis depth but cut cost ~80%.
Deep-dive scans — set ANTHROPIC_MEDIUM_MODEL to Opus for maximum category-analysis quality. Costs ~3× default.
Bedrock / Vertex deployments — set each tier to your region's model ID. See Authentication.