Model Selection#

design_research_agents.model_selection is the stable public facade for model catalogs, model flights, hardware snapshots, and selector decisions.

Public Facade#

class design_research_agents.model_selection.HardwareProfile(*, total_ram_gb, available_ram_gb, cpu_count, load_average, gpu_present, gpu_vram_gb, gpu_name=None, platform_name=None)[source]#

Snapshot of system hardware capacity for model selection.

total_ram_gb#

Total system RAM in GiB.

Type:: float | None

available_ram_gb#

Available system RAM in GiB.

Type:: float | None

cpu_count#

Logical CPU count.

Type:: int | None

load_average#

Load average tuple when supported.

Type:: tuple[float, float, float] | None

gpu_present#

Whether a GPU is detected.

Type:: bool | None

gpu_vram_gb#

Detected GPU VRAM in GiB.

Type:: float | None

gpu_name#

Optional GPU name.

Type:: str | None

platform_name#

Platform identifier string.

Type:: str | None

class design_research_agents.model_selection.ModelCatalog(*, models)[source]#

Catalog of known models and their hardware hints.

models#

Tuple of model specifications.

Type:: tuple[design_research_agents._model_selection._types.ModelSpec, …]

class design_research_agents.model_selection.ModelCostHint(*, tier, usd_per_1k_tokens=None)[source]#: Cost hints for model selection.

class design_research_agents.model_selection.ModelFlight(*, flight_id, description, models, tags=())[source]#

Named, reproducible set of model candidates for experiments or selection.

flight_id#

Stable identifier for the candidate set.

Type:: str

description#

Human-readable summary of the flight’s purpose.

Type:: str

models#

Tuple of model specifications included in the flight.

Type:: tuple[design_research_agents._model_selection._types.ModelSpec, …]

tags#

Optional labels for discovery and grouping.

Type:: tuple[str, …]

class design_research_agents.model_selection.ModelFlightRegistry(*, flights)[source]#

Registry of named model flights.

flights#

Tuple of known model flights.

Type:: tuple[design_research_agents._model_selection._catalog.ModelFlight, …]

class design_research_agents.model_selection.ModelLatencyHint(*, tier, note=None)[source]#: Latency hints for model selection.

class design_research_agents.model_selection.ModelMemoryHint(*, min_ram_gb, min_vram_gb, note=None)[source]#

Memory requirement hints for model selection.

min_ram_gb#

Suggested minimum system RAM in GiB.

Type:: float | None

min_vram_gb#

Suggested minimum GPU VRAM in GiB.

Type:: float | None

note#

Optional annotation for the hint.

Type:: str | None

class design_research_agents.model_selection.ModelSafetyConstraints(*, max_cost_usd, max_latency_ms)[source]#

Safety bounds attached to a model selection decision.

max_cost_usd#

Cost bound propagated into the decision.

Type:: float | None

max_latency_ms#

Latency bound propagated into the decision.

Type:: int | None

class design_research_agents.model_selection.ModelSelectionConstraints(*, require_local=False, preferred_provider=None, max_cost_usd=None, max_latency_ms=None)[source]#

Constraints that bound model selection choices.

require_local#

Whether to force local-only selection.

Type:: bool

preferred_provider#

Optional provider override.

Type:: str | None

max_cost_usd#

Optional maximum cost per 1K tokens.

Type:: float | None

max_latency_ms#

Optional latency cap in milliseconds.

Type:: int | None

class design_research_agents.model_selection.ModelSelectionDecision(*, model_id, provider, rationale, safety_constraints, policy_id, catalog_signature)[source]#

Selection output describing the chosen model and rationale.

model_id#

Selected model identifier.

Type:: str

provider#

Selected provider name.

Type:: str

rationale#

Human-readable rationale for the choice.

Type:: str

safety_constraints#

Safety bounds applied to the selection.

Type:: design_research_agents._model_selection._types.ModelSafetyConstraints

policy_id#

Policy identifier for reproducibility.

Type:: str

catalog_signature#

Catalog signature used for the decision.

Type:: str

class design_research_agents.model_selection.ModelSelectionIntent(*, task, priority='balanced')[source]#: Intent descriptor used by the model selection policy.

class design_research_agents.model_selection.ModelSelectionPolicyConfig(*, policy_id='default', prefer_local=True, ram_reserve_gb=2.0, vram_reserve_gb=0.5, max_load_ratio=0.85, remote_cost_floor_usd=0.02, default_max_latency_ms=None)[source]#

Configuration controlling model selection behavior.

policy_id#

Identifier used for traceability.

Type:: str

prefer_local#

Whether to prefer local models by default.

Type:: bool

ram_reserve_gb#

Reserved system RAM in GiB.

Type:: float

vram_reserve_gb#

Reserved GPU VRAM in GiB.

Type:: float

max_load_ratio#

Load ratio threshold to prefer remote.

Type:: float

remote_cost_floor_usd#

Cost below which remote is avoided.

Type:: float

default_max_latency_ms#

Default latency cap when none is provided.

Type:: int | None

class design_research_agents.model_selection.ModelSelector(*, catalog=None, prefer_local=True, ram_reserve_gb=2.0, vram_reserve_gb=0.5, max_load_ratio=0.85, remote_cost_floor_usd=0.02, default_max_latency_ms=None, local_client_resolver=None)[source]#

Flat model selection interface with client/config resolution helpers.

Initialize model selector policy controls and optional resolver hook.

Parameters:

catalog – Optional model catalog to use for selection.
prefer_local – Whether to prefer local models over remote ones when all else is equal.
ram_reserve_gb – Amount of RAM (in GB) to reserve when evaluating local candidates.
vram_reserve_gb – Amount of GPU VRAM (in GB) to reserve when evaluating local candidates.
max_load_ratio – Maximum system load ratio to consider a local candidate viable (0.0 to 1.0).
remote_cost_floor_usd – Minimum cost threshold (in USD) for remote models to be considered viable.
default_max_latency_ms – Default maximum latency (in milliseconds) to consider when evaluating candidates, if not specified in selection constraints.
local_client_resolver – Optional callable that takes a ModelSelectionDecision and returns a dict with ‘client_class’ and ‘kwargs’ for constructing a local client when the provider is not recognized by the built-in resolver. This allows for custom local providers to be integrated without modifying the ModelSelector code.

select(*, task, priority='balanced', require_local=False, preferred_provider=None, max_cost_usd=None, max_latency_ms=None, hardware_profile=None, output='client')[source]#

Select a model and return a decision, config mapping, or live client.

Parameters:

task – Description of the task or use case for which a model is being selected.
priority – Selection priority, which may influence the trade-off between quality, latency, and cost in the decision process.
require_local – If True, only consider local models as viable candidates.
preferred_provider – Optional provider name to prioritize in the selection process.
max_cost_usd – Optional maximum cost threshold (in USD) for candidate models.
max_latency_ms – Optional maximum latency threshold (in milliseconds) for candidate models.
hardware_profile – Optional mapping or HardwareProfile instance describing the current hardware state, which may be used to evaluate local candidates.
output – Determines the format of the selection result. “client” returns an instantiated LLMClient ready for use, “decision” returns the raw ModelSelectionDecision object with details of the selection rationale, and “client_config” returns a dict containing the information needed to construct an LLMClient (including ‘client_class’ and ‘kwargs’) without actually instantiating it.

Returns:

Depending on the ‘output’ parameter –

If output is “client”: An instantiated LLMClient configured according to the selection decision, ready for use in making requests.
If output is “decision”: A ModelSelectionDecision object containing details about the selected model, provider, rationale, and policy information.
If output is “client_config”: A dict containing the resolved client configuration, including ‘client_class’, ‘kwargs’, and metadata from the selection decision, which can be used to instantiate an LLMClient at a later time or in a different context.

Raises:

ValueError – If output is unsupported or selection/config coercion fails.

class design_research_agents.model_selection.ModelSpec(*, model_id, provider, family, size_b, format, quantization, memory_hint, latency_hint, cost_hint, quality_tier, speed_tier, source='curated', repo_id=None, revision=None, artifact=None, license=None, context_window=None, capabilities=(), tags=(), source_url=None, metadata=None)[source]#

Catalog entry describing one model option.

model_id#

Unique model identifier used by backends.

Type:: str

provider#

Backend or provider name.

Type:: str

family#

Model family grouping label.

Type:: str

size_b#

Approximate parameter count in billions.

Type:: float | None

format#

Storage or API format identifier.

Type:: str | None

quantization#

Quantization name when applicable.

Type:: str | None

memory_hint#

Optional memory requirement hints.

Type:: design_research_agents._model_selection._types.ModelMemoryHint | None

latency_hint#

Optional latency hints.

Type:: design_research_agents._model_selection._types.ModelLatencyHint | None

cost_hint#

Optional cost hints.

Type:: design_research_agents._model_selection._types.ModelCostHint | None

quality_tier#

Relative quality score (higher is better).

Type:: int | None

speed_tier#

Relative speed score (higher is faster).

Type:: int | None

source#

Provenance label for the catalog entry.

Type:: str

repo_id#

Optional upstream repository id.

Type:: str | None

revision#

Optional upstream revision or commit SHA.

Type:: str | None

artifact#

Optional preferred model artifact or filename.

Type:: str | None

license#

Optional upstream license identifier.

Type:: str | None

context_window#

Optional context-window size in tokens.

Type:: int | None

capabilities#

Optional normalized capability labels.

Type:: tuple[str, …]

tags#

Optional normalized discovery labels.

Type:: tuple[str, …]

source_url#

Optional upstream URL.

Type:: str | None

metadata#

Optional supplemental metadata.

Type:: collections.abc.Mapping[str, object] | None

Internal Modules#

The underscored modules below are documented for contributor visibility. Public usage should prefer design_research_agents.model_selection and the top-level exports in design_research_agents.

Model catalog utilities and default catalog entries.

Hardware profiling helpers for model selection.

Model selection policy implementation.

Public model selection facade with flattened constructor-first ergonomics.

Shared model selection data types.