API
This page documents the supported top-level public API from
design_research_agents.__all__.
Guaranteed compatibility applies to this top-level API surface and to the
public facade modules documented in docs/reference under “Guaranteed
Public Modules”.
Underscored module paths (for example design_research_agents._contracts)
are internal and unstable. They are documented in the module reference for
contributors, but they are not compatibility-guaranteed.
Top-level groups:
Metadata:
__version__Entry points: agents, LLM clients,
ModelSelectorCore contracts:
ExecutionResult,LLMRequest,LLMMessage,LLMResponse,ToolResultwith normalized read helpers for structured payload accessOrchestration: workflow step classes,
Workflow, and pattern classes (module homes:design_research_agents.workflowanddesign_research_agents.patterns)Tools:
Toolbox,CallableToolConfig,ScriptToolConfig,MCPServerConfigTracing:
Tracer
__version__
- design_research_agents.__version__ = '0.2.0'
str(object=’’) -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.__str__() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to ‘strict’.
Entry Points
Agents
- class design_research_agents.DirectLLMCall(*, llm_client, system_prompt=None, temperature=None, max_tokens=None, provider_options=None, tracer=None)[source]
One-shot direct model call with no tool runtime.
Design choices:
Uses a small Workflow with three LogicSteps (prepare, call, finalize) so the trace mirrors multi-step agents.
Keeps defaults (system prompt, temperature, max_tokens, provider_options) on the agent, but allows per-run overrides via
normalized_input.
Initialize a direct-LLM agent with optional default generation args.
- Parameters:
llm_client – LLM client used for prompt execution.
system_prompt – Optional default system prompt.
temperature – Optional default sampling temperature.
max_tokens – Optional default output-token cap.
provider_options – Optional default backend-specific options.
tracer – Optional explicit tracer dependency.
- Raises:
ValueError – If max token configuration is invalid.
- class design_research_agents.MultiStepAgent(*, mode, llm_client, tool_runtime=None, max_steps=5, stop_on_step_failure=True, controller_system_prompt=None, controller_user_prompt_template=None, continuation_system_prompt=None, continuation_user_prompt_template=None, step_user_prompt_template=None, tool_calling_system_prompt=None, tool_calling_user_prompt_template=None, alternatives_prompt_target='user', continuation_memory_tail_items=6, step_memory_tail_items=8, memory_store=None, memory_namespace='default', memory_read_top_k=4, memory_write_observations=True, max_tool_calls_per_step=5, execution_timeout_seconds=5, validate_tool_input_schema=False, normalize_generated_code_per_step=False, default_tools_per_step=None, allowed_tools=None, tracer=None)[source]
Single multi-step runtime entrypoint for direct/json/code strategies.
Initialize one mode-specific multi-step strategy.
- Parameters:
mode – Required strategy mode (
direct,json, orcode).llm_client – LLM client shared by all strategy modes.
tool_runtime – Tool runtime required for
jsonandcodemodes.max_steps – Maximum number of multi-step iterations.
stop_on_step_failure – Whether to stop loop execution on failed steps.
controller_system_prompt – Direct-mode controller system prompt override.
controller_user_prompt_template – Direct-mode controller user prompt override.
continuation_system_prompt – Continuation system prompt override.
continuation_user_prompt_template – Continuation user prompt override.
step_user_prompt_template – Step action user prompt override.
tool_calling_system_prompt – Json mode tool-calling system prompt override.
tool_calling_user_prompt_template – Json mode tool-calling user prompt override.
alternatives_prompt_target – Prompt insertion target for alternatives blocks.
continuation_memory_tail_items – Continuation memory tail item count.
step_memory_tail_items – Step memory tail item count.
memory_store – Optional persistent memory dependency.
memory_namespace – Memory namespace for read/write operations.
memory_read_top_k – Memory retrieval top-k.
memory_write_observations – Whether to persist per-step observations.
max_tool_calls_per_step – Code-mode per-step tool call cap.
execution_timeout_seconds – Code-mode sandbox timeout.
validate_tool_input_schema – Code-mode tool input schema validation toggle.
normalize_generated_code_per_step – Code-mode code normalization toggle.
default_tools_per_step – Code-mode default tool allowlist.
allowed_tools – Optional json-mode tool allowlist.
tracer – Optional tracer dependency.
- Raises:
ValueError – Raised when mode/tool configuration is invalid.
- compile(prompt, *, request_id=None, dependencies=None)[source]
Compile one run through the selected strategy mode.
- run(prompt, *, request_id=None, dependencies=None)[source]
Execute one run through the selected strategy mode.
- property workflow
Expose the most recently compiled workflow from the selected strategy.
LLM Clients and Selection
All public LLM clients implement the same introspection helpers in addition to
generation methods: default_model(), capabilities(),
config_snapshot(), server_snapshot(), and describe(). They also
implement close() plus with-statement lifecycle support; the
context-manager form is the preferred public usage pattern.
- class design_research_agents.LlamaCppServerLLMClient(*, name='llama-local', model='Qwen2.5-1.5B-Instruct-Q4_K_M.gguf', hf_model_repo_id='bartowski/Qwen2.5-1.5B-Instruct-GGUF', api_model='qwen2-1.5b-q4', host='127.0.0.1', port=8001, context_window=4096, startup_timeout_seconds=60.0, request_timeout_seconds=60.0, poll_interval_seconds=0.25, python_executable='/opt/hostedtoolcache/Python/3.12.13/x64/bin/python3', extra_server_args=(), max_retries=2, model_patterns=None)[source]
Client for a managed local
llama_cpp.serverbackend.Initialize a local llama-cpp client with sensible defaults.
- Parameters:
name – Logical name for this client instance, used in logging and provenance.
model – Local model identifier or path for llama_cpp.server to load.
hf_model_repo_id – Optional Hugging Face repo ID to auto-download the model from if not found locally.
api_model – The model name to report in API responses, which can differ from the local model name.
host – Host interface for the local server to bind to.
port – Port for the local server to listen on.
context_window – Context window size (n_ctx) to configure the llama_cpp.server with.
startup_timeout_seconds – Max time to wait for the server process to start and become healthy.
request_timeout_seconds – HTTP timeout for generate and stream requests.
poll_interval_seconds – Time interval between health check polls during startup.
python_executable – Python executable to use for running the server process.
extra_server_args – Additional command-line arguments to pass when starting the server process.
max_retries – Number of times to retry a request in case of failure before giving up.
model_patterns – Optional tuple of model name patterns supported by this client, used for routing decisions. If None, defaults to (api_model,).
- class design_research_agents.AnthropicServiceLLMClient(*, name='anthropic', default_model='claude-3-5-haiku-latest', api_key_env='ANTHROPIC_API_KEY', api_key=None, base_url=None, max_retries=2, model_patterns=None)[source]
Client for the official Anthropic API backend.
Initialize an Anthropic service client with sensible defaults.
- class design_research_agents.GeminiServiceLLMClient(*, name='gemini', default_model='gemini-2.5-flash', api_key_env='GOOGLE_API_KEY', api_key=None, max_retries=2, model_patterns=None)[source]
Client for the official Gemini API backend.
Initialize a Gemini service client with sensible defaults.
- class design_research_agents.GroqServiceLLMClient(*, name='groq', default_model='llama-3.1-8b-instant', api_key_env='GROQ_API_KEY', api_key=None, base_url=None, max_retries=2, model_patterns=None)[source]
Client for the official Groq API backend.
Initialize a Groq service client with sensible defaults.
- class design_research_agents.OpenAIServiceLLMClient(*, name='openai', default_model='gpt-4o-mini', api_key_env='OPENAI_API_KEY', api_key=None, base_url=None, max_retries=2, model_patterns=None)[source]
Client for the official OpenAI API backend.
Initialize an OpenAI service client with sensible defaults.
- class design_research_agents.AzureOpenAIServiceLLMClient(*, name='azure-openai', default_model='gpt-4o-mini', api_key_env='AZURE_OPENAI_API_KEY', api_key=None, azure_endpoint_env='AZURE_OPENAI_ENDPOINT', azure_endpoint=None, api_version_env='AZURE_OPENAI_API_VERSION', api_version=None, max_retries=2, model_patterns=None)[source]
Client for the Azure OpenAI API via the official OpenAI SDK.
Initialize an Azure OpenAI service client with sensible defaults.
- class design_research_agents.OpenAICompatibleHTTPLLMClient(*, name='openai-compatible', base_url='http://127.0.0.1:8001/v1', default_model='qwen2-1.5b-q4', api_key_env='OPENAI_API_KEY', api_key=None, max_retries=2, model_patterns=None)[source]
Client for OpenAI-compatible HTTP endpoints.
Initialize an OpenAI-compatible HTTP client with sensible defaults.
- class design_research_agents.TransformersLocalLLMClient(*, name='transformers-local', model_id='distilgpt2', default_model='distilgpt2', device='auto', dtype='auto', quantization='none', trust_remote_code=False, revision=None, max_retries=2, model_patterns=None)[source]
Client for in-process Transformers local inference.
Initialize a local Transformers client with sensible defaults.
- Parameters:
name – Logical name for this client instance, used in logging and provenance.
model_id – Identifier for the model to load (e.g. “distilgpt2 or a Hugging Face repo ID like “gpt2”).
default_model – Default model name for prompts that don’t specify one.
device – Device to load the model on (e.g. “cpu”, “cuda”, “mps”, or “auto” to automatically select based on availability).
dtype – Data type to use for model weights (e.g. “float16”, “bfloat16”, “int8”, or “auto” to automatically select based on device).
quantization – Quantization level to use when loading the model (e.g. “4 bit”, “8-bit”, “fp16”, or “none” for no quantization).
trust_remote_code – Whether to allow execution of custom code from remote repositories when loading models, which may be required for some models but can be a security risk.
revision – Optional model revision to load (e.g. a git branch, tag, or commit hash), if the model is being loaded from a Hugging Face repository that has multiple revisions.
max_retries – Number of times to retry a request in case of failure before giving up
model_patterns – Optional tuple of model name patterns supported by this client, used for routing decisions. If None, defaults to (default_model,).
- class design_research_agents.MLXLocalLLMClient(*, name='mlx-local', model_id='mlx-community/Qwen2.5-1.5B-Instruct-4bit', default_model='mlx-community/Qwen2.5-1.5B-Instruct-4bit', quantization='none', max_retries=2, model_patterns=None)[source]
Client for Apple MLX local inference.
Initialize an MLX local client with sensible defaults.
- Parameters:
name – Logical name for this client instance, used in logging and provenance.
model_id – Identifier for the MLX model to load (e.g. “mlx-community /Qwen2.5-1.5B-Instruct-4bit”).
default_model – Default model name for prompts that don’t specify one.
quantization – Quantization level to use when loading the model (e.g. “4 -bit”, “8-bit”, “fp16”).
max_retries – Number of times to retry a request in case of failure before giving up
model_patterns – Optional tuple of model name patterns supported by this client, used for routing decisions. If None, defaults to (default_model,).
- class design_research_agents.VLLMServerLLMClient(*, name='vllm-local', model='Qwen/Qwen2.5-1.5B-Instruct', api_model='qwen2.5-1.5b-instruct', host='127.0.0.1', port=8002, manage_server=True, startup_timeout_seconds=90.0, poll_interval_seconds=0.5, python_executable='/opt/hostedtoolcache/Python/3.12.13/x64/bin/python3', extra_server_args=(), base_url=None, request_timeout_seconds=60.0, max_retries=2, model_patterns=None)[source]
Client for local or self-hosted vLLM OpenAI-compatible inference.
Initialize a vLLM client in managed-server or connect mode.
- Parameters:
name – Logical name for this client instance.
model – Model identifier passed to managed vLLM server startup.
api_model – Model alias exposed by vLLM OpenAI-compatible API.
host – Host interface used in managed mode.
port – TCP port used in managed mode.
manage_server – Whether this client manages the vLLM server lifecycle.
startup_timeout_seconds – Maximum startup wait time in managed mode.
poll_interval_seconds – Delay between readiness probes in managed mode.
python_executable – Python executable used to launch managed vLLM process.
extra_server_args – Additional CLI flags forwarded to vLLM server.
base_url – Optional connect-mode endpoint URL. Required only for remote/self-managed deployments; defaults to
http://{host}:{port}/v1.request_timeout_seconds – HTTP timeout for generate and stream requests.
max_retries – Number of retries for retryable provider/transport errors.
model_patterns – Optional tuple of model patterns for routing decisions.
- Raises:
ValueError – If
manage_serverandbase_urlare both configured.
- class design_research_agents.OllamaLLMClient(*, name='ollama-local', default_model='qwen2.5:1.5b-instruct', host='127.0.0.1', port=11434, manage_server=True, ollama_executable='ollama', auto_pull_model=False, startup_timeout_seconds=60.0, poll_interval_seconds=0.25, request_timeout_seconds=60.0, max_retries=2, model_patterns=None)[source]
Client for local or self-hosted Ollama chat inference.
Initialize an Ollama client in managed-server or connect mode.
- Parameters:
name – Logical name for this client instance.
default_model – Default model id used when requests omit model.
host – Host interface used in managed mode or connect mode.
port – TCP port used in managed mode or connect mode.
manage_server – Whether this client manages
ollama servelifecycle.ollama_executable – Executable used to invoke
ollamacommands.auto_pull_model – Whether to pull
default_modelafter startup.startup_timeout_seconds – Maximum startup wait time in managed mode.
poll_interval_seconds – Delay between readiness probes in managed mode.
request_timeout_seconds – HTTP timeout for generate and stream requests.
max_retries – Number of retries for retryable provider/transport errors.
model_patterns – Optional tuple of model patterns for routing decisions.
- class design_research_agents.SGLangServerLLMClient(*, name='sglang-local', model='Qwen/Qwen2.5-1.5B-Instruct', host='127.0.0.1', port=30000, manage_server=True, startup_timeout_seconds=90.0, poll_interval_seconds=0.5, python_executable='/opt/hostedtoolcache/Python/3.12.13/x64/bin/python3', extra_server_args=(), base_url=None, request_timeout_seconds=60.0, max_retries=2, model_patterns=None)[source]
Client for local or self-hosted SGLang OpenAI-compatible inference.
Initialize an SGLang client in managed-server or connect mode.
- Parameters:
name – Logical name for this client instance.
model – Model identifier passed to managed SGLang server startup.
host – Host interface used in managed mode.
port – TCP port used in managed mode.
manage_server – Whether this client manages the SGLang server lifecycle.
startup_timeout_seconds – Maximum startup wait time in managed mode.
poll_interval_seconds – Delay between readiness probes in managed mode.
python_executable – Python executable used to launch managed SGLang process.
extra_server_args – Additional CLI flags forwarded to SGLang server.
base_url – Optional connect-mode endpoint URL. Required only for remote/self-managed deployments; defaults to
http://{host}:{port}/v1.request_timeout_seconds – HTTP timeout for generate and stream requests.
max_retries – Number of retries for retryable provider/transport errors.
model_patterns – Optional tuple of model patterns for routing decisions.
- Raises:
ValueError – If
manage_serverandbase_urlare both configured.
- class design_research_agents.ModelSelector(*, catalog=None, prefer_local=True, ram_reserve_gb=2.0, vram_reserve_gb=0.5, max_load_ratio=0.85, remote_cost_floor_usd=0.02, default_max_latency_ms=None, local_client_resolver=None)[source]
Flat model selection interface with client/config resolution helpers.
Initialize model selector policy controls and optional resolver hook.
- Parameters:
catalog – Optional model catalog to use for selection.
prefer_local – Whether to prefer local models over remote ones when all else is equal.
ram_reserve_gb – Amount of RAM (in GB) to reserve when evaluating local candidates.
vram_reserve_gb – Amount of GPU VRAM (in GB) to reserve when evaluating local candidates.
max_load_ratio – Maximum system load ratio to consider a local candidate viable (0.0 to 1.0).
remote_cost_floor_usd – Minimum cost threshold (in USD) for remote models to be considered viable.
default_max_latency_ms – Default maximum latency (in milliseconds) to consider when evaluating candidates, if not specified in selection constraints.
local_client_resolver – Optional callable that takes a ModelSelectionDecision and returns a dict with ‘client_class’ and ‘kwargs’ for constructing a local client when the provider is not recognized by the built-in resolver. This allows for custom local providers to be integrated without modifying the ModelSelector code.
- select(*, task, priority='balanced', require_local=False, preferred_provider=None, max_cost_usd=None, max_latency_ms=None, hardware_profile=None, output='client')[source]
Select a model and return a decision, config mapping, or live client.
- Parameters:
task – Description of the task or use case for which a model is being selected.
priority – Selection priority, which may influence the trade-off between quality, latency, and cost in the decision process.
require_local – If True, only consider local models as viable candidates.
preferred_provider – Optional provider name to prioritize in the selection process.
max_cost_usd – Optional maximum cost threshold (in USD) for candidate models.
max_latency_ms – Optional maximum latency threshold (in milliseconds) for candidate models.
hardware_profile – Optional mapping or HardwareProfile instance describing the current hardware state, which may be used to evaluate local candidates.
output – Determines the format of the selection result. “client” returns an instantiated LLMClient ready for use, “decision” returns the raw ModelSelectionDecision object with details of the selection rationale, and “client_config” returns a dict containing the information needed to construct an LLMClient (including ‘client_class’ and ‘kwargs’) without actually instantiating it.
- Returns:
Depending on the ‘output’ parameter –
If output is “client”: An instantiated LLMClient configured according to the selection decision, ready for use in making requests.
If output is “decision”: A ModelSelectionDecision object containing details about the selected model, provider, rationale, and policy information.
If output is “client_config”: A dict containing the resolved client configuration, including ‘client_class’, ‘kwargs’, and metadata from the selection decision, which can be used to instantiate an LLMClient at a later time or in a different context.
- Raises:
ValueError – If
outputis unsupported or selection/config coercion fails.
Core Contracts
ExecutionResult and per-step WorkflowStepResult objects expose matching
output access helpers for safe reads from loosely structured payloads. The
public ToolResult contract also includes normalized getters such as
result_dict(), result_list(), error_message, and artifact_paths.
- class design_research_agents.ExecutionResult(*, success, output=<factory>, tool_results=<factory>, model_response=None, step_results=<factory>, execution_order=<factory>, metadata=<factory>)[source]
Structured output produced by one execution entrypoint.
This shape intentionally covers both agent-like executions and workflow-like executions so callers can consume one result contract everywhere.
- property error
Return terminal error payload when present.
- Returns:
Error payload from
outputmapping, orNone.
- execution_order
Step ids in the order they were executed for workflow-style runs.
- property final_output
Return workflow/agent
final_outputpayload when present.- Returns:
Final output value from
outputpayload, orNone.
- metadata
Additional diagnostics, runtime counters, and trace metadata.
- model_response
Final model response associated with the run, when available.
- output
Primary payload produced by the entrypoint.
- output_dict(key)[source]
Return one output value normalized to a dictionary.
- Parameters:
key – Output key to read.
- Returns:
Dictionary value when the output value is mapping-like, else
{}.
- output_list(key)[source]
Return one output value normalized to a list.
- Parameters:
key – Output key to read.
- Returns:
List value when the output value is a list/tuple, else
[].
- output_value(key, default=None)[source]
Return one output value by key with optional default.
- Parameters:
key – Output key to read.
default – Value returned when
keyis absent.
- Returns:
Output value for
keywhen present, elsedefault.
- step_results
Per-step results keyed by step id for workflow-style runs.
- success
True when the overall run completed without terminal failure.
- summary()[source]
Return one compact summary payload for user-facing output.
- Returns:
Compact summary payload with canonical execution fields.
- property terminated_reason
Return normalized termination reason when present.
- Returns:
Termination reason string, or
None.
- to_dict()[source]
Return a JSON-serializable dictionary representation of the result.
- Returns:
Dictionary representation of the result payload.
- to_json(*, ensure_ascii=True, indent=2, sort_keys=True)[source]
Return JSON string for deterministic pretty-printing.
- Parameters:
ensure_ascii – Forwarded to
json.dumps.indent – Forwarded to
json.dumps.sort_keys – Forwarded to
json.dumps.
- Returns:
JSON representation of this result.
- tool_results
Tool invocation results captured during execution, in call order.
- class design_research_agents.LLMRequest(*, messages, model=None, temperature=None, max_tokens=None, tools=(), response_schema=None, response_format=None, metadata=<factory>, provider_options=<factory>, task_profile=None)[source]
Provider-neutral request payload for LLM generation.
- max_tokens
Maximum output token limit.
- messages
Ordered conversation/messages sent to the model.
- metadata
Caller metadata forwarded for tracing and diagnostics.
- model
Explicit model identifier override for this request.
- provider_options
Backend/provider-specific low-level options.
- response_format
Provider-specific response-format hints.
- response_schema
Optional schema for structured output validation.
- task_profile
Optional routing profile used by selector-aware clients.
- temperature
Sampling temperature override.
- tools
Tool specifications exposed for model tool-calling.
- class design_research_agents.LLMMessage(*, role, content, name=None, tool_call_id=None, tool_name=None)[source]
One chat message in the provider-neutral completion format.
- content
Plain-text message content.
- name
Optional participant name, when supported by the provider.
- role
Message role used by chat-compatible backends.
- tool_call_id
Tool call identifier for tool-response messages.
- tool_name
Tool name associated with a tool-response message.
- class design_research_agents.LLMResponse(*, text, model=None, provider=None, finish_reason=None, usage=None, latency_ms=None, raw_output=None, tool_calls=(), raw=None, provenance=None)[source]
Normalized non-streaming response payload returned by a backend.
- finish_reason
Provider-specific completion reason.
- latency_ms
End-to-end latency in milliseconds.
- model
Model identifier reported by the backend.
- provenance
Execution provenance metadata for auditability.
- provider
Provider/backend name that produced this response.
- raw
Canonical raw backend payload snapshot.
- raw_output
Legacy/raw backend payload for debugging.
- text
Primary response text emitted by the model.
- tool_calls
Tool calls requested by the model in this response.
- usage
Token usage counters when available.
- class design_research_agents.ToolResult(*, tool_name, ok, result=None, artifacts=(), warnings=(), error=None, metadata=None)[source]
Result payload emitted from a tool runtime invocation.
Initialize canonical tool result payload.
- Parameters:
tool_name – Name of the invoked tool.
ok – Invocation success flag.
result – Primary result payload (defaults to empty mapping).
artifacts – Raw or typed artifact entries to normalize.
warnings – Warning messages to attach to the result.
error – Error payload to normalize into
ToolError.metadata – Optional diagnostic metadata mapping.
- property artifact_paths
Return artifact paths in emitted order.
- Returns:
Tuple of artifact path strings.
- artifacts
Artifact list emitted by the invocation.
- error
Structured error details when
okis false.
- property error_message
Return the normalized tool error message when present.
- Returns:
Error message string, or
None.
- metadata
Supplemental runtime metadata for diagnostics and tracing.
- ok
True when invocation succeeded.
- result
Primary tool return payload.
- result_dict()[source]
Return the primary result payload normalized to a dictionary.
- Returns:
Dictionary value when
resultis mapping-like, else{}.
- result_list()[source]
Return the primary result payload normalized to a list.
- Returns:
List value when
resultis a list/tuple, else[].
- tool_name
Name of the invoked tool.
- warnings
Non-fatal warnings produced during invocation.
Orchestration
Workflow Steps and Facade
CompiledExecution is the workflow-backed object returned by delegate
compile(...) methods. Calling compiled.run() executes the bound
workflow and applies delegate-specific finalization. Accessing
compiled.workflow gives the raw workflow graph for inspection and testing.
Calling compiled.workflow.run(...) directly bypasses that finalization
layer and returns the raw workflow result.
Workflow step executions surface WorkflowStepResult payloads through
ExecutionResult.step_results. These step results mirror the top-level
ExecutionResult output accessor helpers for consistent reads.
- class design_research_agents.LogicStep(*, step_id, handler, dependencies=(), route_map=None, artifacts_builder=None)[source]
Workflow step that executes deterministic local logic.
- artifacts_builder
Optional callback that extracts user-facing artifact manifests from step context.
- dependencies
Step ids that must complete before this step can run.
- handler
Deterministic local function that computes this step output.
- route_map
Optional route key to downstream-target mapping for conditional activation.
- step_id
Unique step identifier used for dependency wiring and result lookup.
- class design_research_agents.ToolStep(*, step_id, tool_name, dependencies=(), input_data=None, input_builder=None, artifacts_builder=None)[source]
Workflow step that invokes one runtime tool.
- artifacts_builder
Optional callback that extracts user-facing artifact manifests from step context.
- dependencies
Step ids that must complete before this step can run.
- input_builder
Optional callback that derives input payload from runtime step context.
- input_data
Static input payload used when
input_builderis not provided.
- step_id
Unique step identifier used for dependency wiring and result lookup.
- tool_name
Registered tool name to invoke through the tool runtime.
- class design_research_agents.DelegateStep(*, step_id, delegate, dependencies=(), prompt=None, prompt_builder=None, artifacts_builder=None)[source]
Workflow step that invokes one direct delegate.
- artifacts_builder
Optional callback that extracts user-facing artifact manifests from step context.
- delegate
Direct delegate object (agent, pattern, or workflow-like runner).
- dependencies
Step ids that must complete before this step can run.
- prompt
Static prompt passed to the delegate when
prompt_builderis absent.
- prompt_builder
Optional callback that derives a prompt string from runtime step context.
- step_id
Unique step identifier used for dependency wiring and result lookup.
- class design_research_agents.ModelStep(*, step_id, llm_client, request_builder, dependencies=(), response_parser=None, artifacts_builder=None)[source]
Workflow step that executes one model request through an LLM client.
- artifacts_builder
Optional callback that extracts user-facing artifact manifests from step context.
- dependencies
Step ids that must complete before this step can run.
- llm_client
LLM client used to execute the request built for this step.
- request_builder
Callback that builds the
LLMRequestpayload from runtime context.
- response_parser
Optional callback that parses model response into structured output.
- step_id
Unique step identifier used for dependency wiring and result lookup.
- class design_research_agents.DelegateBatchStep(*, step_id, calls_builder, dependencies=(), fail_fast=True, artifacts_builder=None)[source]
Workflow step that executes multiple delegate invocations in sequence.
- artifacts_builder
Optional callback that extracts user-facing artifact manifests from step context.
- calls_builder
Callback that builds batch delegate call specs from runtime context.
- dependencies
Step ids that must complete before this step can run.
- fail_fast
Whether to stop executing additional calls after first failure.
- step_id
Unique step identifier used for dependency wiring and result lookup.
- class design_research_agents.LoopStep(*, step_id, steps, dependencies=(), max_iterations=1, initial_state=None, continue_predicate=None, state_reducer=None, execution_mode='sequential', failure_policy='skip_dependents', artifacts_builder=None)[source]
Workflow step that executes an iterative nested workflow body.
- artifacts_builder
Optional callback that extracts user-facing artifact manifests from step context.
- continue_predicate
Predicate deciding whether to execute the next iteration.
- dependencies
Step ids that must complete before loop iteration begins.
- execution_mode
Execution mode used for nested loop-body workflow runs.
- failure_policy
Failure handling policy applied within each loop iteration run.
- initial_state
Initial loop state mapping provided to iteration context.
- max_iterations
Hard cap on the number of loop iterations.
- state_reducer
Reducer that computes next loop state from prior state and iteration result.
- step_id
Unique step identifier used for dependency wiring and result lookup.
- steps
Static loop body steps executed for each iteration.
- class design_research_agents.MemoryReadStep(*, step_id, query_builder, dependencies=(), namespace='default', top_k=5, min_score=None, artifacts_builder=None)[source]
Workflow step that reads relevant records from the memory store.
- artifacts_builder
Optional callback that extracts user-facing artifact manifests from step context.
- dependencies
Step ids that must complete before this step can run.
- min_score
Optional minimum score threshold for returned records.
- namespace
Namespace partition to read from.
- query_builder
Callback that builds query text or query payload from step context.
- step_id
Unique step identifier used for dependency wiring and result lookup.
- top_k
Maximum number of records to return.
- class design_research_agents.MemoryWriteStep(*, step_id, records_builder, dependencies=(), namespace='default', artifacts_builder=None)[source]
Workflow step that writes records into the memory store.
- artifacts_builder
Optional callback that extracts user-facing artifact manifests from step context.
- dependencies
Step ids that must complete before this step can run.
- namespace
Namespace partition to write into.
- records_builder
Callback that builds record payloads from step context.
- step_id
Unique step identifier used for dependency wiring and result lookup.
- class design_research_agents.Workflow(*, tool_runtime=None, memory_store=None, steps, input_schema=None, output_schema=None, prompt_context_key='prompt', base_context=None, default_execution_mode='sequential', default_failure_policy='skip_dependents', default_request_id_prefix=None, default_dependencies=None, tracer=None)[source]
Configured workflow for user-defined step graphs and run defaults.
Store runtime dependencies, step graph, and input handling mode.
- Parameters:
tool_runtime – Tool runtime used by
ToolStepexecutions.memory_store – Optional memory store used by memory step executions.
steps – Static workflow step graph to execute for each run.
input_schema – Optional schema used to infer input mode and validate mapped input. When omitted, workflow expects prompt-string input.
output_schema – Optional schema enforced against
output.final_outputwhen the run succeeds.prompt_context_key – Context key used to store normalized prompt input.
base_context – Base context merged into every run context.
default_execution_mode – Default runtime step scheduling mode.
default_failure_policy – Default dependency failure handling policy.
default_request_id_prefix – Optional prefix used to derive request ids.
default_dependencies – Default dependency objects injected into each run.
tracer – Optional tracer used for workflow runtime events.
- Raises:
ValueError – If constructor inputs are inconsistent.
- run(input=None, *, execution_mode=None, failure_policy=None, request_id=None, dependencies=None)[source]
Execute one workflow run with input mode inferred from
input_schema.- Parameters:
input – Prompt string when
input_schemais omitted; otherwise schema mapping.execution_mode – Optional per-run execution mode override.
failure_policy – Optional per-run failure policy override.
request_id – Optional explicit request id for tracing/correlation.
dependencies – Optional per-run dependency overrides.
- Returns:
Aggregated workflow execution result.
- class design_research_agents.CompiledExecution(*, workflow, input, request_id, dependencies, delegate_name, finalize=<function _identity_result>, execution_mode='sequential', failure_policy='skip_dependents', tracer=None, trace_input=<factory>, workflow_request_id=None)[source]
Bound compiled delegate execution that can be run repeatedly.
- delegate_name
Delegate name used for top-level trace metadata.
- dependencies
Bound dependency payload mapping.
- execution_mode
Workflow execution mode used by
run().
- failure_policy
Workflow failure policy used by
run().
- finalize
Finalizer that maps the raw workflow result into the delegate result.
- input
Bound workflow input payload.
- request_id
Top-level request identifier for delegate tracing.
- trace_input
Input payload attached to the top-level trace scope.
- tracer
Optional tracer used for top-level compile-run traces.
- workflow
Workflow graph compiled for this execution.
- workflow_request_id
Optional nested workflow request id override.
Patterns
Pattern compile(...) methods are the lower-level construction hook for
advanced callers. They return a bound CompiledExecution and omit the
top-level run() convenience wrapper until you call compiled.run().
- class design_research_agents.TwoSpeakerConversationPattern(*, llm_client_a, llm_client_b=None, speaker_a_delegate=None, speaker_b_delegate=None, max_turns=3, speaker_a_name='speaker_a', speaker_b_name='speaker_b', speaker_a_system_prompt=None, speaker_a_user_prompt_template=None, speaker_b_system_prompt=None, speaker_b_user_prompt_template=None, default_request_id_prefix=None, default_dependencies=None, tracer=None)[source]
Two-speaker LLM conversation pattern with per-speaker prompts and clients.
Store dependencies and prompt defaults for conversation orchestration.
- Parameters:
llm_client_a – LLM client used by speaker A.
llm_client_b – Optional LLM client used by speaker B. Defaults to
llm_client_awhen omitted.speaker_a_delegate – Optional explicit delegate for speaker A.
speaker_b_delegate – Optional explicit delegate for speaker B.
max_turns – Maximum conversation turns where each turn is A->B.
speaker_a_name – Display name for speaker A in transcript and prompts.
speaker_b_name – Display name for speaker B in transcript and prompts.
speaker_a_system_prompt – Optional override for speaker A system prompt.
speaker_a_user_prompt_template – Optional speaker A user template override.
speaker_b_system_prompt – Optional override for speaker B system prompt.
speaker_b_user_prompt_template – Optional speaker B user template override.
default_request_id_prefix – Optional request-id prefix used for auto-generated ids.
default_dependencies – Default dependency mapping merged into each run.
tracer – Optional tracer used for pattern and nested agent traces.
- Raises:
ValueError – Raised when constructor configuration is invalid.
- class design_research_agents.DebatePattern(*, llm_client, tool_runtime, affirmative_delegate=None, negative_delegate=None, judge_delegate=None, max_rounds=3, affirmative_system_prompt=None, affirmative_user_prompt_template=None, negative_system_prompt=None, negative_user_prompt_template=None, judge_system_prompt=None, judge_user_prompt_template=None, default_request_id_prefix='debate', default_dependencies=None, tracer=None)[source]
Configured reusable debate pattern with affirmative, negative, and judge phases.
Store dependencies and initialize prompt defaults.
- class design_research_agents.PlanExecutePattern(*, llm_client, tool_runtime, planner_delegate=None, executor_delegate=None, max_iterations=3, max_tool_calls_per_step=5, planner_system_prompt=None, planner_user_prompt_template=None, executor_step_prompt_template=None, default_request_id_prefix=None, default_dependencies=None, tracer=None)[source]
Planner/executor orchestration pattern built on workflow primitives.
Store dependencies and initialize workflow-native orchestration settings.
- Parameters:
llm_client – LLM client used for planner and executor model calls.
tool_runtime – Tool runtime used by executor agent steps.
planner_delegate – Optional planner delegate override.
executor_delegate – Optional executor delegate override.
max_iterations – Maximum number of plan steps executed in one run.
max_tool_calls_per_step – Maximum tool calls allowed per executor step.
planner_system_prompt – Optional override for planner system prompt.
planner_user_prompt_template – Optional override for planner user prompt.
executor_step_prompt_template – Optional override for executor step prompt.
default_request_id_prefix – Optional prefix used to derive request ids.
default_dependencies – Dependency defaults merged into each run.
tracer – Optional tracer used for run-level instrumentation.
- Raises:
ValueError – If
max_iterationsormax_tool_calls_per_stepis invalid.
- class design_research_agents.ProposeCriticPattern(*, llm_client, tool_runtime, proposer_delegate=None, critic_delegate=None, max_iterations=3, proposer_system_prompt=None, proposer_user_prompt_template=None, critic_system_prompt=None, critic_user_prompt_template=None, default_request_id_prefix=None, default_dependencies=None, tracer=None)[source]
Propose/critique revision pattern built on workflow primitives.
Store dependencies and initialize workflow-native orchestration settings.
- Parameters:
llm_client – LLM client used by proposer and critic calls.
tool_runtime – Tool runtime used by loop execution runtime.
proposer_delegate – Optional proposer delegate override.
critic_delegate – Optional critic delegate override.
max_iterations – Maximum propose/critic iterations per run.
proposer_system_prompt – Optional override for proposer system prompt.
proposer_user_prompt_template – Optional proposer user prompt template.
critic_system_prompt – Optional override for critic system prompt.
critic_user_prompt_template – Optional critic user prompt template.
default_request_id_prefix – Optional prefix used to derive request ids.
default_dependencies – Dependency defaults merged into each run.
tracer – Optional tracer used for run-level instrumentation.
- Raises:
ValueError – If
max_iterationsis invalid.
- class design_research_agents.RouterDelegatePattern(*, llm_client, tool_runtime, alternatives, alternative_descriptions=None, router_system_prompt=None, router_user_prompt_template=None, default_request_id_prefix=None, default_dependencies=None, tracer=None)[source]
Routing/delegation pattern built on workflow primitives.
Store dependencies and initialize workflow-native routing settings.
- Parameters:
llm_client – LLM client used by the router agent.
tool_runtime – Tool runtime used to cost/metadata-account delegated calls.
alternatives – Mapping of route keys to delegate objects.
alternative_descriptions – Optional descriptions used to guide routing.
router_system_prompt – Optional override for router system prompt.
router_user_prompt_template – Optional override for router user prompt.
default_request_id_prefix – Optional prefix used to derive request ids.
default_dependencies – Dependency defaults merged into each run.
tracer – Optional tracer used for run-level instrumentation.
- Raises:
ValueError – If no valid route alternatives are supplied.
- class design_research_agents.RoundBasedCoordinationPattern(*, peers, max_rounds=4, initial_state=None, peer_prompt_builder=None, tracer=None)[source]
Round-based peer coordination pattern with deterministic peer ordering.
Initialize peer-only networked orchestration.
- Parameters:
peers – Mapping of peer ids to delegate objects.
max_rounds – Maximum number of coordination rounds.
initial_state – Optional initial shared state payload.
peer_prompt_builder – Optional prompt builder per peer and round.
tracer – Optional tracer dependency.
- Raises:
ValueError – Raised when peers is empty or max_rounds is invalid.
- class design_research_agents.BlackboardPattern(*, peers, max_rounds=6, stability_rounds=2, initial_state=None, peer_prompt_builder=None, tracer=None)[source]
Networked pattern with explicit blackboard reducer semantics.
Initialize blackboard specialization with convergence controls.
- Parameters:
peers – Peer delegates participating in rounds.
max_rounds – Maximum rounds before termination.
stability_rounds – Number of unchanged state hashes required to declare convergence.
initial_state – Optional initial blackboard override mapping.
peer_prompt_builder – Optional peer prompt builder callback.
tracer – Optional tracer dependency.
- Raises:
ValueError – Raised when
stability_roundsis less than one.
- class design_research_agents.BeamSearchPattern(*, generator_delegate, evaluator_delegate, max_depth=3, branch_factor=3, beam_width=2, tracer=None)[source]
Beam-style tree search over generated candidate states.
Initialize tree-search reasoning pattern.
- Parameters:
generator_delegate – Delegate that expands one candidate into children.
evaluator_delegate – Delegate that assigns a score to one candidate.
max_depth – Maximum expansion depth.
branch_factor – Max children retained per expanded node.
beam_width – Max frontier width kept after each depth.
tracer – Optional tracer dependency.
- Raises:
ValueError – Raised when depth/branch/beam settings are invalid.
- class design_research_agents.RAGPattern(*, reasoning_delegate, memory_store, memory_namespace='default', memory_top_k=5, memory_min_score=None, write_back=True, tracer=None)[source]
Reasoning pattern orchestrated as memory read -> reason -> memory write.
Initialize RAG reasoning pattern.
- Parameters:
reasoning_delegate – Delegate object that performs reasoning with retrieved context.
memory_store – Memory store used for retrieval and optional write-back.
memory_namespace – Namespace partition for reads/writes.
memory_top_k – Number of retrieved matches for reasoning context.
memory_min_score – Optional minimum retrieval score threshold.
write_back – Whether to persist one summary record after reasoning.
tracer – Optional tracer dependency.
- Raises:
ValueError – Raised when
memory_top_kis less than one.
Tools
- class design_research_agents.Toolbox(*, workspace_root='.', enable_core_tools=True, script_tools=None, callable_tools=None, mcp_servers=None)[source]
Tool runtime that routes calls across enabled tool sources.
Initialize toolbox sources from ergonomic constructor arguments.
- Parameters:
workspace_root – Root directory for tools that interact with the filesystem.
enable_core_tools – Whether to enable the built-in core tools.
script_tools – Optional tuple of ScriptToolConfig definitions to expose through a script tool source.
callable_tools – Optional tuple of CallableToolConfig definitions to register as in-process tools.
mcp_servers – Optional tuple of MCP server definitions to connect to and expose tools from.
- property config
Return active runtime configuration.
- Returns:
Fully resolved runtime configuration for this toolbox.
- invoke(tool_name, input, *, request_id, dependencies)[source]
Invoke one tool through the registry routing layer.
- Parameters:
tool_name – Name of the tool to invoke. This will be normalized by stripping leading and trailing whitespace before lookup.
input – Mapping of input values to provide for this tool invocation. This will be validated against the tool’s input schema before invocation.
request_id – Request ID to associate with this tool invocation, which will be passed through to the underlying tool handler and can be used for logging, tracing, and other purposes.
dependencies – Mapping of dependencies to provide for this tool invocation, which will be passed through to the underlying tool handler and can be used to provide additional context or resources needed for the tool execution.
- Returns:
The result of the tool invocation, as returned by the underlying tool handler. This will be validated against the tool’s output schema before being returned to the caller.
- invoke_dict(tool_name, input, *, request_id, dependencies)[source]
Invoke one tool and require a successful dictionary payload.
- Parameters:
tool_name – Name of the tool to invoke.
input – Tool input payload mapping.
request_id – Request identifier associated with this invocation.
dependencies – Dependency payload mapping for this invocation.
- Returns:
Tool result mapping.
- Raises:
RuntimeError – If invocation fails or result payload is not a mapping.
- list_tools()[source]
List all tools currently exposed by enabled runtime sources.
- Returns:
Sequence of ToolSpec objects representing all tools currently exposed by enabled runtime sources, in no particular order.
- register_callable_tool(callable_tool)[source]
Register one callable tool wrapper.
- Parameters:
callable_tool – CallableToolConfig definition to register. The name field will be normalized by stripping leading and trailing whitespace, and must be non-empty after normalization.
- Returns:
None
- Raises:
Exception – Raised when this operation cannot complete.
- register_tool(*, spec, handler)[source]
Register a custom in-process tool.
- Parameters:
spec – ToolSpec defining the tool to register. The name field will be normalized by stripping leading and trailing whitespace, and must be non-empty after normalization.
handler – ToolHandler function to execute when this tool is invoked. The handler will be wrapped to match the expected signature for in-process tools, which includes additional parameters for request ID and dependencies that will be ignored by the provided handler.
- Returns:
None
- property registry
Return the source-merging registry.
- Returns:
Registry that owns source routing and invocation dispatch.
- class design_research_agents.CallableToolConfig(*, name, description, handler, input_schema=<factory>, output_schema=<factory>, permissions=(), risky=None)[source]
Simple in-process callable tool wrapper descriptor.
- description
Short description of the tool’s behavior.
- handler
Python callable that implements the tool’s behavior. It should accept a single argument of type Mapping[str, object] and return an arbitrary JSON-serializable object.
- input_schema
JSON Schema describing the expected input structure for the tool. This is used for validation and documentation purposes.
- name
Unique name of the tool.
- output_schema
JSON Schema describing the structure of the tool’s output. This is used for validation and documentation purposes.
- permissions
Optional tuple of permission strings that the tool requires. This can be used to enforce security constraints or to inform users about the tool’s capabilities.
- risky
Whether the tool performs potentially risky operations.
- class design_research_agents.ScriptToolConfig(*, name, path, description, input_schema=<factory>, output_schema=<factory>, filesystem_read=False, filesystem_write=False, network=False, commands=(), timeout_s=30, permissions=(), risky=None)[source]
One explicit script-backed tool definition.
- commands
Optional tuple of allowed shell commands that the tool is permitted to execute. If non-empty, the tool will only be allowed to execute commands in this list, and attempts to execute any other commands will be blocked. This is used to enforce security constraints and limit the tool’s capabilities.
- description
Short description of the tool’s behavior. This should be a concise summary of what the tool does, suitable for inclusion in prompts and documentation.
- filesystem_read
Flag indicating whether the tool needs read access to the filesystem. If True, the tool will be granted read access to the workspace root and artifacts directory. If False, the tool will not be granted any filesystem access. This is used to enforce security constraints and limit the tool’s capabilities.
- filesystem_write
Flag indicating whether the tool needs write access to the filesystem. If True, the tool will be granted write access to the workspace root and artifacts directory. If False, the tool will not be granted any filesystem access. This is used to enforce security constraints and limit the tool’s capabilities.
- input_schema
JSON Schema describing the expected input structure for the tool. This is used for validation and documentation purposes. The tool will receive its input as a JSON-encoded string on its standard input, and it should produce its output as a JSON-encoded string on its standard output. The input schema should describe the structure of the JSON object that the tool expects to receive, including any required properties and their types.
- name
Unique name of the tool. This is used to reference the tool in prompts and logs.
- network
Flag indicating whether the tool needs access to the network. If True, the tool will be granted access to the network. If False, the tool will not be granted any network access. This is used to enforce security constraints and limit the tool’s capabilities.
- output_schema
JSON Schema describing the structure of the tool’s output. This is used for validation and documentation purposes. The tool’s output should be a JSON-encoded string written to its standard output, and the output schema should describe the structure of the JSON object that the tool produces, including any properties and their types.
- path
Filesystem path to the script that implements the tool’s behavior. This should be an absolute path or a path relative to the configured workspace root. The script will be executed as a subprocess when the tool is invoked, and communicated with via its standard input and output streams.
- permissions
Optional tuple of permission strings that the tool requires. This can be used to enforce security constraints or to inform users about the tool’s capabilities. The specific permission strings and their meanings are not defined by this configuration and should be interpreted by the tool runtime or the user interface accordingly.
- risky
Optional boolean flag indicating whether the tool performs potentially risky operations, such as executing shell commands, accessing the filesystem, or making network requests. This can be used to inform users about the tool’s capabilities and potential risks.
- timeout_s
Timeout in seconds for the tool’s execution. If the tool does not produce output within this time frame, it will be considered unresponsive, and appropriate error handling will be triggered.
- class design_research_agents.MCPServerConfig(*, id, type='stdio', command=(), timeout_s=20, env_allowlist=('PATH', 'HOME', 'USER', 'LANG', 'LC_ALL', 'PYTHONPATH', 'VIRTUAL_ENV'), env=<factory>)[source]
External MCP server definition.
- command
Command to launch the server, specified as a tuple of strings. The first element should be the executable, and the subsequent elements are its arguments.
- env
Explicit environment variables to set for the server process. This is a mapping of variable names to their desired values. These variables will be included in the server’s environment in addition to any variables from the allowlist that are present in the parent process.
- env_allowlist
Allowlist of environment variable names that will be passed to the server process. Only variables in this list will be included in the server’s environment, which helps to limit exposure of sensitive information and reduce the attack surface.
- id
Unique identifier for the server. This is used to reference the server in tool definitions and logs.
- timeout_s
Timeout in seconds for server responses before treating it as unresponsive.
- type
Communication protocol to use with the server. Currently, only ‘stdio’ is supported, which means the server will be launched as a subprocess and communicated with via its standard input and output streams.
Tracing
- class design_research_agents.Tracer(*, enabled=True, trace_dir=PosixPath('traces'), enable_jsonl=True, enable_console=True, console_stream=<_io.TextIOWrapper name='<stderr>' mode='w' encoding='utf-8'>)[source]
Explicitly configured tracer dependency injected into runtimes.
- build_sinks(*, trace_path)[source]
Build concrete sinks for this tracer configuration.
- Parameters:
trace_path – Optional JSONL path returned by
build_trace_path.- Returns:
Concrete sink instances enabled by this tracer configuration.
- build_trace_path(*, run_id)[source]
Build a trace JSONL path for one run when JSONL sink is enabled.
- Parameters:
run_id – Request or run identifier used in the trace filename.
- Returns:
JSONL output path for the run, or
Nonewhen JSONL output is disabled.
- console_stream
Stream used for console trace output.
- enable_console
Whether console trace output should be emitted.
- enable_jsonl
Whether JSONL trace files should be emitted.
- enabled
Whether tracing is enabled for this tracer instance.
- resolve_latest_trace_path(request_id)[source]
Resolve latest emitted JSONL trace path for one request id.
- Parameters:
request_id – Request identifier used in trace filenames.
- Returns:
Latest matching trace file path, or
None.
- run_callable(*, agent_name, request_id, input_payload, function, dependencies=None)[source]
One callable wrapped in explicit trace session lifecycle.
- Parameters:
agent_name – Delegate name used in trace metadata.
request_id – Request id used for trace run and file naming.
input_payload – Input payload metadata for trace run start.
function – Zero-argument callable to execute.
dependencies – Optional dependency mapping for trace metadata.
- Returns:
Function return value.
- trace_dir
Directory where JSONL trace files are written.