Multi Step JSON Tool Calling 1d Optimization#

Source: examples/optimization/multi_step_json_tool_calling_1d_optimization.py

Introduction#

Practical Bayesian optimization motivates iterative search over expensive objective evaluations, while Toolformer and Plan-and-Solve motivate explicit action/reason loops for model-guided exploration. This example operationalizes that idea as a JSON tool-calling optimization workflow with traceable proposals and evaluations.

Technical Implementation#

  1. Configure Tracer with JSONL + console output so each run emits machine-readable traces and lifecycle logs.

  2. Build the runtime surface (public APIs only) and execute MultiStepAgent.run(...) with a fixed request_id.

  3. Configure and invoke Toolbox integrations (core/script/MCP/callable) before assembling the final payload.

  4. Print a compact JSON payload including trace_info for deterministic tests and docs examples.

        flowchart LR
    A["Input prompt or scenario"] --> B["main(): runtime wiring"]
    B --> C["MultiStepAgent.run(...)"]
    C --> D["optimization loop combines callable tools with explicit final answers"]
    C --> E["Tracer JSONL + console events"]
    D --> F["ExecutionResult/payload"]
    E --> F
    F --> G["Printed JSON output"]
    
  1from __future__ import annotations
  2
  3import json
  4from collections.abc import Mapping
  5from pathlib import Path
  6
  7import design_research_agents as drag
  8
  9_EXAMPLE_LLAMA_CLIENT_KWARGS = {
 10    "model": "Qwen_Qwen3-4B-Instruct-2507-Q4_K_M.gguf",
 11    "hf_model_repo_id": "bartowski/Qwen_Qwen3-4B-Instruct-2507-GGUF",
 12    "api_model": "qwen3-4b-instruct-2507-q4km",
 13    "context_window": 8192,
 14    "startup_timeout_seconds": 240.0,
 15    "request_timeout_seconds": 240.0,
 16}
 17
 18
 19def _objective(x: float) -> float:
 20    return x * x
 21
 22
 23def main() -> None:
 24    """Optimize ``x^2`` from ``x=3`` by letting the LLM choose each tool step."""
 25    # Fixed request id keeps traces and docs output deterministic across runs.
 26    request_id = "example-optimization-json-tool-calling-design-001"
 27    tracer = drag.Tracer(
 28        enabled=True,
 29        trace_dir=Path("artifacts/examples/traces"),
 30        enable_jsonl=True,
 31        enable_console=True,
 32    )
 33    initial_x = 3.0
 34    evaluation_history: list[dict[str, float]] = []
 35
 36    def _evaluate(payload: Mapping[str, object]) -> dict[str, object]:
 37        raw_x = payload.get("x", initial_x)
 38        x_value = float(raw_x) if isinstance(raw_x, (int, float)) else initial_x
 39        f_x = _objective(x_value)
 40        evaluation_record = {"x": x_value, "f_x": f_x}
 41        evaluation_history.append(evaluation_record)
 42        best_record = min(evaluation_history, key=lambda record: record["f_x"])
 43        previous_record = evaluation_history[-2] if len(evaluation_history) > 1 else None
 44        return {
 45            "x": x_value,
 46            "f_x": f_x,
 47            "evaluations": len(evaluation_history),
 48            "previous_x": None if previous_record is None else previous_record["x"],
 49            "previous_f_x": None if previous_record is None else previous_record["f_x"],
 50            "best_x": best_record["x"],
 51            "best_objective": best_record["f_x"],
 52            "improved_best": best_record is evaluation_record,
 53            "history": list(evaluation_history),
 54        }
 55
 56    # Run the optimization example using public runtime surfaces. Using this with statement will automatically
 57    # shut down the managed client and tool runtime when the example is done.
 58    with (
 59        drag.Toolbox(
 60            enable_core_tools=False,
 61            callable_tools=(
 62                drag.CallableToolConfig(
 63                    name="optimizer.evaluate",
 64                    description="Evaluate f(x) = x^2 at a proposed x and return the best observation so far.",
 65                    handler=_evaluate,
 66                    input_schema={
 67                        "type": "object",
 68                        "additionalProperties": False,
 69                        "properties": {"x": {"type": "number"}},
 70                        "required": ["x"],
 71                    },
 72                ),
 73            ),
 74        ) as tools,
 75        drag.LlamaCppServerLLMClient(**_EXAMPLE_LLAMA_CLIENT_KWARGS) as llm_client,
 76    ):
 77        optimization_agent = drag.MultiStepAgent(
 78            mode="json",
 79            llm_client=llm_client,
 80            tool_runtime=tools,
 81            max_steps=6,
 82            # This example uses prompt guidance rather than tool-enforced step directions.
 83            tool_calling_system_prompt=(
 84                "You are solving a simple one-dimensional black-box minimization problem. "
 85                "Use optimizer.evaluate to test concrete x values, and rely on observed tool results instead "
 86                "of guessing numeric outcomes. Prefer a short, informative search that moves toward lower "
 87                "observed objective values, then emit final_answer once the best observed x is well-supported."
 88            ),
 89            tracer=tracer,
 90        )
 91        result = optimization_agent.run(
 92            prompt=(
 93                "Minimize the black-box function f(x). Begin by evaluating x=3. "
 94                "Use the observed results to choose a few better candidate x values, keeping the search efficient. "
 95                "When you have enough evidence, emit final_answer with exactly the keys best_x, "
 96                "best_objective, and evaluations, and use only values that came from tool observations."
 97            ),
 98            request_id=request_id,
 99        )
100
101    # Print the results
102    summary = result.summary()
103    print(json.dumps(summary, ensure_ascii=True, indent=2, sort_keys=True))
104
105
106if __name__ == "__main__":
107    main()

Expected Results#

Run Command

PYTHONPATH=src python3 examples/optimization/multi_step_json_tool_calling_1d_optimization.py

Example output shape (values vary by run):

{
  "success": true,
  "final_output": "<example-specific payload>",
  "terminated_reason": "<string-or-null>",
  "error": null,
  "trace": {
    "request_id": "<request-id>",
    "trace_dir": "artifacts/examples/traces",
    "trace_path": "artifacts/examples/traces/run_<timestamp>_<request_id>.jsonl"
  }
}

References#