SGLangServerLLMClient
SGLangServerLLMClient targets local/self-hosted SGLang OpenAI-compatible
inference endpoints.
Default behavior
Default managed mode:
manage_server=TrueDefault startup model:
Qwen/Qwen2.5-1.5B-InstructDefault managed endpoint:
http://127.0.0.1:30000/v1
Constructor-first usage
from design_research_agents import SGLangServerLLMClient
from design_research_agents.llm import LLMMessage, LLMRequest
with SGLangServerLLMClient() as client:
response = client.generate(
LLMRequest(
messages=(LLMMessage(role="user", content="Give one architecture tradeoff."),),
model=client.default_model(),
)
)
Prefer the context-manager form so managed servers shut down deterministically.
close() remains available for explicit lifecycle control.
Dependencies and environment
Install SGLang extras for managed mode:
pip install -e ".[sglang]"For connect mode, point at an existing SGLang-compatible endpoint with
manage_server=Falseandbase_url=....
Examples
examples/clients/sglang_server_client.py
Attribution
Docs: SGLang docs
Homepage: SGLang GitHub