SGLangServerLLMClient
=====================
``SGLangServerLLMClient`` targets local/self-hosted SGLang OpenAI-compatible
inference endpoints.
Default behavior
----------------
- Default managed mode: ``manage_server=True``
- Default startup model: ``Qwen/Qwen2.5-1.5B-Instruct``
- Default managed endpoint: ``http://127.0.0.1:30000/v1``
Constructor-first usage
-----------------------
.. code-block:: python
from design_research_agents import SGLangServerLLMClient
from design_research_agents.llm import LLMMessage, LLMRequest
with SGLangServerLLMClient() as client:
response = client.generate(
LLMRequest(
messages=(LLMMessage(role="user", content="Give one architecture tradeoff."),),
model=client.default_model(),
)
)
Prefer the context-manager form so managed servers shut down deterministically.
``close()`` remains available for explicit lifecycle control.
Dependencies and environment
----------------------------
- Install SGLang extras for managed mode: ``pip install -e ".[sglang]"``
- For connect mode, point at an existing SGLang-compatible endpoint with
``manage_server=False`` and ``base_url=...``.
Examples
--------
- ``examples/clients/sglang_server_client.py``
Attribution
-----------
- Docs: `SGLang docs `_
- Homepage: `SGLang GitHub `_