SGLangServerLLMClient
=====================

``SGLangServerLLMClient`` targets local/self-hosted SGLang OpenAI-compatible
inference endpoints.

Default behavior
----------------

- Default managed mode: ``manage_server=True``
- Default startup model: ``Qwen/Qwen2.5-1.5B-Instruct``
- Default managed endpoint: ``http://127.0.0.1:30000/v1``

Constructor-first usage
-----------------------

.. code-block:: python

   from design_research_agents import SGLangServerLLMClient
   from design_research_agents.llm import LLMMessage, LLMRequest

   with SGLangServerLLMClient() as client:
       response = client.generate(
           LLMRequest(
               messages=(LLMMessage(role="user", content="Give one architecture tradeoff."),),
               model=client.default_model(),
           )
       )

Prefer the context-manager form so managed servers shut down deterministically.
``close()`` remains available for explicit lifecycle control.

Dependencies and environment
----------------------------

- Install SGLang extras for managed mode: ``pip install -e ".[sglang]"``
- For connect mode, point at an existing SGLang-compatible endpoint with
  ``manage_server=False`` and ``base_url=...``.

Examples
--------

- ``examples/clients/sglang_server_client.py``

Attribution
-----------

- Docs: `SGLang docs <https://docs.sglang.ai/>`_
- Homepage: `SGLang GitHub <https://github.com/sgl-project/sglang>`_