LOADING

If loading is slow. Please enable cache. Caching is enabled by default in most browsers.

CHATBOX

chatbox

A standalone OpenAI-compatible chat workspace inspired by the llama.cpp server UI. Start from an endpoint, fetch models, tune reasoning and sampling, then chat in a dedicated page instead of inside the tools catalog.

Back to Tools
Workspace Endpoint-first chat

Model discovery, think controls, advanced sampling and optional speed metrics all live in this one tool page.

chatbox

Standalone chat workspace

Endpoint-first setup, model discovery and advanced controls inspired by the llama.cpp server experience.

Pick an endpoint, then start

  1. Set the API root or a compatible endpoint.
  2. Fetch models from /models or type the model id manually.
  3. Open Controls if you need think or sampling settings, then send the first message.
Prompt Output Prefill Decode TTFT Total

Controls

Reasoning, Sampling & Advanced

Reasoning

Think & Display

Reasoning visibility, think mode and streaming behavior.

Sampling

Generation Controls

Temperature, limits and sampling knobs for the current model.

Advanced

Extra Request Body

Pass vendor-specific fields without changing the core request builder.

This JSON is merged into the final request body after the built-in controls, so you can pass extra vendor-specific fields without editing the code again.