Streaming Responses
When interacting with Large Language Models (LLMs), generating a comprehensive, well-researched answer can take several seconds. Without streaming, the user interface remains completely frozen during this processing time, leading users to assume the system has crashed or hung.
Lyntaris solves this via Streaming.
What is Streaming?
Instead of waiting for the LLM to generate the entire multi-paragraph response before sending it back to the user, a streaming architecture transmits the response in real-time, token by token, exactly as the AI formulates it.
Visually, this creates a "typewriter effect" in the chat interface where words appear sequentially.
The Value for Corporate Users
For end-users utilizing Lyntaris via the embedded Chat Widget or internal dashboards, streaming is a critical component of perceived performance:
- Immediate Feedback (Lower TTFB): "Time to First Byte" is drastically reduced. While the total generation might take 8 seconds, the user sees the first word appear in under 1 second. This immediate visual feedback confirms the AI is actively working on their request.
- Improved Reading Experience: Users can begin reading and comprehending the first paragraph of an answer while the AI is simultaneously drafting the conclusion. This parallelizes the user's reading time with the machine's generation time.
- Early Abortion: If the user immediately recognizes from the first few generated sentences that the AI misunderstood their intent, they can stop reading or issue a new prompt immediately, saving time rather than waiting 10 seconds for a useless block of text.
How Lyntaris Implements Streaming
Streaming is enabled by default across the Lyntaris platform for supported LLMs (such as OpenAI, Anthropic, and localized models).
- Agentflow V2: When you connect an LLM node or Agent node, Lyntaris automatically negotiates a streaming connection with the provider.
- API Level (For Developers): Lyntaris utilizes Server-Sent Events (SSE). This lightweight protocol keeps a one-way connection open, pushing tokens as discrete
data:events straight to the client browser.
As a Lyntaris administrator, you do not need to configure complex WebSockets or chunking protocols; simply drag an LLM onto the canvas, and Lyntaris handles the real-time token delivery automatically.