[Python] FastAPI Using Server-Sent Events (SSE) for Streaming Responses

Last Updated on 2024-11-02 by Clay

I have recently set up numerous backend API servers for Chatbots. Initially, I received user messages and returned the entire LLM-generated reply in one go to the frontend interface. However, this approach did not provide a good user experience. I then switched to HTTP streaming, sending each generated token to the frontend as it was produced. Later, I found that some users’ devices experienced packet sticking, so I finally switched to using WebSocket.

Recently, my frontend colleagues and I discussed that using Server-Sent Events (SSE) might be a better option. As a result, I started exploring how to build an SSE API using FastAPI.

A quick overview shows that SSE is also based on HTTP and can be considered a lightweight alternative to WebSocket.

First, install the following package:

pip install sse-starlette

Next, you can perform a simple test:

import asyncio

from fastapi import FastAPI
from sse_starlette.sse import EventSourceResponse

app = FastAPI()

async def event_generator(sent: str):
    for char in sent:
        yield {"event": "message", "data": char}
        await asyncio.sleep(0.2)

@app.get("/api/chatbot/stream")
async def sse_endpoint(sent: str):
    return EventSourceResponse(event_generator(sent))

This is a classic repetitive sentence response bot, used here to test the replacement effect of LLM. You can start it using the following command:

uvicorn app:app --port 8080 --reload

Once we make the following request:

import httpx

url = "http://127.0.0.1:8080/api/chatbot/stream"
params = {"sent": "你好，今天天氣不錯～"}

with httpx.stream("GET", url, params=params) as response:
    for line in response.iter_text():
        if line:
            print(line)

Output:

event: message
data: 你

event: message
data: 好

event: message
data: ，

event: message
data: 今

event: message
data: 天

event: message
data: 天

event: message
data: 氣

event: message
data: 不

event: message
data: 錯

event: message
data: ～

(Notes are not yet fully organized)

References

Implementing Streamed Output Token Generation Using TextStreamer and TextIteratorStreamer in HuggingFace Transformers

[Python] FastAPI Using Server-Sent Events (SSE) for Streaming Responses

References

Read More

相關

Leave a ReplyCancel reply

[Python] FastAPI Using Server-Sent Events (SSE) for Streaming Responses

References

Read More

Share this:

相關

Leave a ReplyCancel reply