Skip to main content
This feature works for both API Access and Dedicated Models.

Streaming in Confidential AI API

Confidential AI API supports streaming, enabling you to receive responses in a streaming fashion. This is particularly useful for applications that require real-time data processing or integration with other systems.

Example of Streaming

Replace <API_KEY> with your actual API key in the examples below.
from openai import OpenAI

client = OpenAI(
    api_key="<API_KEY>",
    base_url="https://inference.phala.com/v1",
)

stream = client.chat.completions.create(
    model="phala/qwen3.5-27b",
    messages=[
        {
            "role": "user",
            "content": "say `Hello` 2 times fast, no other output",
        },
    ],
    stream=True,
)
for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="")
HelloHello

Supported Models

All models support streaming.