Skip to main content
Phala Confidential AI is OpenAI-compatible. Most integrations require changing only the base URL and API key:
https://inference.phala.com/v1
After migration, responses can be verified with x-receipt-id, Attestation Report, and Get Receipt.

Migrate from OpenAI

from openai import OpenAI

# Before
openai_client = OpenAI(api_key="sk-...")

# After
phala_client = OpenAI(
    api_key="YOUR_PHALA_API_KEY",
    base_url="https://inference.phala.com/v1",
)
Everything else stays familiar: chat completions, streaming, tool calling, vision, structured output, and embeddings use OpenAI-compatible request shapes.

Migrate from OpenRouter

from openai import OpenAI

client = OpenAI(
    base_url="https://inference.phala.com/v1",
    api_key="YOUR_PHALA_API_KEY",
)
Model ids use the provider/model format returned by /v1/models. Use is_tee: true to find models that can be served confidentially, then verify the actual response receipt.

OpenAI SDK

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_PHALA_API_KEY",
    base_url="https://inference.phala.com/v1",
)

response = client.chat.completions.create(
    model="phala/qwen3.5-27b",
    messages=[{"role": "user", "content": "Explain TEE privacy."}],
)

print(response.choices[0].message.content)

LangChain

from langchain_openai import ChatOpenAI

chat = ChatOpenAI(
    model="phala/qwen3.5-27b",
    api_key="YOUR_PHALA_API_KEY",
    base_url="https://inference.phala.com/v1",
)

response = chat.invoke("Explain confidential inference in one paragraph.")
print(response.content)

Vercel AI SDK

import { generateText } from "ai";
import { createOpenAI } from "@ai-sdk/openai";

const phala = createOpenAI({
  apiKey: process.env.API_KEY,
  baseURL: "https://inference.phala.com/v1",
});

const { text } = await generateText({
  model: phala("phala/qwen3.5-27b"),
  prompt: "Explain attestation.",
});

Langfuse

import os
from langfuse.openai import OpenAI

os.environ["LANGFUSE_PUBLIC_KEY"] = "pk-lf-..."
os.environ["LANGFUSE_SECRET_KEY"] = "sk-lf-..."
os.environ["LANGFUSE_HOST"] = "https://cloud.langfuse.com"

client = OpenAI(
    api_key="YOUR_PHALA_API_KEY",
    base_url="https://inference.phala.com/v1",
)
Observability tools may store prompts, responses, metadata, or traces. For sensitive data, configure redaction or self-hosting before sending production traffic.

Coding Assistants and Desktop Clients

For tools that support an OpenAI-compatible provider, use:
SettingValue
Provider typeOpenAI compatible
Base URLhttps://inference.phala.com/v1
API keyYour Phala API key
ModelAny id from /v1/models
This works for tools such as Continue, Cline-compatible assistants, Cherry Studio, and other clients that let you set an OpenAI-compatible base URL.

Migration Checklist

  • Create a Phala API key.
  • Change the base URL to https://inference.phala.com/v1.
  • List models with GET /v1/models.
  • Use is_tee: true for confidential-capable models.
  • Capture x-receipt-id in raw HTTP responses when you need verification.
  • Verify one representative response before production rollout.

On-demand API

Error Handling