Phala Cloud Documentation — Confidential AI on TEE

Endpoints

GET https://inference.phala.com/v1/models

Use the live model catalog before hardcoding model IDs. The catalog returns model IDs, context windows, pricing, serving metadata, modalities, supported parameters, and whether a model can be served confidentially.

Examples

curl https://inference.phala.com/v1/models \
  -H "Authorization: Bearer <API_KEY>"

Response

{
  "data": [
    {
      "id": "phala/qwen3.5-27b",
      "name": "Qwen3.5 27B",
      "created": 1677652288,
      "is_tee": true,
      "description": "Qwen model running through Phala GPU TEE infrastructure",
      "context_length": 262144,
      "max_output_length": 262144,
      "pricing": {
        "prompt": "0.00000030",
        "completion": "0.00000240"
      },
      "providers": ["phala"],
      "input_modalities": ["text"],
      "output_modalities": ["text"],
      "supported_parameters": ["max_tokens", "temperature", "tools", "tool_choice", "response_format"],
      "metadata": {}
    }
  ]
}

Model Object Fields

Field	Description
`id`	Model identifier for API calls
`name`	Human-readable model name
`is_tee`	`true` if the model can be served confidentially by a verified TEE provider
`description`	Model or provider description
`context_length`	Maximum context window
`max_output_length`	Maximum output length
`pricing.prompt`	Input token price per token; multiply by 1,000,000 for per-million-token pricing
`pricing.completion`	Output token price per token; multiply by 1,000,000 for per-million-token pricing
`providers`	Serving routes available for the model.
`input_modalities`	Supported input types, such as `text` or `image`
`output_modalities`	Supported output types, such as `text` or `embeddings`
`supported_parameters`	Request parameters accepted by the model

Find Verifiable TEE Models

Filter for models that can be served confidentially:

curl https://inference.phala.com/v1/models \
  -H "Authorization: Bearer <API_KEY>" | \
  jq -r '.data[] | select(.is_tee == true) | .id'

is_tee: true means the model can be served confidentially. The receipt remains the per-response proof: read the x-receipt-id header, then verify it with Attestation Report and Get Receipt.

​Endpoints

​Examples

​Response

​Model Object Fields

​Find Verifiable TEE Models

Endpoints

Examples

Response

Model Object Fields

Find Verifiable TEE Models