> ## Documentation Index
> Fetch the complete documentation index at: https://docs.phala.com/llms.txt
> Use this file to discover all available pages before exploring further.

> Get started quickly with pre-deployed models running in GPU TEE with OpenAI-compatible interface. Pay per request.

# On-demand API

## Overview

On-demand Confidential AI API provides a secure, OpenAI-compatible interface for running AI models in TEE on GPU hardware. Pay per request with no infrastructure management. This enables developers to integrate AI applications with hardware-level privacy protection, ensuring user data remain confidential during inference. Browse [available confidential AI models](https://phala.com/confidential-ai-models) for your application.

For dedicated GPU resources with hourly pricing, see [Dedicated Models](/phala-cloud/confidential-ai/confidential-gpu/model-template). Both options use the same API with identical features - the only difference is billing and resource allocation.

## Prerequisites

Before you begin, ensure you have enough funds to get the API key. You need at least \$5 in your account. Go to **Dashboard** and click **Deposit** to add funds.

Navigate to **Dashboard** → **Confidential AI API** and click **Enable**. Then create your first API key and click the key to copy.

<Frame>
  <img src="https://mintcdn.com/phalanetwork-1606097b/416gZMDMREnPDd33/images/confidential-ai/confidential-model/api-keys.png?fit=max&auto=format&n=416gZMDMREnPDd33&q=85&s=74fe03c7de5d4e99800da48f5c3e5cb9" alt="GPU TEE API Generate Key" width="2438" height="918" data-path="images/confidential-ai/confidential-model/api-keys.png" />
</Frame>

Once you get the API Key, you can start making requests to the Confidential AI API.

## Make Your Secure Request

Replace `<API_KEY>` with your actual API key in the examples below. We use DeepSeek V3 0324 model as an example, but you can choose any other available models.

<CodeGroup>
  ```bash Python theme={"system"}
  # Install OpenAI SDK: `pip3 install openai`

  from openai import OpenAI

  client = OpenAI(api_key="<API_KEY>", base_url="https://api.redpill.ai/v1")

  response = client.chat.completions.create(
      model="phala/deepseek-chat-v3-0324",
      messages=[
          {"role": "system", "content": "You are a helpful assistant"},
          {"role": "user", "content": "What is your model name?"},
      ],
      stream=True
  )
  print(response.choices[0].message.content)
  ```

  ```typescript TypeScript theme={"system"}
  import OpenAI from 'openai';

  const client = new OpenAI({
      baseURL: 'https://api.redpill.ai/v1',
      apiKey: '<API_KEY>',
    },
  });

  async function main() {
    const completion = await client.chat.completions.create({
      model: 'phala/deepseek-chat-v3-0324',
      messages: [
        {
          role: 'user',
          content: 'What is the meaning of life?',
        },
      ],
    });
    console.log(completion.choices[0].message);
  }

  main();
  ```

  ```bash CLI theme={"system"}
  curl -X 'POST' \
    'https://api.redpill.ai/v1/chat/completions' \
    -H 'accept: application/json' \
    -H 'Content-Type: application/json' \
    -H 'Authorization: Bearer <API_KEY>' \
    -d '{
    "messages": [
      {
        "content": "You are a helpful assistant.",
        "role": "system"
      },
      {
        "content": "What is your model name?",
        "role": "user"
      }
    ],
    "stream": true,
    "model": "phala/deepseek-chat-v3-0324"
  }'
  ```
</CodeGroup>

### Available Models

We support [14+ models](https://redpill.ai/models) running in GPU TEE from multiple providers. Click the **GPU TEE** checkbox to see all options.

#### Phala Provider

| Model                                  | Model ID                                 | Context | Pricing (per 1M tokens) |
| -------------------------------------- | ---------------------------------------- | ------- | ----------------------- |
| DeepSeek V3 0324                       | `deepseek/deepseek-chat-v3-0324`         | 163K    | $0.28 / $1.14           |
| Qwen2.5 VL 72B Instruct                | `qwen/qwen2.5-vl-72b-instruct`           | 65K     | $0.59 / $0.59           |
| Google Gemma 3 27B                     | `google/gemma-3-27b-it`                  | 53K     | $0.11 / $0.40           |
| OpenAI GPT OSS 120B                    | `openai/gpt-oss-120b`                    | 131K    | $0.10 / $0.49           |
| OpenAI GPT OSS 20B                     | `openai/gpt-oss-20b`                     | 131K    | $0.04 / $0.15           |
| Qwen2.5 7B Instruct                    | `qwen/qwen-2.5-7b-instruct`              | 32K     | $0.04 / $0.10           |
| Sentence Transformers all-MiniLM-L6-v2 | `sentence-transformers/all-minilm-l6-v2` | 512     | \$0.000005              |

#### NearAI Provider

| Model                  | Model ID                           | Context | Pricing (per 1M tokens) |
| ---------------------- | ---------------------------------- | ------- | ----------------------- |
| DeepSeek V3.1          | `deepseek/deepseek-chat-v3.1`      | 163K    | $1.00 / $2.50           |
| Qwen3 30B A3B Instruct | `qwen/qwen3-30b-a3b-instruct-2507` | 262K    | $0.15 / $0.45           |
| Z.AI GLM 4.6           | `z-ai/glm-4.6`                     | 202K    | $0.75 / $2.00           |

#### Tinfoil Provider

| Model                       | Model ID                              | Context | Pricing (per 1M tokens) |
| --------------------------- | ------------------------------------- | ------- | ----------------------- |
| DeepSeek R1 0528            | `deepseek/deepseek-r1-0528`           | 163K    | $2.00 / $2.00           |
| Qwen3 Coder 480B A35B       | `qwen/qwen3-coder-480b-a35b-instruct` | 262K    | $2.00 / $2.00           |
| Qwen3 VL 30B A3B            | `qwen/qwen3-vl-30b-a3b-instruct`      | 262K    | $2.00 / $2.00           |
| Meta Llama 3.3 70B Instruct | `meta-llama/llama-3.3-70b-instruct`   | 131K    | $2.00 / $2.00           |

<Note>
  All models run in GPU TEEs with hardware attestation. Pricing shows input/output token costs. Browse the full list at [redpill.ai/models](https://redpill.ai/models).
</Note>

## Verify Your AI is Running Securely

Once you finished your secure request, every response comes with cryptographic proof that it ran in a secure TEE. This proof is generated by the TEE. ensures the response is secure and trustworthy. Click [Verify](/phala-cloud/confidential-ai/verify/overview) to learn how to verify your AI is running securely.

## Next Steps

There are some advanced features you could use with Confidential AI API.

* [Tool Calling](/phala-cloud/confidential-ai/confidential-model/tool-calling) help you call tools from your AI models.
* [Images and Vision](/phala-cloud/confidential-ai/confidential-model/images-and-vision) help you use images and vision models in Confidential AI.
* [Structured Output](/phala-cloud/confidential-ai/confidential-model/structured-output) help you get structured output from your AI models.
* [Streaming](/phala-cloud/confidential-ai/confidential-model/streaming) help you get streaming response from your AI models.
* [Playground](/phala-cloud/confidential-ai/confidential-model/playground) help you play with Confidential AI models in a private environment.
