Skip to main content

Overview

On-demand Confidential AI API provides a secure, OpenAI-compatible interface for running AI models in TEE on GPU hardware. Pay per request with no infrastructure management. This enables developers to integrate AI applications with hardware-level privacy protection, ensuring user data remain confidential during inference. Browse available confidential AI models for your application. For dedicated GPU resources with hourly pricing, see Dedicated Models. Both options use the same API with identical features - the only difference is billing and resource allocation.

Prerequisites

Before you begin, ensure you have enough funds to get the API key. You need at least $5 in your account. Go to Dashboard and click Deposit to add funds. Navigate to DashboardConfidential AI API and click Enable. Then create your first API key and click the key to copy.
GPU TEE API Generate Key
Once you get the API Key, you can start making requests to the Confidential AI API.

Make Your Secure Request

Replace <API_KEY> with your actual API key in the examples below. We use DeepSeek V3 0324 model as an example, but you can choose any other available models.
# Install OpenAI SDK: `pip3 install openai`

from openai import OpenAI

client = OpenAI(api_key="<API_KEY>", base_url="https://api.redpill.ai/v1")

response = client.chat.completions.create(
    model="phala/deepseek-chat-v3-0324",
    messages=[
        {"role": "system", "content": "You are a helpful assistant"},
        {"role": "user", "content": "What is your model name?"},
    ],
    stream=True
)
print(response.choices[0].message.content)

Available Models

We support 14+ models running in GPU TEE from multiple providers. Click the GPU TEE checkbox to see all options.

Phala Provider

ModelModel IDContextPricing (per 1M tokens)
DeepSeek V3 0324deepseek/deepseek-chat-v3-0324163K0.28/0.28 / 1.14
Qwen2.5 VL 72B Instructqwen/qwen2.5-vl-72b-instruct65K0.59/0.59 / 0.59
Google Gemma 3 27Bgoogle/gemma-3-27b-it53K0.11/0.11 / 0.40
OpenAI GPT OSS 120Bopenai/gpt-oss-120b131K0.10/0.10 / 0.49
OpenAI GPT OSS 20Bopenai/gpt-oss-20b131K0.04/0.04 / 0.15
Qwen2.5 7B Instructqwen/qwen-2.5-7b-instruct32K0.04/0.04 / 0.10
Sentence Transformers all-MiniLM-L6-v2sentence-transformers/all-minilm-l6-v2512$0.000005

NearAI Provider

ModelModel IDContextPricing (per 1M tokens)
DeepSeek V3.1deepseek/deepseek-chat-v3.1163K1.00/1.00 / 2.50
Qwen3 30B A3B Instructqwen/qwen3-30b-a3b-instruct-2507262K0.15/0.15 / 0.45
Z.AI GLM 4.6z-ai/glm-4.6202K0.75/0.75 / 2.00

Tinfoil Provider

ModelModel IDContextPricing (per 1M tokens)
DeepSeek R1 0528deepseek/deepseek-r1-0528163K2.00/2.00 / 2.00
Qwen3 Coder 480B A35Bqwen/qwen3-coder-480b-a35b-instruct262K2.00/2.00 / 2.00
Qwen3 VL 30B A3Bqwen/qwen3-vl-30b-a3b-instruct262K2.00/2.00 / 2.00
Meta Llama 3.3 70B Instructmeta-llama/llama-3.3-70b-instruct131K2.00/2.00 / 2.00
All models run in GPU TEEs with hardware attestation. Pricing shows input/output token costs. Browse the full list at redpill.ai/models.

Verify Your AI is Running Securely

Once you finished your secure request, every response comes with cryptographic proof that it ran in a secure TEE. This proof is generated by the TEE. ensures the response is secure and trustworthy. Click Verify to learn how to verify your AI is running securely.

Next Steps

There are some advanced features you could use with Confidential AI API.
  • Tool Calling help you call tools from your AI models.
  • Images and Vision help you use images and vision models in Confidential AI.
  • Structured Output help you get structured output from your AI models.
  • Streaming help you get streaming response from your AI models.
  • Playground help you play with Confidential AI models in a private environment.