> ## Documentation Index
> Fetch the complete documentation index at: https://docs.phala.com/llms.txt
> Use this file to discover all available pages before exploring further.

> This overview introduces Confidential AI, running AI models with hardware-level privacy in GPU TEEs. Explain how to use Confidential AI to inference LLMs, deploy custom models, and run AI workloads with verifiable attestation and near-native performance.

# Overview

<img src="https://mintcdn.com/phalanetwork-1606097b/416gZMDMREnPDd33/images/confidential-ai/confidential-ai-overview.jpeg?fit=max&auto=format&n=416gZMDMREnPDd33&q=85&s=7078e43ac11a53f8be6d0d9626fa9941" alt="Confidential AI overview - run AI models with hardware-level privacy in GPU TEEs" width="2880" height="1616" data-path="images/confidential-ai/confidential-ai-overview.jpeg" />

## Why Confidential AI?

Traditional cloud AI deployments expose your models and data to the cloud provider. Confidential AI runs inference, training, and fine-tuning inside GPU TEEs — the cloud provider cannot access your models or data. You get cryptographic proof that execution happened in a trusted environment.

Phala Cloud offers two ways to run confidential AI workloads: pre-deployed **Models** (API or Dedicated) for quick integration, or **GPU TEE** for custom infrastructure. See [available models](https://phala.com/confidential-ai-models) for supported models, use cases, and deployment options.

<Columns cols={2}>
  <Card icon="webhook" href="/phala-cloud/confidential-ai/confidential-model/confidential-ai-api" title="API Access" arrow="true">
    **Pre-deployed models, pay per request**

    Best for quick integration. 5 minute setup with OpenAI-compatible API and no infrastructure management.
  </Card>

  <Card icon="workflow" href="/phala-cloud/confidential-ai/confidential-gpu/model-template" title="Dedicated Models" arrow="true">
    **Same models, dedicated performance**

    Best for high-volume workloads. Same API as API Access, but with hourly billing and dedicated GPU resources.
  </Card>

  <Card icon="microchip" title="GPU TEE" href="/phala-cloud/confidential-ai/confidential-gpu/deploy-and-verify" arrow="true">
    **Custom infrastructure, full control**

    Best for custom models. Rent dedicated GPU TEE servers for training, fine-tuning, or any custom workload.
  </Card>
</Columns>

## Quick Tour of Confidential AI

### Models: API and Dedicated

**API Access** provides pre-deployed LLMs with OpenAI-compatible APIs for quick integration. Pay per request with no infrastructure to manage. Start with [API Access](/phala-cloud/confidential-ai/confidential-model/confidential-ai-api).

For advanced API features, explore [Tool Calling](/phala-cloud/confidential-ai/confidential-model/tool-calling) to enable LLMs to interact with external tools and APIs securely within TEE.

**Dedicated Models** give you the same pre-deployed models but with dedicated GPU resources and hourly pricing. Choose this for predictable performance or high-volume workloads. See [Dedicated Models](/phala-cloud/confidential-ai/confidential-gpu/model-template).

### GPU TEE: Custom Infrastructure

For complete infrastructure control beyond pre-deployed models, use [GPU TEE](/phala-cloud/confidential-ai/confidential-gpu/deploy-and-verify) to rent dedicated GPU servers. Run any workload including custom models for inference, training, or fine-tuning. Configure GPU, CPU, RAM, and storage to match your exact needs.

### Verify Attestation and Signature

To ensure your workloads run securely in TEE, you can [Verify Attestation](./verify-attestation) to check the TEE hardware, operating system, source code, and distributed root-of-trust attestations.

Then you can [Verify Signature](./verify-signature) to confirm the integrity of your Confidential AI API requests and responses.

### Benchmark

Our [benchmark](/phala-cloud/confidential-ai/benchmark) shows GPU TEE mode achieves 99% of native performance on H100/H200 GPUs.

### FAQs

Check [FAQs](/phala-cloud/confidential-ai/faqs) for frequently asked questions about Confidential AI.

## What makes Phala Cloud Confidential AI Different?

* **Drop-in compatibility**: OpenAI-compatible API with popular models (DeepSeek, Llama, GPT-OSS, Qwen) ready for immediate use
* **Verifiable security**: Hardware-enforced privacy with cryptographic attestation proving execution in genuine TEE environments
* **Flexible deployment**: Choose from API access (pay per request), dedicated models (hourly dedicated GPU), or GPU TEE (full infrastructure control)

## Open Source Foundation

Our underlying technology is open source. Check out the [dstack](https://github.com/Dstack-TEE/dstack) repository to see how LLMs run securely in GPU TEEs.
