Skip to main content
Confidential AI overview - run AI models with hardware-level privacy in GPU TEEs

Why Confidential AI?

Traditional cloud AI deployments expose your models and data to the cloud provider. Confidential AI addresses this by running everything inside hardware-protected TEE. Your models stay private, your data stays secure, and you get cryptographic proof that execution happened in a trusted environment. Phala Cloud offers two ways to run confidential AI workloads: pre-deployed Models (API or Dedicated) for quick integration, or GPU TEE for custom infrastructure. Explore our Confidential AI Models to see available pre-trained models, use cases, and deployment options.

Quick Tour of Confidential AI

Models: API and Dedicated

API Access provides pre-deployed LLMs with OpenAI-compatible APIs for quick integration. Pay per request with no infrastructure to manage. Start with API Access. For advanced API features, explore Tool Calling to enable LLMs to interact with external tools and APIs securely within TEE. Dedicated Models give you the same pre-deployed models but with dedicated GPU resources and hourly pricing. Choose this for predictable performance or high-volume workloads. See Dedicated Models.

GPU TEE: Custom Infrastructure

For complete infrastructure control beyond pre-deployed models, use GPU TEE to rent dedicated GPU servers. Run any workload including custom models for inference, training, or fine-tuning. Configure GPU, CPU, RAM, and storage to match your exact needs.

Verify Attestation and Signature

To ensure your workloads run securely in TEE, you can Verify Attestation to check the TEE hardware, operating system, source code, and distributed root-of-trust attestations. Then you can Verify Signature to confirm the integrity of your Confidential AI API requests and responses.

Benchmark

Our performance benchmark shows TEE mode on H100/H200 GPUs runs up to 99% efficiency, nearly matching native performance. This means you get confidential computing with minimal performance penalty.

FAQs

Check FAQs for frequently asked questions about Confidential AI.

What makes Phala Cloud Confidential AI Different?

  • Seamless integration: Drop-in OpenAI API compatibility with popular models (DeepSeek, Llama, GPT-OSS, Qwen) ready for immediate use
  • Verifiable security: Hardware-enforced privacy with cryptographic attestation proving execution in genuine TEE environments
  • Flexible deployment: Choose from API access (pay per request), dedicated models (hourly dedicated GPU), or GPU TEE (full infrastructure control)

Open Source Foundation

Our underlying technology is open source. Check out the dstack repository to see how LLMs run securely in GPU TEEs.