Phala Cloud Documentation — Confidential AI on TEE

Replicas let you run several copies of the same application side by side on Phala Cloud. This page covers the concepts and the decision criteria. For concrete dashboard and CLI steps, see Replicating CVMs.

What a Replica Is

A replica is an independent CVM instance provisioned from an existing source CVM. Replicas of the same app share an app_id and the same compose hash, but each replica has its own:

vm_uuid, endpoint URL, and internal IP
TEE attestation report
Process state, memory, and local disk
Billing meter

Replicas are created one at a time from an existing source, not as an atomic group. If you ask for three replicas, Phala Cloud creates three independent CVMs whose only formal relationship is the shared app_id.

What Replicas Give You

Horizontal capacity. More replicas serve more concurrent traffic, assuming your workload is stateless or has a shared data layer behind it. Failure isolation. If one replica’s node goes down for maintenance, the others keep serving. A single CVM has no such fallback. Regional placement. You can pin different replicas to nodes in different regions by passing a node_id when you replicate, which reduces latency for geographically distributed users. Independent attestation. Each replica produces its own TDX quote. A relying party can verify any individual replica without trusting the others, which matters when the attestation is part of your security story.

What Replicas Do Not Give You

Load balancing. Phala Cloud does not distribute traffic across replicas. Each replica gets its own public endpoint. You need an external load balancer, DNS round-robin, or service mesh to split traffic. Shared state. Replicas do not share filesystems, memory, or any in-process state. Two replicas writing to their local disks write to two separate disks. If your app keeps state locally, replicating it will cause divergence. Automatic propagation of compose updates. Updating the compose of one replica does not update the others. Each replica is its own CVM and must be updated individually. See Upgrade Application for the update flow. Linear performance scaling. Two replicas do not automatically serve twice the throughput. The upstream bottleneck (database, queue, external API) often dominates. Measure before assuming.

When to Use Replicas

Use replicas when:

The workload is stateless, or reads and writes go to a shared backing store (managed Postgres, object storage, an external queue).
You need high availability and one CVM outage would be unacceptable.
You want to distribute traffic geographically and you have a way to route users to the nearest replica.
The attestation model assumes independent instances — each replica must be verifiable on its own.

Avoid replicas when:

The workload keeps state on local disk and you have no replication layer. Two divergent copies of the same database do not make a cluster.
The workload assumes it is the only writer to an external resource (a singleton job scheduler, a leader-elected worker).
You are trying to solve a vertical scaling problem. If one replica is slow because it is CPU- or memory-bound, bigger instance types often help more than more copies. See Resize Resources.

Cost Model

Each replica is billed as an independent CVM at the full rate of its instance type. Three tdx.medium replicas cost three times one tdx.medium. There is no bulk discount for adding replicas. Factor this into your scaling budget before raising the replica count. Stopping a replica pauses compute billing but keeps its disk allocation, which is billed separately. Deleting a replica releases both.

How to Create Replicas

The concrete steps live in Replicating CVMs, which covers:

The dashboard Scale dialog on the app detail page
The phala cvms replicate CLI command
Onchain KMS workflows for contract-governed apps
The replicate API for custom integrations

Replicating CVMs — concrete dashboard, CLI, and API steps.
Upgrade Application — how to update compose or environment, including across replicas.
Resize Resources — vertical scaling when more replicas is not the answer.
Deploy Your First CVM — single-CVM getting started.

Phala Cloud

Documentation Index

​What a Replica Is

​What Replicas Give You

​What Replicas Do Not Give You

​When to Use Replicas

​Cost Model

​How to Create Replicas

​Related Documentation