Skip to main content
Replicas let you run several copies of the same application side by side on Phala Cloud. This page covers the concepts and the decision criteria. For concrete dashboard and CLI steps, see Replicating CVMs.

What a Replica Is

A replica is an independent CVM instance provisioned from an existing source CVM. Replicas of the same app share an app_id and the same compose hash, but each replica has its own:
  • vm_uuid, endpoint URL, and internal IP
  • TEE attestation report
  • Process state, memory, and local disk
  • Billing meter
Replicas are created one at a time from an existing source, not as an atomic group. If you ask for three replicas, Phala Cloud creates three independent CVMs whose only formal relationship is the shared app_id.

What Replicas Give You

Horizontal capacity. More replicas serve more concurrent traffic, assuming your workload is stateless or has a shared data layer behind it. Failure isolation. If one replica’s node goes down for maintenance, the others keep serving. A single CVM has no such fallback. Regional placement. You can pin different replicas to nodes in different regions by passing a node_id when you replicate, which reduces latency for geographically distributed users. Independent attestation. Each replica produces its own TDX quote. A relying party can verify any individual replica without trusting the others, which matters when the attestation is part of your security story.

What Replicas Do Not Give You

Load balancing. Phala Cloud does not distribute traffic across replicas. Each replica gets its own public endpoint. You need an external load balancer, DNS round-robin, or service mesh to split traffic. Shared state. Replicas do not share filesystems, memory, or any in-process state. Two replicas writing to their local disks write to two separate disks. If your app keeps state locally, replicating it will cause divergence. Automatic propagation of compose updates. Updating the compose of one replica does not update the others. Each replica is its own CVM and must be updated individually. See Upgrade Application for the update flow. Linear performance scaling. Two replicas do not automatically serve twice the throughput. The upstream bottleneck (database, queue, external API) often dominates. Measure before assuming.

When to Use Replicas

Use replicas when:
  • The workload is stateless, or reads and writes go to a shared backing store (managed Postgres, object storage, an external queue).
  • You need high availability and one CVM outage would be unacceptable.
  • You want to distribute traffic geographically and you have a way to route users to the nearest replica.
  • The attestation model assumes independent instances — each replica must be verifiable on its own.
Avoid replicas when:
  • The workload keeps state on local disk and you have no replication layer. Two divergent copies of the same database do not make a cluster.
  • The workload assumes it is the only writer to an external resource (a singleton job scheduler, a leader-elected worker).
  • You are trying to solve a vertical scaling problem. If one replica is slow because it is CPU- or memory-bound, bigger instance types often help more than more copies. See Resize Resources.

Cost Model

Each replica is billed as an independent CVM at the full rate of its instance type. Three tdx.medium replicas cost three times one tdx.medium. There is no bulk discount for adding replicas. Factor this into your scaling budget before raising the replica count. Stopping a replica pauses compute billing but keeps its disk allocation, which is billed separately. Deleting a replica releases both.

How to Create Replicas

The concrete steps live in Replicating CVMs, which covers:
  • The dashboard Scale dialog on the app detail page
  • The phala cvms replicate CLI command
  • Onchain KMS workflows for contract-governed apps
  • The replicate API for custom integrations