Privacy-Enhancing Technologies & Confidential Computing -

Services / Privacy-Enhancing Technologies & Confidential Computing

The conventional approach to data privacy is access control: restrict who can see the data, log who accessed it, and hope that the controls hold. This approach has a fundamental limitation — it makes privacy a governance problem rather than an engineering problem. The data exists in plaintext somewhere. Everyone with access to that location can see it. The controls fail when an insider abuses access, when credentials are stolen, when a cloud provider’s infrastructure is compromised, or when a data breach exposes what was assumed to be protected. Privacy-enhancing technologies take a different approach: they make computation possible without requiring access to the underlying data in plaintext. The privacy guarantee is mathematical, not organisational.

Four technologies define the current frontier of privacy-preserving computation. Federated learning trains machine learning models across distributed datasets without centralising the underlying data — the model comes to the data rather than the data coming to the model. Secure multi-party computation allows multiple parties to jointly compute a function over their combined data without any party seeing the other parties’ inputs. Homomorphic encryption allows computation on encrypted data, producing results that decrypt to the same answer as computation on the plaintext. Differential privacy provides a mathematical privacy guarantee for statistical queries and machine learning by adding calibrated noise that prevents individual record inference without materially affecting aggregate results.

These technologies are not experimental. They are in production in healthcare systems training diagnostic models across hospital networks, in financial services detecting fraud across banks that cannot share transaction records, in advertising measurement systems that compute campaign effectiveness without exposing individual browsing behaviour, and in genomics research that identifies drug targets across patient cohorts that cannot be legally aggregated. The constraint is not availability — the libraries exist, the hardware acceleration is improving, and the regulatory pressure to use these technologies is increasing. The constraint is knowing which technology solves the specific problem, at what performance cost, with what security assumptions, and how to implement it correctly at production scale.

Book a PET Assessment →

Pricing & Scope

Price Range

£22,000 – £280,000+
Technology selection, privacy architecture design, implementation specification, and security analysis. Production implementation is additional.

Duration

6 – 24 weeks
Architecture design phase. Production implementation adds 3–18 months depending on technology choice and system complexity.

Technologies

Federated Learning · Secure Multi-Party Computation (MPC) · Homomorphic Encryption (FHE/PHE) · Differential Privacy · Trusted Execution Environments · Private Set Intersection · Zero-Knowledge Proofs

Regulatory

UK GDPR Article 25 (privacy by design) · EU AI Act Article 10 (training data governance) · DPDIB (UK Data Protection and Digital Information Bill) · NHS DSPT · FCA data science guidance

Contract

Fixed-price. 50% on signing, 50% on delivery acceptance.

Implementation complexity is consistently underestimatedPrivacy-enhancing technologies introduce performance overhead, protocol complexity, and security assumptions that are not present in conventional systems. Homomorphic encryption is 1,000–1,000,000× slower than plaintext computation on current hardware. Federated learning introduces communication overhead, convergence challenges, and new attack surfaces. MPC requires a network protocol between parties that is more complex than any single-party system. These costs are manageable and in many cases reducing rapidly — but they must be accurately estimated and architecturally planned for before implementation begins.

The Four Core Technologies — What Each Does, What It Costs, When It Is the Right Choice

Four technologies with different privacy guarantees, different performance characteristics, and different deployment requirements. The correct choice depends on the threat model, the computation required, the parties involved, and the performance constraints of the application.

The most common mistake in PET deployment is selecting a technology based on its name recognition rather than its fit to the specific problem. Federated learning is not a substitute for differential privacy. Homomorphic encryption is not a substitute for secure multi-party computation. Each addresses a different privacy problem, with different security assumptions, at a different performance cost. The technology selection is the most consequential decision in a PET programme and the one most commonly made incorrectly.

Technology 1

Federated Learning

In federated learning, model training is distributed across the data holders. Each participant trains on their local data, computes gradient updates, and sends the updates — not the data — to a central aggregator or to peer participants. The aggregated model improves from all participants’ data without any participant’s raw data leaving their environment. The model comes to the data rather than the data coming to the model. Federated learning is production-ready: Google deploys it for keyboard prediction, Apple for Siri improvements, and a growing number of healthcare networks for diagnostic model training.

Privacy guarantee

Raw data does not leave the data holder’s environment. But gradient updates can leak information about the training data — gradient inversion attacks can reconstruct training examples from updates. Federated learning alone is not a formal privacy guarantee; it requires differential privacy as an additional layer for strong privacy protection.

Performance cost

Communication overhead from sending gradient updates, not the raw data. Convergence requires more rounds than centralised training. For models with large parameter counts, update communication can dominate training time. GPU compute at each participant: equivalent to standard distributed training, no cryptographic overhead.

Security assumptions

The aggregator is assumed to be honest but curious (sees the updates, does not see the data) or, in decentralised FL, no central party sees anything. If the aggregator is malicious, gradient inversion attacks are possible. Secure aggregation protocols using MPC can prevent the aggregator from seeing individual updates.

When federated learning is the right choice

Multiple parties each have training data that cannot be legally or contractually shared, but all would benefit from a model trained on the combined data. The primary computation is ML model training, not general-purpose analytics. The parties trust each other sufficiently to share gradient updates (or secure aggregation is added). Latency requirements are compatible with multi-round training. Examples: NHS hospital network training a diagnostic model, consortium of banks training a fraud detection model, pharmaceutical companies training a drug interaction model on confidential trial data.

When it is not the right choice

The required computation is a query or analytics operation rather than model training. The parties do not trust each other sufficiently even with gradient aggregation. The data holders have heterogeneous data distributions that make federated convergence impractical. The required privacy guarantee is formal and mathematical rather than the practical protection of not sharing raw data. Real-time inference is required over combined data (FL produces a model, not a query interface).

Technology 2

Secure Multi-Party Computation

Secure multi-party computation (MPC) allows two or more parties to jointly compute a function over their combined data such that no party learns anything about the other parties’ inputs beyond what is revealed by the function’s output. The computation is performed by a cryptographic protocol — typically using secret sharing, garbled circuits, or oblivious transfer — that ensures the inputs remain private even from the other parties. MPC has a formal security proof under well-defined adversary models: the result is a mathematical guarantee, not an organisational policy. Production deployments include privacy-preserving advertising measurement (Apple, Meta), secure genome-wide association studies, and inter-bank financial settlement.

Privacy guarantee

Formally proved under the semi-honest (honest-but-curious) or malicious adversary model, depending on the protocol. The security guarantee is cryptographic: no computationally bounded adversary can learn more about a party’s input than what is logically implied by the output, even if they deviate from the protocol.

Performance cost

Significant: 100×–10,000× slower than plaintext computation depending on the circuit complexity, adversary model, and network latency. Recent progress with SPDZ, MOTION, and MP-SPDZ frameworks has made practical MPC feasible for moderate-scale computations. Not suitable for general-purpose high-throughput processing on current hardware.

Security assumptions

Semi-honest protocols assume parties follow the protocol but try to learn information from their view of the execution. Malicious protocols provide security even when parties deviate. The output itself reveals information — MPC prevents learning more than the output implies; it does not prevent inference from the output.

When MPC is the right choice

Two or more parties need to compute a specific function over their joint data without revealing their inputs to each other, and the computation is bounded in complexity (a specific query, a specific model inference, a specific analytics operation rather than general-purpose data access). The parties cannot trust each other at all — MPC provides security even against a malicious party in some protocols. Examples: two banks jointly computing customer overlap without revealing their customer lists, an insurer and a hospital jointly computing risk scores without revealing either’s underlying data, private set intersection for identity matching.

When it is not the right choice

The computation is high-throughput or requires real-time latency that the protocol overhead cannot meet. More than 3–5 parties are involved (MPC communication complexity scales with party count). The function to be computed is not well-defined in advance — MPC protocols are optimised for specific functions. The performance budget cannot accommodate the overhead of current MPC frameworks for the required computation complexity.

Technology 3

Homomorphic Encryption

Homomorphic encryption (HE) allows computation on encrypted data. A client encrypts their data, sends the ciphertext to a server, the server computes on the ciphertext using homomorphic operations, and returns an encrypted result that the client can decrypt to get the answer. The server never has access to the plaintext. Fully homomorphic encryption (FHE) supports arbitrary computation on encrypted data. Partially homomorphic encryption (PHE) supports limited operation types (addition only, or multiplication only). Levelled HE supports a bounded circuit depth. FHE is no longer just theoretical: Microsoft SEAL, OpenFHE, and TFHE are production-ready libraries deployed in cloud services and financial applications.

Privacy guarantee

The server learns nothing about the plaintext data. The guarantee rests on the hardness of the underlying lattice problems (Ring-LWE, RLWE), which are also believed to be quantum-resistant — making FHE a technology that complements post-quantum cryptography rather than conflicting with it.

Performance cost

Severe for FHE: 1,000×–1,000,000× slower than plaintext computation for current FHE schemes on general-purpose hardware. GPU and dedicated ASIC acceleration (Intel HEXL, NVIDIA cuFHE) is reducing this gap rapidly. Practical FHE today is constrained to specific function classes: simple arithmetic operations, neural network inference with constrained architectures, and database queries with specific access patterns.

Practical use today

CKKS scheme for approximate arithmetic (ideal for ML inference on floating-point data). BFV/BGV for exact integer arithmetic. TFHE for Boolean circuits with fast bootstrapping. PHE for specific narrow applications: Paillier encryption for addition-only use cases (vote tallying, private aggregation, sum statistics).

When HE is the right choice

A client wants to outsource computation to an untrusted server without revealing the data being processed — cloud ML inference on sensitive medical or financial data, encrypted database queries, private information retrieval. The computation is arithmetic and bounded in complexity. The client-server model fits the deployment (client encrypts, server computes, client decrypts). The “server sees nothing” guarantee is required rather than the “parties see nothing about each other” guarantee of MPC. Examples: a hospital running ML inference on a cloud provider’s model without revealing patient data, private database queries where the query reveals nothing about what was searched.

When it is not the right choice

The computation requires non-arithmetic operations at scale (complex branching, non-polynomial activations) that make FHE circuits impractically large. Latency requirements are incompatible with FHE overhead on current hardware. The deployment is multi-party collaborative computation (MPC is typically more efficient for this). The use case requires general-purpose encrypted data processing rather than a specific bounded function.

Technology 4

Differential Privacy

Differential privacy provides a formal mathematical privacy guarantee for statistical queries and machine learning. A mechanism is (ε, δ)-differentially private if the probability of any outcome differs by at most e⊃ε between a dataset and any dataset that differs by one record. Intuitively: the presence or absence of any single individual’s record in the dataset has a bounded effect on the output, making it impossible to determine from the output whether any specific individual was in the dataset. Implemented by adding calibrated noise to query results or gradient updates during training. Apple, Google, Microsoft, and Meta all deploy differential privacy in production systems at scale.

Privacy guarantee

Formal mathematical guarantee: the privacy loss ε quantifies the worst-case information an adversary can learn about any individual’s record from the output. This is the strongest formal privacy guarantee available for statistical computation, and it composes — the total privacy loss of multiple differentially private queries can be tracked and bounded.

Performance cost

Accuracy reduction proportional to the privacy parameter ε: lower ε (stronger privacy) requires more noise and reduces accuracy. The privacy-utility trade-off must be explicitly quantified and accepted. For ML training with DP-SGD, tight ε values (ε < 1) typically reduce model accuracy by 2–10 percentage points on classification benchmarks, though recent advances in amplification and accounting are reducing this gap.

Deployment context

DP-SGD for differentially private ML training. Local DP for data collection from individuals (each individual’s data is randomised before submission). Central DP for query answering on a trusted curator’s dataset. Rényi DP and Gaussian DP for tighter accounting in ML training. PATE (Private Aggregation of Teachers’ Ensembles) for private knowledge transfer.

When differential privacy is the right choice

A formal, quantifiable privacy guarantee is required for regulatory or legal purposes — not a best-efforts approach but a mathematical bound on privacy loss. The computation is statistical (aggregates, summaries, ML training) rather than individual record retrieval. The privacy-utility trade-off is acceptable for the application’s accuracy requirements. Membership inference risk is a specific concern (DP directly addresses the attack surface that membership inference exploits). Examples: health data statistics for research publication, ML model training on personal data with formal privacy guarantees, census data release, federated learning with formal gradient privacy guarantees, survey data with local randomisation.

When it is not the right choice

The application requires individual record retrieval — DP is a property of statistical computation and cannot protect individual queries in a way that preserves query accuracy. The accuracy loss from the required ε is unacceptable for the application. The data is not statistical in nature. The threat model requires protection against a curator who is themselves malicious — DP in the central model requires trusting the curator.

Where PET Implementations Fail

Seven implementation failures specific to privacy-enhancing technologies. Each one provides the guarantee on paper while failing to deliver it in practice.

PET implementations have a unique failure mode: the privacy guarantee is real, but it applies to the protocol in isolation. The surrounding system — the data pipeline that feeds the protocol, the output processing that follows it, the auxiliary information available to an adversary, the implementation choices that affect the security assumptions — can invalidate the guarantee without touching the cryptography. A correctly implemented FHE scheme in a system that logs the plaintexts before encryption provides no privacy. A correctly implemented federated learning protocol with a gradient inversion-vulnerable update scheme provides no privacy against the aggregator. The cryptographic core can be correct while the system fails.

Federated learning gradient updates are not protected against inversion attacks

Federated learning’s promise is that raw data does not leave participants. What does leave is gradient updates — the parameter changes computed from the local training data. Gradient inversion attacks, demonstrated to be effective in 2020 and since improved substantially, can reconstruct training examples from gradient updates with high fidelity for image classifiers, text classifiers, and tabular data models. An aggregator who collects individual gradient updates — rather than using secure aggregation — can reconstruct the training data of individual participants. Deployments that implement federated learning without secure aggregation or without differential privacy applied to the gradient updates provide significantly weaker privacy than their architects assumed.

What this looks like in a healthcare deployment

A hospital network deploys federated learning for a diagnostic model, believing that patient data never leaves the hospitals. The central aggregator receives individual per-hospital gradient updates. A research paper published 18 months after deployment demonstrates that the specific model architecture used is vulnerable to gradient inversion, and that individual patient chest X-ray images can be reconstructed from the gradient updates with sufficient quality for clinical recognition. The hospitals’ belief that patient data had not left their systems was correct for the raw data. It was incorrect for the gradient updates that represented it.