Introducing BlindLlama, Zero-Trust AI APIs With Privacy Guarantees & Traceability

Introducing BlindLlama, Zero-Trust AI APIs With Privacy Guarantees & Traceability

Introducing BlindLlama: An open-source Zero-trust AI API. Learn how BlindLlama ensures confidentiality and transparency in AI deployment.

Daniel Huynh

We are delighted to announce the launch of BlindLlama, an open-source project that aims to make AI confidential and transparent!

Key takeaways:

    • We are launching BlindLlama, a Zero-trust AI API for open-source LLMs
    • BlindLlama responds to concerns around data privacy and AI by deploying models in hardened environments that remain inaccessible to the AI provider.
    • BlindLlama responds to concerns around code integrity by providing cryptographic proof of the server's backend code.
    • The solution consists of a Python SDK client and a hardened, verifiable server.

Motivation

While the usage of AI APIs such as ChatGPT has skyrocketed in the past year, serious concerns have been raised about their lack of transparency and security.

A key concern is the inadvertent leakage of confidential data. Data sent to AI APIs by users may be used by the AI provider to further train their models. The API may then reproduce a segment of this input data as output to other API users, thus leaking the confidential data. Incidents of accidental proprietary data breaches in the industry have led to many industry-leading companies banning staff from using AI APIs such as ChatGPT. Concerns over the safety of LLMs have equally been highlighted in recent research efforts.

A related concern is code integrity: how can users be sure what code is used in the backend? Even where APIs and models are open-source, users have no way to verify that they are communicating with a server that hosts the expected open-source code and model. They cannot be sure, for example, that the AI API they are communicating with does not contain a few extra lines of code to save their data to disk or use it for other purposes, such as fine-tuning.

By creating secure zero-trust AI APIs and using Trusted Platform Modules (TPMs) to measure and verify the code deployed, we can address these issues and offer end users significant privacy and security guarantees.

Quick tour

You can get started with BLindLlama with our quick tour which will walk you through your first query of the Llama2-70b model with BlindLLama using our Python SDK!

Example query:

Example result:

How we achieve Zero-trust

There are three key elements behind our zero-trust APIs:

1. Hardened environments

Before deploying our APIs, we wrap them in an isolated hardened environment. Hardened environments ensure that the AI API provider (Mithril Security) cannot get into the environment where the API is deployed, and data cannot leave it! This is achieved by removing all key access points to the environment such as SSH and logs. Our code is open-source and auditable to enable transparency and scrutiny over these privacy measures.

2.  Attestation

Attestation is how we prove that our AI APIs really are deployed in these hardened environments. We implement attestation through TPMs, which are capable of measuring the whole stack of a machine and the code it serves. We can then request a signed copy of these measures, which are passed on to users. Our cloud provider endorses the TPM’s signature so we can be sure this data comes from a genuine TPM.

Before a user’s query is sent to the API server, our open-source client first verifies these measurements. For example, it will check that the API code loaded on the server matches the latest open-source code we provide on our GitHub. The client will only send data for AI inference after these verifications are successfully completed.

3. Attested TLS

Attested TLS is how we know we are really communicating with the attested hardened API. 

Without this final step, our API would be vulnerable to man-in-the-middle attacks where a malicious server could intercept and forward on the signed proofs of a genuine hardened BlindLlama server in order to trick the client into sending it users’ data. To eliminate the risk of this attack, we need to verify the identity of the server. We do this by having the server send a TPM-measured hash of its TLS certificate, which acts as a form of ID. The client checks that this certificate matches the TLS certificate of the current connection and only allows TLS communications to be established with the server if this check is successfully concluded.

Architecture

BlindLlama is made up of a client-side Python SDK that verifies the remotely hosted zero-trust API and our “server,” which is a combination of three server-side elements:

  • A reverse proxy that forwards communications between the client and the attesting launcher or hardened AI container using attested TLS
  • An attesting launcher that loads the hardened AI container and creates the cryptographic proof file used for attestation
  • A hardened AI container responsible for serving the AI model with strict isolation and security measures

Conclusion

BlindLlama makes the usage of AI APIs for users with confidential data possible. It provides our API users with end-to-end data protection and proves that it is implemented.

We are still at the beginning of our mission, in what we call our “alpha” phase. This means we are still implementing some of the key security features required to have a fully production-ready and auditable product. You can consult our roadmap to monitor our progress! You can also find a whitepaper regarding the technical details of our solution.

Your feedback helps us to improve our work! You can reach out to us on Discord or provide bug reports on GitHub (and star our repo if you like our work ).