blind_chat / README.md
lauro1's picture
test
faca43f
|
raw
history blame
13.7 kB


Logo

BlindChat

Website Blog

Open-source and privacy-by-design alternative to ChatGPT

Table of Contents
  1. About the project
  2. Roadmap
  3. Design
  4. Comparisons
  5. Contact

๐Ÿ“œ About the project

What is BlindChat?

๐Ÿฑ BlindChat is an open-source project to develop the first fully in-browser and private Conversational AI.

Most conversational AI solutions today require users to send their data to AI providers who serve AI models as a Service. This poses privacy issues for users who lose control over their data.

โš ๏ธ Because data is a key asset to improve LLMs, many solutions more or less implicitly fine-tune usersโ€™ data to improve their model.

This creates privacy risks for users as LLMs might learn their data by heart. Carlini et al. [1] showed that LLMs such as GPT-J could learn at least 1% of their training set by heart.

๐Ÿ” BlindChat solves this issue as users have guarantees that their data remains private at all times and have full control over it, either by doing local inference or using secure isolated environments called secure enclaves.

Local conversations

Demo

๐Ÿ‘ฉโ€๐Ÿ’ป You can try out BlindChat here! We enable users to interact with a Flan-T5 model locally through their browser: the model is pulled and used for local inference using transformers.js.

Who is BlindChat for?

BlindChat aims to serve two users:

  • End users: We want to provide privacy-by-design alternatives to change the current status quo. Most users today are forced to give up their data to leverage AI services, and opaque or inexistent privacy controls are the norm.

  • Developers: We want to help developers easily serve privacy-by-design Conversational AI, which is why we are focused on making BlindChat easy to customize and deploy.

(back to top)

Roadmap

You can check out our progress in more detail on our official roadmap. We highlight feature on which we would love help from contributors in our help wanted section.

Roadmap quick summary:

  • Revamping of Hugging Face Chat UI to make it entirely client-side (removal of telemetry, data sharing, server-side history of conversations, server-side inference, etc.)
  • Integration of privacy-by-design inference with local model
  • Local caching of conversations
  • Integration of more advanced local models (e.g. phi-1.5) and more advanced inference (e.g. Web LLM)
  • Integration of privacy-by-design inference with remote enclaves using BlindLlama for powerful models such as Llama 2 70b & Falcon 180b โŒ›
  • Integration with LlamaIndex TS for local Retrieval Augmented Generation (RAG) โŒ›
  • Internet search โŒ›
  • Connectors to pull data from different sources โŒ›

(back to top)

๐Ÿ”ง Setup

Before going any further, please make sure you have Node JS 18.0 installed on your system.

To run the chat user interface in dev/debug mode for testing purposes, execute the following commands in the root folder of your BlindChat code repo.

npm install
npm run dev

This will install the dependencies of the project and launch the dev environment.

The chat can be deployed in production mode with the following commands:

npm run build
node build

The chat-ui uses server-side rendering, so building the pages before deploying them is mandatory.

โš ๏ธ Note that the command node build will run the server in HTTP mode. If you wish to add TLS, please use a proxy server, such as NGINX.

(back to top)

๐Ÿง‘โ€๐ŸŽจ Design

Principles

๐Ÿค— BlindChat is a fork from Hugging Face Chat UI project.

We modified the code so that various tasks usually handled by the server are done by the browser. This is to ensure privacy as we do not want to send user data to the server/AI provider as our solution places the AI provider outside of our trust model.

Philosophy

To make AI transparent and confidential, (almost) all of the logic is transported from the server-side to the client-side browser.

This ensures end-usersโ€™ privacy and gives them control over what happens to their data. For instance, the inference can be done locally using transformers.js, and conversations can be stored in the user's browser chat. This means the operators of the AI service are blind to the user's data, hence the name BlindChat!

Data is only sent server-side where our remote enclave mode is selected. With this mode, the server is deployed within a hardened and verifiable environment called an enclave which provides end-to-end protection and prevents external access. Not even the AI provider admins operating the enclave can read usersโ€™ data.

Note that while our hardened environments donโ€™t fit in with all definitions of an โ€œenclaveโ€, we will use it for convenienceโ€™s sake here to describe an environment that allows a server to process data without exposing its contents to service providers.

Private inference

We offer two modes to ensure usersโ€™ data remains private:

On-device inference

on-device-mode-dark on-device-mode-light

With the on-device mode, the model is sent locally to the usersโ€™ browser, and inference is performed on-device.

This mode is generally suitable for smaller models as large models may require too much bandwidth and computational resources.

Confidential and transparent AI APis with enclaves

zero-trust-mode-dark zero-trust-mode-light

With the Zero-trust AI APIs mode, data is sent to a secure environment called an enclave containing the model for remote inference.

These environments provide end-to-end protection through robust isolation and verification. User data is never accessible in clear to the AI provider admins.

You can find out more about Confidential and transparent AI APIs with enclaves in the guide we provide with our BlindLlama project, which is the underlying technology for this mode of BlindChat.

(back to top)

Architecture

The project currently has three major components:

  • UI: This is the Chat interface that the end user interacts with. It contains the Chat box, and will contain plugins and other widgets for more complex interaction, such as loading documents or enabling voice commands.
  • Private LLM: Developers can customize which LLM they choose to answer usersโ€™ queries. Current options are either local models or remote enclaves to ensure transparent and private inference.
  • Storage: Developers can customize what kind of storage is used to save information such as conversation history and, in the future, embeddings for RAG.

*Coming soon:

  • Connectors: Connectors will allows users to pull documents from various sources, e.g. PDF upload, and share outputs
  • Integration with Llama Index TS: This will allow users to index documents with local models, store them in local storage and use them for RAG (query the LLMs based on the information contained in their documents).

๐Ÿ“Š Comparisons

Client-side bandwidth requirements Client-side computing requirements Model capabilities Privacy
On-device prediction High High Low High
Regular AI APIs Low Low High Low
Zero-trust AI APIs Low Low High High

On-device predictions and Confidential AI APIs both provide privacy contrary to most existing Conversational AI solutions that expose data to privacy risks.

On-device prediction has the advantage of providing the highest level of privacy as data does not leave the device but requires downloading models that are several hundreds of MBs to several GBs and require heavy memory and computing resources. For many users, this option will not be possible with larger, higher-performing models due to these device requirements.

Confidential AI APIs are deployed remotely, meaning the size of models is not restricted by the specifications of user devices. Users are able to query large models while still having robust privacy guarantees.

(back to top)

๐Ÿ“‡ Get in touch

We would love to hear your feedback or suggestions, here are the ways you can reach us:

Want to hear more about our work on privacy in the field AI?

  • Check out our blog
  • Subscribe to our newsletter here

Thank you for your support!

(back to top)

References

[1] Carlini, N., Ippolito, D., Jagielski, M., Lee, K., Tramer, F., & Zhang, C. (2022). Quantifying Memorization Across Neural Language Models. ArXiv. /abs/2202.07646