Privacy and AI

No.

Conduit does not directly send data to the OpenAI API. Instead, it feeds OpenAI metadata (such as table column descriptions) and asks the AI to write a Python program that, with access to the data, can answer the question.

Conduit then runs the program on a local server in a sandbox where the program has access to the data.

Local Analysis vs Cloud

In Analyst mode Conduit runs the program on a local server in a sandbox where the program has access to the your data.

There are several reasons for this approach:

Context window limits: The token limit is 100k, which restricts data volume to around 1000 rows, impractical for business tasks.
Elimination of hallucinations: Using Python programs for arithmetic tasks helps avoid AI calculation errors.

However, in certain operations, such as automatically generating column descriptions, we may pass data samples. This happens only with a manual command from the user and can be disabled as it is not a core function of the product.

S3 buckets

We securely store your data in private, encrypted S3 buckets for caching purposes. Alternatively, you have the option to store your data in your own S3 bucket.

How to mask the sensitive information

Question: While sending data to AI for analysis, will I be able to mask the sensitive information of my clients and create user IDs to share only those data for analysis?

Answer:

There are two approaches to ensure data privacy when using AI for analysis.

Local Analysis:
Conduit offers two modes for working with AI. One of the modes, called Analyst, does not send data to the AI. Instead, it asks the AI to generate a program that performs the analysis locally on the server where the data resides. This way, your data is not sent to the cloud. More details in the "Data Sharing with the AI / Local Analysis" section.
Data Cleanup Before Sending:
Another approach is to sanitize the data before sending it to the cloud AI. You can create a workflow that removes sensitive information from the data. This can involve removing entire columns, such as account names or IDs, or obfuscating specific parts of columns, like the first digits of account numbers and names. You need to connect these cleaning workflows instead of directly connecting the raw data.

See also

Tags

Privacy and AI

Local Analysis vs Cloud

S3 buckets

How to mask the sensitive information

Other privacy topics

See also

Tags

Privacy and AI

Do you share my information with OpenAI?

Local Analysis vs Cloud

S3 buckets

How to mask the sensitive information

Other privacy topics