# Snowflake Architecture

Espresso's architecture is designed to maximize availability and to protect the security of your data.

Our system is trained on metadata and we only use metadata in production. All metadata is encrypted in transit and at rest.

Your data passes through our Snowflake proxy for proxy-enabled features (the Scheduling Agent and the Query Agent). Data is encrypted in transit and is never accessed, logged, or stored by our system.

Enterprise customers can self-host the proxy to prevent data from being transmitted outside of their VPC.

## Network Connectivity

We support TLS encryption, PrivateLink on AWS and Azure, and Private Service Connect on GCP. (This applies to any connection in <mark style="color:blue;">blue</mark> on our architecture diagram.)

If you use a Snowflake allowlist, please allow the following IPs:

```
18.233.13.51
34.195.242.31
34.231.116.52
34.231.212.71
34.234.123.175
35.169.148.94
52.87.110.223
54.161.160.239
```

## Warehouse Agent

Espresso's warehouse agent connects directly to your Snowflake account using a Snowflake service user.

<figure><img src="/files/4vZCjlyLXfPRfsydJNFG" alt=""><figcaption></figcaption></figure>

## Snowflake Proxy: Standard Deployment

Our Scheduling Agent and Query Agent run over a proxy. In our standard deployment users connect directly to the proxy, which forwards requests to Snowflake and returns results to the user.

Customer data passes in transit through the proxy but is never inspected or stored.

<figure><img src="/files/6WGTm6Km9Dho5ZedOhsi" alt=""><figcaption></figcaption></figure>

## Snowflake Proxy: Self-Hosted Proxy

Customers who do not want their data to leave their environment, even in transit, can self-host the proxy.

<figure><img src="/files/dAYrbhcLG54quUUHGGH4" alt=""><figcaption></figcaption></figure>

## Self-Hosted Proxy: Query-Text-Less Operation

For deployments that require stronger privacy guarantees, we support a query-text-less proxy mode via `EXCLUDE_QUERY_TEXT=true`. In this mode, the proxy only sends routing and control-plane metadata to our backend. It degrades optimization and routing signal quality by replacing non-`USE` queries with a query hash, while still sending `USE` statements as raw SQL when needed for tracking accuracy.

In this mode, we currently send:

* Login metadata for account/session initialization:
  * Warehouse
  * Username
  * Database and schema names
  * Hashed session tokens for session tracking
* Query context metadata for query routing:
  * Query hash and hash version
  * Original warehouse (when available)
  * Database and schema from request context
* Routing SQL statements (e.g., `USE ...`) needed for routing and execution continuity

We may update this list to collect more metadata information, but guarantee to not collect general query text in this mode.

## Snowflake Proxy: Self-Hosted Deployment

Enterprise customers can self-host Espresso's entire architecture.

<figure><img src="/files/VWbdzh1vBpNsv6x4Svds" alt=""><figcaption></figcaption></figure>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.espresso.ai/snowflake-optimizer/snowflake-architecture.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
