> For the complete documentation index, see [llms.txt](https://docs.espresso.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.espresso.ai/snowflake-optimizer/proxy-onboarding/proxy-onboarding-helm-deployment.md).

# Self-hosted Helm chart deployment

Use this Helm chart to deploy Espresso AI's Proxy Service into an existing Kubernetes cluster.

* **`aws`** — EKS (or self-managed Kubernetes on AWS) fronted by the AWS Load Balancer Controller (ALB).
* **`azure`** — AKS (or self-managed Kubernetes on Azure) fronted by the Application Gateway Ingress Controller (AGIC).
* **`generic`** — any Kubernetes cluster with an ingress controller you manage yourself (NGINX, Traefik, HAProxy, Contour, etc.), or no ingress at all.

The non-ingress parts of the chart (Deployment, Service, HPA, ServiceAccount) are identical across all three providers.

## Prerequisites

Common to all deployments:

* Kubernetes cluster access (`kubectl` context points to the target cluster).
* An existing Kubernetes Secret containing `ESPRESSO_AI_API_KEY`.
* A container image for the proxy that is reachable from the cluster's nodes.
* In the [Espresso AI dashboard](https://dashboard.espressocomputing.com/), go to `Proxy Onboarding` and:
  * Provide the information needed for image access (see per-provider notes below).
  * Copy your customer name.
  * If running on AWS, copy Espresso AI's AWS Account ID. This is needed for the ECR url.
  * If running on Azure, copy Espresso AI's Azure Account ID. This is needed for the ACR url.
  * Generate an API key for Espresso API authentication.

Per-provider additions:

* **AWS** — In the dashboard, enter your AWS account ID so Espresso AI can grant ECR access for the Proxy image. If you plan to expose the proxy via ingress, install the AWS Load Balancer Controller and have an ACM certificate ARN ready.
* **Azure** — Enter your Azure Subscription ID so we can grant ACR access for the Proxy image. We will generate a username and password for you to be able to pull the image from our ACR. The chart's `azure` ingress provider is specifically for the Application Gateway Ingress Controller (AGIC); install AGIC on your AKS cluster if you plan to use it, and have your Application Gateway SSL certificate name ready if terminating TLS at the gateway. If you front AKS with Azure Front Door (AFD) over a different in-cluster ingress controller (e.g., NGINX), use the `generic` provider instead and point your AFD origin at that ingress — the chart's `ingress.healthcheck.enabled` option emits a hostless `/healthcheck` rule that AFD probes can hit, regardless of provider.
* **Generic** — Contact Espresso AI for image distribution details. Have the ingress controller of your choice installed and the corresponding `ingressClassName` available.

## Required values

These are required regardless of provider:

* `image.repository`
* `image.tag`
* `customer` (non-empty)
* `env.PROXY_HOST` (non-empty) — your base domain (e.g. `example.com`), not a full hostname or URL
* `apiKeySecret.name` (must reference an existing Kubernetes Secret with key `ESPRESSO_AI_API_KEY`)

If `ingress.enabled: true`, also set:

* `ingress.provider` (`generic`, `aws`, or `azure`)
* `ingress.host` (the hostname clients will use)
* For `aws`: `ingress.aws.certificateArn` (recommended)
* For `azure`: `ingress.azure.appgwSslCertificate` (when terminating TLS at the gateway) or `ingress.tls` (when terminating TLS via a Kubernetes Secret)
* For `generic`: `ingress.className` and, if using TLS, `ingress.tls`

The API key secret key is fixed to `ESPRESSO_AI_API_KEY` and is not configurable.

## How to deploy

Add/update the chart repository:

```bash
helm repo add espresso-ai-proxy-chart https://espressocomputing.github.io/espresso-ai-proxy-chart
helm repo update
```

Install/upgrade with your `values.yaml`:

```bash
helm upgrade --install proxy espresso-ai-proxy-chart/proxy \
  --namespace proxy \
  --create-namespace \
  --version 0.4.0 \
  -f values.yaml
```

Create the API key secret (example):

```bash
kubectl -n proxy create secret generic espresso-ai \
  --from-literal=ESPRESSO_AI_API_KEY='<api-key>'
```

## Example values per provider

### AWS (EKS + ALB)

```yaml
customer: "value from Dashboard"

image:
  repository: <Espresso AI's AWS Account ID>.dkr.ecr.us-east-1.amazonaws.com/proxy
  tag: "0.1-dev-c6cb3f5e933cc1d6871195b9d4ffcfea149d4321f1bdf96a8352c112740f32f3"

env:
  PROXY_HOST: customer.example.com

apiKeySecret:
  name: espresso-api

service:
  type: ClusterIP
  port: 5050

ingress:
  enabled: true
  provider: aws
  host: proxy.customer.example.com
  # className defaults to "alb" when provider is aws and className is unset.
  aws:
    certificateArn: arn:aws:acm:us-east-1:123456789012:certificate/11111111-2222-3333-4444-555555555555
    scheme: internet-facing
    targetType: ip
    listenPorts: '[{"HTTPS":443}]'
    sslRedirect: "443"
    healthcheckPath: /healthcheck
    annotations: {}

autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70
```

### Azure (AKS + Application Gateway Ingress Controller)

```yaml
customer: "value from Dashboard"

image:
  repository: <Espresso AI's Azure Account ID>.azurecr.io/espresso/proxy
  tag: "0.1-dev-c6cb3f5e933cc1d6871195b9d4ffcfea149d4321f1bdf96a8352c112740f32f3"

env:
  PROXY_HOST: customer.example.com

apiKeySecret:
  name: espresso-api

service:
  type: ClusterIP
  port: 5050

ingress:
  enabled: true
  provider: azure
  host: proxy.customer.example.com
  # className defaults to "azure/application-gateway" when provider is azure and className is unset.
  azure:
    sslRedirect: true
    healthProbePath: /healthcheck
    # Name of an SSL certificate already uploaded to your Application Gateway.
    # Use either appgwSslCertificate (TLS at the gateway) or ingress.tls (TLS via a K8s Secret), not both.
    appgwSslCertificate: proxy-cert
    annotations: {}
  # Optional: extra hostless /healthcheck rule for probes that don't send the Host header
  # (e.g., Azure Front Door health probes).
  healthcheck:
    enabled: true
    path: /healthcheck
    pathType: Prefix

autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70
```

If you prefer to terminate TLS in the cluster instead of at the Application Gateway, omit `ingress.azure.appgwSslCertificate` and use the `ingress.tls` shorthand (see below).

**Azure Front Door (AFD).** The example above assumes AGIC is the edge. If your edge is AFD over an in-cluster ingress controller (e.g., NGINX on AKS), use `provider: generic` with your controller's `className` instead — the AGIC-specific annotations don't apply. The `ingress.healthcheck` block shown above is still useful: AFD origin health probes do not forward the application Host header, so the hostless `/healthcheck` rule lets them succeed against any provider.

### Generic Kubernetes (any ingress controller)

This example uses an NGINX ingress controller and TLS via a Kubernetes Secret.

```yaml
customer: "value from Dashboard"

image:
  repository: <Espresso AI's AWS Account ID>.dkr.ecr.us-east-1.amazonaws.com/proxy
  tag: "0.1-dev-c6cb3f5e933cc1d6871195b9d4ffcfea149d4321f1bdf96a8352c112740f32f3"

env:
  PROXY_HOST: customer.example.com

apiKeySecret:
  name: espresso-api

service:
  type: ClusterIP
  port: 5050

ingress:
  enabled: true
  provider: generic
  className: nginx
  host: proxy.customer.example.com
  path: /
  pathType: Prefix
  # Shorthand: chart fills hosts from ingress.host if you omit it.
  tls:
    secretName: proxy-tls
  # Or pass through any annotations your controller needs:
  annotations:
    nginx.ingress.kubernetes.io/proxy-body-size: "10m"

autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70
```

If you don't need an ingress at all (for example, exposing the Service via a `LoadBalancer` type or accessing it from inside the cluster), set `ingress.enabled: false` and `service.type` to whatever fits your environment.

## Core configuration

### Image

| Field              | Description                               | Required | Default        |
| ------------------ | ----------------------------------------- | -------- | -------------- |
| `image.repository` | Container image repository for the proxy. | Yes      | None           |
| `image.tag`        | Container image tag.                      | Yes      | None           |
| `image.pullPolicy` | Kubernetes image pull policy.             | No       | `IfNotPresent` |

### Environment

| Field                             | Description                                                                                                                                                                                                                                       | Required | Default                                   |
| --------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | ----------------------------------------- |
| `customer`                        | Customer identifier used by the proxy.                                                                                                                                                                                                            | Yes      | None                                      |
| `env.PROXY_HOST`                  | Your base domain (e.g. `example.com`), injected as `PROXY_HOST`. Use the registrable base domain only — not a full hostname (`proxy.example.com`), scheme, or port.                                                                               | Yes      | None                                      |
| `apiUrl`                          | Base API URL used to derive runtime `API_URL` (`<apiUrl>/<customer>` unless overridden).                                                                                                                                                          | No       | `https://api.espressocomputing.com:25831` |
| `env.API_URL`                     | Optional full override for `API_URL`.                                                                                                                                                                                                             | No       | `<apiUrl>/<customer>`                     |
| `env.OTEL_EXPORTER_OTLP_ENDPOINT` | Optional telemetry OTLP endpoint override. Defaults to `http://localhost:4318` when `otelCollector.enabled: true` (so the proxy hits the in-pod sidecar) and to `https://metrics.espressocomputing.com:443` otherwise.                            | No       | See description                           |
| `env.EXCLUDE_QUERY_TEXT`          | Whether to exclude query text on requests to Espresso AI's API. *Note: enabling this will limit supported functionality.*                                                                                                                         | No       | `false`                                   |
| `extraEnv`                        | Raw list of Kubernetes env entries injected into the proxy container, rendered after the chart-managed env vars. Use it for values the `env` map can't express — anything that needs `valueFrom` (`fieldRef`, `secretKeyRef`, `configMapKeyRef`). | No       | `[]`                                      |

Any other key/value pairs you put under `env` are passed through to the container as environment variables, except for the chart-managed names listed above and `ESPRESSO_AI_API_KEY`.

#### `env` vs `extraEnv`

`env` is a simple `name: value` **map** for literal values, and is the right place for almost everything. `extraEnv` is the escape hatch for env vars whose value comes from a non-literal source via `valueFrom`:

```yaml
extraEnv:
  - name: NODE_IP
    valueFrom:
      fieldRef:
        fieldPath: status.hostIP
```

`extraEnv` entries are rendered **after** the chart's managed env vars. Kubernetes `$(VAR)` substitution only resolves variables defined earlier in the same container, so a chart-managed env var cannot reference an `extraEnv` var via `$(VAR)`. If one `extraEnv` var must reference another via `$(VAR)`, define both in `extraEnv` with the source listed before the consumer.

Each entry needs its `name` and `value` (or `valueFrom`) on the **same list item**. Splitting them across two items — `- name: X` then `- value: Y` — produces an entry with no name and fails admission with `spec.template.spec.containers[N].env[M].name: Required value`.

### API key secret

| Field                 | Description                                                    | Required             | Default |
| --------------------- | -------------------------------------------------------------- | -------------------- | ------- |
| `apiKeySecret.name`   | Existing Kubernetes Secret name that stores the proxy API key. | Yes                  | None    |
| `ESPRESSO_AI_API_KEY` | Fixed key the chart reads from the Kubernetes Secret.          | Yes (in Secret data) | Fixed   |

### Service

| Field                 | Description                                                                        | Required | Default     |
| --------------------- | ---------------------------------------------------------------------------------- | -------- | ----------- |
| `service.type`        | Service type (`ClusterIP`, `NodePort`, `LoadBalancer`).                            | No       | `ClusterIP` |
| `service.port`        | Service port.                                                                      | No       | `5050`      |
| `service.annotations` | Extra annotations on the Service (e.g., cloud LB hints when using `LoadBalancer`). | No       | `{}`        |

### Ingress (common fields)

These fields apply to every provider when `ingress.enabled: true`.

| Field                          | Description                                                                                                                                       | Required         | Default                  |
| ------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------- | ------------------------ |
| `ingress.enabled`              | Whether to render an Ingress resource.                                                                                                            | No               | `false`                  |
| `ingress.provider`             | Ingress flavor: `generic`, `aws`, or `azure`.                                                                                                     | Yes (if enabled) | `generic`                |
| `ingress.className`            | `ingressClassName` on the Ingress. Defaults: `alb` for `aws`, `azure/application-gateway` for `azure`, empty for `generic`.                       | Conditional      | None / provider-specific |
| `ingress.host`                 | Hostname rule for the ingress.                                                                                                                    | No (recommended) | None                     |
| `ingress.path`                 | Path for the primary rule.                                                                                                                        | No               | `/`                      |
| `ingress.pathType`             | `pathType` for the primary rule.                                                                                                                  | No               | `Prefix`                 |
| `ingress.annotations`          | Extra annotations merged onto the Ingress (after provider-specific annotations).                                                                  | No               | `{}`                     |
| `ingress.tls`                  | Either a standard ingress TLS list, or a shorthand `{secretName, hosts}`. When `hosts` is omitted from the shorthand, `ingress.host` is used.     | No               | `null`                   |
| `ingress.healthcheck.enabled`  | Render an extra hostless rule (no `host`) routing `/healthcheck` to the Service. Useful for probes that don't send the application's Host header. | No               | `false`                  |
| `ingress.healthcheck.path`     | Path used by the hostless healthcheck rule.                                                                                                       | No               | `/healthcheck`           |
| `ingress.healthcheck.pathType` | `pathType` used by the hostless healthcheck rule.                                                                                                 | No               | `Prefix`                 |

If `ingress.provider` is set to anything other than `generic`, `aws`, or `azure`, the chart fails the install with an explanatory error.

### Ingress — AWS (ALB)

When `ingress.provider: aws`, the chart emits AWS Load Balancer Controller annotations from `ingress.aws.*`. The previous `ingress.alb.*` block is still read as a backward-compatible alias if both are present, with `ingress.aws.*` winning on conflicts.

| Field                         | Description                                   | Required    | Default           |
| ----------------------------- | --------------------------------------------- | ----------- | ----------------- |
| `ingress.aws.certificateArn`  | ACM certificate ARN for HTTPS listener.       | Recommended | None              |
| `ingress.aws.scheme`          | ALB scheme (`internet-facing` or `internal`). | No          | `internet-facing` |
| `ingress.aws.targetType`      | ALB target type.                              | No          | `ip`              |
| `ingress.aws.listenPorts`     | ALB listen ports JSON.                        | No          | `[{"HTTPS":443}]` |
| `ingress.aws.sslRedirect`     | ALB SSL redirect port.                        | No          | `"443"`           |
| `ingress.aws.healthcheckPath` | ALB target group health check path.           | No          | `/healthcheck`    |
| `ingress.aws.annotations`     | Extra ALB-specific annotations.               | No          | `{}`              |

### Ingress — Azure (Application Gateway)

When `ingress.provider: azure`, the chart emits AGIC annotations from `ingress.azure.*`.

| Field                               | Description                                                                                                                                                                    | Required | Default        |
| ----------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -------------- |
| `ingress.azure.healthProbePath`     | Custom health probe path used by Application Gateway. Renders as `appgw.ingress.kubernetes.io/health-probe-path`.                                                              | No       | `/healthcheck` |
| `ingress.azure.sslRedirect`         | When `true` (default), renders `appgw.ingress.kubernetes.io/ssl-redirect: "true"`. Set to `false` to disable.                                                                  | No       | `true`         |
| `ingress.azure.appgwSslCertificate` | Name of an SSL certificate already uploaded to the Application Gateway. Renders as `appgw.ingress.kubernetes.io/appgw-ssl-certificate`. Use this *or* `ingress.tls`, not both. | No       | None           |
| `ingress.azure.annotations`         | Extra AGIC annotations.                                                                                                                                                        | No       | `{}`           |

### Ingress — Generic

When `ingress.provider: generic`, no cloud-specific annotations are added. Set `ingress.className` to your controller's class (e.g., `nginx`, `traefik`) and pass any controller-specific configuration through `ingress.annotations`. TLS works the same way as a stock Kubernetes Ingress, including the shorthand:

```yaml
ingress:
  tls:
    secretName: proxy-tls
    # hosts: [proxy.customer.example.com]   # optional; defaults to ingress.host
```

### Autoscaling

| Field                                        | Description                                          | Required | Default |
| -------------------------------------------- | ---------------------------------------------------- | -------- | ------- |
| `autoscaling.enabled`                        | Whether to render the HPA.                           | No       | `true`  |
| `replicaCount`                               | Initial deployment replica count before HPA adjusts. | No       | `2`     |
| `autoscaling.minReplicas`                    | Minimum replicas for HPA.                            | No       | `2`     |
| `autoscaling.maxReplicas`                    | Maximum replicas for HPA.                            | No       | `10`    |
| `autoscaling.targetCPUUtilizationPercentage` | CPU utilization target for HPA scaling decisions.    | No       | `70`    |

### Probes

Both readiness and liveness probes are enabled by default and hit `/healthcheck` on the container port. They can be tuned or disabled under `probes.readiness` and `probes.liveness`.

### Telemetry collector

When `otelCollector.enabled: true`, the chart adds an OpenTelemetry Collector container to the proxy pod and renders a ConfigMap with a pipeline configuration. The proxy emits OTLP/HTTP to `localhost:4318`, and the sidecar fans the traffic out to two exporters:

* The **Espresso exporter** always sends to `otelCollector.espresso.endpoint` (default `https://metrics.espressocomputing.com:443`) for every pipeline. This is how Espresso AI receives your proxy's telemetry, and it is independent of the customer endpoint — leaving the customer exporter unset does not affect it.
* The optional **customer exporter** sends to `otelCollector.customer.endpoint` for the signals listed in `otelCollector.customer.signals`. Leave the endpoint empty to disable it entirely (the Espresso pipeline still runs).

| Field                                    | Description                                                                                                                                                                                   | Required | Default                                           |
| ---------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | ------------------------------------------------- |
| `otelCollector.enabled`                  | Whether to deploy the OTEL Collector sidecar and ConfigMap.                                                                                                                                   | No       | `true`                                            |
| `otelCollector.image.repository`         | Collector image repository.                                                                                                                                                                   | No       | `otel/opentelemetry-collector-contrib`            |
| `otelCollector.image.tag`                | Collector image tag.                                                                                                                                                                          | No       | `0.152.0`                                         |
| `otelCollector.image.pullPolicy`         | Kubernetes image pull policy for the collector.                                                                                                                                               | No       | `IfNotPresent`                                    |
| `otelCollector.resources`                | Standard `requests` / `limits` block for the collector container.                                                                                                                             | No       | `50m` / `128Mi` requests, `200m` / `256Mi` limits |
| `otelCollector.env`                      | Raw list of Kubernetes env entries injected into the collector sidecar container. Reference them inside the collector config with the collector's own `${env:NAME}` substitution (see below). | No       | `[]`                                              |
| `otelCollector.espresso.endpoint`        | OTLP endpoint for Espresso AI's backend. The Espresso exporter always sends here, for every pipeline.                                                                                         | No       | `https://metrics.espressocomputing.com:443`       |
| `otelCollector.customer.endpoint`        | OTLP endpoint for the customer's own observability backend. Leave empty to disable the customer exporter entirely (the Espresso pipeline still runs).                                         | No       | `""`                                              |
| `otelCollector.customer.protocol`        | Wire protocol for the customer exporter. `grpc` renders `otlp/customer`; `http` renders `otlphttp/customer`.                                                                                  | No       | `grpc`                                            |
| `otelCollector.customer.signals`         | Signals to mirror to the customer exporter. Any subset of `metrics`, `logs`. Signals not listed here go only to Espresso.                                                                     | No       | `[metrics, logs]`                                 |
| `otelCollector.customer.authSecret.name` | Kubernetes Secret holding the value for the customer endpoint's `Authorization` header. Leave empty for unauthenticated endpoints.                                                            | No       | `""`                                              |
| `otelCollector.customer.authSecret.key`  | Key within `customer.authSecret.name` whose value is mounted as `CUSTOMER_OTLP_AUTH`.                                                                                                         | No       | `authorization`                                   |
| `otelCollector.customer.tls.insecure`    | Disable TLS verification on the customer exporter.                                                                                                                                            | No       | `false`                                           |

Example — also mirror metrics (not logs) to your own OTLP backend with bearer-token auth:

```yaml
otelCollector:
  customer:
    endpoint: https://otlp.observability.customer.example.com:4317
    protocol: grpc
    signals:
      - metrics
    authSecret:
      name: customer-otlp-auth
      key: authorization
```

Where `customer-otlp-auth` is a Kubernetes Secret in the proxy namespace whose `authorization` key contains the full header value (e.g. `Bearer eyJ...`).

#### Per-node customer collector

If you run a customer collector on each node (a DaemonSet, say) and want the sidecar to ship to the collector on its own node, the customer endpoint has to resolve to a per-node address. Use `otelCollector.env` to surface the node's IP, then reference it from `otelCollector.customer.endpoint`:

```yaml
otelCollector:
  enabled: true
  env:
    - name: NODE_IP
      valueFrom:
        fieldRef:
          fieldPath: status.hostIP
  customer:
    endpoint: "http://${env:NODE_IP}:4318"
    protocol: http        # use grpc + port 4317 for an OTLP/gRPC collector
    signals: [metrics, logs]
    tls:
      insecure: true      # plaintext OTLP/HTTP on the node
```

Espresso keeps receiving telemetry directly through the Espresso exporter; the customer also gets a copy on each node.

**Two substitution layers.** Note which one resolves `NODE_IP` here. There are two distinct layers in play:

* `$(VAR)` is expanded by the **kubelet** in a container's `env` and `args`.
* `${env:NAME}` is expanded by the **OpenTelemetry Collector** when it reads its config.

The customer endpoint lives in the collector's ConfigMap, which the kubelet never touches, so a per-node value there must use `${env:NAME}` — the collector reads `NODE_IP` from its own process environment. Unlike the proxy `extraEnv` `$(VAR)` case above, ordering within `otelCollector.env` does not matter, because the collector just reads the environment it was given.

For the full list of metrics, spans, and resource attributes the proxy emits — useful for building dashboards and alerts against the customer exporter — see [Proxy telemetry reference](/snowflake-optimizer/proxy-onboarding/proxy-telemetry-reference.md).

### Resources, scheduling, service account

`resources`, `nodeSelector`, `tolerations`, `affinity`, and `serviceAccount` follow standard Helm-chart conventions; see `values.yaml` for the defaults.

## Managed secret note

This chart does not create AWS Secrets Manager, Azure Key Vault, or External Secrets resources by itself. For managed secret sync from a cloud secrets manager, provision your sync resource (e.g., External Secrets Operator with an AWS Secrets Manager or Azure Key Vault `SecretStore`) separately and set:

* `apiKeySecret.name` to the Kubernetes Secret generated by the sync (key must be `ESPRESSO_AI_API_KEY`).

## Validation checklist

* Pods are running: `kubectl -n proxy get pods`
* Service exists: `kubectl -n proxy get svc`
* HPA exists (when `autoscaling.enabled`): `kubectl -n proxy get hpa`
* Ingress exists (if enabled): `kubectl -n proxy get ingress`
* App health endpoint responds on `/healthcheck`