> For the complete documentation index, see [llms.txt](https://docs.espresso.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.espresso.ai/snowflake-optimizer/proxy-onboarding/proxy-onboarding-terraform-deployment-aws.md).

# Self-hosted Terraform deployment (AWS)

This guide explains how to deploy Espresso AI's Proxy Service on your AWS infrastructure with Terraform.

You can deploy either:

1. In a dedicated VPC that Terraform creates.
2. In an existing VPC that you provide.

## Prerequisites

* Access to an AWS account with IAM permissions for VPC, EKS, IAM roles, EC2/load balancers, Route53 (if used), and Secrets Manager (if used).
* In the [Espresso AI dashboard](https://dashboard.espressocomputing.com/), go to `Proxy Onboarding` and:
  * Enter your AWS account ID so we can grant ECR access for the Proxy image.
  * Copy your customer name.
  * Copy Espresso AI's AWS Account ID. This is needed for the ECR url.
  * Generate an API key for Espresso API authentication.

## What this module creates

* VPC (optional) or uses your existing VPC/subnets.
* EKS cluster and node group.
* Karpenter for node autoscaling.
* AWS Load Balancer Controller.
* Proxy deployment, service, and HPA in Kubernetes.
* Optional Route53 record.
* Optional managed API key flow via AWS Secrets Manager + External Secrets.

## Example usage

### Dedicated VPC + managed secret + DNS

```hcl
variable "proxy_api_key_value" {
  description = "Managed proxy API key value for Secrets Manager sync."
  type        = string
  sensitive   = true
}

module "proxy_on_prem" {
  source = "github.com/espressocomputing/espresso-ai-proxy-tf//aws?ref=v0.4.3"

  region   = "us-east-1"
  customer = "<Value from Espresso AI dashboard>"

  create_dedicated_vpc = true
  vpc_config = {
    cidr                 = "10.80.0.0/16"
    public_subnet_cidrs  = ["10.80.0.0/20", "10.80.16.0/20"]
    private_subnet_cidrs = ["10.80.32.0/20", "10.80.48.0/20"]
    availability_zones   = ["us-east-1a", "us-east-1b"]
  }

  eks_config = {
    cluster_endpoint_public_access       = true
    cluster_endpoint_public_access_cidrs = ["203.0.113.10/32"]
  }

  proxy_config = {
    repository          = "<Espresso AI's AWS Account ID>.dkr.ecr.us-east-1.amazonaws.com/proxy"
    image               = "0.1-dev-c6cb3f5e933cc1d6871195b9d4ffcfea149d4321f1bdf96a8352c112740f32f3"
    proxy_host          = "customer.example.com"
    api_key_secret_mode = "MANAGED_AWS_SECRETS_MANAGER"
  }

  proxy_api_key_value = var.proxy_api_key_value

  alb_config = {
    certificate_arn = "arn:aws:acm:us-east-1:123456789012:certificate/11111111-2222-3333-4444-555555555555"
    ingress_host    = "proxy.customer.example.com"
  }

  dns_config = {
    create_record = true
    zone_id       = "Z123EXAMPLE456"
    record_name   = "proxy.customer.example.com"
  }
}
```

### Existing VPC + bring-your-own Kubernetes secret

```hcl
module "proxy_on_prem" {
  source = "github.com/espressocomputing/espresso-ai-proxy-tf//aws?ref=v0.4.3"

  region   = "us-east-1"
  customer = "<Value from Espresso AI dashboard>"

  create_dedicated_vpc = false
  existing_vpc_config = {
    vpc_id             = "vpc-0123456789abcdef0"
    private_subnet_ids = ["subnet-01aaaa", "subnet-02bbbb"]
    public_subnet_ids  = ["subnet-03cccc", "subnet-04dddd"]
  }

  eks_config = {
    cluster_endpoint_public_access       = true
    cluster_endpoint_public_access_cidrs = ["203.0.113.10/32"]
  }

  proxy_config = {
    repository          = "<Espresso AI's AWS Account ID>.dkr.ecr.us-east-1.amazonaws.com/proxy"
    image               = "0.1-dev-c6cb3f5e933cc1d6871195b9d4ffcfea149d4321f1bdf96a8352c112740f32f3"
    proxy_host          = "customer.example.com"
    api_key_secret_mode = "BYO_K8S_SECRET"
    api_key_secret_name = "espresso-ai"
  }

  alb_config = {
    certificate_arn = "arn:aws:acm:us-east-1:123456789012:certificate/11111111-2222-3333-4444-555555555555"
    ingress_host    = "proxy.customer.example.com"
  }
}
```

## Argument reference

### Top-level arguments

* `region`: Required. AWS region for deployment.
* `customer`: Required. Customer identifier used in naming and `API_URL` suffixing.
* `create_dedicated_vpc`: Optional. Creates dedicated VPC (`true`) or uses existing VPC (`false`). Default: `true`.
* `vpc_config`: Optional/conditional. Required when `create_dedicated_vpc = true`.
* `existing_vpc_config`: Optional/conditional. Required when `create_dedicated_vpc = false`.
* `eks_config`: Optional. EKS cluster and node group settings.
* `karpenter_config`: Optional. Karpenter NodePool tuning.
* `proxy_config`: Required. Proxy runtime configuration.
* `proxy_api_key_value`: Optional/conditional, sensitive. Required when `proxy_config.api_key_secret_mode = MANAGED_AWS_SECRETS_MANAGER`.
* `alb_config`: Optional. ALB ingress configuration.
* `dns_config`: Optional. Route53 alias record configuration.
* `autoscaling_config`: Optional. Proxy HPA configuration.
* `tags`: Optional. Additional AWS tags. Default: `{}`.

### `vpc_config`

* `vpc_name`: Optional. Default: `espresso-ai-proxy-vpc`.
* `cidr`: Required in dedicated VPC mode.
* `public_subnet_cidrs`: Required in dedicated VPC mode.
* `private_subnet_cidrs`: Required in dedicated VPC mode.
* `availability_zones`: Required in dedicated VPC mode and must align with subnet counts.

### `existing_vpc_config`

* `vpc_id`: Required in existing VPC mode.
* `private_subnet_ids`: Required in existing VPC mode.
* `public_subnet_ids`: Optional. Default: `[]`.

### `eks_config`

* `cluster_name`: Optional. Default: `espresso-ai-proxy`.
* `cluster_version`: Optional. Default: `1.35`.
* `bootstrap_self_managed_addons`: Optional. Default: `false`.
* `cluster_endpoint_public_access`: Optional. Default: `true`.
* `cluster_endpoint_private_access`: Optional. Default: `true`.
* `cluster_endpoint_public_access_cidrs`: Required when public endpoint access is enabled.
* `create_cloudwatch_log_group`: Optional. Default: `false`.
* `cloudwatch_log_group_retention_in_days`: Optional. Default: `90`.
* `instance_types`: Optional. Default: `["c8i.2xlarge", "c8i.4xlarge"]`.
* `node_group_min_size`: Optional. Default: `2`.
* `node_group_desired_size`: Optional. Default: `2`.
* `node_group_max_size`: Optional. Default: `10`.

### `karpenter_config`

* `instance_types`: Optional. Default: `["c8i.2xlarge", "c8i.4xlarge"]`.
* `capacity_types`: Optional. Default: `["on-demand"]`.
* `cpu_limit`: Optional. Default: `64`.
* `memory_limit`: Optional. Default: `256Gi`.
* `node_cap`: Optional. Default: `10`.

### `proxy_config`

* `image`: Required. Proxy container image URI in Espresso AI's ECR.
* `replicas`: Optional. Default: `2`.
* `proxy_host`: Required. Your base domain (e.g. `example.com`), injected as `PROXY_HOST`. Use the registrable base domain only — not a full hostname (`proxy.example.com`), scheme, or port.
* `otel_collector`: Optional. OTEL Collector sidecar configuration. See [`otel_collector`](#otel_collector) below.
* `api_key_secret_name`: Optional. Kubernetes secret name for API key injection. Default: `espresso-ai`.
* API key secret key name is fixed to `ESPRESSO_AI_API_KEY` and is not configurable.
* `api_key_secret_mode`: Optional. `BYO_K8S_SECRET` or `MANAGED_AWS_SECRETS_MANAGER`. Default: `BYO_K8S_SECRET`.
* `api_key_aws_secret_name`: Optional. AWS Secrets Manager secret name used in managed mode. Default: `/espresso-ai/proxy/api-key`.
* `api_url`: Optional. Base URL. Default: `https://api.espressocomputing.com:25831`.
* `env_vars`: Optional. Map of environment variable key/value pairs. Currently supported keys:

  | key                  | type   | definition                                                                                                                                  |
  | -------------------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------- |
  | `EXCLUDE_QUERY_TEXT` | `bool` | Default: `false`. Whether to exclude query text on requests to Espresso AI's API. *Note: Enabling this will limit supported functionality.* |

### `otel_collector`

Nested object under `proxy_config`. When `enabled = true`, the module deploys an OpenTelemetry Collector sidecar in the proxy pod and renders a ConfigMap with its pipeline. The proxy is automatically pointed at `http://localhost:4318`; the sidecar forwards to your-owned OTLP backend.

* `enabled`: Optional. Default: `true`. Set to `false` to disable the sidecar; the proxy will then emit OTLP directly to `otel_exporter_otlp_endpoint`.
* `image`: Optional. Full image reference for the collector container. Default: `otel/opentelemetry-collector-contrib:0.152.0`.
* `customer_endpoint`: Optional. OTLP endpoint for the customer's own observability backend. Leave empty (default) to disable the customer exporter; the Espresso pipeline still runs.
* `customer_protocol`: Optional. `grpc` (renders the `otlp/customer` exporter) or `http` (renders `otlphttp/customer`). Default: `grpc`.
* `customer_signals`: Optional. Signals to mirror to the customer exporter. Any subset of `metrics`, `logs`. Default: all three.
* `customer_auth_secret_name`: Optional. Existing Kubernetes Secret in the proxy namespace whose value is mounted as `CUSTOMER_OTLP_AUTH` and sent as the customer exporter's `Authorization` header. Leave empty for unauthenticated endpoints.
* `customer_auth_secret_key`: Optional. Key within `customer_auth_secret_name`. Default: `authorization`.
* `customer_tls_insecure`: Optional. Disable TLS verification on the customer exporter. Default: `false`.

Example — also mirror metrics (not logs) to the customer's own OTLP backend with bearer-token auth:

```hcl
proxy_config = {
  repository = "<Espresso AI's AWS Account ID>.dkr.ecr.us-east-1.amazonaws.com/proxy"
  image      = "0.1-dev-..."
  proxy_host = "customer.example.com"

  otel_collector = {
    customer_endpoint         = "https://otlp.observability.customer.example.com:4317"
    customer_protocol         = "grpc"
    customer_signals          = ["metrics", "logs"]
    customer_auth_secret_name = "customer-otlp-auth"
  }
}
```

The `customer-otlp-auth` Secret must exist in the `proxy` namespace and contain an `authorization` key whose value is the full header (e.g. `Bearer eyJ...`).

For the full list of metrics, spans, and resource attributes the proxy emits — useful for building dashboards and alerts against the customer exporter — see [Proxy telemetry reference](/snowflake-optimizer/proxy-onboarding/proxy-telemetry-reference.md).

### `alb_config`

* `enable_ingress`: Optional. Enables ALB ingress. Default: `true`.
* `certificate_arn`: Required when ingress is enabled.
* `ingress_host`: Optional. Host rule.
* `scheme`: Optional. `internet-facing` or `internal`. Default: `internet-facing`.

### `dns_config`

* `create_record`: Optional. Creates Route53 alias. Default: `false`.
* `zone_id`: Required when `create_record = true`.
* `record_name`: Optional. Falls back to ingress host if omitted.

### `autoscaling_config`

* `min_replicas`: Optional. Default: `2`.
* `max_replicas`: Optional. Default: `10`.
* `target_cpu_utilization`: Optional. Default: `70`.

## Secret modes

* `BYO_K8S_SECRET` (default): Proxy reads from an existing Kubernetes secret (`api_key_secret_name`) using fixed key `ESPRESSO_AI_API_KEY`.
* `MANAGED_AWS_SECRETS_MANAGER`: Module provisions AWS Secrets Manager secret, IRSA, External Secrets Operator, and syncs to Kubernetes secret.

## Outputs

The module exports:

* `vpc_id`
* `public_subnet_ids`
* `private_subnet_ids`
* `eks_cluster_name`
* `eks_cluster_endpoint`
* `eks_cluster_security_group_id`
* `proxy_namespace`
* `proxy_service_name`
* `proxy_service_load_balancer_hostname`
* `proxy_ingress_load_balancer_hostname`
* `proxy_hpa_name`
* `proxy_dns_fqdn`

## How to deploy

Deployment typically takes around 20-30 minutes.

```bash
terraform init
terraform plan
terraform apply
```

## Best practices

* Manage sensitive variables via environment variables or `.tfvars`.

## Version Migrations

The v0.1.0 → v0.2.0 change is a path-only refactor: the AWS configuration moved from the repo root into an `aws/` subdirectory, so the source URL needs `//aws`. No resource addresses changed, so existing state continues to apply cleanly.

```bash
module "proxy_on_prem" {
  source = "github.com/espressocomputing/espresso-ai-proxy-tf//aws?ref=v0.2.0"
  #                                                       ^^^^^ new
  ...
}
```

```bash
terraform init -upgrade
terraform plan
terraform apply
```

`plan` should report zero changes. If it shows any resource being destroyed, recreated, or replaced, stop and investigate before running `apply`.