# Proxy on-prem Terraform deployment

This guide explains how to deploy Espresso AI's Proxy Service on your AWS infrastructure with Terraform.

You can deploy either:

1. In a dedicated VPC that Terraform creates.
2. In an existing VPC that you provide.

## Prerequisites

* Access to an AWS account with IAM permissions for VPC, EKS, IAM roles, EC2/load balancers, Route53 (if used), and Secrets Manager (if used).
* In the [Espresso AI dashboard](https://dashboard.espressocomputing.com/), go to `Proxy Onboarding` and:
  * Enter your AWS account ID so we can grant ECR access for the Proxy image.
  * Copy your customer name.
  * Generate an API key for Espresso API authentication.

## What this module creates

* VPC (optional) or uses your existing VPC/subnets.
* EKS cluster and node group.
* AWS Load Balancer Controller.
* Proxy deployment, service, and HPA in Kubernetes.
* Optional Route53 record.
* Optional managed API key flow via AWS Secrets Manager + External Secrets.

## Example usage

### Dedicated VPC + managed secret + DNS

```hcl
variable "proxy_api_key_value" {
  description = "Managed proxy API key value for Secrets Manager sync."
  type        = string
  sensitive   = true
}

module "proxy_on_prem" {
  source = "github.com/espressocomputing/espresso-ai-proxy-tf?ref=v0.0.1"

  region   = "us-east-1"
  customer = "<Value from Espresso AI dashboard>"

  create_dedicated_vpc = true
  vpc_config = {
    cidr                 = "10.80.0.0/16"
    public_subnet_cidrs  = ["10.80.0.0/20", "10.80.16.0/20"]
    private_subnet_cidrs = ["10.80.32.0/20", "10.80.48.0/20"]
    availability_zones   = ["us-east-1a", "us-east-1b"]
  }

  eks_config = {
    cluster_endpoint_public_access       = true
    cluster_endpoint_public_access_cidrs = ["203.0.113.10/32"]
  }

  proxy_config = {
    image               = "123456789012.dkr.ecr.us-east-1.amazonaws.com/proxy:0.1-dev-bc733866794f8bc1d40395463ab8151bee52b8bbdc5d41768d02e4e0094b9da8"
    proxy_host          = "proxy.customer.example.com"
    api_key_secret_mode = "MANAGED_AWS_SECRETS_MANAGER"
  }

  proxy_api_key_value = var.proxy_api_key_value

  alb_config = {
    certificate_arn = "arn:aws:acm:us-east-1:123456789012:certificate/11111111-2222-3333-4444-555555555555"
    ingress_host    = "proxy.customer.example.com"
  }

  dns_config = {
    create_record = true
    zone_id       = "Z123EXAMPLE456"
    record_name   = "proxy.customer.example.com"
  }
}
```

### Existing VPC + bring-your-own Kubernetes secret

```hcl
module "proxy_on_prem" {
  source = "github.com/espressocomputing/espresso-ai-proxy-tf?ref=v0.0.1"

  region   = "us-east-1"
  customer = "<Value from Espresso AI dashboard>"

  create_dedicated_vpc = false
  existing_vpc_config = {
    vpc_id             = "vpc-0123456789abcdef0"
    private_subnet_ids = ["subnet-01aaaa", "subnet-02bbbb"]
    public_subnet_ids  = ["subnet-03cccc", "subnet-04dddd"]
  }

  eks_config = {
    cluster_endpoint_public_access       = true
    cluster_endpoint_public_access_cidrs = ["203.0.113.10/32"]
  }

  proxy_config = {
    image               = "123456789012.dkr.ecr.us-east-1.amazonaws.com/proxy:latest"
    proxy_host          = "proxy.customer.example.com"
    api_key_secret_mode = "BYO_K8S_SECRET"
    api_key_secret_name = "espresso-ai"
  }

  alb_config = {
    certificate_arn = "arn:aws:acm:us-east-1:123456789012:certificate/11111111-2222-3333-4444-555555555555"
    ingress_host    = "proxy.customer.example.com"
  }
}
```

## Argument reference

### Top-level arguments

* `region`: Required. AWS region for deployment.
* `customer`: Required. Customer identifier used in naming and `API_URL` suffixing.
* `create_dedicated_vpc`: Optional. Creates dedicated VPC (`true`) or uses existing VPC (`false`). Default: `true`.
* `vpc_config`: Optional/conditional. Required when `create_dedicated_vpc = true`.
* `existing_vpc_config`: Optional/conditional. Required when `create_dedicated_vpc = false`.
* `eks_config`: Optional. EKS cluster and node group settings.
* `karpenter_config`: Optional. Karpenter NodePool tuning.
* `proxy_config`: Required. Proxy runtime configuration.
* `proxy_api_key_value`: Optional/conditional, sensitive. Required when `proxy_config.api_key_secret_mode = MANAGED_AWS_SECRETS_MANAGER`.
* `alb_config`: Optional. ALB ingress configuration.
* `dns_config`: Optional. Route53 alias record configuration.
* `autoscaling_config`: Optional. Proxy HPA configuration.
* `tags`: Optional. Additional AWS tags. Default: `{}`.

### `vpc_config`

* `vpc_name`: Optional. Default: `espresso-ai-proxy-vpc`.
* `cidr`: Required in dedicated VPC mode.
* `public_subnet_cidrs`: Required in dedicated VPC mode.
* `private_subnet_cidrs`: Required in dedicated VPC mode.
* `availability_zones`: Required in dedicated VPC mode and must align with subnet counts.

### `existing_vpc_config`

* `vpc_id`: Required in existing VPC mode.
* `private_subnet_ids`: Required in existing VPC mode.
* `public_subnet_ids`: Optional. Default: `[]`.

### `eks_config`

* `cluster_name`: Optional. Default: `espresso-ai-proxy`.
* `cluster_version`: Optional. Default: `1.35`.
* `bootstrap_self_managed_addons`: Optional. Default: `false`.
* `cluster_endpoint_public_access`: Optional. Default: `true`.
* `cluster_endpoint_private_access`: Optional. Default: `true`.
* `cluster_endpoint_public_access_cidrs`: Required when public endpoint access is enabled.
* `create_cloudwatch_log_group`: Optional. Default: `false`.
* `cloudwatch_log_group_retention_in_days`: Optional. Default: `90`.
* `instance_types`: Optional. Default: `["c8i.2xlarge", "c8i.4xlarge"]`.
* `node_group_min_size`: Optional. Default: `2`.
* `node_group_desired_size`: Optional. Default: `2`.
* `node_group_max_size`: Optional. Default: `10`.

### `karpenter_config`

* `instance_types`: Optional. Default: `["c8i.2xlarge", "c8i.4xlarge"]`.
* `capacity_types`: Optional. Default: `["on-demand"]`.
* `cpu_limit`: Optional. Default: `64`.
* `memory_limit`: Optional. Default: `256Gi`.

### `proxy_config`

* `image`: Required. Proxy container image URI.
* `replicas`: Optional. Default: `2`.
* `proxy_host`: Required. Non-empty value injected as `PROXY_HOST`.
* `otel_exporter_otlp_endpoint`: Optional. Injected as `OTEL_EXPORTER_OTLP_ENDPOINT`. Default: `https://metrics.espressocomputing.com:443`.
* `api_key_secret_name`: Optional. Kubernetes secret name for API key injection. Default: `espresso-ai`.
* API key secret key name is fixed to `ESPRESSO_AI_API_KEY` and is not configurable.
* `api_key_secret_mode`: Optional. `BYO_K8S_SECRET` or `MANAGED_AWS_SECRETS_MANAGER`. Default: `BYO_K8S_SECRET`.
* `api_key_aws_secret_name`: Required in managed AWS secrets mode.
* `api_url`: Optional. Base URL. Default: `https://api.espressocomputing.com:25831`.

### `alb_config`

* `enable_ingress`: Optional. Enables ALB ingress. Default: `true`.
* `certificate_arn`: Required when ingress is enabled.
* `ingress_host`: Optional. Host rule.
* `scheme`: Optional. `internet-facing` or `internal`. Default: `internet-facing`.

### `dns_config`

* `create_record`: Optional. Creates Route53 alias. Default: `false`.
* `zone_id`: Required when `create_record = true`.
* `record_name`: Optional. Falls back to ingress host if omitted.

### `autoscaling_config`

* `min_replicas`: Optional. Default: `2`.
* `max_replicas`: Optional. Default: `10`.
* `target_cpu_utilization`: Optional. Default: `70`.

## Secret modes

* `BYO_K8S_SECRET` (default): Proxy reads from an existing Kubernetes secret (`api_key_secret_name`) using fixed key `ESPRESSO_AI_API_KEY`.
* `MANAGED_AWS_SECRETS_MANAGER`: Module provisions AWS Secrets Manager secret, IRSA, External Secrets Operator, and syncs to Kubernetes secret.

## Outputs

The module exports:

* `vpc_id`
* `public_subnet_ids`
* `private_subnet_ids`
* `eks_cluster_name`
* `eks_cluster_endpoint`
* `eks_cluster_security_group_id`
* `proxy_namespace`
* `proxy_service_name`
* `proxy_service_load_balancer_hostname`
* `proxy_ingress_load_balancer_hostname`
* `proxy_hpa_name`
* `proxy_dns_fqdn`

## How to deploy

Deployment typically takes around 20-30 minutes.

```bash
terraform init
terraform plan
terraform apply
```

## Best practices

* Manage sensitive variables via environment variables or `.tfvars`.
