Proxy on-prem Terraform deployment (AWS)
This guide explains how to deploy Espresso AI's Proxy Service on your AWS infrastructure with Terraform.
You can deploy either:
In a dedicated VPC that Terraform creates.
In an existing VPC that you provide.
Prerequisites
Access to an AWS account with IAM permissions for VPC, EKS, IAM roles, EC2/load balancers, Route53 (if used), and Secrets Manager (if used).
In the Espresso AI dashboard, go to
Proxy Onboardingand:Enter your AWS account ID so we can grant ECR access for the Proxy image.
Copy your customer name.
Generate an API key for Espresso API authentication.
What this module creates
VPC (optional) or uses your existing VPC/subnets.
EKS cluster and node group.
Karpenter for node autoscaling.
AWS Load Balancer Controller.
Proxy deployment, service, and HPA in Kubernetes.
Optional Route53 record.
Optional managed API key flow via AWS Secrets Manager + External Secrets.
Example usage
Dedicated VPC + managed secret + DNS
Existing VPC + bring-your-own Kubernetes secret
Argument reference
Top-level arguments
region: Required. AWS region for deployment.customer: Required. Customer identifier used in naming andAPI_URLsuffixing.create_dedicated_vpc: Optional. Creates dedicated VPC (true) or uses existing VPC (false). Default:true.vpc_config: Optional/conditional. Required whencreate_dedicated_vpc = true.existing_vpc_config: Optional/conditional. Required whencreate_dedicated_vpc = false.eks_config: Optional. EKS cluster and node group settings.karpenter_config: Optional. Karpenter NodePool tuning.proxy_config: Required. Proxy runtime configuration.proxy_api_key_value: Optional/conditional, sensitive. Required whenproxy_config.api_key_secret_mode = MANAGED_AWS_SECRETS_MANAGER.alb_config: Optional. ALB ingress configuration.dns_config: Optional. Route53 alias record configuration.autoscaling_config: Optional. Proxy HPA configuration.tags: Optional. Additional AWS tags. Default:{}.
vpc_config
vpc_configvpc_name: Optional. Default:espresso-ai-proxy-vpc.cidr: Required in dedicated VPC mode.public_subnet_cidrs: Required in dedicated VPC mode.private_subnet_cidrs: Required in dedicated VPC mode.availability_zones: Required in dedicated VPC mode and must align with subnet counts.
existing_vpc_config
existing_vpc_configvpc_id: Required in existing VPC mode.private_subnet_ids: Required in existing VPC mode.public_subnet_ids: Optional. Default:[].
eks_config
eks_configcluster_name: Optional. Default:espresso-ai-proxy.cluster_version: Optional. Default:1.35.bootstrap_self_managed_addons: Optional. Default:false.cluster_endpoint_public_access: Optional. Default:true.cluster_endpoint_private_access: Optional. Default:true.cluster_endpoint_public_access_cidrs: Required when public endpoint access is enabled.create_cloudwatch_log_group: Optional. Default:false.cloudwatch_log_group_retention_in_days: Optional. Default:90.instance_types: Optional. Default:["c8i.2xlarge", "c8i.4xlarge"].node_group_min_size: Optional. Default:2.node_group_desired_size: Optional. Default:2.node_group_max_size: Optional. Default:10.
karpenter_config
karpenter_configinstance_types: Optional. Default:["c8i.2xlarge", "c8i.4xlarge"].capacity_types: Optional. Default:["on-demand"].cpu_limit: Optional. Default:64.memory_limit: Optional. Default:256Gi.node_cap: Optional. Default:10.
proxy_config
proxy_configimage: Required. Proxy container image URI.replicas: Optional. Default:2.proxy_host: Required. Non-empty value injected asPROXY_HOST.otel_exporter_otlp_endpoint: Optional. Injected asOTEL_EXPORTER_OTLP_ENDPOINT. Default:https://metrics.espressocomputing.com:443.api_key_secret_name: Optional. Kubernetes secret name for API key injection. Default:espresso-ai.API key secret key name is fixed to
ESPRESSO_AI_API_KEYand is not configurable.api_key_secret_mode: Optional.BYO_K8S_SECRETorMANAGED_AWS_SECRETS_MANAGER. Default:BYO_K8S_SECRET.api_key_aws_secret_name: Optional. AWS Secrets Manager secret name used in managed mode. Default:/espresso-ai/proxy/api-key.api_url: Optional. Base URL. Default:https://api.espressocomputing.com:25831.env_vars: Optional. Map of environment variable key/value pairs. Currently supported keys:keytypedefinitionEXCLUDE_QUERY_TEXTboolDefault:
false. Whether to exclude query text on requests to Espresso AI's API. Note: Enabling this will limit supported functionality.
alb_config
alb_configenable_ingress: Optional. Enables ALB ingress. Default:true.certificate_arn: Required when ingress is enabled.ingress_host: Optional. Host rule.scheme: Optional.internet-facingorinternal. Default:internet-facing.
dns_config
dns_configcreate_record: Optional. Creates Route53 alias. Default:false.zone_id: Required whencreate_record = true.record_name: Optional. Falls back to ingress host if omitted.
autoscaling_config
autoscaling_configmin_replicas: Optional. Default:2.max_replicas: Optional. Default:10.target_cpu_utilization: Optional. Default:70.
Secret modes
BYO_K8S_SECRET(default): Proxy reads from an existing Kubernetes secret (api_key_secret_name) using fixed keyESPRESSO_AI_API_KEY.MANAGED_AWS_SECRETS_MANAGER: Module provisions AWS Secrets Manager secret, IRSA, External Secrets Operator, and syncs to Kubernetes secret.
Outputs
The module exports:
vpc_idpublic_subnet_idsprivate_subnet_idseks_cluster_nameeks_cluster_endpointeks_cluster_security_group_idproxy_namespaceproxy_service_nameproxy_service_load_balancer_hostnameproxy_ingress_load_balancer_hostnameproxy_hpa_nameproxy_dns_fqdn
How to deploy
Deployment typically takes around 20-30 minutes.
Best practices
Manage sensitive variables via environment variables or
.tfvars.
Version Migrations
The v0.1.0 → v0.2.0 change is a path-only refactor: the AWS configuration moved from the repo root into an aws/ subdirectory, so the source URL needs //aws. No resource addresses changed, so existing state continues to apply cleanly.
plan should report zero changes. If it shows any resource being destroyed, recreated, or replaced, stop and investigate before running apply.
Last updated