<aside> ℹ️
Kubernetes Metrics Server Installation Guide: https://kubernetes-sigs.github.io/metrics-server/
</aside>
Before creating the HPAs, deploy the Metrics Servers in your cluster to collect metrics for scaling:
Download the manifest file:
Create the necessary private ECR repositories according to the manifest file through console/CLI, then reference them if needed:
kubernetes-sigs/metrics-server (in components.yaml)/* modules/**ecr**/main.tf */
## Metrics Server
data "aws_ecr_repository" "metrics_server" {
name = "metrics-server/metrics-server"
}
/* modules/**ecr**/outputs.tf */
****
output "helper_urls" {
value = {
# ...
metrics_server = data.aws_ecr_repository.metrics_server.repository_url
}
}
terraform validate && terraform fmt
terraform plan -out tf.plan
terraform apply "tf.plan"
From the local machine pull the images from what the manifest file indicated, then push to your own private repositories:
docker pull --platform linux/arm64 registry.k8s.io/metrics-server/metrics-server:v0.8.0
docker tag registry.k8s.io/metrics-server/metrics-server:v0.8.0 **<aws_account_id>**.dkr.ecr.us-east-1.amazonaws.com/kubernetes-sigs/metrics-server:v0.8.0
docker push **<aws_account_id>**.dkr.ecr.us-east-1.amazonaws.com/kubernetes-sigs/metrics-server:v0.8.0
Modify the components.yaml manifest file:
# components.yaml
image: **<aws_account_id>**.dkr.ecr.us-east-1.amazonaws.com/kubernetes-sigs/metrics-server:v0.8.0
Then apply the Metrics Servers to the cluster:
# From the bastion host
mkdir helpers/metrics-server
# From local machine, copy the file through SCP to the bastion host
scp -i <**path_to_access_key**> ./components.yaml ec2-user@<**bastion_eks_instance_id**>:/home/ec2-user/helpers/metrics-server
# From bastion host
kubectl apply -f ./helpers/metrics-server/components.yaml
# Verify the metrics server
kubectl get deployment metrics-server -n kube-system
kubectl top nodes
Creating the manifest to scale the deployment of each app:
# hpa-frontend.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: coffeeshop-frontend-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: coffeeshop-frontend-deploy
minReplicas: 2
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 70
# From local machine
scp -i <**path_to_access_key**> ./hpa-*.yaml ec2-user@<**bastion_eks_instance_id**>:/home/ec2-user/manifests
# From bastion host
kubectl apply -f './manifests/hpa-*.yaml'
<aside> ℹ️
Cluster Autoscaler on AWS Installation Guide: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md
</aside>
Create the necessary IAM permissions for CA to access AWS resources:
Create the role with inline policy for the CA:
/* modules/**eks**/main.tf */
# Cluster Autoscaler
## IAM Role for CA
resource "aws_iam_role" "cluster_autoscaler_role" {
name = "${var.project_name}-cluster-autoscaler-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRoleWithWebIdentity"
Effect = "Allow"
Principal = {
Federated = aws_iam_openid_connect_provider.eks_cluster.arn
}
Condition = {
StringEquals = {
"${replace(aws_iam_openid_connect_provider.eks_cluster.url, "https://", "")}:sub" = "system:serviceaccount:kube-system:cluster-autoscaler"
"${replace(aws_iam_openid_connect_provider.eks_cluster.url, "https://", "")}:aud" = "sts.amazonaws.com"
}
}
}
]
})
}
resource "aws_iam_role_policy" "cluster_autoscaler_policy" {
name = "EKSClusterAutoscaler"
role = aws_iam_role.cluster_autoscaler_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeScalingActivities",
"ec2:DescribeImages",
"ec2:DescribeInstanceTypes",
"ec2:DescribeLaunchTemplateVersions",
"ec2:GetInstanceTypesFromInstanceRequirements",
"eks:DescribeNodegroup"
]
Resource = "*"
},
{
Effect = "Allow"
Action = [
"autoscaling:SetDesiredCapacity",
"autoscaling:TerminateInstanceInAutoScalingGroup"
]
Resource = "*"
},
]
})
}
Output the ARN of the role so we can annotate it in the k8s service account later (for IRSA purposes):
/* modules/**eks**/outputs.tf */
****
output "cluster_autoscaler_role_arn" {
value = aws_iam_role.cluster_autoscaler_role.arn
}
/* outputs.tf (root) */
output "cluster_autoscaler_role_arn" {
value = module.eks.cluster_autoscaler_role_arn
}
terraform validate && terraform fmt
terraform plan -out tf.plan
terraform apply "tf.plan"
We will create the k8s ServiceAccount for the Cluster Autoscaler later (modifying the existing manifest file from the official guide)
Create a VPC Endpoint to connect the EC2 Auto Scaling service to the private cluster:
autoscaling
/* modules/**eks**/main.tf */
resource "aws_vpc_endpoint" "autoscaling" {
vpc_id = var.vpc_id
service_name = "com.amazonaws.${var.region_primary}.autoscaling"
vpc_endpoint_type = "Interface"
subnet_ids = [
var.subnet_ids.eks1,
var.subnet_ids.eks2
]
security_group_ids = [aws_eks_cluster.main.vpc_config[0].cluster_security_group_id]
private_dns_enabled = true
tags = {
Name = "${var.project_name}-endpoint-autoscaling"
}
}
Install the Cluster Autoscaler (Auto-Discovery Setup method):
Download the manifest files:
Create the necessary private ECR repositories according to the manifest file through console/CLI, then reference them if needed:
autoscaling/cluster-autoscaler (in cluster-autoscaler-autodiscover.yaml)/* modules/**ecr**/main.tf */
## Cluster Autoscaler
data "aws_ecr_repository" "cluster_autoscaler" {
name = "autoscaling/cluster-autoscaler"
}
/* modules/**ecr**/outputs.tf */
****
output "helper_urls" {
value = {
# ...
cluster_autoscaler = data.aws_ecr_repository.cluster_autoscaler.repository_url
}
}
terraform validate && terraform fmt
terraform plan -out tf.plan
terraform apply "tf.plan"
From the local machine pull the images from what the manifest file indicated, then push to your own repository:
docker pull --platform linux/arm64 registry.k8s.io/autoscaling/cluster-autoscaler:v1.32.1
docker tag registry.k8s.io/autoscaling/cluster-autoscaler:v1.32.1 **<aws_account_id>**.dkr.ecr.us-east-1.amazonaws.com/autoscaling/cluster-autoscaler:v1.32.1
docker push **<aws_account_id>**.dkr.ecr.us-east-1.amazonaws.com/autoscaling/cluster-autoscaler:v1.32.1
Modify the cluster-autoscaler-autodiscover.yaml manifest file, there are 3 IMPORTANT things you need to also do beside changing the image path:
cluster-autoscaler.kubernetes.io/safe-to-evict: "false" annotation to the CA’s deployment template, so make sure the CA does not accidently “evict” or remove the node where the CA pod is in.# cluster-autoscaler-autodiscover.yaml
# Change the image to the private ECR image
image: **<aws_account_id>**.dkr.ecr.us-east-1.amazonaws.com/autoscaling/cluster-autoscaler:v1.32.1
# Change the cluster name in Auto Scaling Group's tag
--node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/**eks-demo-cluster**
## Add annotation to the ServiceAccount
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
name: cluster-autoscaler
namespace: kube-system
**annotations:
eks.amazonaws.com/role-arn: <cluster_autoscaler_role_arn>**
---
## Add annotation to the Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
app: cluster-autoscaler
spec:
# ...
template:
metadata:
# ...
annotations:
# ...
**cluster-autoscaler.kubernetes.io/safe-to-evict: 'false'**
# ...
Then apply the CA to the cluster:
# From the bastion host
mkdir helpers/cluster-autoscaler
# From local machine, copy the file through SCP to the bastion host
scp -i <**path_to_access_key**> ./cluster-autoscaler-autodiscover.yaml ec2-user@<**bastion_eks_instance_id**>:/home/ec2-user/helpers/cluster-autoscaler
# From bastion host
kubectl apply -f ./helpers/cluster-autoscaler/cluster-autoscaler-autodiscover.yaml
# Verify the metrics server
kubectl get deployment -n kube-system cluster-autoscaler
Just for extra measures, check if the current node group’s Auto Scaling group (ASG) have these two tags, so CA knows which ASG to scale. If not, add them:
k8s.io/cluster-autoscaler/enabled: truek8s.io/cluster-autoscaler/eks-demo-cluster: ownedYou can test scaling:
Temporarily removing the HPA, then manually scale one of the apps’ deployment up:
kubectl delete hpa coffeeshop-frontend-hpa
kubectl scale --replicas=10 deployment/coffeeshop-frontend-deploy
Some pods should be pending because there are no computing resources left for those pods with only 2 nodes
→ The system will scale up gradually by adding more nodes (instances), and stop when reaching to the maximum size of the node group (= 6 according to the cluster configuration back in step 3.2), or there are no more pending pods in the cluster
# You can check the CA logs to verify
kubectl get pods -n kube-system
kubectl logs -f <cluster_autoscaler_pod_name> -n kube-system


