Before creating the HPAs, deploy the Metrics Servers in your cluster (https://kubernetes-sigs.github.io/metrics-server/) to collect metrics for scaling:
Create a private ECR repository called kubernetes-sigs/metrics-server
From the local machine pull the image from what the manifest file indicated, then push to your own repository → The private EKS cluster can fetch through the ECR VPC endpoint.
docker pull --platform linux/arm64 registry.k8s.io/metrics-server/metrics-server:v0.7.2
docker tag registry.k8s.io/metrics-server/metrics-server:v0.7.2 <aws_id>.dkr.ecr.us-east-1.amazonaws.com/kubernetes-sigs/metrics-server:v0.7.2
docker push <aws_id>.dkr.ecr.us-east-1.amazonaws.com/kubernetes-sigs/metrics-server:v0.7.2
Modify the components.yaml
manifest file:
# components.yaml
# Change the deployment's image to the private ECR image
image: <aws_id>.dkr.ecr.us-east-1.amazonaws.com/kubernetes-sigs/metrics-server:v0.7.2
Then apply the Metrics Servers to the cluster:
# From local machine, copy the file through SCP to the bastion host
scp -i <path_to_access_key> ./components.yaml ec2-user@<instance_id>:/home/ec2-user/helpers/metrics-components.yaml
# From bastion host
kubectl apply -f ./helpers/metrics-components.yaml
# Verify the metrics server
kubectl get deployment metrics-server -n kube-system
kubectl top nodes
Creating the manifest to deploy the deployment for each app:
# hpa-frontend.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: coffeeshop-frontend-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: coffeeshop-frontend-deploy
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 70
# From local machine
scp -i <path_to_access_key> ./hpa-*.yaml ec2-user@<instance_id>:/home/ec2-user/manifests
# From bastion host
kubectl apply -f './manifests/hpa-*.yaml'
Create the necessary IAM permissions for CA to access AWS resources (https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md):
Create an IAM policy callled ClusterAutoscalerEKSDemoPolicy
for the CA:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeScalingActivities",
"ec2:DescribeImages",
"ec2:DescribeInstanceTypes",
"ec2:DescribeLaunchTemplateVersions",
"ec2:GetInstanceTypesFromInstanceRequirements",
"eks:DescribeNodegroup"
],
"Resource": ["*"]
},
{
"Effect": "Allow",
"Action": [
"autoscaling:SetDesiredCapacity",
"autoscaling:TerminateInstanceInAutoScalingGroup"
],
"Resource": ["*"]
}
]
}
From the bastion host, create the service account for CA with eksctl
:
eksctl create iamserviceaccount \\
--cluster eks-demo-cluster \\
--namespace kube-system \\
--name cluster-autoscaler \\
--attach-policy-arn arn:aws:iam::<AWS_ACCOUNT_ID>:policy/ClusterAutoscalerEKSDemoPolicy \\
--override-existing-serviceaccounts \\
--region us-east-1 \\
--approve
If the cluster is private-only, create this VPC Endpoint to connect the CA to the private cluster:
Install the Cluster Autoscaler:
Download the manifest files:
Create a private ECR repository called: autoscaling/cluster-autoscaler
From the local machine pull the images from what the manifest file indicated, then push to your own repository:
docker pull --platform linux/arm64 registry.k8s.io/autoscaling/cluster-autoscaler:v1.26.2
docker tag registry.k8s.io/autoscaling/cluster-autoscaler:v1.26.2 <aws_id>.dkr.ecr.us-east-1.amazonaws.com/autoscaling/cluster-autoscaler:v1.26.2
docker push <aws_id>.dkr.ecr.us-east-1.amazonaws.com/autoscaling/cluster-autoscaler:v1.26.2
Modify the cluster-autoscaler-autodiscover.yaml
manifest file:
# cluster-autoscaler-autodiscover.yaml
# Change the images to the private ECR images
image: <aws_id>.dkr.ecr.us-east-1.amazonaws.com/autoscaling/cluster-autoscaler:v1.26.2
# Change the cluster name in container's commands
--node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/eks-demo-cluster
Then apply the CA to the cluster:
# From local machine, copy the file through SCP to the bastion host
scp -i <path_to_access_key> ./cluster-autoscaler-autodiscover.yaml ec2-user@<instance_id>:/home/ec2-user/helpers/cluster-autoscaler-autodiscover.yaml
# From bastion host
kubectl apply -f ./helpers/cluster-autoscaler-autodiscover.yaml
# Verify the metrics server
kubectl get deployment -n kube-system cluster-autoscaler
Make sure to add this annotation, to prevent CA from removing nodes where its own pod is running:
kubectl -n kube-system \\
annotate deployment.apps/cluster-autoscaler \\
cluster-autoscaler.kubernetes.io/safe-to-evict="false"
Check if the current node group’s Auto Scaling group (ASG) have these two tags, so CA knows which ASG to scale. If not, add them:
k8s.io/cluster-autoscaler/enabled: true
k8s.io/cluster-autoscaler/eks-demo-cluster: owned
You can test scaling:
Temporarily removing the HPA, then scale a deployment up:
kubectl delete hpa coffeeshop-frontend-hpa
kubectl scale --replicas=10 deployment/coffeeshop-frontend-deploy
The cluster configuration back in step 3.2, should show that max size of the node group = 6 → The system will scale gradually, and stop when reaching to size of 6, or no more pending pods in the cluster
kubectl logs -f <cluster_autoscaler_pod_name> -n kube-system