• Create a VPC Endpoint to connect the CloudWatch to the private cluster:

    • Service: com.amazonaws.us-east-1.logs
    • VPC & Subnet: EKS subnet 1 & EKS subnet 2
    • Security group: ClusterSharedNodeSecurityGroup
  • Install the CloudWatch Observability EKS add-on to enable CloudWatch Container Insights:

    • Create an IAM role for CloudWatch agent:

      eksctl create iamserviceaccount \\
        --name cloudwatch-agent \\
        --namespace amazon-cloudwatch \\
        --cluster eks-demo-cluster \\
        --role-name eksctl-cloudwatch-agent-role \\
        --attach-policy-arn arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy \\
      	--override-existing-serviceaccounts \\
      	--region us-east-1 \\
        --role-only \\
        --approve
      
    • Create the add-on to enable the CloudWatch agent on every node:

      aws eks create-addon \\
        --addon-name amazon-cloudwatch-observability \\
        --cluster-name eks-demo-cluster \\
        --service-account-role-arn arn:aws:iam::<aws_account_id>:role/eksctl-cloudwatch-agent-role
        
      # Verify add-on
      kubectl get pods -n amazon-cloudwatch
      
  • Check Container Insights in the console:

    • You can see all the clusters that have been created in your account. In this example, I have only the cluster which just created earlier

      image.png

    • You can find the metrics for individual node or pod of the cluster in performance monitoring dashboard

      image.png

  • Checking application metrics and logs:

    • On the CloudWatch Observability Add-on which has just been added to the cluster, we can see that there are two types of observability: Granular container-level with Container Insights and application-level with Application Signals

      image.png

    • To check the application metrics, select a service to monitor and create an Application Signal for each, then wait for a few minutes for the agent to restart the containers related to that service in your cluster

      image.png

      image.png

    • Then for each service, you can check its service operations to see the API requests’ health & latency

      image.png

  • You can also check application logs at a certain point by clicking on the traces in the graph, then CloudWatch will handle the query for you

    image.png

    image.png