# Logging in Amazon EKS
<a name="logging"></a>

Logging is a critical aspect of managing and maintaining applications that run on Amazon EKS. Effective logging practices in Amazon EKS environments help developers, operations teams, and system administrators gain valuable insights into the behavior, performance, and health of their containerized applications and their underlying infrastructure.

Implementing a robust logging strategy in Amazon EKS is essential for several reasons:
+ **Troubleshooting**: Logs help identify and diagnose issues quickly, which reduces downtime and improves overall system reliability.
+ **Compliance**: Many industries require comprehensive logging for auditing and regulation purposes.
+ **Security**: Log analysis can help you detect and investigate potential security threats or breaches.
+ **Performance optimization**: Logs provide insights into application and system performance, so you can identify bottlenecks and optimize resource utilization.
+ **Monitoring and alerting**: Log data can be used to set up monitoring systems and trigger alerts for specific events or conditions.

**Topics**
+ [Types of logging](log-types.md)
+ [Best practices](logging-best-practices.md)
+ [Important considerations](logging-considerations.md)

# Types of logging in Amazon EKS
<a name="log-types"></a>

In Amazon EKS, logging involves capturing, storing, and analyzing various types of log data that's generated by different components of the [Kubernetes](https://kubernetes.io/) cluster, including:
+ **System logs**: Information about the underlying [Amazon Elastic Compute Cloud (Amazon EC2)](https://aws.amazon.com/pm/ec2/) instances or [AWS Fargate](https://aws.amazon.com/fargate/) nodes
+ **Kubernetes component logs**: Data from core Kubernetes components such as the [API server](https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/), [scheduler](https://kubernetes.io/docs/concepts/scheduling-eviction/kube-scheduler/), and [controller manager](https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller-manager/)
+ **Container runtime logs**: Information from the container runtime, such as [Docker](https://www.docker.com/blog/containerd-vs-docker/) or [containerd](https://containerd.io/)
+ **Application logs**: Output from containerized applications

To manage logs in your Amazon EKS environment effectively, you typically employ a combination of AWS services, third-party tools, and best practices. This might include using [Amazon CloudWatch](https://aws.amazon.com/cloudwatch/), [Fluent Bit](https://fluentbit.io/), [Elasticsearch](https://www.elastic.co/elasticsearch), [Kibana](https://www.elastic.co/kibana), and other logging and analysis tools to collect, store, and visualize log data.

The following sections explore various aspects of logging in Amazon EKS, including best practices, tools, and techniques for implementing a comprehensive logging strategy in your Kubernetes clusters on AWS.

## System logs
<a name="system-logs"></a>

Logging for underlying EC2 instances or Fargate nodes in Amazon EKS involves different approaches depending on the node type.

To implement logging for EC2 instances in Amazon EKS, you can use the following tools:
+ [CloudWatch agent](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/install-CloudWatch-Agent-on-EC2-Instance.html): Install and configure the CloudWatch agent on your EC2 instances. Configure it to collect system logs such as `/var/log/messages` and `/var/log/secure`. You can use user data scripts or configuration management tools to automate this process.
+ [Fluent Bit](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-setup-logs-FluentBit.html): Deploy Fluent Bit as a DaemonSet to collect logs from all nodes. Configure it to forward logs to [CloudWatch Logs](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/WhatIsCloudWatchLogs.html) or other centralized logging systems.
+ [Container Insights](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/ContainerInsights.html): Enable Container Insights in your EKS cluster to automatically collect metrics and logs from EC2 instances.
+ Custom scripts: Develop custom scripts to collect specific logs and send them to your preferred logging destination.
+ [SSM Agent](https://docs.aws.amazon.com/systems-manager/latest/userguide/ssm-agent.html): Use AWS Systems Manager Agent (SSM Agent) to collect and forward logs to CloudWatch Logs.

To implement logging for Fargate nodes in Amazon EKS, use these tools:
+ [Fargate logging](https://docs.aws.amazon.com/eks/latest/userguide/fargate-logging.html): Fargate automatically collects `stdout` and `stderr` logs from your containers. Configure your Fargate profile to send these logs to CloudWatch Logs.
+ [Fluent Bit for Fargate](https://github.com/aws/aws-for-fluent-bit): AWS provides a Fluent Bit image specifically for Fargate logging. Deploy it as a sidecar container in your Fargate pods to collect and forward logs.
+ [Container Insights for Fargate](https://aws-otel.github.io/docs/getting-started/container-insights/eks-fargate): Enable Container Insights to collect metrics and logs from Fargate nodes.

## Kubernetes component logs
<a name="kubernetes-logs"></a>

Collecting logs from Kubernetes components such as the API server, scheduler, and controller manager in Amazon EKS requires a slightly different approach from application logging. These components run as part of the Amazon EKS control plane, which is managed by AWS. Here's how you can collect and access these logs:
+ **Enable control plane logging:** You can enable control plane logging for your EKS cluster through the AWS Management Console, [AWS Command Line Interface (AWS CLI)](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html), or infrastructure as code (IaC) tools such as [AWS CloudFormation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/Welcome.html) or Terraform. When you enable control plane logging, the logs are sent to Amazon CloudWatch Logs. You can view them in the CloudWatch console in the `/aws/eks/<cluster-name>/cluster` log group. Within this log group, each control plane component has its own log stream as follows:    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/amazon-eks-observability-best-practices/log-types.html)

  To view logs for a specific component, navigate to the cluster log group and filter by the target log stream name.
+ **Use CloudWatch Logs Insights**: You can use [CloudWatch Logs Insights](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AnalyzingLogData.html) to perform complex queries on your logs.
+ **Export logs to Amazon S3**: For long-term storage or further analysis, you can export logs to Amazon Simple Storage Service[ (](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html)Amazon S3[)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html).
+ **Use third-party tools**: You can use tools such as Fluent Bit to collect these logs and forward them to other logging systems such as Elasticsearch or Splunk.
+ **Use AWS CloudTrail**: The [AWS CloudTrail](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-user-guide.html) service can provide additional insights into API calls made to your EKS cluster.

## Container runtime logs
<a name="runtime-logs"></a>

Logging container runtime logs in Amazon EKS involves capturing and managing logs from the container runtime, which is typically `containerd` for Amazon EKS. Here's how you can approach logging container runtime logs in Amazon EKS:
+ Directly access the logs on Amazon EC2 nodes. For self-managed EC2 nodes, you can directly access the container runtime logs on the host from these locations:
  + `containerd` logs: `/var/log/containers/`
  + Docker logs (if you're using the Docker runtime): `/var/log/docker.log`
+ Use a DaemonSet for log collection.
+ Deploy a log collection agent (such as Fluent Bit) as a DaemonSet to collect logs from all nodes.
+ Configure the CloudWatch agent to collect container runtime logs.
+ Enable Container Insights to collect container runtime metrics and logs.
+ Use Fargate. For Fargate nodes, container runtime logs are automatically collected and can be accessed through CloudWatch Logs.
+ Implement custom logging solutions by using tools such as Fluent Bit or Logstash. Set up [CloudWatch alarms](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/AlarmThatSendsEmail.html) or use tools such as Prometheus to monitor for specific patterns or issues in container runtime logs. Consider using third-party logging solutions that integrate well with Kubernetes and Amazon EKS, such as Datadog, Splunk, or the Elastic Stack (ELK Stack). Use log aggregation tools to collect logs from multiple sources and forward them to a centralized logging system.

## Application logs
<a name="app-logs"></a>

Application logs in Amazon EKS are a crucial part of maintaining and troubleshooting your applications. To implement application logging in Amazon EKS, you can choose from these options:
+ Write logs to `stdout`/`stderr`: The simplest and most Kubernetes-native way to handle application logs is to write them to `stdout` and `stderr`. Kubernetes automatically captures these streams.
+ Implement log aggregation: Use a log aggregator such as Fluent Bit to collect logs from all your pods.
+ Configure log routing: Configure your log aggregator to route logs to your desired destination (such as CloudWatch Logs or Elasticsearch).
+ Use CloudWatch Container Insights: Enable Container Insights for comprehensive logging and monitoring.

# Best practices for logging in Amazon EKS
<a name="logging-best-practices"></a>

The following best practices help create a robust, scalable, and efficient logging system for your Amazon EKS environment, and provide better troubleshooting, monitoring, and overall management of your Kubernetes clusters.
+ **Centralize log collection**: Use a centralized logging solution such as CloudWatch Logs, Elasticsearch, or a third-party service to aggregate logs from all components. This provides a single point of access for log analysis and simplifies management.
+ **Implement structured logging**: Use structured log formats such as JSON so that logs can be parsed and searched more easily. Include relevant metadata such as timestamps, log levels, and source identifiers.
+ **Use log levels appropriately**: Implement proper log levels (such as `DEBUG`, `INFO`, `WARN`, and `ERROR`) in your applications. Configure production environments to log at appropriate levels to avoid excessive logging.
+ **Enable container logging**: Configure your containers to log to `stdout` and `stderr`. This allows Kubernetes to capture and forward these logs to your chosen logging solution.
+ **Enable application logging**: Configure applications to write logs to `stdout` and `stderr` instead of writing to log files. This follows the [12-factor app methodology](https://12factor.net/logs) and aligns with cloud-native best practices.
+ **Use Kubernetes DaemonSets for log collection**: Deploy log collection agents (such as Fluent Bit) as DaemonSets to ensure that they run on every node in your cluster.
+ **Implement retention policies**: Define and enforce log retention policies to comply with regulations and to manage storage costs.
+ **Secure log data**: Encrypt logs in transit and at rest. Implement access controls to restrict who can view and manage logs.
+ **Monitor log ingestion**: Set up alerts for log ingestion failures or delays to ensure continuous logging.
+ **Use Kubernetes annotations and labels**: Use Kubernetes annotations and labels to add metadata to your logs, to improve searchability and filtering.
+ **Implement distributed tracing**: Use distributed tracing tools such as [AWS X-Ray](https://docs.aws.amazon.com/xray/latest/devguide/aws-xray.html) or Jaeger to correlate logs across microservices.
+ **Optimize log volume**: Be selective about what you log to avoid unnecessary costs and performance issues. Use sampling for high-volume, low-value logs.
+ **Implement log aggregation**: Use tools such as Logstash to aggregate logs from multiple sources before sending them to your central logging system.
+ **Use AWS services when possible**: Services such as CloudWatch Logs and Container Insights provide seamless integration with other AWS services.
+ **Implement log analysis and visualization**: Use tools such as CloudWatch Logs Insights, Elasticsearch with Kibana, or third-party solutions for log analysis and visualization.
+ **Implement automated log analysis**: Use machine learning and AI-powered tools to detect anomalies and patterns in your logs automatically.
+ **Document your logging strategy**: Maintain clear documentation of your logging architecture, practices, and tools for your team.

# Important considerations for logging in Amazon EKS
<a name="logging-considerations"></a>

This section discusses important considerations to keep in mind when you implement logging in Amazon EKS.
+ **Performance impact**: Excessive logging can affect application performance. Be mindful of the volume and frequency of logs generated.
+ **Cost management**: Log storage and processing can incur significant costs, especially at scale. Implement log retention policies and consider using log aggregation to reduce costs.
+ **Security and compliance**: Make sure that logs don't contain sensitive information such as passwords or personal data. Implement encryption for logs in transit and at rest. Consider compliance requirements such as General Data Protection Regulation (GDPR) or Health Insurance Portability and Accountability Act (HIPAA) when you handle logs.
+ **Scalability**: Make sure that your logging solution can scale with your cluster size and log volume. Consider using buffering and batching for log transmission.
+ **Log retention**: Define and implement appropriate log retention periods. Balance compliance requirements against storage costs.
+ **Access control**: Implement proper AWS Identity and Access Management (IAM) roles and policies for log access. Follow the [least privilege principle](https://docs.aws.amazon.com/wellarchitected/latest/security-pillar/sec_permissions_least_privileges.html) for log management.
+ **Log consistency**: Use consistent log formats across different applications and services. Use structured logging for easier parsing and analysis.
+ **Time synchronization**: Synchronize time across all nodes to get consistent timestamps in logs.
+ **Resource allocation**: Allocate appropriate resources (such as CPU and memory) for logging agents. Monitor the resource usage of logging components.
+ **Fargate considerations**: Fargate has specific logging mechanisms that differ from EC2-based nodes. Understand the limitations and capabilities of [Fargate logging](https://docs.aws.amazon.com/eks/latest/userguide/fargate-logging.html).
+ **Multi-tenant clusters**: In multi-tenant environments, make sure that logs are properly isolated between tenants.
+ **Log parsing and analysis**: Consider the tools and skills required for effective log analysis. Implement log parsing for structured data extraction.
+ **Monitoring the logging system**: Set up monitoring for the logging infrastructure itself. Generate alerts for logging system failures or backlogs.
+ **Network impact**: Be aware of the network bandwidth used by log transmission. Consider using compression for log data.
+ **Kubernetes events**: Don't overlook Kubernetes events as a source of important information.
+ **Control plane logging**: Understand the implications and costs of enabling control plane logging.
+ **Debugging capabilities**: Make sure that your logging solution allows for easy debugging and troubleshooting.
+ **Integration with existing tools**: Consider how your Amazon EKS logging solution integrates with existing monitoring and alerting tools.
+ **Testing**: Regularly test your logging setup, especially after cluster upgrades.
+ **Documentation**: Maintain clear documentation of your logging architecture and practices.
+ **Log aggregation latency**: Be aware of any latency in log aggregation and how it might affect real-time monitoring.