

# Observability
<a name="observability"></a>

Observability is the extent to which a system's current state can be inferred from the data it emits. The data emitted is commonly referred to as telemetry.

The AWS SDK for Kotlin can provide all three common telemetry signals: metrics, traces, and logs. You can wire up a [https://docs.aws.amazon.com/smithy-kotlin/api/latest/telemetry-api/aws.smithy.kotlin.runtime.telemetry/-telemetry-provider/index.html](https://docs.aws.amazon.com/smithy-kotlin/api/latest/telemetry-api/aws.smithy.kotlin.runtime.telemetry/-telemetry-provider/index.html) to send telemetry data to an observability backend (such as [AWS X-Ray](https://docs.aws.amazon.com/xray/?icmpid=docs_homepage_devtools) or [Amazon CloudWatch](https://docs.aws.amazon.com/cloudwatch/?icmpid=docs_homepage_mgmtgov)) and then act on it.

By default, only logging is enabled and other telemetry signals are disabled in the SDK. This topic explains how to enable and configure telemetry output.

**Important**  
`TelemetryProvider` is currently an experimental API that must be opted in to use.

## Configure a `TelemetryProvider`
<a name="observability-conf-telemetry-provider"></a>

You can configure a `TelemetryProvider` in your application globally for all service clients or for individual clients. The following examples use a hypothetical `getConfiguredProvider()` function to demonstrate the `TelemetryProvider` API operations. The [Telemetry providers](observability-telemetry-providers.md) section describes information for implementations provided by the SDK. If a provider isn’t supported, you can implement your own support or [open a feature request on GitHub](https://github.com/awslabs/aws-sdk-kotlin/issues/new/choose).

### Configure the default global telemetry provider
<a name="observability-conf-telemetry-provider-global"></a>

By default, every service client attempts to use the globally available telemetry provider. This way, you can set the provider once, and all clients will use it. This should be done only once, before you instantiate any service clients.

To use the global telemetry provider, first update your project dependencies to add the telemetry defaults module as shown in the following Gradle snippet.

(You can navigate to the *X.Y.Z* link to see the latest version available.)

```
dependencies {
    implementation(platform("aws.smithy.kotlin:bom:[https://github.com/smithy-lang/smithy-kotlin/releases/latest](https://github.com/smithy-lang/smithy-kotlin/releases/latest)"))
    implementation("aws.smithy.kotlin:telemetry-defaults")
    ...
}
```

Then set the global telemetry provider before creating a service client as shown in the following code.

```
import aws.sdk.kotlin.services.s3.S3Client
import aws.smithy.kotlin.runtime.telemetry.GlobalTelemetryProvider
import kotlinx.coroutines.runBlocking

fun main() = runBlocking {
    val myTelemetryProvider = getConfiguredProvider()
    GlobalTelemetryProvider.set(myTelemetryProvider)

    S3Client.fromEnvironment().use { s3 ->
        …
    }     
}

fun getConfiguredProvider(): TelemetryProvider {
    TODO("TODO - configure a provider")
}
```

### Configure a telemetry provider for a specific service client
<a name="observability-conf-telemetry-provider-client"></a>

You can configure an individual service client with a specific telemetry provider (other than the global one). This is shown in the following example.

```
import aws.sdk.kotlin.services.s3.S3Client
import kotlinx.coroutines.runBlocking

fun main() = runBlocking {
    S3Client.fromEnvironment{
        telemetryProvider = getConfiguredProvider()
    }.use { s3 ->
        ...
    }
}

fun getConfiguredProvider(): TelemetryProvider {
    TODO("TODO - configure a provider")
}
```

# Metrics
<a name="observability-telemetry-metrics"></a>

The following table lists the telemetry metrics that the SDK emits. [Configure a telemetry provider](observability.md#observability-conf-telemetry-provider) to make the metrics observable.


**What metrics are emitted?**  

| Metric name | Units | Type | Attributes | Description | 
| --- | --- | --- | --- | --- | 
| smithy.client.call.duration | s | Histogram | rpc.service, rpc.method | Overall call duration (including retries) | 
| smithy.client.call.attempts | \$1attempt\$1 | MonotonicCounter | rpc.service, rpc.method | The number of attempts for an individual operation | 
| smithy.client.call.errors | \$1error\$1 | MonotonicCounter | rpc.service, rpc.method, exception.type | The number of errors for an operation | 
| smithy.client.call.attempt\$1duration | s | Histogram | rpc.service, rpc.method | The time it takes to connect to the service, send the request, and get back HTTP status code and headers (including time queued waiting to be sent) | 
| smithy.client.call.resolve\$1endpoint\$1duration | s | Histogram | rpc.service, rpc.method | The time it takes to resolve an endpoint (endpoint resolver, not DNS) for the request | 
| smithy.client.call.serialization\$1duration | s | Histogram | rpc.service, rpc.method | The time it takes to serialize a message body | 
| smithy.client.call.deserialization\$1duration | s | Histogram | rpc.service, rpc.method | The time it takes to deserialize a message body | 
| smithy.client.call.auth.signing\$1duration | s | Histogram | rpc.service, rpc.method, auth.scheme\$1id | The time it takes to sign a request | 
| smithy.client.call.auth.resolve\$1identity\$1duration | s | Histogram | rpc.service, rpc.method, auth.scheme\$1id | The time it takes to acquire an identity (such as AWS credentials or a bearer token) from an Identity Provider | 
| smithy.client.http.connections.acquire\$1duration | s | Histogram |  | The time it takes a request to acquire a connection | 
| smithy.client.http.connections.limit | \$1connection\$1 | [Async]UpDownCounter |  | The maximum open connections allowed/configured for the HTTP client | 
| smithy.client.http.connections.usage | \$1connection\$1 | [Async]UpDownCounter | state: idle \$1 acquired | Current state of connections pool | 
| smithy.client.http.connections.uptime | s | Histogram |  | The amount of time a connection has been open | 
| smithy.client.http.requests.usage | \$1request\$1 | [Async]UpDownCounter | state: queued \$1 in-flight | The current state of HTTP client request concurrency | 
| smithy.client.http.requests.queued\$1duration | s | Histogram |  | The amount of time a request spent queued and waiting to be executed by the HTTP client | 
| smithy.client.http.bytes\$1sent | By | MonotonicCounter | server.address | The total number of bytes sent by the HTTP client | 
| smithy.client.http.bytes\$1received | By | MonotonicCounter | server.address | The total number of bytes received by the HTTP client | 

Following are the column descriptions:
+ **Metric name**–The name of the emitted metric.
+ **Units**–The unit of measure for the metric. Units are given in the [UCUM](https://unitsofmeasure.org/ucum) case sensitive ("c/s") notation.
+ **Type**–The type of instrument used to capture the metric.
+ **Description**–A description of what the metric is measuring.
+ **Attributes**–The set of attributes (dimensions) emitted with the metric.

# Logging
<a name="logging"></a>

The AWS SDK for Kotlin configures an [SLF4J](https://www.slf4j.org/manual.html) compatible logger as the default `LoggerProvider` of the telemetry provider. With SLF4J, which is an abstraction layer, you can use of any one of several logging systems at runtime. Supported logging systems include the [Java Logging APIs](https://docs.oracle.com/javase/8/docs/technotes/guides/logging/), [Log4j 2](https://logging.apache.org/log4j/2.x/), and [Logback](https://logback.qos.ch/).

**Warning**  
We recommend that you only use wire logging for debugging purposes. (Wire logging is discussed below.) Turn it off in your production environments because it can log sensitive data such as email addresses, security tokens, API keys, passwords, and AWS Secrets Manager secrets. Wire logging logs the full request or response without encryption, even for an HTTPS call.   
For large requests (such as uploading a file to Amazon S3) or responses, verbose wire logging can also significantly impact your application’s performance.

## Example Log4j 2 logging configuration
<a name="log4j2-example"></a>

 While any `SLF4J`-compatible log library may be used, this example enables log output from the SDK in JVM programs using Log4j 2:

**Gradle dependencies**

(You can navigate to the *X.Y.Z* link to see the latest version available.)

```
implementation("org.apache.logging.log4j:log4j-slf4j2-impl:[https://search.maven.org/#search|gav|1|g:org.apache.logging.log4j%20AND%20a:log4j-slf4j2-impl](https://search.maven.org/#search|gav|1|g:org.apache.logging.log4j%20AND%20a:log4j-slf4j2-impl)")
```

**Log4j 2 configuration file**

Create a file named `log4j2.xml` in your `resources` directory (for example, `<project-dir>/src/main/resources`). Add the following XML configuration to the file:

```
<Configuration status="ERROR">
    <Appenders>
        <Console name="Out">
            <PatternLayout pattern="%d{YYYY-MM-dd HH:mm:ss} %-5p %c:%L %X - %encode{%m}{CRLF}%n"/>
        </Console>
    </Appenders>
    <Loggers>
        <Root level="info">
            <AppenderRef ref="Out"/>
        </Root>
    </Loggers>
</Configuration>
```

This configuration includes the `%X` specifier in the `pattern` attribute that enables MDC (mapped diagnostic context) logging.

The SDK adds the following MDC elements for each operation.

**rpc**  
The name of the invoked RPC, for example `S3.GetObject`.

**sdkInvocationId**  
A unique ID assigned by the service client for the operation. The ID correlates all logging events related to the invocation of a single operation.

## Specify log mode for wire-level messages
<a name="sdk-log-mode"></a>

By default, the AWS SDK for Kotlin doesn't log wire-level messages because they might contain sensitive data from API requests and responses. However, sometimes you need this level of detail for debugging purposes. 

With the Kotlin SDK, you can set a log mode in code or using environment settings to enable debug messaging for the following:
+ HTTP requests
+ HTTP responses

The log mode is backed by a bit-field where each bit is a flag (mode) and values are additive. You can combine one request mode and one response mode.

### Set log mode in code
<a name="set-log-mode-programmatically"></a>

To opt into additional logging, set the `logMode` property when you construct a service client.

The following example shows how to enable logging of requests (with the body) and the response (without the body).

```
import aws.smithy.kotlin.runtime.client.LogMode

// ...

val client = DynamoDbClient {
    // ...
    logMode = LogMode.LogRequestWithBody + LogMode.LogResponse
}
```

A log mode value set during service client construction, overrides any log mode value set from the environment.

### Set log mode from the environment
<a name="set-log-mode-from-enviironment"></a>

To set a log mode globally for all service clients not explicitly configured in code, use one of the following:
+ JVM system property: `sdk.logMode`
+ Environment variable: `SDK_LOG_MODE`

The following case-insensitive values are available:
+ `LogRequest`
+ `LogRequestWithBody`
+ `LogResponse`
+ `LogResponseWithBody`

To create a combined log mode using settings from the environment, you separate the values with a pipe (`|`) symbol.

For example, the following examples set the same log mode as the previous example.

```
# Environment variable.
export SDK_LOG_MODE=LogRequestWithBody|LogResponse
```

```
# JVM system property.
java -Dsdk.logMode=LogRequestWithBody|LogResponse ...
```

**Note**  
You must also configure a compatible SLF4J logger and set the logging level to DEBUG to enable wire-level logging.

# Telemetry providers
<a name="observability-telemetry-providers"></a>

The SDK currently supports [OpenTelemetry](https://opentelemetry.io/) (OTel) as a provider. The SDK might offer additional telemetry providers in the future.

**Topics**
+ [Configure the OpenTelemetry-based telemetry provider](observability-telemetry-providers-otel.md)

# Configure the OpenTelemetry-based telemetry provider
<a name="observability-telemetry-providers-otel"></a>

The SDK for Kotlin provides an implementation of the `TelemetryProvider` interface backed by OpenTelemetry.

## Prerequisites
<a name="observability-telemetry-providers-otel-prereqs"></a>

Update your project dependencies to add the OpenTelemetry provider as shown in the following Gradle snippet. You can navigate to the *X.Y.Z* link to see the latest version available.

```
dependencies {
    implementation(platform("aws.smithy.kotlin:bom:[https://github.com/smithy-lang/smithy-kotlin/releases/latest](https://github.com/smithy-lang/smithy-kotlin/releases/latest)"))
    implementation(platform("io.opentelemetry.instrumentation:opentelemetry-instrumentation-bom:[https://search.maven.org/#search|gav|1|g:io.opentelemetry.instrumentation%20AND%20a:opentelemetry-instrumentation-bom](https://search.maven.org/#search|gav|1|g:io.opentelemetry.instrumentation%20AND%20a:opentelemetry-instrumentation-bom)"))
    implementation("aws.smithy.kotlin:telemetry-provider-otel")

    // OPTIONAL: If you use log4j, the following entry enables the ability to export logs through OTel.
    runtimeOnly("io.opentelemetry.instrumentation:opentelemetry-log4j-appender-2.17")
}
```

## Configure the SDK
<a name="observability-telemetry-providers-otel-conf"></a>

The following code configures a service client by using the OpenTelemetry telemetry provider.

```
import aws.sdk.kotlin.services.s3.S3Client
import aws.smithy.kotlin.runtime.telemetry.otel.OpenTelemetryProvider
import io.opentelemetry.api.GlobalOpenTelemetry
import kotlinx.coroutines.runBlocking

fun main() = runBlocking {
    val otelProvider = OpenTelemetryProvider(GlobalOpenTelemetry.get())

    S3Client.fromEnvironment().use { s3 ->
        telemetryProvider = otelProvider
        …
    }
}
```

**Note**  
A discussion of how to configure the OpenTelemetry SDK is outside of the scope of this guide. The [OpenTelemetry Java documentation](https://opentelemetry.io/docs/instrumentation/java/) contains configuration information on the various approaches: [manually](https://opentelemetry.io/docs/instrumentation/java/manual/), automatically through the [ Java agent](https://opentelemetry.io/docs/instrumentation/java/automatic/), or the (optional) [collector](https://opentelemetry.io/docs/collector/).

## Resources
<a name="observability-telemetry-providers-otel-res"></a>

The following resources are available to help you get started with OpenTelemetry.
+ [AWS Distro for OpenTelemetry](https://aws-otel.github.io/docs/introduction) - AWS OTeL Distro homepage
+ [aws-otel-java-instrumentation](https://github.com/aws-observability/aws-otel-java-instrumentation) - AWS Distro for OpenTelemetry Java Instrumentation Library
+ [aws-otel-lambda](https://github.com/aws-observability/aws-otel-lambda) - AWS managed OpenTelemetry Lambda layers
+ [aws-otel-collector](https://github.com/aws-observability/aws-otel-collector) - AWS Distro for OpenTelemetry Collector
+ [AWS Observability Best Practices](https://aws-observability.github.io/observability-best-practices/) - General best practices for observability specific to AWS