搭配 OpenTelemetry Collector 使用 OpenSearch 擷取管道 OpenTelemetry - Amazon OpenSearch Service

本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。

搭配 OpenTelemetry Collector 使用 OpenSearch 擷取管道 OpenTelemetry

您可以使用 OpenTelemetry Collector 將日誌、追蹤和指標擷取至 OpenSearch Ingestion 管道。單一管道可用來將所有日誌、追蹤和指標擷取到網域或集合上的不同索引。您也可以使用管道來單獨擷取日誌、追蹤或指標。

先決條件

設定 OpenTelemetry 組態檔案時,您必須設定下列項目才能進行擷取:

  • 擷取角色需要 osis:Ingest許可才能與管道互動。如需詳細資訊,請參閱擷取角色

  • 端點值必須包含您的管道端點。例如 https://pipeline-endpoint.us-east-1.osis.amazonaws.com.

  • 服務值必須為 osis

  • OTLP/HTTP Exporter 的壓縮選項必須符合管道所選來源上的壓縮選項。

extensions: sigv4auth: region: "region" service: "osis" exporters: otlphttp: logs_endpoint: "https://pipeline-endpoint.us-east-1.osis.amazonaws.com/v1/logs" metrics_endpoint: "https://pipeline-endpoint.us-east-1.osis.amazonaws.com/v1/metrics" traces_endpoint: "https://pipeline-endpoint.us-east-1.osis.amazonaws.com/v1/traces" auth: authenticator: sigv4auth compression: none service: extensions: [sigv4auth] pipelines: traces: receivers: [jaeger] exporters: [otlphttp]

步驟 1:設定管道角色

設定 OpenTelemetry 收集器組態之後,請設定您要在管道組態中使用的管道角色。管道角色沒有 OTLP 來源所需的特定許可,只有授予管道存取 OpenSearch 網域或集合的許可。

步驟 2:建立管道

然後,您可以如下所示設定 OpenSearch Ingestion 管道,指定 OTLP 做為來源。您也可以將 OpenTelemetry 日誌、指標和追蹤設定為個別來源。

OTLP 來源管道組態:

version: 2 otlp-pipeline: source: otlp: logs_path: /otlp-pipeline/v1/logs traces_path: /otlp-pipeline/v1/traces metrics_path: /otlp-pipeline/v1/metrics sink: - opensearch: hosts: ["https://search-mydomain.region.es.amazonaws.com"] index: "ss4o_metrics-otel-%{yyyy.MM.dd}" index_type: custom aws: region: "region"

OpenTelemetry Logs 管道組態:

version: 2 otel-logs-pipeline: source: otel_logs_source: path: /otel-logs-pipeline/v1/logs sink: - opensearch: hosts: ["https://search-mydomain.region.es.amazonaws.com"] index: "ss4o_metrics-otel-%{yyyy.MM.dd}" index_type: custom aws: region: "region"

OpenTelemetry 指標管道組態:

version: 2 otel-metrics-pipeline: source: otel_metrics_source: path: /otel-metrics-pipeline/v1/metrics sink: - opensearch: hosts: ["https://search-mydomain.region.es.amazonaws.com"] index: "ss4o_metrics-otel-%{yyyy.MM.dd}" index_type: custom aws: region: "region"

OpenTelemetry Traces 管道組態:

version: 2 otel-trace-pipeline: source: otel_trace_source: path: /otel-traces-pipeline/v1/traces sink: - opensearch: hosts: ["https://search-mydomain.region.es.amazonaws.com"] index: "ss4o_metrics-otel-%{yyyy.MM.dd}" index_type: custom aws: region: "region"

您可以使用預先設定的藍圖來建立此管道。如需詳細資訊,請參閱使用藍圖

跨帳戶連線

具有 OpenTelemetry 來源的 OpenSearch OpenSearch 擷取管道具有跨帳戶擷取功能。Amazon OpenSearch Ingestion 可讓您將管道 AWS 帳戶 從虛擬私有雲端 (VPC) 跨 共用到個別 VPC 中的管道端點。如需詳細資訊,請參閱設定跨帳戶擷取的 OpenSearch 擷取管道

限制

OpenSearch 擷取管道無法接收任何大於 20mb 的請求。此值由使用者在 max_request_length選項中設定。此選項預設為 10mb。

OpenTelemetry 來源的建議 CloudWatch 警示

建議使用下列 CloudWatch 指標來監控擷取管道的效能。這些指標可協助您識別從匯出處理的資料量、從串流處理的事件量、處理匯出和串流事件的錯誤,以及寫入目的地的文件數量。您可以設定 CloudWatch 警示,在其中一個指標超過指定時間長度的指定值時執行動作。

OTLP 來源的 CloudWatch 指標格式為 {pipeline-name}.otlp.{logs | traces | metrics}.{metric-name}。例如 otel-pipeline.otlp.metrics.requestTimeouts.count

如果使用個別 OpenTelemetry 來源,則指標會格式化為 {pipeline-name}.{source-name}.{metric-name}。例如 trace-pipeline.otel_trace_source.requestTimeouts.count

所有三種 OpenTelemetry 資料類型都會有相同的指標,但為了簡潔起見,這些指標只會在下表中列出 OTLP 來源日誌類型資料。

指標 Description
otel-pipeline.BlockingBuffer.bufferUsage.value

指出使用了多少緩衝區。

otel-pipeline.otlp.logs.requestTimeouts.count

已逾時的請求數。

otel-pipeline.otlp.logs.requestsReceived.count

OpenTelemetry Collector 收到的請求數量。

otel-pipeline.otlp.logs.badRequests.count

OpenTelemetry Collector 收到的格式不正確請求數量。

otel-pipeline.otlp.logs.requestsTooLarge.count

OpenTelemetry Collector 收到的請求數量超過 20mb 的上限。

otel-pipeline.otlp.logs.internalServerError.count The number of HTTP 500 errors received from the OpenTelemetry Collector.
otel-pipeline.opensearch.bulkBadRequestErrors.count Count of errors during bulk requests due to malformed request.
otel-pipeline.opensearch.bulkRequestLatency.avg Average latency for bulk write requests made to OpenSearch.
otel-pipeline.opensearch.bulkRequestNotFoundErrors.count Number of bulk requests that failed because the target data could not be found.
otel-pipeline.opensearch.bulkRequestNumberOfRetries.count Number of retries by OpenSearch Ingestion pipelines to write OpenSearch cluster.
otel-pipeline.opensearch.bulkRequestSizeBytes.sum Total size in bytes of all bulk requests made to OpenSearch.
otel-pipeline.opensearch.documentErrors.count Number of errors when sending documents to OpenSearch. The documents causing the errors witll be sent to DLQ.
otel-pipeline.opensearch.documentsSuccess.count Number of documents successfully written to an OpenSearch cluster or collection.
otel-pipeline.opensearch.documentsSuccessFirstAttempt.count Number of documents successfully indexed in OpenSearch on the first attempt.

otel-pipeline.opensearch.documentsVersionConflictErrors.count

Count of errors due to version conflicts in documents during processing.

otel-pipeline.opensearch.PipelineLatency.avg

Average latency of OpenSearch Ingestion pipeline to process the data by reading from the source to writing to the destination.
otel-pipeline.opensearch.PipelineLatency.max Maximum latency of OpenSearch Ingestion pipeline to process the data by reading from the source to writing the destination.
otel-pipeline.opensearch.recordsIn.count Count of records successfully ingested into OpenSearch. This metric is essential for tracking the volume of data being processed and stored.
otel-pipeline.opensearch.s3.dlqS3RecordsFailed.count Number of records that failed to write to DLQ.
otel-pipeline.opensearch.s3.dlqS3RecordsSuccess.count Number of records that are written to DLQ.
otel-pipeline.opensearch.s3.dlqS3RequestLatency.count Count of latency measurements for requests to the Amazon S3 dead-letter queue.
otel-pipeline.opensearch.s3.dlqS3RequestLatency.sum Total latency for all requests to the Amazon S3 dead-letter queue
otel-pipeline.opensearch.s3.dlqS3RequestSizeBytes.sum Total size in bytes of all requests made to the Amazon S3 dead-letter queue.
otel-pipeline.recordsProcessed.count Total number of records processed in the pipeline, a key metric for overal throughput.

otel-pipeline.opensearch.bulkRequestInvalidInputErrors.count

Count of bulk request errors in OpenSearch due to invalid input, crucial for monitoring data quality and operational issues.