本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。
搭配 OpenTelemetry Collector 使用 OpenSearch 擷取管道 OpenTelemetry
您可以使用 OpenTelemetry Collector
先決條件
設定 OpenTelemetry 組態檔案
-
擷取角色需要
osis:Ingest許可才能與管道互動。如需詳細資訊,請參閱擷取角色。 -
端點值必須包含您的管道端點。例如
https://pipeline-endpoint.us-east-1.osis.amazonaws.com. -
服務值必須為
osis。 -
OTLP/HTTP Exporter 的壓縮選項必須符合管道所選來源上的壓縮選項。
extensions: sigv4auth: region: "region" service: "osis" exporters: otlphttp: logs_endpoint: "https://pipeline-endpoint.us-east-1.osis.amazonaws.com/v1/logs" metrics_endpoint: "https://pipeline-endpoint.us-east-1.osis.amazonaws.com/v1/metrics" traces_endpoint: "https://pipeline-endpoint.us-east-1.osis.amazonaws.com/v1/traces" auth: authenticator: sigv4auth compression: none service: extensions: [sigv4auth] pipelines: traces: receivers: [jaeger] exporters: [otlphttp]
步驟 1:設定管道角色
設定 OpenTelemetry 收集器組態之後,請設定您要在管道組態中使用的管道角色。管道角色沒有 OTLP 來源所需的特定許可,只有授予管道存取 OpenSearch 網域或集合的許可。
步驟 2:建立管道
然後,您可以如下所示設定 OpenSearch Ingestion 管道,指定 OTLP 做為來源。您也可以將 OpenTelemetry 日誌、指標和追蹤設定為個別來源。
OTLP 來源管道組態:
version: 2 otlp-pipeline: source: otlp: logs_path: /otlp-pipeline/v1/logs traces_path: /otlp-pipeline/v1/traces metrics_path: /otlp-pipeline/v1/metrics sink: - opensearch: hosts: ["https://search-mydomain.region.es.amazonaws.com"] index: "ss4o_metrics-otel-%{yyyy.MM.dd}" index_type: custom aws: region: "region"
OpenTelemetry Logs 管道組態:
version: 2 otel-logs-pipeline: source: otel_logs_source: path: /otel-logs-pipeline/v1/logs sink: - opensearch: hosts: ["https://search-mydomain.region.es.amazonaws.com"] index: "ss4o_metrics-otel-%{yyyy.MM.dd}" index_type: custom aws: region: "region"
OpenTelemetry 指標管道組態:
version: 2 otel-metrics-pipeline: source: otel_metrics_source: path: /otel-metrics-pipeline/v1/metrics sink: - opensearch: hosts: ["https://search-mydomain.region.es.amazonaws.com"] index: "ss4o_metrics-otel-%{yyyy.MM.dd}" index_type: custom aws: region: "region"
OpenTelemetry Traces 管道組態:
version: 2 otel-trace-pipeline: source: otel_trace_source: path: /otel-traces-pipeline/v1/traces sink: - opensearch: hosts: ["https://search-mydomain.region.es.amazonaws.com"] index: "ss4o_metrics-otel-%{yyyy.MM.dd}" index_type: custom aws: region: "region"
您可以使用預先設定的藍圖來建立此管道。如需詳細資訊,請參閱使用藍圖。
跨帳戶連線
具有 OpenTelemetry 來源的 OpenSearch OpenSearch 擷取管道具有跨帳戶擷取功能。Amazon OpenSearch Ingestion 可讓您將管道 AWS 帳戶 從虛擬私有雲端 (VPC) 跨 共用到個別 VPC 中的管道端點。如需詳細資訊,請參閱設定跨帳戶擷取的 OpenSearch 擷取管道。
限制
OpenSearch 擷取管道無法接收任何大於 20mb 的請求。此值由使用者在 max_request_length選項中設定。此選項預設為 10mb。
OpenTelemetry 來源的建議 CloudWatch 警示
建議使用下列 CloudWatch 指標來監控擷取管道的效能。這些指標可協助您識別從匯出處理的資料量、從串流處理的事件量、處理匯出和串流事件的錯誤,以及寫入目的地的文件數量。您可以設定 CloudWatch 警示,在其中一個指標超過指定時間長度的指定值時執行動作。
OTLP 來源的 CloudWatch 指標格式為 {pipeline-name}.otlp.{logs | traces | metrics}.{metric-name}。例如 otel-pipeline.otlp.metrics.requestTimeouts.count。
如果使用個別 OpenTelemetry 來源,則指標會格式化為 {pipeline-name}.{source-name}.{metric-name}。例如 trace-pipeline.otel_trace_source.requestTimeouts.count。
所有三種 OpenTelemetry 資料類型都會有相同的指標,但為了簡潔起見,這些指標只會在下表中列出 OTLP 來源日誌類型資料。
| 指標 | Description |
|---|---|
otel-pipeline.BlockingBuffer.bufferUsage.value |
指出使用了多少緩衝區。 |
otel-pipeline.otlp.logs.requestTimeouts.count
|
已逾時的請求數。 |
otel-pipeline.otlp.logs.requestsReceived.count
|
OpenTelemetry Collector 收到的請求數量。 |
otel-pipeline.otlp.logs.badRequests.count
|
OpenTelemetry Collector 收到的格式不正確請求數量。 |
otel-pipeline.otlp.logs.requestsTooLarge.count
|
OpenTelemetry Collector 收到的請求數量超過 20mb 的上限。 |
otel-pipeline.otlp.logs.internalServerError.count
|
The number of HTTP 500 errors received from the OpenTelemetry Collector. |
otel-pipeline.opensearch.bulkBadRequestErrors.count
|
Count of errors during bulk requests due to malformed request. |
otel-pipeline.opensearch.bulkRequestLatency.avg
|
Average latency for bulk write requests made to OpenSearch. |
otel-pipeline.opensearch.bulkRequestNotFoundErrors.count
|
Number of bulk requests that failed because the target data could not be found. |
otel-pipeline.opensearch.bulkRequestNumberOfRetries.count
|
Number of retries by OpenSearch Ingestion pipelines to write OpenSearch cluster. |
otel-pipeline.opensearch.bulkRequestSizeBytes.sum
|
Total size in bytes of all bulk requests made to OpenSearch. |
otel-pipeline.opensearch.documentErrors.count
|
Number of errors when sending documents to OpenSearch. The documents causing the errors witll be sent to DLQ. |
otel-pipeline.opensearch.documentsSuccess.count
|
Number of documents successfully written to an OpenSearch cluster or collection. |
otel-pipeline.opensearch.documentsSuccessFirstAttempt.count
|
Number of documents successfully indexed in OpenSearch on the first attempt. |
|
|
Count of errors due to version conflicts in documents during processing. |
|
|
Average latency of OpenSearch Ingestion pipeline to process the data by reading from the source to writing to the destination. |
otel-pipeline.opensearch.PipelineLatency.max
|
Maximum latency of OpenSearch Ingestion pipeline to process the data by reading from the source to writing the destination. |
otel-pipeline.opensearch.recordsIn.count
|
Count of records successfully ingested into OpenSearch. This metric is essential for tracking the volume of data being processed and stored. |
otel-pipeline.opensearch.s3.dlqS3RecordsFailed.count
|
Number of records that failed to write to DLQ. |
otel-pipeline.opensearch.s3.dlqS3RecordsSuccess.count
|
Number of records that are written to DLQ. |
otel-pipeline.opensearch.s3.dlqS3RequestLatency.count
|
Count of latency measurements for requests to the Amazon S3 dead-letter queue. |
otel-pipeline.opensearch.s3.dlqS3RequestLatency.sum
|
Total latency for all requests to the Amazon S3 dead-letter queue |
otel-pipeline.opensearch.s3.dlqS3RequestSizeBytes.sum
|
Total size in bytes of all requests made to the Amazon S3 dead-letter queue. |
otel-pipeline.recordsProcessed.count
|
Total number of records processed in the pipeline, a key metric for overal throughput. |
|
|
Count of bulk request errors in OpenSearch due to invalid input, crucial for monitoring data quality and operational issues. |