Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Connector/Servicegraph] Servicegraph Connector are not giving correct metrics of spans #34170

Open
VijayPatil872 opened this issue Jul 19, 2024 · 2 comments
Labels
bug Something isn't working connector/servicegraph needs triage New item requiring triage

Comments

@VijayPatil872
Copy link

VijayPatil872 commented Jul 19, 2024

Component(s)

connector/servicegraph

What happened?

Description

I am using servicegraph connector to generate service graph and metrics from span. the metrics are emitted by the connector are fluctuating up and down.
We are using service graphs connector to build service graph. We have deployed a layer of Collectors containing the load-balancing exporter in front of traces Collectors doing the span metrics and service graph connector processing. The load-balancing exporter is used to hash the trace ID consistently and determine which collector backend should receive spans for that trace.
the service graph exporting the metrics to Grafana mimir with prometheusremotewrite exporter.

Steps to Reproduce

Expected Result

The metrics are emitted by the connector should be correct

Actual Result

image

Collector version

0.104.0

Environment information

No response

OpenTelemetry Collector configuration

config:        
  exporters:


    prometheusremotewrite/mimir-default-processor-spanmetrics:
      endpoint: 
      headers:
        x-scope-orgid: ********
      resource_to_telemetry_conversion:
        enabled: true
      timeout: 30s
      tls:
        insecure: true
      remote_write_queue:
        enabled: true
        queue_size: 100000
        num_consumers: 500        

    prometheusremotewrite/mimir-default-servicegraph:
      endpoint: 
      headers:
        x-scope-orgid: **********
      resource_to_telemetry_conversion:
        enabled: true
      timeout: 30s  
      tls:
        insecure: true
      remote_write_queue:
        enabled: true
        queue_size: 100000
        num_consumers: 500


  connectors:
    spanmetrics:
      histogram:
        explicit:
          buckets: [100ms, 500ms, 2s, 5s, 10s, 20s, 30s]
      aggregation_temporality: "AGGREGATION_TEMPORALITY_CUMULATIVE"
      metrics_flush_interval: 15s
      metrics_expiration: 5m
      exemplars:
        enabled: false
      dimensions:
        - name: http.method
        - name: http.status_code
        - name: cluster
        - name: collector.hostname
      events:
        enabled: true
        dimensions:
          - name: exception.type
      resource_metrics_key_attributes:
        - service.name
        - telemetry.sdk.language
        - telemetry.sdk.name
    servicegraph:
      latency_histogram_buckets: [100ms, 250ms, 1s, 5s, 10s]
      store:
        ttl: 2s
        max_items: 10

  receivers:
    otlp:
      protocols:
        http:
          endpoint: ${env:MY_POD_IP}:*****
        grpc:
          endpoint: ${env:MY_POD_IP}:*****
  service:


    pipelines:
      traces/connector-pipeline:
        exporters:
          - otlphttp/tempo-processor-default
          - spanmetrics
          - servicegraph
        processors:
          - batch          
          - memory_limiter
        receivers:
          - otlp
     
      metrics/spanmetrics:
        exporters:
          - debug
          - prometheusremotewrite/mimir-default-processor-spanmetrics
        processors:
          - batch          
          - memory_limiter
        receivers:
          - spanmetrics

      metrics/servicegraph:
        exporters:
          - debug
          - prometheusremotewrite/mimir-default-servicegraph
        processors:
          - batch          
          - memory_limiter
        receivers:
          - servicegraph

Log output

No response

Additional context

No response

@VijayPatil872 VijayPatil872 added bug Something isn't working needs triage New item requiring triage labels Jul 19, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@VijayPatil872
Copy link
Author

any update on the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working connector/servicegraph needs triage New item requiring triage
Projects
None yet
Development

No branches or pull requests

1 participant