A Hands-On Guide to Tracing with OpenTelemetry, Jaeger, Tempo, and Grafana

7 min read1 day ago

–

Press enter or click to view image in full size

Welcome to the next pillar of the observability stack every SRE should understand: Tracing

If you haven’t yet set up the other pillars — monitoring, alerting, and logging, I recommend starting there first. You can follow the complete series in the link below, which walks through each component step by step before diving into tracing.

Why Tracing Matters?

7 min read1 day ago

–

Press enter or click to view image in full size

Welcome to the next pillar of the observability stack every SRE should understand: Tracing

Why Tracing Matters?

From an SRE perspective, tracing is the most critical — and often the most undervalued — pillar of the observability stack**.** Metrics and alerts are excellent at telling you something is broken, but they completely fall short when it comes to explaining why. In real production environments, it’s common to see all dashboards green, SLIs within thresholds, and alerts silent — while users still complain that the application feels slow. This is where tracing becomes non-negotiable. Tracing gives you request-level truth: how a request propagates across services, where latency is actually introduced, which downstream dependency is slowing things down, what status codes are returned, and how retries or fan-outs behave. For an SRE, this depth is not a “nice to have”; it is the difference between guessing and knowing, between reactive firefighting and confident, data-driven debugging.

Introduction

In this section, we will implement distributed tracing using a dummy microservices-based application. The service is designed to mimic a real-world architecture, where a single request fans out to multiple downstream dependencies such as Redis, PostgreSQL, and other internal services. As requests flow through the system, we will generate end-to-end traces, store them in Tempo, and visualize them in Grafana. This setup allows us to observe how each request propagates across services, identify latency bottlenecks, and understand dependency behavior — exactly the kind of visibility required to debug and operate modern microservice architectures with confidence.

HotROD: Our Trace Generator for This Blog

HotROD (“Hot Rides on Demand”) is an open-source demo microservices application provided by the Jaeger project. It simulates a ride-sharing service composed of several services that communicate with each other, making it ideal for generating distributed trace data that reflects real-world latency, service calls, and dependencies. HotROD is widely used in the community to illustrate how tracing works end-to-end because it naturally emits spans across multiple services without requiring manual instrumentation in every component.

we run HotROD as a Kubernetes pod and configure it to send trace spans to Tempo so we can visualize them in Grafana. Below is the deployment manifest we use to run HotROD in Kubernetes:

apiVersion: apps/v1kind: Deploymentmetadata:  labels:    app: hotrod  name: hotrodspec:  replicas: 1  selector:    matchLabels:      app: hotrod  template:    metadata:      labels:        app: hotrod    spec:      containers:        - args:            - all          env:            - name: JAEGER_AGENT_HOST              value: tempo.monitoring.svc.cluster.local            - name: JAEGER_AGENT_PORT              value: '6831'          image: jaegertracing/example-hotrod:1.41.0          name: hotrod          ports:            - containerPort: 8080      restartPolicy: Always---apiVersion: v1kind: Servicemetadata:  name: hotrod  labels:    app: hotrodspec:  type: ClusterIP  ports:    - name: http      port: 8080      targetPort: 8080  selector:    app: hotrod

In this configuration, HotROD emits spans using the Jaeger Thrift protocol by specifying JAEGER_AGENT_HOST and JAEGER_AGENT_PORT. Tempo accepts Jaeger traces natively, so this setup works without additional collectors.

Can HotROD Be Used with OpenTelemetry?

The app is launched with an OTLP exporter (--otel-exporter=otlp). In this mode, HotROD emits traces using the OpenTelemetry Protocol (OTLP) instead of native Jaeger Thrift

apiVersion: apps/v1kind: Deploymentmetadata:  name: hotrod-otelspec:  replicas: 1  template:    spec:      containers:      - name: hotrod        image: jaegertracing/example-hotrod:latest        args:          - all          - --otel-exporter=otlp        env:          - name: OTEL_EXPORTER_OTLP_ENDPOINT            value: "http://otel-collector.monitoring.svc.cluster.local:4317"        ports:          - containerPort: 8080

This configures HotROD to emit OTLP traces to an OpenTelemetry Collector running at otel-collector.monitoring.svc.cluster.local:4317

Once the deployment is applied and the HotROD pod is running in the Kubernetes cluster, you can access the UI by port-forwarding the pod to your local machine.

kubectl port-forward $(kubectl get pods | grep hotrod | awk '{print $1}') 8080

After the port-forward is active, open your browser and navigate to http://localhost:8080 The HotROD UI will be available. Interact with the UI by clicking the available buttons to generate requests. Each action triggers distributed tracing spans across the simulated services.

Press enter or click to view image in full size

Grafana Tempo

At this stage, traces are actively being generated and exported, but they are not yet stored anywhere. To persist and query these traces, we now need a backend designed specifically for distributed tracing. This is where Tempo comes into the picture.

Get Sanskar Agrawalla’s stories in your inbox

Join Medium for free to get updates from this writer.

Tempo is a natural fit because it is cost-efficient, scalable, and index-free. Unlike traditional tracing backends, Tempo stores traces cheaply (object storage–friendly) and relies on metrics and logs for trace discovery. This makes it ideal for high-cardinality, high-volume tracing workloads commonly seen in microservice architectures. Tempo also integrates seamlessly with Grafana, which keeps the observability stack consistent and simple. We will install Tempo using Helm for a quick and production-aligned setup.

git clone https://github.com/sanskar153/Grafana-Tempo.gitcd Grafana-Tempohelm install tempo tempo -n monitoring --debug

Press enter or click to view image in full size

Now it’s time to install Grafana for trace visualization. For a quick and straightforward setup, we will use Helm.

Run the following commands to add the Grafana Helm repository and install Grafana in the monitoring namespace:

helm repo add grafana https://grafana.github.io/helm-chartshelm repo updatehelm install my-grafana grafana/grafana --namespace monitoring

Press enter or click to view image in full size

Once Grafana is installed, retrieve the admin credentials by running:

kubectl get secret --namespace monitoring my-grafana \  -o jsonpath="{.data.admin-password}" | base64 --decode ; echo

Next, port-forward the Grafana pod to access the UI locally:

kubectl port-forward $(kubectl get pods -n monitoring | grep grafana | awk '{print $1}') -n monitoring 3000

Now open your browser and navigate to http://localhost:3000 , login using the retrieved credentials, and Grafana will be ready for configuration and visualization.

Press enter or click to view image in full size

Configuring Tempo as a Grafana Data Source

Now that Tempo and Grafana is up and running, the next step is to configure it as a data source in Grafana.

Follow these steps:

Navigate to Grafana → Connections → Data sources.
Search for and select Tempo.
In the URL field, add the Tempo service endpoint: [http://tempo\.monitoring\.svc\.cluster\.local:3200](http://tempo.monitoring.svc.cluster.local:3200)
Scroll to the bottom and click Save & Test.

Press enter or click to view image in full size

If the configuration is correct, Grafana will display a “Data source is working” message, confirming successful connectivity with Tempo.

Press enter or click to view image in full size

At this point, Grafana is fully configured to query and visualize traces stored in Tempo. To explore the traces, navigate to the Explore section in Grafana, select Tempo as the data source, and choose the Search query type. You will now be able to see all the generated traces along with their individual spans. Each trace provides detailed insights into the request flow, including which services were called, the duration of each span, HTTP status codes, and the associated request and response metadata. This view gives a complete, end-to-end picture of how a request traverses your system.

With this, the tracing setup is complete. You can now visualize end-to-end traces and individual spans, gaining deep visibility into how requests flow through your services. Beyond exploration, this data becomes even more powerful when combined with dashboards — for example, identifying the busiest services or highlighting requests that take longer than one second. These insights help SREs move from reactive debugging to proactive performance optimization.

This brings the blog to a close. I encourage you to explore tracing further, experiment with different queries and dashboards, and adapt this setup to your own production environments. If you have any questions or run into issues, feel free to reach out. And finally, a shout-out to all the SREs who debug late into the night, chase elusive latency spikes, and quietly keep systems reliable at scale — your work may not always be visible, but it is foundational to everything running smoothly.

Welcome to the next pillar of the observability stack every SRE should understand: Tracing

Why Tracing Matters?

Welcome to the next pillar of the observability stack every SRE should understand: Tracing

Why Tracing Matters?

Introduction

HotROD: Our Trace Generator for This Blog

Can HotROD Be Used with OpenTelemetry?

Grafana Tempo

Get Sanskar Agrawalla’s stories in your inbox

Configuring Tempo as a Grafana Data Source

Similar Posts