If you’re new to Tracer or want a conceptual overview, see How Tracer fits in your stack.
What Datadog does well
Datadog is designed to provide broad observability across many systems and services. It provides:- Centralized dashboards for metrics, logs, and traces
- Agent-based telemetry collection across hosts, containers, and services
- Alerting based on thresholds, anomalies, and service health
- Integrations across cloud platforms, orchestration systems, and application frameworks
What Datadog does not observe
Datadog organizes telemetry around services, hosts, and applications. While it can collect detailed telemetry, it does not natively observe execution behavior in terms of pipeline or task semantics. It does not show:- Execution behavior inside processes or containers as execution units
- CPU vs I/O vs memory contention during individual tasks
- Short-lived subprocesses that do not align with service boundaries
- Idle or blocked execution hidden by aggregate utilization
- How telemetry maps directly to pipeline runs, tasks, or tools
- How cost relates to observed execution rather than to infrastructure or services
Why this gap matters
Scientific and data pipelines often involve heterogeneous tools, nested execution, and short-lived processes orchestrated by workflow engines or schedulers. When relying on general-purpose observability alone:- Performance bottlenecks must be inferred from service-level telemetry
- Idle or blocked execution can appear as normal utilization
- Cost is attributed to infrastructure or services rather than execution units
- Diagnosing variability between runs requires manual investigation
What Tracer adds
Tracer observes execution directly from the host and container runtime and adds:- Observed CPU, memory, disk, and network behavior
- Visibility into short-lived processes and nested tools
- Attribution by pipeline, run, task, or execution unit
- Cost mapping aligned with observed runtime activity
Example: service telemetry versus observed execution
Datadog dashboards show elevated resource usage during pipeline runs. Tracer reveals that:- CPU usage is low across most tasks
- Execution time is dominated by disk I/O wait
- Multiple short-lived helper processes drive runtime variability
Observability comparison
This comparison highlights the difference between service-level observability and execution-level observation.
What Tracer does not replace
Tracer is not a general-purpose observability platform.- It does not replace Datadog for monitoring unrelated services or applications
- It does not replace dashboards built from arbitrary business or application metrics
- It does not replace organization-wide alerting across all systems
- Its alerting is focused on execution behavior, not all service events
When to use Tracer with Datadog
Tracer is most useful alongside Datadog when teams need to:- Understand pipeline behavior beyond service-level telemetry
- Diagnose performance issues involving short-lived or nested execution
- Attribute resource usage and cost to workflows or tools
- Reduce manual correlation across metrics, logs, and traces

