D Datadog + Omlet Agent

Your Datadog.
Supercharged.

An AI-powered agent that creates dashboards, configures alerts, educates your team, and optimizes your entire Datadog environment.

Get Started Learn About Agent
01 / Dashboard Creation

Dashboards Built in Seconds

Describe what you want to monitor and the agent discovers your metrics, builds widgets, and assembles production-ready Datadog dashboards.

  • Auto-discovers your metrics and tags
  • Creates widgets with proper queries (avg, p95, p99)
  • Builds multi-section dashboards with template variables
  • Exports shareable dashboard JSON
omlet-agent
you > set up a dashboard for our API latency
Discovering metrics via Datadog API...
Found 14 matching metrics for trace.http.request.duration
Tags: service, resource_name, env, region
Building dashboard widgets...
+ Timeseries: P50 / P95 / P99 latency by service
+ Top List: Slowest endpoints (p99 > 500ms)
+ Heatmap: Request duration distribution
+ Query Value: Current error rate by region
Creating dashboard via API...
Dashboard created: API Latency Overview
4 widgets · 3 template variables · shared with team
02 / Alert Configuration

Intelligent Alert Setup

The agent analyzes your services, identifies monitoring gaps, and configures Datadog monitors with the right thresholds, notification channels, and escalation paths.

P95 Latency Monitor Critical
avg(last_5m):trace.http.request.duration.p95{service:api} > 500ms
Notify: PagerDuty → #ops-critical
Error Rate Spike Warning
sum(last_10m):trace.http.request.errors{service:api}.as_rate() > 0.05
Notify: Slack → #eng-alerts
Disk Usage Threshold Info
avg(last_15m):system.disk.in_use{*} by {host} > 0.85
Notify: Email → infra-team@company.com
Memory Leak Detection Warning
linear(last_1h):system.mem.used{*} by {host}.trend() > 0
Notify: Slack → #infra-alerts
Alert Coverage Analysis
12
Monitors Created
3
Notification Channels
2
Composite Monitors
4
SLO Monitors
Added missing coverage for memory leak detection and disk saturation across 8 hosts
📚 Omlet Agent — Datadog Educator
What's the difference between gauge, rate, and count metric types in Datadog?
Gauge — A snapshot value at a point in time. Your system.mem.used metrics are gauges. They show the current state.

Rate — A per-second derivative. Your trace.http.request.hits is submitted as a rate. Useful for throughput.

Count — A raw total over an interval. Your aws.elb.request_count is a count. Aggregates with sum, not avg.
When should I use .as_rate() vs .as_count() in queries?
Use .as_rate() when you want per-second normalization—great for comparing intervals of different lengths.

Use .as_count() for raw totals, like "how many errors in the last 5 minutes." Your error rate alert already uses this correctly.
03 / Datadog Education

Learn Datadog as You Go

The agent understands your environment and teaches Datadog concepts using your own metrics, services, and configurations as examples.

  • Explains metric types, tags, and query syntax
  • Teaches dashboard and monitor best practices
  • Walks through Datadog APIs and integrations
  • Context-aware answers using your actual environment
04 / Recommendations

Actionable Recommendations

The agent audits your Datadog setup and delivers targeted recommendations to improve performance, reduce costs, and close monitoring gaps.

Coverage
Add custom tags to APM traces
Enable service-level filtering by adding environment and team tags to your trace spans.
Performance
Switch from avg to p99 for latency SLOs
Your current SLOs use avg() which masks tail latency. p99 better reflects user experience.
Cost
Enable log-based metrics
Replace 3 high-volume log queries with log-based metrics to reduce indexing costs by ~40%.
Coverage
Add anomaly detection monitors
5 services have no anomaly detection. Static thresholds miss gradual performance degradation.
Optimization Summary
~$2.4k
Monthly Savings
47
Unused Metrics
8
Missing Monitors
34%
Tag Cardinality Reduction
Implementing all recommendations would save ~$2.4k/mo and improve alert coverage by 67%
05 / Environment Optimization

Optimize Your Datadog Spend

The agent continuously audits your Datadog environment to eliminate waste, right-size configurations, and keep costs predictable.

  • Identifies unused custom metrics
  • Optimizes log pipelines and exclusion filters
  • Right-sizes APM sampling rates
  • Audits tag cardinality to prevent cost spikes
app.datadoghq.com/account/usage
1,247
Custom Metrics
↓ 12% after audit
8.2 GB
Daily Log Volume
↓ 34% with filters
94%
APM Trace Coverage
↓ optimized sampling
Monthly Cost by Category
Infra
Logs
APM
RUM
Metrics
Synth

Transform Your Datadog
Experience

Let Omlet Agent handle the complexity of Datadog so your team can focus on building.