🎯

llm-obs-experiment-py-bootstrap

🎯Skill

from datadog-labs/agent-skills

What it does

Official Datadog skills for AI agents — monitors, logs, APM traces, documentation search, LLM observability, browser SDK, and audit trail investigations.

📦

Same repository

datadog-labs/agent-skills(35 items)

llm-obs-experiment-py-bootstrap

Installation

Vibe Index InstallInstalls to .claude/skills/

npx vibeindex add datadog-labs/agent-skills --skill llm-obs-experiment-py-bootstrap

skills.sh Install⚠ Installs to .agents/skills/

npx skills add datadog-labs/agent-skills --skill llm-obs-experiment-py-bootstrap

Manual InstallCopy SKILL.md content and save to the path below

~/.claude/skills/llm-obs-experiment-py-bootstrap/SKILL.md

SKILL.md

172Installs

AddedMay 22, 2026

View on GitHub Back to Skills

More from this repository10

🎯

dd-pup🎯Skill

A skill for the Datadog CLI (pup) built in Rust with OAuth2 authentication. Enables searching logs, listing monitors, querying metrics, triaging security signals, and managing incidents and downtimes.

🎯

agent-skills🎯Skill

Five essential Datadog skills for AI agents including CLI commands, monitor management, log search, APM traces, and documentation search. Compatible with Claude Code, Codex CLI, Gemini CLI, Cursor, and other agents.

🎯

dd-logs🎯Skill

A Datadog agent skill for log management including search, pipelines, archives, and cost control, using the Datadog Pup CLI tool.

🎯

dd-apm🎯Skill

A Datadog agent skill for APM (Application Performance Monitoring) including distributed tracing, service maps, and performance analysis using the Datadog Pup CLI.

🎯

dd-monitors🎯Skill

A Datadog agent skill for monitor management including creating, updating, muting, and alerting best practices using the Datadog Pup CLI tool.

🎯

dd-docs🎯Skill

A Datadog agent skill for documentation lookup using docs.datadoghq.com/llms.txt and linked Markdown pages, enabling efficient access to Datadog's product documentation.

🎯

llm-obs-trace-rca🎯Skill

Root-causes production LLM failures by analyzing eval judge verdicts and runtime errors across Datadog LLM Observability traces. Outputs a failure taxonomy that can seed evaluator generation via the companion eval-bootstrap skill.

🎯

llm-obs-experiment-analyzer🎯Skill

Analyzes and compares offline LLM experiments in Datadog LLM Observability, supporting single experiment analysis, baseline/candidate comparison, targeted questions, and optional export to Datadog notebooks.

🎯

llm-obs-eval-bootstrap🎯Skill

Generates evaluator code from Datadog LLM Observability production traces, optionally seeded by root-cause analysis output. Part of the Datadog LLMO eval pipeline for diagnosing failures and building automated evaluators.

🎯

llm-obs-eval-pipeline🎯Skill

A Datadog skill that runs an end-to-end LLM Observability evaluation pipeline: classifying user sessions, diagnosing failures via root cause analysis of eval judge verdicts, and bootstrapping evaluator code to capture discovered failure patterns. Integrates with Datadog's LLM Observability and RUM data via MCP.