google-agents-cli-observability
google/agents-cli
Set up tracing, logging, and monitoring for ADK agents with Cloud Trace, BigQuery, and third-party platforms.
What is google-agents-cli-observability?
This skill guides you through configuring observability for Google ADK (Agent Development Kit) agents. It covers Cloud Trace (distributed tracing, always enabled), Prompt-Response Logging (GenAI interactions to GCS/BigQuery), BigQuery Agent Analytics (structured agent events), and third-party integrations (AgentOps, Phoenix, MLflow, etc.). Use it when you need to monitor deployed agents, debug production traffic, or set up logging and tracing.
- Enable Cloud Trace for distributed tracing of agent execution flow, latency, and errors via OpenTelemetry spans
- Configure Prompt-Response Logging to capture GenAI interactions (model, tokens, timing) exported to GCS and BigQuery
- Set up BigQuery Agent Analytics plugin for structured agent events (LLM calls, tool use, outcomes)
- Integrate third-party observability platforms (AgentOps, Phoenix, MLflow, Monocle, Weave, Arize AX, Freeplay)
- Provision infrastructure (service account, GCS bucket, BigQuery dataset) via Terraform for logging and analytics
- Troubleshoot observability issues and configure privacy modes for prompt-response capture
How to install google-agents-cli-observability
npx skills add https://github.com/google/agents-cli --skill google-agents-cli-observability- agents-cli installed (via `uv tool install google-agents-cli`)
- Google Cloud project with appropriate IAM permissions
- For Prompt-Response Logging and BigQuery Analytics: run `agents-cli infra single-project --project PROJECT_ID` to provision Terraform resources before first deploy
- For Agent Runtime deployments: infrastructure must be provisioned before `agents-cli deploy` to avoid state mismatch
How to use google-agents-cli-observability
- 1.Run `agents-cli infra single-project --project PROJECT_ID` to provision GCS bucket, BigQuery dataset, and service account (required for logging/analytics)
- 2.Verify Cloud Trace is enabled by default; view traces in Cloud Console → Trace → Trace explorer
- 3.For Prompt-Response Logging, set `LOGS_BUCKET_NAME` environment variable and verify service account has `storage.objectCreator` permissions
- 4.Enable BigQuery Agent Analytics by using `--bq-analytics` flag at scaffold time, then check `references/bigquery-agent-analytics.md` for configuration
- 5.Choose a third-party integration (AgentOps, Phoenix, MLflow, etc.) if needed and follow platform-specific setup in ADK docs
- 6.For Agent Runtime deployments, apply Terraform before first deploy; for existing SDK-deployed instances, either switch to Terraform-managed or manually set env vars and IAM permissions
- 7.Check `references/cloud-trace-and-logging.md` for environment variables, verification commands, and enabling/disabling observability locally
Use cases
- Debug agent latency and execution flow using Cloud Trace span hierarchies
- Audit LLM interactions and ensure compliance with prompt-response logging to BigQuery
- Analyze agent behavior with structured events (tool use, outcomes) in BigQuery dashboards
- Monitor production ADK agents deployed to Agent Runtime, Cloud Run, or GKE
- Integrate with team collaboration platforms like AgentOps or Weave for session replays and timeline views
- ADK agent developers debugging production deployments
- DevOps engineers setting up monitoring infrastructure for agent applications
- Data analysts building custom dashboards from BigQuery agent events
- Teams using third-party observability platforms (AgentOps, Phoenix, MLflow, etc.)
- Compliance and audit teams requiring GenAI interaction logging
google-agents-cli-observability FAQ
Cloud Trace works out of the box with no setup. Prompt-Response Logging and BigQuery Agent Analytics require running `agents-cli infra single-project` to provision a GCS bucket, BigQuery dataset, and service account via Terraform. This must be done before the first `agents-cli deploy` for Agent Runtime deployments.
Cloud Trace (always on) provides distributed tracing of execution flow. Prompt-Response Logging captures GenAI interactions to GCS/BigQuery. BigQuery Agent Analytics logs structured agent events for analytics. Third-party integrations (AgentOps, Phoenix, etc.) offer specialized features like session replays or custom dashboards. You can combine them.
Cloud Trace spans are visible in Cloud Console → Trace → Trace explorer. Prompt-response logs go to GCS (JSONL) and BigQuery (via log sinks and external tables). BigQuery Agent Analytics events appear in a dedicated BigQuery table. Third-party platforms have their own dashboards.
You have two options: (1) Delete the SDK-deployed Reasoning Engine, run `agents-cli infra single-project`, then redeploy with Terraform; or (2) Keep the SDK instance and manually set environment variables and IAM permissions on the running instance via the vertexai client API. Option 1 is cleaner but loses sessions.
By default, only metadata (model name, tokens, timing) is logged via the `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT` environment variable. Set it to `NO_CONTENT` (default in scaffolded projects) for metadata-only, or other modes like `span_only` or `span_and_event` to capture more. Set to `false` to disable capture entirely.
Full instructions (SKILL.md)
Source of truth, from google/agents-cli.
name: google-agents-cli-observability description: > This skill should be used when the user wants to "set up tracing", "monitor my ADK agent", "configure logging", "add observability", "debug production traffic", or needs guidance on monitoring deployed ADK (Agent Development Kit) agents. Covers Cloud Trace, prompt-response logging, BigQuery Agent Analytics, third-party integrations (AgentOps, Phoenix, MLflow, etc.), and troubleshooting. Part of the Google ADK (Agent Development Kit) skills suite. Do NOT use for deployment setup (use google-agents-cli-deploy) or API code patterns (use google-agents-cli-adk-code). metadata: author: Google license: Apache-2.0 version: 0.6.1 requires: bins: - agents-cli install: "uv tool install google-agents-cli"
ADK Observability Guide
Cloud Trace works out of the box — no infrastructure needed. Prompt-response logging and BigQuery Agent Analytics require Terraform-provisioned infrastructure (service account, GCS bucket, BigQuery dataset). Run
agents-cli infra single-project --project PROJECT_IDto provision these resources. Seereferences/cloud-trace-and-logging.mdfor details, env vars, and verification commands. If your project isn't scaffolded yet, see/google-agents-cli-scaffoldfirst.
Order of operations for agent_runtime deployments
For deployment_target = agent_runtime, run agents-cli infra single-project before the first agents-cli deploy. The Terraform module owns the entire Reasoning Engine resource (display_name, service account, deployment spec, env vars), so applying it after a SDK-based deploy creates a state mismatch — Terraform has no record of the SDK-deployed instance and cannot layer env vars onto it without taking ownership of the whole resource.
If you have already run agents-cli deploy, you have two options:
- Switch to Terraform-managed. Delete the SDK-deployed Reasoning Engine, then run
agents-cli infra single-projectfollowed byagents-cli deploy. Sessions and any in-flight state on the previous instance are lost. - Keep the SDK-deployed instance. Skip
infra single-projectand set the observability env vars on the running instance directly via thevertexaiclientupdateAPI. You will also need to grant the instance's service account the IAM permissions required to emit telemetry — writing to the logs GCS bucket, BigQuery dataset access, log writer, etc. Seedeployment/terraform/single-project/iam.tfandtelemetry.tfin your scaffolded project for the full set of bindings the Terraform module would otherwise provision. Terraform-managed env vars are not available in this mode.
Reference Files
| File | Contents |
|---|---|
references/cloud-trace-and-logging.md | Scaffolded project details — Terraform-provisioned resources, environment variables, verification commands, enabling/disabling locally |
references/bigquery-agent-analytics.md | BQ Agent Analytics plugin — enabling, key features, GCS offloading, tool provenance |
Observability Tiers
Choose the right level of observability based on your needs:
| Tier | What It Does | Scope | Default State | Best For |
|---|---|---|---|---|
| Cloud Trace | Distributed tracing — execution flow, latency, errors via OpenTelemetry spans | All templates, all environments | Always enabled | Debugging latency, understanding agent execution flow |
| Prompt-Response Logging | GenAI interactions exported to GCS, BigQuery, and Cloud Logging | ADK agents only | Disabled locally, enabled when deployed | Auditing LLM interactions, compliance |
| BigQuery Agent Analytics | Structured agent events (LLM calls, tool use, outcomes) to BigQuery | ADK agents with plugin enabled | Opt-in (--bq-analytics at scaffold time) | Conversational analytics, custom dashboards, LLM-as-judge evals |
| Third-Party Integrations | External observability platforms (AgentOps, Phoenix, MLflow, etc.) | Any ADK agent | Opt-in, per-provider setup | Team collaboration, specialized visualization, prompt management |
Ask the user which tier(s) they need — they can be combined. Cloud Trace is always on; the others are additive.
Cloud Trace
ADK uses OpenTelemetry to emit distributed traces. Every agent invocation produces spans that track the full execution flow.
Span Hierarchy
invocation
└── agent_run (one per agent in the chain)
├── call_llm (model request/response)
└── execute_tool (tool execution)
Setup by Deployment Type
| Deployment | Setup |
|---|---|
| Agent Runtime | Automatic — traces are exported to Cloud Trace by default |
| Cloud Run (scaffolded) | Automatic — setup_telemetry() configures Cloud Trace/Logging exporters |
| GKE (scaffolded) | Automatic — setup_telemetry() configures Cloud Trace/Logging exporters |
| Cloud Run / GKE (manual) | Configure OpenTelemetry exporter in your app |
| Local dev | Works with agents-cli playground; traces visible in Cloud Console |
View traces: Cloud Console → Trace → Trace explorer
For detailed setup instructions (Agent Runtime CLI/SDK, Cloud Run, custom deployments), fetch https://adk.dev/integrations/cloud-trace/index.md.
Prompt-Response Logging
Captures GenAI interactions (model name, tokens, timing) and exports to GCS (JSONL) and BigQuery (via direct log sinks and external tables). Privacy-preserving by default — only metadata is logged unless explicitly configured otherwise.
Key env var: OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT — OTel GenAI semantic-conventions standard (modes: span_only, event_only, span_and_event, no_content). The scaffolded setup_telemetry() collapses every non-false value to NO_CONTENT (metadata-only); false disables capture. Logging is disabled locally unless LOGS_BUCKET_NAME is set.
For scaffolded project details (Terraform resources, env vars, privacy modes, enabling/disabling, verification commands), see references/cloud-trace-and-logging.md.
For ADK logging docs (log levels, configuration, debugging), fetch https://adk.dev/observability/logging/index.md.
BigQuery Agent Analytics Plugin
Optional plugin that logs structured agent events to BigQuery. Enable with --bq-analytics at scaffold time. See references/bigquery-agent-analytics.md for details.
Third-Party Integrations
ADK supports several third-party observability platforms. Each uses OpenTelemetry or custom instrumentation to capture agent behavior.
| Platform | Key Differentiator | Setup Complexity | Self-Hosted Option |
|---|---|---|---|
| AgentOps | Session replays, 2-line setup, replaces native telemetry | Minimal | No (SaaS) |
| Arize AX | Commercial platform, production monitoring, evaluation dashboards | Low | No (SaaS) |
| Phoenix | Open-source, custom evaluators, experiment testing | Low | Yes |
| MLflow | OTel traces to MLflow Tracking Server, span tree visualization | Medium (needs SQL backend) | Yes |
| Monocle | 1-call setup, VS Code Gantt chart visualizer | Minimal | Yes (local files) |
| Weave | W&B platform, team collaboration, timeline views | Low | No (SaaS) |
| Freeplay | Prompt management + evals + observability in one platform | Low | No (SaaS) |
Ask the user which platform they prefer — present the trade-offs and let them choose. For setup details, fetch the relevant ADK docs page from the Deep Dive table below.
Troubleshooting
| Issue | Solution |
|---|---|
| No traces in Cloud Trace | Verify setup_telemetry() runs at startup and the service account has the cloudtrace.agent role |
| Prompt-response data not appearing | Check LOGS_BUCKET_NAME is set; verify SA has storage.objectCreator on the bucket; check app logs for telemetry setup warnings |
| Privacy mode misconfigured | Check OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT value — use NO_CONTENT for metadata-only, false to disable |
| BigQuery Analytics not logging | Verify plugin is configured in app/agent.py; check BQ_ANALYTICS_DATASET_ID env var is set |
| Third-party integration not capturing spans | Check provider-specific env vars (API keys, endpoints); some providers (AgentOps) replace native telemetry |
| Traces missing tool spans | Tool execution spans appear under execute_tool — check trace explorer filters |
| High telemetry costs | Switch to NO_CONTENT mode; reduce BigQuery retention; disable unused tiers |
Deep Dive: ADK Docs (WebFetch URLs)
For detailed documentation beyond what this skill covers, fetch these pages:
| Topic | URL |
|---|---|
| Observability overview | https://adk.dev/observability/index.md |
| Agent activity logging | https://adk.dev/observability/logging/index.md |
| Cloud Trace integration | https://adk.dev/integrations/cloud-trace/index.md |
| BigQuery Agent Analytics | https://adk.dev/integrations/bigquery-agent-analytics/index.md |
| AgentOps | https://adk.dev/integrations/agentops/index.md |
| Arize AX | https://adk.dev/integrations/arize-ax/index.md |
| Phoenix (Arize) | https://adk.dev/integrations/phoenix/index.md |
| MLflow tracing | https://adk.dev/integrations/mlflow-tracing/index.md |
| Monocle | https://adk.dev/integrations/monocle/index.md |
| W&B Weave | https://adk.dev/integrations/weave/index.md |
| Freeplay | https://adk.dev/integrations/freeplay/index.md |
Related Skills
/google-agents-cli-deploy— Deployment targets, CI/CD pipelines, and production workflows/google-agents-cli-workflow— Development workflow, coding guidelines, and operational rules/google-agents-cli-adk-code— ADK Python API quick reference for writing agent code
Related skills
More from google/agents-cli and the wider catalog.
google-agents-cli-adk-code
Quick reference for ADK Python API patterns, tools, callbacks, and agent code examples.
google-agents-cli-workflow
CLI toolkit for building, evaluating, and deploying agents on Google Cloud using the Agent Development Kit (ADK).
google-agents-cli-eval
Evaluate and optimize ADK agents with built-in metrics, LLM-as-judge scoring, and failure analysis.
google-agents-cli-scaffold
Scaffold new ADK agent projects and add deployment, CI/CD, and infrastructure.
google-agents-cli-deploy
Deploy ADK agents to Agent Runtime, Cloud Run, or GKE with managed infrastructure and CI/CD.
google-agents-cli-publish
Publish and register agents with Gemini Enterprise via ADK or A2A modes.