Skill

Pass

Audit score 90

service-mesh-observability

Name: service-mesh-observability
Author: wshobson

wshobson/agents

Source View on skills.sh

How to install service-mesh-observability

npx skills add https://github.com/wshobson/agents --skill service-mesh-observability

Claude Code

Cursor

Windsurf

Cline

Full instructions (SKILL.md)

Source of truth, from wshobson/agents.

name: service-mesh-observability description: Implement comprehensive observability for service meshes including distributed tracing, metrics, and visualization. Use when setting up mesh monitoring, debugging latency issues, or implementing SLOs for service communication.

Service Mesh Observability

Complete guide to observability patterns for Istio, Linkerd, and service mesh deployments.

When to Use This Skill

Setting up distributed tracing across services
Implementing service mesh metrics and dashboards
Debugging latency and error issues
Defining SLOs for service communication
Visualizing service dependencies
Troubleshooting mesh connectivity

Core Concepts

1. Three Pillars of Observability

┌─────────────────────────────────────────────────────┐
│                  Observability                       │
├─────────────────┬─────────────────┬─────────────────┤
│     Metrics     │     Traces      │      Logs       │
│                 │                 │                 │
│ • Request rate  │ • Span context  │ • Access logs   │
│ • Error rate    │ • Latency       │ • Error details │
│ • Latency P50   │ • Dependencies  │ • Debug info    │
│ • Saturation    │ • Bottlenecks   │ • Audit trail   │
└─────────────────┴─────────────────┴─────────────────┘

2. Golden Signals for Mesh

Signal	Description	Alert Threshold
Latency	Request duration P50, P99	P99 > 500ms
Traffic	Requests per second	Anomaly detection
Errors	5xx error rate	> 1%
Saturation	Resource utilization	> 80%

Templates and detailed worked examples

Full template library and detailed worked examples live in references/details.md. Read that file when you need the concrete templates.

Best Practices

Do's

Sample appropriately - 100% in dev, 1-10% in prod
Use trace context - Propagate headers consistently
Set up alerts - For golden signals
Correlate metrics/traces - Use exemplars
Retain strategically - Hot/cold storage tiers

Don'ts

Don't over-sample - Storage costs add up
Don't ignore cardinality - Limit label values
Don't skip dashboards - Visualize dependencies
Don't forget costs - Monitor observability costs

Related skills

More from wshobson/agents and the wider catalog.

tailwind-design-system

wshobson/agents

Build production-ready design systems with Tailwind CSS v4, design tokens, and component libraries.

52k installsAudited

typescript-advanced-types

wshobson/agents

Master TypeScript's advanced type system: generics, conditional types, mapped types, and utility types for type-safe applications.

51k installsAudited

nodejs-backend-patterns

wshobson/agents

Build production-ready Node.js backends with Express/Fastify, middleware patterns, auth, and database integration.

38k installsAudited

python-performance-optimization

wshobson/agents

Profile and optimize Python code using cProfile, memory profilers, and performance best practices.

28k installsAudited

brand-landingpage

wshobson/agents

Brand-first landing page designer with guided interviews and Stitch-powered iteration.

26k installsAudited

python-testing-patterns

wshobson/agents

Implement comprehensive testing strategies with pytest, fixtures, mocking, and test-driven development.

26k installsAudited