Skill

Pass

Audit score 90

prompt-engineering-patterns

wshobson/agents

Master advanced prompt engineering patterns to optimize LLM performance and reliability.

What is prompt-engineering-patterns?

This skill teaches advanced prompt engineering techniques for production LLM applications, including few-shot learning, chain-of-thought reasoning, structured outputs, and prompt optimization. Use it when you need to design complex prompts, improve consistency, debug prompt issues, or implement specialized reasoning patterns.

Few-shot learning with semantic similarity and dynamic example selection
Chain-of-thought prompting including zero-shot, few-shot, and self-consistency techniques
Structured outputs using JSON mode and Pydantic schema enforcement
Iterative prompt optimization with A/B testing and performance measurement
Template systems with variable interpolation and conditional sections
System prompt design for behavior control and output formatting

How to install prompt-engineering-patterns

npx skills add https://github.com/wshobson/agents --skill prompt-engineering-patterns

Claude Code

Cursor

Windsurf

Cline

How to use prompt-engineering-patterns

1.Define your output schema using Pydantic BaseModel with Field descriptions
2.Create a ChatPromptTemplate with system and user message roles
3.Initialize your LLM and apply structured output constraints using with_structured_output()
4.Build a chain combining the prompt template and structured LLM
5.Test extensively on diverse inputs and track performance metrics like accuracy and token usage
6.Iterate on prompt wording, examples, and structure based on performance results

Use cases

Good for

Designing SQL generation prompts with structured output validation
Building few-shot learning systems that dynamically select relevant examples
Implementing chain-of-thought reasoning for complex multi-step problems
Optimizing prompts for production by measuring accuracy, consistency, and token usage
Creating reusable prompt templates with role-based composition

Who it's for

LLM application developers
Prompt engineers optimizing production systems
AI assistants requiring specialized behavior
Teams building few-shot learning systems
Developers implementing structured reasoning patterns

prompt-engineering-patterns FAQ

When should I use few-shot vs. zero-shot prompting?

Start with zero-shot for simplicity. Use few-shot when zero-shot produces inconsistent results or when you need to teach the model a specific format or reasoning style. Balance example count with context window constraints.

How do I handle prompts that exceed token limits?

Reduce example count, use semantic similarity to select only the most relevant examples, shorten explanations, or implement dynamic example retrieval that adapts to available context.

What's the difference between chain-of-thought and structured outputs?

Chain-of-thought elicits step-by-step reasoning in natural language; structured outputs enforce a specific schema (JSON/Pydantic) for reliable parsing. Use both together for reasoning + reliable extraction.

How do I measure if my prompt optimization is working?

Track metrics like accuracy (correctness), consistency (reproducibility), latency (response time), token usage, success rate (valid outputs), and user satisfaction. Compare baseline vs. optimized versions.

Should I version control my prompts?

Yes. Treat prompts as code with version control, documentation of intent, and change tracking. This enables rollback, comparison, and understanding why specific wording was chosen.

Full instructions (SKILL.md)

Source of truth, from wshobson/agents.

name: prompt-engineering-patterns description: >- This skill should be used when the user asks to "optimize a prompt", "improve prompt performance", "design a prompt template", "write better prompts", "debug prompt issues", "use chain-of-thought", "structured prompting", "few-shot prompting", or wants to apply advanced prompt engineering patterns for production LLM applications.

Prompt Engineering Patterns

Master advanced prompt engineering techniques to maximize LLM performance, reliability, and controllability.

When to Use This Skill

Designing complex prompts for production LLM applications
Optimizing prompt performance and consistency
Implementing structured reasoning patterns (chain-of-thought, tree-of-thought)
Building few-shot learning systems with dynamic example selection
Creating reusable prompt templates with variable interpolation
Debugging and refining prompts that produce inconsistent outputs
Implementing system prompts for specialized AI assistants
Using structured outputs (JSON mode) for reliable parsing

Core Capabilities

1. Few-Shot Learning

Example selection strategies (semantic similarity, diversity sampling)
Balancing example count with context window constraints
Constructing effective demonstrations with input-output pairs
Dynamic example retrieval from knowledge bases
Handling edge cases through strategic example selection

2. Chain-of-Thought Prompting

Step-by-step reasoning elicitation
Zero-shot CoT with "Let's think step by step"
Few-shot CoT with reasoning traces
Self-consistency techniques (sampling multiple reasoning paths)
Verification and validation steps

3. Structured Outputs

JSON mode for reliable parsing
Pydantic schema enforcement
Type-safe response handling
Error handling for malformed outputs

4. Prompt Optimization

Iterative refinement workflows
A/B testing prompt variations
Measuring prompt performance metrics (accuracy, consistency, latency)
Reducing token usage while maintaining quality
Handling edge cases and failure modes

5. Template Systems

Variable interpolation and formatting
Conditional prompt sections
Multi-turn conversation templates
Role-based prompt composition
Modular prompt components

6. System Prompt Design

Setting model behavior and constraints
Defining output formats and structure
Establishing role and expertise
Safety guidelines and content policies
Context setting and background information

Quick Start

from langchain_anthropic import ChatAnthropic
from langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel, Field

# Define structured output schema
class SQLQuery(BaseModel):
    query: str = Field(description="The SQL query")
    explanation: str = Field(description="Brief explanation of what the query does")
    tables_used: list[str] = Field(description="List of tables referenced")

# Initialize model with structured output
llm = ChatAnthropic(model="claude-sonnet-4-6")
structured_llm = llm.with_structured_output(SQLQuery)

# Create prompt template
prompt = ChatPromptTemplate.from_messages([
    ("system", """You are an expert SQL developer. Generate efficient, secure SQL queries.
    Always use parameterized queries to prevent SQL injection.
    Explain your reasoning briefly."""),
    ("user", "Convert this to SQL: {query}")
])

# Create chain
chain = prompt | structured_llm

# Use
result = await chain.ainvoke({
    "query": "Find all users who registered in the last 30 days"
})
print(result.query)
print(result.explanation)

Detailed patterns and worked examples

Detailed pattern documentation lives in references/details.md. Read that file when the navigation tier above is insufficient.

Best Practices

Be Specific: Vague prompts produce inconsistent results
Show, Don't Tell: Examples are more effective than descriptions
Use Structured Outputs: Enforce schemas with Pydantic for reliability
Test Extensively: Evaluate on diverse, representative inputs
Iterate Rapidly: Small changes can have large impacts
Monitor Performance: Track metrics in production
Version Control: Treat prompts as code with proper versioning
Document Intent: Explain why prompts are structured as they are

Common Pitfalls

Over-engineering: Starting with complex prompts before trying simple ones
Example pollution: Using examples that don't match the target task
Context overflow: Exceeding token limits with excessive examples
Ambiguous instructions: Leaving room for multiple interpretations
Ignoring edge cases: Not testing on unusual or boundary inputs
No error handling: Assuming outputs will always be well-formed
Hardcoded values: Not parameterizing prompts for reuse

Success Metrics

Track these KPIs for your prompts:

Accuracy: Correctness of outputs
Consistency: Reproducibility across similar inputs
Latency: Response time (P50, P95, P99)
Token Usage: Average tokens per request
Success Rate: Percentage of valid, parseable outputs
User Satisfaction: Ratings and feedback

Related skills

More from wshobson/agents and the wider catalog.

tailwind-design-system

wshobson/agents

Build production-ready design systems with Tailwind CSS v4, design tokens, and component libraries.

52k installsAudited

typescript-advanced-types

wshobson/agents

Master TypeScript's advanced type system: generics, conditional types, mapped types, and utility types for type-safe applications.

51k installsAudited

nodejs-backend-patterns

wshobson/agents

Build production-ready Node.js backends with Express/Fastify, middleware patterns, auth, and database integration.

38k installsAudited

python-performance-optimization

wshobson/agents

Profile and optimize Python code using cProfile, memory profilers, and performance best practices.

28k installsAudited

brand-landingpage

wshobson/agents

Brand-first landing page designer with guided interviews and Stitch-powered iteration.

26k installsAudited

python-testing-patterns

wshobson/agents

Implement comprehensive testing strategies with pytest, fixtures, mocking, and test-driven development.

26k installsAudited