backtesting-frameworks
wshobson/agents
Build robust backtesting systems that avoid look-ahead bias, survivorship bias, and transaction costs.
What is backtesting-frameworks?
A framework for developing production-grade backtesting systems for trading strategies. Use this when validating trading algorithms, building backtesting infrastructure, or comparing strategy alternatives while avoiding common pitfalls like overfitting and data leakage.
- Mitigate look-ahead bias using point-in-time data
- Handle survivorship bias by including delisted securities
- Account for transaction costs with realistic cost models
- Implement walk-forward analysis for robust validation
- Structure proper train/validation/test data splits
- Perform Monte Carlo analysis to understand uncertainty
How to install backtesting-frameworks
npx skills add https://github.com/wshobson/agents --skill backtesting-frameworksHow to use backtesting-frameworks
- 1.Understand the five main backtesting biases: look-ahead, survivorship, overfitting, selection, and transaction
- 2.Structure your historical data into training, validation, and test sets with no data leakage
- 3.Use point-in-time data to avoid look-ahead bias
- 4.Include realistic transaction costs in your simulations
- 5.Implement walk-forward analysis with rolling windows instead of single train/test splits
- 6.Reserve a final test set for performance evaluation only
Use cases
- Developing and validating trading strategy backtests
- Building backtesting infrastructure for multiple strategies
- Comparing alternative strategies with out-of-sample testing
- Implementing walk-forward analysis for parameter selection
- Avoiding overfitting through proper data partitioning
- Quantitative traders
- Strategy developers
- Algorithmic trading engineers
- Financial technologists building backtesting systems
backtesting-frameworks FAQ
Look-ahead bias occurs when your backtest uses future information that wouldn't be available at the time of trading. Avoid it by using point-in-time data and ensuring your data pipeline only accesses information available up to each decision point.
Survivorship bias occurs when you only test on securities that survived to the present. Including delisted securities gives a realistic picture of strategy performance, as many trades will fail or be delisted.
Walk-forward analysis uses rolling windows where you train on historical data, validate on the next period, then move the window forward. Use it instead of a single train/test split to get more robust estimates and avoid overfitting to specific time periods.
Limit the number of parameters you optimize, always reserve an out-of-sample test set that you don't touch during development, use walk-forward analysis, and consider pre-registering your strategy before testing.
Be careful with adjusted price data (split and dividend adjusted). Understand what adjustments were applied and ensure they're applied consistently. Point-in-time data should reflect prices as they were known at each historical date.
Full instructions (SKILL.md)
Source of truth, from wshobson/agents.
name: backtesting-frameworks description: Build robust backtesting systems for trading strategies with proper handling of look-ahead bias, survivorship bias, and transaction costs. Use when developing trading algorithms, validating strategies, or building backtesting infrastructure.
Backtesting Frameworks
Build robust, production-grade backtesting systems that avoid common pitfalls and produce reliable strategy performance estimates.
When to Use This Skill
- Developing trading strategy backtests
- Building backtesting infrastructure
- Validating strategy performance
- Avoiding common backtesting biases
- Implementing walk-forward analysis
- Comparing strategy alternatives
Core Concepts
1. Backtesting Biases
| Bias | Description | Mitigation |
|---|---|---|
| Look-ahead | Using future information | Point-in-time data |
| Survivorship | Only testing on survivors | Use delisted securities |
| Overfitting | Curve-fitting to history | Out-of-sample testing |
| Selection | Cherry-picking strategies | Pre-registration |
| Transaction | Ignoring trading costs | Realistic cost models |
2. Proper Backtest Structure
Historical Data
│
▼
┌─────────────────────────────────────────┐
│ Training Set │
│ (Strategy Development & Optimization) │
└─────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Validation Set │
│ (Parameter Selection, No Peeking) │
└─────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Test Set │
│ (Final Performance Evaluation) │
└─────────────────────────────────────────┘
3. Walk-Forward Analysis
Window 1: [Train──────][Test]
Window 2: [Train──────][Test]
Window 3: [Train──────][Test]
Window 4: [Train──────][Test]
─────▶ Time
Detailed worked examples and patterns
Detailed sections (starting with ## Implementation Patterns) live in references/details.md. Read that file when the navigation summary above is insufficient.
Best Practices
Do's
- Use point-in-time data - Avoid look-ahead bias
- Include transaction costs - Realistic estimates
- Test out-of-sample - Always reserve data
- Use walk-forward - Not just train/test
- Monte Carlo analysis - Understand uncertainty
Don'ts
- Don't overfit - Limit parameters
- Don't ignore survivorship - Include delisted
- Don't use adjusted data carelessly - Understand adjustments
- Don't optimize on full history - Reserve test set
- Don't ignore capacity - Market impact matters
Related skills
More from wshobson/agents and the wider catalog.
tailwind-design-system
Build production-ready design systems with Tailwind CSS v4, design tokens, and component libraries.
typescript-advanced-types
Master TypeScript's advanced type system: generics, conditional types, mapped types, and utility types for type-safe applications.
nodejs-backend-patterns
Build production-ready Node.js backends with Express/Fastify, middleware patterns, auth, and database integration.
python-performance-optimization
Profile and optimize Python code using cProfile, memory profilers, and performance best practices.
brand-landingpage
Brand-first landing page designer with guided interviews and Stitch-powered iteration.
python-testing-patterns
Implement comprehensive testing strategies with pytest, fixtures, mocking, and test-driven development.