run-train
lllllllama/rigorpilot-skills
Execute and document deep learning training runs with reproducibility and status tracking.
What is run-train?
Rigor Train executes pre-selected training commands for deep learning research repositories and writes standardized evidence (logs, configs, checkpoints, metrics) to train_outputs/. Use this when you have a runnable training command and need startup verification, short-run checks, full training kickoff, or resume handling—not for environment setup, sweeps, or exploratory work.
- Executes selected training commands conservatively with bounded evidence collection
- Records training command, config, seed, logs, checkpoints, and metrics to standardized train_outputs/ directory
- Supports multiple run modes: startup verification, short-run verification, full kickoff, and resume
- Generates structured output files including SUMMARY.md, COMMANDS.md, LOG.md, and status.json
- Preserves reproducibility context and runtime assumptions for comparability
- Tracks partial, blocked, resumed, and kicked-off training states clearly
How to install run-train
npx skills add https://github.com/lllllllama/rigorpilot-skills --skill run-train- A selected, runnable training command ready to execute
- Environment and assets already set up (this skill does not handle setup)
- Access to the repository's training scripts and configuration files
How to use run-train
- 1.Provide the training command and run mode (startup verification, short-run, full kickoff, or resume)
- 2.Specify any checkpoint path if resuming a previous run
- 3.Execute the skill to run the training command
- 4.Review the generated train_outputs/ directory for SUMMARY.md, logs, metrics, and status.json
- 5.Check COMPARABILITY_REPORT.md to verify reproducibility context is preserved
Use cases
- Verify a training setup works before committing to a long run
- Resume an interrupted training job from the last checkpoint
- Run a quick validation training to confirm hyperparameters and data pipeline
- Execute a full model training with complete evidence collection for reproducibility
- Document training state and metrics for research comparison and auditing
- Deep learning researchers running experiments in established repositories
- ML engineers verifying training pipelines before production deployment
- Research teams needing reproducible training evidence and status tracking
- Scientists resuming interrupted training jobs with checkpoint management
run-train FAQ
No. This skill executes a single selected training command conservatively. For multi-variant sweeps or exploratory parameter search, use a different approach.
No. This skill assumes your environment and assets are already prepared. Use it only after setup is complete.
The skill records the partial state and can resume from the last checkpoint if you provide the checkpoint path and select resume mode.
All outputs are written to train_outputs/ including SUMMARY.md, COMMANDS.md, LOG.md, SCIENTIFIC_CHANGELOG.md, COMPARABILITY_REPORT.md, and status.json.
No. This skill is designed for training execution. Use it only when you need to run a training command.
Full instructions (SKILL.md)
Source of truth, from lllllllama/rigorpilot-skills.
name: run-train
description: Rigor Train skill for deep learning research repositories. Use when a documented or selected training command should be run conservatively for startup verification, short-run verification, full kickoff, or resume, with command, config, seed, log, checkpoint, status, and metric evidence written to standardized train_outputs/. Do not use for environment setup, exploratory sweeps, speculative idea implementation, or end-to-end orchestration.
run-train
Use this as the Rigor Train skill. The installed slug remains run-train for
compatibility.
Use the shared operating principles in
../../references/agent-operating-principles.md; this skill should keep
training evidence bounded while leaving repository-specific monitoring details
to the model.
When to apply
- When the training command has already been selected and should be executed conservatively.
- When the researcher wants startup verification, short-run verification, full training kickoff, or resume handling.
- When the run needs structured training status, checkpoint, and metric reporting.
When not to apply
- When the main task is environment setup or asset download.
- When the researcher wants inference-only or evaluation-only execution.
- When the task is speculative exploration, multi-variant sweeps, or autonomous idea implementation.
- When the user still needs repository intake or paper gap resolution.
Clear boundaries
- This skill executes a selected training command and normalizes the resulting evidence.
- It does not choose the overall research goal on its own.
- It does not own exploratory branching or speculative code adaptation.
- It should record partial, blocked, resumed, and kicked-off states clearly.
- It should preserve reproducibility context such as configs, seeds, checkpoints, logs, metrics, and runtime assumptions when available.
Input expectations
- selected training goal
- runnable training command
- environment and asset assumptions
- run mode such as startup verification, short-run verification, full kickoff, or resume
Output expectations
train_outputs/SUMMARY.mdtrain_outputs/COMMANDS.mdtrain_outputs/LOG.mdtrain_outputs/SCIENTIFIC_CHANGELOG.mdtrain_outputs/COMPARABILITY_REPORT.mdtrain_outputs/status.json
Notes
Use references/training-policy.md, ../../references/deep-learning-experiment-principles.md, scripts/run_training.py, and scripts/write_outputs.py.
Related skills
More from lllllllama/rigorpilot-skills and the wider catalog.
analyze-project
Read-only analysis of deep learning repositories to understand structure, configs, and suspicious patterns.
ai-research-explore
Auditable deep learning research exploration with idea gating, fair comparison, and governed experiments.
explore-code
Auditable exploratory code modifications for deep learning research on isolated branches with rollback tracking.
ai-research-reproduction
README-first deep learning repository reproduction with auditable evidence and standardized outputs.
paper-context-resolver
Resolve reproduction-critical paper details when README and repo files leave gaps.
safe-debug
Conservative diagnosis and minimal patching for deep learning training failures without automatic code mutation.