Skill

Review

Audit score 70

run-train

lllllllama/rigorpilot-skills

Execute and document deep learning training runs with reproducibility and status tracking.

What is run-train?

Rigor Train executes pre-selected training commands for deep learning research repositories and writes standardized evidence (logs, configs, checkpoints, metrics) to train_outputs/. Use this when you have a runnable training command and need startup verification, short-run checks, full training kickoff, or resume handling—not for environment setup, sweeps, or exploratory work.

Executes selected training commands conservatively with bounded evidence collection
Records training command, config, seed, logs, checkpoints, and metrics to standardized train_outputs/ directory
Supports multiple run modes: startup verification, short-run verification, full kickoff, and resume
Generates structured output files including SUMMARY.md, COMMANDS.md, LOG.md, and status.json
Preserves reproducibility context and runtime assumptions for comparability
Tracks partial, blocked, resumed, and kicked-off training states clearly

How to install run-train

npx skills add https://github.com/lllllllama/rigorpilot-skills --skill run-train

Prerequisites

A selected, runnable training command ready to execute
Environment and assets already set up (this skill does not handle setup)
Access to the repository's training scripts and configuration files

Claude Code

Cursor

Windsurf

Cline

How to use run-train

1.Provide the training command and run mode (startup verification, short-run, full kickoff, or resume)
2.Specify any checkpoint path if resuming a previous run
3.Execute the skill to run the training command
4.Review the generated train_outputs/ directory for SUMMARY.md, logs, metrics, and status.json
5.Check COMPARABILITY_REPORT.md to verify reproducibility context is preserved

Use cases

Good for

Verify a training setup works before committing to a long run
Resume an interrupted training job from the last checkpoint
Run a quick validation training to confirm hyperparameters and data pipeline
Execute a full model training with complete evidence collection for reproducibility
Document training state and metrics for research comparison and auditing

Who it's for

Deep learning researchers running experiments in established repositories
ML engineers verifying training pipelines before production deployment
Research teams needing reproducible training evidence and status tracking
Scientists resuming interrupted training jobs with checkpoint management

run-train FAQ

Should I use this skill for hyperparameter sweeps?

No. This skill executes a single selected training command conservatively. For multi-variant sweeps or exploratory parameter search, use a different approach.

Can this skill set up my environment or download assets?

No. This skill assumes your environment and assets are already prepared. Use it only after setup is complete.

What happens if training is interrupted?

The skill records the partial state and can resume from the last checkpoint if you provide the checkpoint path and select resume mode.

Where does the training evidence get written?

All outputs are written to train_outputs/ including SUMMARY.md, COMMANDS.md, LOG.md, SCIENTIFIC_CHANGELOG.md, COMPARABILITY_REPORT.md, and status.json.

Can I use this for inference or evaluation only?

No. This skill is designed for training execution. Use it only when you need to run a training command.

Full instructions (SKILL.md)

Source of truth, from lllllllama/rigorpilot-skills.

name: run-train description: Rigor Train skill for deep learning research repositories. Use when a documented or selected training command should be run conservatively for startup verification, short-run verification, full kickoff, or resume, with command, config, seed, log, checkpoint, status, and metric evidence written to standardized `train_outputs/`. Do not use for environment setup, exploratory sweeps, speculative idea implementation, or end-to-end orchestration.

run-train

Use this as the Rigor Train skill. The installed slug remains run-train for compatibility.

Use the shared operating principles in ../../references/agent-operating-principles.md; this skill should keep training evidence bounded while leaving repository-specific monitoring details to the model.

When to apply

When the training command has already been selected and should be executed conservatively.
When the researcher wants startup verification, short-run verification, full training kickoff, or resume handling.
When the run needs structured training status, checkpoint, and metric reporting.

When not to apply

When the main task is environment setup or asset download.
When the researcher wants inference-only or evaluation-only execution.
When the task is speculative exploration, multi-variant sweeps, or autonomous idea implementation.
When the user still needs repository intake or paper gap resolution.

Clear boundaries

This skill executes a selected training command and normalizes the resulting evidence.
It does not choose the overall research goal on its own.
It does not own exploratory branching or speculative code adaptation.
It should record partial, blocked, resumed, and kicked-off states clearly.
It should preserve reproducibility context such as configs, seeds, checkpoints, logs, metrics, and runtime assumptions when available.

Input expectations

selected training goal
runnable training command
environment and asset assumptions
run mode such as startup verification, short-run verification, full kickoff, or resume

Output expectations

train_outputs/SUMMARY.md
train_outputs/COMMANDS.md
train_outputs/LOG.md
train_outputs/SCIENTIFIC_CHANGELOG.md
train_outputs/COMPARABILITY_REPORT.md
train_outputs/status.json

Notes

Use references/training-policy.md, ../../references/deep-learning-experiment-principles.md, scripts/run_training.py, and scripts/write_outputs.py.

Related skills

More from lllllllama/rigorpilot-skills and the wider catalog.

analyze-project

lllllllama/rigorpilot-skills

Read-only analysis of deep learning repositories to understand structure, configs, and suspicious patterns.

84k installsAudited

ai-research-explore

lllllllama/rigorpilot-skills

Auditable deep learning research exploration with idea gating, fair comparison, and governed experiments.

84k installs

explore-code

lllllllama/rigorpilot-skills

Auditable exploratory code modifications for deep learning research on isolated branches with rollback tracking.

84k installs

ai-research-reproduction

lllllllama/rigorpilot-skills

README-first deep learning repository reproduction with auditable evidence and standardized outputs.

84k installsAudited

paper-context-resolver

lllllllama/rigorpilot-skills

Resolve reproduction-critical paper details when README and repo files leave gaps.

84k installs

safe-debug

lllllllama/rigorpilot-skills

Conservative diagnosis and minimal patching for deep learning training failures without automatic code mutation.

84k installsAudited