PluginBench
Skill
Review
Audit score 70

run-train

lllllllama/rigorpilot-skills

Execute and document deep learning training runs with reproducibility and status tracking.

What is run-train?

Rigor Train executes pre-selected training commands for deep learning research repositories and writes standardized evidence (logs, configs, checkpoints, metrics) to train_outputs/. Use this when you have a runnable training command and need startup verification, short-run checks, full training kickoff, or resume handling—not for environment setup, sweeps, or exploratory work.

  • Executes selected training commands conservatively with bounded evidence collection
  • Records training command, config, seed, logs, checkpoints, and metrics to standardized train_outputs/ directory
  • Supports multiple run modes: startup verification, short-run verification, full kickoff, and resume
  • Generates structured output files including SUMMARY.md, COMMANDS.md, LOG.md, and status.json
  • Preserves reproducibility context and runtime assumptions for comparability
  • Tracks partial, blocked, resumed, and kicked-off training states clearly

How to install run-train

npx skills add https://github.com/lllllllama/rigorpilot-skills --skill run-train
Prerequisites
  • A selected, runnable training command ready to execute
  • Environment and assets already set up (this skill does not handle setup)
  • Access to the repository's training scripts and configuration files
Claude Code
Cursor
Windsurf
Cline

How to use run-train

  1. 1.Provide the training command and run mode (startup verification, short-run, full kickoff, or resume)
  2. 2.Specify any checkpoint path if resuming a previous run
  3. 3.Execute the skill to run the training command
  4. 4.Review the generated train_outputs/ directory for SUMMARY.md, logs, metrics, and status.json
  5. 5.Check COMPARABILITY_REPORT.md to verify reproducibility context is preserved

Use cases

Good for
  • Verify a training setup works before committing to a long run
  • Resume an interrupted training job from the last checkpoint
  • Run a quick validation training to confirm hyperparameters and data pipeline
  • Execute a full model training with complete evidence collection for reproducibility
  • Document training state and metrics for research comparison and auditing
Who it's for
  • Deep learning researchers running experiments in established repositories
  • ML engineers verifying training pipelines before production deployment
  • Research teams needing reproducible training evidence and status tracking
  • Scientists resuming interrupted training jobs with checkpoint management

run-train FAQ

Should I use this skill for hyperparameter sweeps?

No. This skill executes a single selected training command conservatively. For multi-variant sweeps or exploratory parameter search, use a different approach.

Can this skill set up my environment or download assets?

No. This skill assumes your environment and assets are already prepared. Use it only after setup is complete.

What happens if training is interrupted?

The skill records the partial state and can resume from the last checkpoint if you provide the checkpoint path and select resume mode.

Where does the training evidence get written?

All outputs are written to train_outputs/ including SUMMARY.md, COMMANDS.md, LOG.md, SCIENTIFIC_CHANGELOG.md, COMPARABILITY_REPORT.md, and status.json.

Can I use this for inference or evaluation only?

No. This skill is designed for training execution. Use it only when you need to run a training command.

Full instructions (SKILL.md)

Source of truth, from lllllllama/rigorpilot-skills.


name: run-train description: Rigor Train skill for deep learning research repositories. Use when a documented or selected training command should be run conservatively for startup verification, short-run verification, full kickoff, or resume, with command, config, seed, log, checkpoint, status, and metric evidence written to standardized train_outputs/. Do not use for environment setup, exploratory sweeps, speculative idea implementation, or end-to-end orchestration.

run-train

Use this as the Rigor Train skill. The installed slug remains run-train for compatibility.

Use the shared operating principles in ../../references/agent-operating-principles.md; this skill should keep training evidence bounded while leaving repository-specific monitoring details to the model.

When to apply

  • When the training command has already been selected and should be executed conservatively.
  • When the researcher wants startup verification, short-run verification, full training kickoff, or resume handling.
  • When the run needs structured training status, checkpoint, and metric reporting.

When not to apply

  • When the main task is environment setup or asset download.
  • When the researcher wants inference-only or evaluation-only execution.
  • When the task is speculative exploration, multi-variant sweeps, or autonomous idea implementation.
  • When the user still needs repository intake or paper gap resolution.

Clear boundaries

  • This skill executes a selected training command and normalizes the resulting evidence.
  • It does not choose the overall research goal on its own.
  • It does not own exploratory branching or speculative code adaptation.
  • It should record partial, blocked, resumed, and kicked-off states clearly.
  • It should preserve reproducibility context such as configs, seeds, checkpoints, logs, metrics, and runtime assumptions when available.

Input expectations

  • selected training goal
  • runnable training command
  • environment and asset assumptions
  • run mode such as startup verification, short-run verification, full kickoff, or resume

Output expectations

  • train_outputs/SUMMARY.md
  • train_outputs/COMMANDS.md
  • train_outputs/LOG.md
  • train_outputs/SCIENTIFIC_CHANGELOG.md
  • train_outputs/COMPARABILITY_REPORT.md
  • train_outputs/status.json

Notes

Use references/training-policy.md, ../../references/deep-learning-experiment-principles.md, scripts/run_training.py, and scripts/write_outputs.py.