PluginBench
Skill
Pass
Audit score 90

safe-debug

lllllllama/rigorpilot-skills

Conservative diagnosis and minimal patching for deep learning training failures without automatic code mutation.

What is safe-debug?

Safe-debug is a rigor-focused debugging skill for deep learning research. Use it when you have a concrete error (traceback, CUDA OOM, shape mismatch, NaN loss, checkpoint failure) and want systematic root-cause diagnosis with explicit human approval before any code changes.

  • Diagnoses training and inference failures from tracebacks and error symptoms
  • Narrows root cause without modifying code by default
  • Proposes minimal, targeted patches with explicit approval gates
  • Separates debug fixes from research contributions to preserve experiment integrity
  • Escalates branch/savepoint creation for medium or high-risk changes
  • Generates structured diagnosis, patch plan, and status outputs

How to install safe-debug

npx skills add https://github.com/lllllllama/rigorpilot-skills --skill safe-debug
Claude Code
Cursor
Windsurf
Cline

How to use safe-debug

  1. 1.Paste the full traceback or error message into the chat
  2. 2.Describe the context (what were you training/running when it failed)
  3. 3.Let safe-debug generate DIAGNOSIS.md with root-cause analysis
  4. 4.Review the PATCH_PLAN.md for proposed minimal fixes
  5. 5.Explicitly approve before any code modifications are applied
  6. 6.Check debug_outputs/status.json for summary and next steps

Use cases

Good for
  • Debugging CUDA out-of-memory errors during training
  • Investigating shape mismatches in tensor operations
  • Diagnosing NaN loss or training divergence symptoms
  • Resolving checkpoint loading failures
  • Identifying root causes of inference failures before attempting fixes
Who it's for
  • Deep learning researchers
  • ML engineers debugging training pipelines
  • Anyone needing conservative, audit-ready debugging with clear separation of fixes from research code

safe-debug FAQ

Will this skill automatically fix my code?

No. Safe-debug diagnoses first and proposes minimal patches only after explicit human approval. Code is not modified by default.

What if the fix changes my experiment's meaning?

Safe-debug will flag this explicitly. Debug fixes are not automatically research contributions; any change affecting experiment comparability is called out.

When should I not use this skill?

Do not use it for broad repository walkthroughs, speculative experimentation, large refactors, or general code familiarization without an active failure.

What outputs does this skill produce?

Three files: DIAGNOSIS.md (root-cause analysis), PATCH_PLAN.md (proposed fixes), and status.json (summary and next steps).

Does this skill require a branch or savepoint?

For medium or high-risk changes, yes—safe-debug will escalate branch or savepoint creation before proceeding.

Full instructions (SKILL.md)

Source of truth, from lllllllama/rigorpilot-skills.


name: safe-debug description: Rigor Debug / Rigor Audit skill for deep learning research work. Use when the user pastes a traceback, terminal error, CUDA OOM, checkpoint load failure, shape mismatch, NaN loss symptom, or training failure and wants conservative diagnosis before any patching, with debug fixes clearly separated from research contributions. Do not use for broad refactoring, speculative adaptation, automatic exploratory patching, or general repository familiarization.

safe-debug

Use this as the Rigor Debug / Rigor Audit skill. The installed slug remains safe-debug for compatibility.

Use the shared operating principles in ../../references/agent-operating-principles.md; this skill should guide conservative diagnosis without blocking the model from finding the local root cause.

When to apply

  • The user provides a traceback, terminal error, or concrete training or inference failure symptom.
  • The user wants diagnosis, root-cause narrowing, and minimal patch suggestions before code is changed.
  • The user wants a safe debug flow with explicit human approval before mutation.

When not to apply

  • When the user wants a broad repository walkthrough without an active failure.
  • When the task is speculative experimentation or code adaptation.
  • When the user is asking for a large refactor or readability rewrite.

Clear boundaries

  • Diagnose first.
  • Do not modify repository code by default.
  • If a patch is needed, propose the smallest fix and require explicit approval first.
  • Escalate savepoint or branch creation before medium-risk or high-risk changes.
  • A debug fix is not automatically a research contribution; if it changes experiment meaning or comparability, say so explicitly.

Output expectations

  • debug_outputs/DIAGNOSIS.md
  • debug_outputs/PATCH_PLAN.md
  • debug_outputs/status.json

Notes

Use references/debug-policy.md, ../../references/research-rigor-principles.md, and the shared references/research-pitfall-checklist.md.