Skill

Pass

Audit score 90

safe-debug

lllllllama/rigorpilot-skills

Conservative diagnosis and minimal patching for deep learning training failures without automatic code mutation.

What is safe-debug?

Safe-debug is a rigor-focused debugging skill for deep learning research. Use it when you have a concrete error (traceback, CUDA OOM, shape mismatch, NaN loss, checkpoint failure) and want systematic root-cause diagnosis with explicit human approval before any code changes.

Diagnoses training and inference failures from tracebacks and error symptoms
Narrows root cause without modifying code by default
Proposes minimal, targeted patches with explicit approval gates
Separates debug fixes from research contributions to preserve experiment integrity
Escalates branch/savepoint creation for medium or high-risk changes
Generates structured diagnosis, patch plan, and status outputs

How to install safe-debug

npx skills add https://github.com/lllllllama/rigorpilot-skills --skill safe-debug

Claude Code

Cursor

Windsurf

Cline

How to use safe-debug

1.Paste the full traceback or error message into the chat
2.Describe the context (what were you training/running when it failed)
3.Let safe-debug generate DIAGNOSIS.md with root-cause analysis
4.Review the PATCH_PLAN.md for proposed minimal fixes
5.Explicitly approve before any code modifications are applied
6.Check debug_outputs/status.json for summary and next steps

Use cases

Good for

Debugging CUDA out-of-memory errors during training
Investigating shape mismatches in tensor operations
Diagnosing NaN loss or training divergence symptoms
Resolving checkpoint loading failures
Identifying root causes of inference failures before attempting fixes

Who it's for

Deep learning researchers
ML engineers debugging training pipelines
Anyone needing conservative, audit-ready debugging with clear separation of fixes from research code

safe-debug FAQ

Will this skill automatically fix my code?

No. Safe-debug diagnoses first and proposes minimal patches only after explicit human approval. Code is not modified by default.

What if the fix changes my experiment's meaning?

Safe-debug will flag this explicitly. Debug fixes are not automatically research contributions; any change affecting experiment comparability is called out.

When should I not use this skill?

Do not use it for broad repository walkthroughs, speculative experimentation, large refactors, or general code familiarization without an active failure.

What outputs does this skill produce?

Three files: DIAGNOSIS.md (root-cause analysis), PATCH_PLAN.md (proposed fixes), and status.json (summary and next steps).

Does this skill require a branch or savepoint?

For medium or high-risk changes, yes—safe-debug will escalate branch or savepoint creation before proceeding.

Full instructions (SKILL.md)

Source of truth, from lllllllama/rigorpilot-skills.

name: safe-debug description: Rigor Debug / Rigor Audit skill for deep learning research work. Use when the user pastes a traceback, terminal error, CUDA OOM, checkpoint load failure, shape mismatch, NaN loss symptom, or training failure and wants conservative diagnosis before any patching, with debug fixes clearly separated from research contributions. Do not use for broad refactoring, speculative adaptation, automatic exploratory patching, or general repository familiarization.

safe-debug

Use this as the Rigor Debug / Rigor Audit skill. The installed slug remains safe-debug for compatibility.

Use the shared operating principles in ../../references/agent-operating-principles.md; this skill should guide conservative diagnosis without blocking the model from finding the local root cause.

When to apply

The user provides a traceback, terminal error, or concrete training or inference failure symptom.
The user wants diagnosis, root-cause narrowing, and minimal patch suggestions before code is changed.
The user wants a safe debug flow with explicit human approval before mutation.

When not to apply

When the user wants a broad repository walkthrough without an active failure.
When the task is speculative experimentation or code adaptation.
When the user is asking for a large refactor or readability rewrite.

Clear boundaries

Diagnose first.
Do not modify repository code by default.
If a patch is needed, propose the smallest fix and require explicit approval first.
Escalate savepoint or branch creation before medium-risk or high-risk changes.
A debug fix is not automatically a research contribution; if it changes experiment meaning or comparability, say so explicitly.

Output expectations

debug_outputs/DIAGNOSIS.md
debug_outputs/PATCH_PLAN.md
debug_outputs/status.json

Notes

Use references/debug-policy.md, ../../references/research-rigor-principles.md, and the shared references/research-pitfall-checklist.md.

Related skills

More from lllllllama/rigorpilot-skills and the wider catalog.

analyze-project

lllllllama/rigorpilot-skills

Read-only analysis of deep learning repositories to understand structure, configs, and suspicious patterns.

84k installsAudited

ai-research-explore

lllllllama/rigorpilot-skills

Auditable deep learning research exploration with idea gating, fair comparison, and governed experiments.

84k installs

explore-code

lllllllama/rigorpilot-skills

Auditable exploratory code modifications for deep learning research on isolated branches with rollback tracking.

84k installs

ai-research-reproduction

lllllllama/rigorpilot-skills

README-first deep learning repository reproduction with auditable evidence and standardized outputs.

84k installsAudited

paper-context-resolver

lllllllama/rigorpilot-skills

Resolve reproduction-critical paper details when README and repo files leave gaps.

84k installs

run-train

lllllllama/rigorpilot-skills

Execute and document deep learning training runs with reproducibility and status tracking.

84k installs