parallel-data-enrichment
parallel-web/parallel-agent-skills
Bulk enrich company, people, or product data with web-sourced fields like CEO names, funding, and contact info.
What is parallel-data-enrichment?
Adds missing fields to lists of entities by querying web sources. Use this to augment CSV files or inline data with enriched information like executive names, funding rounds, or contact details. Supports context chaining from prior research tasks.
- Enriches company records with CEO names, founding dates, and funding information
- Adds contact information and executive details to people records
- Augments product data with sourced metadata from the web
- Accepts CSV files or inline JSON data as input
- Chains context from previous research interactions via interaction ID
- Returns results as JSON array of input/output object pairs
How to install parallel-data-enrichment
npx skills add https://github.com/parallel-web/parallel-agent-skills --skill parallel-data-enrichment- parallel-cli installed and authenticated
- Internet access
- Data in CSV format or as inline JSON
How to use parallel-data-enrichment
- 1.Prepare your data as a CSV file or inline JSON array of objects
- 2.Run `parallel-cli enrich run` with your data source, specifying fields to add via `--intent` or `--enriched-columns`
- 3.Include `--no-wait` flag to return immediately with a taskgroup_id and monitoring URL
- 4.Use `parallel-cli enrich poll` with the taskgroup_id to retrieve results (timeout 540 seconds)
- 5.Results are saved as JSON file containing array of {input, output} objects; re-run poll command if it times out on large datasets
Use cases
- Bulk-enrich a CSV of 500 companies with CEO names and recent funding rounds before outreach
- Add missing contact info and titles to a list of discovered prospects from a prior research task
- Augment product records with sourced metadata and company details for competitive analysis
- Enrich a manually curated list of people with verified job titles and company affiliations
- Sales and business development teams building prospect lists
- Researchers compiling datasets that require verified external data
- Product managers gathering competitive intelligence
- Anyone needing to augment existing data with web-sourced enrichment
parallel-data-enrichment FAQ
Enrichment duration depends on the number of rows and fields requested. The skill will inform you upfront that it may take several minutes. Use --no-wait to avoid blocking while it runs server-side.
Yes. If you have the interaction_id from a prior research task, pass it via --previous-interaction-id to carry context forward and enrich entities discovered earlier without restating prior findings.
Results are always returned as JSON, regardless of input format. The file contains an array of {input, output} objects where input is your original row and output contains the enriched fields.
For large datasets, enrichment may exceed the 9-minute tool timeout. The enrichment continues running server-side. Re-run the same poll command to continue waiting and retrieve results when ready.
A 403 error typically indicates insufficient balance. Run `parallel-cli balance get` to check, then `parallel-cli balance add` if needed before retrying the enrichment command.
Full instructions (SKILL.md)
Source of truth, from parallel-web/parallel-agent-skills.
name: parallel-data-enrichment description: "Bulk data enrichment. Adds web-sourced fields (CEO names, funding, contact info) to lists of companies, people, or products. Use for enriching CSV files or inline data. Supports multi-turn: pass --previous-interaction-id from a prior research task to carry context forward." user-invocable: true argument-hint: <file or entities> with <fields to add> compatibility: Requires parallel-cli and internet access. allowed-tools: Bash(parallel-cli:*) metadata: author: parallel
Data Enrichment
Enrich: $ARGUMENTS
Before starting
Inform the user that enrichment may take several minutes depending on the number of rows and fields requested.
Optional: Suggest output columns
If the user gave a vague intent ("enrich these companies with useful info") and you're not sure what columns to add, ask the API for a suggestion before kicking off the run:
parallel-cli enrich suggest "Find CEO and recent funding info" --json
The response is an envelope: {title, processor, enriched_columns, warnings}. Extract just the enriched_columns array (not the whole envelope) and pass it as the value of --enriched-columns on enrich run, in place of --intent — the two flags are alternative ways to specify what to enrich, not combined. If suggest returned a processor, pass it through explicitly via --processor on the run call (it's a tuned recommendation for the schema). Skip this whole section if the user already specified the fields they want.
enrich suggestrequiresparallel-cli≥ 0.3.0. If it errors with anything resemblingno such command/No such command/unknown command, do not bail — skip the suggestion step, fall through to step 1 with--intent, complete the run, and mentionparallel-cli update(orpipx upgrade parallel-web-tools) in the final response so the user picks up the feature next time.
Step 1: Start the enrichment
Use ONE of these command patterns (substitute user's actual data):
For inline data:
parallel-cli enrich run --data '[{"company": "Google"}, {"company": "Microsoft"}]' --intent "CEO name and founding year" --target "output.csv" --no-wait --json
For CSV file:
parallel-cli enrich run --source-type csv --source "input.csv" --target "output.csv" --source-columns '[{"name": "company", "description": "Company name"}]' --intent "CEO name and founding year" --no-wait --json
If this is a follow-up to a previous research task and you have its interaction_id, add context chaining:
parallel-cli enrich run --data '...' --intent "..." --target "output.csv" --no-wait --json --previous-interaction-id "$INTERACTION_ID"
The enrichment will run with the full context of that prior research — so you can enrich entities discovered earlier without restating what was already found. Note: enrichment does not itself produce a new interaction_id, so you cannot chain a further follow-up off of an enrichment.
IMPORTANT: Always include --no-wait so the command returns immediately instead of blocking.
Parse the --json output to extract taskgroup_id and url. The output is {taskgroup_id, url, num_runs} — there is no interaction_id field, do not look for one. Immediately tell the user:
- Enrichment has been kicked off
- The monitoring URL where they can track progress
Tell them they can background the polling step to continue working while it runs.
Step 2: Poll for results
Pick a concrete output path (e.g., /tmp/enrichment-acme.json). Note: the file is JSON regardless of the extension you choose — it's an array of {input, output} objects, not a CSV. Name it .json to avoid confusing yourself or the user.
parallel-cli enrich poll "$TASKGROUP_ID" --timeout 540 --output "/tmp/enrichment-<descriptive-name>.json"
Important:
- Use
--timeout 540(9 minutes) to stay within tool execution limits - The
--targetfrom step 1 is unused in--no-waitmode — only--outputhere determines where results are saved, and the file is always JSON
If the poll times out
Enrichment of large datasets can take longer than 9 minutes. If the poll exits without completing:
- Tell the user the enrichment is still running server-side
- Re-run the same
parallel-cli enrich pollcommand to continue waiting
Response format
After step 1: Share the monitoring URL (for tracking progress).
After step 2:
- Report number of rows enriched
- Preview first few rows from the output file (it's a JSON array of
{input, output}objects) - Tell the user the full path to the output file
Do NOT re-share the monitoring URL after completion — the results are in the output file.
Setup
If parallel-cli is not found, install and authenticate:
/parallel:parallel-cli-setup
If any parallel-cli enrich command returns 403, tell the user balance is likely required. Offer to run parallel-cli balance get, and if needed ask for explicit confirmation before running parallel-cli balance add <amount_cents>. Then retry the original enrichment command.
Related skills
More from parallel-web/parallel-agent-skills and the wider catalog.
parallel-deep-research
Exhaustive multi-source research for complex topics when users explicitly request deep, comprehensive, or thorough investigation.
parallel-web-search
Fast, cost-effective web search for current information and research queries.
parallel-web-extract
Token-efficient URL content extraction for webpages, articles, PDFs, and JavaScript-heavy sites.
status
Check the status of a running research task by its run ID.
result
Retrieve completed research task results by run ID using Parallel CLI.
parallel-monitor
Continuously track the web for changes on a recurring cadence. Use when the user asks to 'monitor', 'track changes to', 'watch', or 'alert me when' something on the web changes — e.g., 'Track price changes for iPhone 16', 'Alert me when Tesla files a new 8-K', 'Monitor competitor pricing pages weekly'. Also use to list, inspect, update, or delete existing monitors.