transloadit-media-processing
github/awesome-copilot
Encode, transform, and pipeline video, audio, image, and document files via Transloadit's cloud robots.
What is transloadit-media-processing?
This skill enables an AI coding agent to process video, audio, image, and document files using Transloadit's cloud-based processing robots. Use it when a task requires encoding video, generating thumbnails, resizing/watermarking images, extracting audio, concatenating clips, OCR, or building multi-step media pipelines.
- Provides access to 86+ Transloadit robots for video, audio, image, and document processing
- Encodes video into formats like HLS, MP4, and WebM, and generates thumbnails
- Resizes, crops, watermarks, and converts images between formats (JPEG, PNG, WebP, AVIF, HEIF)
- Extracts/transcodes audio and concatenates audio or video clips
- Runs OCR, speech-to-text/text-to-speech, content moderation, and object detection via document/AI robots
- Chains multiple processing steps into a single pipeline (Assembly) using the 'use' field, with optional reusable Templates
How to install transloadit-media-processing
npx skills add https://github.com/github/awesome-copilot --skill transloadit-media-processing- A free Transloadit account (https://transloadit.com/signup) with API key and secret
- Node.js/npx available to run @transloadit/mcp-server or @transloadit/node
- MCP-compatible IDE/agent (e.g. VS Code with GitHub Copilot) if using the MCP server option
How to use transloadit-media-processing
- 1.Sign up for a free Transloadit account and get API credentials at https://transloadit.com/c/-/api-credentials
- 2.Choose a setup option: configure the MCP server in your IDE (e.g. .vscode/mcp.json with TRANSLOADIT_KEY and TRANSLOADIT_SECRET) or use the @transloadit/node CLI directly
- 3.Define a pipeline as a JSON 'steps' object, choosing robots (e.g. /video/encode, /image/resize, /document/ocr) and chaining them with the 'use' field, starting from ':original' for the input file
- 4.Run the pipeline: via MCP tools like create_assembly/create_template, or via CLI with 'npx -y @transloadit/node assemblies create --steps ... --wait --input ./file'
- 5.For batch jobs, use /http/import to pull files from URLs or cloud storage before processing, and use presets (e.g. hls-1080p, mp3, webp) for common output targets
- 6.Optionally save reusable pipelines as Templates with ${variable} placeholders for dynamic values
Use cases
- Encoding uploaded video to HLS or MP4 for adaptive streaming
- Generating thumbnails or animated GIFs from a video file
- Resizing, cropping, and watermarking a batch of product images
- Extracting or transcoding audio tracks (e.g. to MP3 or FLAC) and concatenating multiple clips
- OCR'ing scanned PDF documents or building a multi-step pipeline that resizes, optimizes, and uploads images to S3
- Developers building media upload/transformation pipelines
- Engineers needing automated video/audio encoding or thumbnail generation
- Teams processing documents (OCR) or images at scale via API
- Coding agent users (Copilot/Cursor) who want to script media workflows without manual Transloadit dashboard use
transloadit-media-processing FAQ
Yes, a free Transloadit account is required to get API credentials (key and secret).
No, processing runs on Transloadit's cloud infrastructure via assemblies; the agent only configures and triggers jobs.
Yes, you can save a set of steps as a reusable Template on Transloadit and pass dynamic values via ${variables}.
Use the /http/import robot to pull files from URLs, S3, GCS, Azure, FTP, or Dropbox before processing.
The MCP server (@transloadit/mcp-server) is recommended for Copilot/IDE integration; the CLI (@transloadit/node) works for direct command-line use.
Full instructions (SKILL.md)
Source of truth, from github/awesome-copilot.
name: transloadit-media-processing description: 'Process media files (video, audio, images, documents) using Transloadit. Use when asked to encode video to HLS/MP4, generate thumbnails, resize or watermark images, extract audio, concatenate clips, add subtitles, OCR documents, or run any media processing pipeline. Covers 86+ processing robots for file transformation at scale.' license: MIT compatibility: Requires a free Transloadit account (https://transloadit.com/signup). Uses the @transloadit/mcp-server MCP server or the @transloadit/node CLI.
Transloadit Media Processing
Process, transform, and encode media files using Transloadit's cloud infrastructure. Supports video, audio, images, and documents with 86+ specialized processing robots.
When to Use This Skill
Use this skill when you need to:
- Encode video to HLS, MP4, WebM, or other formats
- Generate thumbnails or animated GIFs from video
- Resize, crop, watermark, or optimize images
- Convert between image formats (JPEG, PNG, WebP, AVIF, HEIF)
- Extract or transcode audio (MP3, AAC, FLAC, WAV)
- Concatenate video or audio clips
- Add subtitles or overlay text on video
- OCR documents (PDF, scanned images)
- Run speech-to-text or text-to-speech
- Apply AI-based content moderation or object detection
- Build multi-step media pipelines that chain operations together
Setup
Option A: MCP Server (recommended for Copilot)
Add the Transloadit MCP server to your IDE config. This gives the agent direct access
to Transloadit tools (create_template, create_assembly, list_assembly_notifications, etc.).
VS Code / GitHub Copilot (.vscode/mcp.json or user settings):
{
"servers": {
"transloadit": {
"command": "npx",
"args": ["-y", "@transloadit/mcp-server", "stdio"],
"env": {
"TRANSLOADIT_KEY": "YOUR_AUTH_KEY",
"TRANSLOADIT_SECRET": "YOUR_AUTH_SECRET"
}
}
}
}
Get your API credentials at https://transloadit.com/c/-/api-credentials
Option B: CLI
If you prefer running commands directly:
npx -y @transloadit/node assemblies create \
--steps '{"encoded": {"robot": "/video/encode", "use": ":original", "preset": "hls-1080p"}}' \
--wait \
--input ./my-video.mp4
Core Workflows
Encode Video to HLS (Adaptive Streaming)
{
"steps": {
"encoded": {
"robot": "/video/encode",
"use": ":original",
"preset": "hls-1080p"
}
}
}
Generate Thumbnails from Video
{
"steps": {
"thumbnails": {
"robot": "/video/thumbs",
"use": ":original",
"count": 8,
"width": 320,
"height": 240
}
}
}
Resize and Watermark Images
{
"steps": {
"resized": {
"robot": "/image/resize",
"use": ":original",
"width": 1200,
"height": 800,
"resize_strategy": "fit"
},
"watermarked": {
"robot": "/image/resize",
"use": "resized",
"watermark_url": "https://example.com/logo.png",
"watermark_position": "bottom-right",
"watermark_size": "15%"
}
}
}
OCR a Document
{
"steps": {
"recognized": {
"robot": "/document/ocr",
"use": ":original",
"provider": "aws",
"format": "text"
}
}
}
Concatenate Audio Clips
{
"steps": {
"imported": {
"robot": "/http/import",
"url": ["https://example.com/clip1.mp3", "https://example.com/clip2.mp3"]
},
"concatenated": {
"robot": "/audio/concat",
"use": "imported",
"preset": "mp3"
}
}
}
Multi-Step Pipelines
Steps can be chained using the "use" field. Each step references a previous step's output:
{
"steps": {
"resized": {
"robot": "/image/resize",
"use": ":original",
"width": 1920
},
"optimized": {
"robot": "/image/optimize",
"use": "resized"
},
"exported": {
"robot": "/s3/store",
"use": "optimized",
"bucket": "my-bucket",
"path": "processed/${file.name}"
}
}
}
Key Concepts
- Assembly: A single processing job. Created via
create_assembly(MCP) orassemblies create(CLI). - Template: A reusable set of steps stored on Transloadit. Created via
create_template(MCP) ortemplates create(CLI). - Robot: A processing unit (e.g.,
/video/encode,/image/resize). See full list at https://transloadit.com/docs/transcoding/ - Steps: JSON object defining the pipeline. Each key is a step name, each value configures a robot.
:original: Refers to the uploaded input file.
Tips
- Use
--waitwith the CLI to block until processing completes. - Use
presetvalues (e.g.,"hls-1080p","mp3","webp") for common format targets instead of specifying every parameter. - Chain
"use": "step_name"to build multi-step pipelines without intermediate downloads. - For batch processing, use
/http/importto pull files from URLs, S3, GCS, Azure, FTP, or Dropbox. - Templates can include
${variables}for dynamic values passed at assembly creation time.
Related skills
More from github/awesome-copilot and the wider catalog.
git-commit
Execute semantic git commits with conventional message analysis and intelligent staging.
excalidraw-diagram-generator
Generate Excalidraw diagrams from natural language descriptions.
documentation-writer
Create structured technical documentation using the Diátaxis framework for tutorials, how-to guides, references, and explanations.
gh-cli
GitHub CLI comprehensive reference for repositories, issues, PRs, Actions, projects, releases, and all GitHub operations from the command line.
prd
Generate comprehensive Product Requirements Documents with executive summaries, user stories, technical specs, and risk analysis.
refactor
Surgical code refactoring to improve maintainability without changing behavior.