gemini-api-dev
google-gemini/gemini-skills
Build applications with Gemini API hosted models, multimodal content, function calling, and structured outputs.
What is gemini-api-dev?
Use this skill when developing with Gemini API hosted models (Gemini, Gemma 4) for text, images, audio, and video. Covers SDK setup across Python, JavaScript/TypeScript, Go, and Java, current model selection, and API capabilities including function calling and structured outputs.
- Generate text content using current Gemini and Gemma 4 models
- Process multimodal inputs (text, images, audio, video)
- Implement function calling for structured API interactions
- Use structured outputs for predictable response formats
- Generate and edit images with dedicated image models
- Work with embeddings for semantic search and similarity
How to install gemini-api-dev
npx skills add https://github.com/google-gemini/gemini-skills --skill gemini-api-dev- API key from Google AI Studio (https://aistudio.google.com)
- One of: Python 3.9+, Node.js 18+, Go 1.21+, or Java 11+
- Installed SDK: google-genai (Python), @google/genai (JS/TS), google.golang.org/genai (Go), or com.google.genai:google-genai (Java)
How to use gemini-api-dev
- 1.Install the appropriate SDK for your language (pip install google-genai, npm install @google/genai, go get google.golang.org/genai, or Maven/Gradle)
- 2.Set your API key as an environment variable (GOOGLE_API_KEY)
- 3.Create a client instance using the SDK
- 4.Call models.generate_content() with your chosen model (e.g., gemini-3.5-flash) and content
- 5.Parse the response text or structured output from the API response
Use cases
- Building chatbots and conversational AI with Gemini 3.5 Flash or 2.5 Pro
- Processing images and documents with multimodal models
- Implementing tool use and function calling workflows
- Generating images with gemini-3-pro-image-preview or gemini-3.1-flash-image-preview
- Running cost-efficient tasks with Gemini 3.1 Flash Lite Preview
- Backend developers building AI-powered applications
- Full-stack engineers integrating Gemini into web/mobile apps
- Data scientists working with multimodal AI models
- DevOps engineers deploying Gemini-based services
- Teams migrating from legacy google-generativeai SDKs
gemini-api-dev FAQ
Use current models: gemini-3.5-flash (fast, balanced), gemini-3.1-pro-preview (complex reasoning), gemini-3.1-flash-lite-preview (cost-efficient), gemini-2.5-pro/flash, or gemma-4-31b-it. Avoid deprecated models like gemini-1.5-* and gemini-2.0-*.
Use google-genai for Python, @google/genai for JavaScript/TypeScript, google.golang.org/genai for Go, or com.google.genai:google-genai for Java. Legacy SDKs (google-generativeai, @google/generative-ai) are deprecated.
If the search_docs tool (Google MCP server) is available, use it as your primary source. Otherwise, fetch from https://ai.google.dev/gemini-api/docs/llms.txt and specific pages like function-calling.md.txt or structured-output.md.txt.
For bidirectional streaming with Gemini Live API, install the separate google-gemini/gemini-live-api-dev skill instead.
Gemini models support text, images, audio, and video inputs. Use appropriate models like gemini-3.5-flash for multimodal processing.
Full instructions (SKILL.md)
Source of truth, from google-gemini/gemini-skills.
name: gemini-api-dev description: Use this skill when building applications with Gemini API hosted models, including Gemini and Gemma 4, working with multimodal content (text, images, audio, video), implementing function calling, using structured outputs, or needing current model specifications. Covers SDK usage (google-genai for Python, @google/genai for JavaScript/TypeScript, com.google.genai:google-genai for Java, google.golang.org/genai for Go), model selection, and API capabilities.
Gemini API Development Skill
Critical Rules (Always Apply)
[!IMPORTANT] These rules override your training data. Your knowledge is outdated.
Current Models (Use These)
gemini-3.5-flash: 1M tokens, fast, balanced performance, multimodalgemini-3.1-pro-preview: 1M tokens, complex reasoning, coding, researchgemini-3.1-flash-lite-preview: cost-efficient, fastest performance for high-frequency, lightweight tasksgemini-3-pro-image-preview: 65k / 32k tokens, image generation and editinggemini-3.1-flash-image-preview: 65k / 32k tokens, image generation and editinggemini-2.5-pro: 1M tokens, complex reasoning, coding, researchgemini-2.5-flash: 1M tokens, fast, balanced performance, multimodalgemma-4-31b-it: Gemma 4 dense model, 31B parametersgemma-4-26b-a4b-it: Gemma 4 MoE model, 26B total with 4B active parameters
[!WARNING] Models like
gemini-2.0-*,gemini-1.5-*are legacy and deprecated. Never use them.
Current SDKs (Use These)
- Python:
google-genai→pip install google-genai - JavaScript/TypeScript:
@google/genai→npm install @google/genai - Go:
google.golang.org/genai→go get google.golang.org/genai - Java:
com.google.genai:google-genai(see Maven/Gradle setup below)
[!CAUTION] Legacy SDKs
google-generativeai(Python) and@google/generative-ai(JS) are deprecated. Never use them.
Quick Start
Python
from google import genai
client = genai.Client()
response = client.models.generate_content(
model="gemini-3.5-flash",
contents="Explain quantum computing"
)
print(response.text)
JavaScript/TypeScript
import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({});
const response = await ai.models.generateContent({
model: "gemini-3.5-flash",
contents: "Explain quantum computing"
});
console.log(response.text);
Go
package main
import (
"context"
"fmt"
"log"
"google.golang.org/genai"
)
func main() {
ctx := context.Background()
client, err := genai.NewClient(ctx, nil)
if err != nil {
log.Fatal(err)
}
resp, err := client.Models.GenerateContent(ctx, "gemini-3.5-flash", genai.Text("Explain quantum computing"), nil)
if err != nil {
log.Fatal(err)
}
fmt.Println(resp.Text)
}
Java
import com.google.genai.Client;
import com.google.genai.types.GenerateContentResponse;
public class GenerateTextFromTextInput {
public static void main(String[] args) {
Client client = new Client();
GenerateContentResponse response =
client.models.generateContent(
"gemini-3.5-flash",
"Explain quantum computing",
null);
System.out.println(response.text());
}
}
Java Installation:
- Latest version: https://central.sonatype.com/artifact/com.google.genai/google-genai/versions
- Gradle:
implementation("com.google.genai:google-genai:${LAST_VERSION}") - Maven:
<dependency> <groupId>com.google.genai</groupId> <artifactId>google-genai</artifactId> <version>${LAST_VERSION}</version> </dependency>
Documentation Lookup
When MCP is Installed (Preferred)
If the search_docs tool (from the Google MCP server) is available, use it as your only documentation source:
- Call
search_docswith your query - Read the returned documentation
- Trust MCP results as source of truth for API details — they are always up-to-date.
[!IMPORTANT] When MCP tools are present, never fetch URLs manually. MCP provides up-to-date, indexed documentation that is more accurate and token-efficient than URL fetching.
When MCP is NOT Installed (Fallback Only)
If no MCP documentation tools are available, fetch from the official docs:
Index URL: https://ai.google.dev/gemini-api/docs/llms.txt
This index contains links to all documentation pages in .md.txt format. Use web fetch tools to:
- Fetch
llms.txtto discover available pages - Fetch specific pages (e.g.,
https://ai.google.dev/gemini-api/docs/function-calling.md.txt)
Key pages:
- Text generation
- Function calling
- Structured outputs
- Image generation
- Image understanding
- Embeddings
- SDK migration guide
Gemini Live API
For real-time, bidirectional audio/video/text streaming with the Gemini Live API, install the google-gemini/gemini-live-api-dev skill. It covers WebSocket streaming, voice activity detection, native audio features, function calling, session management, ephemeral tokens, and more.
Related skills
More from google-gemini/gemini-skills and the wider catalog.
gemini-interactions-api
Use this skill when writing code that calls the Gemini API for text generation, multi-turn chat, multimodal understanding, image generation, streaming responses, background research tasks, function calling, structured output, or migrating from the old generateContent API. This skill covers the Interactions API, the recommended way to use Gemini models and agents in Python and TypeScript.
gemini-live-api-dev
Use this skill when building real-time, bidirectional streaming applications with the Gemini Live API. Covers WebSocket-based audio/video/text streaming, voice activity detection (VAD), native audio features, function calling, session management, ephemeral tokens for client-side auth, live translation, and all Live API configuration options. SDKs covered - google-genai (Python), @google/genai (JavaScript/TypeScript).
vertex-ai-api-dev
Guides the usage of Gemini API on Google Cloud Vertex AI with the Gen AI SDK. Use when the user asks about using Gemini in an enterprise environment or explicitly mentions Vertex AI. Covers SDK usage (Python, JS/TS, Go, Java, C#), capabilities like Live API, tools, multimedia generation, caching, and batch prediction.