Gemini

artificial intelligenceNO_AUTH8 actions

Comprehensive Gemini integration supporting Veo 3 video generation, Gemini Flash text generation (Nano Banana), chat completions, and multimodal AI capabilities via the Google Gemini API.

Connect Gemini

Connect Now Book a Demo

Categoryartificial intelligence

AuthNO_AUTH

Actions8

About

What is Gemini?

Comprehensive Gemini integration supporting Veo 3 video generation, Gemini Flash text generation (Nano Banana), chat completions, and multimodal AI capabilities via the Google Gemini API. On Nagent, Gemini is exposed as a fully-configurable artificial intelligence integration that any agent can call — 8 actions, and no authentication authentication. No code is required to wire Gemini into your workflow — connect it once via the External Integrations panel and reuse it across every agent you build.

What can you build with Gemini?

Agent builders use Gemini to automate the kinds of tasks artificial intelligence teams previously handled manually. Concrete examples — each one is a single agent step in Nagent — include:

Count Tokens (Gemini) — Counts the number of tokens in text using Gemini tokenization.
Embed Content (Gemini) — Generates text embeddings using Gemini embedding models.
Generate Content (Gemini) — Generates text content or speech audio from prompts using Gemini models.
Generate Image (Nano Banana) — Generates images from text prompts using Gemini models (Nano Banana).
Generate Videos (Veo) — Generates videos from text prompts using Google's Veo models.
Get Videos Operation (Veo) (Deprecated) — DEPRECATED: Use WaitForVideo instead.

Every action and trigger is paired with a structured input/output schema (visible in the sections below), so when you wire Gemini into Helix — our agentic agent builder — the editor knows exactly what each step expects and produces. Configure once, deploy anywhere across your Nagent agents.

What You Can Do

Actions (8)

Every operation an agent can call against Gemini, with input parameters and output schema. Drop these into any step of an agent built in Helix.

Count Tokens (Gemini)GEMINI_COUNT_TOKENS

Counts the number of tokens in text using Gemini tokenization. Useful for estimating costs, checking input limits, and optimizing prompts before making API calls.

Input parameters

PropType

textstring

Required

Text to count tokens for

model?string

Optional

Model to use for token counting. Must be a model that supports the countTokens method. Use the ListModels action to see available models and their supported methods.

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Embed Content (Gemini)GEMINI_EMBED_CONTENT

Generates text embeddings using Gemini embedding models. Converts text into numerical vectors for semantic search, similarity comparison, clustering, and classification tasks.

Input parameters

PropType

textstring

Required

The text content to generate embeddings for.

model?string

Optional

Embedding model to use. Options: 'text-embedding-004' (768 dimensions, default), 'gemini-embedding-001' (3072 dimensions, latest).

title?string

Optional

Optional title for the content. Use with task_type='RETRIEVAL_DOCUMENT' to improve embedding quality for document search.

task_type?string

Optional

Specifies the intended use case to optimize the embedding. Options: 'RETRIEVAL_QUERY' (search queries), 'RETRIEVAL_DOCUMENT' (documents to be searched), 'SEMANTIC_SIMILARITY' (text similarity), 'CLASSIFICATION' (categorization), 'CLUSTERING' (grouping), 'QUESTION_ANSWERING' (question-document matching). Note: Some task types like 'CODE_RETRIEVAL_QUERY' may only be supported by certain models.

output_dimensionality?integer

Optional

Truncate the embedding to this number of dimensions. Only supported by 'gemini-embedding-001' model. Recommended values: 768, 1536, or 3072. Lower dimensions reduce storage but may affect quality.

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Generate Content (Gemini)GEMINI_GENERATE_CONTENT

Generates text content or speech audio from prompts using Gemini models. Supports text generation models (Gemini Flash, Pro) and text-to-speech models with configurable parameters. Generated text is nested at results\[i\].response.data.text. Output may be wrapped in markdown fences (e.g., ```html...```) or preceded by explanatory prose; strip these before file writing or rendering.

Input parameters

PropType

model?string

Optional

Model to use for generation. Text generation models: 'gemini-2.5-flash' (default, fast & efficient), 'gemini-2.5-pro' (advanced reasoning), 'gemini-2.0-flash' (previous generation), 'gemini-2.0-flash-lite' (cost-optimized). Text-to-speech models: 'gemini-2.5-flash-preview-tts' (low latency), 'gemini-2.5-pro-preview-tts' (high quality). Note: TTS models require voice_name parameter and return audio data instead of text.

top_k?integer

Optional

Top-k sampling parameter

top_p?number

Optional

Nucleus sampling parameter (0.0 to 1.0)

promptstring

Required

REQUIRED. The text prompt for content generation. This field must be provided. Example: 'Write a short poem about the ocean' or 'Explain quantum computing in simple terms'. For TTS models, include style instructions in the prompt (e.g., 'Say cheerfully: Hello!').

Optional

Available prebuilt voices for text-to-speech generation. Complete list of 30 official Gemini TTS voices as documented at: https://ai.google.dev/gemini-api/docs/speech-generation

temperature?number

Optional

Controls randomness (0.0 to 2.0)

stop_sequences?array

Optional

Sequences where generation should stop

safety_settings?array

Optional

Safety filter settings

max_output_tokens?integer

Optional

Maximum number of tokens to generate If response finishReason='MAX_TOKENS', output was truncated; narrow prompt scope or increase this value and regenerate.

system_instruction?string

Optional

System instruction to guide the model's behavior

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Generate Image (Nano Banana)GEMINI_GENERATE_IMAGE

Generates images from text prompts using Gemini models (Nano Banana). Supports models: 'gemini-2.5-flash-image' (GA stable, fast), 'gemini-3-pro-image-preview' (Nano Banana Pro - advanced with 4K resolution, thinking mode, up to 14 reference images), and 'gemini-2.0-flash-exp-image-generation' (2.0 Flash experimental). Returns one image per call; images are uploaded to S3. Parse response at data.image.s3url or the text-type entry in data.content — prefer the URL to avoid base64 blobs. Always validate s3url before treating call as successful; a 200 response may contain only text with no image. Store s3url immediately as URLs can expire. Output formats are raster only (JPG/PNG/WebP); request PNG for transparency. Concurrent usage may trigger HTTP 429/RESOURCE_EXHAUSTED — keep concurrency ≤3 and use exponential backoff (1s→2s→4s, ~5 retries). NOTE NEVER EVER TRUE SYNC_TO_WORKBENCH IN RUBE_MULTI_EXECUTE_TOOL

Input parameters

PropType

model?string ("gemini-2.5-flash-image" | "gemini-3-pro-image-preview" | "gemini-2.0-flash-exp-image-generation")

Optional

Model to use for image generation. Options: 'gemini-2.5-flash-image' (GA stable, fast), 'gemini-3-pro-image-preview' (advanced with 4K, thinking mode), 'gemini-2.0-flash-exp-image-generation' (2.0 Flash experimental).

top_k?integer

Optional

Top-k sampling parameter

top_p?number

Optional

Nucleus sampling parameter (0.0 to 1.0)

promptstring

Required

Text prompt for image generation Sensitive, trademarked, or explicit content triggers HTTP 400 (PROHIBITED_CONTENT or IMAGE_RECITATION) with no image returned — rephrase into neutral, policy-compliant language rather than retrying identical prompts.

timeout?number

Optional

Request timeout in seconds. Default is 300 seconds (5 minutes). Increase for complex prompts or high-resolution images. Minimum 120 seconds, maximum 600 seconds.

image_size?string ("1K" | "2K" | "4K")

Optional

Output resolution (only for 'gemini-3-pro-image-preview'). Options: 1K, 2K, 4K.

temperature?number

Optional

Controls randomness (0.0 to 2.0)

aspect_ratio?string ("1:1" | "2:3" | "3:2" | "3:4" | "4:3" | "4:5" | "5:4" | "9:16" | "16:9" | "21:9")

Optional

Aspect ratio for generated image. Not supported by 'gemini-2.0-flash-exp-image-generation' model. Accepted values: '1:1', '4:5', '16:9', '9:16'. Unsupported strings will fail or silently default to 1:1.

safety_settings?array

Optional

Safety filter settings. List of objects specifying content categories to filter and threshold levels. Each setting requires 'category' (HARM_CATEGORY_HARASSMENT, HARM_CATEGORY_HATE_SPEECH, HARM_CATEGORY_SEXUALLY_EXPLICIT, or HARM_CATEGORY_DANGEROUS_CONTENT) and 'threshold' (BLOCK_NONE, BLOCK_LOW_AND_ABOVE, BLOCK_MEDIUM_AND_ABOVE, or BLOCK_ONLY_HIGH).

max_output_tokens?integer

Optional

Maximum number of tokens to generate (max 32,768). For image generation, images consume tokens based on resolution: 1K/2K consume 1,120 tokens, 4K consumes 2,000 tokens. If set too low, the API may return MAX_TOKENS finish reason with no image. If not specified, the API uses its default which is sufficient for image generation.

system_instruction?string

Optional

System instruction to guide image generation behavior

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Generate Videos (Veo)GEMINI_GENERATE_VIDEOS

Generates videos from text prompts using Google's Veo models. Returns an operation_name for tracking; pass it verbatim (no edits) to GEMINI_WAIT_FOR_VIDEO or GEMINI_GET_VIDEOS_OPERATION. Jobs take 30–180+ seconds; wait 10s before first poll, then poll every 10–30s (allow up to 12 min). Successful results include data.video_file.s3url — missing s3url means failure. If done=true but no video_file, check raiMediaFilteredReasons (safety block); revise prompt and regenerate. Text-only; cannot accept image inputs. Max ~3–5 concurrent jobs; 429 RESOURCE_EXHAUSTED requires exponential backoff. For retries, always start a fresh call — never reuse a failed operation_name.

Input parameters

PropType

seed?integer

Optional

Seed value for reproducibility. IMPORTANT: Only supported by Veo 3/3.1 models (VEO_3, VEO_3_FAST, VEO_3_1, VEO_3_1_FAST). VEO_2 does NOT support seed - using seed with VEO_2 will result in a validation error.

model?string ("veo-2.0-generate-001" | "veo-3.0-generate-001" | "veo-3.0-fast-generate-001" | "veo-3.1-generate-preview" | "veo-3.1-fast-generate-preview")

Optional

Veo model for video generation. Available enum values: VEO_3 (default, recommended), VEO_2, VEO_3_FAST, VEO_3_1 (newest), VEO_3_1_FAST (newest). Avoid preview model ID variants (e.g., '*generate-preview*') — they fail to produce downloadable URIs. Use only stable IDs: veo-2.0-generate-001 or veo-3.0-generate-001.

promptstring

Required

Text prompt for Veo video generation. Must be a non-empty string describing the video to generate.

resolution?string ("720p" | "1080p")

Optional

Supported resolutions for video generation.

aspect_ratio?string ("16:9" | "9:16")

Optional

Supported aspect ratios for video generation.

negative_prompt?string

Optional

Text describing content to avoid in the generated video (e.g., 'cartoon, drawing, low quality').

duration_seconds?integer ("4" | "5" | "6" | "7" | "8")

Optional

Supported video durations in seconds. Model-specific restrictions apply: - Veo 2: Supports 5, 6, 7, or 8 seconds (4 seconds NOT supported) - Veo 3/3.1 models: Supports 4, 6, or 8 seconds (5 and 7 seconds NOT supported)

person_generation?string ("dont_allow" | "allow_adult" | "allow_all")

Optional

Person generation safety settings for video generation. Model-specific restrictions apply: - Veo 2: Supports DONT_ALLOW, ALLOW_ADULT, and ALLOW_ALL - Veo 3/3.1 models: Only ALLOW_ALL is supported (requires allowlist access)

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Get Videos Operation (Veo) (Deprecated)GEMINI_GET_VIDEOS_OPERATION

DEPRECATED: Use WaitForVideo instead. Checks status of a Veo video generation operation. Use operation_name from GenerateVideos to track progress. Wait several seconds after starting GenerateVideos before first call to avoid OPERATION_NOT_FOUND. Poll at 10–30s intervals; use exponential backoff on HTTP 429 RESOURCE_EXHAUSTED; cap total polling at ~15 minutes. Complete when done=true AND a valid video URI is present; done=true without video_file indicates safety filtering blocked output — inspect raiMediaFilteredReasons and rephrase prompt. Video URL is at generatedSamples\[\].video.uri — persist promptly as URLs are time-limited. Keep concurrent polling to 3–5 parallel calls to avoid rate limits. If WaitForVideo times out, continue polling here using the same operation_name rather than starting a new GenerateVideos job.

Input parameters

PropType

operation_namestring

Required

The operation resource name from GEMINI_GENERATE_VIDEOS. Accepts either the full resource name 'models/{model}/operations/{operation_id}' or just the operation ID. If only operation ID is provided, it will be expanded to use the default model 'veo-3.0-generate-001'. Pass exactly as returned — do not truncate or edit. Never reuse an operation_name from a failed job; start a new GenerateVideos call instead.

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

List Models (Gemini API)GEMINI_LIST_MODELS

Lists available Gemini and Veo models with their capabilities and limits. Useful for discovering supported models and their features before making generation requests. Before calling video generation tools, verify model availability here — preview Veo models (e.g., veo-3.0-generate-preview) may be unavailable or return missing video URIs; prefer stable models like veo-2.0-generate-001.

Input parameters

PropType

page_size?integer

Optional

Maximum number of models to return per page (default 50, max 1000).

page_token?string

Optional

Token from a previous response's nextPageToken to retrieve the next page.

filter_prefix?string

Optional

Filter models by name prefix (client-side). Leave empty to get all models.

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Wait and Download Video (Veo)GEMINI_WAIT_FOR_VIDEO

Polls a Veo video generation operation until completion, then downloads and returns the video as a FileDownloadable. Generation takes 30–120+ seconds (up to ~10–12 min); long waits are normal, not failures. On completion, the URL is nested at data.video_file.s3url — validate it is non-empty before downstream use. A done=true response without a valid s3url indicates safety filter rejection (check raiMediaFilteredReasons) or quota exhaustion — adjust the prompt and regenerate. On timeout, use GEMINI_GET_VIDEOS_OPERATION with incremental backoff before starting a new job. Keep parallel jobs to 3–5 to avoid 429 RESOURCE_EXHAUSTED errors.

Input parameters

PropType

operation_namestring

Required

The full operation name returned by GEMINI_GENERATE_VIDEOS. Format: 'models/<model-id>/operations/<operation-id>' where <operation-id> is an alphanumeric string (e.g., 'models/veo-3.0-generate-001/operations/m8dl4dtqqzg8'). IMPORTANT: Do NOT use placeholder values like '...' - use the exact operation_name string from the generate videos response. CRITICAL: Must be from a generate-video operation (VEO_2, VEO_3, VEO_3_FAST models), NOT generate-preview operations (VEO_3_1, VEO_3_1_FAST models). Do not reuse operation_name from a failed GEMINI_GENERATE_VIDEOS job — always start a new GEMINI_GENERATE_VIDEOS call for retried requests.

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

On the Nagent Platform

Agents using Gemini

No publicly available marketplace agent is found using this tool yet. There are 63 agents privately built on Nagent that already use Gemini.

Build on Nagent

Build an agent that uses Gemini

Connect Gemini to any Nagent agent in minutes — no API key management, no boilerplate. Just configure and deploy.

Connect Now Book a Demo

Frequently Asked

Building with Gemini on Nagent

The five questions agent builders ask before adopting a new integration.

How do I connect Gemini to my Nagent agent?

Open the External Integrations panel inside Nagent (app.nagent.ai/externalIntegration), find Gemini, and click "Connect Now." You'll authenticate with no authentication (it’s public) — Nagent handles credential storage and refresh automatically. Once connected, Gemini is available to any agent in your workspace.

Do I need to write code to use Gemini?

No. Nagent provides no-code integration for every tool. Once Gemini is connected, you configure its 8 actions directly in the agent builder UI — no API calls, no boilerplate, no schema management.

How do I configure Gemini actions and triggers in Helix?

Helix — Nagent's agentic agent builder — lets you drop Gemini steps into any workflow visually. Pick an action (e.g., one of those listed above), fill in the inputs (Helix knows the required vs. optional schema for each parameter), and connect it to upstream/downstream steps. Triggers run as the entry point of an agent, so when a Gemini event fires, the agent kicks off automatically.

What input and output schemas does Gemini support?

Every Gemini action and trigger ships with a fully-typed schema — input parameters with name, type, required flag, and description, plus the output payload shape. The schemas are documented in the sections above. Helix uses these schemas to validate your configuration at build time and to type-check the data flowing between steps.

Can I extend Gemini with custom logic?

Yes. While Gemini ships with 8 pre-built artificial intelligence actions, you can layer custom logic around them inside Helix — pre/post-processing steps, conditional branches, retries, or stitching Gemini together with other connected tools. For deeper customization, talk to our team about Nagent's Agentic AI Lab — forward-deployed engineers who build Gemini-based workflows tailored to your business.

All tools & integrations