Run large language models locally or in the cloud with Ollama
Run large language models locally or in the cloud with Ollama On Nagent, Ollama is exposed as a fully-configurable ai models integration that any agent can call — 8 actions, and API key authentication. No code is required to wire Ollama into your workflow — connect it once via the External Integrations panel and reuse it across every agent you build.
Agent builders use Ollama to automate the kinds of tasks ai models teams previously handled manually. Concrete examples — each one is a single agent step in Nagent — include:
Every action and trigger is paired with a structured input/output schema (visible in the sections below), so when you wire Ollama into Helix — our agentic agent builder — the editor knows exactly what each step expects and produces. Configure once, deploy anywhere across your Nagent agents.
Every operation an agent can call against Ollama, with input parameters and output schema. Drop these into any step of an agent built in Helix.
OLLAMA_CHATTool to send a chat message with conversation history to Ollama. Use when you need to have a multi-turn conversation with an LLM model.
Input parameters
Model name to use for generating responses
Enables thinking output; accepts true/false or 'high'/'medium'/'low'
Optional function tools the model may call during chat
Format for response: 'json' or JSON schema object
Returns streamed partial responses; defaults to false
Runtime options controlling text generation (ModelOptions)
Returns log probabilities of output tokens
Chat history as message objects with role and content
Model keep-alive duration (e.g., '5m' or 0 to unload immediately)
Number of most likely tokens per position when logprobs enabled
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
OLLAMA_GENERATETool to generate text responses from Ollama models with optional raw mode. Use raw=true to bypass prompt templating when you need full control over the prompt for debugging or custom processing. Note that raw mode will not return a context.
Input parameters
When true, bypasses prompt templating and returns the raw response from the model. Use this for debugging or when you need full control over the prompt. Note: raw mode will not return a context.
Model name to use for generation. Examples: 'llama2', 'mistral', 'gemma3:4b'.
Enables thinking output. Accepts true/false or 'high'/'medium'/'low' for verbosity level.
Structured output format. Use 'json' for JSON output or provide a JSON schema object.
Array of base64-encoded images for multimodal models.
Text for the model to generate a response from. Required unless using raw mode with a complete prompt.
When true, returns partial responses as stream. Default is false for non-streaming responses.
Text that appears after the user prompt. Used for fill-in-the-middle models.
System prompt for the model to generate a response from.
Runtime options for model generation behavior.
Whether to return log probabilities of output tokens.
Model keep-alive duration. Examples: '5m' for 5 minutes, 0 to unload immediately.
Number of most likely tokens to return at each position. Requires logprobs=true.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
OLLAMA_LIST_MODELSTool to list all available Ollama models and their details. Use when you need to fetch installed models with metadata including name, size, last modified timestamp, digest, and format information.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
OLLAMA_OPEN_AI_CHAT_COMPLETIONSTool to create OpenAI-compatible chat completions using Ollama models. Use when you need conversational AI responses with OpenAI API format compatibility.
Input parameters
Number of chat completion choices to generate. Default: 1
Seed for reproducible outputs. Same seed with same parameters produces same results
Stop sequences where the API will stop generating further tokens. Can be a string or array of strings
Unique identifier for the end-user for monitoring and abuse detection
Model identifier to use for completion (e.g., 'llama2', 'mistral', 'gemma3:4b')
Array of tool/function definitions available for the model to call
Nucleus sampling parameter. Alternative to temperature. Default: 1.0
Enable streaming responses. If true, tokens are sent as server-sent events. Default: false
Array of message objects representing the conversation history
Modify likelihood of specified tokens appearing in the completion. Map of token IDs to bias values (-100 to 100)
Maximum number of tokens to generate in the completion
Sampling temperature for randomness (0-2). Higher values make output more random. Default: 1.0
Controls which tool the model should use ('none', 'auto', or specific tool name)
Additional streaming options.
Response format specification for structured outputs.
Penalize tokens that have appeared to encourage new topics (-2.0 to 2.0). Default: 0
Penalize frequent tokens to reduce repetition (-2.0 to 2.0). Default: 0
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
OLLAMA_OPEN_AI_COMPLETIONSTool to create OpenAI-compatible text completions using Ollama models. Use when you need text generation with OpenAI API format compatibility beyond chat-based interactions.
Input parameters
Number of completions to generate for the prompt. Default: 1
Returns the prompt embedded within the completion response. Default: false
Random seed for reproducible outputs. Same seed with same parameters should yield same result
Sequences where the model should stop generating. Can be a single string or array of strings
Unique identifier representing the end-user for tracking and abuse monitoring
Model identifier to use for completion (e.g., 'llama2', 'mistral', 'deepseek-v3.2')
Nucleus sampling parameter controlling diversity (0-1). Recommended to alter either temperature or top_p, not both. Default: 1.0
Text prompt for completion generation. Note: Currently only accepts a string, not arrays
Enable streaming responses as Server-Sent Events. Default: false
Text that comes after the completion. Used for insertion mode
Generates multiple completions server-side and returns the best one. Default: 1
Number of log probabilities to return (up to 5)
Modify likelihood of specified tokens appearing. Maps token IDs to bias values (-100 to 100)
Maximum number of tokens to generate in the completion. Total of prompt + max_tokens cannot exceed model's context length
Controls randomness in output (0-2). Lower values make output more focused, higher values more random. Default: 1.0
Additional streaming options.
Penalizes tokens based on whether they appear in text so far (-2.0 to 2.0). Positive values encourage new topics. Default: 0
Penalizes tokens based on frequency in text so far (-2.0 to 2.0). Positive values decrease repetition. Default: 0
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
OLLAMA_OPEN_AI_LIST_MODELSTool to list available models using OpenAI-compatible API format. Use when you need to retrieve locally available Ollama models with metadata following OpenAI's model list format.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
OLLAMA_SHOWTool to show comprehensive information about an Ollama model. Use when you need to retrieve model details, parameters, template, license, or system prompt.
Input parameters
The name of the model to retrieve information about.
When enabled, includes large verbose fields in the response.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
OLLAMA_VERSIONTool to get the version of Ollama running locally. Use to check which version of Ollama is currently installed.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
No publicly available marketplace agent is found using this tool yet. There are 56 agents privately built on Nagent that already use Ollama.
Build on Nagent
Connect Ollama to any Nagent agent in minutes — no API key management, no boilerplate. Just configure and deploy.
The five questions agent builders ask before adopting a new integration.
Open the External Integrations panel inside Nagent (app.nagent.ai/externalIntegration), find Ollama, and click "Connect Now." You'll authenticate with an API key — Nagent handles credential storage and refresh automatically. Once connected, Ollama is available to any agent in your workspace.
No. Nagent provides no-code integration for every tool. Once Ollama is connected, you configure its 8 actions directly in the agent builder UI — no API calls, no boilerplate, no schema management.
Helix — Nagent's agentic agent builder — lets you drop Ollama steps into any workflow visually. Pick an action (e.g., one of those listed above), fill in the inputs (Helix knows the required vs. optional schema for each parameter), and connect it to upstream/downstream steps. Triggers run as the entry point of an agent, so when a Ollama event fires, the agent kicks off automatically.
Every Ollama action and trigger ships with a fully-typed schema — input parameters with name, type, required flag, and description, plus the output payload shape. The schemas are documented in the sections above. Helix uses these schemas to validate your configuration at build time and to type-check the data flowing between steps.
Yes. While Ollama ships with 8 pre-built ai models actions, you can layer custom logic around them inside Helix — pre/post-processing steps, conditional branches, retries, or stitching Ollama together with other connected tools. For deeper customization, talk to our team about Nagent's Agentic AI Lab — forward-deployed engineers who build Ollama-based workflows tailored to your business.