ScrapeGraphAI is an AI-powered web scraping API that enables developers to extract structured data from any website using natural language prompts. Website https://scrapegraphai.com
ScrapeGraphAI is an AI-powered web scraping API that enables developers to extract structured data from any website using natural language prompts. Website https://scrapegraphai.com On Nagent, Scrapegraph Ai is exposed as a fully-configurable ai web scraping integration that any agent can call — 27 actions, and API key authentication. No code is required to wire Scrapegraph Ai into your workflow — connect it once via the External Integrations panel and reuse it across every agent you build.
Agent builders use Scrapegraph Ai to automate the kinds of tasks ai web scraping teams previously handled manually. Concrete examples — each one is a single agent step in Nagent — include:
Every action and trigger is paired with a structured input/output schema (visible in the sections below), so when you wire Scrapegraph Ai into Helix — our agentic agent builder — the editor knows exactly what each step expects and produces. Configure once, deploy anywhere across your Nagent agents.
Every operation an agent can call against Scrapegraph Ai, with input parameters and output schema. Drop these into any step of an agent built in Helix.
SCRAPEGRAPH_AI_CONVERT_WEBPAGE_TO_MARKDOWN_V2Tool to convert any webpage into clean, well-formatted Markdown with full parameter control. Use when you need advanced options like stealth mode, custom headers, or webhook notifications. Supports all Markdownify API parameters.
Input parameters
If true, return mock test data instead of actual conversion. Useful for testing without consuming credits. Default: false.
Browser interaction steps to perform before extracting content (e.g., clicking buttons, filling forms, scrolling).
Whether to return streaming response. Default: false.
Optional headers to send with the request, including cookies and user agent. Use to customize request behavior or handle authentication.
Enable stealth mode to bypass bot protection using advanced anti-detection techniques. Adds +4 credits to the request cost. Default: false.
The number of milliseconds to wait before scraping the website. Useful for pages that load content dynamically. Default: 3000ms.
Include ScrapeGraphAI branding in the response. Default: false.
Webhook URL to send the job result to when processing completes. Enables async notification of completion.
The URL of the webpage to convert to Markdown. Must be a valid HTTP/HTTPS URL.
The country code to use for the scrape (e.g., 'US', 'GB', 'FR'). Determines the geographic location for the request.
Enable rendering of heavy JavaScript. Use for Single Page Applications (SPAs) that require full JavaScript execution. Default: false.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_GENERATE_SCHEMAGenerate or modify a JSON schema based on a search query for structured data extraction. Use when you need a schema template for scraping specific data fields.
Input parameters
The user's search query describing what data structure to generate. Be specific about the fields and structure you need (e.g., 'Extract product name, price, and availability from an e-commerce product page').
Optional existing JSON schema to modify or extend. If provided, the API will refine this schema based on the user_prompt.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_GET_AGENTIC_SCRAPER_HISTORYRetrieve paginated history of agentic scraper jobs. Use to view past scraping requests, their status, and results.
Input parameters
Page number for pagination. Use to navigate through pages of history results.
Number of history records to return per page. Maximum is typically 100.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_GET_CRAWLER_HISTORYRetrieve the history of crawler jobs for your account. Returns paginated list of past crawler requests with their status, results, and metadata.
Input parameters
Page number for pagination. Must be a positive integer.
Number of crawler history records to return per page. Must be between 1 and 100.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_GET_CREDITSRetrieve remaining and used credits for your ScrapeGraphAI account. Useful for checking credit availability before bulk scraping operations to avoid mid-run failures.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_GET_ENDPOINT_SUGGESTIONSTool to get AI-powered suggestions for creating scraping endpoints. Use when you need to identify what data can be extracted from a website and how to structure the scraping logic.
Input parameters
Natural language description of what data you want to extract. Be specific about the type of information needed (e.g., 'product listings', 'user profiles', 'news articles').
The website URL to analyze for scraping opportunities. Must be a valid URL with https:// or http:// protocol.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_GET_LIVE_SESSION_URLTool to get a URL for a live browser session. Use when you need to interact with a webpage in real-time through a controlled browser environment.
Input parameters
The URL of the webpage to open in the live browser session. Must be a valid HTTP/HTTPS URL.
Timeout for the live session in seconds. Default is 300 seconds (5 minutes).
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_GET_MARKDOWNIFY_HISTORYTool to retrieve the history of markdownify webpage-to-Markdown conversion jobs. Use when you need to view past markdownify requests and their statuses.
Input parameters
Page number for pagination. Defaults to 1.
Number of records to return per page. Defaults to 10.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_GET_SCRAPE_HISTORYRetrieve the history of scrape jobs from your ScrapeGraphAI account. Use this to check the status of past scrapes, view results, and track credit usage.
Input parameters
Page number for pagination. Starts at 1.
Number of scrape requests to return per page. Maximum depends on API limits.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_GET_SEARCHSCRAPER_HISTORYGet the history of searchscraper jobs with pagination support. Use this to retrieve past searchscraper requests, their status, and results.
Input parameters
Page number for pagination. Must be 1 or greater.
Number of records per page. Must be between 1 and 100.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_GET_SITEMAP_HISTORYTool to retrieve the history of sitemap extraction jobs. Use when you need to view past sitemap extraction requests, their status, and results.
Input parameters
Page number for pagination. Defaults to 1.
Number of items per page. Defaults to 10.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_GET_SMARTSCRAPER_HISTORYTool to retrieve the history of smartscraper jobs. Use when you need to view past scraping requests and their results.
Input parameters
Page number for pagination (starts at 1). Use to navigate through multiple pages of history.
Number of records to return per page (1-100). Default is 10.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_GET_USAGE_TIMELINETool to retrieve usage timeline statistics for your ScrapeGraphAI account. Use when you need to visualize or analyze service usage patterns over time.
Input parameters
Time range for usage timeline data. Choose '7' for last 7 days, '14' for last 14 days, '30' for last 30 days, or 'all' for complete history.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_GET_WEBHOOK_LOGSTool to retrieve webhook delivery logs for a crawler job. Use when you need to check the status and history of webhook notifications sent for a specific crawler execution.
Input parameters
The unique identifier of the crawler job to retrieve webhook logs for. Obtained from starting a crawler job.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_LIST_SCHEDULED_JOBSRetrieve a paginated list of all scheduled scraping jobs for your account. Use this action to view and manage your scheduled jobs, including their configuration, cron schedules, and active status. Supports filtering by service type and active status.
Input parameters
Page number for pagination. Must be 1 or greater.
Filter jobs by active status. Use 'true' to show only active jobs, 'false' for inactive jobs, or omit to show all jobs.
Number of jobs to return per page. Must be between 1 and 100.
Filter jobs by service type (e.g., 'smartscraper', 'markdownify', 'searchscraper'). If not provided, returns jobs of all service types.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_MARKDOWNIFY_STATUSCheck the status and retrieve results of a Markdownify webpage-to-Markdown conversion job. Use this action to poll for the status of an async Markdownify request started via SCRAPEGRAPH_AI_MARKDOWNIFY. Note: The ScrapeGraph AI API typically returns completed results synchronously, so this status endpoint is primarily useful for long-running conversions of large or complex webpages.
Input parameters
The unique identifier (UUID) of the Markdownify request returned by the SCRAPEGRAPH_AI_MARKDOWNIFY action
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_SAVE_ENDPOINTTool to save custom scraping endpoint configurations to ScrapeGraphAI. Use when you need to create reusable scraping endpoints with specific parameters and extraction logic.
Input parameters
List of endpoint configurations to save. Each configuration defines a custom scraping endpoint with its parameters and extraction logic.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_SEARCH_SCRAPERPerform AI-powered web searches with structured, parsed results. Some sites block scrapers and return empty bodies; treat these as unrecoverable for that URL. JS-rendered pages may yield incomplete content.
Input parameters
If true, returns a mock response for testing without consuming credits. Default is false.
The search query describing what you want to find on the web.
ISO 639-1 language code for the search (e.g., 'en').
Number of websites to search (3-20). Each website incurs credit costs. Default is 3. Insufficient credits returns a 402 error; verify balance before setting values above 3.
If true (default), uses AI to extract structured data (10 credits/page). If false, returns raw markdown content (2 credits/page).
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_SEARCH_SCRAPER_STATUSCheck the status and results of an asynchronous SearchScraper job.
Input parameters
The unique identifier (UUID) of the SearchScraper request obtained from the SearchScraper action response.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_SMART_CRAWLER_STATUSCheck the status and retrieve results of a SmartCrawler web crawling job. Use this action to poll for completion and get the extracted content from a previously started SmartCrawler job. Returns the job status, crawled URLs, page content in markdown/HTML format, and LLM extraction results (if enabled). Implement a polling timeout (e.g., max retries or elapsed time cap) to avoid indefinite loops when waiting for long-running jobs.
Input parameters
The unique identifier of the SmartCrawler task to check status for. Obtained from starting a SmartCrawler job.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_SMART_SCRAPER_STARTStart AI-powered web scraping with natural language extraction prompts. When `wait` is false (default), returns a `request_id`; poll for results using SCRAPEGRAPH_AI_SMART_SCRAPER_STATUS. Check `error` and `job_status` fields in the response before using extracted data.
Input parameters
If true, return mock test data instead of actual extraction. Useful for testing without consuming credits.
If true, wait for job completion and return full results. If false (default), return request_id immediately for async polling. Poll using SCRAPEGRAPH_AI_SMART_SCRAPER_STATUS with the returned `request_id`.
Browser interaction actions to perform before extraction (e.g., click buttons, fill forms). Each step is a dict with action type and parameters.
Cookies to include in the request for authentication or session management.
Custom HTTP headers for the scraping request (e.g., User-Agent, Accept-Language). Used when fetching website_url.
Enable anti-bot detection techniques. Adds +4 credit cost per request. Use for sites with bot protection.
Return extracted content as plain text instead of JSON. Useful for simple text extraction.
Number of pages to scrape for paginated content (1-100). Default is 1.
Natural language description of what information to extract from the webpage. Be specific about the data you want (e.g., 'Extract the product name, price, and description').
Full URL of the webpage to scrape (must include https://). Required if website_html and website_markdown are not provided.
Raw HTML content to scrape (max 2MB). Use this if you already have the page HTML. Required if website_url and website_markdown are not provided.
JSON Schema defining the structure of extracted data. Helps ensure consistent, structured output.
Enable enhanced JavaScript rendering for Single Page Applications (SPAs) and heavy JS sites. May increase processing time.
Raw Markdown content to scrape (max 2MB). Use this if you already have the page content as Markdown. Required if website_url and website_html are not provided.
Number of scroll iterations for infinite scroll pages (0-50). Default is 0. Use for pages that load content on scroll.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_SMART_SCRAPER_STATUSCheck the status and retrieve results of a SmartScraper web scraping job. Use this action to poll for completion after starting a SmartScraper job with wait=false. The request_id is returned by the Start SmartScraper action. Typical workflow: 1. Start a scraping job with SCRAPEGRAPH_AI_SMART_SCRAPER_START (wait=false) 2. Use the returned request_id to check status with this action 3. Poll until status is 'completed' or 'failed' 4. When completed, the 'result' field contains the extracted data. When completed, also check the 'error' field before consuming 'result', as 'failed' status populates 'error' instead of 'result'.
Input parameters
The unique identifier (UUID) of the SmartScraper request returned by the Start SmartScraper action
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_START_SMART_CRAWLERTool to start a multi-page web crawl using SmartCrawler for AI-powered data extraction. Use when you need to extract structured data from multiple pages of a website. Returns immediately with a task_id - use the status check action to monitor progress and retrieve results.
Input parameters
The starting URL for the crawl. Must be a valid HTTP or HTTPS URL.
Maximum crawl depth - how many link levels to follow from the start URL. Default is 1.
Crawl rules for filtering URLs during the crawl process.
Instructions for what data to extract during the crawl. Required when extraction_mode=true. Be specific about the information you want extracted.
JSON Schema object defining the structure of extracted data. Helps ensure consistent, structured output in extraction mode.
Maximum number of links to crawl per depth level. If not specified (null), unlimited breadth is allowed. Ignored when sitemap=true.
Whether to use sitemap.xml for URL discovery instead of link following. When enabled, ignores breadth parameter. Default is false.
Enable stealth mode to bypass bot detection and anti-scraping measures. Adds +4 credits per page. Use for sites with bot protection. Default is false.
Maximum total number of pages to crawl across all depth levels. Default is 10.
Number of pages to process in each batch during the crawl. Higher values may speed up large crawls. Default is 1.
Webhook URL to receive the job completion notification. The result will be POSTed to this URL when the crawl finishes.
Whether to cache the website content for faster subsequent crawls. Default is false.
When true, enables AI-powered extraction using LLM (default, 10 credits/page). When false, enables markdown conversion mode (NO AI processing, 2 credits/page, 80% cheaper).
Enable enhanced JavaScript rendering for Single Page Applications (SPAs) and sites with heavy JavaScript. May increase processing time. Default is false.
Whether to restrict crawling to only pages on the same domain as the starting URL. Default is true.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_SUBMIT_FEEDBACKSubmit feedback and ratings for completed ScrapeGraphAI requests.
Input parameters
Star rating from 0 (lowest) to 5 (highest)
UUID of the request/session this feedback is for
Optional comments about the request
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_SUBMIT_PRODUCT_FEEDBACKSubmit product feedback for ScrapeGraphAI. Use to provide ratings, comments, suggestions, and other feedback about the product itself.
Input parameters
Your name
Your email address
Any issues you've encountered
Overall rating from 1 (lowest) to 5 (highest)
Your company name
What you disliked about the product
Your use cases for the product
What you liked most about the product
Whether the setup process was easy
Whether you consent to be contacted about your feedback
UUID of the feedback submission
Preferred method of contact
How you discovered ScrapeGraphAI
How likely you are to recommend this product (0-10 scale)
How frequently you use the product
Features you'd like to see added
Your suggestions for product improvements
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_TOONIFYTool to convert JSON data to TOON (Token-Oriented Object Notation) format. Use when you need to reduce token usage for LLM processing while maintaining data structure.
Input parameters
The JSON data to convert to TOON (Token-Oriented Object Notation) format. Can be any valid JSON object or array. TOON format reduces token usage by 30-60% compared to JSON while maintaining structure and readability. If not provided, returns empty response.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPEGRAPH_AI_VALIDATE_API_KEYValidate your ScrapeGraphAI API key to ensure it is active and authorized. Use this action to check API key validity before making other API calls.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
No publicly available marketplace agent is found using this tool yet. There are 75 agents privately built on Nagent that already use Scrapegraph Ai.
Build on Nagent
Connect Scrapegraph Ai to any Nagent agent in minutes — no API key management, no boilerplate. Just configure and deploy.
The five questions agent builders ask before adopting a new integration.
Open the External Integrations panel inside Nagent (app.nagent.ai/externalIntegration), find Scrapegraph Ai, and click "Connect Now." You'll authenticate with an API key — Nagent handles credential storage and refresh automatically. Once connected, Scrapegraph Ai is available to any agent in your workspace.
No. Nagent provides no-code integration for every tool. Once Scrapegraph Ai is connected, you configure its 27 actions directly in the agent builder UI — no API calls, no boilerplate, no schema management.
Helix — Nagent's agentic agent builder — lets you drop Scrapegraph Ai steps into any workflow visually. Pick an action (e.g., one of those listed above), fill in the inputs (Helix knows the required vs. optional schema for each parameter), and connect it to upstream/downstream steps. Triggers run as the entry point of an agent, so when a Scrapegraph Ai event fires, the agent kicks off automatically.
Every Scrapegraph Ai action and trigger ships with a fully-typed schema — input parameters with name, type, required flag, and description, plus the output payload shape. The schemas are documented in the sections above. Helix uses these schemas to validate your configuration at build time and to type-check the data flowing between steps.
Yes. While Scrapegraph Ai ships with 27 pre-built ai web scraping actions, you can layer custom logic around them inside Helix — pre/post-processing steps, conditional branches, retries, or stitching Scrapegraph Ai together with other connected tools. For deeper customization, talk to our team about Nagent's Agentic AI Lab — forward-deployed engineers who build Scrapegraph Ai-based workflows tailored to your business.