ScrapingAnt is a web scraping API service that enables data extraction from websites through headless Chrome browsers, rotating proxies, CAPTCHA/Cloudflare bypass, LLM-ready markdown output, and AI-powered structured data extraction.
ScrapingAnt is a web scraping API service that enables data extraction from websites through headless Chrome browsers, rotating proxies, CAPTCHA/Cloudflare bypass, LLM-ready markdown output, and AI-powered structured data extraction. On Nagent, Scrapingant is exposed as a fully-configurable ai web scraping integration that any agent can call — 9 actions, and API key authentication. No code is required to wire Scrapingant into your workflow — connect it once via the External Integrations panel and reuse it across every agent you build.
Agent builders use Scrapingant to automate the kinds of tasks ai web scraping teams previously handled manually. Concrete examples — each one is a single agent step in Nagent — include:
Every action and trigger is paired with a structured input/output schema (visible in the sections below), so when you wire Scrapingant into Helix — our agentic agent builder — the editor knows exactly what each step expects and produces. Configure once, deploy anywhere across your Nagent agents.
Every operation an agent can call against Scrapingant, with input parameters and output schema. Drop these into any step of an agent built in Helix.
SCRAPINGANT_EXTRACT_CONTENT_AS_MARKDOWNThis tool extracts content from a given URL and converts it into Markdown format. It is particularly useful for preparing text for Language Learning Models (LLMs) and Retrieval-Augmented Generation (RAG) systems. It supports GET, POST, PUT, and DELETE methods.
Input parameters
The URL of the web page to scrape and convert to Markdown.
HTTP method to use for the request.
Enables the use of a headless browser for scraping. Default is true.
Cookies to include with the request.
Base64-encoded JavaScript to execute on the page after it loads.
Specifies the type of proxy to use.
Specifies the country for the proxy (e.g., US, GB).
List of resource types to block (e.g., image, script, stylesheet, font, media, websocket, other).
CSS selector to wait for before returning the result.
Returns the raw HTML as received from the server, without JavaScript rendering. Default is false.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPINGANT_EXTRACT_DATA_WITH_AIThis tool allows you to extract structured data from a web page using ScrapingAnt's AI-powered extraction capabilities. You provide a URL and an AI query (prompt) describing what data you want to extract, and the tool returns the extracted data in a structured format. It supports additional parameters for browser rendering, proxies, and cookies to handle dynamic content and localization.
Input parameters
The URL of the page to extract data from.
Cookies to use for the request. (e.g. cookie1=value1; cookie2=value2)
Proxy type to use for the request. (datacenter, residential)
Return text content of the page. (default: false)
Proxy country to use for the request. (e.g. US, GB, DE)
Enable browser rendering. (default: true)
Wait for a specific selector to appear on the page before extracting data.
A free-form text describing the data you want to extract.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPINGANT_GET_API_CREDITS_USAGEThis tool retrieves the current API credit usage status for the authenticated ScrapingAnt account. It enables users to monitor their consumption of API credits, check their current usage against the subscription limits, and manage their API credits effectively.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPINGANT_GET_V1_USAGE\[DEPRECATED - Use v2\] Tool to get the current subscription status and API credits usage information. This is the legacy v1 endpoint which is no longer actively maintained.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPINGANT_SCRAPE_WEB_PAGEThis tool scrapes a web page using the ScrapingAnt API. It fetches the HTML content of the specified URL. Users can customize the scraping behavior by enabling a headless browser, using proxies, waiting for specific elements, executing JavaScript, passing cookies, and blocking certain resources.
Input parameters
URL of the web page to scrape.
Enable to use a headless browser for scraping. Defaults to True. If False, JavaScript will not be rendered.
Cookies to pass with the scraping request.
Base64 encoded JavaScript snippet to execute on the page. Requires headless browser.
Specifies the type of proxy to use.
Specifies the country for the proxy.
List of resource types to block. Requires headless browser.
CSS selector to wait for before returning the result. Requires headless browser.
Enable to return the raw HTML from the server without JavaScript rendering. Requires headless browser. Defaults to False.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPINGANT_SCRAPE_WEBPAGE_POSTTool to perform a POST request through ScrapingAnt's proxy to scrape a webpage. Use when you need to scrape pages that require POST method, such as form submissions or APIs that only accept POST requests. Data is forwarded transparently to the target web page.
Input parameters
URL of the target web page to scrape using POST request.
Enable to use a headless browser for scraping. Defaults to True. If False, JavaScript will not be rendered.
Cookies to pass with the scraping request to the target site. Format: cookie_name1=cookie_value1;cookie_name2=cookie_value2
POST data to send to the target web page. This will be forwarded transparently to the target. Useful for form submissions and APIs requiring POST requests.
Base64 encoded JavaScript snippet to execute once the page is loaded. Requires browser=True.
Specifies the type of proxy to use for the request. Defaults to datacenter.
Specifies the country for the proxy. If not specified, a random country will be used.
List of resource types to block. Prevents cloud browser from loading specified resource types. Requires browser=True.
CSS selector to wait for before returning the result. Requires browser=True.
Enable to return the raw HTML from the server without JavaScript rendering. Requires browser=True. Defaults to False.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPINGANT_SCRAPE_WEBPAGE_PUTTool to perform a PUT request through ScrapingAnt's proxy to scrape a webpage that requires PUT method. Use when the target webpage requires PUT method for data submission. Data is forwarded transparently to the target web page.
Input parameters
URL of the web page to scrape using PUT method.
Enable to use a headless browser for scraping. Defaults to True. If False, JavaScript will not be rendered.
Cookies to pass with the scraping request to the target site. Format: cookie_name1=cookie_value1;cookie_name2=cookie_value2
Base64 encoded JavaScript snippet to execute on the page once loaded. Requires headless browser.
Specifies the type of proxy to use. Defaults to datacenter.
Content-Type header to use for the PUT request. Will be sent as Ant-Content-Type header.
Request body data to send with the PUT request. This data will be forwarded transparently to the target web page.
Specifies the country for the proxy. If not specified, request will be made from a random country.
List of resource types to block from loading. Requires headless browser.
CSS selector to wait for before returning the result. Requires headless browser.
Enable to return the raw HTML from the server without JavaScript rendering. Requires headless browser. Defaults to False.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPINGANT_SCRAPE_WEBPAGE_V1_POST\[DEPRECATED - Use v2\] Tool to scrape a webpage using POST method with ScrapingAnt's v1 API. Returns JSON with content, cookies, and status_code. This is the legacy v1 endpoint which is no longer actively maintained. Use the v2 endpoints for new implementations.
Input parameters
URL of the web page to scrape.
Enables using headless browser for scraping. Default: true. If false, JavaScript will not be rendered.
Cookie data to include with the scraping request. Format: cookie_name1=cookie_value1;cookie_name2=cookie_value2
Base64 encoded JavaScript snippet to run once the page is loaded. Requires browser=true.
Proxy classification to use. Default is datacenter. Options: datacenter or residential.
Enables returning text only content from the page. Default: false. If true, returns plain text instead of HTML.
Geographic location for proxy requests. If not specified, a random country will be used.
CSS selector of the element our service will wait for before returning the result. Requires browser=true.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPINGANT_SCRAPE_WITH_EXTENDED_JSON_OUTPUTScrapes a web page and returns comprehensive data including HTML content, plain text, cookies, HTTP headers, XHR/Fetch requests, and iframe content. This tool uses ScrapingAnt's extended endpoint which provides much richer data than standard scraping: - Full HTML and extracted plain text content - All cookies and HTTP response headers from the target page - Captured XHR/Fetch API requests made by the page (useful for finding hidden APIs) - Content from embedded iframes Best used when you need more than just the HTML - such as analyzing cookies, headers, or JavaScript API calls made by a page. For simple HTML scraping, consider using the basic scrape tool instead for lower API credit usage.
Input parameters
The full URL of the web page to scrape (must include protocol, e.g., https://).
Enable or disable headless browser rendering. When True (default), JavaScript is executed and dynamic content is loaded. Set to False for faster scraping of static pages.
Custom cookies to send with the request. Format: 'name1=value1; name2=value2'.
Maximum time in seconds to wait for the page to load. Must be between 5 and 60. Default is 60 seconds.
Type of proxy to use: 'datacenter' (faster, cheaper) or 'residential' (better for anti-bot sites). Default is 'datacenter'.
Two-letter country code (ISO 3166-1 alpha-2) for geographic proxy location (e.g., 'US', 'GB', 'DE').
CSS selector to wait for before returning the page content. Useful for pages with dynamic content that loads after initial page load.
When True, returns the raw HTML from the server without JavaScript rendering. Useful for faster scraping when JS execution is not needed.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
No publicly available marketplace agent is found using this tool yet. There are 72 agents privately built on Nagent that already use Scrapingant.
Build on Nagent
Connect Scrapingant to any Nagent agent in minutes — no API key management, no boilerplate. Just configure and deploy.
The five questions agent builders ask before adopting a new integration.
Open the External Integrations panel inside Nagent (app.nagent.ai/externalIntegration), find Scrapingant, and click "Connect Now." You'll authenticate with an API key — Nagent handles credential storage and refresh automatically. Once connected, Scrapingant is available to any agent in your workspace.
No. Nagent provides no-code integration for every tool. Once Scrapingant is connected, you configure its 9 actions directly in the agent builder UI — no API calls, no boilerplate, no schema management.
Helix — Nagent's agentic agent builder — lets you drop Scrapingant steps into any workflow visually. Pick an action (e.g., one of those listed above), fill in the inputs (Helix knows the required vs. optional schema for each parameter), and connect it to upstream/downstream steps. Triggers run as the entry point of an agent, so when a Scrapingant event fires, the agent kicks off automatically.
Every Scrapingant action and trigger ships with a fully-typed schema — input parameters with name, type, required flag, and description, plus the output payload shape. The schemas are documented in the sections above. Helix uses these schemas to validate your configuration at build time and to type-check the data flowing between steps.
Yes. While Scrapingant ships with 9 pre-built ai web scraping actions, you can layer custom logic around them inside Helix — pre/post-processing steps, conditional branches, retries, or stitching Scrapingant together with other connected tools. For deeper customization, talk to our team about Nagent's Agentic AI Lab — forward-deployed engineers who build Scrapingant-based workflows tailored to your business.