Tools & Integrations/Scrapingant

Scrapingant

ai web scrapingAPI_KEY9 actions

Connect Scrapingant

Connect Now Book a Demo

Categoryai web scraping

AuthAPI_KEY

Actions9

About

What is Scrapingant?

ScrapingAnt is a web scraping API service that enables data extraction from websites through headless Chrome browsers, rotating proxies, CAPTCHA/Cloudflare bypass, LLM-ready markdown output, and AI-powered structured data extraction. On Nagent, Scrapingant is exposed as a fully-configurable ai web scraping integration that any agent can call — 9 actions, and API key authentication. No code is required to wire Scrapingant into your workflow — connect it once via the External Integrations panel and reuse it across every agent you build.

What can you build with Scrapingant?

Agent builders use Scrapingant to automate the kinds of tasks ai web scraping teams previously handled manually. Concrete examples — each one is a single agent step in Nagent — include:

Extract Content as Markdown — This tool extracts content from a given URL and converts it into Markdown format.
Extract Data with AI — This tool allows you to extract structured data from a web page using ScrapingAnt's AI-powered extraction capabilities.
Get API Credits Usage — This tool retrieves the current API credit usage status for the authenticated ScrapingAnt account.
Get V1 Usage (Deprecated) — \[DEPRECATED - Use v2\] Tool to get the current subscription status and API credits usage information.
Scrape Web Page — This tool scrapes a web page using the ScrapingAnt API.
Scrape Webpage via POST — Tool to perform a POST request through ScrapingAnt's proxy to scrape a webpage.

Every action and trigger is paired with a structured input/output schema (visible in the sections below), so when you wire Scrapingant into Helix — our agentic agent builder — the editor knows exactly what each step expects and produces. Configure once, deploy anywhere across your Nagent agents.

What You Can Do

Actions (9)

Every operation an agent can call against Scrapingant, with input parameters and output schema. Drop these into any step of an agent built in Helix.

Extract Content as MarkdownSCRAPINGANT_EXTRACT_CONTENT_AS_MARKDOWN

This tool extracts content from a given URL and converts it into Markdown format. It is particularly useful for preparing text for Language Learning Models (LLMs) and Retrieval-Augmented Generation (RAG) systems. It supports GET, POST, PUT, and DELETE methods.

Input parameters

PropType

urlstring

Required

The URL of the web page to scrape and convert to Markdown.

method?string ("get" | "post" | "put" | "delete")

Optional

HTTP method to use for the request.

browser?boolean

Optional

Enables the use of a headless browser for scraping. Default is true.

cookies?string

Optional

Cookies to include with the request.

js_snippet?string

Optional

Base64-encoded JavaScript to execute on the page after it loads.

proxy_type?string

Optional

Specifies the type of proxy to use.

proxy_country?string

Optional

Specifies the country for the proxy (e.g., US, GB).

block_resource?array

Optional

List of resource types to block (e.g., image, script, stylesheet, font, media, websocket, other).

wait_for_selector?string

Optional

CSS selector to wait for before returning the result.

return_page_source?boolean

Optional

Returns the raw HTML as received from the server, without JavaScript rendering. Default is false.

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Extract Data with AISCRAPINGANT_EXTRACT_DATA_WITH_AI

This tool allows you to extract structured data from a web page using ScrapingAnt's AI-powered extraction capabilities. You provide a URL and an AI query (prompt) describing what data you want to extract, and the tool returns the extracted data in a structured format. It supports additional parameters for browser rendering, proxies, and cookies to handle dynamic content and localization.

Input parameters

PropType

urlstring

Required

The URL of the page to extract data from.

cookies?string

Optional

Cookies to use for the request. (e.g. cookie1=value1; cookie2=value2)

proxy_type?string

Optional

Proxy type to use for the request. (datacenter, residential)

return_text?boolean

Optional

Return text content of the page. (default: false)

proxy_country?string

Optional

Proxy country to use for the request. (e.g. US, GB, DE)

enable_javascript?boolean

Optional

Enable browser rendering. (default: true)

wait_for_selector?string

Optional

Wait for a specific selector to appear on the page before extracting data.

extract_propertiesstring

Required

A free-form text describing the data you want to extract.

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Get API Credits UsageSCRAPINGANT_GET_API_CREDITS_USAGE

This tool retrieves the current API credit usage status for the authenticated ScrapingAnt account. It enables users to monitor their consumption of API credits, check their current usage against the subscription limits, and manage their API credits effectively.

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Get V1 Usage (Deprecated)SCRAPINGANT_GET_V1_USAGE

\[DEPRECATED - Use v2\] Tool to get the current subscription status and API credits usage information. This is the legacy v1 endpoint which is no longer actively maintained.

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Scrape Web PageSCRAPINGANT_SCRAPE_WEB_PAGE

This tool scrapes a web page using the ScrapingAnt API. It fetches the HTML content of the specified URL. Users can customize the scraping behavior by enabling a headless browser, using proxies, waiting for specific elements, executing JavaScript, passing cookies, and blocking certain resources.

Input parameters

PropType

urlstring

Required

URL of the web page to scrape.

browser?boolean

Optional

Enable to use a headless browser for scraping. Defaults to True. If False, JavaScript will not be rendered.

cookies?string

Optional

Cookies to pass with the scraping request.

js_snippet?string

Optional

Base64 encoded JavaScript snippet to execute on the page. Requires headless browser.

proxy_type?string ("datacenter" | "residential")

Optional

Specifies the type of proxy to use.

proxy_country?string

Optional

Specifies the country for the proxy.

block_resource?array

Optional

List of resource types to block. Requires headless browser.

wait_for_selector?string

Optional

CSS selector to wait for before returning the result. Requires headless browser.

return_page_source?boolean

Optional

Enable to return the raw HTML from the server without JavaScript rendering. Requires headless browser. Defaults to False.

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Scrape Webpage via POSTSCRAPINGANT_SCRAPE_WEBPAGE_POST

Tool to perform a POST request through ScrapingAnt's proxy to scrape a webpage. Use when you need to scrape pages that require POST method, such as form submissions or APIs that only accept POST requests. Data is forwarded transparently to the target web page.

Input parameters

PropType

urlstring

Required

URL of the target web page to scrape using POST request.

browser?boolean

Optional

Enable to use a headless browser for scraping. Defaults to True. If False, JavaScript will not be rendered.

cookies?string

Optional

Cookies to pass with the scraping request to the target site. Format: cookie_name1=cookie_value1;cookie_name2=cookie_value2

post_data?object

Optional

POST data to send to the target web page. This will be forwarded transparently to the target. Useful for form submissions and APIs requiring POST requests.

js_snippet?string

Optional

Base64 encoded JavaScript snippet to execute once the page is loaded. Requires browser=True.

proxy_type?string ("datacenter" | "residential")

Optional

Specifies the type of proxy to use for the request. Defaults to datacenter.

proxy_country?string

Optional

Specifies the country for the proxy. If not specified, a random country will be used.

block_resource?array

Optional

List of resource types to block. Prevents cloud browser from loading specified resource types. Requires browser=True.

wait_for_selector?string

Optional

CSS selector to wait for before returning the result. Requires browser=True.

return_page_source?boolean

Optional

Enable to return the raw HTML from the server without JavaScript rendering. Requires browser=True. Defaults to False.

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Scrape Webpage with PUTSCRAPINGANT_SCRAPE_WEBPAGE_PUT

Tool to perform a PUT request through ScrapingAnt's proxy to scrape a webpage that requires PUT method. Use when the target webpage requires PUT method for data submission. Data is forwarded transparently to the target web page.

Input parameters

PropType

urlstring

Required

URL of the web page to scrape using PUT method.

browser?boolean

Optional

Enable to use a headless browser for scraping. Defaults to True. If False, JavaScript will not be rendered.

cookies?string

Optional

Cookies to pass with the scraping request to the target site. Format: cookie_name1=cookie_value1;cookie_name2=cookie_value2

js_snippet?string

Optional

Base64 encoded JavaScript snippet to execute on the page once loaded. Requires headless browser.

proxy_type?string ("datacenter" | "residential")

Optional

Specifies the type of proxy to use. Defaults to datacenter.

content_type?string

Optional

Content-Type header to use for the PUT request. Will be sent as Ant-Content-Type header.

request_body?string

Optional

Request body data to send with the PUT request. This data will be forwarded transparently to the target web page.

proxy_country?string

Optional

Specifies the country for the proxy. If not specified, request will be made from a random country.

block_resource?array

Optional

List of resource types to block from loading. Requires headless browser.

wait_for_selector?string

Optional

CSS selector to wait for before returning the result. Requires headless browser.

return_page_source?boolean

Optional

Enable to return the raw HTML from the server without JavaScript rendering. Requires headless browser. Defaults to False.

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Scrape Webpage (v1 POST - Deprecated)SCRAPINGANT_SCRAPE_WEBPAGE_V1_POST

\[DEPRECATED - Use v2\] Tool to scrape a webpage using POST method with ScrapingAnt's v1 API. Returns JSON with content, cookies, and status_code. This is the legacy v1 endpoint which is no longer actively maintained. Use the v2 endpoints for new implementations.

Input parameters

PropType

urlstring

Required

URL of the web page to scrape.

browser?boolean

Optional

Enables using headless browser for scraping. Default: true. If false, JavaScript will not be rendered.

cookies?string

Optional

Cookie data to include with the scraping request. Format: cookie_name1=cookie_value1;cookie_name2=cookie_value2

js_snippet?string

Optional

Base64 encoded JavaScript snippet to run once the page is loaded. Requires browser=true.

proxy_type?string ("datacenter" | "residential")

Optional

Proxy classification to use. Default is datacenter. Options: datacenter or residential.

return_text?boolean

Optional

Enables returning text only content from the page. Default: false. If true, returns plain text instead of HTML.

proxy_country?string

Optional

Geographic location for proxy requests. If not specified, a random country will be used.

wait_for_selector?string

Optional

CSS selector of the element our service will wait for before returning the result. Requires browser=true.

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Scrape with Extended JSON OutputSCRAPINGANT_SCRAPE_WITH_EXTENDED_JSON_OUTPUT

Scrapes a web page and returns comprehensive data including HTML content, plain text, cookies, HTTP headers, XHR/Fetch requests, and iframe content. This tool uses ScrapingAnt's extended endpoint which provides much richer data than standard scraping: - Full HTML and extracted plain text content - All cookies and HTTP response headers from the target page - Captured XHR/Fetch API requests made by the page (useful for finding hidden APIs) - Content from embedded iframes Best used when you need more than just the HTML - such as analyzing cookies, headers, or JavaScript API calls made by a page. For simple HTML scraping, consider using the basic scrape tool instead for lower API credit usage.

Input parameters

PropType

urlstring

Required

The full URL of the web page to scrape (must include protocol, e.g., https://).

browser?boolean

Optional

Enable or disable headless browser rendering. When True (default), JavaScript is executed and dynamic content is loaded. Set to False for faster scraping of static pages.

cookies?string

Optional

Custom cookies to send with the request. Format: 'name1=value1; name2=value2'.

timeout?integer

Optional

Maximum time in seconds to wait for the page to load. Must be between 5 and 60. Default is 60 seconds.

proxy_type?string

Optional

Type of proxy to use: 'datacenter' (faster, cheaper) or 'residential' (better for anti-bot sites). Default is 'datacenter'.

proxy_country?string

Optional

Two-letter country code (ISO 3166-1 alpha-2) for geographic proxy location (e.g., 'US', 'GB', 'DE').

wait_for_selector?string

Optional

CSS selector to wait for before returning the page content. Useful for pages with dynamic content that loads after initial page load.

return_page_source?boolean

Optional

When True, returns the raw HTML from the server without JavaScript rendering. Useful for faster scraping when JS execution is not needed.

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

On the Nagent Platform

Agents using Scrapingant

No publicly available marketplace agent is found using this tool yet. There are 72 agents privately built on Nagent that already use Scrapingant.

Build on Nagent

Build an agent that uses Scrapingant

Connect Scrapingant to any Nagent agent in minutes — no API key management, no boilerplate. Just configure and deploy.

Connect Now Book a Demo

Frequently Asked

Building with Scrapingant on Nagent

The five questions agent builders ask before adopting a new integration.

How do I connect Scrapingant to my Nagent agent?

Open the External Integrations panel inside Nagent (app.nagent.ai/externalIntegration), find Scrapingant, and click "Connect Now." You'll authenticate with an API key — Nagent handles credential storage and refresh automatically. Once connected, Scrapingant is available to any agent in your workspace.

Do I need to write code to use Scrapingant?

No. Nagent provides no-code integration for every tool. Once Scrapingant is connected, you configure its 9 actions directly in the agent builder UI — no API calls, no boilerplate, no schema management.

How do I configure Scrapingant actions and triggers in Helix?

Helix — Nagent's agentic agent builder — lets you drop Scrapingant steps into any workflow visually. Pick an action (e.g., one of those listed above), fill in the inputs (Helix knows the required vs. optional schema for each parameter), and connect it to upstream/downstream steps. Triggers run as the entry point of an agent, so when a Scrapingant event fires, the agent kicks off automatically.

What input and output schemas does Scrapingant support?

Every Scrapingant action and trigger ships with a fully-typed schema — input parameters with name, type, required flag, and description, plus the output payload shape. The schemas are documented in the sections above. Helix uses these schemas to validate your configuration at build time and to type-check the data flowing between steps.

Can I extend Scrapingant with custom logic?

Yes. While Scrapingant ships with 9 pre-built ai web scraping actions, you can layer custom logic around them inside Helix — pre/post-processing steps, conditional branches, retries, or stitching Scrapingant together with other connected tools. For deeper customization, talk to our team about Nagent's Agentic AI Lab — forward-deployed engineers who build Scrapingant-based workflows tailored to your business.

All tools & integrations