Tools & Integrations/Scrape Do

Scrape Do

ai web scrapingAPI_KEY16 actions

Connect Scrape Do

Connect Now Book a Demo

Categoryai web scraping

AuthAPI_KEY

Actions16

About

What is Scrape Do?

Scrape.do is a web scraping API offering rotating residential, data-center, and mobile proxies with headless browser support and session management to bypass anti-bot protections (e.g., Cloudflare, Akamai) and extract data at scale in formats like JSON and HTML. On Nagent, Scrape Do is exposed as a fully-configurable ai web scraping integration that any agent can call — 16 actions, and API key authentication. No code is required to wire Scrape Do into your workflow — connect it once via the External Integrations panel and reuse it across every agent you build.

What can you build with Scrape Do?

Agent builders use Scrape Do to automate the kinds of tasks ai web scraping teams previously handled manually. Concrete examples — each one is a single agent step in Nagent — include:

Cancel Async Job — Tool to cancel an asynchronous scraping job.
Create Async Scraping Job — Tool to create an asynchronous scraping job with specified targets and options.
Get Account Information — Retrieves account information and usage statistics from Scrape.
Get Amazon Product Offers — Get all seller offers for any Amazon product.
Get Amazon product details — Extract structured product data from Amazon product detail pages (PDP).
Get Amazon raw HTML — Tool to get raw HTML from any Amazon page with ZIP code geo-targeting.

Every action and trigger is paired with a structured input/output schema (visible in the sections below), so when you wire Scrape Do into Helix — our agentic agent builder — the editor knows exactly what each step expects and produces. Configure once, deploy anywhere across your Nagent agents.

What You Can Do

Actions (16)

Every operation an agent can call against Scrape Do, with input parameters and output schema. Drop these into any step of an agent built in Helix.

Cancel Async JobSCRAPE_DO_CANCEL_ASYNC_JOB

Tool to cancel an asynchronous scraping job. Use when you need to stop processing of pending tasks in a job. Completed tasks remain available.

Input parameters

PropType

tokenstring

Required

Authentication token for Scrape.do API

job_idstring

Required

Unique identifier of the job to cancel

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Create Async Scraping JobSCRAPE_DO_CREATE_ASYNC_JOB

Tool to create an asynchronous scraping job with specified targets and options. Use when you need to scrape multiple URLs in parallel without waiting for results. Returns a job ID immediately for polling results later via the get job status action.

Input parameters

PropType

Body?string

Optional

HTTP request body for POST/PUT/PATCH requests

Super?boolean

Optional

Use residential/mobile proxy networks (default: false)

Device?string ("desktop" | "mobile" | "tablet")

Optional

Device types for scraping emulation.

Optional

HTTP methods for async scraping requests.

Output?string ("raw" | "markdown")

Optional

Output format for scraped content.

Render?object

Optional

Options for headless browser rendering.

GeoCode?string

Optional

Country code for geo-targeting (e.g., 'us', 'gb', 'de')

Headers?object

Optional

Custom HTTP headers to send with requests

Targetsarray

Required

Array of target URLs to scrape. Each URL will be processed asynchronously.

Timeout?integer

Optional

Total request timeout in milliseconds (default: 60000)

SessionID?string

Optional

Sticky session ID to reuse same IP address across requests

SetCookies?string

Optional

Cookies to include with the request

WebhookURL?string

Optional

Webhook URL to send results to when job completes

DisableRetry?boolean

Optional

Disable automatic retry mechanism (default: false)

RetryTimeout?integer

Optional

Retry timeout per request in milliseconds (default: 15000)

ForwardHeaders?boolean

Optional

Use only provided headers, discard default headers (default: false)

WebhookHeaders?object

Optional

Additional headers to send with webhook notification

RegionalGeoCode?string

Optional

Regional code for more specific geo-targeting

DisableRedirection?boolean

Optional

Disable following HTTP redirects (default: false)

TransparentResponse?boolean

Optional

Return raw target website response without processing (default: false)

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Get Account InformationSCRAPE_DO_GET_ACCOUNT_INFO

Retrieves account information and usage statistics from Scrape.do. This action makes a GET request to the Scrape.do info endpoint to fetch: - Subscription status - Concurrent request limits and usage - Monthly request limits and remaining requests - Real-time usage statistics Rate limit: Maximum 10 requests per minute. Use remaining request counts to monitor credits proactively, as different scraping operations (e.g., rendered-page requests) consume varying credit amounts and exhaustion mid-run causes failures.

Input parameters

PropType

tokenstring

Required

Authentication token for Scrape.do API

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Get Amazon Product OffersSCRAPE_DO_GET_AMAZON_OFFERS

Get all seller offers for any Amazon product. Retrieves every seller listing including pricing, shipping costs, seller information, and Buy Box status in structured JSON format. Use when you need to compare prices across multiple sellers or find the best deal for a specific product.

Input parameters

PropType

asinstring

Required

Amazon Standard Identification Number (10-character product ID)

geocodestring

Required

Country code for Amazon marketplace (e.g., us, gb, de, jp, fr, es, it, ca)

zipcodestring

Required

Postal/ZIP code formatted according to country requirements

super_mode?boolean

Optional

Enable residential/mobile proxies for higher success rates. Costs 10x credits

include_html?boolean

Optional

When true, includes the full raw HTML alongside structured JSON

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Get Amazon product detailsSCRAPE_DO_GET_AMAZON_PRODUCT

Extract structured product data from Amazon product detail pages (PDP). Returns comprehensive product information including title, pricing, ratings, images, best seller rankings, and technical specifications in JSON format.

Input parameters

PropType

asinstring

Required

Amazon Standard Identification Number (10-character product ID)

geocodestring

Required

Country code (e.g., us, gb, de, jp, fr, ca)

zipcodestring

Required

Postal code formatted according to country requirements

language?string

Optional

Language code in ISO 639-1 format (e.g., EN, DE, FR)

super_mode?boolean

Optional

Enable residential/mobile proxies for higher success rates. Costs 10x credits

include_html?boolean

Optional

When true, includes the full raw HTML alongside structured JSON

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Get Amazon raw HTMLSCRAPE_DO_GET_AMAZON_RAW_HTML

Tool to get raw HTML from any Amazon page with ZIP code geo-targeting. Use when you need complete unprocessed HTML source from Amazon URLs with location-based targeting. Ideal for scraping pages not covered by other structured endpoints.

Input parameters

PropType

urlstring

Required

Full Amazon URL to scrape (e.g., https://www.amazon.com/dp/B08N5WRWNW)

super?boolean

Optional

Enable residential/mobile proxies for higher success rates. Costs 10x credits. Default is false.

output?string

Optional

Output format - must be 'html' for raw HTML content

geocodestring

Required

Country code for geo-targeting (e.g., us, gb, de, jp)

timeout?integer

Optional

Request timeout in milliseconds

zipcodestring

Required

Postal code formatted according to country requirements (e.g., 10001 for US, SW1A 1AA for UK)

language?string

Optional

Language code in ISO 639-1 format (e.g., EN, DE, FR, ES)

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Get Async API Account InformationSCRAPE_DO_GET_ASYNC_ACCOUNT_INFO

Tool to get account information for the Async API including concurrency limits and usage statistics. Use when you need to check available concurrency slots, active jobs, or remaining credits for Async API operations.

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Get Async Job DetailsSCRAPE_DO_GET_ASYNC_JOB

Tool to retrieve details and status of a specific asynchronous scraping job. Use when you need to check the progress, status, or results of a previously created async job. Returns job metadata including creation time, completion time, task counts, and detailed task list.

Input parameters

PropType

jobIDstring

Required

Unique identifier of the job to retrieve

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Get Async Task ResultSCRAPE_DO_GET_ASYNC_TASK

Tool to retrieve the result of a specific task within an asynchronous job. Returns the scraped content for that particular URL. Use when you need to check the status and result of a previously submitted async scraping task.

Input parameters

PropType

tokenstring

Required

Authentication token for Scrape.do API

job_idstring

Required

Unique identifier of the job

task_idstring

Required

Unique identifier of the task within the job

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Scrape webpage using scrape.doSCRAPE_DO_GET_PAGE

A tool to scrape web pages using scrape.do's API service. Makes a basic GET request to fetch webpage content while handling anti-bot protections and proxy rotation automatically. Does not execute JavaScript by default — pages requiring client-side rendering (SPAs, dynamically loaded content) will return incomplete HTML; use SCRAPE_DO_GET_RENDER_PAGE or set render=true for those cases.

Input parameters

PropType

urlstring

Required

Target web page URL to scrape

super?boolean

Optional

Use residential & mobile proxy networks

width?integer

Optional

Browser viewport width (requires render=true)

device?string ("desktop" | "mobile" | "tablet")

Optional

Specify device type (desktop, mobile, tablet)

height?integer

Optional

Browser viewport height (requires render=true)

output?string ("raw" | "markdown")

Optional

Output format (raw or markdown)

render?boolean

Optional

Enable headless browser rendering Use for JS-heavy pages, SPAs, or sites with anti-bot JS challenges. Increase `timeout` when enabling to ensure full page load before cutoff.

timeout?integer

Optional

Maximum request timeout in ms (5000-120000)

geo_code?string

Optional

Choose country for target web page (e.g. 'us', 'gb')

return_json?boolean

Optional

Return network requests in JSON format

set_cookies?string

Optional

Set cookies for target web page

extra_headers?boolean

Optional

Add/modify headers

retry_timeout?integer

Optional

Maximum retry timeout in ms (5000-55000)

custom_headers?boolean

Optional

Handle all request headers

block_resources?boolean

Optional

Block CSS and image sources

disable_redirection?boolean

Optional

Disable request redirection

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

List Asynchronous Scraping JobsSCRAPE_DO_LIST_ASYNC_JOBS

Tool to list all asynchronous scraping jobs. Returns paginated list of jobs with their status and metadata. Use when you need to retrieve job history or monitor job statuses. Supports pagination with up to 100 jobs per page.

Input parameters

PropType

page?integer

Optional

Page number for pagination (default: 1, minimum: 1)

page_size?integer

Optional

Number of jobs per page (default: 10, maximum: 100)

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Use Scrape.do Proxy ModeSCRAPE_DO_PROXY_MODE

This tool implements the Proxy Mode functionality of scrape.do, which allows routing requests through their proxy server. It provides an alternative way to access web scraping capabilities by handling complex JavaScript-rendered pages, geolocation-based routing, device simulation, and built-in anti-bot and retry mechanisms.

Input parameters

PropType

urlstring

Required

The target URL to scrape

device?string

Optional

Device type to simulate (desktop, mobile, tablet)

render?boolean

Optional

Enable/disable JavaScript rendering

geo_code?string

Optional

Geographic location for the request (e.g., 'us', 'uk')

custom_headers?boolean

Optional

Whether to forward custom headers to the target website

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Scrape URL using POST methodSCRAPE_DO_SCRAPE_URL_POST

Tool to scrape web pages using POST method via scrape.do API. Use when you need to send POST requests to target websites with custom request body data. Supports all parameters from GET endpoint plus request body customization for POST/PUT/PATCH methods.

Input parameters

PropType

urlstring

Required

Target web page URL to scrape with POST request

body?string

Optional

HTTP request body for POST request. Can be JSON string, form data, or plain text

super?boolean

Optional

Enable residential/mobile proxies. Costs 10x credits

device?string ("desktop" | "mobile" | "tablet")

Optional

Device types for scraping emulation.

render?boolean

Optional

Enable JavaScript rendering with headless browser

geoCode?string

Optional

Country code for geo-targeting (e.g. 'us', 'gb', 'de')

timeout?integer

Optional

Total request timeout in milliseconds (5000-120000)

sessionId?string

Optional

Sticky session ID to reuse same IP address across multiple requests

setCookies?string

Optional

Cookies to include with the request (format: key1=value1; key2=value2)

customHeaders?boolean

Optional

Enable sending custom headers with the request

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Search Amazon productsSCRAPE_DO_SEARCH_AMAZON

Tool to search Amazon and scrape product listings with structured results. Performs keyword searches and returns structured product data including titles, prices, ratings, Prime status, sponsored flags, and position rankings in JSON format. Use when you need to search for products on Amazon marketplace or gather product information from search results.

Input parameters

PropType

page?integer

Optional

Page number for pagination (default: 1)

super?boolean

Optional

Enable residential/mobile proxies for higher success rates. Costs 10x credits (default: false)

geocodestring

Required

Country code for Amazon marketplace (e.g., us, gb, de, jp, ca, fr, it, es, in)

keywordstring

Required

Search query term (will be URL-encoded automatically)

zipcodestring

Required

Postal/ZIP code formatted according to country requirements (e.g., 10001 for US, SW1A 1AA for UK)

language?string

Optional

Language code in ISO 639-1 format (e.g., EN, DE, FR, ES)

include_html?boolean

Optional

When true, includes the full raw HTML alongside structured JSON (default: false)

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Block specific URLs during scrapingSCRAPE_DO_SET_BLOCK_URLS

This tool allows users to block specific URLs during the scraping process. It's particularly useful for blocking unwanted resources like analytics scripts, advertisements, or any other URLs that might interfere with the scraping process or slow it down. It provides granular control by allowing users to specify URL patterns to block, thereby improving scraping performance and maintaining privacy.

Input parameters

PropType

urlsarray

Required

List of URL patterns to block during scraping. Can be full URLs or patterns.

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

Set Regional Geolocation for ScrapingSCRAPE_DO_SET_REGIONAL_GEO_CODE

This tool allows users to set a broader geographical targeting by specifying a region code instead of a specific country code. This is useful when you want to scrape content from an entire region rather than a specific country. Note that this feature requires super mode to be enabled and is only available for Business Plan or higher subscriptions.

Input parameters

PropType

urlstring

Required

The target URL to scrape with the specified regional geo code

Required

The region code to target for scraping requests

Output

PropType

datastring

Required

Data from the action execution

error?string

Optional

Error if any occurred during the execution of the action

successfulboolean

Required

Whether or not the action execution was successful or not

On the Nagent Platform

Agents using Scrape Do

No publicly available marketplace agent is found using this tool yet. There are 74 agents privately built on Nagent that already use Scrape Do.

Build on Nagent

Build an agent that uses Scrape Do

Connect Scrape Do to any Nagent agent in minutes — no API key management, no boilerplate. Just configure and deploy.

Connect Now Book a Demo

Frequently Asked

Building with Scrape Do on Nagent

The five questions agent builders ask before adopting a new integration.

How do I connect Scrape Do to my Nagent agent?

Open the External Integrations panel inside Nagent (app.nagent.ai/externalIntegration), find Scrape Do, and click "Connect Now." You'll authenticate with an API key — Nagent handles credential storage and refresh automatically. Once connected, Scrape Do is available to any agent in your workspace.

Do I need to write code to use Scrape Do?

No. Nagent provides no-code integration for every tool. Once Scrape Do is connected, you configure its 16 actions directly in the agent builder UI — no API calls, no boilerplate, no schema management.

How do I configure Scrape Do actions and triggers in Helix?

Helix — Nagent's agentic agent builder — lets you drop Scrape Do steps into any workflow visually. Pick an action (e.g., one of those listed above), fill in the inputs (Helix knows the required vs. optional schema for each parameter), and connect it to upstream/downstream steps. Triggers run as the entry point of an agent, so when a Scrape Do event fires, the agent kicks off automatically.

What input and output schemas does Scrape Do support?

Every Scrape Do action and trigger ships with a fully-typed schema — input parameters with name, type, required flag, and description, plus the output payload shape. The schemas are documented in the sections above. Helix uses these schemas to validate your configuration at build time and to type-check the data flowing between steps.

Can I extend Scrape Do with custom logic?

Yes. While Scrape Do ships with 16 pre-built ai web scraping actions, you can layer custom logic around them inside Helix — pre/post-processing steps, conditional branches, retries, or stitching Scrape Do together with other connected tools. For deeper customization, talk to our team about Nagent's Agentic AI Lab — forward-deployed engineers who build Scrape Do-based workflows tailored to your business.

All tools & integrations