Scrape.do is a web scraping API offering rotating residential, data-center, and mobile proxies with headless browser support and session management to bypass anti-bot protections (e.g., Cloudflare, Akamai) and extract data at scale in formats like JSON and HTML.
Scrape.do is a web scraping API offering rotating residential, data-center, and mobile proxies with headless browser support and session management to bypass anti-bot protections (e.g., Cloudflare, Akamai) and extract data at scale in formats like JSON and HTML. On Nagent, Scrape Do is exposed as a fully-configurable ai web scraping integration that any agent can call — 16 actions, and API key authentication. No code is required to wire Scrape Do into your workflow — connect it once via the External Integrations panel and reuse it across every agent you build.
Agent builders use Scrape Do to automate the kinds of tasks ai web scraping teams previously handled manually. Concrete examples — each one is a single agent step in Nagent — include:
Every action and trigger is paired with a structured input/output schema (visible in the sections below), so when you wire Scrape Do into Helix — our agentic agent builder — the editor knows exactly what each step expects and produces. Configure once, deploy anywhere across your Nagent agents.
Every operation an agent can call against Scrape Do, with input parameters and output schema. Drop these into any step of an agent built in Helix.
SCRAPE_DO_CANCEL_ASYNC_JOBTool to cancel an asynchronous scraping job. Use when you need to stop processing of pending tasks in a job. Completed tasks remain available.
Input parameters
Authentication token for Scrape.do API
Unique identifier of the job to cancel
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPE_DO_CREATE_ASYNC_JOBTool to create an asynchronous scraping job with specified targets and options. Use when you need to scrape multiple URLs in parallel without waiting for results. Returns a job ID immediately for polling results later via the get job status action.
Input parameters
HTTP request body for POST/PUT/PATCH requests
Use residential/mobile proxy networks (default: false)
Device types for scraping emulation.
HTTP methods for async scraping requests.
Output format for scraped content.
Options for headless browser rendering.
Country code for geo-targeting (e.g., 'us', 'gb', 'de')
Custom HTTP headers to send with requests
Array of target URLs to scrape. Each URL will be processed asynchronously.
Total request timeout in milliseconds (default: 60000)
Sticky session ID to reuse same IP address across requests
Cookies to include with the request
Webhook URL to send results to when job completes
Disable automatic retry mechanism (default: false)
Retry timeout per request in milliseconds (default: 15000)
Use only provided headers, discard default headers (default: false)
Additional headers to send with webhook notification
Regional code for more specific geo-targeting
Disable following HTTP redirects (default: false)
Return raw target website response without processing (default: false)
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPE_DO_GET_ACCOUNT_INFORetrieves account information and usage statistics from Scrape.do. This action makes a GET request to the Scrape.do info endpoint to fetch: - Subscription status - Concurrent request limits and usage - Monthly request limits and remaining requests - Real-time usage statistics Rate limit: Maximum 10 requests per minute. Use remaining request counts to monitor credits proactively, as different scraping operations (e.g., rendered-page requests) consume varying credit amounts and exhaustion mid-run causes failures.
Input parameters
Authentication token for Scrape.do API
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPE_DO_GET_AMAZON_OFFERSGet all seller offers for any Amazon product. Retrieves every seller listing including pricing, shipping costs, seller information, and Buy Box status in structured JSON format. Use when you need to compare prices across multiple sellers or find the best deal for a specific product.
Input parameters
Amazon Standard Identification Number (10-character product ID)
Country code for Amazon marketplace (e.g., us, gb, de, jp, fr, es, it, ca)
Postal/ZIP code formatted according to country requirements
Enable residential/mobile proxies for higher success rates. Costs 10x credits
When true, includes the full raw HTML alongside structured JSON
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPE_DO_GET_AMAZON_PRODUCTExtract structured product data from Amazon product detail pages (PDP). Returns comprehensive product information including title, pricing, ratings, images, best seller rankings, and technical specifications in JSON format.
Input parameters
Amazon Standard Identification Number (10-character product ID)
Country code (e.g., us, gb, de, jp, fr, ca)
Postal code formatted according to country requirements
Language code in ISO 639-1 format (e.g., EN, DE, FR)
Enable residential/mobile proxies for higher success rates. Costs 10x credits
When true, includes the full raw HTML alongside structured JSON
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPE_DO_GET_AMAZON_RAW_HTMLTool to get raw HTML from any Amazon page with ZIP code geo-targeting. Use when you need complete unprocessed HTML source from Amazon URLs with location-based targeting. Ideal for scraping pages not covered by other structured endpoints.
Input parameters
Full Amazon URL to scrape (e.g., https://www.amazon.com/dp/B08N5WRWNW)
Enable residential/mobile proxies for higher success rates. Costs 10x credits. Default is false.
Output format - must be 'html' for raw HTML content
Country code for geo-targeting (e.g., us, gb, de, jp)
Request timeout in milliseconds
Postal code formatted according to country requirements (e.g., 10001 for US, SW1A 1AA for UK)
Language code in ISO 639-1 format (e.g., EN, DE, FR, ES)
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPE_DO_GET_ASYNC_ACCOUNT_INFOTool to get account information for the Async API including concurrency limits and usage statistics. Use when you need to check available concurrency slots, active jobs, or remaining credits for Async API operations.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPE_DO_GET_ASYNC_JOBTool to retrieve details and status of a specific asynchronous scraping job. Use when you need to check the progress, status, or results of a previously created async job. Returns job metadata including creation time, completion time, task counts, and detailed task list.
Input parameters
Unique identifier of the job to retrieve
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPE_DO_GET_ASYNC_TASKTool to retrieve the result of a specific task within an asynchronous job. Returns the scraped content for that particular URL. Use when you need to check the status and result of a previously submitted async scraping task.
Input parameters
Authentication token for Scrape.do API
Unique identifier of the job
Unique identifier of the task within the job
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPE_DO_GET_PAGEA tool to scrape web pages using scrape.do's API service. Makes a basic GET request to fetch webpage content while handling anti-bot protections and proxy rotation automatically. Does not execute JavaScript by default — pages requiring client-side rendering (SPAs, dynamically loaded content) will return incomplete HTML; use SCRAPE_DO_GET_RENDER_PAGE or set render=true for those cases.
Input parameters
Target web page URL to scrape
Use residential & mobile proxy networks
Browser viewport width (requires render=true)
Specify device type (desktop, mobile, tablet)
Browser viewport height (requires render=true)
Output format (raw or markdown)
Enable headless browser rendering Use for JS-heavy pages, SPAs, or sites with anti-bot JS challenges. Increase `timeout` when enabling to ensure full page load before cutoff.
Maximum request timeout in ms (5000-120000)
Choose country for target web page (e.g. 'us', 'gb')
Return network requests in JSON format
Set cookies for target web page
Add/modify headers
Maximum retry timeout in ms (5000-55000)
Handle all request headers
Block CSS and image sources
Disable request redirection
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPE_DO_LIST_ASYNC_JOBSTool to list all asynchronous scraping jobs. Returns paginated list of jobs with their status and metadata. Use when you need to retrieve job history or monitor job statuses. Supports pagination with up to 100 jobs per page.
Input parameters
Page number for pagination (default: 1, minimum: 1)
Number of jobs per page (default: 10, maximum: 100)
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPE_DO_PROXY_MODEThis tool implements the Proxy Mode functionality of scrape.do, which allows routing requests through their proxy server. It provides an alternative way to access web scraping capabilities by handling complex JavaScript-rendered pages, geolocation-based routing, device simulation, and built-in anti-bot and retry mechanisms.
Input parameters
The target URL to scrape
Device type to simulate (desktop, mobile, tablet)
Enable/disable JavaScript rendering
Geographic location for the request (e.g., 'us', 'uk')
Whether to forward custom headers to the target website
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPE_DO_SCRAPE_URL_POSTTool to scrape web pages using POST method via scrape.do API. Use when you need to send POST requests to target websites with custom request body data. Supports all parameters from GET endpoint plus request body customization for POST/PUT/PATCH methods.
Input parameters
Target web page URL to scrape with POST request
HTTP request body for POST request. Can be JSON string, form data, or plain text
Enable residential/mobile proxies. Costs 10x credits
Device types for scraping emulation.
Enable JavaScript rendering with headless browser
Country code for geo-targeting (e.g. 'us', 'gb', 'de')
Total request timeout in milliseconds (5000-120000)
Sticky session ID to reuse same IP address across multiple requests
Cookies to include with the request (format: key1=value1; key2=value2)
Enable sending custom headers with the request
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPE_DO_SEARCH_AMAZONTool to search Amazon and scrape product listings with structured results. Performs keyword searches and returns structured product data including titles, prices, ratings, Prime status, sponsored flags, and position rankings in JSON format. Use when you need to search for products on Amazon marketplace or gather product information from search results.
Input parameters
Page number for pagination (default: 1)
Enable residential/mobile proxies for higher success rates. Costs 10x credits (default: false)
Country code for Amazon marketplace (e.g., us, gb, de, jp, ca, fr, it, es, in)
Search query term (will be URL-encoded automatically)
Postal/ZIP code formatted according to country requirements (e.g., 10001 for US, SW1A 1AA for UK)
Language code in ISO 639-1 format (e.g., EN, DE, FR, ES)
When true, includes the full raw HTML alongside structured JSON (default: false)
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPE_DO_SET_BLOCK_URLSThis tool allows users to block specific URLs during the scraping process. It's particularly useful for blocking unwanted resources like analytics scripts, advertisements, or any other URLs that might interfere with the scraping process or slow it down. It provides granular control by allowing users to specify URL patterns to block, thereby improving scraping performance and maintaining privacy.
Input parameters
List of URL patterns to block during scraping. Can be full URLs or patterns.
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
SCRAPE_DO_SET_REGIONAL_GEO_CODEThis tool allows users to set a broader geographical targeting by specifying a region code instead of a specific country code. This is useful when you want to scrape content from an entire region rather than a specific country. Note that this feature requires super mode to be enabled and is only available for Business Plan or higher subscriptions.
Input parameters
The target URL to scrape with the specified regional geo code
The region code to target for scraping requests
Output
Data from the action execution
Error if any occurred during the execution of the action
Whether or not the action execution was successful or not
No publicly available marketplace agent is found using this tool yet. There are 74 agents privately built on Nagent that already use Scrape Do.
Build on Nagent
Connect Scrape Do to any Nagent agent in minutes — no API key management, no boilerplate. Just configure and deploy.
The five questions agent builders ask before adopting a new integration.
Open the External Integrations panel inside Nagent (app.nagent.ai/externalIntegration), find Scrape Do, and click "Connect Now." You'll authenticate with an API key — Nagent handles credential storage and refresh automatically. Once connected, Scrape Do is available to any agent in your workspace.
No. Nagent provides no-code integration for every tool. Once Scrape Do is connected, you configure its 16 actions directly in the agent builder UI — no API calls, no boilerplate, no schema management.
Helix — Nagent's agentic agent builder — lets you drop Scrape Do steps into any workflow visually. Pick an action (e.g., one of those listed above), fill in the inputs (Helix knows the required vs. optional schema for each parameter), and connect it to upstream/downstream steps. Triggers run as the entry point of an agent, so when a Scrape Do event fires, the agent kicks off automatically.
Every Scrape Do action and trigger ships with a fully-typed schema — input parameters with name, type, required flag, and description, plus the output payload shape. The schemas are documented in the sections above. Helix uses these schemas to validate your configuration at build time and to type-check the data flowing between steps.
Yes. While Scrape Do ships with 16 pre-built ai web scraping actions, you can layer custom logic around them inside Helix — pre/post-processing steps, conditional branches, retries, or stitching Scrape Do together with other connected tools. For deeper customization, talk to our team about Nagent's Agentic AI Lab — forward-deployed engineers who build Scrape Do-based workflows tailored to your business.