Written By Hanzala Saleem
Updated At July 01, 2026 | 8 min read
You are building a competitor monitoring tool, an SEO reporting dashboard, or a compliance archive, and you have hit the same wall every developer hits eventually: do you need a screenshot API or a scraping API?
The honest answer is that most teams end up needing both, just not for the same reason. A screenshot API gives you a pixel accurate visual record of what a page looked like. A web scraping API gives you the structured data buried inside that page. Confusing the two, or picking the wrong one for a task, wastes engineering time and produces incomplete monitoring pipelines.
This article breaks down the actual technical difference, when each tool wins, where they overlap, and how to combine them without maintaining two separate infrastructures.
Every web monitoring or automation task boils down to one question: are you trying to prove what a page looked like, or are you trying to extract what it says?
A pricing page redesign, a broken checkout flow, or a competitor's new hero section are visual problems. You need an image or PDF that shows exactly what a human visitor saw, rendered in a real browser, with layout, fonts, and images intact.
A product price, a stock status, an email address, or an article body are data problems. You need clean text or structured HTML you can parse, store in a database, and run comparisons against.
Teams that reach for a scraping-only tool when they actually need visual proof end up with JSON that says "price changed" but no image to show why or when. Teams that reach for a screenshot-only tool when they need structured data end up manually eyeballing images instead of running automated diffs. Both mistakes are avoidable once you understand what each API actually does under the hood.
A screenshot API opens a real browser, in ScreenshotAPI's case a full, isolated Chromium instance, the same rendering engine used by Google Chrome, navigates to a URL, waits for the page to render, and returns an image or PDF. This matters because modern sites are not static HTML documents. React, Vue, Angular, and Next.js apps need to be rendered correctly, with scripts executed, components hydrated, and async data loaded before the capture happens.
A basic capture with ScreenshotAPI looks like this:
// Node.js: capture a full page screenshot
const axios = require("axios");
const query = "https://shot.screenshotapi.net/screenshot" +
"?token=YOUR_API_KEY" +
"&url=" + encodeURIComponent("https://example.com") +
"&output=image" +
"&file_type=png" +
"&full_page=true" +
"&block_ads=true" +
"&no_cookie_banners=true" +
"&lazy_load=true";
axios.get(query, { responseType: "arraybuffer" }).then((response) => {
require("fs").writeFileSync("screenshot.png", response.data);
});That single request handles a full page render, strips ads and cookie consent popups using the built-in blocking engine, and triggers lazy-loaded content before capturing. Doing this yourself with a self-hosted headless browser means managing browser launch flags, request interception rules, and custom scroll logic for every site you monitor. It also means owning the operational side: memory leaks, Chrome version drift, and the CVE patch cycle that comes with running your own Chromium fleet.
Screenshot APIs also handle format flexibility that raw scraping tools do not. You can request PNG, JPG, WebP, or PDF output, capture a specific element with a CSS selector instead of the full page, emulate mobile or tablet viewports, force dark mode, or inject cookies to capture an authenticated dashboard. None of that is scraping. All of it is visual rendering.
A scraping API's job is different. Instead of returning pixels, it returns the content itself: HTML markup, plain text, or structured data you can parse programmatically.
ScreenshotAPI's scraping functionality runs through the same rendering pipeline as its screenshot capture, which is an important distinction from traditional scrapers that fetch raw HTML without executing JavaScript. Because the page is fully rendered in a real browser first, dynamic content, API-loaded data, and DOM modifications are captured, not just what exists in the original server response.
# Python: extract text and HTML from a rendered page
import requests
url = "https://shot.screenshotapi.net/v3/screenshot"
params = {
"token": "YOUR_API_KEY",
"url": "https://example.com/pricing",
"output": "json",
"extract_text": "true",
"extract_html": "true",
"extract_markdown": "true"
}
response = requests.get(url, params=params)
data = response.json()
print(data)This request returns a JSON payload with URLs to a .txt, .html, and .md file generated from the fully rendered page. You get a plain text version for indexing or diffing, raw HTML for structural analysis, and Markdown for feeding content into LLM or RAG pipelines. If you also need every image referenced on the page, adding get_image_urls=true returns a structured list without a separate crawl step.
Note that web scraping output only works when output is set to json. It is not supported alongside video or scrolling screenshot render modes, since those are visual-only outputs.
| Feature | Screenshot API | Web Scraping API |
|---|---|---|
| Primary output | PNG, JPG, WebP, PDF | Text, HTML, Markdown, JSON |
| Renders JavaScript | Yes, full Chromium execution | Yes, on the rendered DOM |
| Use case | Visual proof, archiving, QA | Data extraction, monitoring, indexing |
| Output is human readable | Directly viewable | Requires parsing |
| Output is machine parseable | No, requires image analysis or OCR | Yes, structured by default |
| Handles lazy load, infinite scroll | Yes, via full page + lazy load params | Yes, same rendering pipeline |
| Ad and cookie banner blocking | Built in, 20,000+ rules | Built in, same blocking engine |
| Storage | Direct upload to S3, Google Cloud, Wasabi | Same, plus JSON payload |
| Scenario | Best Fit | Why |
|---|---|---|
| Competitor pricing changed, need proof | Screenshot API | You need to see the actual layout and context |
| Tracking price as a numeric field in a database | Scraping API | You need a clean, parseable value |
| Visual regression testing in CI/CD | Screenshot API | Pixel comparison across viewports |
| Feeding web content to an LLM or RAG pipeline | Scraping API | Markdown or text output is directly usable |
| Compliance archiving of a legal disclosure page | Screenshot API (PDF) | Court-admissible, tamper-evident visual record |
| SEO content audits across many pages | Both | Screenshot for layout, extract_text for content analysis |
| Social preview image generation | Screenshot API | Rendered image is the deliverable itself |
| Lead generation from directory listings | Scraping API | Structured contact data, not visuals |
| Factor | Self-hosted Puppeteer or Playwright | Managed API (Screenshot or Scraping) |
|---|---|---|
| Infrastructure cost | Servers, memory, DevOps time | Included in plan |
| Setup time | Days to weeks | Minutes |
| Scaling | Manual orchestration, queues, autoscaling | Automatic |
| Chrome version management | Your responsibility, weekly patch cycle | Handled by provider |
| Ad and cookie blocking | Custom rule maintenance | Prebuilt, updated centrally |
| Reliability across environments | Depends on host machine | Same Chromium version every request |
If you have used Playwright directly for full page captures before, our guide on taking a full page screenshot in Playwright walks through the manual setup and where teams typically hit friction with lazy-loaded content and scroll timing. The web scraping with Playwright guide covers the equivalent self-hosted path for extraction workflows.
SaaS competitor monitoring. An engineering lead wants to track five competitor pricing pages daily. A scheduled screenshot captures the visual layout so the team can see design changes at a glance, while extract_text=true on the same request pulls the actual price strings for automated diffing. One API call, two outputs. This is the exact workflow covered in how to monitor competitor websites with a screenshot API.
Legal and compliance archiving. A fintech company needs timestamped, tamper-evident records of terms of service pages for regulatory audits. Here, the screenshot API wins outright. A full page PDF, stored directly to a private S3 bucket, preserves layout and text exactly as a regulator would need to see it. Structured text alone would not hold up as visual evidence.
AI agent and RAG pipelines. A team building an internal AI research assistant needs to feed live web pages into an LLM. Raw HTML is noisy and token-expensive. extract_markdown=true strips the page down to headings, lists, and links in a clean format built for language models. This is the approach detailed in feeding web pages to AI agents with ScreenshotAPI and in the webpage to Markdown guide.
QA visual regression. A frontend team wants to catch UI bugs before functional tests do, a shifted button, a broken font load, a layout that collapsed on mobile. This is purely visual. Desktop, tablet, and mobile viewport captures on every deployment, compared against a baseline with a tool like pixelmatch, catch what unit tests miss.
Ask these three questions in order.
1. Do you need to prove what the page looked like, or what it said? If the answer is "looked like," you need a screenshot API. If the answer is "said," you need a scraping API.
2. Will a human ever need to view the output directly? Screenshots are for human eyes: reports, evidence, design reviews, social previews. Extracted text and HTML are for machines: databases, diff engines, search indexes, LLM context windows.
3. Do you need both from the same page, at the same time? This is more common than picking one and sticking with it. Competitor monitoring, SEO audits, and e-commerce tracking usually need a visual record and a structured data point from the identical render. Since ScreenshotAPI runs scraping through the same Chromium rendering pipeline as its screenshot capture, you can request output=json with extract_text=true and still receive a screenshot URL in the same response, avoiding two separate calls, two separate rendering costs, and two chances for the page state to drift between captures.
full_page=true and lazy_load=true together when the page has infinite scroll or below-the-fold content you need in both the image and the extracted text.block_ads=true and no_cookie_banners=true on every request. Clean captures produce cleaner extracted text, since consent banners and ad slots otherwise pollute your .txt output.fresh=true sparingly. Cached responses do not count toward your quota, so only force a fresh render when you specifically need the current state, not a cached one.selector parameter to capture only the component under test rather than the full page. Smaller images mean faster diffs.
No. A screenshot API returns a visual image or PDF of a rendered webpage. A web scraping API returns structured data such as text, HTML, or Markdown extracted from that same page. Some providers, including ScreenshotAPI, support both through the same rendering engine and the same request.
Yes. Setting output=json along with extract_text=true, extract_html=true, or extract_markdown=true returns structured content while the request still generates the visual render, so you are not paying for two separate captures of the same page.
Both, used together. The scraping output gives you a parseable price value you can diff automatically, while the screenshot gives you visual context for what actually changed on the page, which matters when a price change comes with a redesign or a new promotion.
Yes. Because extraction runs on the fully rendered DOM after JavaScript execution, dynamic content loaded by React, Vue, Angular, or Next.js applications is captured correctly, unlike traditional scrapers that only read the initial server response.