Web Scraper

⚡ Built-in — Data & Enrichment

Extract public contact data from company websites. Finds emails, social media links, and phone numbers. AVG-aware: always opt-in, respects robots.txt, identifiable User-Agent.

⚡ Built-in — No Setup Required This integration is built into the WebsitePublisher platform. All endpoints are available immediately — no API key needed.

Endpoints (2)

POST scrape

Extract public contact data from a company website. Scrapes the homepage and common contact page paths (/contact, /over-ons, /about). Returns found emails, social media links, and phone numbers. AVG notice: only scrapes publicly available data. User is responsible for further processing of any personal data found.

Field	Type	Required	Description
`url`	url	✓ Yes	Company website URL (e.g. "https://example.nl"). Must include https:// or http://.
`include_contact_page`	boolean	No	Also scrape common contact pages (/contact, /contactus, /over-ons, /about). Default: true.
`respect_robots`	boolean	No	Check and respect robots.txt before scraping. Default: true.

POST robots-check

Check if a website allows scraping by reading its robots.txt. Returns allowed/disallowed status for our User-Agent.

Field	Type	Required	Description
`url`	url	✓ Yes	Website URL to check robots.txt for.

MCP Tool Names

When using this integration through an AI assistant (Claude, ChatGPT, Cursor, etc.), the endpoints are available as MCP tools:

Endpoint	MCP Tool Name
scrape	`web-scraper_scrape`
robots-check	`web-scraper_robots_check`

← Back to all integrations