Web Scraper
Extract public contact data from company websites. Finds emails, social media links, and phone numbers. AVG-aware: always opt-in, respects robots.txt, identifiable User-Agent.
⚡ Built-in — No Setup Required
This integration is built into the WebsitePublisher platform. All endpoints are available immediately — no API key needed.
Endpoints (2)
POST
scrape
Extract public contact data from a company website. Scrapes the homepage and common contact page paths (/contact, /over-ons, /about). Returns found emails, social media links, and phone numbers. AVG notice: only scrapes publicly available data. User is responsible for further processing of any personal data found.
| Field | Type | Required | Description |
|---|---|---|---|
url |
url | ✓ Yes | Company website URL (e.g. "https://example.nl"). Must include https:// or http://. |
include_contact_page |
boolean | No | Also scrape common contact pages (/contact, /contactus, /over-ons, /about). Default: true. |
respect_robots |
boolean | No | Check and respect robots.txt before scraping. Default: true. |
POST
robots-check
Check if a website allows scraping by reading its robots.txt. Returns allowed/disallowed status for our User-Agent.
| Field | Type | Required | Description |
|---|---|---|---|
url |
url | ✓ Yes | Website URL to check robots.txt for. |
MCP Tool Names
When using this integration through an AI assistant (Claude, ChatGPT, Cursor, etc.), the endpoints are available as MCP tools:
| Endpoint | MCP Tool Name |
|---|---|
| scrape | web-scraper_scrape |
| robots-check | web-scraper_robots_check |