Vetcare Scraping Service

A fast API that reads public clinic pages and summarizes key info.

What it does

Give the API a clinic website URL. It loads the page, reads the visible text, follows helpful links like "Services" or "Team", and returns a concise summary: name, contacts, accepted pets, services, team roles, and more.

Results are cached for speed. You can force a fresh read when needed.

How it works

Your request URL + use_cache Smart fetch HTTP/2, pooling, timeouts Understand the page Readability + links Detect pets & services API keywords + local terms Team & contact Phone, socials, roles Response JSON (orjson) + gzip

What information we extract

Request/Response sequence

Client Scraper API Clinic website Vetcare API POST /scrape (url, use_cache) fetch HTML (cached if allowed) HTML get pets/services keywords IDs + keywords JSON summary (gzip + orjson) Parsing and detection run in parallel for speed

Flow overview

1. Input URL use_cache? yes/no 2. Fetch & parse HTML 3. Extract content 4. Detect pets/services 5. Build JSON response

How to use

Endpoint
POST /scrape
Auth
x-api-key: 12345
Request
{ "url": "https://example.com", "use_cache": true }
Fresh read
use_cache: false forces re-fetch
Typical time
~0.5s cached · ~4–5s fresh
curl -X POST 'http://localhost:9001/scrape' -H 'accept: application/json' -H 'x-api-key: 12345' -H 'Content-Type: application/json' -d '{"url":"https://www.helsinki.fi/fi/yliopistollinen-elainsairaala","use_cache":false}'

Tip: Update Vetcare API keywords to improve detection across languages.