E-Commerce / SaaS · A pricing-intelligence platform
Daily price tracking across 12 marketplace catalogues
Replaced an ageing in-house crawler with a 12-source pipeline that runs nightly and feeds a Postgres warehouse.
Coverage uplift
62% → 99.4%
Daily records
~2.1M
Manual patch hours / week
14h → 0h
The challenge
The customer's previous crawler had grown into 3,000 lines of Python that broke whenever a source site shipped a layout change. Coverage had silently dropped to ~62% across their 12 priority marketplaces, and their analytics team was patching CSVs by hand each morning.
What we built
We rebuilt the pipeline as 12 isolated Playwright projects with shared selector definitions, residential proxy rotation, automatic retry on selector drift and a daily cron schedule running on Apify Cloud. Each output writes deltas straight into the customer's Postgres warehouse with a versioned snapshot kept on S3.
The outcome
Coverage now sits at 99.4% across the 12 sources. The analytics team stopped patching files entirely; layout-drift alerts route to Slack and the engineer on rotation patches the affected scraper inside the same day.