Looking to hire me? See Services.
Coming soon
First posts queued, in order of priority. If a topic here is what you're wrestling with, just email me - I'll send notes ahead of the writeup.
Why most scraping projects fail
The recurring failure modes I see across audits: source choice before architecture, cost-per-document never measured, anti-bot strategy that can't survive a single vendor change, and "scrape now, structure later" pipelines that never get structured. A field guide based on a decade of production systems.
Crawl economics: what a page actually costs you
Most teams price scraping at proxy cost. Real cost is proxy + compute + retry amplification + anti-bot vendor + human babysitting + data quality cleanup + replacement when the source breaks. A model for cost-per-acquired-document and where the actual margins live.
When Bright Data is the wrong solution
Managed scraping vendors are good at exactly the problems they're designed for and quietly terrible at the rest. A decision framework for build vs. buy on proxies, scraping APIs, and full managed data feeds - written from the buyer side, not the vendor side.
Why RAG fails without an acquisition strategy
"We'll just embed everything" is not an acquisition strategy. Why most AI products plateau on data quality and what a real ingestion pipeline looks like - source selection, extraction tier, structure-first vs. structure-later, refresh cadence, and the part nobody talks about: continuous coverage.
Get notified
No newsletter system yet - just email me and I'll add you to a small "ship list" that gets a one-line note when each log goes live.