Andrew Harris

Andrew Harris

Data Acquisition & Engineering Leader

Selected Projects

Open-source projects demonstrating expertise in web data acquisition, agentic AI systems, and full-stack development.

Serpopotamus

Enterprise-grade Google Search and Maps SERP scraping platform. Features a Flask-based web UI for managing scrape jobs, PostgreSQL for persistent storage, intelligent rate limiting and proxy rotation, and batch CSV processing for bulk operations. Demonstrates the same architectural patterns used to build ZoomInfo's 500M+ query/month SERP infrastructure.

Python Flask PostgreSQL SERP API Rate Limiting Proxy Management

Courtwise

Agentic RAG platform designed to make legal information accessible. Combines autonomous web crawling with semantic search to index and query legal documents. Uses LLM-powered extraction to identify relevant statutes, case law, and procedural information, then surfaces answers through a conversational interface. Built to demonstrate production-ready agentic AI architecture.

Python LangChain RAG Vector DB LLM Semantic Search Web Crawling

Uktena

Universal web scraper with a 3-tier hybrid fallback strategy for maximum reliability. Combines stealth browser automation with residential proxy rotation and intelligent request fingerprinting. Designed to handle sites with aggressive bot detection while maintaining ethical scraping practices and rate limits.

Python Playwright Stealth Browser Residential Proxies Fingerprinting

Yelptopus

Yelp business data extraction platform with deep pagination support. Features a Flask web UI for configuring and monitoring scrape jobs, PostgreSQL storage with deduplication, and real-time progress tracking. Handles Yelp's pagination limits through intelligent cursor management.

Python Flask PostgreSQL Deep Pagination Real-time Tracking

Catchmark

Full-stack fishing spot discovery platform for anglers. React/TypeScript frontend with interactive maps, Express.js backend API, and PostGIS-powered geospatial queries for location-based search. Demonstrates modern full-stack architecture with spatial data handling.

React TypeScript Express.js PostgreSQL PostGIS Geospatial