Fullstack- Data Acquisition Engineer
Overview
Headquartered in Los Angeles, California, Right Balance provides top-tier technology talent for innovative companies in the US. We’re in the top 50 companies to watch in LA.
Engagement Details
The client is an AI-driven competitive intelligence platform built for modern e-commerce brands. They provide real-time visibility into market shifts by tracking hundreds of thousands of brands, retailers, and marketplaces across categories and channels. Their platform delivers insights on pricing, promotions, creative strategy, merchandising, and overall site experience, helping brands make faster and smarter decisions. Backed by experienced ecommerce operators and strong funding, they’re building a category-defining product at the intersection of AI and commerce intelligence. The Role We’re looking for a full-stack engineer who specializes in large-scale, resilient web scraping. Your job is to own our data acquisition systems end-to-end: architecture, development, anti-bot evasion, proxy strategy, observability, and ongoing operational excellence. This is not a “learn scraping on the job” role. We need someone who has deeply lived in anti-bot environments, shipping scrapers that survive behind Cloudflare, Akamai, PerimeterX, and the many smaller, nastier systems used across retail. You will work closely with the CTO and platform team to scale our ingestion engine safely, predictably, and with high fidelity—across tens of thousands of domains.
What You’ll Do Scraper Engineering
- Build and maintain high-reliability scrapers for eCommerce product pages, category pages, promotions, and landing pages.
- Architect multi-step flows (e.g., pagination, dynamic rendering, multi-URL extraction, login/checkout flows).
- Implement sophisticated anti-bot evasion strategies: browser fingerprinting, TLS/client-hello tuning, header randomization, cookie management, session persistence, rendering strategies, etc. Anti-Bot & Proxy Strategy
- Design and manage large proxy pools (residential, ISP, mobile).
- Choose when to use headless browsers, stealth browsers, or HTML-only extraction.
- Implement robust retry, fallback, and fingerprint-rotation logic. Captcha & Hard-Target Handling
- Integrate captcha solving (local models, external solvers, hybrid approaches).
- Engineer detection logic for Cloudflare challenges, bot traps, geometric challenges, and behavioral detection patterns. Platform & Infrastructure
- Deploy scrapers as serverless functions, containers, or distributed workers on AWS/GCP.
- Implement job orchestration, rate limiting, circuit breakers, and adaptive throttling.
- Build real-time monitoring for failures, block rates, response anomalies, and deltas in structured outputs. Data Output Quality & Reliability
- Match expected data output format for ingestion into data warehouses using Databricks or similar
- Ensure extracted content (HTML, screenshots, product data) meets strict accuracy and freshness SLAs.
- Collaborate with AI and app teams to tune extraction schemas, detect regressions, and scale ingestion capacity.
Why Join Now
- Join a well-funded AI startup defining the new Market and Competitive Intelligence category.
- Work directly with experienced founders who’ve built and scaled leading ecommerce tech companies.
- Competitive compensation, benefits, equity, and flexible remote culture
- We bring the full team together 2-3 times per year to connect, and keep our remote culture strong. Availability: Full-time.
What’s in it for you
- Learn and evolve your skills using the latest and greatest technology tools in a rapidly growing company.
- Learn from the best people around you. We constantly challenge the status quo and invent new ways of building a great product.
- 100% remote. Work anywhere, whether it is remotely in the comfort of your home, in a shared co-working space, in an RV on the beach, or while being a nomad in another country.
- Work on challenging problems, innovate, and positively impact many people's lives while having fun doing it.
Required Qualifications
- Upper-intermediate to fluent speaking and writing English. Able to have a real-time conversation.
- 5+ years of full-time hands-on scraping experience or 5+ years of QA Automation experience
- 5+ years building high-volume scrapers or crawling systems (Python, TypeScript/Node, or Go).
- Hands-on experience bypassing modern anti-bot systems (Cloudflare, Akamai, PerimeterX, Datadome, etc.).
- Deep understanding of browser automation (Playwright/Puppeteer/Selenium), stealth plugins, and fingerprinting.
- Expert-level proxy handling: residential, mobile, ISP, rotation strategies, reputation management.
- Experience solving or bypassing complex captcha systems.
- Strong AWS or GCP experience: Lambda/Cloud Run, S3/GCS, EventBridge/Cloud Scheduler, queues, logging, tracing.
- Strong full-stack skills (you can write the scrapers and the systems that run and monitor them).
- Opinionated ideas about reliability, scaling, testing, and operational excellence.
Nice to haves
- Building scrapers on top of Playwright clusters, Browserless, Puppeteer clusters, or splash-like renderers.
- Prior experience with eCommerce data extraction at scale.
- Experience speed-optimizing scrapers and minimizing COGS through architectural choices.
- Familiarity with vector databases (Qdrant), Supabase/Postgres, or multi-agent AI workflows.
- Bachelor’s degree in Computer Science or equivalent demonstrated ability.
Frequently Asked Questions
What are your typical clients?
The majority of our clients are venture-backed startups at the growth stage. Usually, at this stage, the company already achieved a product-market fit and is looking to expand rapidly. That’s where we bring the best engineering practices, strong architecture, the latest technologies, and consistent processes to help companies scale.
What is the length of your engagements?
Most of our long-term full-time engagements last multiple years. It allows you to evolve your career with the client company taking on more responsibilities.
What’s your company size?
The Right Balance team is 55+ engineers going to 75+ by the end of the year. The current client size team is 5+ people. The timing is great to be a part of a rapidly growing team making meaningful contributions.
What happens if the engagement is completed?
Most of our engagements are long-term in nature. That said, if the current engagement is ramping down, we’ll present you with more long-term opportunities to transition into.
What are your core values?
Client First: we only win when our clients win. We treat client challenges as our own.
Ownership: we embrace responsibility, taking on challenges, getting them to completion, and enjoying getting things done.
Quality: we’re passionate about achieving quality outcomes by applying meticulous attention to detail.

