View All Jobs 2635

Web Scraping Engineer - Remote Eligible

Design and implement a large-scale web crawling system for data acquisition.
Remote
Mid-Level
4 weeks ago

✨ About The Role

- The Web Scraping Engineer will be responsible for designing and implementing the architecture of a large-scale crawling system. - The role involves maintaining various components of the data acquisition infrastructure, including building new crawlers and maintaining existing ones. - The engineer will develop tools to facilitate scraping at scale and monitor the health of crawlers to ensure data quality. - Collaboration with product and business teams is necessary to understand and anticipate requirements for data gathering systems. - The position requires maintaining all aspects of a scraping pipeline, from building and maintaining spiders to monitoring their health and performance.

⚡ Requirements

- The ideal candidate will have over 3 years of experience with Python, particularly in data wrangling and cleaning. - A strong background in data crawling and scraping at scale, with experience managing over 100 spiders, is essential. - Familiarity with various scraping libraries and monitoring tools, such as BeautifulSoup and Selenium, is highly recommended. - The candidate should possess experience in bypassing bot detection techniques and protecting web scrapers against common issues like site bans and IP leaks. - Experience with cloud environments like GCP or AWS, as well as containerization tools like Docker, is important for this role.
+ Show Original Job Post
























Web Scraping Engineer - Remote Eligible
Remote
Engineering
About Legalist
Tech-enabled asset management firm