Data Crawling Engineer
3 weeks ago
Join Our Team as a Data Crawling Engineer
Are you an engineer passionate about leveraging AI to tackle real-world challenges? We're a leading player in the AI space, backed by prominent investors and driven by a commitment to innovation. Our team is dedicated to pushing the boundaries of what's possible with AI, and we're looking for motivated individuals to join us in our mission.
About Us
At our company, we're all about harnessing the power of AI to deepen the partnership between humans and computers. Our innovative approach spans the entire AI stack, from user experience to optimizing models for maximum efficiency. The Data Acquisition (Crawler) Engineer will be responsible for developing and maintaining the systems that allow for the smooth and efficient collection, storage, and processing of data from various sources. Your primary responsibility will be to design, develop, and maintain web crawlers and data acquisition systems in an efficient and reliable manner to support our model training.
Responsibilities:
Architecting and building a large-scale distributed web crawler system.
Designing and implementing web crawlers and scrapers to automatically extract data from websites, addressing challenges such as dynamic content and scaling to handle large data volumes.
Developing data acquisition pipelines to ingest, transform, and store substantial volumes of data efficiently.
Creating a highly scalable system and optimizing crawler performance for maximum efficiency.
Monitoring and troubleshooting crawler activities to detect and resolve issues promptly.
Collaborating closely with the data infrastructure and data researcher to enhance the quality of the data.
What We Are Looking For:
Previous experience with large-scale web crawling is essential for this role.
5 years of experience in data-intensive applications and distributed systems.
Proficiency in high-performance programming languages such as Go, Rust, or C++.
Strong understanding of orchestration and containerization frameworks like Docker and Kubernetes.
Experience with building on GCP or AWS services.
Keywords: Data Crawling, Web Crawling, AWS, C++, Docker, Kubernetes
#J-18808-Ljbffr
-
Certified Engineer
10 hours ago
Los Angeles, United States T5 Data Centers Full timeJob DescriptionJob DescriptionCompany DescriptionForever On! From the start in 2008, T5 has been focused on supporting enterprise and hyperscale customers with customized data center solutions. Today, we remain dedicated to an unrivaled level of quality that extends across the lifecycle of the core data center ranging from customized turnkey development,...
-
Full Stack Engineer 5
2 weeks ago
Los Gatos, United States Netflix Full timeNetflix is one of the world's leading streaming entertainment services, with over 260 million members in over 190 countries enjoying TV series, documentaries, feature films, and mobile games across various genres and languages. Members can play, pause, and resume watching as they want, anytime, anywhere, and change their plans anytime. A culture of...
-
Data Engineer
2 weeks ago
Los Angeles, United States Proliance Consulting Full timeData EngineerLength: until 10/31/24 (strong possibility of extension or conversion)Associate Vendors: We are accepting applications from candidates who are currently authorized to work in the US for any employer without sponsorship.Role & Responsibilities• Build complex data engineering pipelines using Python and Airflow to hydrate datamarts in Snowflake...
-
Data Engineer
1 month ago
Los Angeles, United States Proliance Consulting Full timeData EngineerLength: until 10/31/24 (strong possibility of extension or conversion)Associate Vendors: We are accepting applications from candidates who are currently authorized to work in the US for any employer without sponsorship.Role & Responsibilities• Build complex data engineering pipelines using Python and Airflow to hydrate datamarts in Snowflake...
-
Data Engineer
1 month ago
Los Angeles, United States Proliance Consulting Full timeData EngineerLength: until 10/31/24 (strong possibility of extension or conversion)Associate Vendors: We are accepting applications from candidates who are currently authorized to work in the US for any employer without sponsorship.Role & Responsibilities• Build complex data engineering pipelines using Python and Airflow to hydrate datamarts in Snowflake...
-
Data Engineer
15 hours ago
Los Angeles, United States Motion Recruitment Full timeOur client, a global leader in media & entertainment is actively looking for a Data Engineer to join their team in Los Angeles, CA!***This role is a hybrid position, working onsite 3 days/week. Candidates must be commutable to Hollywood, California***Key Responsibilities:Build complex data engineering pipelines using Python and Airflow to hydrate datamarts...
-
Data Engineer
21 hours ago
Los Angeles, United States Motion Recruitment Full timeOur client, a global leader in media & entertainment is actively looking for a Data Engineer to join their team in Los Angeles, CA!***This role is a hybrid position, working onsite 3 days/week. Candidates must be commutable to Hollywood, California***Key Responsibilities:Build complex data engineering pipelines using Python and Airflow to hydrate datamarts...
-
Data Engineer
6 days ago
Los Angeles, United States TekWissen LLC Full timeJob DescriptionJob DescriptionOverview: TekWissen Group is a workforce management provider throughout the USA and many other countries in the world. Our client is a media and entertainment company that develops, produces, and markets entertainment, news, and information. It owns and operates a portfolio of news and entertainment television networks, a motion...
-
Los Gatos, California, United States Netflix Full timeNetflix is the world's leading streaming entertainment service with 220M+ paid memberships in over 190 countries enjoying TV series, documentaries, and feature films across a wide variety of genres and languages.The OpportunityWe are looking for a passionate and talented machine learning engineer to join our applied ML research team on estimating long-term...
-
Senior Data Engineer
6 days ago
Los Angeles, United States YO HR CONSULTANCY Full timeJob DescriptionJob DescriptionSenior Data Engineer& ExpertisePythonSQLNoSQLAWSAzureData ModelingKubernetesHIPAAApache Spark - 3 yearsPySparkSnowflake - 3 yearsApache AirflowData Engineer - 3 yearsSchema DesignRequirements: Bachelor’s or master’s degree in computer science, Engineering, or a related field. 5 - 10 years of experience in data...
-
Data Engineer
1 month ago
Los Angeles, United States VISART Full timeResponsibilities Implement data ingestion routines both real time and batch using best practices in data modeling. Develop ETL/ELT processes leveraging AWS technologies and Big data tools. Gather business and functional requirements and translate these requirements into robust scalable operable solutions that work well within the overall data...
-
Data Engineer
1 month ago
Los Angeles, California, United States VISART Full timeResponsibilities Implement data ingestion routines both real time and batch using best practices in data modeling. Develop ETL/ELT processes leveraging AWS technologies and Big data tools. Gather business and functional requirements and translate these requirements into robust scalable operable solutions that work well within the overall data architecture....
-
Data Engineer II
2 weeks ago
Los Angeles, California, United States Riot Games Full timeData Engineers at Riot bring deep knowledge of specific technical areas and also value the opportunity to work in many broader domains. Our engineers are player-focused and aim to find solutions that make players excited to experience our games. Senior Engineers, at Riot, are often called on to help with the hardest problems on a project by using their...
-
Data Researcher
1 week ago
Los Angeles, United States Acceler8 Talent Full timeJoin Our Team as a Data Researcher! We're dedicated to enhancing the synergy between humans and computers, unlocking collaborative potentials that transcend current capabilities. Our vision encompasses innovating across the entire spectrum, from UX design to crafting models that maximize user value per FLOP. About Us: We believe in the power of a focused,...
-
Los Gatos, United States Netflix Full timeNetflix is the world's leading internet entertainment service with over 240 million paid memberships in over 190 countries enjoying TV series, documentaries, and feature films across a wide variety of genres and languages. Netflix is also one of the largest studios in the world and empowers storytellers from all over the globe to tell the stories you fall in...
-
Lead Data Engineer
5 days ago
Los Angeles, California, United States SoLo Funds Full timeSoLo Funds is on a mission to help the millions of Americans that live paycheck to paycheck and can't survive a money crunch. Launched in 2018 as a unique and proprietary lending marketplace, today SoLo is the leading alternative financial solution providing the fastest loans available for borrowers and the opportunity for lenders to earn significant...
-
Lead Data Engineer
7 days ago
Los Angeles, United States SoLo Funds Full timeSoLo Funds is on a mission to help the millions of Americans that live paycheck to paycheck and can’t survive a money crunch. Launched in 2018 as a unique and proprietary lending marketplace, today SoLo is the leading alternative financial solution providing the fastest loans available for borrowers and the opportunity for lenders to earn significant...
-
Principal Data Engineer
1 month ago
Los Angeles, California, United States Riot Games Full timeData Engineers at Riot bring deep knowledge of data management so we can synthesize data within and across data domains. Our engineers are player and Rioter focused and aim to find solutions that make players excited to experience our games, and Rioters excited to deliver on our mission. Principal Engineers, at Riot, are often called on to help with the...
-
Senior Data Engineer
1 day ago
Los Angeles, California, United States Motion Recruitment Full timeOur client is an Investment Tech company in the Los Angeles area hiring for a Software/Data Engineer. They are looking for someone with 6+ years of experience in Software/Data Engineering coding in Python and building out ETL pipelines. Ideally, they need someone to come in make an immediate impact. If you are in the Los Angeles area looking to get out of...
-
Data Engineer, Analytics
7 days ago
Los Angeles, United States Meta Full timeMeta Platforms, Inc. (Meta), formerly known as Facebook Inc., builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps and services like Messenger, Instagram, and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens...