Sr DevOps Engineer

2 weeks ago


Anaconda, United States Devopshunt Full time

Profile Insights Find out how your skills align with the job description. Skills Do you have experience in WebRTC ? Yes No Education Do you have a Bachelor's degree ? Yes No Job Details Job Type Full-time Location Remote Full Job Description About Anaconda With more than 45 million users, Anaconda is the most popular operating system for AI providing access to the foundational open-source Python packages used in modern AI, data science, and machine learning through a seamless platform. Our enterprise-grade solutions enable corporate, research, and academic institutions around the world to harness the power of open source for competitive advantage, groundbreaking research, and a better world. To learn more visit Anaconda . Here is what people love most about working here: We’re not just a company, we’re part of a movement. Our dedicated employees and user community are democratizing data science and creating and promoting open-source technologies for a better world. Summary: Anaconda is seeking a talented Senior DevOps Engineer to join our rapidly growing company. This is an excellent opportunity for you to leverage your experience and skills and apply it to the world of data science, artificial intelligence, and machine learning. What You'll Do: Design and implement scalable AWS infrastructure, with particular focus on Lambda functions, RDS, and message bus architectures. Build and maintain robust MLOps pipelines for deploying and monitoring LLM models in production environments. Develop and optimize real-time communication systems using WebSockets and WebRTC for ML inference services. Create and maintain Python packages with C extensions, focusing on performance optimization and reliability. Design and implement comprehensive monitoring and telemetry systems across our infrastructure. Manage and optimize Kubernetes clusters for ML workloads, ensuring efficient resource utilization and high availability. Architect and maintain efficient CI/CD pipelines for both infrastructure and application deployments. Collaborate with AI and research teams to understand and implement infrastructure requirements for new ML models and features. Optimize system performance and cost efficiency across our AWS infrastructure. Lead technical discussions and provide expertise in infrastructure and deployment strategies. Implement and maintain security best practices across our infrastructure. Participate in on-call rotations and lead incident response efforts when necessary. What You Need: 7+ years of software engineering experience, with at least 4 years focused on infrastructure and DevOps. Deep expertise with AWS services, particularly Lambda, RDS, and message bus architectures. Strong experience with Kubernetes in production environments. Extensive experience building and maintaining production ML deployment pipelines. Expert-level Python programming skills and experience building Python packages. Proven experience with C/C++ programming, particularly in building Python extensions. Strong background in WebSocket and WebRTC implementations. Demonstrated experience with monitoring and telemetry systems. Experience with high-performance, distributed systems. Strong understanding of security best practices in cloud environments. Bachelor's degree in Computer Science, Engineering, or related field. Experience with CI/CD pipelines and infrastructure automation. Proven track record of optimizing system performance and reliability. Team attitude: “I am not done until WE are done.” Embody our core values: Great People, Great Product, Great Performance, Care deeply about fostering an environment where people of all backgrounds and experiences can flourish. What Will Make You Stand Out: Experience with Rust programming language. Knowledge of WASM deployments and optimization. Experience with Llama.cpp or similar ML optimization frameworks. Contributions to open-source infrastructure or MLOps tools. Experience with large-scale LLM deployments. Advanced degree in Computer Science or related field. Experience with multi-region AWS deployments. Background in network optimization and protocols. Track record of building developer tools and platforms. Experience working in a fast-paced startup environment. Experience working in an open-source, AI, or data science-oriented company. Why You'll Like Working Here: Unique opportunity to translate strong open-source adoption and user enthusiasm into commercial product growth. Dynamic company that rewards high-performers. On the cutting edge of enterprise application of data science, machine learning, and AI. Collaborative team environment that values multiple perspectives and clear thinking. Employees-first culture. Medical, Dental, Vision, HSA, Life, and 401K. Paid parental leave - both parents. Monthly productivity stipend. Quarterly Snake days (company-wide bonus day off). 100% remote. An Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, or protected veteran status and will not be discriminated against on the basis of disability. This job post expires 30 days from its original post date. #J-18808-Ljbffr