Cloud Engineering Infrastructure Development Lead
6 days ago
Job Summary: We are seeking a highly skilled and experienced Senior Principal Software Engineer to join our Cloud Engineering Infrastructure Development team at Oracle. As a key member of our team, you will be responsible for designing, developing, and performance tuning the networking stack required to run distributed AI/ML/HPC workloads across thousands of GPUs.
Key Responsibilities:
- Design and develop high-performance networking systems for AI/ML/HPC workloads
- Collaborate with cross-functional teams to integrate networking solutions with distributed systems
- Develop and maintain collective communications libraries and GPU frameworks
- Troubleshoot and optimize network performance for large-scale distributed systems
- Stay up-to-date with industry trends and emerging technologies in cloud networking and AI/ML/HPC
Requirements:
- 10+ years of experience in software development with a focus on high-performance networking and distributed systems
- 3+ years of experience with RDMA over Infiniband networks, including setup, troubleshooting, tuning, and scaling
- 3+ years of experience with collective communications libraries and GPU frameworks
- Proficient in data structures, algorithms, and operating systems
- Excellent organizational, verbal, and written communication skills
- Bachelor's degree in Computer Science or related engineering fields
Preferred Qualifications:
- Masters or PhD degree in Computer Science or related engineering fields
- Experience with distributed workload managers like Slurm or K8s
- Experience with ML training frameworks like PyTorch or TensorFlow
- Experience with Linux performance tools
- Experience in SDN, NFV, cloud networking, and infrastructure-as-a-service
What We Offer:
- Competitive salary range: $96,800 to $251,600 per annum
- Eligibility for bonus, equity, and compensation deferral
- Comprehensive benefits package, including medical, dental, and vision insurance, short-term and long-term disability, life insurance, and more
-
Software Engineer
2 weeks ago
Santa Clara, California, United States Oracle Full timeSoftware Engineer - Cloud Engineering Infrastructure DevelopmentOracle is seeking a skilled Software Engineer to design, develop, and troubleshoot software programs for various purposes, including file storage, databases, applications, and tools networks.Key Responsibilities:Collaborate with cross-functional teams to define and develop software for tasks...
-
Lead Software Engineer, Cloud Infrastructure
2 weeks ago
Santa Clara, California, United States NVIDIA Full timeNVIDIA is seeking talented engineers to enhance its AI Infrastructure. We are looking for individuals with a robust programming foundation, profound knowledge of distributed systems, and a strong grasp of software testing and deployment methodologies. Excellent communication and organizational skills are essential. We value innovative thinkers who can...
-
Cloud Infrastructure Architect
6 days ago
Santa Clara, California, United States Palo Alto Networks Full timeAbout the RolePalo Alto Networks is seeking a highly skilled Senior Principal Cloud Reliability Engineer to join our team. As a key member of our cloud infrastructure team, you will be responsible for designing, building, and operating reliable and secure cloud infrastructure.Key ResponsibilitiesContribute to the success of our cloud infrastructure team by...
-
Cloud Infrastructure Architect
4 days ago
Santa Clara, California, United States eTeam Full timeJob DescriptionJob Title: Cloud Infrastructure ArchitectLocation: Remote (with occasional travel)Job Type: Full-timeAbout eTeam: eTeam is a leading provider of cloud-based solutions, dedicated to delivering innovative and secure infrastructure to our clients.Job Summary: We are seeking an experienced Cloud Infrastructure Architect to join our team. The ideal...
-
Senior Cloud Infrastructure Software Engineer
2 weeks ago
Santa Clara, California, United States NVIDIA Full timeThe NVIDIA GPU Cloud (NGC) team is seeking experienced software engineers to develop NVIDIA's advanced compute cloud solutions. These solutions encompass software for managing hardware and network provisioning to create a multi-tenant infrastructure. As a software engineer, you will collaborate with fellow engineers, product architects, and product managers...
-
Cloud Infrastructure Engineer
2 weeks ago
Santa Ana, California, United States Rancho Santiago Community College District Full timePosition Title: Cloud Infrastructure EngineerOverview: The Cloud Infrastructure Engineer is responsible for overseeing the district's cloud computing resources, focusing on Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) solutions. This role requires a deep understanding of cloud architecture, security protocols, and operational...
-
Senior Cloud Reliability Engineer
3 days ago
Santa Clara, California, United States Geospatial And Cloud Analytics Inc Full timeAbout the RoleWe are seeking a highly skilled Senior Cloud Reliability Engineer to join our team at Geospatial And Cloud Analytics Inc. As a key member of our engineering team, you will be responsible for designing, implementing, and supporting operational and reliability aspects of large-scale cloud infrastructure.Key ResponsibilitiesDesign and implement...
-
Senior Cloud Infrastructure Architect
2 weeks ago
Santa Clara, California, United States Astera Labs Full timeAstera Labs stands at the forefront of innovative connectivity solutions, enabling the full potential of AI and cloud infrastructure. Our Intelligent Connectivity Platform seamlessly integrates PCIe, CXL, and Ethernet semiconductor-based solutions alongside the COSMOS software suite, delivering a software-defined architecture that is both scalable and...
-
Cloud Infrastructure Specialist
7 days ago
Santa Clara, California, United States NVIDIA Full timeJob SummaryNVIDIA is seeking a highly skilled Senior Cloud Engineer to join its Infrastructure, Planning and Processes organization. As a Senior Cloud Engineer, you will be part of a fast-paced team that develops and maintains NVIDIA's internal cloud provisioning product for GPUs and Tegra systems.Key ResponsibilitiesDesign and implement scalable, resilient...
-
Cloud Infrastructure Architect
2 weeks ago
Santa Clara, California, United States TechStar Group Full timeJob Title: Cloud Infrastructure Architect**Job Summary:**We are seeking a highly skilled Cloud Infrastructure Architect to join our team at TechStar Group. As a key member of our infrastructure team, you will be responsible for designing, implementing, and managing our cloud infrastructure to ensure high levels of performance, availability, and security.Key...
-
Senior Cloud Engineer
4 days ago
Santa Clara, California, United States NVIDIA Full timeAbout the RoleNVIDIA is seeking a seasoned Cloud Engineer to join its fast-paced Infrastructure, Planning and Processes organization. As a Senior Cloud Engineer, you will be part of a dynamic team that develops and maintains NVIDIA's internal cloud provisioning product for GPUs and Tegra systems.Key ResponsibilitiesDesign and implement scalable, resilient...
-
Cloud Security Engineering Manager
2 days ago
Santa Clara, California, United States Palo Alto Networks Full timeAbout the RoleWe are seeking a highly experienced Senior Manager, Software Engineering to lead our Cloud Security Engineering team at Palo Alto Networks. As a key member of our engineering team, you will be responsible for driving and delivering our next-generation virtualization products and cloud security offerings.Key ResponsibilitiesLead and expand a...
-
Reliable Cloud Infrastructure Specialist
7 days ago
Santa Clara, California, United States ServiceNow Full timeJob DescriptionOverviewThe ServiceNow SRE team is a group of highly technical engineers who are tasked with maintaining and developing the reliability, scalability, and performance of the ServiceNow cloud infrastructure.Our SREs are empowered to drive technical resolutions across the technology stack from hardware through to application and all stops in...
-
Senior Cloud Infrastructure Engineer
2 weeks ago
Santa Monica, California, United States GoodRx Full timeGoodRx serves as a pivotal healthcare marketplace in the United States, assisting millions of individuals each month in locating trustworthy health information and securing discounts on their healthcare expenses. Since its inception, GoodRx has facilitated savings of $60 billion for consumers, providing access to prescription discounts accepted at over...
-
Global Infrastructure Expansion Lead
4 days ago
Santa Clara, California, United States Cryptoware Technologies Inc Full timeJob DescriptionJob SummaryCryptoware Technologies Inc is seeking a highly skilled Global Infrastructure Expansion Lead to join our team. As a key member of our engineering team, you will be responsible for leading the effort of global expansion of our globe-spanning infrastructure.Key ResponsibilitiesLead the effort of global expansion of our globe-spanning...
-
Santa Clara, California, United States XPENG Motors Full timeAbout XPeng MotorsXpeng Motors is a leading innovator in the electric vehicle industry, dedicated to designing, developing, and manufacturing cutting-edge smart electric vehicles that seamlessly integrate advanced Internet, AI, and autonomous driving technologies.Job SummaryWe are seeking a highly skilled Senior Staff AI Infrastructure Site Reliability...
-
Senior IT Engineer
2 weeks ago
Santa Clara, California, United States OmniVision Technologies Full timeAbout OmniVision TechnologiesWe are a leading developer of advanced digital imaging solutions, providing a diverse culture that works together on the development of cutting-edge imaging technology, products, and solutions.Job SummaryWe are seeking a highly skilled Senior IT Engineer to lead our cloud infrastructure team. The successful candidate will be...
-
Lead Cloud Software Engineer
2 weeks ago
Santa Clara, California, United States NVIDIA Full timeWe are looking for a Lead Cloud Software Engineer to become a vital member of the DRIVE Sim Cloud team at NVIDIA. In this position, you will play a key role in shaping the future of autonomous vehicle technology. You will thrive in a fast-paced environment where creativity and challenging conventional methods are encouraged. Your proficiency in backend...
-
Lead Software Engineer
2 weeks ago
Santa Clara, California, United States Palo Alto Networks Full timeJob OverviewCompany OverviewPalo Alto Networks is dedicated to safeguarding our digital existence. Our mission is to be the premier cybersecurity partner, ensuring a secure and safe environment for everyone.VisionWe envision a future where each day is more secure than the last. Our foundation is built on innovation and a commitment to redefining the...
-
Lead Software Engineer
1 week ago
Santa Clara, California, United States Palo Alto Networks Full timeJob OverviewYour Career JourneyUtilize your expertise in backend Java cloud engineering to contribute to cutting-edge cloud software and web applications. Join us in deploying and scaling the next generation of cloud security, leveraging big data and analytics.We are seeking a Principal Engineer to be part of the team dedicated to developing our latest cloud...