Staff Software Engineer, AI/ML Infrastructure, GPUs
2 days ago
- Bachelor's degree or equivalent practical experience.
- 8 years of experience in software development.
- 5 years of experience testing, and launching software products, and 3 years of experience with software design and architecture.
- 5 years of experience building and developing large-scale infrastructure, distributed systems or networks, or experience with compute technologies, storage, or hardware architecture.
- 5 years of experience with ML design and ML infrastructure (e.g., model deployment, model evaluation, data processing, debugging, fine tuning).
- Master's degree or PhD in Engineering, Computer Science, or a related technical field.
- 8 years of experience with data structures/algorithms.
- 3 years of experience in a technical leadership role leading project teams and setting technical direction.
- 3 years of experience working in a complex, matrixed organization involving cross-functional, or cross-business projects
- Experience working with GPUs.
Google's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another. Our products need to handle information at massive scale, and extend well beyond web search. We're looking for engineers who bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile; the list goes on and is growing every day. As a software engineer, you will work on a specific project critical to Google's needs with opportunities to switch teams and projects as you and our fast-paced business grow and evolve. We need our engineers to be versatile, display leadership qualities and be enthusiastic to take on new problems across the full-stack as we continue to push technology forward.
As a part of the Cloud Graphics Processing Unit (GPU) team, you will be central to AI innovation, building and maintaining an industry-leading GPU fleet and AI Platform that empowers Google Cloud's training and inference customers with unparalleled computational resources. You will be working at the intersection of hardware, software, and applied AI, managing the entire life-cycle of GPU offerings, from launching new GPU families to ensuring optimal reliability and operational excellence, while pushing the boundaries of accelerated computing to deliver the foundational infrastructure that fuels AI advancements for various, rapidly evolving customer base.Google Cloud accelerates every organization's ability to digitally transform its business and industry. We deliver enterprise-grade solutions that leverage Google's cutting-edge technology, and tools that help developers build more sustainably. Customers in more than 200 countries and territories turn to Google Cloud as their trusted partner to enable growth and solve their most critical business problems.
The US base salary range for this full-time position is $197,000-$291,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process.
Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google.
Responsibilities- Drive technical roadmaps to ensure the Google Cloud GPU fleet remains industry-leading, while anticipating future needs and advancements in AI.
- Collaborate with internal and external partners to effectively execute on the mission of delivering unparalleled compute acceleration.
- Lead and drive operational excellence across the entire GPU fleet, ensuring reliability, performance, and scalability.
- Hire, grow, and mentor engineers to help build a strong and cohesive team that excels in a fast-paced environment.
- Advocate for the customer, collaborating with Google's AI customers to understand their needs and translate them into product and platform enhancements.
Google is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. See also Google's EEO Policy and EEO is the Law. If you have a disability or special need that requires accommodation, please let us know by completing our Accommodations for Applicants form.
-
Seattle, Washington, United States Scale AI Full time $179,400 - $310,500As a Software Engineer on the ML Infrastructure team, you will design and build the next generation of foundational systems that power all ML Infrastructure compute at Scale - from model training and evaluation to large-scale inference and experimentation.Our platform is responsible for orchestrating workloads across heterogeneous compute environments (GPU,...
-
Seattle, Washington, United States Oracle Full time $120,000 - $250,000 per yearOracle Cloud Infrastructure (OCI) is looking for a visionary Director to lead innovation in AI large scale AI/ML datacenter capacity planning. In this high-impact role, you'll lead a team guiding some of the largest cloud deals in the AI and GPU Infrastructure space. This role entails closely working with executive leadership, sales, finance and...
-
Seattle, Washington, United States JPMorgan Chase Full time $200,000 - $250,000 per yearBe an integral part of an agile team that's constantly pushing the envelope to enhance, build, and deliver top-notch technology products.As a Senior Lead Software Engineer at JPMorgan Chase within the Corporate Sector, Infrastructure Platforms team, you are an integral part of an agile team that works to enhance, build, and deliver trusted market-leading...
-
Software Developer 3
3 days ago
Seattle, Washington, United States Oracle Full time $120,000 - $180,000 per yearOCI is Oracle's next-generation cloud platform, built for the most demanding enterprise workloads. We are focused on delivering high-performance computing, storage, networking, and platform services at global scale.The AI Platform, Services & Solutions organization within OCI is building a robust ecosystem to support the end-to-end lifecycle of AI and...
-
Software Engineer, ML Infrastructure, Chrome
4 days ago
Seattle, Washington, United States Google Full time $141,000 - $202,000Minimum qualifications:Bachelor's degree or equivalent practical experience.2 years of software development experience in one or more general-purpose programming languages (e.g., C++, Java, Python, Go).Experience with the software development life-cycle, including testing, deployment, and maintenance.Experience contributing to the design of software systems...
-
Software Developer 3
3 days ago
Seattle, Washington, United States Oracle Full time $120,000 - $250,000 per yearOCI is Oracle's next-generation cloud platform, built for the most demanding enterprise workloads. We are focused on delivering high-performance computing, storage, networking, and platform services at global scale.The AI Platform, Services & Solutions organization within OCI is building a robust ecosystem to support the end-to-end lifecycle of AI and...
-
Director, AI
2 days ago
Seattle, Washington, United States ALLEN INSTITUTE Full time $209,550 - $274,950 per yearDirector, AI/ML Infrastructure - Office of the CTOThe mission of the Allen Institute is to unlock the complexities of bioscience and advance our knowledge to improve human health. Using an open science, multi-scale, team-oriented approach, the Allen Institute focuses on accelerating foundational research, developing standards and models, and cultivating new...
-
Software Engineer III
5 days ago
Seattle, Washington, United States ALLEN INSTITUTE Full time $146,600 - $183,250 per yearSW Engineer III - AI/ML Infrastructure in BiologyThe mission of the Allen Institute is to unlock the complexities of bioscience and advance our knowledge to improve human health. Using an open science, multi-scale, team-oriented approach, the Allen Institute focuses on accelerating foundational research, developing standards and models, and cultivating new...
-
Software Engineer, Infrastructure
22 hours ago
Seattle, Washington, United States Anthropic Full time $300,000 - $485,000 per yearAbout AnthropicAnthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.About The RoleAnthropic...
-
Principal Software Engineer
5 days ago
Seattle, Washington, United States Oracle Full time $120,000 - $200,000 per yearOracle Cloud Infrastructure's (OCI) architecture development engineering team is seeking a highly driven GPU platform software & system development engineer at the Principal Engineer level. We are at the forefront of AI innovation, exploring the next generation of AI accelerators and hardware solutions.As a Senior Principal software engineer, part of our...