Current jobs related to Site Reliability Engineer for AI Platform - San Jose, California - Adobe
-
San Jose, California, United States Adobe Full timeJob Title: Site Reliability Engineering Manager, AI PlatformAbout the Role:We are seeking an experienced Site Reliability Engineering Manager to lead our AI Inference Platform team at Adobe. As a key member of our Engineering organization, you will be responsible for developing and implementing strategies to ensure the reliability, scalability, and security...
-
Site Reliability Engineer, AI Platform Training
4 weeks ago
San Jose, California, United States Adobe Full timeJob Title: Site Reliability Engineer, AI Platform TrainingJob Summary: We are seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and security of our AI Platform.About the Role:* Identify and implement methodologies and solutions to...
-
Site Reliability Engineer
4 weeks ago
San Jose, California, United States HireIO Inc Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at HireIO Inc. As a Site Reliability Engineer, you will be responsible for designing and developing solutions to automate the technical operations of large-scale systems, working closely with teams to improve stability from a Software Development Lifecycle...
-
Senior Software Engineer
4 weeks ago
San Jose, California, United States Coactive AI Full timeAt Coactive AI, we're revolutionizing the way businesses interact with visual content. As a Senior Software Engineer on our AI Applications team, you'll play a pivotal role in bridging the gap between customer success, product development, and engineering to deliver impactful AI-driven solutions.Leveraging our advanced Multimodal AI Platform (MAP), you'll...
-
Software Engineer
4 weeks ago
San Jose, California, United States Coactive AI Full timeUnlock the power of visual data with Coactive AI.As a Software Engineer on our AI Applications team, you will play a pivotal role in developing and maintaining RESTful microservices using Python and FastAPI.Leveraging our advanced Multimodal AI Platform (MAP), you'll bridge the gap between customer success, product development, and engineering to deliver...
-
San Francisco, California, United States TBWA\Chiat\Day Full timeJob Title:Senior Site Reliability Engineer with Perplexity AIJob Summary:We are seeking a highly skilled Senior Site Reliability Engineer to join our team at Perplexity AI. As a key member of our infrastructure team, you will be responsible for designing, implementing, and scaling our cloud infrastructure to support our AI-powered search...
-
Software Engineer
4 weeks ago
San Jose, California, United States Coactive AI Full timeCoactive is revolutionizing the way businesses harness the power of machine learning to unlock the potential of unstructured data. We are seeking a highly skilled Software Engineer to join our Solutions team as an AI Solutions Expert.About the Role:As an AI Solutions Expert, you will be responsible for delivering AI-focused technical solutions with clear...
-
AI Platform Engineer
4 weeks ago
San Francisco, California, United States Labelbox Full timeAbout the RoleLabelbox is seeking a skilled AI Platform Engineer to join our team. As a key member of our engineering organization, you will be responsible for building and maintaining a scalable AI platform that utilizes foundation models for real-world applications.Your Day to DayEnhance and improve Labelbox's core machine learning capabilities, including...
-
Site Reliability Engineer
1 month ago
San Francisco, California, United States Genmo Full timeJob DescriptionWe are Genmo, a research lab dedicated to building open, state-of-the-art models for video generation towards unlocking the right brain of AGI.As a Site Reliability Engineer at Genmo, you will be responsible for designing, implementing, and maintaining the infrastructure that powers our large generative AI models. You will work on...
-
Cloud Platform Staff Site Reliability Engineer
1 month ago
San Francisco, California, United States Zilliz Full timeJob Title: Cloud Platform Staff Site Reliability EngineerWe are seeking a highly skilled Cloud Platform Staff Site Reliability Engineer to join our team at Zilliz. As a key member of our SRE team, you will be responsible for ensuring the reliability, availability, and performance of our distributed database systems.Key Responsibilities:Design and build tools...
-
Site Reliability Engineer
1 month ago
San Jose, California, United States Tik Tok Full timeAbout Team Site Reliability Engineering at TikTokTikTok's mission is to inspire creativity and bring joy. Our platform is built to help imaginations thrive, and our Site Reliability Engineering team plays a crucial role in making this happen.ResponsibilitiesDesign and implement software platforms and monitor frameworks for efficient, automated, and...
-
AI Engineer and Technical Writer
4 weeks ago
San Jose, California, United States Hume AI Full timeAbout the RoleWe are seeking an AI Engineer and Writer to help us advance our mission of building empathic AI. As part of our team, you will create content that helps developers understand the role of emotional intelligence in AI and integrate our API into wide-ranging applications.ResponsibilitiesCopyedit developer materials, including API documentation and...
-
Senior AI/ML Platform Manager
4 weeks ago
San Jose, California, United States PayPal Full timeAt PayPal, we're revolutionizing commerce globally, and we need a Senior AI/ML Platform Manager to help us scale our AI/ML infrastructure and platform.We're looking for a strong Senior Product Manager with a deep understanding of the AI/ML Platform stack and a strong business acumen to partner with Data Scientists and ML Engineers in delivering a...
-
Senior AI Engineer, Platform
4 weeks ago
San Jose, California, United States Adobe Full timeJob SummaryWe are seeking a highly skilled Senior AI Engineer to join our team at Adobe. As a key member of our platform, you will be responsible for designing, developing, and maintaining robust AI/ML infrastructure solutions to support the training and deployment of large-scale AI models. Key ResponsibilitiesDesign and develop AI/ML infrastructure...
-
San Francisco, California, United States Together AI Full timeJob ResponsibilitiesInfrastructure Development:Identify and resolve infrastructure gaps to ensure reliable, efficient, and scalable AI/ML solutions.AI/ML Solutions:Develop advanced AI/ML infrastructure solutions to enhance the efficiency of our ML teams, leveraging expertise in distributed systems and large-scale data processing.System Design:Design and...
-
Site Reliability Engineer
4 weeks ago
San Leandro, California, United States Omni Inclusive Full timeAbout the Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Omni Inclusive. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and availability of our Digital Sales & Marketing platforms.Key Responsibilities:Design, implement, and maintain scalable and efficient systems to...
-
Senior Product Manager, AI Platform
4 weeks ago
San Jose, California, United States Adobe Full timeJob Title: Senior Product Manager, AI PlatformAbout the Role:We are seeking a seasoned AI/ML product management leader to lead the platform providing responsible data and enabling training for our models. The ideal candidate is a seasoned AI/ML product management leader with experience empowering applied AI/ML researchers to deliver best-in-class...
-
Senior Site Reliability Engineer
1 month ago
San Francisco, California, United States Hinge Health Full timeAbout the RoleHinge Health is seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our platform, including automation, logging, monitoring, and alerting.You will thrive in a collaborative environment, have excellent communication skills, and be...
-
Site Reliability Engineer, Data Infrastructure
4 weeks ago
San Jose, California, United States Tik Tok Full timeJob SummaryTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. As a Site Reliability Engineer on our Compute Platform team, you will play a critical role in ensuring the reliability of all Big Data services and products across the company.Key Responsibilities Responsible for the reliability of...
-
Senior AI/ML Platform Manager
4 weeks ago
San Jose, California, United States PayPal, Inc. Full timeJob Title: Senior AI/ML Platform ManagerJob Summary:PayPal, Inc. is seeking a Senior AI/ML Platform Manager to lead the development and implementation of our AI/ML platform. The successful candidate will have a strong background in AI/ML and experience in managing cross-functional teams.Key Responsibilities:* Develop and execute a long-term strategy for the...
Site Reliability Engineer for AI Platform
1 month ago
About the Role
We're seeking a highly skilled Site Reliability Engineer to join our team at Adobe, working on the AI Training Platform. As a key member of our team, you'll be responsible for ensuring the highest uptime and Quality of Service (QoS) for our customers.
Key Responsibilities
- Design and implement methodologies to increase reliability, scalability, security, and efficiency.
- Collaborate with cross-functional teams to define service level objectives (SLOs) and indicators (SLIs) to represent and measure service quality.
- Develop and maintain globally distributed, multi-cloud environments to support our AI platform.
- Automate common, repeatable tasks at a large scale to streamline operational procedures.
- Identify areas to improve service resiliency through techniques such as chaos engineering and performance/load testing.
Requirements
- Bachelor's or Master's degree in Computer Science, Electrical Engineering, or a related field, and 5+ years of relevant industry experience.
- Experience in building and scaling distributed systems, as well as experience with containerization and orchestration technologies like Kubernetes.
- Production-level expertise with containerization orchestration engines and proven understanding of modern, continuous development techniques and pipelines.
- Fundamental programming skills, ideally practical experience in one (and preferably more) of the following languages: Python, Go.
- Good knowledge of infrastructure configuration management tools like Ansible and Terraform.
- Experience in using observability and tracing-related tools like InfluxDB, Prometheus, and Elastic Stack.
- An understanding of AI/ML, including ML frameworks, public cloud, and commercial AI/ML solutions.
About Adobe
At Adobe, we're passionate about empowering people to create beautiful and powerful images, videos, and apps, and transform how companies interact with customers across every screen. We're committed to creating exceptional employee experiences where everyone is respected and has access to equal opportunity.
Compensation and Benefits
Our compensation reflects the cost of labor across several U.S. geographic markets, and we pay differently based on those defined markets. The U.S. pay range for this position is $124,000 -- $234,200 annually. Pay within this range varies by work location and may also depend on job-related knowledge, skills, and experience.
At Adobe, we're proud to be an Equal Employment Opportunity and affirmative action employer. We do not discriminate based on gender, race or color, ethnicity or national origin, age, disability, religion, sexual orientation, gender identity or expression, veteran status, or any other applicable characteristics protected by law.