Site Reliability Engineer
1 month ago
Our client is a Series A startup within the Generative AI space and they are hiring a Site Reliability Engineer to join the team. Backed by one of the leading venture capital firms in the industry, this is an exciting opportunity to join a SaaS company that is revolutionizing their industry.
Responsibilities:
- As the Site Reliability Engineer, you will perform root cause analysis to identify and resolve system or application issues in a timely and effective manner.
- You will design and implement a broad range of automated tests to ensure system reliability and performance.
- Building scalable and cost-effective observability patterns in Datadog or other monitoring providers.
- Monitor and analyze SLIs to ensure adherence to SLAs and SLOs.
- Collaborate with development and operations teams to improve system reliability and developer experience.
- Develop and maintain monitoring and alerting systems to proactively address issues.
- Implement best practices for incident management and disaster recovery.
- Plan and implement capacity upgrades, ensuring scalability and performance.
- Define, monitor, and manage SLAs, ensuring service levels meet or exceed expectations.
- Ensure systems comply with security and regulatory requirements.
Skillset:
- Experienced in Kubernetes and Helm.
- Expertise in observability and monitoring tools such as Prometheus, Grafana, Datadog, or Elk.
- Experience in Azure cloud.
- Strong understanding of microservices architecture, including Postgres and AI systems.
- Expertise in automated testing frameworks and tools.
- Experience with monitoring and analytics tools to track SLIs, SLAs, and SLOs.
- Excellent problem-solving skills and attention to detail. Tenacious attitude.
- Proficiency in programming languages such as TypeScript and Python.
- Strong scripting skills in Bash, PowerShell, or similar.
- Understanding of networking principles and experience with network troubleshooting.
This is a full-time, remote position and is only open to US Citizens due to potential security clearance requirements.
Benefits:
- Salary: $140k – $175k.
- Stock options.
- Benefits package.
Interested? Apply now in the link below or email your resume directly to matthew@alldus.com for consideration.
44985
#J-18808-Ljbffr-
Site Reliability Engineer
6 days ago
Washington, United States OpenAI Full timeAbout the Team Join the engineering teams that bring OpenAI's ideas safely to the world!! The Applied Engineering team works across research, engineering, product, and design to bring OpenAI's technology to consumers and businesses. We seek to learn from deployment and distribute the benefits of AI, while ensuring that this powerful tool is used responsibly...
-
Site Reliability Engineer
1 month ago
Washington, United States Harbor Compliance Full timeSite Reliability Engineer - Full-time RemoteAdvance Your Career with Cutting-Edge Infrastructure at Harbor ComplianceAbout Harbor Compliance:Harbor Compliance is committed to simplifying the regulatory challenges of businesses and nonprofits through innovative technology solutions. As we continue to grow, we seek a Site Reliability Engineer who is passionate...
-
Site Reliability Engineer
1 month ago
Washington, United States Harbor Compliance Full timeSite Reliability Engineer - Full-time RemoteAdvance Your Career with Cutting-Edge Infrastructure at Harbor ComplianceAbout Harbor Compliance:Harbor Compliance is committed to simplifying the regulatory challenges of businesses and nonprofits through innovative technology solutions. As we continue to grow, we seek a Site Reliability Engineer who is passionate...
-
Site Reliability Engineer
2 weeks ago
Washington, DC, United States Alldus International Consulting Ltd Full timeOur client is a Series A startup within the Generative AI space and they are hiring a Site Reliability Engineer to join the team. Backed by one of the leading venture capital firms in the industry, this is an exciting opportunity to join a SaaS company that is revolutionizing their industry. Responsibilities: As the Site Reliability Engineer, you will...
-
Cloud Site Reliability Engineer
2 weeks ago
Washington, United States ZipRecruiter Full timeJob DescriptionJob DescriptionWe are seeking a skill, legally authorized to work in the US Cloud Site Reliability Engineer. Do you have an interest in Infrastructure Engineering, software architecture design and cloud computing? SRE/Cloud Engineers are responsible for creating infrastructure designs and guiding the development and implementation of cloud...
-
Cloud Site Reliability Engineer
3 weeks ago
Washington, United States ZipRecruiter Full timeJob DescriptionWe are seeking a skilled Cloud Site Reliability Engineer, legally authorized to work in the US. Do you have an interest in Infrastructure Engineering, software architecture design, and cloud computing? SRE/Cloud Engineers are responsible for creating infrastructure designs and guiding the development and implementation of cloud applications,...
-
Site Reliability Engineer
1 month ago
Washington, United States Palantir Technologies Full timeA World-Changing CompanyPalantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more.The RoleWe’re looking for Site Reliability Engineers who...
-
Reliability Engineering Specialist
3 weeks ago
Washington, Washington, D.C., United States AlmrStaffing Full timeSite Reliability Engineer - TS/SCIWashington D.C.Full TimeITMid LevelWe are seeking a skilled Site Reliability Engineer (SRE) to drive continuous improvements in observability, performance, and reliability for our federal government client. The ideal candidate will ensure robust and reliable technology services, enhancing the overall customer experience.The...
-
Reliability Engineering Lead
1 day ago
Washington, United States Sparibis Full timeAbout the PositionWe are seeking an experienced Senior Site Reliability Engineer to join our team at Sparibis. As a key member of our technology group, you will be responsible for ensuring the stability and availability of our cloud-based systems. With a strong background in software engineering and DevOps, you will design and implement end-to-end continuous...
-
Site Reliability Engineer
1 month ago
Washington, United States Palantir Technologies Full timeA World-Changing CompanyPalantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more.The RolePalantir has been selected as the prime contractor...
-
Washington, United States Infoblox Full timeAt Infoblox, we're revolutionizing cloud-first networking and security services. As a Top 25 Cyber Security Company and one of Inc.'s Best Workplaces for 2020, our solutions empower organizations to deliver seamless network experiences. Our customers are among the largest enterprises worldwide, and we're seeking talented individuals to join our Incident...
-
Site Reliability Engineer
3 weeks ago
Washington, United States Harbor Compliance Full timeHarbor Compliance is a leading provider of innovative technology solutions for businesses and nonprofits. We are committed to simplifying regulatory challenges through cutting-edge infrastructure.About the Role:We are seeking an experienced Site Reliability Engineer to join our team. As a key member of our IT Services department, you will be responsible for...
-
Senior Site Reliability Engineer
4 weeks ago
Washington, United States Sparibis Full timeLocation: 100% remote Years' Experience: 10+ Year's of experience Education: Bachelor's degree Work Authorization: United States Citizenship is required as part of the eligibility criteria to be able to obtain a security clearance. Clearance: Applicants must be able to obtain and maintain a Public Trust security clearance. Key Skills: Must experience...
-
Senior System Reliability Engineer
1 day ago
Washington, United States CoStar Realty Information, Inc. Full timeJob DescriptionAs a Senior Site Reliability Engineer at CoStar Realty Information, Inc., you will play a crucial role in improving the availability, reliability, and performance of our applications. Our team is responsible for ensuring that our software systems are scalable, secure, and efficient. If you have expertise in designing, analyzing,...
-
Reliability Engineer
4 weeks ago
Washington, United States Saint-Gobain Full timeConsistent with CertainTeed Gypsum Vision, Mission, Values and Objectives, the Reliability Engineer identifies and quantifies Line 1 and Line 2 root cause failure(s), and drives permanent solutions to address systemic or chronic mechanical deficiencies to world class levels of safety, environmental impact, quality, service, and efficiency standards within...
-
Reliability Engineer
4 weeks ago
Washington, United States Saint Gobain Full timeConsistent with CertainTeed Gypsum Vision, Mission, Values and Objectives, the Reliability Engineer identifies and quantifies Line 1 and Line 2 root cause failure(s), and drives permanent solutions to address systemic or chronic mechanical deficiencies to world class levels of safety, environmental impact, quality, service, and efficiency standards within...
-
Reliability Engineer
4 weeks ago
Washington, United States Saint Gobain Glass Full timeConsistent with CertainTeed Gypsum Vision, Mission, Values and Objectives, the Reliability Engineer identifies and quantifies Line 1 and Line 2 root cause failure(s), and drives permanent solutions to address systemic or chronic mechanical deficiencies to world class levels of safety, environmental impact, quality, service, and efficiency standards within...
-
Reliability Engineer
1 month ago
Washington, United States Saint-Gobain Full timeConsistent with CertainTeed Gypsum Vision, Mission, Values and Objectives, the Reliability Engineer identifies and quantifies Line 1 and Line 2 root cause failure(s), and drives permanent solutions to address systemic or chronic mechanical deficiencies to world class levels of safety, environmental impact, quality, service, and efficiency standards within...
-
Reliability Engineer
1 month ago
Washington, United States Northern Star Mining Services Limited Full timeReady to pursue your professional journey with Northern Star? As an ASX 50 global-scale gold miner, we have sizeable operations in Western Australia and Alaska. With unparalleled pathways for advancement and avenues for personal growth, we stand as Australia’s premier gold employer. Your journey starts here.At Northern Star, we live by our STARR Core...
-
Reliability Engineer
4 weeks ago
Washington, United States Saint Gobain Glass Full timeConsistent with CertainTeed Gypsum Vision, Mission, Values and Objectives, the Reliability Engineer identifies and quantifies Line 1 and Line 2 root cause failure(s), and drives permanent solutions to address systemic or chronic mechanical deficiencies to world class levels of safety, environmental impact, quality, service, and efficiency standards within...