Site Reliability Engineer
2 hours ago
Our client is a Series A startup within the Generative AI space and they are hiring a Site Reliability Engineer to join the team. Backed by one of the leading venture capital firms in the industry, this is an exciting opportunity to join a SaaS company that is revolutionizing their industry.
Responsibilities:
- As the Site Reliability Engineer, you will perform root cause analysis to identify and resolve system or application issues in a timely and effective manner.
- You will design and implement a broad range of automated tests to ensure system reliability and performance.
- Building scalable and cost-effective observability patterns in Datadog or other monitoring providers.
- Monitor and analyze SLIs to ensure adherence to SLAs and SLOs.
- Collaborate with development and operations teams to improve system reliability and developer experience.
- Develop and maintain monitoring and alerting systems to proactively address issues.
- Implement best practices for incident management and disaster recovery.
- Plan and implement capacity upgrades, ensuring scalability and performance.
- Define, monitor, and manage SLAs, ensuring service levels meet or exceed expectations.
- Ensure systems comply with security and regulatory requirements.
Skillset:
- Experienced in Kubernetes and Helm.
- Expertise in observability and monitoring tools such as Prometheus, Grafana, Datadog, or Elk.
- Experience in Azure cloud.
- Strong understanding of microservices architecture, including Postgres and AI systems.
- Expertise in automated testing frameworks and tools.
- Experience with monitoring and analytics tools to track SLIs, SLAs, and SLOs.
- Excellent problem-solving skills and attention to detail. Tenacious attitude.
- Proficiency in programming languages such as TypeScript and Python.
- Strong scripting skills in Bash, PowerShell, or similar.
- Understanding of networking principles and experience with network troubleshooting.
This is a full-time, remote position and is only open to US Citizens due to potential security clearance requirements.
Benefits:
- Salary: $140k – $175k.
- Stock options.
- Benefits package.
Interested? Apply now in the link below or email your resume directly to for consideration.
44985
#J-18808-Ljbffr-
Site Reliability Engineer
3 weeks ago
Washington, United States Alldus International Consulting Ltd Full timeOur client is a Series A startup within the Generative AI space and they are hiring a Site Reliability Engineer to join the team. Backed by one of the leading venture capital firms in the industry, this is an exciting opportunity to join a SaaS company that is revolutionizing their industry.Responsibilities:As the Site Reliability Engineer, you will perform...
-
Washington, United States Google Inc. Full timeSite Reliability Engineering, Transformative Compute Site Reliability Engineeringcorporate_fare Google place Warsaw, PolandApplyMinimum Qualifications:Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.5 years of experience building and developing infrastructure, distributed systems or networks, or experience with...
-
Site Reliability Engineer
1 day ago
Sunnyvale, CA, United States Natcast, Inc. Full timeNatcast (short for The National Center for the Advancement of Semiconductor Technology) is a new, purpose-built, non-profit entity created to operate the National Semiconductor Technology Center (NSTC) consortium, established by the CHIPS Act of the U.S. government. Working at Natcast represents an opportunity to help extend America’s leadership in...
-
Site Reliability Engineer
3 weeks ago
Washington, United States Harbor Compliance Full timeSite Reliability Engineer - Full-time RemoteAdvance Your Career with Cutting-Edge Infrastructure at Harbor ComplianceAbout Harbor Compliance:Harbor Compliance is committed to simplifying the regulatory challenges of businesses and nonprofits through innovative technology solutions. As we continue to grow, we seek a Site Reliability Engineer who is passionate...
-
Site Reliability Engineer
3 weeks ago
Washington, United States Harbor Compliance Full timeSite Reliability Engineer - Full-time RemoteAdvance Your Career with Cutting-Edge Infrastructure at Harbor ComplianceAbout Harbor Compliance:Harbor Compliance is committed to simplifying the regulatory challenges of businesses and nonprofits through innovative technology solutions. As we continue to grow, we seek a Site Reliability Engineer who is passionate...
-
Site Reliability Engineer
4 weeks ago
Annapolis Junction, MD, United States Maximus Full timeGeneral information Job Posting Title Site Reliability Engineer Date Wednesday, October 16, 2024 City Annapolis Junction State MD Country United States Working time Full-time Description & Requirements Maximus is seeking a Site Reliability Engineer to provide expertise to a federal client in support of their mission critical systems in defense of our...
-
Site Reliability Engineer
4 weeks ago
Annapolis Junction, MD, United States Maximus Full timeGeneral information ...
-
Site Reliability Engineer
4 weeks ago
Duluth, GA, United States BlueSky Resource Solutions Full timeJob Title: Site Reliability Engineer – ObservabilityOverview:We are seeking a Site Reliability Engineer III to develop and maintain our observability platform. This role focuses on ensuring the reliability, performance, and scalability of microservices, Kubernetes clusters, and cloud infrastructure. You'll collaborate with cross-functional teams to deliver...
-
Site Reliability Engineer
1 hour ago
Miami, FL, United States Royal Caribbean Group Full timeSite Reliability Engineer Journey with us! Combine your career goals and sense of adventure by joining our incredible team of employees at Royal Caribbean Group . We are proud to offer a competitive compensation and benefits package, and excellent career development opportunities, each offering unique ways to explore the world. We are proud to be the...
-
Site Reliability Engineer
4 weeks ago
Fairfax, VA, United States Apex Systems Full timeWe are seeking talented professionals to join our successful and growing team in building the next-generation Continuous Diagnostics and Mitigation (CDM) Cyber data solution. The CDM Program is the Cybersecurity and Infrastructure Security Agency’s (CISA) dynamic approach to strengthening the cybersecurity of Federal networks and systems through better...
-
Redwood City, CA, United States C3 AI Full timeWe are looking for an Associate Site Reliability Engineer / Site Reliability Engineer to join our team at our HQ in Redwood City, CA. Responsibilities: Maximize system uptime and availability, ensuring functional and performance SLAs. Establish end-to-end monitoring and alerting on all critical aspects. Solve complex problems for critical services...
-
Site Reliability Engineer
4 weeks ago
Newton, MA, United States Intelliswift Software Full timeTitle : Site Reliability EngineerLocation : Newton, MA HybridDuration : 6 MonthsPay rate : $38.73 per hour on W2We are seeking a skilled Site Reliability Engineer (SRE) Level 2 to join our dynamic team. The ideal candidate will have a strong technical background, excellent problem-solving skills, and a passion for enhancing system reliability and...
-
Site Reliability Engineer
4 weeks ago
Portland, OR, United States Matlen Silver Full timeCompensation: $70 - $75/HourHybrid: 2 Days Onsite Portland, OregonDomain: Retail/Supply ChainJob Title: Site Reliability EngineerPosition SummaryAs a Site Reliability Engineer/DevOps Engineer, you will be responsible for ensuring the availability, performance, and reliability of Fulfillment Technology solutions for our client to support omni-channel...
-
Site Reliability Engineer IN
2 weeks ago
Indianapolis, IN, United States BCforward Full timeSite Reliability EngineerBCforward is currently seeking a highly motivated Site Reliability Engineer for an opportunity in Remote!Position Title: Site Reliability EngineerLocation: RemoteAnticipated Start Date: 12/10/2024Please note this is the target date and is subject to change. BCforward will send official notice ahead of a confirmed start date.Expected...
-
Site Reliability Engineer
2 hours ago
Sunnyvale, CA, United States Apple Inc. Full timeTo view your favorites, sign in with your Apple Account. Imagine what you could do here. At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. The people here at Apple don’t just create products —...
-
Site Reliability Engineer
2 weeks ago
Indianapolis, IN, United States BCforward Full timeSite Reliability EngineerBCforward is currently seeking a highly motivated Site Reliability Engineer for an opportunity in Remote!Position Title: Site Reliability EngineerLocation: RemoteAnticipated Start Date: 12/10/2024Please note this is the target date and is subject to change. BCforward will send official notice ahead of a confirmed start date.Expected...
-
Principal Site Reliability Engineer
2 hours ago
Sunnyvale, CA, United States Microsoft Full timeThere has never been a more exciting time to be working in healthcare at Microsoft. Our Health & Life Sciences Solutions organization is an interdisciplinary team of product managers, designers, engineers, and clinicians who are designing, developing and deploying next-generation healthcare solutions powered by the Microsoft Cloud for healthcare...
-
Site Reliability Engineer
3 weeks ago
Miami, FL, United States INSPYR Solutions Full timeTitle: Site Reliability Engineer Make sure to apply quickly in order to maximise your chances of being considered for an interview Read the complete job description below. Location: Miami, FL Duration: 6+ months Compensation: $55.00 -60.00 Work Requirements: US Citizen, GC Holders or Authorized to Work in the U.S. Site Reliability...
-
Site Reliability Engineer
4 weeks ago
Austin, TX, United States Sustainable Talent Full timeJoin Sustainable Talent as an Engineering Technician (Site Reliability Engineer) supporting Nvidia and their IPP Platform Group (Infrastructure, Planning and Process)! This is a W-2 full-time contract with openings in Hillsboro, OR and Austin, TX. We offer competitive pay $35-45/hourly based on factors like experience, education, location, etc. and provide...
-
Site Reliability Engineer
23 hours ago
Washington, DC, United States Palantir Technologies Full timeSite Reliability Engineer - Security Infrastructure Palantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more. The Role Our products...