Current jobs related to Site Reliability Engineer - Washington - UnitedHealth Group


  • Washington, United States Cinder LLC Full time

    [Full Time] Site Reliability Engineer at Cinder (United States) Site Reliability Engineer Cinder United States Date Posted: 31 Oct, 2022 Work Location: Washington, DC, United States Salary Offered: $110 — $220 yearly Job Type: Full Time Experience Required: 1+ years Remote Work: Yes Stock Options: No Vacancies: 1 available About Cinder Cinder provides a...


  • Washington, United States Varada Consulting Full time

    Site Reliability EngineerJob Location-Washington, DC; Hybrid Overview:Varada Consulting, LLC is seeking a full-time highly skilled and experienced Site Reliability Engineer (SRE) to join our team. As an SRE, you will be responsible for ensuring the reliability, scalability, and performance of our systems and applications through automation, monitoring, and...


  • Washington, United States Alldus Full time

    Our client is a Series A startup within the Generative AI space and they are hiring a Site Reliability Engineer to join the team. Backed by one of the leading venture capital firms in the industry, this is an exciting opportunity to join a SaaS company that is revolutionizing their industry. Responsibilities: As the Site Reliability Engineer, you will...


  • Washington, United States StaffWorthy Inc. Full time

    We are a leading technology services provider with a rich history of assembling exceptional teams dedicated to delivering outstanding solutions. For over two decades, we have been committed to excellence, with a mission centered around our passion for our people and the value they deliver to our customers. Responsibilities Monitor platform and containerized...


  • Washington, United States System One Full time

    Site Reliability Engineer Work Location: 3 days onsite DC - JBAB, 2 days remote Clearance: Active TS/SCI with ability to clear PSD As a Site Reliability Engineer (SRE), you’ll continuously drive improvements in observability, performance, and reliability, with the goal to make an impact across the federal government. What You’ll Do Monitor platform and...


  • Washington, United States StaffWorthy Inc. Full time

    We are a leading technology services provider with a rich history of assembling exceptional teams dedicated to delivering outstanding solutions. For over two decades, we have been committed to excellence, with a mission centered around our passion for our people and the value they deliver to our customers.ResponsibilitiesMonitor platform and containerized...


  • Washington, United States Mount Indie Full time

    Job DescriptionJob DescriptionAs aSite Reliability Engineer (SRE), youll continuously drive improvements in observability, performance, and reliability,with the goal to make an impact across the federal government. This role requires a current TS/SCI that has been obtained within the last 51 months and the ability to pass additional background...


  • Washington, United States Kansas Action for Children, Inc Full time

    at T-Mobile USA, Inc. in Overland Park, Kansas, United States Job DescriptionBe unstoppable with us!T-Mobile is synonymous with innovation-and you could be part of the team that disrupted an entire industry! We reinvented customer service, brought real 5G to the nation, and now we're shaping the future of technology in wireless and beyond. Our work is as...


  • Washington, United States Kansas Action for Children, Inc Full time

    at T-Mobile USA, Inc. in Overland Park, Kansas, United StatesJob DescriptionBe unstoppable with us!T-Mobile is synonymous with innovation-and you could be part of the team that disrupted an entire industry! We reinvented customer service, brought real 5G to the nation, and now we're shaping the future of technology in wireless and beyond. Our work is as...


  • Washington, United States CruitZi, INC Full time

    Job DescriptionJob DescriptionOur Client is currently hiring a full-time Sr. Site Reliability Engineer (SRE), who will play a vital role in continuously driving improvements in observability, performance, and reliability, aiming to make a substantial impact across the federal government.This role is Hybrid, requiring travel to downtown Washington, DC, at...


  • Washington, United States Karsun Solutions Full time

    About the RoleWe are seeking a highly skilled and experienced Site Reliability Engineering Manager to join our team at Karsun Solutions. The ideal candidate will be responsible for ensuring the reliability, scalability, and performance of our systems and services.Key Responsibilities:Lead a team of engineers in designing, implementing, and maintaining robust...


  • Washington, United States Veterans Enterprise Technology Solutions Full time

    Job Summary:We are seeking a highly skilled Site Reliability Engineer to join our team at Veterans Enterprise Technology Solutions. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, performance, and scalability of our systems and applications.Key Responsibilities:Monitor and analyze system performance to identify...


  • Washington, United States Veterans Enterprise Technology Solutions Full time

    Job Summary:We are seeking a highly skilled Site Reliability Engineer to join our team at Veterans Enterprise Technology Solutions. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, performance, and scalability of our infrastructure.Key Responsibilities:Monitor and Maintain Infrastructure: Continuously monitor our...


  • Washington, United States Kansas Action for Children, Inc Full time

    About the RoleWe are seeking a highly skilled Principal Site Reliability Engineer to join our team at Kansas Action for Children, Inc. in Overland Park, Kansas, United States.This is an exciting opportunity for a technical professional who is passionate about innovation and wants to be part of a team that is reshaping the future of technology in the wireless...


  • Washington, United States Karsun Solutions Full time

    We are seeking a highly skilled and experienced Site Reliability Manager to join our team. The ideal candidate will be responsible for ensuring the reliability, scalability, and performance of our systems and services. They will lead a team of engineers in designing, implementing, and maintaining robust infrastructure and automation solutions. The ideal...


  • Washington, United States Red Frog Solutions Full time

    Site Reliability Engineer - SRE - (TS/SCI) Full Time Perm Washington D.C. (Hybrid - 3 days onsite, 2 days remote) $180K - $200K Salary Plus Competitive Benefits As a Site Reliability Engineer (SRE), you will play a vital role in continuously driving improvements in observability, performance, and reliability, aiming to make a substantial impact across the...


  • Washington, United States Tik Tok Full time

    About the RoleTikTok is a leading destination for short-form mobile video, and our mission is to inspire creativity and bring joy. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our platform.Key ResponsibilitiesCollaborate with infrastructure, product, and platform engineering teams to operate and...


  • Washington, United States MetroStar Systems Full time

    ***$25k Sign-On Bonus for this role*** As a Site Reliability Engineer (SRE) , you'll continuously drive improvements in observability, performance, and reliability, with the goal to make an impact across the highest levels of government. If you think you can see yourself delivering our mission and pursuing our goals with us, then check out the job...


  • Washington, United States Veterans Enterprise Technology Solutions Full time

    Overview: Staffing Pros, a division of VETS Inc., is recruiting for a full-time Site Reliability Engineer. This position will work a rotating hybrid schedule- 3 days onsite at JBAB, 2 days remote. An Active Top Secret SCI clearance is required for this role. If you have additional questions not answered by the information contained within this posting,...


  • Washington, United States Palantir Technologies Full time

    About the RolePalantir Technologies is seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our systems and applications.Key ResponsibilitiesCollaborate with cross-functional teams to design, implement, and maintain...

Site Reliability Engineer

1 month ago


Washington, United States UnitedHealth Group Full time

Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by diversity and inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health equity on a global scale. Join us to start Caring. Connecting. Growing together. As a Site Reliability Engineer (SRE) , you will employ software engineering to automate critical IT operations tasks, including production system management, change management, and incident response. You will be responsible for design review and control; prediction, estimation, and apportionment methodology; failure mode effects and analysis; the planning, operation and analysis of reliability testing and field failures, and the ability to develop and administer reliability information systems for failure analysis, design and performance improvement and reliability program management over the entire product life cycle. You will help ensure swift incident response and scalable emergency handling, fostering greater reliability and resilience in managing complex systems. You will support our efforts in optimizing system performance and ensuring the reliability of our technology ecosystem. You’ll enjoy the flexibility to telecommute* from anywhere within the U.S. as you take on some tough challenges. Primary Responsibilities: System Reliability and Incident Management: Ensure the reliability, availability, and performance of services. Respond to, troubleshoot, and resolve service outages or degradation. Lead post-incident reviews and drive root cause analysis and mitigation. Monitoring and Performance Tuning: Develop and maintain advanced monitoring and alerting systems to detect and mitigate issues proactively. Continuously measure and optimize system performance, identifying bottlenecks and points of failure. Continuous Improvement: Advocate for and implement changes to improve system reliability and scalability. Innovate new ways to manage and automate operations tasks. Collaboration and Advocacy: Work closely with development teams to incorporate best practices and influence architecture, code health, and operational processes. Promote a culture of shared responsibility for production stability and performance. Integrate SRE principles into the engineering workflow. Capacity Planning and Scalability: Forecast and plan for the infrastructure needs. Implement scalable systems and resource allocation strategies to handle growth and peaks in demand. Documentation and Knowledge Sharing: Create and maintain detailed documentation of the systems, processes, and procedures. Facilitate knowledge sharing through regular technical presentations and training sessions. Configure, implement, and manage/optimize end-to-end APM solutions, with a focus on Dynatrace, AppDynamics, Splunk, or other relevant tools. Work closely with IT teams to seamlessly integrate APM solutions into the existing infrastructure and applications. Develop and maintain customized dashboards, reports, and alerts to offer real-time insights into the health and performance of the system. Collaborate with diverse teams to understand business requirements and configure APM solutions to meet performance monitoring needs. Conduct system analysis, troubleshooting, and optimization across various applications and infrastructure components. Provide support to internal stakeholders and support teams regarding tweaking configurations, troubleshooting, and tool-specific nuances. Continuous performance management, measuring performance and working with stakeholders to improve the same. Build quality frameworks to provide feedback loop to stakeholders to ease and improve APM product management, patching systems and implementing security controls. Document automation procedures to improve the velocity and quality of the effort. Continuous performance management, Software release management, configuration management and transition to stakeholders. Request feedback from teams, perform tool implementation assessments, offering recommendations for improvements to enhance system reliability and responsiveness. Required Qualifications: Must possess an industry recognized Reliability Engineer Certification CRE. 4+ years hands-on experience with scripting languages (e.g., Python, PowerShell) for automation and customization across various APM tools. 4+ years' experience monitoring software performance in terms of service-level agreements (SLAs), service-level indicators (SLIs), and service-level objectives (SLOs). 4+ years' experience with APM features such as real user monitoring, synthetic monitoring, and effective root cause analysis. 4+ years' experience with one or more of the following platforms: Salesforce, Pega, Appian, Microsoft Power Platform. Preferred Qualifications: ITIL Foundation Certification. Bachelor's Degree in computer science or equivalent technical degree. Understanding of application architecture, infrastructure, and cloud environments. Proficiency in configuring and customizing multiple APM tools like Dynatrace, Splunk, AppDynamics for optimal performance monitoring. Additional certifications (e.g. Salesforce Developer, Quality Engineer Certification CQE etc.) are highly desirable. Soft Skills: Ability to communicate both verbally and in written form. Excellent communication skills to collaborate effectively with cross-functional teams and convey technical concepts to non-technical stakeholders. Strong problem-solving skills, including the ability to analyze complex systems and identify performance bottlenecks. Telecommuting Requirements: Must have reliable internet service that allows for effective telecommuting. Must be able to obtain and maintain a government security clearance. All work must be conducted in the United States. Must be eligible to work in the United States. *All Telecommuters will be required to adhere to UnitedHealth Group’s Telecommuter Policy. California, Colorado, Nevada, Connecticut, New York, New Jersey, Rhode Island, Hawaii, Washington, or Washington D.C Residents Only: The salary range for California, Colorado, Nevada, Connecticut, New York, New Jersey, Rhode Island, Hawaii, Washington, or Washington D.C residents is $70,200 to $137,800 per year. Pay is based on several factors including but not limited to local labor markets, education, work experience, certifications, etc. UnitedHealth Group complies with all minimum wage laws as applicable. In addition to your salary, UnitedHealth Group offers benefits such as a comprehensive benefits package, incentive and recognition programs, equity stock purchase and 401k contribution (all benefits are subject to eligibility requirements). No matter where or when you begin a career with UnitedHealth Group, you’ll find a far-reaching choice of benefits and incentives. Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records. At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone–of every race, gender, sexuality, age, location and income–deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes — an enterprise priority reflected in our mission. Diversity creates a healthier atmosphere: UnitedHealth Group is an Equal Employment Opportunity / Affirmative Action employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, age, national origin, protected veteran status, disability status, sexual orientation, gender identity or expression, marital status, genetic information, or any other characteristic protected by law. UnitedHealth Group is a drug-free workplace. Candidates are required to pass a drug test before beginning employment. #RPO #GREEN #J-18808-Ljbffr