Site Reliability Engineer

2 weeks ago


New York, United States CloudFit Software Full time

CloudFit Software is looking for a Site Reliability Engineer (SRE) to work directly on increasing quality, performance, and reliability of the CloudFit Managed Applications and Services systems. Our SREs help maintain accountability of production workloads while making it easier to run modern applications and services in the cloud. As a team, we bring together significant and complementary capabilities with tooling, infrastructure, monitoring and insights in new ways to innovate in operations. We truly value people who enjoy learning, celebrating mistakes to show the way forward, and those who are happy to continuously improve. If this excites you, we’d love to connect Responsibilities: Work across CloudFit Software Platform and other internal systems to improve platforms and processes that result in improved reliability and maintainability for all Work across CloudFit to drive tools that help deliver insights and automation to simplify the complex world of scaled services Communicate effectively and partner well with other disciplines of the project team to deliver high-quality solutions from ideas to production code Be accountable for the production health of a managed service or application, responding in kind to escalations, and driving repairs to prevent future recurrences of issues. Write clean and thorough design documents and code that exemplify quality, simplicity, and maintainability Be a mentor for design reviews, code, and test cases Design systems that prioritize the customer perspective and experience Quickly adapt and apply new technologies, tools, methods, and processes from both internal and external sources Design and influence design, implementation, and architectural direction Drive architectural consolidation and simplification Exemplify the CloudFit values of leveraging the work of others and helping others be successful through your behaviors and actions Travel to support customers in air-gapped environments Must be able to report on-site in Lynchburg, VA What makes you the perfect fit for this role: Bachelor’s in Computer Science/Engineering or 1+ years of experience in software engineering, infrastructure design, system engineering, QA/testing automation, DevOps 1+ Years of programming/development experience with modern programming/development languages: Python, PowerShell, Go, C#, Java, C++ Ability to manage competing priorities and work well under pressure or timelines Familiar with SRE methodologies and passionate about solving operation problems through automation and software engineering Ability to communicate effectively with peers and management within the organization Some nice haves (but not required): Awareness of, and ability to reason about, modern software & systems architectures, including load-balancing, queueing, caching, distributed systems failure modes generally, microservices, and so on Understanding of observability in distributed systems 1+ years of experience with Linux/Unix or Windows system administration Experience with working in a team, including coordinating large projects, communicating well, and exercising initiative when presented with problems Practical experience running large-scale online systems is always an advantage Familiar with git, GitLab, or GitHub Exposure to containers and Kubernetes, as well as automation tools such as helm Current Federal Security Clearance or ability to obtain (US Citizenship required) The salary range for this role is $60k - $150k + benefits, & perks. Please note the national salary range listed in the job posting reflects the new hire salary range across levels and U.S. locations that would be applicable to the position. Final salary will be commensurate with the candidate’s final level and location. Also, the range represents salary only and does not include our total rewards package. Perks & Benefits: CloudFit benefits are designed with wellness in mind, to support what matters the most: CloudFit covers over 90% of medical, 100% of dental and vision, including buy-up options for you and your family Financial Wellness: 401k, HSA, FSAs, Personal Financial Advisors 3 Weeks of PTO, 13 Paid Company Holidays, 8 Giving Back Hours Wellness Reimbursement Program Career development for employees at every level is a priority at CloudFit. Unlimited Certification Reimbursement User-Friendly Career Development Tools & Resources About CloudFit: CloudFit Software provides customers the ability to quickly iterate on the development of Cloud Solutions while optimizing security, networking, operations, and monitoring through an engineering mindset. At CloudFit Software, we instrument solutions that attempt to reduce the human element by apply automation and engineering to every aspect of a solution's lifecycle. CloudFit focuses on delivering managed scenarios that meet customer business needs. This allows us to provide core operational outcomes while integrating back with the customer’s existing operations teams. Please be aware many of our positions require the ability to obtain a security clearance. Security clearances may only be granted to U.S. citizens. In addition, applicants who accept a conditional offer of employment may be subject to government security investigation(s) and must meet eligibility requirements for access to classified information. CloudFit is an equal opportunities employer and welcomes applications from all sections of society and does not discriminate on grounds of race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, or any other basis as protected by applicable law.



  • New York, United States Automatic Data Processing Full time

    ADP is hiring a Site Reliability Engineer. Do you thrive in a challenging environment, love production systems, curious by nature with a thirst for pushing the limits? Are you inspired by transformation and making an impact on the lives of millions o Reliability Engineer, Liability, Reliability, Engineer, Reliability, Operations, Manufacturing


  • New York, United States Unreal Gigs Full time

    Job DescriptionJob DescriptionJob SummaryWe are in search of a Site Reliability Engineer to join our tech startup specializing in infrastructure and authorization solutions. As a Site Reliability Engineer, you'll be pivotal in ensuring the reliability, availability, and performance of our systems. Your role will involve designing, implementing, and...


  • New York, United States Unreal Gigs Full time

    Job Summary We are in search of a Site Reliability Engineer to join our tech startup specializing in infrastructure and authorization solutions. As a Site Reliability Engineer, you'll be pivotal in ensuring the reliability, availability, and performance of our systems. Your role will involve designing, implementing, and maintaining scalable infrastructure...


  • New York, United States Unreal Gigs Full time

    Job DescriptionJob DescriptionJob SummaryWe are in search of a Site Reliability Engineer to join our tech startup specializing in infrastructure and authorization solutions. As a Site Reliability Engineer, you'll be pivotal in ensuring the reliability, availability, and performance of our systems. Your role will involve designing, implementing, and...


  • New York, United States RedTech Recruitment Full time

    Site Reliability Engineer – Graduates consideredWe are excited to be able to offer this Site Reliability Engineer role working for an industry-leading software company. This company has won several awards and is pioneering in their machine learning technology. Founded 8 years ago, with a team of 150 brilliant engineers, they are already renowned as having...


  • New York, United States Hyperion Industries Full time

    Company DescriptionJoin us on an exhilarating mission at Hyperion, a VC-backed startup working with Tim Hwang, CEO of FiscalNote (NYSE: NOTE). Our co-founders, with their extensive AI and engineering backgrounds from Google, Amazon, Workday, and Instacart are leading the charge. Our mission is to revolutionize Site Reliability Engineering (SRE) with an...


  • New York, United States Hyperion Industries Full time

    Company DescriptionJoin us on an exhilarating mission at Hyperion, a VC-backed startup working with Tim Hwang, CEO of FiscalNote (NYSE: NOTE). Our co-founders, with their extensive AI and engineering backgrounds from Google, Amazon, Workday, and Instacart are leading the charge. Our mission is to revolutionize Site Reliability Engineering (SRE) with an...


  • New York, United States Mondrian Alpha Full time

    An industry leading systematic trading fund is seeking highly skilled Site Reliability Engineers to join a team responsible for engineering and supporting the companies critical infrastructure platforms. This team also handles the centralized development infrastructure and works alongside engineering teams across the business assure the optimal route of...


  • New York, United States ICTerGezocht Full time

    Locatie Amsterdam Vacature in het kort Ever thought of how many people log in to the app or Internet Banking website each month? Over five million! The objective of the Personal Banking Grid is to ensure that each visit is not only secure but also a personal and smooth experience. As a Site Reliability Engineer, you play a key role in this mission. You will...


  • New York, United States Instabase Full time

    At Instabase, we're passionate about democratizing access to cutting-edge AI innovation to enable any organization to solve previously unsolvable unstructured data problems in their industry. With customers representing some of the largest and most complex organizations in the world, and investors like Greylock, Andreessen Horowitz, and Index Ventures, our...


  • New York, United States InterEx Group Full time

    Senior Site Reliability Engineer PRIMARY ACCOUNTABILITIES Improve the reliability of mission critical solutions, applications, and platforms Software development for enterprises Continuous improvement identification and implementation Manage risks and resolve resolves issues that affect applications Lead efforts to troubleshoot and/or debug issues in any...


  • New York, New York, United States Instabase Full time

    At Instabase, we're passionate about democratizing access to cutting-edge AI innovation to enable any organization to solve previously unsolvable unstructured data problems in their industry. With customers representing some of the largest and most complex organizations in the world, and investors like Greylock, Andreessen Horowitz, and Index Ventures, our...


  • New York, United States Hebbia Full time

    About Hebbia The user interface for AGI - Hebbia is AI that works the way you work. Designed to be generally capable- it can tackle even the most complex tasks, citing answers over any amount of sources. By showing its work, Hebbia empowers users to collaborate with AI on each step and validate responses instead of blindly trusting them. Our mission is to...


  • New York, New York, United States Astir IT Solutions, Inc. Full time

    Position: Senior Site Reliability EngineerLocation: Onsite in NJContract Duration: Long-term EngagementCompensation: $50 per hourNote: No OPT/CPT candidates will be considered.We are seeking a highly skilled Senior Site Reliability Engineer (SRE) with subject matter expertise. The ideal candidate will possess exceptional communication skills and the...


  • New York, New York, United States Astir IT Solutions, Inc. Full time

    Position: Senior Site Reliability EngineerLocation: Onsite in New JerseyContract Duration: Long-termCompensation: $50 per hourThis role requires a highly skilled individual with a strong background in Site Reliability Engineering. The ideal candidate will possess exceptional communication abilities and the confidence to engage with executive-level teams.Key...


  • New York, United States InterEx Group Full time

    Senior Site Reliability EngineerPRIMARY ACCOUNTABILITIESImprove the reliability of mission critical solutions, applications, and platformsSoftware development for enterprisesContinuous improvement identification and implementationManage risks and resolve resolves issues that affect applicationsLead efforts to troubleshoot and/or debug issues in any...


  • New York, New York, United States Astir IT Solutions, Inc. Full time

    Position: Senior Site Reliability EngineerLocation: Onsite in New JerseyContract Duration: Long-term EngagementCompensation: $50 per hourThis role requires a highly skilled individual with a strong background in Site Reliability Engineering. The ideal candidate will possess exceptional communication abilities and the confidence to engage with executive-level...


  • New York, New York, United States Astir IT Solutions, Inc. Full time

    Position: Senior Site Reliability EngineerLocation: Onsite in New JerseyContract Duration: Long-term EngagementCompensation: $50 per hourThis role requires a highly skilled individual with a strong background in Site Reliability Engineering. The ideal candidate will possess:Exceptional communication skills, with the ability to engage confidently with...


  • New York, New York, United States Astir IT Solutions, Inc. Full time

    Position: Senior Site Reliability EngineerLocation: Onsite in New JerseyContract Duration: Long-term EngagementCompensation: $50 per hourThis role requires a seasoned professional with a strong background in Site Reliability Engineering. The ideal candidate will possess exceptional communication skills and the confidence to engage with executive-level...


  • New York, New York, United States Astir IT Solutions, Inc. Full time

    Position: Senior Site Reliability EngineerLocation: Onsite in New JerseyContract Duration: Long-termCompensation: $50 per hourThis role requires a highly skilled individual with a proven track record in Site Reliability Engineering. The ideal candidate will possess:Exceptional communication abilities and the confidence to engage with executive-level...