Site Reliability Engineer

2 days ago


Culver City, United States V-Soft Consulting Group, Inc. Full time

Role: Site Reliability Engineer (Data Center)


Number of positions: 2

  • Location: 5 days’ on-site in one of these 3 locations
  • Culver City, CA 90230
  • Mountain View, CA 94041
  • Bellevue, WA 98004

The Ideal Candidate will have experience with system operations and running large-scale, massively distributed infrastructure.

Responsibilities:

  • Data monitoring and alerting, data quality assurance and anomaly detection.
  • Document team processes and policies, including methods of engagement and SLOs.
  • Analyze, design, and implement solutions at the system level to remove bottlenecks and improve edge service performance.
  • Implement monitoring and alerting to improve issue detection and response.
  • Work in a fast-paced environment. Participate in technical operations and rotations in response to performance and reliability issues.
  • Participate in on-call rotations, responsible for resolving or escalating incoming events
  • Maintain and operate a Linux and Kubernetes environment.

Qualifications

  • Bachelor's degree or above, majoring in Computer Science or related fields, with at least 5 years of related work experience.
  • 3+ years’ experience working with Unix Linux systems from kernel to shell and beyond with
  • 3+ years’ experience working with system libraries, file systems, and client-server protocols.
  • Experience reading python scripts for platform operations.
  • Experience in networking technologies such TCP/IP, BGP, DNS, etc. in a carrier-grade environment.
  • Experience in developing and operating one or more of following systems: OpenStack, Kubernetes, Nginx, ipvs, ELK stack, Hadoop, etc.



  • Culver City, United States V-Soft Consulting Group, Inc. Full time

    Role: Site Reliability Engineer (Data Center)Number of positions: 2Location: 5 days’ on-site in one of these 3 locationsCulver City, CA 90230Mountain View, CA 94041Bellevue, WA 98004 The Ideal Candidate will have experience with system operations and running large-scale, massively distributed infrastructure. Responsibilities:Data monitoring and alerting,...


  • Culver City, California, United States V-Soft Consulting Group, Inc. Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at V-Soft Consulting Group, Inc. as a Data Center Expert. In this role, you will be responsible for ensuring the reliability and performance of our data center infrastructure.Key ResponsibilitiesData Monitoring and Alerting: Design and implement data monitoring and...


  • Oklahoma City, Oklahoma, United States Ford Motor Company Full time

    Site Reliability Engineering at Ford Motor Company plays a critical role in maintaining and improving the reliability, scalability, and performance of our services. You will work closely with our development teams to build and maintain large-scale, distributed systems and ensure our products meet our high standards for availability and user...


  • Oklahoma City, United States PAYCOM PAYROLL LLC Full time

    Site reliability engineers will be dedicated full-time to creating software tools, metrics and processes that improve the reliability of applications, sites, and systems in production. The Site Reliability Engineer is primarily responsible for ensuring the integrity, functionality, and reliability of applications and sites.RESPONSIBILITIESDevelop software to...


  • Oklahoma City, United States Paycom Payroll Llc Full time

    Site reliability engineers will be dedicated full-time to creating software tools, metrics and processes that improve the reliability of applications, sites, and systems in production. The Site Reliability Engineer is primarily responsible for ensuring the integrity, functionality, and reliability of applications and sites.RESPONSIBILITIESDevelop software to...


  • Redwood City, United States 1872 Consulting Full time

    Site Reliability Engineer - 100% Remote Role Summary: Site Reliability Engineers (SREs) are responsible for working with different developer teams to keep our systems running smoothly. They are a blend of pragmatic operators and software craftspeople that apply excellent problem-solving and communication skills to develop or configure tools that will...


  • Oklahoma City, United States Paycom Online Full time

    Site reliability engineers will be dedicated full-time to creating software tools, metrics and processes that improve the reliability of applications, sites, and systems in production. The Site Reliability Engineer is primarily responsible for ensuring the integrity, functionality, and reliability of applications and sites. RESPONSIBILITIES Develop...


  • Jersey City, New Jersey, United States JPMorganChase Full time

    Job Description Elevate your engineering prowess to unprecedented levels by joining a team of exceptionally gifted professionals and position yourself among the top echelon in site reliability.As a Lead Site Reliability Engineer at JPMorgan Chase within the Community & Consumer Banking - Infrastructure & Production Management Team, you hold a leadership role...


  • Culver City, United States ICON Consultants, LP Full time

    Duration: 1 year with possible extensionSpecial requirements: Resources must be a US citizen or permanent resident. Green card holders as long as the person is located in the US.The Ideal Candidate will have experience with system operations and running large-scale, massively distributed infrastructure.Responsibilities:• Data monitoring and alerting, data...


  • Oklahoma City, Oklahoma, United States Thegradcafe Full time

    Position Overview:This is a full-time role for a Senior Site Reliability Engineer with a software development organization specializing in manufacturing and mechanical engineering. Opportunity:Join a distributed team dedicated to enhancing manufacturing processes and reducing production costs for physical products. Work Environment:This position is hybrid,...


  • Oklahoma City, Oklahoma, United States Zoom Full time

    Site Reliability Engineer - WorkvivoWhat you can expectAs a Site Reliability Engineer, you will run the production environment by monitoring availability and taking a holistic view of system health. You will build software and systems to manage platform infrastructure and applications. Your work will help improve reliability, quality, and time-to-market of...


  • Jersey City, New Jersey, United States The Goldman Sachs Group Full time

    About the RoleAt The Goldman Sachs Group, we're seeking a highly skilled Site Reliability Engineering Specialist to join our Platforms team. As a key member of our global engineering team, you'll be responsible for designing, developing, and operating distributed systems that provide observability for our mission-critical applications and platform...


  • Culver City, California, United States Apple Full time

    SummaryWe are part of Apple's Hardware Reliability Engineering team, dedicated to collaborating with various iOS hardware engineering groups. Our mission is to enhance and ensure the durability and dependability of Apple's innovative products.In this role, you will engage with multiple engineering teams throughout the entire product lifecycle, from initial...


  • Foster City, United States Zoox Full time

    Zoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service from designing systems that are easy to maintain and fault-tolerant through...


  • Foster City, United States Zoox Full time

    Zoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service from designing systems that are easy to maintain and fault-tolerant through...


  • Foster City, United States Zoox Full time

    Zoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service from designing systems that are easy to maintain and fault-tolerant through...


  • Jersey City, New Jersey, United States JPMorganChase Full time

    Job Description Guide and shape the future of technology at a globally recognized firm, driven by pride in ownership.As a Senior Manager of Site Reliability Engineering at JPMorgan Chase within the Corporate Technology, you are the non-functional requirement owner and champion for the applications in your remit. You are a key influencer in your team's...


  • Jersey City, New Jersey, United States Devexperts Full time

    Company DescriptionDevexperts has been working for nearly two decades consulting and developing for the financial industry. We solve complex technological challenges facing the most well-respected financial institutions worldwide.By becoming a part of Devexperts, you'll become a part of a company that fosters self-improvement and actively seeks...


  • Oklahoma City, United States Allied Reliability Full time

    Overview: The primary focus of this role is improving the productivity and efficiency of our chemical manufacturing processes through developments of existing and to be developed control systems. You will be accountable for developing and implementing carefully designed and engineered solutions to plant operations control for improved efficiency and uptime...


  • Jersey City, United States Fidelity Investments Full time

    Job Description:The RoleAs a member of the TechOps SRE team, you'll work closely with our engineering partners to help enable and drive initiatives from design to implementation. Our highly available multi-region Kubernetes (AWS EKS) environments are best-in-class and central to our enterprise-grade infrastructure strategy. These growing environments...