Site Reliability Engineer

2 days ago


Los Angeles CA United States CV Library Full time

Position Title: Site Reliability Engineer (SRE for Datacenter)

Location: REMOTE

Pay Rate: $100/hr (+benefits)

Assignment Length: 3-month W2 Contract

Industry: Technology

The Ideal Candidate will have experience with system operations and running large-scale, massively distributed infrastructure.

Responsibilities:
  1. Data monitoring and alerting, data quality assurance and anomaly detection.
  2. Document team processes and policies, including methods of engagement and SLOs.
  3. Analyze, design, and implement solutions at the system level to remove bottlenecks and improve edge service performance.
  4. Implement monitoring and alerting to improve issue detection and response.
  5. Work in a fast-paced environment. Participate in technical operations and rotations in response to performance and reliability issues.
  6. Participate in on-call rotations, responsible for resolving or escalating incoming events.
  7. Maintain and operate a Linux and Kubernetes environment.
Qualifications:
  1. 3+ years’ experience working with Unix Linux systems from kernel to shell and beyond with experience working with system libraries, file systems, and client-server protocols.
  2. Experience reading python scripts for platform operations.
  3. Experience in networking technologies such as TCP/IP, BGP, DNS, etc. in a carrier-grade environment.
  4. Experience in developing and operating one or more of the following systems: OpenStack, Kubernetes, Nginx, ipvs, ELK stack, Hadoop, etc.
  5. Bachelor's degree or above, majoring in Computer Science or related fields, with at least 2 years of related work experience.

Special Requirements: Resources must be a US citizen or permanent resident. Green card and H1B holders are accepted, as long as the person is located in the US.

Reference #HEJP(phone number removed) #ZR

#J-18808-Ljbffr

  • Sunnyvale, CA, United States Natcast, Inc. Full time

    Natcast (short for The National Center for the Advancement of Semiconductor Technology) is a new, purpose-built, non-profit entity created to operate the National Semiconductor Technology Center (NSTC) consortium, established by the CHIPS Act of the U.S. government. Working at Natcast represents an opportunity to help extend America’s leadership in...


  • Redwood City, CA, United States C3 AI Full time

    We are looking for an Associate Site Reliability Engineer / Site Reliability Engineer to join our team at our HQ in Redwood City, CA. Responsibilities: Maximize system uptime and availability, ensuring functional and performance SLAs. Establish end-to-end monitoring and alerting on all critical aspects. Solve complex problems for critical services...


  • Sunnyvale, CA, United States Apple Inc. Full time

    To view your favorites, sign in with your Apple Account. Imagine what you could do here. At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. The people here at Apple don’t just create products —...


  • Sunnyvale, CA, United States Microsoft Full time

    There has never been a more exciting time to be working in healthcare at Microsoft. Our Health & Life Sciences Solutions organization is an interdisciplinary team of product managers, designers, engineers, and clinicians who are designing, developing and deploying next-generation healthcare solutions powered by the Microsoft Cloud for healthcare...


  • Los Angeles, California, United States CoSM Full time

    Cosm is a global technology company that specializes in creating immersive experiences. We help our partners design and build spaces and content that blur the lines between the physical and virtual worlds across various markets. Our team brings together innovation, creativity, and expertise to power the immersive experiences of the future as Cosm.About the...


  • Chicago, IL, United States WEX, Inc. Full time

    The WEX Site Reliability Engineering (SRE) team is seeking an entry-level Site Reliability Engineer Level 1 who is passionate about learning and growing in the field of software development and solutions focused on observability, incident response, reliability and performance, operational excellence, and compliance. The team will be part of the Benefits...


  • San Francisco, CA, United States Earnest Current Job Openings Full time

    The Site Reliability Engineer II position will report to the Lead Cloud Engineer. As an SRE II Engineer, you will: Set up and maintain comprehensive monitoring, create and refine playbooks, build dashboards, and adopt industry-standard practices to enhance the reliability and resilience of our site and systems. Develop and manage IaC to ensure reliable,...


  • Los Angeles, United States Saxon Global Full time

    Looking for a highly motivated Site Reliability Engineer, who is capable of build and run large-scale, massively distributed, fault-tolerant systems. Individual to work with teams across the organization and ensures core services reliability and keep an eye on capacity and performance. This is for a migration from AWS into GCP. Knowledge and experience with...


  • Annapolis Junction, MD, United States Maximus Full time

    General information Job Posting Title Site Reliability Engineer Date Wednesday, October 16, 2024 City Annapolis Junction State MD Country United States Working time Full-time Description & Requirements Maximus is seeking a Site Reliability Engineer to provide expertise to a federal client in support of their mission critical systems in defense of our...


  • Annapolis Junction, MD, United States Maximus Full time

    General information ...


  • Duluth, GA, United States BlueSky Resource Solutions Full time

    Job Title: Site Reliability Engineer – ObservabilityOverview:We are seeking a Site Reliability Engineer III to develop and maintain our observability platform. This role focuses on ensuring the reliability, performance, and scalability of microservices, Kubernetes clusters, and cloud infrastructure. You'll collaborate with cross-functional teams to deliver...


  • Miami, FL, United States Royal Caribbean Group Full time

    Site Reliability Engineer Journey with us! Combine your career goals and sense of adventure by joining our incredible team of employees at Royal Caribbean Group . We are proud to offer a competitive compensation and benefits package, and excellent career development opportunities, each offering unique ways to explore the world. We are proud to be the...


  • Fairfax, VA, United States Apex Systems Full time

    We are seeking talented professionals to join our successful and growing team in building the next-generation Continuous Diagnostics and Mitigation (CDM) Cyber data solution. The CDM Program is the Cybersecurity and Infrastructure Security Agency’s (CISA) dynamic approach to strengthening the cybersecurity of Federal networks and systems through better...


  • Newton, MA, United States Intelliswift Software Full time

    Title : Site Reliability EngineerLocation : Newton, MA HybridDuration : 6 MonthsPay rate : $38.73 per hour on W2We are seeking a skilled Site Reliability Engineer (SRE) Level 2 to join our dynamic team. The ideal candidate will have a strong technical background, excellent problem-solving skills, and a passion for enhancing system reliability and...


  • Washington, DC, United States Alldus International Consulting Ltd Full time

    Our client is a Series A startup within the Generative AI space and they are hiring a Site Reliability Engineer to join the team. Backed by one of the leading venture capital firms in the industry, this is an exciting opportunity to join a SaaS company that is revolutionizing their industry. Responsibilities: As the Site Reliability Engineer, you will...


  • Los Angeles, California, United States StubHub Full time

    At StubHub, we're redefining the live event experience on a global scale. Our team is looking for a talented Senior Site Reliability Engineer to design and develop next-generation technologies and complex features.This role will be based in either our New York, NY or Los Angeles, CA office and has a hybrid (3 in-person days per week) work schedule. As a...


  • Portland, OR, United States Matlen Silver Full time

    Compensation: $70 - $75/HourHybrid: 2 Days Onsite Portland, OregonDomain: Retail/Supply ChainJob Title: Site Reliability EngineerPosition SummaryAs a Site Reliability Engineer/DevOps Engineer, you will be responsible for ensuring the availability, performance, and reliability of Fulfillment Technology solutions for our client to support omni-channel...


  • San Diego, CA, United States Apple Inc. Full time

    Atlassian Services Site Reliability Engineer The Atlassian Services Site Reliability Engineer (SRE) role resides within the Software Delivery organization, which is at the core of the Apple software release process. This role is responsible for applying SRE practices in maintaining Atlassian services, which are used by software engineers and project managers...


  • Los Angeles, CA, United States Management Recruiters of Raleigh Full time

    Our client, a Global Petrochemical & Plastics Company, has an excellent opportunity in its world-class ethane cracker and polymers facility for a Senior Instrumentation Reliability Engineer.This brand new, state of the art facility is among the largest of its kind in the US and is one of the most extensively instrumented facilities in the world. The site...


  • Indianapolis, IN, United States BCforward Full time

    Site Reliability EngineerBCforward is currently seeking a highly motivated Site Reliability Engineer for an opportunity in Remote!Position Title: Site Reliability EngineerLocation: RemoteAnticipated Start Date: 12/10/2024Please note this is the target date and is subject to change. BCforward will send official notice ahead of a confirmed start date.Expected...