Staff Site Reliability Engineer

4 weeks ago


MD United States Experis ManpowerGroup Sp. z o.o. Full time

Responsibilities:

  • Support critical applications and ensure the stability of the applications by performing proactive maintenance activities.
  • Engage in automation activities.
  • Support application and infrastructure based on new technologies like Kubernetes containers, Kafka, Grafana, Prometheus, Elastic etc.
  • Perform root cause analysis and remediation.
  • Good knowledge on Cloud and VMware infrastructure.
  • Good knowledge on F5 Load Balancer, TCP layer architecture.
  • Good experience on Kubernetes and Docker (preferable OpenShift, MKE vendor products).
  • Basic knowledge of Ansible and YAML scripting.
  • Requires working knowledge of production support processes such as incident/change/problem management, call triaging, escalation procedures and such.
  • Ability to write and maintain scripts to monitor system activity including application smoke test activities during pre and postproduction implementations.
  • Monitor application performance (e.g. memory, logging, latency).
  • Writing SQL queries for data analytics.
  • Code release into Test and Production environments using industry standard deployment tools.
  • Support application deployment using Chef/Jenkins.
  • Support client escalated issues specific to applications (e.g. increased latency, transactional issues, features not working as expected etc.).
  • Implement and maintain performance monitoring dashboards using industry standard tools (Splunk, Thousand Eyes, Keynote, Runscope, Ghost Inspector, Evolven, Graphite etc.).

Experience:

  • 6 or more years of work experience with a Bachelor's Degree or 4 or more years of relevant experience with an Advanced Degree (e.g. Masters, MBA, JD, MD) or up to 3 years of relevant experience with a PhD.
  • Experience with application support organization working in 24*7 environments.
  • Experience in working with RDBMS DBs, Non-SQL DBs, MySQL DML/DDL, Oracle.
  • Possess exceptional analytical and problem-solving skills, oral and written communication skills.
  • Basic level knowledge on Active/Active setup Application.
  • Experience in Production support working in a globally distributed team.
  • Working experience on Java, J2EE and Python technologies.
  • Experience with ServiceNow and ticketing workflows is preferred.
  • Working experience with monitoring tools like SPLUNK or any other monitoring tools/processes will be advantageous.
  • Prior working experience with Card and transaction domains will be advantageous.
  • Should have a technical and business mindset.
  • ISO 9000 and ITIL experience will be advantageous.
  • Understanding of core networking concepts such as routing, protocols, subnets, DNS, Certificates, Load balancer and firewall.
  • Demonstrated proficiency in troubleshooting, root cause analysis, application design, and implementing major components for large projects.

Offer:

  • Annual bonus.
  • Pension plan.
  • Life Assurance.
  • Lunch Allowance.
  • Medical Insurance.
  • Health and fitness financial bonus.
  • Eye care reimbursement.
  • Stable employment conditions based on an employment contract.
  • A wide training package (soft and technical training offer, access to the e-learning platform, possibility of co-financing courses and certification).
  • and more.
#J-18808-Ljbffr

  • Chicago, IL, United States WEX Inc. Full time

    Senior Staff Site Reliability Engineer Apply to locations: Chicago, IL; Bay Area, CA; San Francisco, CA. About the Role The WEX Site Reliability Engineering (SRE) team is seeking a Senior Staff SRE who is passionate about developing software and solutions focused on observability, incident response, reliability and performance, operational excellence, and...


  • Columbia, MD, United States Geon Technologies, LLC Full time

    Geon Technologies is a rapidly growing small business that provides signal processing and sensor system integration services to the United States Government (USG) and the industry base that supports them. Geon seeks to be known for “signals, sensors, and systems”. Geon has expertise in the science and development of signal processing techniques and...


  • Foster City, CA, United States Zoox Full time

    Foster City, CA • Full-time Staff/Senior Staff Site Reliability Engineer Zoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service from...


  • Chicago, IL, United States WEX, Inc. Full time

    The WEX Site Reliability Engineering (SRE) team is seeking an entry-level Site Reliability Engineer Level 1 who is passionate about learning and growing in the field of software development and solutions focused on observability, incident response, reliability and performance, operational excellence, and compliance. The team will be part of the Benefits...


  • Sunnyvale, CA, United States Natcast, Inc. Full time

    Natcast (short for The National Center for the Advancement of Semiconductor Technology) is a new, purpose-built, non-profit entity created to operate the National Semiconductor Technology Center (NSTC) consortium, established by the CHIPS Act of the U.S. government. Working at Natcast represents an opportunity to help extend America’s leadership in...


  • San Francisco, CA, United States Ellation, Inc. Full time

    Who We Are We're a cast of characters working to shine a spotlight on anime. Crunchyroll is an international business focused on creating both online and offline experiences for fans through content (licensed, co-produced, originals, distribution), merchandise, events, gaming, news, and more. Visit our About Us pages for more information about our...


  • Miami, FL, United States Royal Caribbean Group Full time

    Site Reliability Engineer Journey with us! Combine your career goals and sense of adventure by joining our incredible team of employees at Royal Caribbean Group . We are proud to offer a competitive compensation and benefits package, and excellent career development opportunities, each offering unique ways to explore the world. We are proud to be the...


  • Redwood City, CA, United States C3 AI Full time

    We are looking for an Associate Site Reliability Engineer / Site Reliability Engineer to join our team at our HQ in Redwood City, CA. Responsibilities: Maximize system uptime and availability, ensuring functional and performance SLAs. Establish end-to-end monitoring and alerting on all critical aspects. Solve complex problems for critical services...


  • Washington, DC, United States Alldus International Consulting Ltd Full time

    Our client is a Series A startup within the Generative AI space and they are hiring a Site Reliability Engineer to join the team. Backed by one of the leading venture capital firms in the industry, this is an exciting opportunity to join a SaaS company that is revolutionizing their industry. Responsibilities: As the Site Reliability Engineer, you will...


  • Aiea, HI, United States Smxtech Full time

    SMX is seeking a Site Reliability Engineer to support the USINDOPACOM J6 portfolio of programs. This position is a hybrid between Camp H.M. Smith Marine Corps Base and Joint Base Pearl Harbor-Hickam in Hawaii. This position requires a DoD TS/SCI security clearance which requires US citizenship for work on DoD contracts. Responsibilities Independently manage...


  • Sunnyvale, CA, United States Apple Inc. Full time

    To view your favorites, sign in with your Apple Account. Imagine what you could do here. At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. The people here at Apple don’t just create products —...


  • Indianapolis, IN, United States BCforward Full time

    Site Reliability EngineerBCforward is currently seeking a highly motivated Site Reliability Engineer for an opportunity in Remote!Position Title: Site Reliability EngineerLocation: RemoteAnticipated Start Date: 12/10/2024Please note this is the target date and is subject to change. BCforward will send official notice ahead of a confirmed start date.Expected...


  • Sunnyvale, CA, United States Microsoft Full time

    There has never been a more exciting time to be working in healthcare at Microsoft. Our Health & Life Sciences Solutions organization is an interdisciplinary team of product managers, designers, engineers, and clinicians who are designing, developing and deploying next-generation healthcare solutions powered by the Microsoft Cloud for healthcare...


  • Redmond, WA, United States Microsoft Full time

    OverviewSecurity represents the most critical priorities for our customers in a world awash in digital threats, regulatory scrutiny, and estate complexity. Microsoft Security aspires to make the world a safer place for all. We want to reshape security and empower every user, customer, and developer with a security cloud that protects them with end to end,...


  • San Francisco, CA, United States Earnest Current Job Openings Full time

    The Site Reliability Engineer II position will report to the Lead Cloud Engineer. As an SRE II Engineer, you will: Set up and maintain comprehensive monitoring, create and refine playbooks, build dashboards, and adopt industry-standard practices to enhance the reliability and resilience of our site and systems. Develop and manage IaC to ensure reliable,...


  • Herndon, VA, United States Fortinet Full time

    We are seeking a self-motivated and experienced Senior Site Reliability Engineer to spearhead the development and expansion of our FortiSASE OpenStack infrastructure. This role demands deep expertise in both Networking and SRE practices, with a heavy focus on automation and infrastructure as code (Ansible/Terraform). If you're a seasoned professional who...


  • Chicago, IL, United States Nextpoint Full time

    Join the team designing and developing innovative software solutions to meet client needs while providing expert technical support. Who we are and what we offer at Nextpoint Nextpoint delivers transformative software and services for all law-kind. Our award-winning team is 100% focused on making it simple, fluid, and affordable for law firms of all...


  • Phoenix, AZ, United States TEKsystems Full time

    One of our Fortune 20 financial clients is looking to train up individuals for their Site Reliability Engineering division. This would consist of a 13-week boot camp starting in February 2025 and transition into a contract to hire in May 2025. Individuals must live in/near either Phoenix, AZ or Pittsburgh, PA. Training with start 100% remote, but then...


  • Seattle, WA, United States Apple Full time

    Senior Site Reliability Engineer - ASE Seattle, Washington, United States Software and Services Imagine what you could do here. At Apple, great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. Join Apple’s Cloud Service...


  • Los Angeles, CA, United States CV Library Full time

    Position Title: Site Reliability Engineer (SRE for Datacenter) Location: REMOTE Pay Rate: $100/hr (+benefits) Assignment Length: 3-month W2 Contract Industry: Technology The Ideal Candidate will have experience with system operations and running large-scale, massively distributed infrastructure. Responsibilities: Data monitoring and alerting, data...