Site Reliability Engineer
1 month ago
RESPONSIBILITIES
Develop software to detect unusual error activity. Implement workflows and processes that are designed to identify and reduce the overall number of application/system errors.
Collaborate with software development as part of the SDLC to design and implement availability, reliability, and error monitoring solutions in their applications.
Take responsibility for removing, isolating, or remediating errors, debugs, warnings or other kinds of messages from existing logs to improve overall log content and usefulness.
Limit system downtime by defining and enforcing standards for incident responses, error tracking, monitoring, and alerting with the goal to improve established reliability metrics.
Effectively respond to escalated site reliability issues any time of the day while on-call.
Conduct regular research on best practices and new technology for monitoring, alerting, error tracking and detection and application performance
Education/Certification:
Bachelors degree in Computer Science, MIS or related field
Experience:
3+ years experience utilizing alerting and telemetry tools such as Grafana, Prometheus, Splunk, Dynatrace and others
2+ years experience with Splunk SPL
2+ years experience with at least one programming language such as PHP, Python, Java, .Net
PREFERRED QUALIFICATIONS
Experience:
1+ years experience with CI/CD
1+ years experience with container and container orchestration such as Docker and Kubernetes
1+ years experience with Prom
1+ years experience with SQL
Skills/Abilities:
Troubleshooting in a large-scale networked environment
Knowledge of Paycoms applications, systems, and database
Paycom is an equal opportunity employer and prohibits discrimination and harassment of any kind. Paycom makes employment decisions on the basis of business needs, job requirements, individual qualifications and merit. Paycom wants to have the best available people in every job. Therefore, Paycom does not permit its employees to harass, discriminate or retaliate against other employees or applicants because of race, color, religion, sex, sexual orientation, gender identity, pregnancy, national origin, military and veteran status, age, physical or mental disability, genetic characteristic, reproductive health decisions, family or parental status or any other consideration made unlawful by applicable laws. Equal employment opportunity will be extended to all persons in all aspects of the employer-employee relationship. This policy applies to all terms and conditions of employment, including, but not limited to, hiring, training, promotion, discipline, compensation benefits, and separation of employment. The Human Resources Department has overall responsibility for this policy and maintains reporting and monitoring procedures. Any questions or concerns should be referred to the Human Resources Department. To learn more about Paycom's affirmative action policy, equal employment opportunity, or to request an accommodation - Click on the link to find more information:
#
-
Site Reliability Engineer
1 month ago
Oklahoma City, OK, United States Paycom Payroll Llc Full timeSite reliability engineers will be dedicated full-time to creating software tools, metrics and processes that improve the reliability of applications, sites, and systems in production. The Site Reliability Engineer is primarily responsible for ensuring the integrity, functionality, and reliability of applications and sites.RESPONSIBILITIESDevelop software to...
-
Site Reliability Engineer
1 month ago
Oklahoma City, United States Ford Motor Company Full timeSite Reliability Engineering at Ford Motor Company plays a critical role in maintaining and improving the reliability, scalability, and performance of our services. You will work closely with our development teams to build and maintain large-scale, distributed systems and ensure our products meet our high standards for availability and user...
-
Site Reliability Engineer
1 month ago
Oklahoma City, Oklahoma, United States Ford Motor Company Full timeSite Reliability Engineering at Ford Motor Company plays a critical role in maintaining and improving the reliability, scalability, and performance of our services. You will work closely with our development teams to build and maintain large-scale, distributed systems and ensure our products meet our high standards for availability and user...
-
Site Reliability Engineer
1 month ago
Oklahoma City, United States PAYCOM PAYROLL LLC Full timeSite reliability engineers will be dedicated full-time to creating software tools, metrics and processes that improve the reliability of applications, sites, and systems in production. The Site Reliability Engineer is primarily responsible for ensuring the integrity, functionality, and reliability of applications and sites.RESPONSIBILITIESDevelop software to...
-
Site Reliability Engineer
1 month ago
Oklahoma City, United States Paycom Payroll Llc Full timeSite reliability engineers will be dedicated full-time to creating software tools, metrics and processes that improve the reliability of applications, sites, and systems in production. The Site Reliability Engineer is primarily responsible for ensuring the integrity, functionality, and reliability of applications and sites.RESPONSIBILITIESDevelop software to...
-
Site Reliability Engineer
1 week ago
Oklahoma City, United States Paycom Online Full timeSite reliability engineers will be dedicated full-time to creating software tools, metrics and processes that improve the reliability of applications, sites, and systems in production. The Site Reliability Engineer is primarily responsible for ensuring the integrity, functionality, and reliability of applications and sites. RESPONSIBILITIES Develop...
-
Lead Site Reliability Engineer
2 weeks ago
Oklahoma City, Oklahoma, United States Thegradcafe Full timePosition Overview:This is a full-time role for a Senior Site Reliability Engineer with a software development organization specializing in manufacturing and mechanical engineering. Opportunity:Join a distributed team dedicated to enhancing manufacturing processes and reducing production costs for physical products. Work Environment:This position is hybrid,...
-
Site Reliability Engineer
1 month ago
Oklahoma City, Oklahoma, United States Zoom Full timeSite Reliability Engineer - WorkvivoWhat you can expectAs a Site Reliability Engineer, you will run the production environment by monitoring availability and taking a holistic view of system health. You will build software and systems to manage platform infrastructure and applications. Your work will help improve reliability, quality, and time-to-market of...
-
Senior Control Systems Engineer
3 months ago
Oklahoma City, United States Allied Reliability Full timeOverview: The primary focus of this role is improving the productivity and efficiency of our chemical manufacturing processes through developments of existing and to be developed control systems. You will be accountable for developing and implementing carefully designed and engineered solutions to plant operations control for improved efficiency and uptime...
-
Site Reliability Engineer
4 days ago
Redwood City, United States 1872 Consulting Full timeSite Reliability Engineer - 100% Remote Role Summary: Site Reliability Engineers (SREs) are responsible for working with different developer teams to keep our systems running smoothly. They are a blend of pragmatic operators and software craftspeople that apply excellent problem-solving and communication skills to develop or configure tools that will...
-
Principal Site Reliability Specialist
2 weeks ago
Oklahoma City, Oklahoma, United States Thegradcafe Full timePosition OverviewThis is a full-time position for a Senior Site Reliability Engineer with a software development organization specializing in manufacturing and mechanical engineering. You will have the chance to be part of a distributed team dedicated to enhancing manufacturing processes and reducing production costs for tangible products. This role offers a...
-
Reliability Systems Engineer
2 weeks ago
Oklahoma City, Oklahoma, United States TEC Group International Ltd Full timeAs a Reliability Systems Engineer, your primary focus will be on providing engineering leadership for equipment integrity, inspection protocols, plant operations, and the implementation of recognized and accepted good engineering practices (RAGAGEP). You will serve as a crucial technical resource, addressing issues related to fixed plant equipment while...
-
Lead Site Reliability Engineer
1 month ago
Jersey City, New Jersey, United States JPMorganChase Full timeJob Description Elevate your engineering prowess to unprecedented levels by joining a team of exceptionally gifted professionals and position yourself among the top echelon in site reliability.As a Lead Site Reliability Engineer at JPMorgan Chase within the Community & Consumer Banking - Infrastructure & Production Management Team, you hold a leadership role...
-
Oklahoma City, Oklahoma, United States Oracle Full timeJob Description MS or BS in Computer Science, or equivalent experience. 5+ years of experience managing technology teams. 10+ years of software engineering experience Proven experience as a Director of Site Reliability Engineering or a similar leadership role, with a track record of successfully managing and scaling SRE teams. Strong knowledge of...
-
Senior Reliability Engineer
2 weeks ago
Oklahoma City, Oklahoma, United States Paycom Full timeJob OverviewLevelExperiencedJob LocationOklahoma City OfficePosition TypeFull TimeEducation LevelBachelor's DegreeTravel PercentageNoneJob CategoryDevelopmentRole SummaryThe Site Reliability Engineer will be dedicated to developing software tools, metrics, and processes that enhance the reliability of applications, websites, and systems in a production...
-
Site Reliability Engineer
2 days ago
Culver City, United States V-Soft Consulting Group, Inc. Full timeRole: Site Reliability Engineer (Data Center)Number of positions: 2Location: 5 days’ on-site in one of these 3 locationsCulver City, CA 90230Mountain View, CA 94041Bellevue, WA 98004 The Ideal Candidate will have experience with system operations and running large-scale, massively distributed infrastructure. Responsibilities:Data monitoring and alerting,...
-
Site Reliability Engineer
6 hours ago
Culver City, United States V-Soft Consulting Group, Inc. Full timeRole: Site Reliability Engineer (Data Center)Number of positions: 2Location: 5 days’ on-site in one of these 3 locationsCulver City, CA 90230Mountain View, CA 94041Bellevue, WA 98004 The Ideal Candidate will have experience with system operations and running large-scale, massively distributed infrastructure. Responsibilities:Data monitoring and alerting,...
-
Lead Site Reliability Engineer
2 weeks ago
United States Envision LLC Full timePosition Type: Hybrid (On-site 3 days a week) Contract Duration: 12-24 months with potential for RTH or extension Responsibilities: - Implement and oversee logging, monitoring, alerting, and dashboarding tools while establishing standards and responses. - Engage in Chaos Engineering practices to ensure production readiness.Automation & Problem-Solving: -...
-
Site Reliability Engineering Specialist
3 days ago
Jersey City, New Jersey, United States The Goldman Sachs Group Full timeAbout the RoleAt The Goldman Sachs Group, we're seeking a highly skilled Site Reliability Engineering Specialist to join our Platforms team. As a key member of our global engineering team, you'll be responsible for designing, developing, and operating distributed systems that provide observability for our mission-critical applications and platform...
-
Senior Director Site Reliability Engineering
1 month ago
Oklahoma City, Oklahoma, United States Oracle Full timeJob Description Do you want to be a part of changing healthcare? Oracle is excited to be using our resources, knowledge, and expertise-as well as our successes in other industries-and applying them to healthcare to make a meaningful impact. As people, we all participate in healthcare, it's deeply personal, and we put the human at the center of each of our...