Site Reliability Engineer
3 weeks ago
SRE/Dynatrace Lead
Contract to hire - W2
Remote - Candidate MUST be in Georgia or Alabama
Job Description
The
Site Reliability Engineer Lead
will work with stakeholders to define SLOs and SLIs as well as develop the overall SRE strategy and roadmap. The ideal candidate will develop a depth of understanding of how all the systems work together, how they fail, how they can be improved and how they can be designed and monitored automatically and manually in the most optimized way – and share that understanding through dashboarding, reporting, demonstrations, presentations, and guidance with leadership and other engineers across the organization.
This position will assist in the support and maintenance of a Continuous Integration and Continuous Delivery automation framework across multiple tools and platforms. This person will equally play a vital role on the team and must demonstrate a solid understanding of site reliability, operability, infrastructure, and other DevOps groups. They should possess a high-degree of automation skills and expertise. This person should possess a software development background preferably including the development of shared/common components and integrations. Extensive C#.NET and PowerShell experience with Azure DevOps/GitHub, Visual Studio, Kubernetes/Google/AWS/Apigee and other cloud tools, CI/CD pipelines, and the Continuous/Shift-Left principles and process, in general, thrive in a fast-paced dynamic environment, and effortlessly adapt to the ever-evolving changes that comes along with technology and doing business. The person in this role will interact closely with several other Business Units to ensure clear understanding of business and technical needs; timely delivery of requirements; and service level agreements, events, and operations. Also, this person would collaborate and interface with those who have operational responsibility of the software infrastructure performance, SDLC development, application security, system monitoring, risk, compliance, including testing and documentation – as well as present to leadership and technical teams findings and possible solutions to addressing application/system challenges impacting service-levels.
Responsibilities:
Team Leadership:
Lead and mentor a team of SREs, fostering a collaborative and high-performing culture.
Set clear expectations, provide regular feedback, and empower team members to excel.
Drive continuous learning and skill development within the team.
Willing to Collaborate/Cooperate with Enterprise, Infrastructure, and Architects on DevOps tool chain decisions.
Be a collaborative highly experienced & supportive team player with strong communication and planning skills.
Gives attention to detail and demonstrates a high commitment to customer satisfaction.
Product Ownership:
Collaborate with leadership and prospective business teams to understand, ensure, and anticipate their SLAs and SLOs commitments.
Translate user requirements into clear and actionable product features.
Define the overall vision for the team’s products and services.
Follow engineering best practices, participate in new tool evaluations.
Operational Excellence:
Define and implement best practices for incident response, monitoring, and alerting.
Identify, assess, and integrate various open-source technologies and cloud services.
Collaborate with cross-functional teams (product development, operations, etc.) to improve system reliability.
Develop and maintain incident playbooks and post-mortem processes.
Database SQL experience with Oracle, Microsoft SQL Server, TOAD, MongoDB, others.
Develop internal process automations using observability (like Splunk & Dynatrace), DevOps, & DevSecOps tools and AI technologies.
Assume ownership of automation processes; problem resolution and root cause analysis; strong automation, problem-solving skills, and ability to follow through to completion.
Detailed, hands-on experience with public cloud resources and services such as Microsoft Azure, Docker, and other similar tools
Develop new automation framework to utilize and define/refine existing build and deploy processes.
Be a contributor toward development/testing of common services to support Monitoring, containerization, CI/CD.
Infrastructure and Automation:
Automate repetitive tasks and processes to improve efficiency.
Champion infrastructure as code (IaC) principles.
Working closely with Development, QA, Product Management, and Operations teams to make sure Product Releases on-time with quality.
Performance and Scalability:
Monitor system performance, identify bottlenecks, and optimize resource utilization.
Plan for capacity growth and scalability.
Collaborate with development teams to ensure efficient code deployment.
Incident Management:
Serve as a technical contributor in incident response efforts during critical incidents.
Coordinate with stakeholders to minimize downtime and impact.
Drive root cause analysis and preventive measures.
Technology Stack:
Our SREs work with a variety of tools and platforms, including but not limited to:Dynatrace
Splunk
Grafana
OpenTelemetry
Github Enterprise
Azure DevOps
Microsoft SQL Server and Oracle RDMS & Oracle Cloud
Primarily .NET development environment
Qualifications:
Bachelor’s Degree in Computer Science (or closely related field) with at least 5+ years of relevant IT experience
Strong verbal/written communication skills
Strong analytical and problem-solving skills
Strong PowerShell Scripting, Python, Bash, Perl, Shell, and other scripting skills
At minimum, 5 years of experience as a Site Reliability Engineer
Experience leading projects and teams
A background within Object Oriented software development
An understanding of monitoring, performance management, automation, and cloud infrastructure
Exceptional architectural, design and development skills
Excellent verbal and written communication skills
Work great in a team environment and as an individual contributor on solo efforts with minimal supervision
5+ years of DevOps experience in application (vendor & internal apps) delivery and transformation to deployment automation
Proven ability to follow priorities and timelines
Track record & passion for bringing high-quality automation, including continuous deployments, automated service failover, etc. into a team
Knowledge of good coding practices and improving code quality
Extensive experience in creating robust release management processes using PowerShell and other automation scripting technologies
Extensive experience in Configuration Mgt/Deployment tools such as SCCM / Azure DevOps (TFS)
Experience in explaining and persuading DevOps’ culture, automation, lean, measurement, and sharing changes throughout internal teams and with business partners.
Experience with building CI/CD Delivery solutions using tools such as Azure DevOps (TFS), GitHub, Docker
Experience building an SRE practice
Experience within financial services
Experience working in an Agile environment
Experience in systems administration activities and networking technologies
Experience with organizational transformations to Agile methodology and practices within a more DevOps-centric ecosystem supporting process and delivery automation and collaboration.
Has experience of architecting solutions to support and host customer facing products with a focus on stability, scalability, security, testability, and maintainability.
The hourly pay range for this position is $55 to $70/hr (dependent on factors including but not limited to client requirements, experience, statutory considerations, and location). Benefits available to full-time employees: medical, dental, vision, disability, life insurance, 401k and commuter benefits. Note: Disclosure as required by the Equal Pay for Equal Work Act (CO), NYC Pay Transparency Law, and sb5761 (WA).
About Synergis
Synergis serves a myriad of clients across nearly all industries, from start-ups to Fortune 100 companies. The outcomes of these relationships are demonstrated in a growing list of more than 300 clients and industry recognition by Inc. magazine and the Atlanta Business Chronicle. From its foundation in 1997, Synergis has been successfully recruiting and placing IT professionals in all areas of information technology. For more information about Synergis, please visit the company website at www.synergishr.com.
Synergis is proud to be an Equal Opportunity Employer. We value diversity and do not discriminate on the basis of race, color, ethnicity, national origin, religion, age, gender, gender identity, political affiliation, sexual orientation, marital status, disability, military/veteran status, or any other status protected by applicable law.
For immediate consideration, please forward your resume to Sumner Pirkle at spirkle@synergishr.com. If you require assistance or an accommodation in the application or employment process, please contact us at spirkle@synergishr.com
-
Site Reliability Engineer
1 week ago
New York, United States Unreal Gigs Full timeJob Summary We are in search of a Site Reliability Engineer to join our tech startup specializing in infrastructure and authorization solutions. As a Site Reliability Engineer, you'll be pivotal in ensuring the reliability, availability, and performance of our systems. Your role will involve designing, implementing, and maintaining scalable infrastructure...
-
Site Reliability Engineer
2 weeks ago
New York, United States Unreal Gigs Full timeJob DescriptionJob DescriptionJob SummaryWe are in search of a Site Reliability Engineer to join our tech startup specializing in infrastructure and authorization solutions. As a Site Reliability Engineer, you'll be pivotal in ensuring the reliability, availability, and performance of our systems. Your role will involve designing, implementing, and...
-
Site Reliability Engineering Manager
2 weeks ago
New York, United States developrec Full timeSRE Lead/Manager | San Diego, CA | Full-time Role Overview: As the Engineering Manager for Site Reliability, you'll lead the charge in transitioning to cloud-based solutions while ensuring the stability of our existing systems for our rapidly growing user base, currently standing at around one million. You'll spearhead our cloud infrastructure strategy...
-
Site Reliability Engineer
7 days ago
New York, United States InterEx Group Full timeSenior Site Reliability Engineer PRIMARY ACCOUNTABILITIES Improve the reliability of mission critical solutions, applications, and platforms Software development for enterprises Continuous improvement identification and implementation Manage risks and resolve resolves issues that affect applications Lead efforts to troubleshoot and/or debug issues in any...
-
Site Reliability Engineer
3 weeks ago
New York, United States InterEx Group Full timeSenior Site Reliability Engineer PRIMARY ACCOUNTABILITIES Improve the reliability of mission critical solutions, applications, and platforms Software development for enterprises Continuous improvement identification and implementation Manage risks and resolve resolves issues that affect applications Lead efforts to troubleshoot and/or debug issues in any...
-
Site Reliability Engineer
2 weeks ago
New York, United States The Judge Group, LLC Full timeContract: 6+ months Hybrid: Riverwoods, IL W2 ONLY - NO C2C Job Responsibilities: Guide full stack developers on the importance of SRE principles. Analyze, design, and deploy new functionality and enhancements with high quality (security, reliability, operations) to production. Build new and analyze current monitoring for applications for...
-
Site Reliability Engineer
4 weeks ago
New York, United States InterEx Group Full timeSenior Site Reliability EngineerPRIMARY ACCOUNTABILITIESImprove the reliability of mission critical solutions, applications, and platformsSoftware development for enterprisesContinuous improvement identification and implementationManage risks and resolve resolves issues that affect applicationsLead efforts to troubleshoot and/or debug issues in any...
-
Site Reliability Engineer
2 days ago
New York, United States Citadel Securities Americas Services LLC Full timeSite Reliability Engineer (Citadel Securities Americas Services LLC - New York, NY); Multiple positions available: Collaborate with cross-functional teams, including trading, quantitative, and software engineering teams, to support and enhance Citadel's core suite of trading applications with the latest, most cutting edge technology in order to proactively...
-
Site Reliability Engineer
3 weeks ago
New York, United States Nationstaff Full timeAbout This Role We are seeking a talented Site Reliability Engineer with experience in building and maintaining continuous integration, automating programmatic tasks, deploying applications, configuration management, and monitoring and maintaining the uptime of the platform. The Site Reliability Engineer will be an expert in Linux, is passionate about open...
-
Site Reliability Engineer
3 days ago
New York, United States Nationstaff Full timeAbout This Role We are seeking a talented Site Reliability Engineer with experience in building and maintaining continuous integration, automating programmatic tasks, deploying applications, configuration management, and monitoring and maintaining the uptime of the platform. The Site Reliability Engineer will be an expert in Linux, is passionate about open...
-
Site Reliability Engineer
14 hours ago
New York, United States Gallery Systems Full timeJob Summary: Job Description: We are seeking a Site Reliability Engineer (SRE) with 3-5 years experience to join our team at Gallery Systems. The SRE will play a critical role in overseeing the reliability, performance, and scalability of our systems in a Microsoft/Linux environment. The ideal candidate will bring expertise and best practices from previous...
-
Site Reliability Engineer
7 days ago
New York, United States Hale Recruiting Full timeSummary - Site Reliablity Engineer (For one of the Big 4 Sports &Entertainment League) Our client is enhancing the landscape of the live sports and entertainment industry. They are striving to deliver innovative, cutting-edge technologies to enable safe, unforgettable fan experiences across the globe. They are assembling a world-class technology team to...
-
Junior Site Reliability Engineer
1 week ago
New York, United States Sesame Workshop Full timeJob Description Sesame Workshop is seeking a Junior Site Reliability Engineer. Sesame Workshop is an independent nonprofit organization dedicated to helping children grow smarter, stronger, and kinder. This role is within the Digital Media Engineering (DME) group which is part of the Technology and Engineering department and will help provide support for our...
-
Junior Site Reliability Engineer
3 weeks ago
New York, New York, United States Sesame Workshop Full timeSesame Workshop is seeking a Junior Site Reliability Engineer. Sesame Workshop is an independent nonprofit organization dedicated to helping children grow smarter, stronger, and kinder. This role is within the Digital Media Engineering (DME) group which is part of the Technology and Engineering department and will help provide support for our diverse media...
-
Senior Site Reliability Engineer
2 days ago
New York, United States Mondrian Alpha Full timeA leading systematic multi strat fund are seeking an experienced site reliability engineer to join a team of senior engineers to focus on varying platforms throughout the business. SRE's here combine software and systems engineering experience to build, maintain and improve systems that power the companies investment strategies. The right candidate will come...
-
Senior Site Reliability Engineer
3 days ago
New York, United States Mondrian Alpha Full timeA leading systematic multi strat fund are seeking an experienced site reliability engineer to join a team of senior engineers to focus on varying platforms throughout the business. SRE's here combine software and systems engineering experience to build, maintain and improve systems that power the companies investment strategies.The right candidate will come...
-
Senior Site Reliability Engineer
5 days ago
New York, United States InterEx Group Full timeROLE: Senior Site Reliability Engineer PRIMARY ACCOUNTABILITIES Improve the reliability of mission-critical solutions, applications, and platforms Software development for enterprises Continuous improvement identification and implementation Manage risks and resolve resolves issues that affect applications Lead efforts to troubleshoot and/or debug issues in...
-
Site Reliability Engineer DevOps
2 days ago
New York, United States PEX Full time SITE RELIABILITY ENGINEER SUMMARY: Since 2006 PEX has been on a steady march to build and evolve a solution that helps improve the way organizations operate in order to make them more efficient, more nimble, and more competitive. PEX has evolved into a robust, secure SaaS solution with a deep suite of workforce spend management capabilities, advanced...
-
Senior Site Reliability Engineer
2 weeks ago
New York, United States InterEx Group Full timeROLE: Senior Site Reliability EngineerPRIMARY ACCOUNTABILITIESImprove the reliability of mission-critical solutions, applications, and platformsSoftware development for enterprisesContinuous improvement identification and implementationManage risks and resolve resolves issues that affect applicationsLead efforts to troubleshoot and/or debug issues in any...
-
Site Reliability Engineer DevOps
2 weeks ago
New York, United States PEX Full timeJob DescriptionJob DescriptionSITE RELIABILITY ENGINEER SUMMARY: Since 2006 PEX has been on a steady march to build and evolve a solution that helps improve the way organizations operate in order to make them more efficient, more nimble, and more competitive. PEX has evolved into a robust, secure SaaS solution with a deep suite of workforce spend...