Senior Site Reliability Engineer/DevOps

4 weeks ago


New York NY, United States Tickets.com Full time

com , an MLB company , delivers innovative, cutting-edge technologies to enable frictionless and unforgettable fan experiences in venues across the globe. com is changing the landscape of the live sports and entertainment industry, delivering new digital venue and ticketing experiences to millions of fans. Our Technology team builds platforms and products that provide a new smart ticketing solution and venue experience. Using cutting-edge technology, our platform and applications are consumed by fans, stadiums, and MLB teams.
We are assembling a world-class team to build on these experiences and to scale platforms and products that anticipate emerging opportunities, including dynamic pricing and offers and digital, contactless ticketing. com is looking for a SiteReliability Engineer passionate about building engaging products for our fans.
You will lead the planning, managing, and scaling of mission critical transactional and analytic datastores and pipelines to ensure the business' complex data requirements are met and it can easily access high quality data in a fast, reliable, and safe manner.
Ability to work both independently with little supervision and in a team environment
Ensures availability, security, integrity, and recovery of data, pipelines and data stores
Define and configure relevant database metrics to ensure observability
Create and maintain dashboards and reports to visualize database performance and health
Create monitoring and alerting to trigger on error conditions, degradation symptoms and defined SLOs, as well as outages
Develops and implements data store maintenance plans, including performing integrity checks, updating statistics and monitoring security and hardware resource utilization
Work with peers to roll out changes to production environments and help mitigate and prevent data-related production incidents
Work on automation of data store infrastructure and help engineering succeed by providing self-service tools
Resolves performance, capacity, replication, and other distributed data, pipeline and data store issues
Support and debug data production issues across services and all levels of the stack
Continuously identify opportunities for process improvement and automation to enhance database performance, reliability, and efficiency
Minimum of a bachelor’s degree in Computer Science, MIS or related degree and five (5) years of relevant experience including software or reliability engineering, database administration, data store programming experience or combination of education, training and experience.
Ability to communicate clearly and effectively strong opinions on how to use technologies such as cloud, microframeworks, DevOps, automation, and observability tools
Demonstrable experience engineering automation of triggers, alerts, and remediation and a passion for reducing toil
Have written code in a compiled language that runs in production somewhere
Have written code in interpreted languages
Experience in Oracle 19c, Postgres, Mongo, Change Data Capture, data and data store monitoring, management and support
Experience in Linux OS and shell scripting
Extensive experience in performance tuning and analysis
Capacity planning for all aspects of a data store system (storage, compute, memory, etc.)
Understanding of networking and connectivity and how it relates to a data store environment
Ability to work non-standard shifts including nights and/or weekend on-call responsibilities
Background in data engineering, data science, or even data analysis
Understand industry best practices for scaling data platforms
REST, Web, Graph
Database Solutions – Oracle, MYSQL, MSSQL, CloudSQL, NoSQL
Cloud Providers: Oracle Cloud Infrastructure, Google Cloud Platform, AWS
Real-time log/event monitoring – DataDog, Stackdriver, Oracle Enterprise Manager, Oracle Cloud Monitoring, SolarWinds, Splunk, SumoLogic, OpenTelemetry
Secured Access and control – Okta SSO and MFA, MS Active Directory, DataSafe
Software Development tools – Jira, GIT, Jenkins, ArgoCD, Terraform
Medical
~ Dental
~ 401K Retirement Plan
~ Paid Time Off (PTO, STO, Holidays including Year-End Holiday Break)
~ Legal Plan
~ Tuition Reimbursement
~ Flexible Hybrid Work Environment
~



  • New York, United States InterEx Group Full time

    ROLE: Senior Site Reliability Engineer PRIMARY ACCOUNTABILITIES Improve the reliability of mission-critical solutions, applications, and platforms Software development for enterprises Continuous improvement identification and implementation Manage risks and resolve resolves issues that affect applications Lead efforts to troubleshoot and/or debug issues in...


  • New York, United States InterEx Group Full time

    Senior Site Reliability Engineer PRIMARY ACCOUNTABILITIES Improve the reliability of mission critical solutions, applications, and platforms Software development for enterprises Continuous improvement identification and implementation Manage risks and resolve resolves issues that affect applications Lead efforts to troubleshoot and/or debug issues in any...


  • New York, United States InterEx Group Full time

    Senior Site Reliability Engineer PRIMARY ACCOUNTABILITIES Improve the reliability of mission critical solutions, applications, and platforms Software development for enterprises Continuous improvement identification and implementation Manage risks and resolve resolves issues that affect applications Lead efforts to troubleshoot and/or debug issues in any...


  • New York, United States InterEx Group Full time

    ROLE: Senior Site Reliability EngineerPRIMARY ACCOUNTABILITIESImprove the reliability of mission-critical solutions, applications, and platformsSoftware development for enterprisesContinuous improvement identification and implementationManage risks and resolve resolves issues that affect applicationsLead efforts to troubleshoot and/or debug issues in any...


  • New York, United States InterEx Group Full time

    Senior Site Reliability EngineerPRIMARY ACCOUNTABILITIESImprove the reliability of mission critical solutions, applications, and platformsSoftware development for enterprisesContinuous improvement identification and implementationManage risks and resolve resolves issues that affect applicationsLead efforts to troubleshoot and/or debug issues in any...


  • New York, United States developrec Full time

    SRE Lead/Manager | San Diego, CA | Full-time Role Overview: As the Engineering Manager for Site Reliability, you'll lead the charge in transitioning to cloud-based solutions while ensuring the stability of our existing systems for our rapidly growing user base, currently standing at around one million. You'll spearhead our cloud infrastructure strategy...


  • New York, United States Mondrian Alpha Full time

    A leading systematic multi strat fund are seeking an experienced site reliability engineer to join a team of senior engineers to focus on varying platforms throughout the business. SRE's here combine software and systems engineering experience to build, maintain and improve systems that power the companies investment strategies. The right candidate will come...


  • New York, United States Citadel Securities Americas Services LLC Full time

    Site Reliability Engineer (Citadel Securities Americas Services LLC - New York, NY); Multiple positions available: Collaborate with cross-functional teams, including trading, quantitative, and software engineering teams, to support and enhance Citadel's core suite of trading applications with the latest, most cutting edge technology in order to proactively...


  • New York, United States Mondrian Alpha Full time

    A leading systematic multi strat fund are seeking an experienced site reliability engineer to join a team of senior engineers to focus on varying platforms throughout the business. SRE's here combine software and systems engineering experience to build, maintain and improve systems that power the companies investment strategies.The right candidate will come...


  • New York, United States PEX Full time

    ​ SITE RELIABILITY ENGINEER SUMMARY: Since 2006 PEX has been on a steady march to build and evolve a solution that helps improve the way organizations operate in order to make them more efficient, more nimble, and more competitive. PEX has evolved into a robust, secure SaaS solution with a deep suite of workforce spend management capabilities, advanced...


  • New York, United States Gallery Systems Full time

    Job Summary: Job Description: We are seeking a Site Reliability Engineer (SRE) with 3-5 years experience to join our team at Gallery Systems. The SRE will play a critical role in overseeing the reliability, performance, and scalability of our systems in a Microsoft/Linux environment. The ideal candidate will bring expertise and best practices from previous...


  • New York, United States Synergis Full time

    SRE/Dynatrace Lead Contract to hire - W2 Remote - Candidate MUST be in Georgia or Alabama Job Description The Site Reliability Engineer Lead will work with stakeholders to define SLOs and SLIs as well as develop the overall SRE strategy and roadmap. The ideal candidate will develop a depth of understanding of how all the systems work together, how they fail,...


  • New York, United States PEX Full time

    Job DescriptionJob Description​SITE RELIABILITY ENGINEER SUMMARY: Since 2006 PEX has been on a steady march to build and evolve a solution that helps improve the way organizations operate in order to make them more efficient, more nimble, and more competitive. PEX has evolved into a robust, secure SaaS solution with a deep suite of workforce spend...


  • New York, United States Nationstaff Full time

    About This Role We are seeking a talented Site Reliability Engineer with experience in building and maintaining continuous integration, automating programmatic tasks, deploying applications, configuration management, and monitoring and maintaining the uptime of the platform. The Site Reliability Engineer will be an expert in Linux, is passionate about open...


  • New York, United States Nationstaff Full time

    About This Role We are seeking a talented Site Reliability Engineer with experience in building and maintaining continuous integration, automating programmatic tasks, deploying applications, configuration management, and monitoring and maintaining the uptime of the platform. The Site Reliability Engineer will be an expert in Linux, is passionate about open...


  • New York, United States Sesame Workshop Full time

    Job Description Sesame Workshop is seeking a Junior Site Reliability Engineer. Sesame Workshop is an independent nonprofit organization dedicated to helping children grow smarter, stronger, and kinder. This role is within the Digital Media Engineering (DME) group which is part of the Technology and Engineering department and will help provide support for our...


  • New York, New York, United States Sesame Workshop Full time

    Sesame Workshop is seeking a Junior Site Reliability Engineer. Sesame Workshop is an independent nonprofit organization dedicated to helping children grow smarter, stronger, and kinder. This role is within the Digital Media Engineering (DME) group which is part of the Technology and Engineering department and will help provide support for our diverse media...


  • New York, New York, United States Atechstar Full time

    Job DescriptionAs a Senior DevOps/Site Reliability Engineer you willAdvise clients on best practices to improve reliability observability and scalability of cloud workloads Develop test and implement high availability and disaster recovery solutions Write automated run-books and play-books to perform repetitive management tasks Implement capacity planning...


  • New York, New York, United States Atechstar Full time

    Job Description As a Senior DevOps/Site Reliability Engineer you willAdvise clients on best practices to improve reliability observability and scalability of cloud workloads Develop test and implement high availability and disaster recovery solutions Write automated run-books and play-books to perform repetitive management tasks Implement capacity planning...


  • New York, United States Hale Recruiting Full time

    Summary - Site Reliablity Engineer (For one of the Big 4 Sports &Entertainment League) Our client is enhancing the landscape of the live sports and entertainment industry. They are striving to deliver innovative, cutting-edge technologies to enable safe, unforgettable fan experiences across the globe. They are assembling a world-class technology team to...