Site Reliability Engineer
3 weeks ago
Fondée en 1999, Scaleway est la filiale cloud du groupe Iliad, l’un des leaders des télécommunications en Europe. Notre mission est de favoriser une industrie numérique plus responsable en aidant les développeurs et les entreprises à créer, déployer et adapter des applications à n'importe quelle infrastructure.
Depuis nos bureaux situés à Paris et à Lille, nous perfectionnons quotidiennement l'écosystème cloud de Scaleway, dont nous sommes les premiers utilisateurs.
Nos quelques 25 000 clients nous choisissent pour notre redondance multi-AZ, notre expérience-utilisateur fluide, nos datacenters neutres en carbone ainsi que nos outils natifs de gestion d'architectures multi-cloud. Nos produits incluent des solutions entièrement gérées pour le bare metal, la conteneurisation et les architectures serverless, offrant ainsi un choix responsable dans le domaine du cloud computing.
Rejoignez notre équipe dynamique de près de 600 collaborateurs venant de divers horizons, dans un environnement stimulant et international alliant excellence technique, créativité et partage.
About the job
Scaleway is looking for a Site Reliability Engineer to join our teams.
Reporting to a Lead SRE, you will be responsible to ensure we can reliably serve our products for users around the world. We expect you to have a strong background in development and system administration. Our systems evolve constantly and the tools needed to observe and act to ensure their resilience need to evolve accordingly.
Minimum qualifications
- Previous experience as a developer in Go, Python or Rust
- Experience in system programming with usual scripting languages (bash, Python)
- Demonstrated ability to troubleshoot production systems failures
- A great attitude and desire to work with a team
- Passion for incremental improvements on tooling, love all things of automation
- Experience with Linux systems (Ubuntu/Debian)
- Experience with cloud environments architecture (baremetal, virtual machines, containers, orchestrators)
- Good understanding of computer networks: TCP/IP, DNS, load-balancing, IPv6, BGP and network virtualisation
- Understanding of written and spoken english, capable of writing technical documentation in English, ability to speak english if needed
Preferred qualifications
- Experience with infrastructure as code and continuous deployment
- Experience dealing with physical hardware automation
- Experience with monitoring & logging systems
- Experience administering relational databases
- Knowledge of one cloud platform and related use-cases
- Take initiatives to propose new solutions and defend them
- Team player, willing to share knowledge, opinions, and participate in regular team rituals
- Good communication skills and coaching skills
Responsibilities
- Create or optimize existing tools & documentation that will help identify, diagnose and remediate production incidents, automating as much as possible
- Troubleshoot high-impact issues working with multiple engineering teams
- Take on-call responsibilities, mitigate issues encountered in production and secure the best real-time answer to our customers
- Ensure a high quality of service for our customers by leveraging observability and monitoring technologies
- Manage lifecycle of products in production
- Help implementing best practices in stability, resiliency, scalability, security and performance across our systems
Technical Stack
- Python, Go, Rust
- RabbitMQ
- PostgreSQL
- HA Proxy, Nginx, REST APIs / Flask
- S3 API
- Sentry, Prometheus, Grafana, ElasticSearch, Fluentd, Kibana
- Ansible, AWX, Foreman, Salt
- GitLab, Nexus
- Ubuntu, Debian, CentOS
- Jira, Confluence, Slack, GSuite
Location
This position is based in our offices in Paris or Lille (France)
Si vous ne vous voyez pas cocher toutes les cases, n'hésitez pas à postuler tout de même. Ne vous limitez pas à une description de poste - on ne sait jamais
#J-18808-Ljbffr-
Site Reliability Engineer
2 days ago
San Diego, CA, United States Talent Software Services Full timeSite Reliability Engineer - Senior (NE) Job Summary: Talent Software Services is in search of a Site Reliability Engineer - Senior (NE) for a contract position in San Diego, CA. The opportunity will be one year with a strong chance for a long-term extension. Position Summary: As a member of the CICD and Cloud Reliability team you'll work at the heart of...
-
Site Reliability Engineer
2 days ago
San Diego, CA, United States Talent Software Services Full timeSite Reliability Engineer - Senior (NE) Job Summary: Talent Software Services is in search of a Site Reliability Engineer - Senior (NE) for a contract position in San Diego, CA. The opportunity will be one year with a strong chance for a long-term extension. Position Summary: As a member of the CICD and Cloud Reliability team you'll work at the heart of...
-
Site Reliability Engineer
6 days ago
Oakland, CA, United States GTT, LLC Full timeSite Reliability Engineer (Hybrid) Direct Hire Salary $200k - $255k/Year Department Overview The Data Solutions Architecture Team at our client is responsible for driving long-term, enterprise-wide data solutions, target state architecture, and overall excellence with the application of data, analytics, and information to critical business challenges and...
-
Site Reliability Engineering
4 days ago
Mountain View, CA, United States News Break Full timeAbout NewsBreakNewsBreak is redefining the way users interact with local news and their communities. By bridging local users, local content creators, and local businesses, our mission is to foster safer, more vibrant, and authentically connected lives. Through robust collaborations with thousands of local publishers and businesses across the nation,...
-
AI Ops Site Reliability Engineer
4 days ago
San Jose, CA, United States TikTok Full timeDescriptionTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo.Why Join UsCreation is the core of TikTok's purpose. Our platform is built to help imaginations thrive. This...
-
Reliability Engineer
4 days ago
Santa Clara, CA, United States Natron Energy Full timeSanta Clara, CAOperations /Full Time /On-siteNatron is seeking a Reliability Engineer to support the development and test of our high-power battery systems for data center UPS and EV charging applications. The occupant of this position will work with the Product Engineering, Reliability, Technology, and Operations teams to develop procedures for accelerated...
-
Senior Site Reliability Engineer/DevOps
6 days ago
Oakland, CA, United States GTT, LLC Full timeThe Data Solutions Architecture Team at our client is responsible for driving long-term, enterprise-wide data solutions, target state architecture, and overall excellence with the application of data, analytics, and information to critical business challenges and opportunities. This team is chartered to develop the strategy, roadmap, and accompanying...
-
Sunnyvale, CA, United States Google Full timeMinimum qualifications:Bachelor's degree in Electrical/Industrial/Mechanical Engineering or equivalent practical experience.10 years of experience in reliability engineering of cloud infrastructure hardware and technology, dealing with failure analysis and fault isolation techniques and applying them to isolate root causes.7 years of experience with system...
-
PGD Senior Reliability Engineer
3 weeks ago
Palm Beach Gardens, Florida, United States NextEra Full timeNextEra Energy Resources is the world's largest generator of renewable energy from the wind and sun, and a world leader in battery storage. We provide energy-related products and services that grow our economy, protect the environment, support our communities and help customers meet their energy needs. We are leading the decarbonization of the U.S. economy...
-
Electrical Engineer
2 weeks ago
Newport News, Virginia, United States SAIC Career Site Full timeDescription SAIC is seeking an Electrical Engineer to provide technical support to Naval Underwater Warfare Center Division Newport (NUWC) in Newport, Rhode Island. We are currently looking for a motivated individual to join a multi-disciplinary team of engineers and scientist.Job Duties: Performs investigations and analyses of undersea vehicles and...
-
Computer Engineer
6 days ago
Newport News, Virginia, United States SAIC Career Site Full timeDescription SAIC is seeking a Computer Engineer to provide technical support to Naval Undersea Warfare Center Division Newport (NUWC) in Newport, RI. This motivated Computer Engineer will join a multi-disciplinary team of engineers and scientists to perform software analysis and reverse engineering on projects for the Office of Naval Intelligence (ONI). Job...
-
Electrical Engineer
6 days ago
Newport News, Virginia, United States SAIC Career Site Full timeDescription SAIC is seeking an Electrical Engineer to provide technical support to Naval Underwater Warfare Center Division Newport (NUWC) in Newport, Rhode Island. We are currently looking for a motivated individual to join a multi-disciplinary team of engineers and scientist.Job Duties: Performs investigations and analyses of undersea vehicles and...
-
Senior Mission Reliability Engineer
3 weeks ago
Long Beach, United States Relativity Full timeCompany Overview: A rocket company at the core, Relativity Space is on a mission to become the next great commercial launch company. Meeting the needs of a growing demand for space infrastructure, our rockets will revolutionize how we connect and communicate on Earth by getting satellites to space. We have developed a vertically integrated technology...
-
Site Reliability Engineer
4 days ago
Mountain View, CA, United States TikTok Full timeDescriptionTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Mumbai, Singapore, Jakarta, Seoul and Tokyo. Our Trust and Safety engineering team is fast growing and responsible for building machine learning...
-
Site Project Manager
1 week ago
Newport, United States Insight Global Full timeWe are looking for an experienced Site Project Manager to be responsible for the overseeing of the steam turbine generator maintenance outage for one of our clients. The site PM will be responsible for technical direction, project management, and ensure successful outage execution. This position requires a strong background in turbine field...
-
Sr Director of Quality and Reliability
4 days ago
Santa Clara, CA, United States Natron Energy Full timeSanta Clara, CAOperations /Full Time /On-siteWe are seeking a highly experienced and dynamic individual to join our team as the Senior Director of Quality and Reliability. In this role, you will play a pivotal leadership role in championing a culture of quality excellence and driving the design and testing processes to ensure product reliability. You will...
-
Reliability Development Engineering
2 weeks ago
San Diego, CA, United States ATR International Full timeJob Description: We are seeking a Reliability Development Engineer for a very important client Job Overview - Principal Duties and ResponsibilitiesSuccessful candidate will be tasked for Product, Package reliability test tracking; reliability database, data analysis and summarization on a regular basis for commercial, industrial and/or automotive application...
-
Site Reliability Engineer
4 days ago
Palo Alto, CA, United States Palantir Technologies Full timePalo Alto, CAInformation Security /Full-time /HybridA World-Changing CompanyPalantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more.The...
-
Storage Administrator
3 days ago
Newport Beach, United States Cognizant Technology Solutions Full timeThis day-one on-site position is open to any qualified applicant in the United States. A qualified candidate must be local to Newport Beach, California, or be open to relocation to Newport Beach, California. Practice - CIS - Cloud, Infrastructure, and Security Services About Cloud Infrastructure & Security Services: Cognizant’s Cloud, Infrastructure, and...
-
Product Owner
2 days ago
Pompano Beach, United States Site Impact Full timeThis position is an in office role that will transition to Hybrid after 60 days with the company. You must be in commutable distance of our Corporate Headquarters. About Us Site Impact is the leading outsourced digital marketing provider to large media companies and agencies serving many local clients. We offer scalable, technology-driven solutions that...