Site Reliability Engineer, Infrastructure

1 month ago


Mountain View, United States Tik Tok Full time
Responsibilities

TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo.

Why Join Us
At TikTok, our people are humble, intelligent, compassionate and creative. We create to inspire - for you, for us, and for more than 1 billion users on our platform. We lead with curiosity and aim for the highest, never shying away from taking calculated risks and embracing ambiguity as it comes. Here, the opportunities are limitless for those who dare to pursue bold ideas that exist just beyond the boundary of possibility. Join us and make impact happen with a career at TikTok.

Our infrastructure team operates a large network of POPs around the world hosting edge services, such as traffic acceleration, CDN cache, gaming, etc. We are seeking experienced frontend developers to build the Kubernetes-based platform (PaaS) to manage the lifecycle (creation, deployment, CI/CD etc.) and provide management tools (monitoring, alerting, logging, etc.) for edge services on our globally distributed POPs.

Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed infrastructures. Our SREs are tasked to ensure the infrastructure services are reliable, fault-tolerant, efficiently scalable and cost-effective. You will have the opportunity to manage a variety of complex systems at scale, including systems that administer hyperscale datacenters and public cloud, a global content distribution networks (CDNs) and load balancers that handles Tbps of traffic etc..

Responsibilities
• Build, expand and operate Bytedance's global infrastructures, including large-scale systems in public and private clouds, data centers and content delivery networks.
• Build tools, automations, visualizations and monitors to facilitate the operation and optimization of the global infrastructure.
• Work in a fast-paced environment. Participate in technical operations and rotations in response to performance and reliability issues.
• Help improve the whole lifecycle of infrastructure services from inception and design throughout development, to deployment, user support and refinement

Qualifications

Minimum qualifications
• Master's degree (or Bachelor's degree with 3+) years of experience in Computer Engineering, Electrical Engineering, Computer Science or related major
• 3+ years experience working with Unix Linux systems from kernel to shell and beyond with experience working with system libraries, file systems, and client-server protocols.
• 3+ years experience in one or more programming languages such as Java, C++, Go, or scripting experience in Shell and Python.

Preferred qualifications
• Self-driven and capable of coping with ambiguity and move projects from concept to delivery.
• Strong in analytical skills and the ability to solve real world problems in a fast moving environment.
• Experience in designing, analyzing and building automation and tools for large scale systems
• Experience in building solutions with AWS, Google, Azures and other cloud services.
• Experience in networking technologies such TCP/IP, BGP, DNS, etc. in a carrier-grade environment.
• Experience in developing and operating one or more of following systems: OpenStack, Kubernetes, Nginx, ipvs, ELK stack, Hadoop, etc.

TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.

TikTok is committed to providing reasonable accommodations during our recruitment process. If you need assistance or an accommodation, please reach out to us at USRC[redacted]

  • Mountain View, United States Optomi Full time

    Site Reliability Engineer Optomi, in partnership with a large consulting firm is seeking an experienced Site Reliability Engineer for their Bay Area / Seattle location! This position requires a versatile, highly motivated individual capable of supplying frontline technical and operational support to our Site Reliability teams. As a vital part of the...


  • Mountain View, United States Optomi Full time

    Site Reliability Engineer Optomi, in partnership with a large consulting firm is seeking an experienced Site Reliability Engineer for their Bay Area / Seattle location! This position requires a versatile, highly motivated individual capable of supplying frontline technical and operational support to our Site Reliability teams. As a vital part of the...


  • Mountain View, California, United States Samsung Full time

    Embedded Site Reliability Engineer (Samsung Ads)remote typeHybridlocations645 Clyde Avenue, Mountain View, CA, USAOne Pennsylvania Plaza, 26th Floor, New York, NY, USAtime typeFull timejob requisition idR84565Position SummaryIn recent years, Samsung has transformed its hardware dominance into a dynamic ecosystem of engaging services across devices. Enter...


  • Mountain View, California, United States Samsung Electronics Full time

    Position OverviewSamsung has evolved from a hardware leader into a vibrant ecosystem of innovative services across devices. At the forefront of this transformation is Samsung Ads, a flourishing division poised for significant growth.Our Global Ads Product & Engineering team, with a robust presence across multiple countries, is integral to this advancement....


  • Mountain View, United States Muon Space Full time

    About the role Muon Space is looking for a Site Reliability Engineer to join our Platform Software team. Our team provides cloud infrastructure for Muon's Satellite Operations systems and Data Platform, as well as development and test systems for engineers across the company. As a Cloud Engineer on the team, you will bring expertise in building and operating...


  • Mountain View, United States Motion Recruitment Full time

    A consulting company for a Top 5 tech company is looking for Linux Site Reliability Engineers. You'll work in a high-computing environment, handling all things Linux. You'll manage and maintain Linux servers (over 20K systems and counting) while also implementing automation and Kubernetes administration. Ideal candidates have a deep understanding of core...


  • Mountain View, United States Motion Recruitment Full time

    A consulting company for a Top 5 tech company is looking for Linux Site Reliability Engineers. You'll work in a high-computing environment, handling all things Linux. You'll manage and maintain Linux servers (over 20K systems and counting) while also implementing automation and Kubernetes administration. Ideal candidates have a deep understanding of core...


  • Mountain View, United States Samsung Full time

    Embedded Site Reliability Engineer (Samsung Ads)remote typeHybridlocations645 Clyde Avenue, Mountain View, CA, USAOne Pennsylvania Plaza, 26th Floor, New York, NY, USAtime typeFull timejob requisition idR84565Position Summary In recent years, Samsung has transformed its hardware dominance into a dynamic ecosystem of engaging services across devices. Enter...


  • Mountain View, United States Groq Full time

    At Groq. We believe in an AI economy powered by human agency. We envision a world where AI is accessible to all, a world that demands processing power that is better, faster, and more affordable than is available today. AI applications are currently constrained by the limitations of the Graphics Processing Unit (GPU), a technology originally developed for...


  • Mountain View, United States Tik Tok Full time

    Responsibilities TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo. Why Join Us Creation is the core of TikTok's purpose. Our platform is built to help imaginations...


  • Mountain View, United States Elastic Full time

    Elastic is a free and open search company that powers enterprise search, observability, and security solutions built on one technology stack that can be deployed anywhere. From finding documents to monitoring infrastructure to hunting for threats, Elastic makes data usable in real-time and at scale. Thousands of organizations worldwide, including Barclays,...


  • Mountain View, California, United States VentureDive Full time

    Job Brief:As Data Platform Site Reliability Engineering you will manage infrastructure and applications on cloud computing platforms to deliver data processing, governance, and storage. Our platform teams work with exabytes of data, terabytes of memory, and hundreds of thousands of jobs to enable predictable and performant data analytics.As an SRE, you'll...


  • Mountain View, United States Bonfy.AI Full time

    Sr. SRE Engineer At Bonfy.AI, we're working behind the scenes on game-changing innovations to make AI trustworthy. Position Overview: Bonfy.AI is looking for a curious, intelligent Security / SRE / DevOps Engineer with experience in at least two of the mentioned domains which can automate company engineering processes to perfection yet not hesitate to say...


  • Mountain View, United States CENTRL Inc Full time

    CENTRL is a super-fast-growing Silicon Valley technology company focused on third-party risk, due diligence, cyber risk, and security. It has offices in the SF Bay Area, NY, Australia, and India. CENTRL’s clients include leading companies around the world including several Fortune 500 firms. CENTRL is led by a highly experienced management team with a...


  • Mountain View, United States CENTRL Inc Full time

    CENTRL is a super-fast-growing Silicon Valley technology company focused on third-party risk, due diligence, cyber risk, and security. It has offices in the SF Bay Area, NY, Australia, and India. CENTRL's clients include leading companies around the world including several Fortune 500 firms. CENTRL is led by a highly experienced management team with a proven...


  • Mountain View, California, United States CENTRL Full time

    CENTRL is looking for a highly skilled and innovative Senior Site Reliability Engineer to take charge of our cloud and infrastructure operations. In this pivotal role, you will be responsible for the strategic oversight, planning, and implementation of our IT systems to guarantee optimal performance, scalability, and availability.Key ResponsibilitiesAnalyze...


  • Mountain View, California, United States CENTRL Full time

    CENTRL is looking for a skilled and proactive Senior Site Reliability Engineer to enhance our cloud and infrastructure operations. In this pivotal role, you will be responsible for the strategic oversight, planning, and implementation of our IT systems to ensure optimal performance, scalability, and availability.Key ResponsibilitiesAnalyze and gather metrics...


  • Mountain View, California, United States CENTRL Full time

    CENTRL is looking for a highly skilled and innovative professional to take on the role of Senior Site Reliability Engineer. In this pivotal position, you will be responsible for the strategic oversight, planning, and implementation of our cloud and infrastructure operations, ensuring optimal availability, scalability, and performance of our IT systems.Key...


  • Mountain View, California, United States CUSHMAN Full time

    Job TitleLead Reliability EngineerJob Description SummaryThe Lead Facilities Reliability Engineer will develop, implement and track facilities reliability and maintenance engineering programs at client site with a focus on performing facilities condition assessments and maintaining the facilities condition assessment database. Utilizing plant...


  • Mountain View, California, United States Athelas Full time

    Job OverviewIn the realm of healthcare, professionals dedicate their lives to patient care, yet often find themselves burdened by administrative tasks and operational hurdles. At Athelas + Commure, we develop innovative solutions that streamline workflows and enhance connectivity between providers and patients, allowing them to concentrate on what truly...