Site Reliability Engineer

2 months ago


New York, United States Material Bank Full time
Job DescriptionJob Description

Material Bank is a fast-paced, high-growth technology company and created the world's largest material marketplace for the Architecture and Design industry, providing the fastest and most powerful way to start and manage a design project. Learn more about us at www.materialbank.com or see below.

--

Site Reliability Engineers (SREs) are responsible for keeping all user-facing services and other Material Bank production systems running reliably and efficiently.

SREs in Material Bank specialize in systems (operating systems, storage subsystems, networking), while implementing best practices for availability, reliability and scalability, with interests in algorithms and distributed systems.

What you'll do:

  • Be on an on-call (OpsGenie) rotation to respond to incidents that impact Material Bank's availability, and provide support for service engineers with multiple customer incidents.
  • Prevent incidents from reoccurring.
  • Run our infrastructure with, Terraform, GitHub CI/CD, Kubernetes and ECS.
  • Build monitoring that alerts on symptoms rather than on outages using NewRelic and Prometheus.
  • Document EVERYTHING so your findings turn into run-books/SOPs and then into automation.
  • Improve operational processes (such as deployments and upgrades) to make them as uneventful as possible.
  • Design, build and maintain core infrastructure that enables Material Bank scaling to support thousands of concurrent users.
  • Debug production issues across services and levels of the stack.
  • Plan the growth of Material Bank infrastructure.

What you'll bring:

  • Think about systems: edge cases, failure modes, behaviors, specific implementations.
  • Know your way around Linux.
  • Have strong programming skills: Shell, and Python.
  • Collaborate and communicate asynchronously.
  • Document all the things to inform and mentor others.
  • Biased for action
  • Delivering quickly and effectively, and iterating fast.
  • Have experience with Nginx, Container technologies, Kubernetes, Terraform, Kafka or similar technologies

Projects you can work on:

  • Coding infrastructure automation with Terraform, and common CI/CD tools
  • Improving our monitoring and building new metrics
  • Helping release managers deploy and fix new versions of Materialbank in all geographies.
  • Plan, prepare for, and execute the provisioning of new infrastructure in our future expansions
  • Develop a relationship with our product and business teams to define their SLAs, iterate on those SLAs and improve their reliability
  • Experience defining SLOs and Error budgets

What you'll get from us:

  • Our people: If you thrive in an inclusive, innovative, and fast-paced organization, look no further You will get to work alongside some of the brightest minds - Join a genuinely fun and supportive workplace where we keep our employees consistently engaged through internal communication and corporate events
  • Relaxation and Celebrations: Generous PTO, Sick Days, Paid National Holidays, and even more (ask us about this when we connect).
  • Health Benefits: We contribute to your medical, dental, vision and short-term/long-term disability plans and have a strong employee assistance program.
  • Plan for your Retirement: 401(k) eligible after your first 90 day's employed
  • Giving Back: We sponsor multiple events throughout the year to help out our communities. You will receive time off to give back as well.
  • Growth: We'll help you take your career to the next level. We want you to be creative and take initiative which will allow you to grow and create within the company. Most importantly, be the best at what matters
  • Flexible Work Schedules: With business units and employees across the globe, Material Technologies has embraced a hybrid  working model allowing department leaders to decide on the best approach for their respective teams, whether that be remote, in person, or a little of both.

About Material Bank

Material Bank is the world's largest material marketplace for the architecture and design industry, providing the fastest and most powerful way to search and sample materials. Material Bank connects design professionals to hundreds of manufacturers through facilitating brand discovery, rep engagement, and material sampling.

Material Bank has transformed the way an entire industry discovers and samples materials. By removing the friction that exists in the process, we drive business between architects and designers (members) and our Brand Partners (clients).

Our powerful material database and proprietary robotic distribution facility allow members to order samples until midnight (ET) to be delivered free of charge anywhere in the US, in one box, by 10:30 AM the next morning.

Connect with us and discover your career at Material Bank.

--Material Bank is proud to be an equal opportunity employer. We value diversity, and all applicants will be considered for employment without attention to race, color, religion, sex, sexual orientation, gender identity, age, national origin, veteran or disability status or other status protected under any applicable federal, state or local law.

  • New York, United States Automatic Data Processing Full time

    ADP is hiring a Site Reliability Engineer. Do you thrive in a challenging environment, love production systems, curious by nature with a thirst for pushing the limits? Are you inspired by transformation and making an impact on the lives of millions o Reliability Engineer, Liability, Reliability, Engineer, Reliability, Operations, Manufacturing


  • New York, United States Unreal Gigs Full time

    Job Summary We are in search of a Site Reliability Engineer to join our tech startup specializing in infrastructure and authorization solutions. As a Site Reliability Engineer, you'll be pivotal in ensuring the reliability, availability, and performance of our systems. Your role will involve designing, implementing, and maintaining scalable infrastructure...


  • New York, United States Unreal Gigs Full time

    Job DescriptionJob DescriptionJob SummaryWe are in search of a Site Reliability Engineer to join our tech startup specializing in infrastructure and authorization solutions. As a Site Reliability Engineer, you'll be pivotal in ensuring the reliability, availability, and performance of our systems. Your role will involve designing, implementing, and...


  • New York, United States Unreal Gigs Full time

    Job DescriptionJob DescriptionJob SummaryWe are in search of a Site Reliability Engineer to join our tech startup specializing in infrastructure and authorization solutions. As a Site Reliability Engineer, you'll be pivotal in ensuring the reliability, availability, and performance of our systems. Your role will involve designing, implementing, and...


  • New York, United States RedTech Recruitment Full time

    Site Reliability Engineer – Graduates consideredWe are excited to be able to offer this Site Reliability Engineer role working for an industry-leading software company. This company has won several awards and is pioneering in their machine learning technology. Founded 8 years ago, with a team of 150 brilliant engineers, they are already renowned as having...


  • New York, United States Hyperion Industries Full time

    Company DescriptionJoin us on an exhilarating mission at Hyperion, a VC-backed startup working with Tim Hwang, CEO of FiscalNote (NYSE: NOTE). Our co-founders, with their extensive AI and engineering backgrounds from Google, Amazon, Workday, and Instacart are leading the charge. Our mission is to revolutionize Site Reliability Engineering (SRE) with an...


  • New York, United States RedTech Recruitment Ltd. Full time

    Site Reliability Engineer – Graduates considered We are excited to be able to offer this Site Reliability Engineer role working for an industry-leading software company. This company has won several awards and is pioneering in their machine learning technology. Founded 8 years ago, with a team of 150 brilliant engineers, they are already renowned as having...


  • New York, United States ICTerGezocht Full time

    Locatie Amsterdam Vacature in het kort Ever thought of how many people log in to the app or Internet Banking website each month? Over five million! The objective of the Personal Banking Grid is to ensure that each visit is not only secure but also a personal and smooth experience. As a Site Reliability Engineer, you play a key role in this mission. You will...


  • New York, United States InterEx Group Full time

    Senior Site Reliability Engineer PRIMARY ACCOUNTABILITIES Improve the reliability of mission critical solutions, applications, and platforms Software development for enterprises Continuous improvement identification and implementation Manage risks and resolve resolves issues that affect applications Lead efforts to troubleshoot and/or debug issues in any...


  • New York, United States Instabase Full time

    At Instabase, we're passionate about democratizing access to cutting-edge AI innovation to enable any organization to solve previously unsolvable unstructured data problems in their industry. With customers representing some of the largest and most complex organizations in the world, and investors like Greylock, Andreessen Horowitz, and Index Ventures, our...


  • New York, New York, United States Instabase Full time

    At Instabase, we're passionate about democratizing access to cutting-edge AI innovation to enable any organization to solve previously unsolvable unstructured data problems in their industry. With customers representing some of the largest and most complex organizations in the world, and investors like Greylock, Andreessen Horowitz, and Index Ventures, our...


  • New York, United States Hebbia Full time

    About Hebbia The user interface for AGI - Hebbia is AI that works the way you work. Designed to be generally capable- it can tackle even the most complex tasks, citing answers over any amount of sources. By showing its work, Hebbia empowers users to collaborate with AI on each step and validate responses instead of blindly trusting them. Our mission is to...


  • New York, New York, United States Astir IT Solutions, Inc. Full time

    Position: Senior Site Reliability EngineerLocation: Onsite in NJContract Duration: Long-term EngagementCompensation: $50 per hourNote: No OPT/CPT candidates will be considered.We are seeking a highly skilled Senior Site Reliability Engineer (SRE) with subject matter expertise. The ideal candidate will possess exceptional communication skills and the...


  • New York, New York, United States Streaming Talent Full time

    Streaming Talent is seeking a highly skilled Site Reliability Engineer to join our client's US team. As a key member of the Site Reliability Team, you will be responsible for ensuring the smooth operation of the company's Content Delivery Network.The ideal candidate will have a strong background in cloud technologies, with experience working with Kubernetes...


  • New York, New York, United States Astir IT Solutions, Inc. Full time

    Position: Senior Site Reliability EngineerLocation: Onsite in New JerseyContract Duration: Long-termCompensation: $50 per hourThis role requires a highly skilled individual with a strong background in Site Reliability Engineering. The ideal candidate will possess exceptional communication abilities and the confidence to engage with executive-level teams.Key...


  • New York, United States InterEx Group Full time

    Senior Site Reliability EngineerPRIMARY ACCOUNTABILITIESImprove the reliability of mission critical solutions, applications, and platformsSoftware development for enterprisesContinuous improvement identification and implementationManage risks and resolve resolves issues that affect applicationsLead efforts to troubleshoot and/or debug issues in any...


  • New York, New York, United States Astir IT Solutions, Inc. Full time

    Position: Senior Site Reliability EngineerLocation: Onsite in New JerseyContract Duration: Long-term EngagementCompensation: $50 per hourThis role requires a highly skilled individual with a strong background in Site Reliability Engineering. The ideal candidate will possess exceptional communication abilities and the confidence to engage with executive-level...


  • New York, New York, United States Astir IT Solutions, Inc. Full time

    Position: Senior Site Reliability EngineerLocation: Onsite in New JerseyContract Duration: Long-term EngagementCompensation: $50 per hourThis role requires a highly skilled individual with a strong background in Site Reliability Engineering. The ideal candidate will possess:Exceptional communication skills, with the ability to engage confidently with...


  • New York, New York, United States Astir IT Solutions, Inc. Full time

    Position: Senior Site Reliability EngineerLocation: Onsite in New JerseyContract Duration: Long-term EngagementCompensation: $50 per hourThis role requires a seasoned professional with a strong background in Site Reliability Engineering. The ideal candidate will possess exceptional communication skills and the confidence to engage with executive-level...


  • New York, New York, United States Astir IT Solutions, Inc. Full time

    Position: Senior Site Reliability EngineerLocation: Onsite in New JerseyContract Duration: Long-termCompensation: $50 per hourThis role requires a highly skilled individual with a proven track record in Site Reliability Engineering. The ideal candidate will possess:Exceptional communication abilities and the confidence to engage with executive-level...