Senior Infrastructure Engineer

1 month ago


San Diego, United States Hillbot Full time

About Us:

Hillbot is a pioneering start-up headquartered in San Diego, co-founded by leading scientists in artificial intelligence. Our mission is to pioneer the future of robotics by merging cutting-edge Generative AI with advanced robotics technologies. We strive to develop comprehensive robot foundation models that will revolutionize the field and set new industry standards. We are seeking a highly skilled Senior Infrastructure Engineer who is passionate about infrastructure, data, and machine learning, and ready to take on the challenge of building from the ground up.


Key Responsibilities:

Design and Development

  • Collaborate closely with researchers and engineers to design and implement scalable, reliable, and efficient infrastructure solutions for processing and analyzing large volumes of multimedia data;
  • Set up and maintain software architecture to support the scaling of data processing and training of transformer-based models; 
  • Implement and manage cloud-based infrastructure using platforms like AWS, Google Cloud, or Azure.
  • Build and maintain containerization and orchestration systems using Docker and Kubernetes to ensure reproducible software practices.
  • Deploy distributed services on public networks, utilizing reverse proxies to securely expose them.

Software and System Optimization

  • Monitor and troubleshoot software and hardware issues on large GPU clusters, including GPU connectivity issues, storage system outages, stalls, Kubernetes-related errors, and networking degradations or outages.
  • Be able to solve the software and hardware issues quickly if possible and communicate effectively with relevant vendors otherwise.
  • Improve existing storage systems and set up new scalable storage solutions for data processing and model training.
  • Take a quantitative and rigorous approach to measuring and improving code, pipeline, cost, and developer efficiency.

Implementation and Development Support

  • Partner with software engineers to enhance and support developer operations.
  • Contribute to SDKs and APIs used internally.
  • Educate team members and document best practices for coding, testing, and deployment operations.
  • Build and rebuild containers effortlessly, ensuring reproducible software practices using Docker, Kubernetes, and other containerization technologies.

Continuous Learning

  • Stay updated with the latest advancements in infrastructure technologies, machine learning tools, and software solutions relevant to our implementations.
  • Identify opportunities to improve software efficiency and usability, driving initiatives to implement these enhancements.

Leadership

  • Mentor and guide junior engineers, fostering a culture of continuous learning and improvement.
  • Lead projects and initiatives, ensuring timely and successful delivery of solutions.


Required Qualifications:

  • Education: Bachelor's degree in Computer Science, Electrical Engineering, Mathematics, or a related field.
  • Experience:
  • 5+ years of relevant work/research experience in infrastructure engineering, particularly supporting machine learning and data science workloads.
  • Proven experience designing and analyzing performance bottlenecks for large-scale data processing systems and storage systems for model training.
  • Expertise with cloud platforms (AWS, Google Cloud, Azure) and containerization technologies (Docker, Kubernetes).
  • Knowledge of GPU infrastructure (monitoring, GPU/RoCE/networking management commands, Ansible) and experience with hardware acceleration technologies (GPUs, TPUs).
  • Understanding of networking concepts, including routing, firewalls, certificates, and using reverse proxies to securely expose distributed services.
  • Strong programming skills in Python and at least one of C/C++ (both are a plus).
  • Demonstrated proficiency with software development best practices (e.g., test-driven development) and version control systems (Git).
  • Skills:
  • Strong analytical and problem-solving abilities.
  • Excellent communication and teamwork skills.
  • Ability to work in a fast-paced, dynamic environment and adapt to changing priorities.


Preferred Qualifications

  • Solid understanding of distributed, high-performance SQL and NoSQL databases and experience with data management technologies for real-time data analytics (e.g., cloud-native databases, HTAP solutions, Apache Arrow).
  • Familiarity with frameworks such as TensorFlow, PyTorch, Keras, and deployment libraries like GStreamer, ONNX, TorchScript, TensorRT.
  • Experience and enthusiasm for mentoring junior engineers.


What We Offer:

  • Opportunity to build and shape the infrastructure stack from the ground up in a rapidly growing company.
  • Impactful role in driving innovative infrastructure strategies critical to the growth and success of Hillbot.ai.
  • Collaborative and inclusive work environment that values creativity, initiative, and professional growth.
  • Competitive salary and benefits package.
  • Visa and immigration support.
  • Unlimited PTO.
  • Employer 401k match.


How to Apply:

If you are passionate about infrastructure, data, and machine learning and ready to take on the challenge of building from the ground up, we want to hear from you Please send your resume and a cover letter detailing your relevant experience and why you are the perfect fit for this role to hr@hillbot.ai.



  • san diego, United States Hillbot Full time

    About Us:Hillbot is a pioneering start-up headquartered in San Diego, co-founded by leading scientists in artificial intelligence. Our mission is to pioneer the future of robotics by merging cutting-edge Generative AI with advanced robotics technologies. We strive to develop comprehensive robot foundation models that will revolutionize the field and set new...


  • san diego, United States Hillbot Full time

    About Us:Hillbot is a pioneering start-up headquartered in San Diego, co-founded by leading scientists in artificial intelligence. Our mission is to pioneer the future of robotics by merging cutting-edge Generative AI with advanced robotics technologies. We strive to develop comprehensive robot foundation models that will revolutionize the field and set new...


  • San Diego, California, United States Zachary Piper Solutions Full time

    Zachary Piper Solutions is a leading provider of innovative DoD consulting solutions. We are seeking a highly skilled Senior Linux Infrastructure Engineer to join our team in San Diego, with the option to work remotely 1 day a week. As a Senior Linux Infrastructure Engineer, you will play a critical role in designing, implementing, and maintaining our...


  • San Diego, California, United States ServiceNow Full time

    Cloud Infrastructure EngineerAt ServiceNow, we're committed to creating an inclusive environment where all voices are heard, valued, and respected. We're seeking a skilled Cloud Infrastructure Engineer to join our team and contribute to the administration and operations of our global cloud infrastructure.Key Responsibilities:Contribute to Configuration...


  • San Francisco, United States Caldera Full time

    Senior Infrastructure Engineer, (Devops)We’re looking for an incredible senior engineer to help us build the future of blockchain scalability.This is an ideal opportunity for an engineer who is already passionate about tackling problems in blockchain scalability, or looking to break into the blockchain engineering space. If you’re looking to work in a...


  • San Francisco, United States Recruiting From Scratch Full time

    Who is Recruiting from Scratch: Recruiting from Scratch is a talent firm that focuses on placing the best candidate for our clients. Our team is 100% remote and we work with teams across North America, South America, and Europe to help them hire. Senior ML Infrastructure Engineer | AI Infrastructure Scale-Up | SF Based Base: $180K - $300K + Equity (0.1-3%)...


  • San Francisco, United States Unreal Gigs Full time

    Company Overview: Welcome to the cutting-edge of AI-driven innovation! At our company, we're pioneers in leveraging machine learning to revolutionize industries. We're committed to building robust infrastructure that powers our machine learning models at scale. Join us and be part of a dynamic team shaping the future of AI infrastructure engineering....


  • San Francisco, United States ZipRecruiter Full time

    Company Overview: Welcome to the cutting-edge of AI-driven innovation! At our company, we're pioneers in leveraging machine learning to revolutionize industries. We're committed to building robust infrastructure that powers our machine learning models at scale. Join us and be part of a dynamic team shaping the future of AI infrastructure engineering.Position...


  • San Francisco, United States Unreal Gigs Full time

    Company Overview: Welcome to the cutting-edge of AI-driven innovation! At our company, we're pioneers in leveraging machine learning to revolutionize industries. We're committed to building robust infrastructure that powers our machine learning models at scale. Join us and be part of a dynamic team shaping the future of AI infrastructure engineering.Position...


  • San Francisco, United States Arbitrum Full time

    Senior Distributed Systems Engineer (Infrastructure)We’re looking for an incredible senior engineer to help us build the future of blockchain scalability.This is an ideal opportunity for an engineer who is already passionate about tackling problems in blockchain scalability, or looking to break into the blockchain engineering space. If you’re looking to...


  • San Francisco, United States Caldera Full time

    Senior Distributed Systems Engineer (Infrastructure)We’re looking for an incredible senior engineer to help us build the future of blockchain scalability.This is an ideal opportunity for an engineer who is already passionate about tackling problems in blockchain scalability or looking to break into the blockchain engineering space. If you’re looking to...


  • San Francisco, California, United States Acceler8 Talent Full time

    We are seeking an experienced software engineer to join our team at the forefront of data-centric AI. Our company is a leading innovator in enhancing data quality for AI models.About the Role:As a Senior Data Infrastructure Engineer, you will be responsible for designing, implementing, and optimizing scalable infrastructure to support our data processing...


  • San Francisco, United States Nexus Full time

    We are seeking a skilled Senior Software Engineer to join our infrastructure team and help us shape the future of verifiable computing. Leveraging your expertise in Rust, you will contribute to the development of efficient, scalable, and secure systems that support our ambitious goals.About NexusThe Nexus Project is a scientific and engineering effort...


  • San Francisco, California, United States Unity Technologies Full time

    About the RoleWe're seeking a skilled Senior Data and ML Infrastructure Engineer to join our team at Unity. As a key member of our Data & ML Platform team, you will design and optimize large-scale data platforms and machine learning infrastructure systems for efficiency, reliability, and cost-effectiveness.Key Responsibilities:Design and optimize large-scale...


  • San Francisco, United States Vapi Dashboard Full time

    MissionThe Founding Senior Engineer (Infra) will be responsible for scaling Vapi’s real-time conversational infrastructure to millions of concurrent calls, while ensuring every call gets picked up within 1s with 99.9% reliability. You’ll also be a foundational part of how we build the culture at Vapi.OutcomesOwn scalability projects end-to-endShip a...


  • San Francisco, California, United States Informal Systems Full time

    Job OverviewInformal Systems is a pioneering company in the field of blockchain technology, specializing in the security of interoperable, fault-tolerant networks. We are seeking a highly skilled Senior Software Engineer to join our team as a Blockchain Infrastructure Engineer, focusing on our core staking operations and service offerings on Ethereum,...


  • San Francisco, United States CentML Full time

    About UsWe believe AI will fundamentally transform how people live and work. CentML's mission is to massively reduce the cost of developing and deploying ML models so we can enable anyone to harness the power of AI and everyone to benefit from its potential.Our founding team is made up of experts in AI, compilers, and ML hardware and has led efforts at...


  • San Francisco, United States Baton (A Ryder Technology Lab) Full time

    Job DescriptionJob DescriptionWho We AreBaton is seeking ambitious individuals who desire the autonomy and agility of a startup environment combined with the backing, power, reach, and stability of a highly respected logistics industry giant.Baton is the Silicon Valley-based technology innovation lab for Ryder, a leading logistics company that owns 260k...


  • San Francisco, United States Tbwa ChiatDay Inc Full time

    Senior Engineering Manager, Realtime InfrastructureDiscord is used by over 200 million people every month for many different reasons, but there’s one thing that nearly everyone does on our platform: play video games. Over 90% of our users play games, spending a combined 1.5 billion hours playing thousands of unique titles on Discord each month. Discord...


  • San Francisco, United States Recruiting From Scratch Full time

    Who is Recruiting from Scratch: Recruiting from Scratch is a premier talent firm that focuses on placing the best product managers, software, and hardware talent at innovative companies. Our team is 100% remote and we work with teams across the United States to help them hire. Our client is looking for a Senior Infrastructure Engineer located in San...