Join an innovative and dynamic AI technology company dedicated to transforming the way video content is experienced worldwide!
Responsibilities:
- Design and elevate our scalable infrastructure for AI/ML workloads, ensuring top-tier availability for model serving and seamless content processing. - Develop efficient CI/CD pipelines and implement infrastructure as code using tools like Terraform and CloudFormation to automate provisioning and deployments. - Be a champion for developer experience by crafting self-service platforms, enhancing internal tooling, and establishing standardized development environments. - Orchestrate the management of containerized workloads on Kubernetes, expertly handling resource allocation for GPU clusters and video processing tasks. - Implement robust observability solutions to monitor system health, track LLM costs, and evaluate performance metrics effectively. - Collaborate closely with engineering teams to promote and optimize DevOps best practices, while also managing cloud costs and maximizing resource utilization.
Requirements:
- A seasoned professional with 4+ years of DevOps/SRE experience and strong expertise in cloud platforms (AWS, GCP, or Azure). - A proficient user of infrastructure as code tools (Terraform, CloudFormation), containerization technologies (Docker, Kubernetes), and CI/CD methodologies. - Experience in developer enablement, including building internal tools, fostering self-service platforms, and enhancing developer productivity. - Strong scripting capabilities in languages such as Python, Bash, or Go, along with hands-on experience in monitoring and observability tools (like Prometheus, Grafana, or DataDog). - A background in database and messaging systems (PostgreSQL, MongoDB, Redis, Kafka/RabbitMQ) is highly valued. - Excellent problem-solving abilities and communication skills, with a knack for recognizing and addressing developer pain points.
Bonus: Experience with ML infrastructure, GPU cluster management, and platform engineering will set you apart!
Urban Recruits specialises in connecting companies of all sizes—from startups to large tech companies, defense and cybersecurity—with the industry's brightest minds.We focus on recruiting top-tier developers, DevOps engineers, QA professionals, data experts and more. Our tailored approach ensures that each placement not only meets your technical needs but also aligns with your company's unique culture and growth goals.