Job Description
A leading Fintech company is looking for a Cloud Infrastructure Engineer (MLOps) work on its payment system to deliver the best payment experience for customers. This role involves leveraging cloud-based AI/ML Services to build infrastructure as well as developing, deploying, and maintaining ML models, collaborating with cross-functional teams, and ensuring scalable and efficient AI solutions particularly on Amazon Web Services (AWS).
Main Responsibilities
- The role primarily focuses on managing cloud infrastructure for AI/ML projects using AWS, ensuring best practices in security, cost-efficiency, and high availability, while also monitoring and maintaining the operational health of deployed services.
- It involves the design, development, and deployment of machine learning models using services like SageMaker, in collaboration with data scientists and engineers to build scalable workflows. The position requires optimizing model performance, implementing CI/CD pipelines, and setting up cloud-based development environments.
- Additionally, it includes deploying monitoring and observability solutions to manage production models effectively.
- Data management is another key responsibility, involving handling both structured and unstructured data using AWS services like S3, RDS, DynamoDB, and Redshift, while ensuring data security and regulatory compliance.
- The role emphasizes strong collaboration with DevOps, Data Engineering, and Product teams, and requires clear communication of technical concepts to non-technical stakeholders.
- Continuous improvement is encouraged through staying current with evolving AI/ML and AWS technologies, and by automating development and deployment processes to empower teams.
- The tech stack includes a broad range of AWS services, infrastructure-as-code tools like Terraform, CI/CD with GitHub Actions, monitoring tools such as Prometheus and Grafana, and ML workflow tools like MLFlow, Jupyter, Argo Workflows, and Airflow.