A leading Fintech company is looking for an SRE, we strive towards ensuring high availability and top-level performance so that our users can have flawless and reliable service exceeding expectations. We are looking for experienced SREs who can deliver insights into system bottlenecks and ensure system reliability and scalability, while increasing the number of services that our company offers. We are looking for individuals who can bring informed and unique viewpoints, enjoy collaborating with a cross-functional team and are actively pushing boundaries to develop reliable and scalable solutions and positive user experiences.
Main Responsibilities
- Analyse current technologies used in the company and develop monitoring and notification tools to improve observability and visibility.
- Ensure system stability by pre-emptively verifying failure scenarios and implement solutions to reduce MTTR
- Develop solutions to improve system performance with a focus on high availability, scalability and resilience
- Integrate telemetry and alerting platforms to track and improve reliability of systems Implement industry best practices for system development, configuration
- management and system deployment Ensure seamless flow of information between teams by documenting knowledge
- gained Be up to date on modern technologies and trends to advocate for inclusion within
- products if they add value Participate in incident management including troubleshooting production issues,
- driving root cause analysis (RCA) and actively sharing lessons learned to improve system reliability and internal knowledge.