
Turing
A US-based company that is revitalizing businesses by providing them with modern and relevant customer service technology is looking for a Site Reliability Engineer. The selected candidate will be expected to have a strong track record in terms of delivering modern, enterprise-grade, cloud-native architectures with high levels of automation. The company has developed a cutting-edge customer service platform to assist businesses in improving service delivery and developing healthy and profitable client relationships. This will be a long-term, full-time position that requires some overlap with the IST/PST time zone.
Job Responsibilities:
- Scale the SRE automation and monitoring efforts with an infrastructure-as-code mindset
- Contribute to various strategies for all essential services
- Collaborate with the development team and stakeholders to understand the application service level agreement
- Collaboratively design, implement, deploy, and monitor cloud-based infrastructures that meet requirements
- Structure development and release cycles to ensure smooth and successful product and update releases
- Drive the DevopsSec roadmap playing a leading role in the SOC2/HIPAA/ISO27001/GDPR and other certifications
- Advocate for alignment with application teams, security, and business
- Work collaboratively in Sprint and Scrum ceremonies
- Rapidly implement automation as required using infrastructure as code
- Ensure the synchronization of internal environments with external to ensure
- Monitor the application and infrastructure in production and act on growing issues and remediate
- Rotate, store and monitor logs so that engineering is aware of application behavior in production
Job Requirements:
- Bachelor??s/Master??s degree in Engineering, Computer Science (or equivalent experience)
- At least 3+ years of relevant experience as a software engineer
- Prolific experience with Systems Admin experience
- Solid working experience with Linux, Docker, and Terraform
- Extensive experience with Ansible, DNS, and Networking
- Prior experience with Troubleshooting is essential
- Profound knowledge and experience with Cloud platforms like AWS, Azure, or GCP
- Nice to have some experience with Prometheus, Okta, and Jumpcloud