Phonepe : Site Reliability Engineer – Azure

CategoryDetails
CompanyPhonePe
Job TitleSite Reliability Engineer – Azure
LocationBangalore
Experience4–8 years
Role FocusManage, scale, and ensure high availability of PhonePe’s core Azure-based cloud infrastructure; implement automation, monitoring, and networking solutions
Key Responsibilities– Configure and maintain Azure VMs, storage, CosmosDB, ADX- Manage complex networking: Azure Firewall, Route Tables, VN Gateways, ExpressRoute, BGP- Automate BAU tasks with Terraform, Saltstack, Ansible, scripting- Manage databases: MySQL, Aerospike, replication, backups- Implement monitoring (Prometheus, Victoria Metrics, Riemann) and logging (Loki) with Grafana dashboards- Ensure security/compliance with SOC and Infosec- Incident management, DR planning, capacity & performance management
Technical Skills– Cloud: Microsoft Azure core services- OS: Ubuntu/Linux administration- Scripting: Python, Go, Java, Bash- Monitoring/Observability: Prometheus, Victoria Metrics, Riemann, Grafana, Loki- IaC & Config Mgmt: Terraform, Saltstack, Ansible- Databases: MySQL, Aerospike, InfluxDB, ElasticSearch- Core infra: Nginx, HAProxy, RMQ, Docker- Networking: DNS, IPsec, BGP, ExpressRoute
Scope / ScaleLarge-scale, mission-critical infrastructure supporting 600+ million users and 330+ million transactions/day
Soft SkillsOwnership, accountability, communication, mentoring (for senior roles), SLO/SLI management, toil reduction, cost optimization
BenefitsSame as other PhonePe full-time roles: insurance, wellness, parental support, mobility, retirement, education, car lease, salary advance
Key DifferentiatorFocused on cloud infrastructure reliability, automation, networking, and database availability in a high-volume Azure environment

Click here to apply

Leave a Comment