Hyderabad Jobs |
Banglore Jobs |
Chennai Jobs |
Delhi Jobs |
Ahmedabad Jobs |
Mumbai Jobs |
Pune Jobs |
Vijayawada Jobs |
Gurgaon Jobs |
Noida Jobs |
Hyderabad Jobs |
Banglore Jobs |
Chennai Jobs |
Delhi Jobs |
Ahmedabad Jobs |
Mumbai Jobs |
Pune Jobs |
Vijayawada Jobs |
Gurgaon Jobs |
Noida Jobs |
Oil & Gas Jobs |
Banking Jobs |
Construction Jobs |
Top Management Jobs |
IT - Software Jobs |
Medical Healthcare Jobs |
Purchase / Logistics Jobs |
Sales |
Ajax Jobs |
Designing Jobs |
ASP .NET Jobs |
Java Jobs |
MySQL Jobs |
Sap hr Jobs |
Software Testing Jobs |
Html Jobs |
Job Location | Bangalore |
Education | Not Mentioned |
Salary | Not Disclosed |
Industry | Internet / E-Commerce |
Functional Area | General / Other Software |
EmploymentType | Full-time |
Operations Engineer (Service Reliability Engineer (SRE)) 2 About the Role Operations Engineers also called as Service Reliability Engineer at Flipkart are developers with excellent operations mindset. As a Platform OE you will be responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning for the platforms and services provided by Flipkart.You will be responsible for making sure that our platforms and applications are highly available and Service Level Agreements (SLA) are met. You will own all the SLIs and SLOs of the services. You will work directly on scrum teams with our Software Development Engineers using their interest in operations and development skills to ensure new features follow SRE best practices and are supportable. You will be responsible for solving greenfield problems in automation and benchmarking at scale.Flipkart Operations Engineers (SRE) are good developers with an excellent operations mindset. As a Platform OE you will be responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning for the platforms and services provided by Flipkart. OEs make sure that our platforms and applications are highly available and Service Level Agreements are met. They own all the SLIs and SLOs of the services. OEs work directly on scrum teams with our Software Engineers (SDE) using their interest in operations and development skills to ensure new features follow SRE best practices and are supportable. They also solve greenfield problems in automation and benchmarking at scale.What You ll Do Build, coach and mentor teams of Operation EngineersBuild and improve configuration and automation tools to remove manual steps indeploying, upgrading etc.Monitor and resolve issues in all environments. Ensure SLA/SLO and uptime aremet. Alert appropriately, build self-healing capabilities in the platforms, involvepeople when needed, and log tickets. Participate in a 24x7 on-call rotation.Cover availability, reliability, security etc. considerations being imbibed and reviewedand adhered to at every stage of product development.Own the RCA lifecycle for the platform issues, be answerable to the stakeholders(internals and external) on most of the service internals.Have a viewpoint on the distributed systems performance, and should be able todrive the capacity plans and scale requirements.Identifying bottlenecks and tuning areas as long as major code changes are notnecessary. e.g. If working on a hive benchmark, and MySQL connection pool is notexternally configurable and expansion policy is becoming a problem, you should beable to make code changes, build it and expose config and continue benchmark.Partner the developer and devops teams in on-call load sharing, handle 24/7platform support. What You ll Need BTech or Mtech in CS or equivalent with 5+ years working w/ highly available platforms in web-scale organizations. Demonstrated experience of around 1-2 years as a developer is good to have.Good troubleshooting skills of always available and high scale systems.Should have the ability to effectively collect all the relevant data-points anddebugging artefacts/snapshots so that the debugging at a later stage can be aseffective as possible.Expert level knowledge of at least one configuration management system (Ansible,Puppet, etc.).Understanding of standard networking basics such as: HTTP, DNS, TCP/IP, ICMP,the OSI Model, Subnetting and Load Balancing, DB sharding, partitions etc.Excellent written and verbal communication skills.Understand CI/CD and ability to architect the workflow or a deployment plan.Write software to automate API-driven tasks at scale; using Python, Go etc., develop application components wherever required using Scala, Python, C++ and JavaPosted On :Operations Engineer (Service Reliability Engineer (SRE)) 2 About the Role Operations Engineers also called as Service Reliability Engineer at Flipkart are developers with excellent operations mindset. As a Platform OE you will be responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning for the platforms and services provided by Flipkart.You will be responsible for making sure that our platforms and applications are highly available and Service Level Agreements (SLA) are met. You will own all the SLIs and SLOs of the services. You will work directly on scrum teams with our Software Development Engineers using their interest in operations and development skills to ensure new features follow SRE best practices and are supportable. You will be responsible for solving greenfield problems in automation and benchmarking at scale.Flipkart Operations Engineers (SRE) are good developers with an excellent operations mindset. As a Platform OE you will be responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning for the platforms and services provided by Flipkart. OEs make sure that our platforms and applications are highly available and Service Level Agreements are met. They own all the SLIs and SLOs of the services. OEs work directly on scrum teams with our Software Engineers (SDE) using their interest in operations and development skills to ensure new features follow SRE best practices and are supportable. They also solve greenfield problems in automation and benchmarking at scale.What You ll Do Build, coach and mentor teams of Operation EngineersBuild and improve configuration and automation tools to remove manual steps indeploying, upgrading etc.Monitor and resolve issues in all environments. Ensure SLA/SLO and uptime aremet. Alert appropriately, build self-healing capabilities in the platforms, involvepeople when needed, and log tickets. Participate in a 24x7 on-call rotation.Cover availability, reliability, security etc. considerations being imbibed and reviewedand adhered to at every stage of product development.Own the RCA lifecycle for the platform issues, be answerable to the stakeholders(internals and external) on most of the service internals.Have a viewpoint on the distributed systems performance, and should be able todrive the capacity plans and scale requirements.Identifying bottlenecks and tuning areas as long as major code changes are notnecessary. e.g. If working on a hive benchmark, and MySQL connection pool is notexternally configurable and expansion policy is becoming a problem, you should beable to make code changes, build it and expose config and continue benchmark.Partner the developer and devops teams in on-call load sharing, handle 24/7platform support.,
Keyskills :
load balancingautomation toolsemergency responsedata managementchange managementbias for actionosi modelcapacity planningmanagement systemassembly drawingscontract managementconsultingservice levelbom