hireejobs
Hyderabad Jobs
Banglore Jobs
Chennai Jobs
Delhi Jobs
Ahmedabad Jobs
Mumbai Jobs
Pune Jobs
Vijayawada Jobs
Gurgaon Jobs
Noida Jobs
Oil & Gas Jobs
Banking Jobs
Construction Jobs
Top Management Jobs
IT - Software Jobs
Medical Healthcare Jobs
Purchase / Logistics Jobs
Sales
Ajax Jobs
Designing Jobs
ASP .NET Jobs
Java Jobs
MySQL Jobs
Sap hr Jobs
Software Testing Jobs
Html Jobs
IT Jobs
Logistics Jobs
Customer Service Jobs
Airport Jobs
Banking Jobs
Driver Jobs
Part Time Jobs
Civil Engineering Jobs
Accountant Jobs
Safety Officer Jobs
Nursing Jobs
Civil Engineering Jobs
Hospitality Jobs
Part Time Jobs
Security Jobs
Finance Jobs
Marketing Jobs
Shipping Jobs
Real Estate Jobs
Telecom Jobs

Site Reliability Engineer 3

4.00 to 9.00 Years   Bangalore   25 Nov, 2021
Job LocationBangalore
EducationNot Mentioned
SalaryNot Disclosed
IndustryInternet / E-Commerce
Functional AreaIT Operations / EDP / MIS,Site Engineering / Project Management
EmploymentTypeFull-time

Job Description

About FlipkartFlipkart is India s largest e-commerce marketplace with a registered customer base of over 150 million. In the 10 years since we started, Flipkart has come to offer over 100 million products across 120+ categories including Smartphones, Books, Media, Consumer Electronics, Furniture, Fashion and Lifestyle.Launched in October 2007, Flipkart is known for its path-breaking services like Cash-onDelivery, No-Cost-EMI and 10-day replacement policy. Flipkart was the pioneer in offering services like In-a-Day Guarantee (65 cities) and Same-Day-Guarantee (13 cities) at scale. With over 1,20,000 registered sellers, Flipkart has redefined the way brands and MSME s do business online.Site Reliability Engineer 3 (SRE3) About the Role Site Reliability Engineer at Flipkart are developers with excellent operations mindset. As a Site Reliability Engineer, you will be building solutions to scale our platforms and applications reliably for high availableility and make sure Service Level objectives (SLO) are met. You will own all the SLOs of various Flipkart services across tiers. You will work directly with our Software Development teams to reduce the toil of developing, deploying and maintaining our software,by adopting engineered solutions and reliability engineering best practices . You will be responsible for solving greenfield problems in reliability engineering and benchmarking, at scale. What You ll Do

  • Help our engineers adopt Flipkart Reliability Engineering playbook by abstracting context and complexities of a hybrid cloud.
  • Build, coach and mentor teams of Site Reliability Engineers
  • Cover availability, reliability, security etc. considerations being imbibed and reviewed and adhered to at every stage of product development.
  • Monitor and resolve issues in all environments. Ensure SLO s are met. Alert appropriately, build self-healing capabilities in the platforms, involve people when needed, and log tickets. Participate in a 24x7 on-call rotation.
  • Run periodic resilience ( chaos) experiments and continuously verify the state of reliability
  • Build and improve configuration and automation tools to remove toil in developing, deploying and maintaining software
  • Own the RCA lifecycle for the platform issues, be answerable to the stakeholders (internals and external) on most of the service internals.
  • Have a viewpoint on the distributed systems performance, and should be able to drive the capacity plans and scale requirements.
  • Identifying bottlenecks and tuning areas as long as major code changes are not necessary. e.g. If working on a hive benchmark, and MySQL connection pool is not externally configurable and expansion policy is becoming a problem, you should be able to make code changes, build it and expose config and continue benchmark.
  • Partner the developer and devops teams in on-call load sharing, handle 24/7 platform support.
What You ll Need
  • BTech or Mtech in CS or equivalent with 5+ years working w/ highly available platforms in web-scale organizations. Demonstrated experience of around 1-2 years as a developer is good to have.
  • Good troubleshooting skills of always available and high scale systems.
  • Should have the ability to effectively collect all the relevant data-points and debugging artefacts/snapshots so that the debugging at a later stage can be as effective as possible.
  • Expert level knowledge of at least one configuration management system (Ansible, Puppet, etc.).
  • Understanding of standard networking basics such as: HTTP, DNS, TCP/IP, ICMP, the OSI Model, Subnetting and Load Balancing, DB sharding, partitions etc.
  • Excellent written and verbal communication skills.
  • Understand CI/CD and ability to architect the workflow or a deployment plan.
  • Write software to automate API-driven tasks at scale; using Python, Go etc., develop application components wherever required using Scala, Python, C++ and Java
Open Positions:1Skills Required:Programming/Scripting, Distributed Systems, Unix System, Troubleshooting, Observability, Hybrid Cloud, Cloud Infrastructure Design.Location:Bangalore,KarnatakaYears Of Exp:5 to 10 YearsPosted On :Site Reliability Engineer 3 (SRE3) About the Role Site Reliability Engineer at Flipkart are developers with excellent operations mindset. As a Site Reliability Engineer, you will be building solutions to scale our platforms and applications reliably for high availableility and make sure Service Level objectives (SLO) are met. You will own all the SLOs of various Flipkart services across tiers. You will work directly with our Software Development teams to reduce the toil of developing, deploying and maintaining our software,by adopting engineered solutions and reliability engineering best practices . You will be responsible for solving greenfield problems in reliability engineering and benchmarking, at scale. What You ll Do
  • Help our engineers adopt Flipkart Reliability Engineering playbook by abstracting context and complexities of a hybrid cloud.
  • Build, coach and mentor teams of Site Reliability Engineers
  • Cover availability, reliability, security etc. considerations being imbibed and reviewed and adhered to at every stage of product development.
  • Monitor and resolve issues in all environments. Ensure SLO s are met. Alert appropriately, build self-healing capabilities in the platforms, involve people when needed, and log tickets. Participate in a 24x7 on-call rotation.
  • Run periodic resilience ( chaos) experiments and continuously verify the state of reliability
  • Build and improve configuration and automation tools to remove toil in developing, deploying and maintaining software
  • Own the RCA lifecycle for the platform issues, be answerable to the stakeholders (internals and external) on most of the service internals.
  • Have a viewpoint on the distributed systems performance, and should be able to drive the capacity plans and scale requirements.
  • Identifying bottlenecks and tuning areas as long as major code changes are not necessary. e.g. If working on a hive benchmark, and MySQL connection pool is not externally configurable and expansion policy is becoming a problem, you should be able to make code changes, build it and expose config and continue benchmark.
  • Partner the developer and devops teams in on-call load sharing, handle 24/7 platform support.
What You ll Need
  • BTech or Mtech in CS or equivalent with 5+ years working w/ highly available platforms in web-scale organizations. Demonstrated experience of around 1-2 years as a developer is good to have.
  • Good troubleshooting skills of always available and high scale systems.
  • Should have the ability to effectively collect all the relevant data-points and debugging artefacts/snapshots so that the debugging at a later stage can be as effective as possible.
  • Expert level knowledge of at least one configuration management system (Ansible, Puppet, etc.).
  • Understanding of standard networking basics such as: HTTP, DNS, TCP/IP, ICMP, the OSI Model, Subnetting and Load Balancing, DB sharding, partitions etc.
  • Excellent written and verbal communication skills.
  • Understand CI/CD and ability to architect the workflow or a deployment plan.
  • Write software to automate API-driven tasks at scale; using Python, Go etc., develop application components wherever required using Scala, Python, C++ and Java
Open Positions:1Skills Required:Programming/Scripting, Distributed Systems, Unix System, Troubleshooting, Observability, Hybrid Cloud, Cloud Infrastructure Design.Location:Bangalore,KarnatakaYears Of Exp:5 to 10 YearsOpen Positions :1Posted On :01-Jan-1970Skills Required :Programming/Scripting, Distributed Systems, Unix System, Troubleshooting, Observability, Hybrid Cloud, Cloud Infrastructure Design.Posted On :Programming/Scripting, Distributed Systems, Unix System, Troubleshooting, Observability, Hybrid Cloud, Cloud Infrastructure Design.Location :Bangalore,KarnatakaPosted On :Bangalore,KarnatakaApply Back, About FlipkartFlipkart is India s largest e-commerce marketplace with a registered customer base of over 150 million. In the 10 years since we started, Flipkart has come to offer over 100 million products across 120+ categories including Smartphones, Books, Media, Consumer Electronics, Furniture, Fashion and Lifestyle.Launched in October 2007, Flipkart is known for its path-breaking services like Cash-onDelivery, No-Cost-EMI and 10-day replacement policy. Flipkart was the pioneer in offering services like In-a-Day Guarantee (65 cities) and Same-Day-Guarantee (13 cities) at scale. With over 1,20,000 registered sellers, Flipkart has redefined the way brands and MSME s do business online.Site Reliability Engineer 3 (SRE3) About the Role Site Reliability Engineer at Flipkart are developers with excellent operations mindset. As a Site Reliability Engineer, you will be building solutions to scale our platforms and applications reliably for high availableility and make sure Service Level objectives (SLO) are met. You will own all the SLOs of various Flipkart services across tiers. You will work directly with our Software Development teams to reduce the toil of developing, deploying and maintaining our software,by adopting engineered solutions and reliability engineering best practices . You will be responsible for solving greenfield problems in reliability engineering and benchmarking, at scale. What You ll Do
  • Help our engineers adopt Flipkart Reliability Engineering playbook by abstracting context and complexities of a hybrid cloud.
  • Build, coach and mentor teams of Site Reliability Engineers
  • Cover availability, reliability, security etc. considerations being imbibed and reviewed and adhered to at every stage of product development.
  • Monitor and resolve issues in all environments. Ensure SLO s are met. Alert appropriately, build self-healing capabilities in the platforms, involve people when needed, and log tickets. Participate in a 24x7 on-call rotation.
  • Run periodic resilience ( chaos) experiments and continuously verify the state of reliability
  • Build and improve configuration and automation tools to remove toil in developing, deploying and maintaining software
  • Own the RCA lifecycle for the platform issues, be answerable to the stakeholders (internals and external) on most of the service internals.
  • Have a viewpoint on the distributed systems performance, and should be able to drive the capacity plans and scale requirements.
  • Identifying bottlenecks and tuning areas as long as major code changes are not necessary. e.g. If working on a hive benchmark, and MySQL connection pool is not externally configurable and expansion policy is becoming a problem, you should be able to make code changes, build it and expose config and continue benchmark.
  • Partner the developer and devops teams in on-call load sharing, handle 24/7 platform support.
What You ll Need
  • BTech or Mtech in CS or equivalent with 5+ years working w/ highly available platforms in web-scale organizations. Demonstrated experience of around 1-2 years as a developer is good to have.
  • Good troubleshooting skills of always available and high scale systems.
  • Should have the ability to effectively collect all the relevant data-points and debugging artefacts/snapshots so that the debugging at a later stage can be as effective as possible.
  • Expert level knowledge of at least one configuration management system (Ansible, Puppet, etc.).
  • Understanding of standard networking basics such as: HTTP, DNS, TCP/IP, ICMP, the OSI Model, Subnetting and Load Balancing, DB sharding, partitions etc.
  • Excellent written and verbal communication skills.
  • Understand CI/CD and ability to architect the workflow or a deployment plan.
  • Write software to automate API-driven tasks at scale; using Python, Go etc., develop application components wherever required using Scala, Python, C++ and Java
Open Positions:1Skills Required:Programming/Scripting, Distributed Systems, Unix System, Troubleshooting, Observability, Hybrid Cloud, Cloud Infrastructure Design.Location:Bangalore,KarnatakaYears Of Exp:5 to 10 YearsPosted On :Site Reliability Engineer 3 (SRE3) About the Role Site Reliability Engineer at Flipkart are developers with excellent operations mindset. As a Site Reliability Engineer, you will be building solutions to scale our platforms and applications reliably for high availableility and make sure Service Level objectives (SLO) are met. You will own all the SLOs of various Flipkart services across tiers. You will work directly with our Software Development teams to reduce the toil of developing, deploying and maintaining our software,by adopting engineered solutions and reliability engineering best practices . You will be responsible for solving greenfield problems in reliability engineering and benchmarking, at scale. What You ll Do
  • Help our engineers adopt Flipkart Reliability Engineering playbook by abstracting context and complexities of a hybrid cloud.
  • Build, coach and mentor teams of Site Reliability Engineers
  • Cover availability, reliability, security etc. considerations being imbibed and reviewed and adhered to at every stage of product development.
  • Monitor and resolve issues in all environments. Ensure SLO s are met. Alert appropriately, build self-healing capabilities in the platforms, involve people when needed, and log tickets. Participate in a 24x7 on-call rotation.
  • Run periodic resilience ( chaos) experiments and continuously verify the state of reliability
  • Build and improve configuration and automation tools to remove toil in developing, deploying and maintaining software
  • Own the RCA lifecycle for the platform issues, be answerable to the stakeholders (internals and external) on most of the service internals.
  • Have a viewpoint on the distributed systems performance, and should be able to drive the capacity plans and scale requirements.
  • Identifying bottlenecks and tuning areas as long as major code changes are not necessary. e.g. If working on a hive benchmark, and MySQL connection pool is not externally configurable and expansion policy is becoming a problem, you should be able to make code changes, build it and expose config and continue benchmark.
  • Partner the developer and devops teams in on-call load sharing, handle 24/7 platform support.
What You ll Need
  • BTech or Mtech in CS or equivalent with 5+ years working w/ highly available platforms in web-scale organizations. Demonstrated experience of around 1-2 years as a developer is good to have.
  • Good troubleshooting skills of always available and high scale systems.
  • Should have the ability to effectively collect all the relevant data-points and debugging artefacts/snapshots so that the debugging at a later stage can be as effective as possible.
  • Expert level knowledge of at least one configuration management system (Ansible, Puppet, etc.).
  • Understanding of standard networking basics such as: HTTP, DNS, TCP/IP, ICMP, the OSI Model, Subnetting and Load Balancing, DB sharding, partitions etc.
  • Excellent written and verbal communication skills.
  • Understand CI/CD and ability to architect the workflow or a deployment plan.
  • Write software to automate API-driven tasks at scale; using Python, Go etc., develop application components wherever required using Scala, Python, C++ and Java
Open Positions:1Skills Required:Programming/Scripting, Distributed Systems, Unix System, Troubleshooting, Observability, Hybrid Cloud, Cloud Infrastructure Design.Location:Bangalore,KarnatakaYears Of Exp:5 to 10 YearsOpen Positions :1Posted On :01-Jan-1970Skills Required :Programming/Scripting, Distributed Systems, Unix System, Troubleshooting, Observability, Hybrid Cloud, Cloud Infrastructure Design.Posted On :Programming/Scripting, Distributed Systems, Unix System, Troubleshooting, Observability, Hybrid Cloud, Cloud Infrastructure Design.Location :Bangalore,KarnatakaPosted On :Bangalore,KarnatakaApply Back,

Keyskills :
javaacademicsacpalgorithmsandroidosi modelhybrid cloudservice levelload balancingautomation toolsmanagement systemdistributed systemssoftware developmentconsumer electronics

Site Reliability Engineer 3 Related Jobs

© 2019 Hireejobs All Rights Reserved