Hyderabad Jobs |
Banglore Jobs |
Chennai Jobs |
Delhi Jobs |
Ahmedabad Jobs |
Mumbai Jobs |
Pune Jobs |
Vijayawada Jobs |
Gurgaon Jobs |
Noida Jobs |
Hyderabad Jobs |
Banglore Jobs |
Chennai Jobs |
Delhi Jobs |
Ahmedabad Jobs |
Mumbai Jobs |
Pune Jobs |
Vijayawada Jobs |
Gurgaon Jobs |
Noida Jobs |
Oil & Gas Jobs |
Banking Jobs |
Construction Jobs |
Top Management Jobs |
IT - Software Jobs |
Medical Healthcare Jobs |
Purchase / Logistics Jobs |
Sales |
Ajax Jobs |
Designing Jobs |
ASP .NET Jobs |
Java Jobs |
MySQL Jobs |
Sap hr Jobs |
Software Testing Jobs |
Html Jobs |
Job Location | Chennai |
Education | Not Mentioned |
Salary | Not Disclosed |
Industry | Recruitment Services |
Functional Area | General / Other Software |
EmploymentType | Full-time |
site reliability engineer in chennaiWe at Hippo Video works on the latest video solutions. As a startup, we aregrowing 2X every quarter. As the growing customer traffic, we will be scalingup resources and rearchitect data flows, Containerization, develop data lake,add monitoring at various levels to be proactive. We have started building ourSRE team. If you are a startup enthusiast and ready for challenges, jump in,and grab the opportunity. What youll do: You will build, operate, andmaintain a platform for Cloud services. This will include technologies such asAWS services (Elastic beanstalk, Kubernetes, S3, and more), Postgres/RDS,Redis, API gateways, authentication services, 3rd party integrations, andmore. Collaborate with other Engineering teams to support services beforethey go live through activities such as system design consulting, developingsoftware platforms and frameworks, capacity planning, and launch reviews. Maintain services by measuring and monitoring availability, latency, andoverall system health. Youll implement modern systems observabilitysolutions including monitoring, alerting, metrics, logging, and APM &distributed tracing. Scale systems sustainably through automation and evolvesystems by pushing for changes that improve reliability and velocity. Be on-call for services that the SRE team owns. Practice sustainable incidentresponse and post-incident analysis by acting as an incident manager. Youllfollow our existing incident management process and recommend improvements tothat process. Who you are (Our ideal candidate will have some or all of thesequalifications): You have 5+ years of relevant experience. You have anexpert-level understanding of and at least 3 years of working experience withAWS in a production environment. Youre comfortable deploying and operatingservices using AWS technologies and have an expert understanding of thevarious offerings available. Youve built and supported systems using cloud-native (CNCF) technologies at scale. You are interested in designing,analyzing, and troubleshooting large-scale distributed systems. Youunderstand what it means to operate infrastructure as code, and haveexperience developing services and automation to do so. You have a greatability to debug and optimize code and automate routine tasks to eliminatetoil. You have a systematic problem-solving approach, coupled with strongcommunication skills and a sense of ownership, initiative, grit, and drive. You have designed and implemented applications and systems that scale, areresilient to failure, and are observable. You have practical experiencedeveloping and improving applications written in Ruby, Java. If this soundslike the right career opportunity for you, wed be happy to talk! skillsAWS, Kubernet, Jenkin,
Keyskills :
javaacademicsacpalgorithmsandroid3rd party integrationssystem designproblem solvingsupport servicescapacity planningdesign consultingworking experienceincident managementapi