hireejobs
Hyderabad Jobs
Banglore Jobs
Chennai Jobs
Delhi Jobs
Ahmedabad Jobs
Mumbai Jobs
Pune Jobs
Vijayawada Jobs
Gurgaon Jobs
Noida Jobs
Oil & Gas Jobs
Banking Jobs
Construction Jobs
Top Management Jobs
IT - Software Jobs
Medical Healthcare Jobs
Purchase / Logistics Jobs
Sales
Ajax Jobs
Designing Jobs
ASP .NET Jobs
Java Jobs
MySQL Jobs
Sap hr Jobs
Software Testing Jobs
Html Jobs
IT Jobs
Logistics Jobs
Customer Service Jobs
Airport Jobs
Banking Jobs
Driver Jobs
Part Time Jobs
Civil Engineering Jobs
Accountant Jobs
Safety Officer Jobs
Nursing Jobs
Civil Engineering Jobs
Hospitality Jobs
Part Time Jobs
Security Jobs
Finance Jobs
Marketing Jobs
Shipping Jobs
Real Estate Jobs
Telecom Jobs

Manager - Site Reliability Engineering

5.00 to 9.00 Years   Hyderabad   08 Oct, 2020
Job LocationHyderabad
EducationNot Mentioned
SalaryNot Disclosed
IndustryIT - Software
Functional AreaGeneral / Other Software
EmploymentTypeFull-time

Job Description

Responsibilities:We re looking for someone with a balance of technical expertise, leadership skills, and managerial experience. A leader with the ability to set technical direction on incident bridges and marshal resources accordingly. This person will ensure that investigations follow appropriate triage and troubleshooting paths while ensuring projects meet deadlines. The ideal candidate is someone who will drive continuous improvement while helping us streamline how we do Operations. This position will foster and maintain strong relationships with other connected areas of the business, ensuring the SRE team are vital stakeholders within any process and procedural enhancements. The leader in this role must demonstrate a strong focus on engineering practices, service ownership, agile practices, and people management skills. You will be responsible for managing and supervising the day-to-day responsibilities of front-line Site Reliability Engineers.

    • Experience successfully coaching and managing/team leading people to achieve goals.
    • Incident management, able to perform in key support roles during major incidents e.g. Sev0, Sev1, Sev2.
    • Experience with project management methodologies, ability to multi-task and excels in dynamic environments.
    • Drives the team to be proactive in diagnostics, detection and configuration of applications while driving service-ownership. Uses data to make decisions, addresses problems head on and is customer focused.
    • Problem Management experience, drive quality RCAs and partner with the Global Solutions team and Service Owners to fix issues permanently.
    • Oversees process improvements by working with other cross-cloud service owners (Developers, DBAs, Network, etc) to build positive relationships and influencing when necessary.
    • Involved in the automation and tooling of manual and repetitive processes.
    • Ensure work carried out by the team meets compliance policy and directives.
Required skills/Experience:
    • We have a 24x7 work environment and candidates should be open to work in any shift (AMER/APAC/EMEA).
    • It is not necessary that weekends are on Sat/Sun and will involve rotations (Non Standard work week i.e. work week is 5 days a week but weekend can be any 2 days of the week).
    • 10 years of Infrastructure Engineering or Operations experience.
    • 5 years managing Site Reliability, NOC, or mixed Operations teams preferably in globally distributed environments.
    • Experience in working in a 24/7/365 Operations team, managing large data centers and infrastructure
    • Past Experience in Incident Management and strong understanding of ITIL service operations and SCRUM methodologies
    • Expertise with enterprise monitoring systems, such as, Nagios and Splunk
    • Passionate about employee development with experience successfully coaching individuals to achieve goals
    • Strong communication, organizational, analytical and problem solving skills. Understands the importance of communicating with urgency, transparency and clarity.
    • Advanced Windows administration and troubleshooting and experience with Linux in a large data center environment.
    • Results-driven communicator who uses analytical tools and dashboards to tell a story and influence others.
    • Passionate about engineering productivity and service ownership and the success of our customers
    • Experience designing, developing, debugging, and operating resilient distributed systems that run across thousands of compute nodes in multiple data centers.
Preferred:
    • Experience with traditional data centers as well as knowledge/experience in any of the Public Clouds(AWS, GCP, or Azure)
    • Incident/Escalation Management Experience
    • Experience with integrating new functions/on boarding new services into Site Reliability
    • Analytics/BI Background
Education:
    • MS in Computer Science or related field, or
    • BS in Computer Science plus relevant job-related experience
,

Keyskills :
javaacademicsacpalgorithmsandroidequal employment opportunitydata centerstandard workproblem solvingcomputer sciencepeople managementleadership skillsproject management

Manager - Site Reliability Engineering Related Jobs

© 2019 Hireejobs All Rights Reserved