Director , Site Reliability Engineering

15.00 to 20.00 Years Hyderabad 18 Jul, 2019

Job Location	Hyderabad
Education	Not Mentioned
Salary	Not Disclosed
Industry	IT - Software
Functional Area	General / Other Software
EmploymentType	Full-time

Job Description

Engineering, Infrastructure and Operations Hyderabad, IndiaDescription Director Site Reliability Engineering, ServiceNow, The Enterprise IT Cloud Company, is the industry-leading cloud platform provider for building enterprise applications. We are redefining markets and changing the perception of enterprise software. Our cloud platform allows enterprise IT to bring together business strategy, application design and operations in a powerfully simple solution. To sustain our explosive growth, we are looking for drivers people who thrive on responsibility and live for the next big challenge. We seek to employ the brightest and most forward-thinking talent on the planet, and we want to hire people who have their best work ahead of them, not behind them. Accelerate your career and succeed in an environment where you can make an impact daily. We invite you to join in to stand out. OverviewServiceNow is seeking an experienced individual to lead our Hyderabad SRE team, a team of software and system engineers with a broad set of expertise to provide resolution to issue identified in our cloud. The SRE operates 24/ 7/ 365 providing global coverage with presence in UK, Ireland, US, India and Australia and monitors the health of our cloud, providing value-added troubleshooting and outage management across all infrastructure and application issues. The team serves as the front line of infrastructure support through event monitoring, incident response and mitigation, providing world-class service to our cloud customer base. The team is responsible for the continued high availability and reliability of our SaaS platform and all infrastructure elements that support the environment. The candidate will be responsible for developing requirements, workflows, tools, automations and communications pertaining to the SRE team. Candidates must maintain professional communications with external teams to ensure timely restoration of services and the continuous improvement of the services delivered by the SRE and by ServiceNow. The candidate will be responsible for clearly articulating SRE strategy across the business and driving forward the SRE strategy particularly in the following areas: Onboard complex technical availability issues across the DB, Systems and application stack and drive automated solutions, to remove the need for manual interventionBuild a world class group of engineers capable of driving forward the availability and reliability of our cloud. Driving monitoring of all systems to a proactive positionOwn the ongoing investigation of reliability and availability issues and work with partner teams to remove persistent issues. Drive a culture of continuous improvementBuild a sustainable process frameworkBuild a culture of continuous learning within the team Reporting to the Senior Director of Site Reliability Engineering, the successful candidate ensures that terabytes of data and all cloud services are highly available 24x7, reliable and scalable for our rapidly growing cloud services. The SRE Director provides the SRE staff with direct management and incident leadership, including prioritization of all efforts related to projects, tasks and goals. In addition, the Director will lead continuous improvement activity and drive the continuing development of the team and individuals within the team. The SRE team in Hyderabad is a new initiative so the Director will be responsible for driving the maturity of the team and their services to ensure a sustainable approach. ResponsibilitiesTeam Management: The successful candidate will directly manage the Hyderabad SRE team. They will be responsible for the recruitment and ongoing development of them, and the services they deliver. Duties include performance reviews, objective setting, work prioritization and overseeing all staff activity including tasks, projects, and goals. In addition, this role establishes career paths and implements training programs as requiredThis position is responsible for all incidents and escalations as it pertains to the SRE team and the associated process and workflows with particular focus on maintaining the performance and availability of the supported environments. The candidate will participate in the continued development and execution of SRE management processes including Incident, Problem, Configuration, and Change management. The role is accountable for effectively on-boarding and preparing all of SRE Engineers for new technology, systems and automation that will be supported or used by the team and the wide Site Reliability Engineering team in the application support and development areasThe successful candidate will bring a Devops mentality to the teams and will drive the team towards automated solutions to manual and repetitive tasks. Process and Procedures: The successful candidate will ensure appropriate rigor, discipline, consistency and predictability is applied across the entire organization with respect to how changes are scheduled, executed and measured. They will analyze current procedures and processes and drive continuous improvements efforts to ensure the SRE provide a quality service across all functional areas. This will include the setting up and continuous monitoring of KPI s and metrics pertaining to individual and team performance. They will also provide documentation and training to internal departments to facilitate day-to-day operations throughout the company and define, share and deliver insightful analysis across all metrics for the Operations teams. Required QualificationsHighly Experienced in hands-on operations in a technical setting, with a responsibility for personnel management, managing Operations with a thorough understanding of operational process, Service Now application, underlying technology and development process. Strong understanding or experience of operating in Cloud Operations as it pertains to Software as a Service (SaaS) , Platform as a Service (PaaS) and Infrastructure as a Service (IaaS) Strong working knowledge of operating in a follow-the-sun operational model, including geographic knowledge, talent acquisition, cultural dynamics, and cross-shift handovers and communications. Comprehensive knowledge of principles, methods, and techniques used across ITIL processes, preferably ITIL v3. Outstanding communication skills, both written and verbal, and very strong interpersonal skills. A working understanding of the technology associated with operating a service or platform in the cloud, including datacenter, networking, application and relational databases. Attention to detail and the ability to work independently and lead a team. Bachelors degree in Computer Science or Information Systems or equivalent technical discipline, or similar work experience. Director level experience within a Site Reliability and or Dev/ ops environment would be highly advantageousStrong problem-solving and analytical skills with an aptitude for learning new technologies Desired skills: Experience with service now platform, scripting, tuning, troubleshooting is highly preferred. Scripting: basic shell scripting, Python, JavascriptSolid understanding and experience with databases mainly Mysql/ MARIADB (and Oracle) Solid understanding of schemas, table spaces, indexing, and database performance optimizationFamiliarity or knowledge of CentOS/ RedHat operating system (admin/ root level) Demonstrated experience leading individual projects, including scheduling and reporting. Work Environment,

Keyskills :
javaandroidhellscriptingproblemsolvinganalyticalskillsdatacenter

APPLY NOW

Director , Site Reliability Engineering Related Jobs

We Hiring for E/M ,HOME HEALTH, Surgery, Ipdrg Coder, Trainer, QA

Axis Services

1.00 to 6.00 Years Hyderabad 02 May, 2024

Keyskills :
medical codingcpcccsbchhc

View & Apply
Senior Analyst

MACKENZIE MODERN IT SOLUTIONS PRIVATE LIMITED

6.00 to 9.00 Years Hyderabad 02 May, 2024

Keyskills :
siliconprocess documentationanalyticalconsultingpythonmachine learning

View & Apply
Program Manager

MACKENZIE MODERN IT SOLUTIONS PRIVATE LIMITED

4.00 to 9.00 Years Hyderabad 02 May, 2024

Keyskills :
advanced excellogistic regressioncommunication skillsqlikviewstatistical modelingpythonmachine learning

View & Apply
Product Designer

MACKENZIE MODERN IT SOLUTIONS PRIVATE LIMITED

5.00 to 8.00 Years Hyderabad 02 May, 2024

Keyskills :
software engineeringrtmsqldata mungingproduct developmentsoftware development life cyclesdlcproduct design

View & Apply
Dynamic Mechanical Engineer

MACKENZIE MODERN IT SOLUTIONS PRIVATE LIMITED

3.00 to 6.00 Years Hyderabad 02 May, 2024

Keyskills :
spibasiccadvalvessolid modelingengineering designproduct designsheet metal

View & Apply
Urgent Hiring AEM Developer MSRcosmos Group

MSR COSMOS IT LLP

4.00 to 9.00 Years Hyderabad 02 May, 2024

Keyskills :
damdigital asset managementadobe experience manager

View & Apply
Business Analyst

MACKENZIE MODERN IT SOLUTIONS PRIVATE LIMITED

4.00 to 9.00 Years Hyderabad 02 May, 2024

Keyskills :
safety trainingprogrammingmanagementhospitalityscheduling

View & Apply
Web Programmer

MACKENZIE MODERN IT SOLUTIONS PRIVATE LIMITED

4.00 to 8.00 Years Hyderabad 02 May, 2024

Keyskills :
numpyawsmatplotlibgitdjangopythonsciencemongodbnode

View & Apply
Program Manager

MACKENZIE MODERN IT SOLUTIONS PRIVATE LIMITED

3.00 to 8.00 Years Hyderabad 02 May, 2024

Keyskills :
machine learningcommunication skillslogistic regressionstatistical modelingqlikviewpythonadvanced excel

View & Apply
Technical Manager

SUNITA AGRI EXPORTS PRIVATE LIMITED

8.00 to 12.00 Years Hyderabad 02 May, 2024

Keyskills :
programminggoodmanagementevaluationtechnicalproblemcommunicationcontentengineeringcreationprojectsolvingsocialskills

View & Apply

Director , Site Reliability Engineering

Job Description

Director , Site Reliability Engineering Related Jobs

We Hiring for E/M ,HOME HEALTH, Surgery, Ipdrg Coder, Trainer, QA

Senior Analyst

Program Manager

Product Designer

Dynamic Mechanical Engineer

Urgent Hiring AEM Developer MSRcosmos Group

Business Analyst

Web Programmer

Program Manager

Technical Manager

Jobs By Category

Jobs By Skills

Jobs By Location

Main Menu

Jobseekers

Employers