hireejobs
Hyderabad Jobs
Banglore Jobs
Chennai Jobs
Delhi Jobs
Ahmedabad Jobs
Mumbai Jobs
Pune Jobs
Vijayawada Jobs
Gurgaon Jobs
Noida Jobs
Oil & Gas Jobs
Banking Jobs
Construction Jobs
Top Management Jobs
IT - Software Jobs
Medical Healthcare Jobs
Purchase / Logistics Jobs
Sales
Ajax Jobs
Designing Jobs
ASP .NET Jobs
Java Jobs
MySQL Jobs
Sap hr Jobs
Software Testing Jobs
Html Jobs
IT Jobs
Logistics Jobs
Customer Service Jobs
Airport Jobs
Banking Jobs
Driver Jobs
Part Time Jobs
Civil Engineering Jobs
Accountant Jobs
Safety Officer Jobs
Nursing Jobs
Civil Engineering Jobs
Hospitality Jobs
Part Time Jobs
Security Jobs
Finance Jobs
Marketing Jobs
Shipping Jobs
Real Estate Jobs
Telecom Jobs

Director , Site Reliability Engineering

15.00 to 20.00 Years   Hyderabad   18 Jul, 2019
Job LocationHyderabad
EducationNot Mentioned
SalaryNot Disclosed
IndustryIT - Software
Functional AreaGeneral / Other Software
EmploymentTypeFull-time

Job Description

Engineering, Infrastructure and Operations Hyderabad, IndiaDescription Director Site Reliability Engineering, ServiceNow, The Enterprise IT Cloud Company, is the industry-leading cloud platform provider for building enterprise applications. We are redefining markets and changing the perception of enterprise software. Our cloud platform allows enterprise IT to bring together business strategy, application design and operations in a powerfully simple solution. To sustain our explosive growth, we are looking for drivers people who thrive on responsibility and live for the next big challenge. We seek to employ the brightest and most forward-thinking talent on the planet, and we want to hire people who have their best work ahead of them, not behind them. Accelerate your career and succeed in an environment where you can make an impact daily. We invite you to join in to stand out. OverviewServiceNow is seeking an experienced individual to lead our Hyderabad SRE team, a team of software and system engineers with a broad set of expertise to provide resolution to issue identified in our cloud. The SRE operates 24/ 7/ 365 providing global coverage with presence in UK, Ireland, US, India and Australia and monitors the health of our cloud, providing value-added troubleshooting and outage management across all infrastructure and application issues. The team serves as the front line of infrastructure support through event monitoring, incident response and mitigation, providing world-class service to our cloud customer base. The team is responsible for the continued high availability and reliability of our SaaS platform and all infrastructure elements that support the environment. The candidate will be responsible for developing requirements, workflows, tools, automations and communications pertaining to the SRE team. Candidates must maintain professional communications with external teams to ensure timely restoration of services and the continuous improvement of the services delivered by the SRE and by ServiceNow. The candidate will be responsible for clearly articulating SRE strategy across the business and driving forward the SRE strategy particularly in the following areas: Onboard complex technical availability issues across the DB, Systems and application stack and drive automated solutions, to remove the need for manual interventionBuild a world class group of engineers capable of driving forward the availability and reliability of our cloud. Driving monitoring of all systems to a proactive positionOwn the ongoing investigation of reliability and availability issues and work with partner teams to remove persistent issues. Drive a culture of continuous improvementBuild a sustainable process frameworkBuild a culture of continuous learning within the team Reporting to the Senior Director of Site Reliability Engineering, the successful candidate ensures that terabytes of data and all cloud services are highly available 24x7, reliable and scalable for our rapidly growing cloud services. The SRE Director provides the SRE staff with direct management and incident leadership, including prioritization of all efforts related to projects, tasks and goals. In addition, the Director will lead continuous improvement activity and drive the continuing development of the team and individuals within the team. The SRE team in Hyderabad is a new initiative so the Director will be responsible for driving the maturity of the team and their services to ensure a sustainable approach. ResponsibilitiesTeam Management: The successful candidate will directly manage the Hyderabad SRE team. They will be responsible for the recruitment and ongoing development of them, and the services they deliver. Duties include performance reviews, objective setting, work prioritization and overseeing all staff activity including tasks, projects, and goals. In addition, this role establishes career paths and implements training programs as requiredThis position is responsible for all incidents and escalations as it pertains to the SRE team and the associated process and workflows with particular focus on maintaining the performance and availability of the supported environments. The candidate will participate in the continued development and execution of SRE management processes including Incident, Problem, Configuration, and Change management. The role is accountable for effectively on-boarding and preparing all of SRE Engineers for new technology, systems and automation that will be supported or used by the team and the wide Site Reliability Engineering team in the application support and development areasThe successful candidate will bring a Devops mentality to the teams and will drive the team towards automated solutions to manual and repetitive tasks. Process and Procedures: The successful candidate will ensure appropriate rigor, discipline, consistency and predictability is applied across the entire organization with respect to how changes are scheduled, executed and measured. They will analyze current procedures and processes and drive continuous improvements efforts to ensure the SRE provide a quality service across all functional areas. This will include the setting up and continuous monitoring of KPI s and metrics pertaining to individual and team performance. They will also provide documentation and training to internal departments to facilitate day-to-day operations throughout the company and define, share and deliver insightful analysis across all metrics for the Operations teams. Required QualificationsHighly Experienced in hands-on operations in a technical setting, with a responsibility for personnel management, managing Operations with a thorough understanding of operational process, Service Now application, underlying technology and development process. Strong understanding or experience of operating in Cloud Operations as it pertains to Software as a Service (SaaS) , Platform as a Service (PaaS) and Infrastructure as a Service (IaaS) Strong working knowledge of operating in a follow-the-sun operational model, including geographic knowledge, talent acquisition, cultural dynamics, and cross-shift handovers and communications. Comprehensive knowledge of principles, methods, and techniques used across ITIL processes, preferably ITIL v3. Outstanding communication skills, both written and verbal, and very strong interpersonal skills. A working understanding of the technology associated with operating a service or platform in the cloud, including datacenter, networking, application and relational databases. Attention to detail and the ability to work independently and lead a team. Bachelors degree in Computer Science or Information Systems or equivalent technical discipline, or similar work experience. Director level experience within a Site Reliability and or Dev/ ops environment would be highly advantageousStrong problem-solving and analytical skills with an aptitude for learning new technologies Desired skills: Experience with service now platform, scripting, tuning, troubleshooting is highly preferred. Scripting: basic shell scripting, Python, JavascriptSolid understanding and experience with databases mainly Mysql/ MARIADB (and Oracle) Solid understanding of schemas, table spaces, indexing, and database performance optimizationFamiliarity or knowledge of CentOS/ RedHat operating system (admin/ root level) Demonstrated experience leading individual projects, including scheduling and reporting. Work Environment,

Keyskills :
javaandroidhellscriptingproblemsolvinganalyticalskillsdatacenter

Director , Site Reliability Engineering Related Jobs

© 2019 Hireejobs All Rights Reserved