Hyderabad Jobs |
Banglore Jobs |
Chennai Jobs |
Delhi Jobs |
Ahmedabad Jobs |
Mumbai Jobs |
Pune Jobs |
Vijayawada Jobs |
Gurgaon Jobs |
Noida Jobs |
Hyderabad Jobs |
Banglore Jobs |
Chennai Jobs |
Delhi Jobs |
Ahmedabad Jobs |
Mumbai Jobs |
Pune Jobs |
Vijayawada Jobs |
Gurgaon Jobs |
Noida Jobs |
Oil & Gas Jobs |
Banking Jobs |
Construction Jobs |
Top Management Jobs |
IT - Software Jobs |
Medical Healthcare Jobs |
Purchase / Logistics Jobs |
Sales |
Ajax Jobs |
Designing Jobs |
ASP .NET Jobs |
Java Jobs |
MySQL Jobs |
Sap hr Jobs |
Software Testing Jobs |
Html Jobs |
Job Location | Pune |
Education | Not Mentioned |
Salary | Not Disclosed |
Industry | IT - Software |
Functional Area | DBA / Datawarehousing |
EmploymentType | Full-time |
Technical/ Process Skills Work with Apache Spark, HDFS, AWS EMR, Spark Streaming, GraphX, MlLib, Cassandra, Elasticsearch, Yarn, Hadoop, Hive, AWS Cloud services, SQL. Be working with Machine learning / Deep learning libraries ( MlLib, Tensorflow, PyTorch) to implement solutions that solves or automates real world tasks like . prediction, image processing, object detection, Natural language processing, anomaly detection, text to speech and many more. Be building smart models that can be used in edge devices like IoT devices to perform edge computing and provide smart predictions locally. Design, implement and automate deployment of distributed system for collecting and processing large data- sources. Write ETL and ELT jobs and Spark/ Hadoop jobs to perform computation on large scale datasets. Design streaming applications using Apache Spark, Apache Kafka for real time computations. Design complex data models and schemas for structured and semi structured datasets in SQL and NoSQL environments. Deploy and test solutions on cloud platforms like Amazon EMR, Google Dataproc, Google Cloud Dataflow etc. Explore and analyze data using various visualization tools like Tableau, Qlik etc. Write unit tests , perform code reviews and collaborate with team to implement best coding practices. Explore various big data technologies to design new product architectures and POC s for same. Proficiency with any one of the following Scala, Java and Python. Minimum 1- 2 years of experience in Apache Spark. Experience in working with Streaming Environments (Spark Streaming / Flink) Experience in Hadoop ecosystem ( Hadoop MR, HDFS, Pig, SQOOP, Impala, Hive, Presto) Good experience of using Spark and Hadoop frameworks on Amazon EMR. Strong knowledge of data modelling and design principles in SQL and NoSQL environments. Strong experience in working and building ELT and ETL pipelines and their components. Experience or familiarity with visualisation tools like Tableau, Qlik or Grafana. Strong experience in developing REST API and consuming data from external web API s. Comfortable with source control system (Github) and linux environments. Experience with any Machine Learning / Deep Learning platforms ( Spark ML, Scikit- learn, DL4J, Tensorflow) Experience with processing text using natural language processing libraries like Core NLP, Open NLP, Spacy. Experience with interactive notebooks like Jupyter, Zeppelin etc. Experience with infrastructure tools like Docker, Kubernetes, Mesos. Following file extensions allowed: doc, docx, pdf, pages, pptx,
Keyskills :
hadoop java hive serexperience apachekafka datamodeling socialmedia controlsystem bigdata datamodels apachespark machinelearning deeplearning apachewebserver