AI & Data
In this age of disruption, organizations need to navigate the future with confidence, embracing decision making with clear, data-driven choices that deliver enterprise value in a dynamic business environment.
TIn this age of disruption, organizations need to navigate the future with confidence, embracing decision making with clear, data-driven choices that deliver enterprise value in a dynamic business environment.
The AI & Data team leverages the power of data, analytics, robotics, science and cognitive technologies to uncover hidden relationships from vast troves of data, generate insights, and inform decision-making. The offering portfolio helps clients transform their business by architecting organizational intelligence programs and differentiated strategies to win in their chosen markets.
AI & Data will work with our clients to:
•Implement large-scale data ecosystems including data management, governance and the integration of structured and unstructured data to generate insights leveraging cloud-based platforms
•Leverage automation, cognitive and science-based techniques to manage data, predict scenarios and prescribe actions
•Drive operational efficiency by maintaining their data ecosystems, sourcing analytics expertise and providing As-a-Service offerings for continuous insights and improvements
PySpark Consultant
The position is suited for individuals who have demonstrated ability to work effectively in a fast paced, high volume, deadline driven environment.
Education and Experience
Education:
B.Tech/M.Tech/MCA/MS
3-6 years of experience in design and implementation of migrating an Enterprise legacy system to Big Data Ecosystem for Data Warehousing project.
Required Skills
•Must have excellent knowledge in Apache Spark and Python programming experience
•Deep technical understanding of distributed computing and broader awareness of different Spark version
•Strong UNIX operating system concepts and shell scripting knowledge
•Hands-on experience using Spark & Python
•Deep experience in developing data processing tasks using PySpark such as reading data from external sources, merge data, perform data enrichment and load in to target data destinations.
•Experience in deployment and operationalizing the code, knowledge of scheduling tools like Airflow, Control-M etc. is preferred
•Working experience on AWS ecosystem, Google Cloud, BigQuery etc. is an added advantage
•Hands on experience with AWS S3 Filesystem operations
•Good knowledge of Hadoop, Hive and Cloudera/ Hortonworks Data Platform
•Should have exposure with Jenkins or equivalent CICD tool & Git repository
•Experience handling CDC operations for huge volume of data
•Should understand and have operating experience with Agile delivery model
•Should have experience in Spark related performance tuning
•Should be well versed with understanding of design documents like HLD, TDD etc
•Should be well versed with Data historical load and overall Framework concepts
•Should have participated in different kinds of testing like Unit Testing, System Testing, User Acceptance Testing, etc
•Deep technical understanding of distributed computing and broader awareness of different Spark version
•Strong UNIX operating system concepts and shell scripting knowledge
•Hands-on experience using Spark & Python
•Deep experience in developing data processing tasks using PySpark such as reading data from external sources, merge data, perform data enrichment and load in to target data destinations.
•Experience in deployment and operationalizing the code, knowledge of scheduling tools like Airflow, Control-M etc. is preferred
•Working experience on AWS ecosystem, Google Cloud, BigQuery etc. is an added advantage
•Hands on experience with AWS S3 Filesystem operations
•Good knowledge of Hadoop, Hive and Cloudera/ Hortonworks Data Platform
•Should have exposure with Jenkins or equivalent CICD tool & Git repository
•Experience handling CDC operations for huge volume of data
•Should understand and have operating experience with Agile delivery model
•Should have experience in Spark related performance tuning
•Should be well versed with understanding of design documents like HLD, TDD etc
•Should be well versed with Data historical load and overall Framework concepts
•Should have participated in different kinds of testing like Unit Testing, System Testing, User Acceptance Testing, etc
Preferred Skills
•Exposure to PySpark, Cloudera/ Hortonworks, Hadoop and Hive.
•Exposure to AWS S3/EC2 and Apache Airflow
•Participation in client interactions/meetings is desirable.
•Participation in code-tuning is desirable.
Our purpose
Our people and culture
Professional development
Benefits to help you thrive
At Deloitte, we know that great people make a great organization. Our comprehensive rewards program helps us deliver a distinctly Deloitte experience that helps that empowers our professionals to thrive mentally, physically, and financially—and live their purpose. To support our professionals and their loved ones, we offer a broad range of benefits. Eligibility requirements may be based on role, tenure, type of employment and/ or other criteria. Learn more about what working at Deloitte can mean for you.
Recruiting tips