AI & Data
In this age of disruption, organizations need to navigate the future with confidence, embracing decision making with clear, data-driven choices that deliver enterprise value in a dynamic business environment.
The AI & Data team leverages the power of data, analytics, robotics, science and cognitive technologies to uncover hidden relationships from vast troves of data, generate insights, and inform decision-making. The offering portfolio helps clients transform their business by architecting organizational intelligence programs and differentiated strategies to win in their chosen markets.
AI & Data will work with our clients to:
- Implement large-scale data ecosystems including data management, governance and the integration of structured and unstructured data to generate insights leveraging cloud-based platforms
- Leverage automation, cognitive and science-based techniques to manage data, predict scenarios and prescribe actions
- Drive operational efficiency by maintaining their data ecosystems, sourcing analytics expertise and providing As-a-Service offerings for continuous insights and improvements
PySpark Sr. Consultant
The position is suited for individuals who have demonstrated ability to work effectively in a fast paced, high volume, deadline driven environment.
Education and Experience
- Education:
- B.Tech/M.Tech/MCA/MS
- 6-9 years of experience in design and implementation of migrating an Enterprise legacy system to Big Data Ecosystem for Data Warehousing project.
Required Skills:
- Must have excellent knowledge in Apache Spark and Python programming experience
- Deep technical understanding of distributed computing and broader awareness of different Spark version
- Strong UNIX operating system concepts and shell scripting knowledge
- Hands-on experience using Spark & Python
- Deep experience in developing data processing tasks using PySpark such as reading data from external sources, merge data, perform data enrichment and load in to target data destinations.
- Experience in deployment and operationalizing the code, knowledge of scheduling tools like Airflow, Control-M etc. is preferred
- Working experience on AWS ecosystem, Google Cloud, BigQuery etc. is an added advantage
- Hands on experience with AWS S3 Filesystem operations
- Good knowledge of Hadoop, Hive and Cloudera/ Hortonworks Data Platform
- Should have exposure with Jenkins or equivalent CICD tool & Git repository
- Experience handling CDC operations for huge volume of data
- Should understand and have operating experience with Agile delivery model
- Should have experience in Spark related performance tuning
- Should be well versed with understanding of design documents like HLD, TDD etc
- Should be well versed with Data historical load and overall Framework concepts
- Should have participated in different kinds of testing like Unit Testing, System Testing, User Acceptance Testing, etc
Preferred Skills:
- Exposure to PySpark, Cloudera/ Hortonworks, Hadoop and Hive.
- Exposure to AWS S3/EC2 and Apache Airflow
- Participation in client interactions/meetings is desirable.
- Participation in code-tuning is desirable.
Our purpose
Our people and culture
Professional development
Benefits to help you thrive
At Deloitte, we know that great people make a great organization. Our comprehensive rewards program helps us deliver a distinctly Deloitte experience that helps that empowers our professionals to thrive mentally, physically, and financially—and live their purpose. To support our professionals and their loved ones, we offer a broad range of benefits. Eligibility requirements may be based on role, tenure, type of employment and/ or other criteria. Learn more about what working at Deloitte can mean for you.
Recruiting tips