Strategy & Analytics
AI & Data
In this age of disruption, organizations need to navigate the future with confidence, embracing decision making with clear, data-driven choices that deliver enterprise value in a dynamic business environment.
The AI & Data team leverages the power of data, analytics, robotics, science and cognitive technologies to uncover hidden relationships from vast troves of data, generate insights, and inform decision-making. Together with the Strategy practice, our Strategy & Analytics portfolio helps clients transform their business by architecting organizational intelligence programs and differentiated strategies to win in their chosen markets.
AI & Data will work with our clients to:
- Implement large-scale data ecosystems including data management, governance and the integration of structured and unstructured data to generate insights leveraging cloud-based platforms
- Leverage automation, cognitive and science-based techniques to manage data, predict scenarios and prescribe actions
- Drive operational efficiency by maintaining their data ecosystems, sourcing analytics expertise and providing As-a-Service offerings for continuous insights and improvements
Education and Experience
• Education: B.Tech/M.Tech/MCA/MS
• 3-9 years of experience
PySpark
• Must have excellent knowledge in Apache Spark and Python programming experience
• Deep technical understanding of distributed computing and broader awareness of different Spark version
• Strong UNIX operating system concepts and shell scripting knowledge
• Hands-on experience using Spark & Python
• Deep experience in developing data processing tasks using PySpark such as reading data from external sources, merge data, perform data enrichment and load in to target data destinations.
• Experience in deployment and operationalizing the code, knowledge of scheduling tools like Airflow, Control-M etc. is preferred
• Working experience on AWS ecosystem, Google Cloud, BigQuery etc. is an added advantage
• Hands on experience with AWS S3 Filesystem operations
• Good knowledge of Hadoop, Hive and Cloudera/ Hortonworks Data Platform
• Should have experience in Spark related performance tuning
Python Developer
• Excellent knowledge of Python programming language along with knowledge of at least one Python web framework ( Django, Flask, FastAPI, Pyramid)
• Extensive experience in Pandas/Numpy dataframes, slicing, data wrangling, aggregations.
• Lambda Functions, Decorators.
• Vector operations on Pandas dataframes /series.
• Application of applymap, apply, map functions.
• Understanding on using a framework based on specific needs and requirements.
• Understanding of the threading limitations of Python, and multi-process architecture
• Basic understanding of front-end technologies, such as JavaScript, HTML5, and CSS3
Amazon Web Services
• A minimum of 3 Years of experience in Cloud Operations
• High degree of knowledge using AWS services like lambda, GLUE, S3, Redshift, SNS, SQS and more.
• Strong scripting experience with python and ability to write SQL queries and string analytical skills.
• Experience working on CICD/DevOps is nice to have.
• Proven experience with agile/iterative methodologies implementing Cloud projects.
• Ability to translate business requirements and technical requirements into technical design.
• Good knowledge of end to end project delivery methodology implementing Cloud projects.
• Strong UNIX operating system concepts and shell scripting knowledge
• Good knowledge of cloud computing technologies and current computing trends.
Data Scientist
• Experience in descriptive & predictive analytics
• Education: Bachelors / Masters in a quantative fields. Analytics certificate programs from a premier institute will be preferred.
• Should have hands on experience implementing & executing Data Science projects throughout the entire lifecycle
• Expertise in Sentiment Analysis, Entity extraction, document classification, Natural Language Processing (NLP) & Natural Language Generation (NLG)
• Strong understanding of text pre-processing & normalization techniques such as tokenization
• Strong knowledge of Python is a must
• Strong SQL querying skills & data processing using Spark
• Hands on experience in data mining with Spark
• String expertise in any commercial data visualization tool such as Tableau, Qlik, Spotfire etc.
• Good understanding of Hadoop ecosystem
PL/SQL
• Develop, test, and maintain PL/SQL scripts, stored procedures, triggers, and functions.
• Optimize and tune PL/SQL code for performance improvements.
• Provide technical support for PL/SQL-related issues, including troubleshooting and resolving database problems.
• Assist end-users and other IT staff with PL/SQL-related inquiries and issues.
• Monitor database performance and implement necessary changes to improve efficiency.
• Perform regular database maintenance tasks, such as backups, restores, and updates.
• Manage database security by implementing and maintaining access controls.
• Collaborate with database administrators to ensure optimal database performance and availability.
• Create and maintain comprehensive documentation for PL/SQL scripts, procedures, and database configurations.
• Generate reports and provide analysis based on database queries and data extraction.
• Design and implement data migration and integration processes using PL/SQL.
• Utilize version control systems to manage PL/SQL code changes and ensure proper versioning.
• Ensure that all PL/SQL code adheres to company standards and complies with industry best practices and regulatory requirements.
Azure Databricks
• Experience in data engineering with a proven track record in using Databricks on Azure
• Strong knowledge of Python, SQL, PySpark and Scala (optional)
• Experience with cloud services such as cloud Databases, storage accounts ADLS Gen2, Azure Key vault, Cosmos DB, Azure Data factory, Azure Synapse is plus
• Experience in building metadata driven ingestion and DQ framework using PySpark
• Strong understanding of Lakehouse, Apache Spark, Delta Lake, and other big data technologies.
• Experience working with data toolsets, including data warehouse, data marts, data lake, 3NF, and dimensional model
• Experience in building pipelines using Delta live tables, autoloader, Databricks workflows for orchestration. Experience with Apache airflow will be plus.
• Experience with Databricks Unity catalog is plus.
• Experience in performance optimization in Databricks/Apache spark
Full Stack – Data Engineer
• Must have excellent knowledge in Apache Spark and Python programming experience.
• Deep technical understanding of distributed computing and broader awareness of different Spark version
• Strong UNIX operating system concepts and shell scripting knowledge
• Hands-on experience using Spark & Python
• Deep experience in developing data processing tasks using PySpark such as reading data from external sources, merge data, perform data enrichment and load in to target data destinations.
• Externally certified in one of the cloud services (foundational or advanced)- (AWS, GCP, Azure, Snowflake, Databricks)
• Experience in deployment and operationalizing the code, knowledge of scheduling tools like Airflow, Control-M etc. is preferred.
• Experience in creating visualizations in either of Tableau, PowerBI, Qlik, Looker or any of the other reporting tools
Oracle Analytics Cloud (OAC)
* Excellent development skills utilizing all aspects of OAC
* Strong knowledge of RPD, OAC reports, Data Visualizations, SQL
* Experience with one or more relational databases on Oracle
* Exposure to Oracle cloud migrations and OBIEE upgrades
* Experience in developing applications on Oracle Cloud infrastructure
* Experience working with ODI tool
* Fair understanding of the agile development process
* Exposure to Oracle Cloud scripts using CURL
* Excellent understanding of Oracle Cloud platform and services
Oracle Data Integrator (ODI)
• Expertise in the Oracle ODI toolset and Oracle PL/SQL, ODI
• Minimum 2-3 end to end DWH Implementation experience.
• Should have experience in developing ETL processes - ETL control tables, error logging, auditing, data quality, etc. Should be able to implement reusability, parameterization, workflow design, etc.
• Design and develop complex mappings, Process Flows and ETL scripts
• Must be well versed and hands-on in using and customizing Knowledge Modules (KM)
• Setting up topology, building objects in Designer, Monitoring Operator, different type of KM’s, Agents etc.
• Packaging components, database operations like Aggregate pivot, union etc.
• Using ODI mappings, error handling, automation using ODI, Load plans, Migration of Objects
• Design and develop complex mappings, Process Flows and ETL scripts
• Must be well versed and hands-on in using and customizing Knowledge Modules (KM)
• Integrate ODI with multiple Source / Target
• Experience in Data Migration using SQL loader, import/export
2ndInnings