AI & Data
In this age of disruption, organizations need to navigate the future with confidence, embracing decision making with clear, data-driven choices that deliver enterprise value in a dynamic business environment.
The AI & Data team leverages the power of data, analytics, robotics, science and cognitive technologies to uncover hidden relationships from vast troves of data, generate insights, and inform decision-making. Together with the offering portfolio helps clients transform their business by architecting organizational intelligence programs and differentiated strategies to win in their chosen markets.
AI & Data will work with our clients to:
- Implement large-scale data ecosystems including data management, governance and the integration of structured and unstructured data to generate insights leveraging cloud-based platforms
- Leverage automation, cognitive and science-based techniques to manage data, predict scenarios and prescribe actions
- Drive operational efficiency by maintaining their data ecosystems, sourcing analytics expertise and providing As-a-Service offerings for continuous insights and improvements
Work you’ll do
• Translate functional requirements into technical design
• Recommend design alternatives for data ingestion, processing and provisioning layers
• Design and develop data ingestion programs to process large data sets in Batch mode using HIVE, Pig and Sqoop technologies
• Design and develop data integration programs using commercial ETL tools such as Informatica, Data Stage and SnapLogic
• Design and develop data integration programs using open source and open standard ETL tools such as TalenD and Pentaho Kettle
• Develop data ingestion programs to ingest realtime data from LIVE sources using Apache Kafka, Spark Streaming and related technologies
• Work in large teams developing and delivering solutions to support large scale data management platforms following Agile methodology
• Monitor data ingestion processes end to end and optimize the overall data processing lead times
• Develop test scenarios and test scripts to validate data loaded in Hadoop platform
Qualifications
Required:
• 6-9 Years of technology Consulting experience
• A minimum of 2 Years of experience designing and developing Big Data Solutions on Hadoop technology platform
• Ability to translate business requirements and technical requirements into technical design
• Good knowledge of end to end project delivery methodology implementing Big Data projects
• Deep technical understanding of distributed computing and broader awareness of different Hadoop distributions
• Experience designing Solution architectures for different use cases with Hadoop and ecosystem tools
• Strong UNIX operating system concepts and shell scripting knowledge
• Hands-on experience using Hadoop (preferably Hadoop 2 with YARN), MapReduce, R, Pig, Hive, Sqoop, and HBase
• Exposure to search tools such as Elastic Search and Lucene
• Extensive Experience in object-oriented programming through JAVA, including optimizing memory usage and JVM configuration in distributed programming environments
• Exposure to metadata management techniques within Hadoop technology architecture
• Ability to operate independently with clear focus on schedule and outcomes
• Proficient with algorithm development, including statistical and probabilistic analysis, clustering, recommendation systems, natural language processing, and performance analysis
• Understanding of machine learning frameworks like Mahout and data mining algorithms like Bayesian and Clustering
• Experience with building APIs for provisioning data to downstream systems by leveraging different frameworks
Preferred:
• Production Experience in Apache Spark using SparkSQL and Spark Streaming or Apache Storm
• Exposure to different NoSQL databases within Hadoop ecosystem
• Exposure to public, private, and hybrid cloud platforms such as AWS, Azure and Google