Data Engineering – Software Engineer II
Join Deloitte’s Enterprise Performance practice as a Software Engineer II, Data Engineering, where you will support the design, development, and maintenance of data pipelines and backend data systems across cloud and on-premises platforms. In this role, you will help deliver reliable, well-governed data solutions that support analytics, reporting, and Generative AI (GenAI) and large language model (LLM) use cases. You will work with cross-functional stakeholders to translate business and technical requirements into scalable data engineering solutions that enable enterprise-ready data at scale.
Work you'll do
As a Software Engineer II, Data Engineering on the Enterprise Performance team, you will be responsible for designing and supporting data engineering solutions that enable analytics, reporting, and GenAI applications.
• Design, develop, and maintain data pipelines and backend systems across cloud and on-premises environments.
• Build and support extract, transform, load (ETL) and extract, load, transform (ELT) workflows using Python, Amazon Web Services (AWS) Lambda, AWS Glue, Apache Airflow, and batch or streaming frameworks.
• Develop data processing solutions for GenAI and LLM use cases, including data cleaning, transformation, embedding workflows, vector store ingestion, and retrieval-augmented generation (RAG) pipeline enablement.
• Administer and optimize databases, schemas, queries, and stored procedures across Oracle, PostgreSQL, MySQL, Microsoft SQL Server, Amazon Redshift, Google BigQuery, and Snowflake environments.
• Implement validation, logging, anomaly detection, and documentation practices to support data quality, operational reliability, and maintainability.
The team
The Enterprise Performance team leverages deep industry knowledge, strong analytical skills, and practical approaches to address clients’ toughest business challenges. Our professionals as part of the Supply Chain Network Operations (SCNO) service line within Enterprise Performance focus on helping organizations achieve sustainable competitive advantage throughout their operations, spanning product development, planning, sourcing, manufacturing, logistics, and distribution. We excel at translating strategic objectives into tangible, measurable outcomes at the operational level. By aligning high-level goals with frontline execution, we ensure our clients realize real value and improved performance across every stage of their supply chain and operations.
Location: Bengaluru / Hyderabad / Pune / Chennai
Shift Timings: 11 AM to 8 PM or 2 PM to 11 PM IST, as per business requirements
Qualifications
Required:
• Bachelor of Engineering, Bachelor of Technology, Master of Computer Applications, Master of Technology, or equivalent degree from an accredited university
• 4-7 years of experience in data engineering, Python, and database administration with SQL or PL/SQL
• Experience developing data pipelines using AWS Lambda, AWS Glue, Apache Airflow, or batch and streaming frameworks
• Experience designing data processing workflows for Generative AI (GenAI) or large language model (LLM) applications, including data cleaning, embedding workflows, vector store ingestion, or retrieval-augmented generation (RAG)
• Experience administering or optimizing Oracle, PostgreSQL, MySQL, Microsoft SQL Server, Amazon Redshift, Google BigQuery, or Snowflake environments
• Experience developing Python-based parsing or transformation logic and SQL or PL/SQL-based data solutions
• Experience implementing data validation, parameterized queries, logging, anomaly detection, or technical documentation in data engineering workflows
Preferred:
• Experience with Pinecone, FAISS, OpenSearch, or similar vector database technologies
• Experience with PySpark for distributed data processing
• Experience supporting cloud-based data engineering solutions on AWS, Google Cloud Platform (GCP), or Microsoft Azure
• Experience migrating schemas, queries, or data pipelines across database platforms
• Experience supporting analytics or reporting use cases through governed enterprise data pipelines
• Experience troubleshooting performance issues across data pipelines, orchestration tools, and database platforms
#SCNOFY27LinkedInbanner