Artificial Intelligence & Engineering
AI & Engineering leverages cutting-edge engineering capabilities to help build, deploy, and operate integrated/verticalized sector solutions in software, data, artificial intelligence (AI), network, and hybrid cloud infrastructure. These insights are powered by engineering for business advantage, helping transform mission-critical operations.
Join our AI & Engineering team to help transform technology platforms, drive innovation, and make a significant impact on our clients’ achievements. You’ll work alongside talented professionals reimagining and re-engineering operations and processes that are critical to businesses.
Position Summary
Level: Consultant
As an experienced Consultant at Deloitte Consulting, you will be responsible for independently delivering high-quality work products within due timelines. As needed, you will mentor and/or direct junior team members and liaise with onsite/offshore teams to understand functional requirements.
Work you’ll do
As an Observability Engineer, you will play a critical role in strengthening end-to-end visibility, reliability, and performance across cloud-native platforms. You will lead the migration from Elastic Stack (ELK) to Amazon Web Services (AWS) OpenSearch, ensuring a seamless transition of logging pipelines, dashboards, alerting strategies, and historical data retention. This includes redesigning ingestion flows, optimizing index structures, and improving query performance to enable scalable, real-time observability.
You will design and implement comprehensive observability solutions that help teams understand system behavior, accelerate troubleshooting, and make data-driven decisions. You’ll build and maintain robust monitoring, logging, and alerting frameworks using modern tools and best practices—standardizing log formats, developing Grok patterns, defining index lifecycle management (ILM) policies, and ensuring efficient storage and search strategies that support both operational and business metrics.
You will also automate infrastructure provisioning and deployment pipelines using continuous integration/continuous delivery (CI/CD) practices to enable consistent, repeatable, and secure releases across environments. This work extends to Kubernetes and containerized workloads, where you will enhance visibility into cluster health, application performance, and distributed service interactions.
Collaboration is central to the role. You will work closely with development, DevOps, site reliability engineering (SRE), and cloud engineering teams to troubleshoot issues, optimize performance, and embed observability as a core engineering practice. You will mentor junior engineers and promote best-in-class standards, fostering proactive monitoring, fast incident response, and continuous improvement.
The team
At Hybrid Cloud Infrastructure, we deliver solutions spanning Hybrid Cloud, Advanced Connectivity, AI Data Centers, High-Performance Computing, and AI Infrastructure to help clients achieve their desired outcomes. Our offerings include engineered transformation services for hybrid cloud infrastructure and platforms, prioritizing resiliency, optimization, and extensive automation. We integrate advanced connectivity with AI infrastructure and enterprise networks to boost operational efficiency and enable real-time data processing. Additionally, we provide comprehensive management of all facets of operations for hybrid cloud infrastructure and field operations.
Qualifications
Must Have Skills / Project Experience / Certifications
- 3–6 years of hands-on experience with OpenSearch and ELK.
- Strong administration experience with Elasticsearch, Logstash, Kibana, and Beats (Filebeat/Metricbeat).
- Strong experience with OpenSearch alerting, deployment models, and migration strategies.
- Experience migrating and creating visualizations, dashboards, rules, alerts, and triggers using GitHub pipelines.
- Experience automating snapshot backup and restore to OpenSearch using pipelines.
- Expertise in log management, Grok patterns, ILM policies, and Query DSL.
- Proficiency with Kubernetes and Docker.
- Experience with AWS services including EC2, S3, RDS, and Lambda.
- Strong understanding of Identity and Access Management (IAM), Virtual Private Cloud (VPC), and cloud security best practices.
- CI/CD pipeline development using GitHub Actions.
- Proven leadership in large-scale infrastructure migrations.
- Strong communication and cross-team collaboration skills.
- Ability to work effectively in fast-paced, dynamic environments.
Good to Have Skills / Project Experience / Certifications
- Experience with application performance monitoring (APM) tools (e.g., Datadog, New Relic, Prometheus).
- SRE/DevOps mindset with incident management exposure.
- Knowledge of Infrastructure as Code (IaC) using Terraform and/or AWS CloudFormation.
- Experience with Nginx (reverse proxying, load balancing, performance tuning).
- Knowledge of WebLogic administration, deployment automation, and troubleshooting.
- Familiarity with service-oriented architecture (SOA) concepts, integration patterns, and legacy modernization.
- Exposure to Microsoft Azure services (monitoring, logging, infrastructure components).
- Working knowledge of Oracle Database performance tuning, query optimization, and availability best practices.
- Networking fundamentals (DNS, routing, firewalls, load balancers).
- Experience with incident and change management aligned to ITIL practices.
- ELK Certification.
Education
- BE/B.Tech/M.C.A./M.Sc (CS) degree or equivalent from an accredited university.
Location
- Bengaluru / Hyderabad / Pune / Chennai
Shift Timings
- 24x7