Hey! I'm Dileep 👋

Thanks for stopping by—I'm really glad you're here.

A Little About My Journey

I grew up in India, dreaming about technology and what it could do. A few years back, I made the leap to move to the United States to study Computer Engineering, and that decision changed everything. Today, I'm a Senior Data Engineer building cloud platforms that transform complex data into meaningful insights.

But there's more to my story than just work.

Who I Am Beyond Work

I believe life gets interesting when you stay curious about everything.

When I'm not working, you'll find me:

  • 📸 Behind my camera—capturing stories through photography and drone shots
  • 🏍️ On long rides—there's something freeing about the open road
  • 🥾 Hiking trails—nature has a way of putting everything in perspective
  • 📈 Analyzing patterns—markets, trends, how things work
  • ✈️ Traveling—new places, new perspectives, always learning

Each of these passions fuels my creativity and makes me better at what I do.

Let's Connect

Thanks for visiting. Feel free to explore and stay in touch.

Cheers,
Dileep
Dileep Kumar Reddy Kapu

DILEEP KUMAR REDDY KAPU

Senior Data Engineer | Multi-Cloud Expert | Lakehouse & GenAI Platforms
📍 Albuquerque, NM — Open to Relocation
📱 +1 (505) 364-5197

Professional Summary

Senior Cloud/Data Engineer with 6+ years of experience designing and leading enterprise-scale data platforms in HIPAA-regulated healthcare environments. Expert in AWS, Azure, Snowflake, and distributed Spark architectures, with a track record of modernizing legacy systems, establishing governed lakehouse foundations and multi-cloud data platforms, and enabling AI-ready data ecosystems. Focused on building secure, scalable, and cost-optimized platforms that drive measurable business impact.

Professional Experience

Senior Data Engineer
New Mexico Health Care Authority — Santa Fe, NM
Jun 2025 – Present
  • Defined and led the implementation of a secure multi-cloud ingestion architecture bridging Azure clinical systems with AWS analytics, establishing a HIPAA-compliant platform serving 1M+ residents with 99.9% reliability
  • Standardized Terraform-based infrastructure patterns across networking, IAM, and data services to eliminate configuration drift, enforce security baselines, and enable repeatable, auditable multi-environment deployments
  • Designed a Snowflake-centric ELT architecture leveraging AWS Glue and Snowpipe to consolidate 20M+ fragmented health records, reducing query latency by 40% and enabling near real-time statewide analytics
  • Built CI/CD pipelines using GitHub Actions and Terraform to replace manual deployments, enabling zero-downtime schema evolution while improving release reliability and deployment velocity
  • Implemented platform-wide security controls using IAM, VPC isolation, KMS encryption, and CloudWatch observability to enforce least-privilege access, reducing audit exceptions by 35% and improving incident MTTR by 25%
  • Partnered with analytics leadership to define a GenAI readiness roadmap by establishing governed, metadata-rich data foundations supporting future RAG and LLM analytics in HIPAA-regulated environments
Senior Data Engineer (Contract)
Optum (Client: UnitedHealth Group) — Albuquerque, NM
Jun 2024 – May 2025
  • Architected and governed 25+ production-grade pipelines integrating 12+ disparate healthcare data domains, processing 3–5TB monthly to enable scalable enterprise analytics across claims, member, and provider ecosystems
  • Built Delta Lake–based ELT frameworks to address schema drift and incremental processing challenges, enabling reliable analytics consumption across 7+ downstream claims, member, and provider reporting teams
  • Orchestrated batch workflows using Airflow with idempotent DAGs and automated backfills to eliminate manual reruns, reducing operational intervention by 40% and improving data consistency across recurring workloads
  • Re-engineered Spark workloads through partition strategy redesign, caching optimization, and resource allocation tuning, reducing processing latency by 35% and accelerating new data source onboarding by 25%
Data Engineer
University of New Mexico (UNM Health — Information Technologies) — Albuquerque, NM
Jan 2023 – May 2024
  • Built a serverless ingestion pipeline using AWS SES, Lambda, S3, and Python-based validation logic to replace manual registration processing, reducing 200K+ annual submissions from hours to under five minutes while improving data quality controls
  • Designed and maintained a 40+ table MySQL data warehouse using structured ETL workflows and optimized SQL transformations, centralizing academic and administrative datasets for 500+ stakeholders and improving reporting accuracy and consistency
  • Automated registration and payment workflows using Lambda, API Gateway, and Python-driven REST integrations to eliminate 70% of manual effort, enabling reliable execution of 250+ conference sessions during a five-day event
  • Scaled event-driven ingestion and analytics pipelines for a national academic conference, mentoring 10+ graduate students on AWS serverless and data engineering best practices while supporting reliable operations and approximately $120K in annual revenue
Big Data Engineer
Carelon Global Solutions (Elevance Health) — Bangalore, India
Jun 2020 – Dec 2022
  • Engineered large-scale batch and streaming pipelines using Spark, Hive, and Airflow to process healthcare provider and claims datasets, improving data availability and reliability for enterprise analytics teams
  • Led the migration of enterprise analytics workloads from on-prem Hadoop to AWS-managed services, redesigning processing architectures to improve platform reliability and increase operational efficiency by 15% through elastic infrastructure adoption
  • Automated repetitive DataOps tasks using Python-based frameworks, generating $250K+ in annual operational savings and earning a High Impact Award for improving delivery timelines and platform stability
  • Implemented early-stage GenAI automation to enhance pipeline observability and reduce manual intervention, improving reliability across distributed data processing workflows

Key Skills

Cloud & Lakehouse Platforms

AWS (S3, Glue, EMR, Athena)
Lambda & Serverless
IAM, KMS, CloudWatch
AWS Bedrock
Azure (Fabric, ADF, Synapse)
Snowflake & Snowpipe
Delta Lake (Medallion)
Apache Iceberg

Data Engineering & Architecture

ETL/ELT Pipeline Design
dbt (Analytics Engineering)
Apache Spark & PySpark
Apache Airflow
Apache Kafka
Batch & Streaming Processing
Schema Evolution
Dimensional Modeling
Data Quality Validation

Infrastructure & DevOps

Terraform (IaC)
CI/CD (GitHub Actions)
Docker & Kubernetes
RBAC & Security
Observability
Performance Tuning

Programming & Systems

Python
SQL & Spark SQL
MySQL & PostgreSQL
REST API Integrations
Linux/Unix

GenAI & ML

RAG Architectures
LangChain Workflows
Embeddings
Vector Databases (Pinecone, Milvus)
Prompt Engineering
Knowledge Graph Integration

Featured Projects

🏔️

Iceberg Lakehouse with Terraform

Designed a serverless Iceberg-based lakehouse architecture to enable ACID transactions and time-travel analytics on S3, reducing dependency on traditional warehouse workloads. Provisioned and managed cloud infrastructure using Terraform (IaC), enabling reproducible deployments, cost control, and environment isolation.

AWS S3 Glue Athena Apache Iceberg Terraform
🔗 View on GitHub

Real-Time Agentic Streaming Data Pipeline

Designed a real-time, event-driven streaming architecture to ingest and process high-velocity data with low-latency guarantees. Architected agent-driven enrichment and routing logic to support autonomous analytics and downstream AI workflows.

Kafka AWS MSK Lambda Spark Streaming
🔗 View on GitHub
🏥

Healthcare Data Lakehouse

Multi-cloud platform consolidating 20M+ fragmented health records with HIPAA compliance. Snowflake-centric architecture enabling near real-time analytics for statewide healthcare insights.

Snowflake AWS Azure Terraform HIPAA

Education

Master of Science in Computer Engineering
University of New Mexico, Albuquerque, NM
Dec 2024

Coursework: Distributed Systems, Advanced Cloud Computing Architectures, Database & Query Optimization

Bachelor of Engineering in Electronics & Communication Engineering
Gitam University, India
May 2021
📄 Download Resume

PHOTOGRAPHY & DRONE SHOTS

Capturing moments. Exploring perspectives.

VENTURES

Something's Brewing... ☕🍫

BUILDING SOMETHING NEW

Beyond data engineering and photography, I'm exploring entrepreneurship with a close friend.

More details coming soon.

WANT TO FOLLOW THE JOURNEY?

Stay tuned. 😊

– Dileep