Data Engineering
Pipelines, ETL, event-driven orchestration
Hello, I'm
Data engineer, AI practitioner and full-stack developer with 4+ years building production data pipelines, LLM-powered apps and relational systems. M.Sc. Bioinformatics & Computational Biology — fluent in life-sciences data and biological databases (NCBI, UniProt, RNAcentral).
🧬 Genomics
⚙ Data Pipelines
From raw sequence and operational data to dependable, query-ready resources and dashboards.
Pipelines, ETL, event-driven orchestration
Claude API, prompt & agentic workflows
React / Next.js + PostgreSQL · Supabase
Genomics, RNA, NCBI / UniProt databases
Power BI & DAX, Tableau, Looker Studio
AppSheet, Apps Script, n8n, Zapier
Production systems, AI workflows and bioinformatics builds — end-to-end ownership.
An LLM-integrated agentic assistant that lets researchers query biological databases (NCBI, UniProt) in natural language — with ML-based query understanding and intent classification. An early example of agentic AI workflow design.
Full-stack scheduling app with Gantt charts, weekly planning views and relational filtering for manufacturing workflows. Modular architecture, documented API design.
Event-driven pipeline triggering branded HTML emails on dispatch confirmation — communication latency cut from hours to seconds with zero manual intervention.
A BERT-based semantic similarity model with optimised text preprocessing and embedding workflows for high-accuracy contextual understanding.
End-to-end automated KPI pipeline — data ingestion, computation, scheduling and dashboard delivery — demonstrating full pipeline ownership.
Open to Bioinformatics Data Engineering roles — including RNA / sequence resource teams.
© Harsh Kashyap · Meerut, India · +91 73515 34994