Tohid
Shaikh
Backend Engineer · ML Engineer · Distributed Systems
Backend and ML engineer with hands-on experience building distributed systems, LLM-integrated pipelines, and data-intensive applications. From encrypted payment meshes to custom vector databases — I build the infrastructure, not just the model.
ABOUT
Building systems that actually work at scale.
I architect and ship production-grade backend systems and ML pipelines — not toy demos. My work spans encrypted payment mesh networks, custom vector databases built from scratch, and large-scale data engines processing 5M+ records.
I don't just train models — I engineer the infrastructure that makes them production-ready. Gossip-protocol networks, multithreaded DPI engines, RAG pipelines with sub-200ms latency — I own the full stack from algorithm design to deployment.
Delivered measurable impact as an ML Engineer at Edunet Foundation × IBM SkillsBuild and 3Skill — drove fraud recall from 61% to 84% on live financial data, reducing false negatives by 38%.
Actively seeking backend engineering, ML infrastructure, or full-stack roles at companies building at scale — where I can own systems end-to-end and ship things that matter.
PROJECTS
Things I've built
01 · Backend Systems
UPI Offline Mesh
Python · Flask · SQLAlchemy · RSA-OAEP · AES-256-GCM · Multithreading
Offline UPI payment system using a gossip-based mesh network. Hybrid RSA-OAEP + AES-256-GCM encryption secures payment packets across virtual bridge nodes. Thread-safe idempotency cache with optimistic locking prevents duplicate settlements.
02 · AI & ML
Vector Database + RAG Pipeline
Python · Flask · HNSW · KD-Tree · Ollama · REST API
Vector database from scratch with HNSW, KD-Tree, and Brute-Force search. Sub-200ms ANN query latency. RAG pipeline using Ollama for document-grounded question answering with semantic chunking and real-time concurrent retrieval.
03 · Data Engineering
UIDAI Aadhaar — Project DRAM
Python · Pandas · SciPy · Plotly · Streamlit
5M+ Aadhaar records across 12 data sources. 807 districts classified using Z-score anomaly detection and custom UER metric. 10x query speedup via vectorized boolean masking. Presented at UIDAI National Innovation Challenge 2026.
04 · Systems & Networking
Deep Packet Inspection Engine
Python · Multithreading · TCP/IP · TLS · PCAP
10-module DPI engine parsing raw PCAP files across Ethernet/IP/TCP/UDP layers. 20+ apps detected via TLS SNI fingerprinting with stateful 5-tuple flow tracking. Multithreaded Reader → Load Balancer → Fast Path pipeline with consistent hashing.
// work history
Where I've Worked
ML Engineer Intern
Edunet Foundation × IBM SkillsBuild
- →Built a Random Forest classification pipeline on 12K records, achieving 91.2% accuracy and 0.87 F1-score through structured feature engineering and cross-validated model selection.
- →Designed modular preprocessing workflows using Pandas and Scikit-learn, eliminating data leakage and ensuring reproducibility across training and evaluation environments.
ML Engineer Intern
3Skill
- →Developed a fraud detection model on 20K+ imbalanced financial transactions; applied SMOTE oversampling to improve recall from 61% to 84%, reaching 92% accuracy and 0.89 ROC-AUC.
- →Reduced false-negative rate by 38% through targeted feature selection and threshold tuning on skewed real-world transaction data.
// tech stack
Technical Arsenal
Languages
Backend & Systems
AI / LLM
ML & Data
Core CS
Tools
CERTIFICATIONS
Verified credentials
Click any certificate to verify on the issuer's official website.
All credentials are publicly verifiable on issuer platforms
// get in touch
Let's build
something great.
Open to full-time roles, internships, and interesting projects. Building at scale — let's talk.




