Hi, I am
Shubham Mishra
A Deep Learning Enthusiast.
Interested in leveraging novel machine learning techniques to solve meaningful problems. Scroll down to know more about my projects and experiences.
About Me
Hola! My name is Shubham, and I enjoy helping AI see, listen and communicate. I'm in the 4th year of my undergraduate studies at LNCT Bhopal.
My Goal: I'm interested in the development of architectures and pipelines that help us gain deeper insights into how a particular model works and how it can be generalized better on Out-Of-Distribution (OOD) data. I'm also interested in frameworks that provide a better understanding of how Language Models can perform better on NLI tasks, bias-fairness and reducing their proneness to hallucinations. I specialize in building RAG systems, fine-tuning models, and optimizing model inference, while also having a little knack for the development side of projects.
I also write blogs on Medium as a writer under the TheDeepHub publication, detailing the implementation of various deep learning architectures (ViTs, CLIP, GPT, etc.). Aditionaly, I'm into reading philosophy and have knack for music; you'll find me with headphones all the time.
Here are a few technologies I'm fairly proficient with:
- Pytorch
- Python
- C/C++
- Tensorflow

Where I’ve Worked
Artificial Intelligence Intern @ Wysa
Aug 2024 - Jan 2025
- Optimizing models for faster inference and performance.
- Migrating Wysa's internal NLP-AI tools from Flask to FastAPI-based implementation.
- Refining and structuring sensitive mental health data.
- Training and fine-tuning language models on challenging datasets and problem domains to boost the NLP capabilities of the Wysa mental health app directly enhancing the experience of over a million active users.
Some Things I’ve Built
Featured Project
Graph-Enhanced Visual Language Processing
Graph Vision is a Python library that aims to create topological maps for an environment connecting neighboring image segments, capturing each segment's spatial and semantic feature embeddings.
Enables localization of objects relative to one another in the topology with language description of the objects using Dijkstra's algorithm. Check it out on PyPI
- Python
- PyTorch
- VLM
- Graphs
Featured Project
Segmentation for Tumor Detection
Trained a UNet model for segmenting Tumours in MRI Brain scans. Achieved a high validation Dice Score ~0.9. Available with a docker image on DockerHub, and deployed on Huggingface Space
- PyTorch
- Deep Learning
- Streamlit
- Docker
Featured Project
Pool of ML-Models
-
A personal GitHub repository containing a variety of DL architectures implemented from scratch using PyTorch and einops for dealing with high-dimension tensors.
-
The architecture mainly includes important Vision Transformers like Swin, Dino, MAE, CvT, etc.
- Pytorch
- ViTs
-
Featured Project
Idefics-OCR
-
Fine-tuned the HuggingFaceM4/idefics2-8b model on the nielsr/docvqa_1200_examples_donut dataset for document Visual question-answering (VQA) pairs.
-
Also checkout Phi3-TheFineTunedOne; DPO trained of Phi-3-mini-4k-instruct.
- Transformers
- Peft
- LoRA
- Data Collation
-
Featured Project
Generate Study Resources
-
A Flask-based Web Application that generates study resources from PDFs, employing LLMs to create content MCQs, Flash Cards, and Q&As.
-
Assured content accuracy by implementing advanced and tight model prompting to prevent hallucinations.
Implemented front-end in HTML, CSS, and Vue.js.
- Flask
- HTML
- CSS
- Vue.js
-
Featured Project
LipReading With LipNet
Developed a 3DConv-LSTM (bi-directional) to predict the spoken sentence by extracting features from the lip movements in the frames based on End-to-End Sentence-level Lipreading.
Utilized CTC Loss to handle the variable length of input alignments (spoken sentence) and weights initialized with He (Kaiming normal) initialization to avoid blank-index predictions.
- Pytorch
- Opencv
Other Noteworthy Projects
End-To-End Movie Recommendation System
Build an End-To-End recommendation system with IMDb Dataset. Created the dataset myself by dynamically webscrapping the official IMDb site. Achieved Bronze medal on Kaggle.
Sentiment Classfication on 50K IMDb dataset
Performed experimentation with various classical machine learning models such as logistic regression, state vector machine, MLP, and a Bi-directional LSTM for classifying sentiments of the movie reviews.
Decoding Dark Matter with Deep Learning
Leveraging deep learning techniques for performing classification on various sub structures of dark matter with all ROC scores > 0.99.
Also used self-supervised learning methods for classifying strong gravitational lensing images.
Music Genre Classification
Utilized Librosa to create MFCC map of 30 sec wav file and created a Convolution Neural Network and trained it on the mfcc data achieving accuracy of 87%.
Autoencoders for Credit Card Fraud Detection
Trained an AutoEncoder for Credit Card fraud Detection to get the small latent space representation of data and use that data for dowstream classification tasks.
NutritionAI App
This Gradio web application takes an Input image of the nutrition fact table at the back of a product and shows contents like Starch fat, protein, etc. and gives health advice regarding nutrition content and proportions.
What’s Next?
Get In Touch
If you are curious to know more about me, please send me a message, and I shall get back to you!
Say Hello