AI engineer who bridges theory and application across domains — from fine-tuning mathematical reasoning models and vision models to protein bioinformatics and large-scale graph analysis. Specializes in rapid domain mastery and building systems that solve real problems with intelligent design.
Network Configuration Overview
I am an AI/ML Research Engineer with expertise in Deep Learning, NLP, and traditional Machine Learning, driven by a passion for solving complex problems with intelligent systems. I have completed my Master's in Machine Learning from Stevens Institute of Technology, with a 3.93 GPA, where I honed my skills in both theoretical foundations and real-world applications.
I have built and optimized machine learning systems across NLP, deep learning, and computational biology, tackling challenges in text classification, retrieval-augmented generation, large-scale model fine-tuning, and predictive analytics. I thrive in environments where rapid learning and adaptation are key.
With a strong foundation in research and hands-on implementation, I am always eager to tackle new challenges, quickly learn emerging technologies, and refine complex models into production-ready solutions. My ability to adapt to new domains while optimizing performance makes me a valuable asset in any machine learning-driven environment.
Model Training & Optimization Journey
Neural Architectures in Action
Fine-tuned DeepSeek-R1-Distill-Qwen-1.5B model that outperformed Claude-3.5 Sonnet on mathematical reasoning tasks. Reduced trainable parameters by 98.8% while doubling inference speed through LoRA adaptation and Unsloth framework optimization.
High-accuracy protein localization system using Meta AI's ESM2-3B, achieving 84.79% top-3 and 92.09% top-5 accuracy across 12 cellular locations. Optimized training performance through mixed precision and gradient checkpointing.
Built production-ready LLM optimization pipeline leveraging NVIDIA NeMo Curator's advanced data filtering algorithms. Implemented intelligent prompt complexity classification and task-aware data selection, achieving 6.7% validation loss reduction while maintaining training efficiency on resource-constrained hardware through sophisticated curation strategies.
Built production-ready document QA system enabling natural language queries on PDF documents. Features secure file upload, intelligent text chunking, vector embeddings storage, and real-time retrieval-augmented generation with modern React interface.
Processed 500,000+ relationships across 442,275 nodes in large-scale dependency graph using Neo4j and NetworkX. Implemented Node2Vec embeddings combined with handcrafted features, achieving perfect classification metrics.
Real-time neural network visualization system with React frontend and FastAPI backend. Features live visualization of hidden layer activations, inter-layer connections via SVG, and prediction probabilities as users draw digits on interactive grid.
Built sophisticated multi-label classification system using BERT transformers to automatically categorize and rank StackOverflow questions across 50 programming language tags. Implemented NDCG (Normalized Discounted Cumulative Gain) evaluation with BCEWithLogitsLoss for optimal ranking performance, enabling intelligent question prioritization and tag-based relevance scoring.
English-to-Spanish neural machine translation using bidirectional LSTM encoder-decoder architecture. Implements character-level tokenization with temperature-based multinomial sampling for improved translation diversity and quality assessment via BLEU metrics.
Implemented complete 2-layer neural network with manual backpropagation using only NumPy, tested on MNIST and Fashion-MNIST datasets. Features Xavier initialization, numerically stable softmax, and custom learning rate scheduling.
Feature Space & Computational Stack