I'm a Deep Learning Engineer with a strong focus on optimizing Large Language Models (LLMs) and Deep Learning Frameworks. I like fine-tuning, merging and evaluating LLM models. Inspired from llm.c by Karpathy. I also love to exploring and rewriting kernels to maximize the use of Nvidia GPUs using CUDA optimization.
- axolotl-finetune: In this project, I've implemented simple and multi-GPU finetuning for LLaMA models, conducted Nous evaluation benchmarks, and will soon integrate model quantization techniques.
- llama.c: Implemented LLama3 architecture using custom CUDA C/C++ kernels to attain high-performance for model pretraining on Nvidia GPUs.
๐ I'm also deeply interested in cutting-edge ML research, particularly in the evolution of LLMs and improving their pre-training efficiency.
Feel free to explore my work and repositories!
Get in touch