LLM Paper Reading

This repo lists some interesting LLM related papers.

LLM Embedding

Fine-Tuning LLaMA for Multi-Stage Text Retrieval
- Keywords: Dense Retriever(RepLLaMA), Pointwise Reranker(RankLLaMA), Contrastive loss, MS MARCO, BEIR
Improving Text Embeddings with Large Language Models
- Keywords: Synthetic Data Generation, Mistral-7b, Contrastive loss, Multilingual Retrieval, BEIR, MTEB

LLM Prompting

LLM Training

QLORA: Efficient Finetuning of Quantized LLMs
- Keywords: Low Rank Adapters (LoRA), 4-bit NormalFloat (NF4), Double Quantization, Paged Optimizers

LLM Inference

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
- Keywords: Self-attention, IO-aware, GPU high bandwidth memory (HBM), GPU SRAM, Block-Sparse Attention
Efficient Memory Management for Large Language Model Serving with PagedAttention
- Keywords: KV cache, PagedAttention, Classical Virtual Memory, Paging techniques, vLLM

LLM Architectures

Mixtral of Experts
- Keywords: Mixtral 8x7B, Sparse Mixture of Experts (SMoE)
Mistral 7B
- Keywords: Grouped-Query Attention (GQA), Sliding Window Attention (SWA), Rolling Buffer Cache, Pre-fill and Chunking
Llama 2: Open Foundation and Fine-Tuned Chat Models
- Keywords: Grouped-Query Attention (GQA), Context Length 4k, 2.0T Tokens
LLaMA: Open and Efficient Foundation Language Models
- Keywords: Pre-normalization, SwiGLU activation function, Rotary Positional Embeddings (RoPE), Context Length 2k, 1.0T Tokens

LLM Agents

LLM Alignment

RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
- Keywords: RL from AI Feedback (RLAIF), Generating AI labels, Self Improvement

LLM Survey

A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models
- Keywords: LLM Hallucination, RAG, Knowledge Retrieval, CoNLI, CoVe
A Survey of Large Language Models
- Keywords: PLMs, LLMs, Pre-training, Adaptation tuning, Capacity evaluation

Blogs

RAGAS A framework to evaluate RAG: https://docs.ragas.io/en/stable/