Skip to content

sorobedio/moe-reading

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 

Repository files navigation

Reading

List of Mixture of Experts (MoE) and Large Language Model (LLM) Papers focusing on model Upcycling

This repository contains a collection of important papers related to Mixture of Experts (MoE) and Large Language Models (LLMs), along with links to their corresponding Arxiv pages and available GitHub code.

# Paper Title Year Link Code
1 Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence 2024 Arxiv No code available
2 Upcycling Large Language Models into Mixture of Experts 2024 Arxiv No code available
3 Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts 2024 Arxiv No code available
4 SELF-MOE: TOWARDS COMPOSITIONAL LARGE LANGUAGE MODELS WITH SELF-SPECIALIZED EXPERTS 2024 Arxiv No code available
5 Pack of LLMs: Model Fusion at Test-Time via Perplexity Optimization 2024 Arxiv GitHub
6 Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM 2024 Arxiv No code available
7 SCALING LAWS FOR FINE-GRAINED MIXTURE OF EXPERTS 2024 Arxiv GitHub
8 Scaling expert language models with unsupervised domain discovery 2023 Arxiv GitHub
9 Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling 2023 Arxiv GitHub
10 Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models 2023 Arxiv GitHub
11 SPARSE UPCYCLING: TRAINING MIXTURE-OF-EXPERTS FROM DENSE CHECKPOINTS 2023 Arxiv GitHub
12 Upcycling Instruction Tuning from Dense to Mixture-of-Experts via Parameter Merging 2024 Arxiv Not available

*Advances in Weight Generation and Retrieval for Language Models

# Paper Title Year Link Code
1 Representing Model Weights with Language using Tree Experts 2024 Arxiv No code available
2 Deep Linear Probe Generators for Weight Space Learning 2024 Arxiv GitHub
3 Knowledge Fusion By Evolving Weights of Language Models 2024 Arxiv GitHub

Vector Quantization Prompting for Continual Learning

Historical Test-time Prompt Tuning for Vision Foundation Models

Contributing

Feel free to open a pull request if you find new papers or code related to MoE and LLMs. Let's keep this list growing!

About

Myreading on fall 2024 on moe

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published