List of Mixture of Experts (MoE) and Large Language Model (LLM) Papers focusing on model Upcycling

Reading

List of Mixture of Experts (MoE) and Large Language Model (LLM) Papers focusing on model Upcycling

This repository contains a collection of important papers related to Mixture of Experts (MoE) and Large Language Models (LLMs), along with links to their corresponding Arxiv pages and available GitHub code.

#	Paper Title	Year	Link	Code
1	Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence	2024	Arxiv	No code available
2	Upcycling Large Language Models into Mixture of Experts	2024	Arxiv	No code available
3	Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts	2024	Arxiv	No code available
4	SELF-MOE: TOWARDS COMPOSITIONAL LARGE LANGUAGE MODELS WITH SELF-SPECIALIZED EXPERTS	2024	Arxiv	No code available
5	Pack of LLMs: Model Fusion at Test-Time via Perplexity Optimization	2024	Arxiv	GitHub
6	Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM	2024	Arxiv	No code available
7	SCALING LAWS FOR FINE-GRAINED MIXTURE OF EXPERTS	2024	Arxiv	GitHub
8	Scaling expert language models with unsupervised domain discovery	2023	Arxiv	GitHub
9	Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling	2023	Arxiv	GitHub
10	Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models	2023	Arxiv	GitHub
11	SPARSE UPCYCLING: TRAINING MIXTURE-OF-EXPERTS FROM DENSE CHECKPOINTS	2023	Arxiv	GitHub
12	Upcycling Instruction Tuning from Dense to Mixture-of-Experts via Parameter Merging	2024	Arxiv	Not available

*Advances in Weight Generation and Retrieval for Language Models

#	Paper Title	Year	Link	Code
1	Representing Model Weights with Language using Tree Experts	2024	Arxiv	No code available
2	Deep Linear Probe Generators for Weight Space Learning	2024	Arxiv	GitHub
3	Knowledge Fusion By Evolving Weights of Language Models	2024	Arxiv	GitHub

Vector Quantization Prompting for Continual Learning

Historical Test-time Prompt Tuning for Vision Foundation Models

Contributing

Feel free to open a pull request if you find new papers or code related to MoE and LLMs. Let's keep this list growing!

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
README.md		README.md
ReaMe.md		ReaMe.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

List of Mixture of Experts (MoE) and Large Language Model (LLM) Papers focusing on model Upcycling

Contributing

About

Releases

Packages

sorobedio/moe-reading

Folders and files

Latest commit

History

Repository files navigation

List of Mixture of Experts (MoE) and Large Language Model (LLM) Papers focusing on model Upcycling

Contributing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages