GitHub - mikudehuane/MPDA: Official Implementation of "On-Device Learning for Model Personalization with Large-Scale Cloud-Coordinated Domain Adaption" [KDD 2022]

Public dataset evaluation for MPDA.

Dependencies

See requirements.txt for the dependent pip packages.

Data Preprocess

First, please download the raw data from https://files.grouplens.org/datasets/movielens/ml-20m.zip and http://snap.stanford.edu/data/amazon/productGraph/categoryFiles/reviews_Electronics_5.json.gz. Unzip and place the downloaded files in '/root/data/'. Then, run the following commands for MovieLens data preprocessing:

echo "preprocess"
python scripts/preprocess/movielens/preprocess.py || exit

echo "split"
python scripts/preprocess/movielens/split.py || exit

echo "intersect"
python scripts/preprocess/movielens/get_users_with_train_and_test.py || exit

Run the following commands for Amazon data preprocessing:

echo "preprocess"
python scripts/preprocess/amazon/preprocess.py || exit

echo "split"
python scripts/preprocess/movielens/split.py -ifd={data_fd}/Amazon/Electronics_5/processed -ts=1385078400 || exit

data_pfd="{data_fd}/Amazon/Electronics_5/processed"
train_data_fd="${data_pfd}/ts=1385078400_train"
eval_data_fd="${data_pfd}/ts=1385078400_test"
examine_user_list_fp="${data_pfd}/ts=1385078400_user-intersect.json"

echo "intersect"
python scripts/preprocess/movielens/get_users_with_train_and_test.py -tfd=${train_data_fd} -tefd=${eval_data_fd} -ofp=${examine_user_list_fp} || exit

Initial Model

The initial models (cloud models) are trained via train_global_model.py. The resulted models have been placed in cloud_models in the dataset-name_model-name format.

Evaluation

Run transfer.py for evaluating MPDA and the baselines. Use --help option to show the option list.

We run the scripts on the PAI platform, which itself handles parallel invocation and computation resource allocation. For other users, we here provide the commands for running in PC environment in commands. Each script corresponds to one run with a specific group of hyper-parameters. E.g., transfer_amazon-din_m-50.sh corresponds to the run using DIN on Amazon Electronics dataset with 50 matched users. For compatibility, the scripts specify CPU as the computing device, and you can change the "device" option for running on other devices. Please spread the tasks to multiple GPUs for accelerating in practice. The default setting in the scripts with 15 cpu tasks might cost unacceptably long time to complete.

Visualizing

Please run visualize to produce the files necessary for visualizing, and use tensorboard to view the results.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
cloud_models		cloud_models
commands		commands
scripts		scripts
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dependencies

Data Preprocess

Initial Model

Evaluation

Visualizing

About

Releases

Packages

Languages

mikudehuane/MPDA

Folders and files

Latest commit

History

Repository files navigation

Dependencies

Data Preprocess

Initial Model

Evaluation

Visualizing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages