Skip to content
muli edited this page Oct 4, 2014 · 2 revisions

There are several options to start a job.

Run on the local machine

script/local.sh starts several processes on the local machine. Here is an example to run sparse logistic regression on dataset rcv1 (use data/rcv1_binary.sh to prepare the data) with 1 server and 2 workers.

scheduler="role:SCHEDULER,hostname:'127.0.0.1',port:8000,id:'H'"
W0="role:WORKER,hostname:'127.0.0.1',port:8001,id:'W0'"
W1="role:WORKER,hostname:'127.0.0.1',port:8002,id:'W1'"
S0="role:SERVER,hostname:'127.0.0.1',port:8010,id:'S0'"
arg="-num_servers 1 -num_workers 1 -num_threads 2 -app ../config/rcv1_l1lr.conf"
bin="../bin/ps"

${bin} ${arg} -scheduler ${scheduler} -my_node ${scheduler} &
${bin} ${arg} -scheduler ${scheduler} -my_node ${W0} &
${bin} ${arg} -scheduler ${scheduler} -my_node ${S0} &
${bin} ${arg} -scheduler ${scheduler} -my_node ${W1} &

Each running bin starts a node instance, which requires three kinds of arguments. First, it needs to know who it is by -my_node. Second, it needs to know where is the scheduler by -scheduler. Third, it accepts arguments such as the number of servers, the number of workers, how many threads a node can use, and the configuration of the application, which is a protobuf ASCII format file.

Start the job by mpirun

You can start the job over multiple machines by mpirun. An example is script/mpi_root.sh config/rcv1_mpi.conf. Slightly different to local.sh, here you can provide the hostnames of all available machines, which are often available at /etc/hosts. (If you skip hostfile, then only the local machine will be used.) Next you need to provide the network interface so the program can find the IP address. The other augments are similar to local.sh.

You can create more node instances than the actual number of machines in the hostfile. Also remember to start the job at one of the machines listed in the hostfile.

num_workers=2
num_servers=2
num_threads=4
app_conf=../config/rcv1_l1lr.conf

network_port=8000
network_interface=eth0
hostfile=../config/hosts

Other options

More options such as by yarn, docker, …, will come soon.

Clone this wiki locally