-
Notifications
You must be signed in to change notification settings - Fork 237
usage
There are several options to start a job.
script/local.sh
starts several processes on the local machine. Here is an
example to run sparse logistic regression on dataset rcv1 (use
data/rcv1_binary.sh
to prepare the data) with 1 server and 2 workers.
scheduler="role:SCHEDULER,hostname:'127.0.0.1',port:8000,id:'H'"
W0="role:WORKER,hostname:'127.0.0.1',port:8001,id:'W0'"
W1="role:WORKER,hostname:'127.0.0.1',port:8002,id:'W1'"
S0="role:SERVER,hostname:'127.0.0.1',port:8010,id:'S0'"
arg="-num_servers 1 -num_workers 1 -num_threads 2 -app ../config/rcv1_l1lr.conf"
bin="../bin/ps"
${bin} ${arg} -scheduler ${scheduler} -my_node ${scheduler} &
${bin} ${arg} -scheduler ${scheduler} -my_node ${W0} &
${bin} ${arg} -scheduler ${scheduler} -my_node ${S0} &
${bin} ${arg} -scheduler ${scheduler} -my_node ${W1} &
Each running bin
starts a node instance, which requires three kinds of
arguments. First, it needs to know who it is by -my_node
. Second, it needs to
know where is the scheduler by -scheduler
. Third, it accepts arguments such as
the number of servers, the number of workers, how many threads a node can use,
and the configuration of the application, which is a protobuf ASCII format file.
You can start the job over multiple machines by mpirun
. An example is
script/mpi_root.sh config/rcv1_mpi.conf
. Slightly different to local.sh
,
here you can provide the hostnames of all available machines, which are
often available at /etc/hosts
. (If you skip hostfile
, then only the local
machine will be used.) Next you need to provide the network interface so the
program can find the IP address. The other augments are similar to
local.sh
.
You can create more node instances than the actual number of machines in the hostfile. Also remember to start the job at one of the machines listed in the hostfile.
num_workers=2
num_servers=2
num_threads=4
app_conf=../config/rcv1_l1lr.conf
network_port=8000
network_interface=eth0
hostfile=../config/hosts
More options such as by yarn
, docker
, …, will come soon.