forked from ArchiveTeam/ArchiveBot
-
Notifications
You must be signed in to change notification settings - Fork 0
/
INSTALL.backend
127 lines (87 loc) · 5.46 KB
/
INSTALL.backend
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
To run the backend, you will need:
- a Redis 2.8+ server
- a CouchDB server
- a Ruby 1.9 installation (use of rvm is suggested)
- ZeroMQ 4.0.5 (earlier API-compatible versions may work, but they have not been tested)
- Bundler
- ExecJS supported runtime (for the dashboard)
(see https://github.com/sstephenson/execjs)
(Little known fact: ArchiveBot is made to be as hard as possible to set
up. If you have trouble with these instructions, drop by in IRC for
help, file an issue, or submit improvements through a pull request. You
can also take a look at the .travis.yml integration test config file.)
Quick install, for Debian and Debian-esque systems:
sudo apt-get update
sudo apt-get install bundler couchdb git tmux
git clone https://github.com/ArchiveTeam/ArchiveBot.git
cd ArchiveBot
git submodule update --init
bundle install
Next, install Redis. It is recommended that you build Redis from source (from http://redis.io/); version 2.8 or higher is recommended. On Debian/Ubuntu, you can do this, using version 2.8.17 as an example:
apt-get install build-essential tcl8.5
wget http://download.redis.io/releases/redis-2.8.17.tar.gz
tar xzf redis-2.8.17.tar.gz
cd redis-2.8.17
make
make test
sudo make install
If you also want to set up Redis as a daemonized (always-running) service on your Debian/Ubuntu machine on port 6379, follow up with this:
cd utils
sudo ./install_server.sh
(and then hit enter a bunch of times to accept the default values)
Next we need to configure CouchDB. But first, check to make sure it installed correctly and is currently running on your machine:
curl http://127.0.0.1:5984/
If it's running, you should get back something like this:
{"couchdb":"Welcome","uuid":"610e43c2778c3be750ad5fff8cadd108","version":"1.5.0","vendor":{"version":"14.04","name":"Ubuntu"}}
Now we need to load up CouchDB with the "archivebot" and "archivebot_logs" databases. You can do this from the command line with CURL:
curl -X PUT http://127.0.0.1:5984/archivebot
curl -X PUT http://127.0.0.1:5984/archivebot_logs
If that works, you should get this back as a response each time:
{"ok":true}
Now, go to the db/design_docs folder in ArchiveBot (full link: https://github.com/ArchiveTeam/ArchiveBot/tree/master/db/design_docs ). You might have installed it somewhere like /home/archivebot/ArchiveBot/db/design_docs . The four design documents in there need to be uploaded to the new archivebot database you just created. You can use CURL or you can use the Futon web interface at http://localhost:5984/_utils/index.html where you can copy and paste the content of the JSON files into new documents manually. If you want to use CURL instead, do this:
cd /home/archivebot/ArchiveBot/db/design_docs (or wherever you put your files)
curl -X PUT http://127.0.0.1:5984/archivebot/_design/archive_urls -d @archive_urls.json
curl -X PUT http://127.0.0.1:5984/archivebot/_design/ignore_patterns -d @ignore_patterns.json
curl -X PUT http://127.0.0.1:5984/archivebot/_design/jobs -d @jobs.json
curl -X PUT http://127.0.0.1:5984/archivebot/_design/user_agents -d @user_agents.json
Finally, you're going to need to install an IRC server (until such time as the ArchiveBot code is changed to allow for alternate ways of sending it instructions, such as Twitter). On Debian/Ubuntu, do this:
apt-get install ircd-hybrid
pico /etc/ircd-hybrid/ircd.conf (or use nano or vim, or whatever -- edit this file to add passwords and such)
sudo /etc/init.d/ircd-hybrid restart
Once that's all in place, run the following:
redis-server (unless it's already running -- and make sure that it does not have a password)
cd /home/archivebot/ArchiveBot/bot
bundle exec ruby bot.rb \
-s 'irc://your-irc-host:6667' \
-r 'redis://your-redis-host:6379/0' \
-c '#archivebot' -n 'YourBot'
The bot should join the IRC channel.
You can run the dashboard webapp on the same machine, or a different machine. From the root of ArchiveBot's repository, run
export REDIS_URL=redis://your-redis-host:6379/0
export UPDATES_CHANNEL=updates
export FIREHOSE_SOCKET_URL=tcp://127.0.0.1:12345
plumbing/updates-listener | plumbing/log-firehose
In another terminal, run
export FIREHOSE_SOCKET_URL=tcp://127.0.0.1:12345
plumbing/firehose-client | bundle exec ruby dashboard/app.rb -u http://your-dashboard-host:8080
Configure twitter_conf.json if you want to post Twitter Tweets.
The last part of ArchiveBot is a set of maintenance tasks. They are currently split between the cogs and plumbing directories; eventually, they will all move to plumbing.
In cogs:
1. Configure twitter_conf.json if you want to post Twitter Tweets.
2. Run the cogs with bundle exec ruby cogs/start.rb.
In plumbing:
1. bundle install (yes, again -- the plumbing currently has its own Gemfile)
2. export REDIS_URL=redis://your-redis-host:6379/0
3. export UPDATES_CHANNEL=updates
4. In separate terminals, tmux panes, or screen sessions, run
a. ./analyzer
b. ./trimmer > /dev/null
c. COUCHDB_URL=http://your-couchdb-url:5984/db-name ./recorder
The trimmer prints all the data it trims to standard output in the form
IDENT JSON
IDENT JSON
...
For the EFNet ArchiveBot, we redirect it to /dev/null because we currently don't do anything with that data.
To upgrade, run `git pull` and restart all programs.
bot.rb, dashboard/app.rb, and cogs/start.rb accept a --help option. Run
them with --help to see accepted options.