Skip to content

Latest commit

 

History

History
108 lines (62 loc) · 3.92 KB

README.md

File metadata and controls

108 lines (62 loc) · 3.92 KB

GT Big Data Club Installation Guide Travis Build Status

Hi, this is the installation guide for GT Big Data Club. It contains instructions on how to install everything that you need to start hacking with us. Check out the section that your interested in, grab a soda, and start installing!

Bootstrapping Scripts

This file comes with bootstrapping scripts, if you don't want to read through this documentation.

  1. Download the appropriate script as shown below.

Windows: Run scripts/windows.cmd

Linux: Run scripts/linux.sh

Mac: Run scripts/mac.sh

NOTE: The above scripts will download a package manager to your computer to simplify downloading and updating packages in the future.

  1. Install Miniconda

Then, install the packages required by running this command:

conda env create -f environment.yml

Now, anytime you want to run any Big Data Club stuff, simply run

source activate big-data-club

and you will have the required packages in that shell!

General

There are certain tools and technologies that all parts of the team interact with. These are:

Package Manager

Package managers make it easier to download programs, handle updates, and set your PATH variable. There are different package managers for different OS.

Windows: Chocolatey

Linux: apt-get

Mac: Homebrew

Using a package manager will make the following steps quite trivial, as there will be no need to open up your browser at all!

While others exist, the bootstrapping scripts use the package managers listed above. Feel free to use your favorite package manager!

Git:

A free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.

You can download Git by going to this link.

If you would like a visual client for Git, GitHub offers a cross- platform app here.

MongoDB

An extremely popular NoSQL database that organises data as documents of key- value pairs, instead of using tables and rows.

The installer for MongoDB can also be obtained from the MongoDB website.

Also, optionally install RoboMongo, an admin tool for MongoDB.

Python and Pip

Python is a popular, high- level programming language with a multitude of uses. Pip is an installation tool that makes installing Python libraries relatively painless.

You can download the latest version of Python here

Conda

Conda makes it easier to change between different versions of Python, and pre-bundles several scientific computing packages for Python

Server Frameworks

Flask

Flask is a Python microframework for writing web servers.

Node.js and npm

An open source, cross-platform runtime environment for server- side and networking applications. npm is a package manager that comes bundled with Node.js.

The installer for Node.js can be found here

Useful Python Libraries

NumPy is the fundamental package for scientific computing with Python.

SciPy extends NumPy to have more functionality.

Leading platform for building Python programs to work with human language data.

Leading Python library for data mining and data analysis.

Beautiful Soup is a Python library for parsing HTML.

Requests is a Python library for making HTTP requests.