Skip to content

Pangool-Flow is an experimental module on top of Pangool (http://pangool.net) which adds automatic flow building and management, parallel execution and high-level constructs.

Notifications You must be signed in to change notification settings

datasalt/pangool-flow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Pangool-flow

Pangool-flow is an experimental module on top of Pangool (http://pangool.net).

Pangool-flow adds to Pangool:

  • The possibility of chaining Pangool MapReduce Jobs and executing the resultant flow.
  • Parallel execution of Jobs in a Flow.
  • High-level constructs (operations) similar to those in Cascading, called "Ops".

The difference between Pangool-flow and other flow-based APIs such as Pig or Cascading is that Pangool-flow is built around the MapReduce abstraction. So, each step in the flow is represented by a MapReduceStep. Therefore, it is possible to tune every MapReduceStep as much as needed, and so there is no flexibility tradeoff involved in using pangool-flow.

About

Pangool-Flow is an experimental module on top of Pangool (http://pangool.net) which adds automatic flow building and management, parallel execution and high-level constructs.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published