Getting Started

Download the Pangool bootstrap project, and start coding with Pangool.

Developing with Pangool

There are available Maven artifacts for Pangool. We also provide a "bootstrap" project to simplify starting developing with Pangool.

Maven artifact

for Hadoop 0.20.X and 1.X

  <dependency>
    <groupId>com.datasalt.pangool</groupId>
    <artifactId>pangool-core</artifactId>
    <version>0.70</version>
  </dependency> 
   

And include a Hadoop dependency for the Hadoop version you are using (preferible with scope "provided")

Maven artifact

for Hadoop 2.X and YARN

  <dependency>
    <groupId>com.datasalt.pangool</groupId>
    <artifactId>pangool-core</artifactId>
    <version>0.70</version>
    <classifier>mr2</classifier>
  </dependency>
   

And include a Hadoop dependency for the Hadoop version you are using (preferible with scope "provided")

Bootstrap project

Download our bootstrap project, a Maven project ready to be used as the basis of your project.

Using Pangool Bootstrap

The Pangool bootstrap project contains everything that is needed for starting working with Pangool. It is ready to build, execute and manage dependencies. Using the project from Eclipse is really easy.


Requirements:


Download bootstrap project

Download the boostrap project from here. Decompress it in a folder.


Compiling & testing

Run the following line for compiling and running your tests:

mvn install

Executing sort example

The bootstrap project contains a simple Pangool example that simply sorts its input. The following command executes this example in local mode (no Hadoop cluster is needed):

cd app-module
mvn exec:java -Dexec.classpathScope="compile" -Dexec.mainClass="com.datasalt.pangool.bootstrap.Driver" -Dexec.args="sort pom.xml pom.xml.copy"

Executing with Hadoop

After executing mvn install successfully, you will have a jar file ready to be executed in a Hadoop cluster. This jar file will be present in the folder app-module/target. You can use it from Hadoop in the following way:

mvn install
hadoop dfs -put pom.xml pom.xml
cd app-module/target
hadoop jar pangool-bootstrap-*-job.jar sort pom.xml pomout

Importing into Eclipse

Execute the following line to generate a Eclipse project from the project folder:

mvn eclipse:eclipse -DdownloadSources=true -DdownloadJavadocs=true

Then import the project from eclipse in the File > Import... menu.


Wanna see a quick and fun introduction? ยป