com.splout.db.hadoop
Class TableBuilder

java.lang.Object
  extended by com.splout.db.hadoop.TableBuilder

public class TableBuilder
extends java.lang.Object

This builder can be used to obtain Table beans. These beans can then be used to obtain a TablespaceSpec through TablespaceBuilder.


Nested Class Summary
static class TableBuilder.TableBuilderException
          Exception that is thrown if a Table cannot be built because there is missing data or inconsistent data has been specified.
 
Constructor Summary
TableBuilder(org.apache.hadoop.conf.Configuration hadoopConf)
          Hadoop configuration, no schema: The input files will contain the Schema (e.g.
TableBuilder(com.datasalt.pangool.io.Schema schema)
          Fixed schema constructor: for example, if we use textual files.
TableBuilder(java.lang.String tableName, org.apache.hadoop.conf.Configuration hadoopConf)
          Schema-less constructor with explicit table name.
TableBuilder(java.lang.String tableName, com.datasalt.pangool.io.Schema schema)
          Fixed schema + explicit table name.
 
Method Summary
 TableBuilder addCascadingTable(org.apache.hadoop.fs.Path path, java.lang.String[] columnNames)
           
 TableBuilder addCascadingTable(org.apache.hadoop.fs.Path inputPath, java.lang.String[] columnNames, org.apache.hadoop.conf.Configuration conf)
           
 TableBuilder addCSVTextFile(org.apache.hadoop.fs.Path path)
           
 TableBuilder addCSVTextFile(org.apache.hadoop.fs.Path path, char separator, char quoteCharacter, char escapeCharacter, boolean hasHeader, boolean strictQuotes, java.lang.String nullString)
           
 TableBuilder addCSVTextFile(org.apache.hadoop.fs.Path path, char separator, char quoteCharacter, char escapeCharacter, boolean hasHeader, boolean strictQuotes, java.lang.String nullString, com.datasalt.pangool.io.Schema fileSchema, RecordProcessor recordProcessor)
           
 TableBuilder addCSVTextFile(org.apache.hadoop.fs.Path path, com.datasalt.pangool.io.Schema fileSchema, RecordProcessor recordProcessor)
           
 TableBuilder addCSVTextFile(java.lang.String path)
           
 TableBuilder addCSVTextFile(java.lang.String path, char separator, char quoteCharacter, char escapeCharacter, boolean hasHeader, boolean strictQuotes, java.lang.String nullString)
           
 TableBuilder addCSVTextFile(java.lang.String path, char separator, char quoteCharacter, char escapeCharacter, boolean hasHeader, boolean strictQuotes, java.lang.String nullString, com.datasalt.pangool.io.Schema fileSchema, RecordProcessor recordProcessor)
           
 TableBuilder addCSVTextFile(java.lang.String path, com.datasalt.pangool.io.Schema fileSchema, RecordProcessor recordProcessor)
           
 TableBuilder addCustomInputFormatFile(org.apache.hadoop.fs.Path path, org.apache.hadoop.mapreduce.InputFormat<com.datasalt.pangool.io.ITuple,org.apache.hadoop.io.NullWritable> inputFormat)
           
 TableBuilder addCustomInputFormatFile(org.apache.hadoop.fs.Path path, org.apache.hadoop.mapreduce.InputFormat<com.datasalt.pangool.io.ITuple,org.apache.hadoop.io.NullWritable> inputFormat, java.util.Map<java.lang.String,java.lang.String> specificContext, RecordProcessor recordProcessor)
           
 TableBuilder addCustomInputFormatFile(org.apache.hadoop.fs.Path path, org.apache.hadoop.mapreduce.InputFormat<com.datasalt.pangool.io.ITuple,org.apache.hadoop.io.NullWritable> inputFormat, RecordProcessor recordProcessor)
           
 TableBuilder addFile(TableInput tableFile)
           
 TableBuilder addFixedWidthTextFile(org.apache.hadoop.fs.Path path, com.datasalt.pangool.io.Schema schema, int[] fields, boolean hasHeader, java.lang.String nullString, RecordProcessor recordProcessor)
           
 TableBuilder addHiveTable(java.lang.String dbName, java.lang.String tableName)
           
 TableBuilder addHiveTable(java.lang.String dbName, java.lang.String tableName, org.apache.hadoop.conf.Configuration conf)
           
 TableBuilder addTupleFile(org.apache.hadoop.fs.Path path)
           
 TableBuilder addTupleFile(org.apache.hadoop.fs.Path path, RecordProcessor recordProcessor)
           
 Table build()
           
 TableBuilder createIndex(java.lang.String... indexFields)
           
 TableBuilder finalSQL(java.lang.String... finalSQLStatements)
           
 TableBuilder initialSQL(java.lang.String... initialSQLStatements)
           
 TableBuilder insertionSortOrder(com.datasalt.pangool.tuplemr.OrderBy orderBy)
           
 TableBuilder partitionBy(java.lang.String... partitionByFields)
           
 TableBuilder partitionByJavaScript(java.lang.String javascript)
           
 TableBuilder postInsertsSQL(java.lang.String... postInsertsSQLStatements)
           
 TableBuilder preInsertsSQL(java.lang.String... preInsertsSQLStatements)
           
 TableBuilder replicateToAll()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TableBuilder

public TableBuilder(com.datasalt.pangool.io.Schema schema)
Fixed schema constructor: for example, if we use textual files. The table name is extracted from the Schema name.


TableBuilder

public TableBuilder(java.lang.String tableName,
                    com.datasalt.pangool.io.Schema schema)
Fixed schema + explicit table name.


TableBuilder

public TableBuilder(org.apache.hadoop.conf.Configuration hadoopConf)
Hadoop configuration, no schema: The input files will contain the Schema (e.g. Tuple files / Cascading files).


TableBuilder

public TableBuilder(java.lang.String tableName,
                    org.apache.hadoop.conf.Configuration hadoopConf)
Schema-less constructor with explicit table name.

Method Detail

addFixedWidthTextFile

public TableBuilder addFixedWidthTextFile(org.apache.hadoop.fs.Path path,
                                          com.datasalt.pangool.io.Schema schema,
                                          int[] fields,
                                          boolean hasHeader,
                                          java.lang.String nullString,
                                          RecordProcessor recordProcessor)

addCSVTextFile

public TableBuilder addCSVTextFile(org.apache.hadoop.fs.Path path,
                                   char separator,
                                   char quoteCharacter,
                                   char escapeCharacter,
                                   boolean hasHeader,
                                   boolean strictQuotes,
                                   java.lang.String nullString,
                                   com.datasalt.pangool.io.Schema fileSchema,
                                   RecordProcessor recordProcessor)

addCSVTextFile

public TableBuilder addCSVTextFile(java.lang.String path,
                                   char separator,
                                   char quoteCharacter,
                                   char escapeCharacter,
                                   boolean hasHeader,
                                   boolean strictQuotes,
                                   java.lang.String nullString,
                                   com.datasalt.pangool.io.Schema fileSchema,
                                   RecordProcessor recordProcessor)

addCSVTextFile

public TableBuilder addCSVTextFile(org.apache.hadoop.fs.Path path,
                                   char separator,
                                   char quoteCharacter,
                                   char escapeCharacter,
                                   boolean hasHeader,
                                   boolean strictQuotes,
                                   java.lang.String nullString)

addCSVTextFile

public TableBuilder addCSVTextFile(java.lang.String path,
                                   char separator,
                                   char quoteCharacter,
                                   char escapeCharacter,
                                   boolean hasHeader,
                                   boolean strictQuotes,
                                   java.lang.String nullString)

addCSVTextFile

public TableBuilder addCSVTextFile(org.apache.hadoop.fs.Path path,
                                   com.datasalt.pangool.io.Schema fileSchema,
                                   RecordProcessor recordProcessor)

addCSVTextFile

public TableBuilder addCSVTextFile(java.lang.String path,
                                   com.datasalt.pangool.io.Schema fileSchema,
                                   RecordProcessor recordProcessor)

addCSVTextFile

public TableBuilder addCSVTextFile(org.apache.hadoop.fs.Path path)

addCSVTextFile

public TableBuilder addCSVTextFile(java.lang.String path)

addHiveTable

public TableBuilder addHiveTable(java.lang.String dbName,
                                 java.lang.String tableName)
                          throws java.io.IOException
Throws:
java.io.IOException

addHiveTable

public TableBuilder addHiveTable(java.lang.String dbName,
                                 java.lang.String tableName,
                                 org.apache.hadoop.conf.Configuration conf)
                          throws java.io.IOException
Throws:
java.io.IOException

addCascadingTable

public TableBuilder addCascadingTable(org.apache.hadoop.fs.Path path,
                                      java.lang.String[] columnNames)
                               throws java.io.IOException
Throws:
java.io.IOException

addCascadingTable

public TableBuilder addCascadingTable(org.apache.hadoop.fs.Path inputPath,
                                      java.lang.String[] columnNames,
                                      org.apache.hadoop.conf.Configuration conf)
                               throws java.io.IOException
Throws:
java.io.IOException

addCustomInputFormatFile

public TableBuilder addCustomInputFormatFile(org.apache.hadoop.fs.Path path,
                                             org.apache.hadoop.mapreduce.InputFormat<com.datasalt.pangool.io.ITuple,org.apache.hadoop.io.NullWritable> inputFormat)
                                      throws java.io.IOException
Throws:
java.io.IOException

addCustomInputFormatFile

public TableBuilder addCustomInputFormatFile(org.apache.hadoop.fs.Path path,
                                             org.apache.hadoop.mapreduce.InputFormat<com.datasalt.pangool.io.ITuple,org.apache.hadoop.io.NullWritable> inputFormat,
                                             RecordProcessor recordProcessor)
                                      throws java.io.IOException
Throws:
java.io.IOException

addCustomInputFormatFile

public TableBuilder addCustomInputFormatFile(org.apache.hadoop.fs.Path path,
                                             org.apache.hadoop.mapreduce.InputFormat<com.datasalt.pangool.io.ITuple,org.apache.hadoop.io.NullWritable> inputFormat,
                                             java.util.Map<java.lang.String,java.lang.String> specificContext,
                                             RecordProcessor recordProcessor)
                                      throws java.io.IOException
Throws:
java.io.IOException

addTupleFile

public TableBuilder addTupleFile(org.apache.hadoop.fs.Path path)
                          throws java.io.IOException
Throws:
java.io.IOException

addTupleFile

public TableBuilder addTupleFile(org.apache.hadoop.fs.Path path,
                                 RecordProcessor recordProcessor)
                          throws java.io.IOException
Throws:
java.io.IOException

initialSQL

public TableBuilder initialSQL(java.lang.String... initialSQLStatements)
Parameters:
initialSQLStatements - SQL statements that will be executed at the start of the process, just after some default PRAGMA statements and just before the CREATE TABLE statements.

preInsertsSQL

public TableBuilder preInsertsSQL(java.lang.String... preInsertsSQLStatements)
Parameters:
preInsertsSQLStatements - SQL statements that will be executed just after the CREATE TABLE statements but just before the INSERT statements used to insert data.

postInsertsSQL

public TableBuilder postInsertsSQL(java.lang.String... postInsertsSQLStatements)
Parameters:
postInsertsSQLStatements - SQL statements that will be executed just after all data is inserted but just before the CREATE INDEX statements.

finalSQL

public TableBuilder finalSQL(java.lang.String... finalSQLStatements)
Parameters:
finalSQLStatements - SQL statements that will be executed al the end of the process, just after the CREATE INDEX statements.

createIndex

public TableBuilder createIndex(java.lang.String... indexFields)

partitionBy

public TableBuilder partitionBy(java.lang.String... partitionByFields)

partitionByJavaScript

public TableBuilder partitionByJavaScript(java.lang.String javascript)
                                   throws TableBuilder.TableBuilderException
Throws:
TableBuilder.TableBuilderException

replicateToAll

public TableBuilder replicateToAll()

addFile

public TableBuilder addFile(TableInput tableFile)

insertionSortOrder

public TableBuilder insertionSortOrder(com.datasalt.pangool.tuplemr.OrderBy orderBy)
                                throws TableBuilder.TableBuilderException
Throws:
TableBuilder.TableBuilderException

build

public Table build()
            throws TableBuilder.TableBuilderException
Throws:
TableBuilder.TableBuilderException


Copyright © 2012-2013 Datasalt Systems S.L.. All Rights Reserved.