com.splout.db.hadoop
Class TupleSQLite4JavaOutputFormat

java.lang.Object
  extended by org.apache.hadoop.mapreduce.OutputFormat<K,V>
      extended by org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<com.datasalt.pangool.io.ITuple,org.apache.hadoop.io.NullWritable>
          extended by com.splout.db.hadoop.TupleSQLite4JavaOutputFormat
All Implemented Interfaces:
java.io.Serializable

public class TupleSQLite4JavaOutputFormat
extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<com.datasalt.pangool.io.ITuple,org.apache.hadoop.io.NullWritable>
implements java.io.Serializable

An OutputFormat that accepts Pangool's Tuples and writes to a sqlite4Java SQLite file. The Tuples that are written to it must conform to a particular schema: having a "_partition" integer field (which will then create a file named "partition".db).

The different schemas that will be given to this OutputFormat are defined in the constructor by providing a TableSpec. These TableSpec also contains information such as pre-SQL or post-SQL statements but most notably contain a Schema so that a CREATE TABLE can be derived automatically from it. Note that the Schema provided to TableSpec doesn't need to contain a "_partition" field or be nullable.

See Also:
Serialized Form

Nested Class Summary
static class TupleSQLite4JavaOutputFormat.TupleSQLiteOutputFormatException
          Exception that is thrown if the Output Format cannot be instantiated because the specified parameters are inconsistent or invalid.
 class TupleSQLite4JavaOutputFormat.TupleSQLRecordWriter
          A RecordWriter that accepts an Int(Partition), a Tuple and delegates to a SQLiteOutputFormat.SQLRecordWriter converting the Tuple into SQL and assigning the partition that comes in the Key.
 
Field Summary
static org.apache.commons.logging.Log LOG
           
static java.lang.String PARTITION_TUPLE_FIELD
           
 
Constructor Summary
TupleSQLite4JavaOutputFormat(int batchSize, TableSpec... dbSpec)
          This OutputFormat receives a list of TableSpec.
 
Method Summary
protected static java.lang.String[] getCreateIndexes(TableSpec... tableSpecs)
           
protected static java.lang.String[] getCreateTables(TableSpec... tableSpecs)
           
 org.apache.hadoop.mapreduce.RecordWriter<com.datasalt.pangool.io.ITuple,org.apache.hadoop.io.NullWritable> getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
           
 
Methods inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
checkOutputSpecs, getCompressOutput, getDefaultWorkFile, getOutputCommitter, getOutputCompressorClass, getOutputPath, getPathForWorkFile, getUniqueFile, getWorkOutputPath, setCompressOutput, setOutputCompressorClass, setOutputPath
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

PARTITION_TUPLE_FIELD

public static final java.lang.String PARTITION_TUPLE_FIELD
See Also:
Constant Field Values

LOG

public static org.apache.commons.logging.Log LOG
Constructor Detail

TupleSQLite4JavaOutputFormat

public TupleSQLite4JavaOutputFormat(int batchSize,
                                    TableSpec... dbSpec)
                             throws TupleSQLite4JavaOutputFormat.TupleSQLiteOutputFormatException
This OutputFormat receives a list of TableSpec. These are the different tables that will be created. They will be identified by Pangool Tuples. The batch size is the number of SQL statements to execute before a COMMIT.

Throws:
TupleSQLite4JavaOutputFormat.TupleSQLiteOutputFormatException
Method Detail

getCreateTables

protected static java.lang.String[] getCreateTables(TableSpec... tableSpecs)
                                             throws TupleSQLite4JavaOutputFormat.TupleSQLiteOutputFormatException
Throws:
TupleSQLite4JavaOutputFormat.TupleSQLiteOutputFormatException

getCreateIndexes

protected static java.lang.String[] getCreateIndexes(TableSpec... tableSpecs)
                                              throws TupleSQLite4JavaOutputFormat.TupleSQLiteOutputFormatException
Throws:
TupleSQLite4JavaOutputFormat.TupleSQLiteOutputFormatException

getRecordWriter

public org.apache.hadoop.mapreduce.RecordWriter<com.datasalt.pangool.io.ITuple,org.apache.hadoop.io.NullWritable> getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
                                                                                                                           throws java.io.IOException,
                                                                                                                                  java.lang.InterruptedException
Specified by:
getRecordWriter in class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<com.datasalt.pangool.io.ITuple,org.apache.hadoop.io.NullWritable>
Throws:
java.io.IOException
java.lang.InterruptedException


Copyright © 2012-2013 Datasalt Systems S.L.. All Rights Reserved.