com.splout.db.hadoop
Class SQLiteOutputFormat
java.lang.Object
org.apache.hadoop.mapreduce.OutputFormat<K,V>
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<com.datasalt.pangool.io.ITuple,org.apache.hadoop.io.NullWritable>
com.splout.db.hadoop.SQLiteOutputFormat
- All Implemented Interfaces:
- java.io.Serializable
public class SQLiteOutputFormat
- extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<com.datasalt.pangool.io.ITuple,org.apache.hadoop.io.NullWritable>
- implements java.io.Serializable
Low-level Pangool OutputFormat that can be used to generate partitioned SQL views in Hadoop. It accepts Tuples that
have "sql" strings and "partition" integers. Each partition will generate a different .db file named .db
Furthermore, the OutputFormat accepts a list of initial SQL statements that will be executed for each partition when
the database is created (e.g. CREATE TABLE and such). It also accepts finalization statements (e.g. CREATE INDEX).
This OutputFormat can be used as a basis for creating more complex OutputFormats such as
TupleSQLite4JavaOutputFormat
.
Moreover, using this OutputFormat directly can result in poor-performing Jobs as it can't cache PreparedStatements
(it has to create a new Statement for every SQL it receives).
- See Also:
- Serialized Form
Field Summary |
static org.apache.commons.logging.Log |
LOG
|
static com.datasalt.pangool.io.Schema |
SCHEMA
|
Constructor Summary |
SQLiteOutputFormat(java.lang.String[] initSqlStatements,
java.lang.String[] endSqlStatements,
int batchSize)
|
Method Summary |
org.apache.hadoop.mapreduce.RecordWriter<com.datasalt.pangool.io.ITuple,org.apache.hadoop.io.NullWritable> |
getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
|
Methods inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat |
checkOutputSpecs, getCompressOutput, getDefaultWorkFile, getOutputCommitter, getOutputCompressorClass, getOutputPath, getPathForWorkFile, getUniqueFile, getWorkOutputPath, setCompressOutput, setOutputCompressorClass, setOutputPath |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
LOG
public static org.apache.commons.logging.Log LOG
SCHEMA
public static final com.datasalt.pangool.io.Schema SCHEMA
SQLiteOutputFormat
public SQLiteOutputFormat(java.lang.String[] initSqlStatements,
java.lang.String[] endSqlStatements,
int batchSize)
getRecordWriter
public org.apache.hadoop.mapreduce.RecordWriter<com.datasalt.pangool.io.ITuple,org.apache.hadoop.io.NullWritable> getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
throws java.io.IOException,
java.lang.InterruptedException
- Specified by:
getRecordWriter
in class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<com.datasalt.pangool.io.ITuple,org.apache.hadoop.io.NullWritable>
- Throws:
java.io.IOException
java.lang.InterruptedException
Copyright © 2012-2013 Datasalt Systems S.L.. All Rights Reserved.