public class ParallelRunner extends Object implements Closeable
This class is intended to be used in the following pattern. This example uses the serialize() method.
Closer closer = Closer.create();
try {
// Do stuff
ParallelRunner runner = closer.register(new ParallelRunner(threads, fs));
runner.serialize(state1, outputFilePath1);
// Submit more serialization tasks
runner.serialize(stateN, outputFilePathN);
// Do stuff
} catch (Throwable e) {
throw closer.rethrow(e);
} finally {
closer.close();
}
Note that calling close() will wait for all submitted tasks to complete and then stop the
ParallelRunner by shutting down the ExecutorService.
| Modifier and Type | Class and Description |
|---|---|
static class |
ParallelRunner.FailPolicy
Policies indicating how
ParallelRunner should handle failure of tasks. |
static class |
ParallelRunner.NamedFuture
A future with a name / message for reporting.
|
| Modifier and Type | Field and Description |
|---|---|
static int |
DEFAULT_PARALLEL_RUNNER_THREADS |
static String |
PARALLEL_RUNNER_THREADS_KEY |
| Constructor and Description |
|---|
ParallelRunner(int threads,
org.apache.hadoop.fs.FileSystem fs) |
ParallelRunner(int threads,
org.apache.hadoop.fs.FileSystem fs,
ParallelRunner.FailPolicy failPolicy) |
| Modifier and Type | Method and Description |
|---|---|
void |
close() |
void |
deletePath(org.apache.hadoop.fs.Path path,
boolean recursive)
Delete a
Path. |
<T extends State> |
deserializeFromFile(T state,
org.apache.hadoop.fs.Path inputFilePath)
Deserialize a
State object from a file. |
<T extends State> |
deserializeFromSequenceFile(Class<? extends org.apache.hadoop.io.Writable> keyClass,
Class<T> stateClass,
org.apache.hadoop.fs.Path inputFilePath,
Collection<T> states,
boolean deleteAfter)
Deserialize a list of
State objects from a Hadoop SequenceFile. |
void |
movePath(org.apache.hadoop.fs.Path src,
org.apache.hadoop.fs.FileSystem dstFs,
org.apache.hadoop.fs.Path dst,
boolean overwrite,
com.google.common.base.Optional<String> group)
Move a
Path. |
void |
movePath(org.apache.hadoop.fs.Path src,
org.apache.hadoop.fs.FileSystem dstFs,
org.apache.hadoop.fs.Path dst,
com.google.common.base.Optional<String> group)
Move a
Path. |
void |
renamePath(org.apache.hadoop.fs.Path src,
org.apache.hadoop.fs.Path dst,
com.google.common.base.Optional<String> group)
Rename a
Path. |
<T extends State> |
serializeToFile(T state,
org.apache.hadoop.fs.Path outputFilePath)
Serialize a
State object into a file. |
void |
submitCallable(Callable<Void> callable,
String name)
Submit a callable to the thread pool
|
void |
waitForTasks()
Wait until default timeout(infinite long, if not specified) for all tasks under this parallel runner.
|
void |
waitForTasks(long timeoutInMills)
Wait for all submitted tasks to complete.
|
public static final String PARALLEL_RUNNER_THREADS_KEY
public static final int DEFAULT_PARALLEL_RUNNER_THREADS
public ParallelRunner(int threads,
org.apache.hadoop.fs.FileSystem fs)
public ParallelRunner(int threads,
org.apache.hadoop.fs.FileSystem fs,
ParallelRunner.FailPolicy failPolicy)
public <T extends State> void serializeToFile(T state, org.apache.hadoop.fs.Path outputFilePath)
public <T extends State> void deserializeFromFile(T state, org.apache.hadoop.fs.Path inputFilePath)
public <T extends State> void deserializeFromSequenceFile(Class<? extends org.apache.hadoop.io.Writable> keyClass, Class<T> stateClass, org.apache.hadoop.fs.Path inputFilePath, Collection<T> states, boolean deleteAfter)
State objects from a Hadoop SequenceFile.
This method submits a task to deserialize the State objects and returns immediately
after the task is submitted.
T - the State object typestateClass - the Class object of the State classinputFilePath - the input SequenceFile to read fromstates - a Collection object to store the deserialized State objectsdeleteAfter - a flag telling whether to delete the SequenceFile afterwardspublic void deletePath(org.apache.hadoop.fs.Path path,
boolean recursive)
Path.
This method submits a task to delete a Path and returns immediately
after the task is submitted.
path - path to be deleted.public void renamePath(org.apache.hadoop.fs.Path src,
org.apache.hadoop.fs.Path dst,
com.google.common.base.Optional<String> group)
Path.
This method submits a task to rename a Path and returns immediately
after the task is submitted.
src - path to be renameddst - new path after renamegroup - an optional group name for the destination pathpublic void movePath(org.apache.hadoop.fs.Path src,
org.apache.hadoop.fs.FileSystem dstFs,
org.apache.hadoop.fs.Path dst,
com.google.common.base.Optional<String> group)
Path.
This method submits a task to move a Path and returns immediately
after the task is submitted.
src - path to be moveddstFs - the destination FileSystemdst - the destination pathgroup - an optional group name for the destination pathpublic void movePath(org.apache.hadoop.fs.Path src,
org.apache.hadoop.fs.FileSystem dstFs,
org.apache.hadoop.fs.Path dst,
boolean overwrite,
com.google.common.base.Optional<String> group)
Path.
This method submits a task to move a Path and returns immediately
after the task is submitted.
src - path to be moveddstFs - the destination FileSystemdst - the destination pathoverwrite - true to overwrite the destinationgroup - an optional group name for the destination pathpublic void submitCallable(Callable<Void> callable, String name)
This method submits a task and returns immediately
callable - the callable to submitname - for the futurepublic void waitForTasks(long timeoutInMills)
throws IOException
ParallelRunner can be reused after this call.IOExceptionpublic void waitForTasks()
throws IOException
IOExceptionpublic void close()
throws IOException
close in interface Closeableclose in interface AutoCloseableIOException