public abstract class FileMergingSnapshotManagerBase extends Object implements FileMergingSnapshotManager
FileMergingSnapshotManager.| Modifier and Type | Class and Description |
|---|---|
protected static class |
FileMergingSnapshotManagerBase.DirectoryHandleWithReferenceTrack
This class wrap DirectoryStreamStateHandle with reference count by ongoing checkpoint.
|
FileMergingSnapshotManager.SpaceStat, FileMergingSnapshotManager.SubtaskKey| Modifier and Type | Field and Description |
|---|---|
protected org.apache.flink.core.fs.Path |
checkpointDir |
protected PhysicalFilePool.Type |
filePoolType
Type of physical file pool.
|
protected org.apache.flink.core.fs.FileSystem |
fs
The
FileSystem that this manager works on. |
protected Executor |
ioExecutor
The executor for I/O operations in this manager.
|
protected Object |
lock
Guard for
initFileSystem(org.apache.flink.core.fs.FileSystem, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, int), restoreStateHandles(long, org.apache.flink.runtime.checkpoint.filemerging.FileMergingSnapshotManager.SubtaskKey, java.util.stream.Stream<org.apache.flink.runtime.state.filemerging.SegmentFileStateHandle>) and uploadedStates. |
protected org.apache.flink.core.fs.Path |
managedExclusiveStateDir
The private state files are merged across subtasks, there is only one directory for
merged-files within one TM per job.
|
protected FileMergingSnapshotManagerBase.DirectoryHandleWithReferenceTrack |
managedExclusiveStateDirHandle
The
DirectoryStreamStateHandle with it ongoing checkpoint reference count for private
state directory, one for each taskmanager and job. |
protected long |
maxPhysicalFileSize
Max size for a physical file.
|
protected float |
maxSpaceAmplification |
protected FileMergingMetricGroup |
metricGroup
The metric group for file merging snapshot manager.
|
protected PhysicalFile.PhysicalFileDeleter |
physicalFileDeleter |
protected org.apache.flink.core.fs.Path |
sharedStateDir |
protected boolean |
shouldSyncAfterClosingLogicalFile
File-system dependent value.
|
protected FileMergingSnapshotManager.SpaceStat |
spaceStat
The current space statistic, updated on file creation/deletion.
|
protected org.apache.flink.core.fs.Path |
taskOwnedStateDir |
protected TreeMap<Long,Set<LogicalFile>> |
uploadedStates |
protected int |
writeBufferSize
The buffer size for writing files to the file system.
|
| Constructor and Description |
|---|
FileMergingSnapshotManagerBase(String id,
long maxFileSize,
PhysicalFilePool.Type filePoolType,
float maxSpaceAmplification,
Executor ioExecutor,
org.apache.flink.metrics.MetricGroup parentMetricGroup) |
| Modifier and Type | Method and Description |
|---|---|
void |
close() |
boolean |
couldReusePreviousStateHandle(StreamStateHandle stateHandle)
Check whether previous state handles could further be reused considering the space
amplification.
|
FileMergingCheckpointStateOutputStream |
createCheckpointStateOutputStream(FileMergingSnapshotManager.SubtaskKey subtaskKey,
long checkpointId,
CheckpointedStateScope scope)
Create a new
FileMergingCheckpointStateOutputStream. |
protected LogicalFile |
createLogicalFile(PhysicalFile physicalFile,
long startOffset,
long length,
FileMergingSnapshotManager.SubtaskKey subtaskKey)
Create a logical file on a physical file.
|
protected PhysicalFile |
createPhysicalFile(FileMergingSnapshotManager.SubtaskKey subtaskKey,
CheckpointedStateScope scope)
Create a physical file in right location (managed directory), which is specified by scope of
this checkpoint and current subtask.
|
protected PhysicalFilePool |
createPhysicalPool()
Create physical pool by filePoolType.
|
protected void |
deletePhysicalFile(org.apache.flink.core.fs.Path filePath,
long size)
Delete a physical file by given file path.
|
protected void |
discardCheckpoint(long checkpointId)
The callback which will be triggered when all subtasks discarded (aborted or subsumed).
|
void |
discardSingleLogicalFile(LogicalFile logicalFile,
long checkpointId) |
protected org.apache.flink.core.fs.Path |
generatePhysicalFilePath(org.apache.flink.core.fs.Path dirPath)
Generate a file path for a physical file.
|
String |
getId() |
LogicalFile |
getLogicalFile(LogicalFile.LogicalFileId fileId) |
org.apache.flink.core.fs.Path |
getManagedDir(FileMergingSnapshotManager.SubtaskKey subtaskKey,
CheckpointedStateScope scope)
Get the managed directory of the file-merging snapshot manager, created in
FileMergingSnapshotManager.initFileSystem(org.apache.flink.core.fs.FileSystem, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, int) or FileMergingSnapshotManager.registerSubtaskForSharedStates(org.apache.flink.runtime.checkpoint.filemerging.FileMergingSnapshotManager.SubtaskKey). |
DirectoryStreamStateHandle |
getManagedDirStateHandle(FileMergingSnapshotManager.SubtaskKey subtaskKey,
CheckpointedStateScope scope)
Get the
DirectoryStreamStateHandle of the managed directory, created in FileMergingSnapshotManager.initFileSystem(org.apache.flink.core.fs.FileSystem, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, int) or FileMergingSnapshotManager.registerSubtaskForSharedStates(org.apache.flink.runtime.checkpoint.filemerging.FileMergingSnapshotManager.SubtaskKey). |
protected abstract PhysicalFile |
getOrCreatePhysicalFileForCheckpoint(FileMergingSnapshotManager.SubtaskKey subtaskKey,
long checkpointId,
CheckpointedStateScope scope)
Get a reused physical file or create one.
|
void |
initFileSystem(org.apache.flink.core.fs.FileSystem fileSystem,
org.apache.flink.core.fs.Path checkpointBaseDir,
org.apache.flink.core.fs.Path sharedStateDir,
org.apache.flink.core.fs.Path taskOwnedStateDir,
int writeBufferSize)
Initialize the file system, recording the checkpoint path the manager should work with.
|
void |
notifyCheckpointAborted(FileMergingSnapshotManager.SubtaskKey subtaskKey,
long checkpointId)
This method is called as a notification once a distributed checkpoint has been aborted.
|
void |
notifyCheckpointComplete(FileMergingSnapshotManager.SubtaskKey subtaskKey,
long checkpointId)
Notifies the manager that the checkpoint with the given
checkpointId completed and
was committed. |
void |
notifyCheckpointStart(FileMergingSnapshotManager.SubtaskKey subtaskKey,
long checkpointId)
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl use this
method let the file merging manager know an ongoing checkpoint may reference the managed
dirs. |
void |
notifyCheckpointSubsumed(FileMergingSnapshotManager.SubtaskKey subtaskKey,
long checkpointId)
This method is called as a notification once a distributed checkpoint has been subsumed.
|
void |
registerSubtaskForSharedStates(FileMergingSnapshotManager.SubtaskKey subtaskKey)
Register a subtask and create the managed directory for shared states.
|
void |
restoreStateHandles(long checkpointId,
FileMergingSnapshotManager.SubtaskKey subtaskKey,
Stream<SegmentFileStateHandle> stateHandles)
Restore and re-register the SegmentFileStateHandles into FileMergingSnapshotManager.
|
protected abstract void |
returnPhysicalFileForNextReuse(FileMergingSnapshotManager.SubtaskKey subtaskKey,
long checkpointId,
PhysicalFile physicalFile)
Try to return an existing physical file to the manager for next reuse.
|
void |
reusePreviousStateHandle(long checkpointId,
Collection<? extends StreamStateHandle> stateHandles)
A callback method which is called when previous state handles are reused by following
checkpoint(s).
|
void |
unregisterSubtask(FileMergingSnapshotManager.SubtaskKey subtaskKey)
Unregister a subtask.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitisFileMergingHandleprotected final Executor ioExecutor
protected final Object lock
initFileSystem(org.apache.flink.core.fs.FileSystem, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, int), restoreStateHandles(long, org.apache.flink.runtime.checkpoint.filemerging.FileMergingSnapshotManager.SubtaskKey, java.util.stream.Stream<org.apache.flink.runtime.state.filemerging.SegmentFileStateHandle>) and uploadedStates.protected TreeMap<Long,Set<LogicalFile>> uploadedStates
protected org.apache.flink.core.fs.FileSystem fs
FileSystem that this manager works on.protected org.apache.flink.core.fs.Path checkpointDir
protected org.apache.flink.core.fs.Path sharedStateDir
protected org.apache.flink.core.fs.Path taskOwnedStateDir
protected int writeBufferSize
protected boolean shouldSyncAfterClosingLogicalFile
protected long maxPhysicalFileSize
protected PhysicalFilePool.Type filePoolType
protected final float maxSpaceAmplification
protected PhysicalFile.PhysicalFileDeleter physicalFileDeleter
protected org.apache.flink.core.fs.Path managedExclusiveStateDir
protected FileMergingSnapshotManagerBase.DirectoryHandleWithReferenceTrack managedExclusiveStateDirHandle
DirectoryStreamStateHandle with it ongoing checkpoint reference count for private
state directory, one for each taskmanager and job.protected FileMergingSnapshotManager.SpaceStat spaceStat
protected FileMergingMetricGroup metricGroup
public FileMergingSnapshotManagerBase(String id, long maxFileSize, PhysicalFilePool.Type filePoolType, float maxSpaceAmplification, Executor ioExecutor, org.apache.flink.metrics.MetricGroup parentMetricGroup)
public void initFileSystem(org.apache.flink.core.fs.FileSystem fileSystem,
org.apache.flink.core.fs.Path checkpointBaseDir,
org.apache.flink.core.fs.Path sharedStateDir,
org.apache.flink.core.fs.Path taskOwnedStateDir,
int writeBufferSize)
throws IllegalArgumentException
FileMergingSnapshotManager
The layout of checkpoint directory:
/user-defined-checkpoint-dir
/{job-id} (checkpointBaseDir)
|
+ --shared/
|
+ --subtask-1/
+ -- merged shared state files
+ --subtask-2/
+ -- merged shared state files
+ --taskowned/
+ -- merged private state files
+ --chk-1/
+ --chk-2/
+ --chk-3/
The reason why initializing directories in this method instead of the constructor is that
the FileMergingSnapshotManager itself belongs to the TaskStateManager, which is
initialized when receiving a task, while the base directories for checkpoint are created by
FsCheckpointStorageAccess when the state backend initializes per subtask. After the
checkpoint directories are initialized, the managed subdirectories are initialized here.
Note: This method may be called several times, the implementation should ensure
idempotency, and throw IllegalArgumentException when any of the path in params change
across function calls.
initFileSystem in interface FileMergingSnapshotManagerfileSystem - The filesystem to write to.checkpointBaseDir - The base directory for checkpoints.sharedStateDir - The directory for shared checkpoint data.taskOwnedStateDir - The name of the directory for state not owned/released by the
master, but by the TaskManagers.writeBufferSize - The buffer size for writing files to the file system.IllegalArgumentException - thrown if these three paths are not deterministic across
calls.public void registerSubtaskForSharedStates(FileMergingSnapshotManager.SubtaskKey subtaskKey)
FileMergingSnapshotManagerregisterSubtaskForSharedStates in interface FileMergingSnapshotManagersubtaskKey - the subtask key identifying a subtask.for layout information.public void unregisterSubtask(FileMergingSnapshotManager.SubtaskKey subtaskKey)
FileMergingSnapshotManagerunregisterSubtask in interface FileMergingSnapshotManagersubtaskKey - the subtask key identifying a subtask.protected LogicalFile createLogicalFile(@Nonnull PhysicalFile physicalFile, long startOffset, long length, @Nonnull FileMergingSnapshotManager.SubtaskKey subtaskKey)
physicalFile - the underlying physical file.startOffset - the offset in the physical file that the logical file starts from.length - the length of the logical file.subtaskKey - the id of the subtask that the logical file belongs to.@Nonnull protected PhysicalFile createPhysicalFile(FileMergingSnapshotManager.SubtaskKey subtaskKey, CheckpointedStateScope scope) throws IOException
subtaskKey - the SubtaskKey of current subtask.scope - the scope of the checkpoint.IOException - if anything goes wrong with file system.public FileMergingCheckpointStateOutputStream createCheckpointStateOutputStream(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId, CheckpointedStateScope scope)
FileMergingSnapshotManagerFileMergingCheckpointStateOutputStream. According to the file merging
strategy, the streams returned by multiple calls to this function may share the same
underlying physical file, and each stream writes to a segment of the physical file.createCheckpointStateOutputStream in interface FileMergingSnapshotManagersubtaskKey - The subtask key identifying the subtask.checkpointId - ID of the checkpoint.scope - The state's scope, whether it is exclusive or shared.protected org.apache.flink.core.fs.Path generatePhysicalFilePath(org.apache.flink.core.fs.Path dirPath)
dirPath - the parent directory path for the physical file.protected final void deletePhysicalFile(org.apache.flink.core.fs.Path filePath,
long size)
filePath - the given file path to delete.protected final PhysicalFilePool createPhysicalPool()
@Nonnull protected abstract PhysicalFile getOrCreatePhysicalFileForCheckpoint(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId, CheckpointedStateScope scope) throws IOException
Basic logic of file reusing: whenever a physical file is needed, this method is called
with necessary information provided for acquiring a file. The file will not be reused until
it is written and returned to the reused pool by calling returnPhysicalFileForNextReuse(org.apache.flink.runtime.checkpoint.filemerging.FileMergingSnapshotManager.SubtaskKey, long, org.apache.flink.runtime.checkpoint.filemerging.PhysicalFile).
subtaskKey - the subtask key for the callercheckpointId - the checkpoint idscope - checkpoint scopeIOException - thrown if anything goes wrong with file system.protected abstract void returnPhysicalFileForNextReuse(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId, PhysicalFile physicalFile) throws IOException
Basic logic of file reusing, see getOrCreatePhysicalFileForCheckpoint(org.apache.flink.runtime.checkpoint.filemerging.FileMergingSnapshotManager.SubtaskKey, long, org.apache.flink.runtime.state.CheckpointedStateScope).
subtaskKey - the subtask key for the callercheckpointId - in which checkpoint this physical file is requested.physicalFile - the returning checkpointIOException - thrown if anything goes wrong with file system.#getOrCreatePhysicalFileForCheckpoint(SubtaskKey, long, CheckpointedStateScope)protected void discardCheckpoint(long checkpointId)
throws IOException
checkpointId - the discarded checkpoint id.IOException - if anything goes wrong with file system.public void notifyCheckpointStart(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId)
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl use this
method let the file merging manager know an ongoing checkpoint may reference the managed
dirs.notifyCheckpointStart in interface FileMergingSnapshotManagersubtaskKey - the subtask key identifying the subtask.checkpointId - The ID of the checkpoint that has been started.public void notifyCheckpointComplete(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId) throws Exception
FileMergingSnapshotManagercheckpointId completed and
was committed.notifyCheckpointComplete in interface FileMergingSnapshotManagersubtaskKey - the subtask key identifying the subtask.checkpointId - The ID of the checkpoint that has been completed.Exception - thrown if anything goes wrong with the listener.public void notifyCheckpointAborted(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId) throws Exception
FileMergingSnapshotManagernotifyCheckpointAborted in interface FileMergingSnapshotManagersubtaskKey - the subtask key identifying the subtask.checkpointId - The ID of the checkpoint that has been completed.Exception - thrown if anything goes wrong with the listener.public void notifyCheckpointSubsumed(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId) throws Exception
FileMergingSnapshotManagernotifyCheckpointSubsumed in interface FileMergingSnapshotManagersubtaskKey - the subtask key identifying the subtask.checkpointId - The ID of the checkpoint that has been completed.Exception - thrown if anything goes wrong with the listener.public void reusePreviousStateHandle(long checkpointId,
Collection<? extends StreamStateHandle> stateHandles)
FileMergingSnapshotManagerreusePreviousStateHandle in interface FileMergingSnapshotManagercheckpointId - the checkpoint that reuses the handles.stateHandles - the handles to be reused.public boolean couldReusePreviousStateHandle(StreamStateHandle stateHandle)
FileMergingSnapshotManagercouldReusePreviousStateHandle in interface FileMergingSnapshotManagerstateHandle - the handle to be reused.public void discardSingleLogicalFile(LogicalFile logicalFile, long checkpointId) throws IOException
IOExceptionpublic org.apache.flink.core.fs.Path getManagedDir(FileMergingSnapshotManager.SubtaskKey subtaskKey, CheckpointedStateScope scope)
FileMergingSnapshotManagerFileMergingSnapshotManager.initFileSystem(org.apache.flink.core.fs.FileSystem, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, int) or FileMergingSnapshotManager.registerSubtaskForSharedStates(org.apache.flink.runtime.checkpoint.filemerging.FileMergingSnapshotManager.SubtaskKey).getManagedDir in interface FileMergingSnapshotManagersubtaskKey - the subtask key identifying the subtask.scope - the checkpoint scope.public DirectoryStreamStateHandle getManagedDirStateHandle(FileMergingSnapshotManager.SubtaskKey subtaskKey, CheckpointedStateScope scope)
FileMergingSnapshotManagerDirectoryStreamStateHandle of the managed directory, created in FileMergingSnapshotManager.initFileSystem(org.apache.flink.core.fs.FileSystem, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, int) or FileMergingSnapshotManager.registerSubtaskForSharedStates(org.apache.flink.runtime.checkpoint.filemerging.FileMergingSnapshotManager.SubtaskKey).getManagedDirStateHandle in interface FileMergingSnapshotManagersubtaskKey - the subtask key identifying the subtask.scope - the checkpoint scope.DirectoryStreamStateHandle for one subtask in specified checkpoint scope.public void close()
throws IOException
close in interface Closeableclose in interface AutoCloseableIOException@VisibleForTesting public String getId()
public void restoreStateHandles(long checkpointId,
FileMergingSnapshotManager.SubtaskKey subtaskKey,
Stream<SegmentFileStateHandle> stateHandles)
FileMergingSnapshotManagerrestoreStateHandles in interface FileMergingSnapshotManagercheckpointId - the restored checkpoint id.subtaskKey - the subtask key identifying the subtask.stateHandles - the restored segment file handles.@VisibleForTesting public LogicalFile getLogicalFile(LogicalFile.LogicalFileId fileId)
Copyright © 2014–2025 The Apache Software Foundation. All rights reserved.