public class PermanentBlobCache extends AbstractBlobCache implements JobPermanentBlobService
When requesting BLOBs via getFile(JobID, PermanentBlobKey), the cache will first
attempt to serve the file from its local cache. Only if the local cache does not contain the
desired BLOB, it will try to download it from a distributed HA file system (if available) or the
BLOB server.
If files for a job are not needed any more, they will enter a staged, i.e. deferred, cleanup. Files may thus still be accessible upon recovery and do not need to be re-downloaded.
blobClientConfig, blobView, log, numFetchRetries, readWriteLock, serverAddress, shutdownHook, shutdownRequested, storageDir, tempFileCounter| Constructor and Description |
|---|
PermanentBlobCache(org.apache.flink.configuration.Configuration blobClientConfig,
File storageDir,
BlobView blobView,
InetSocketAddress serverAddress) |
PermanentBlobCache(org.apache.flink.configuration.Configuration blobClientConfig,
File storageDir,
BlobView blobView,
InetSocketAddress serverAddress,
BlobCacheSizeTracker blobCacheSizeTracker) |
PermanentBlobCache(org.apache.flink.configuration.Configuration blobClientConfig,
org.apache.flink.util.Reference<File> storageDir,
BlobView blobView,
InetSocketAddress serverAddress)
Instantiates a new cache for permanent BLOBs which are also available in an HA store.
|
PermanentBlobCache(org.apache.flink.configuration.Configuration blobClientConfig,
org.apache.flink.util.Reference<File> storageDir,
BlobView blobView,
InetSocketAddress serverAddress,
BlobCacheSizeTracker blobCacheSizeTracker) |
| Modifier and Type | Method and Description |
|---|---|
protected void |
cancelCleanupTask()
Cancels any cleanup task that subclasses may be executing.
|
File |
getFile(org.apache.flink.api.common.JobID jobId,
PermanentBlobKey key)
Returns the path to a local copy of the file associated with the provided job ID and blob
key.
|
int |
getNumberOfReferenceHolders(org.apache.flink.api.common.JobID jobId) |
File |
getStorageLocation(org.apache.flink.api.common.JobID jobId,
BlobKey key)
Returns a file handle to the file associated with the given blob key on the blob server.
|
byte[] |
readFile(org.apache.flink.api.common.JobID jobId,
PermanentBlobKey blobKey)
Returns the content of the file for the BLOB with the provided job ID the blob key.
|
void |
registerJob(org.apache.flink.api.common.JobID jobId)
Registers use of job-related BLOBs.
|
void |
releaseJob(org.apache.flink.api.common.JobID jobId)
Unregisters use of job-related BLOBs and allow them to be released.
|
close, getFileInternal, getPort, getStorageDir, setBlobServerAddress@VisibleForTesting
public PermanentBlobCache(org.apache.flink.configuration.Configuration blobClientConfig,
File storageDir,
BlobView blobView,
@Nullable
InetSocketAddress serverAddress)
throws IOException
IOException@VisibleForTesting
public PermanentBlobCache(org.apache.flink.configuration.Configuration blobClientConfig,
File storageDir,
BlobView blobView,
@Nullable
InetSocketAddress serverAddress,
BlobCacheSizeTracker blobCacheSizeTracker)
throws IOException
IOExceptionpublic PermanentBlobCache(org.apache.flink.configuration.Configuration blobClientConfig,
org.apache.flink.util.Reference<File> storageDir,
BlobView blobView,
@Nullable
InetSocketAddress serverAddress)
throws IOException
blobClientConfig - global configurationstorageDir - storage directory for the cached blobsblobView - (distributed) HA blob store file system to retrieve files from firstserverAddress - address of the BlobServer to use for fetching files from or
null if none yetIOException - thrown if the (local or distributed) file storage cannot be created or is
not usable@VisibleForTesting
public PermanentBlobCache(org.apache.flink.configuration.Configuration blobClientConfig,
org.apache.flink.util.Reference<File> storageDir,
BlobView blobView,
@Nullable
InetSocketAddress serverAddress,
BlobCacheSizeTracker blobCacheSizeTracker)
throws IOException
IOExceptionpublic void registerJob(org.apache.flink.api.common.JobID jobId)
Using any other method to access BLOBs, e.g. getFile(org.apache.flink.api.common.JobID, org.apache.flink.runtime.blob.PermanentBlobKey), is only valid within calls
to registerJob(JobID) and releaseJob(JobID).
registerJob in interface JobPermanentBlobServicejobId - ID of the job this blob belongs toreleaseJob(JobID)public void releaseJob(org.apache.flink.api.common.JobID jobId)
releaseJob in interface JobPermanentBlobServicejobId - ID of the job this blob belongs toregisterJob(JobID)public int getNumberOfReferenceHolders(org.apache.flink.api.common.JobID jobId)
public File getFile(org.apache.flink.api.common.JobID jobId, PermanentBlobKey key) throws IOException
We will first attempt to serve the BLOB from the local storage. If the BLOB is not in
there, we will try to download it from the HA store, or directly from the BlobServer.
getFile in interface PermanentBlobServicejobId - ID of the job this blob belongs tokey - blob key associated with the requested fileFileNotFoundException - if the BLOB does not exist;IOException - if any other error occurs when retrieving the filepublic byte[] readFile(org.apache.flink.api.common.JobID jobId,
PermanentBlobKey blobKey)
throws IOException
The method will first attempt to serve the BLOB from the local cache. If the BLOB is not
in the cache, the method will try to download it from the HA store, or directly from the
BlobServer.
Compared to getFile, readFile makes sure that the file is fully read in
the same write lock as the file is accessed. This avoids the scenario that the path is
returned as the file is deleted concurrently by other threads.
readFile in interface PermanentBlobServicejobId - ID of the job this blob belongs toblobKey - BLOB key associated with the requested fileFileNotFoundException - if the BLOB does not exist;IOException - if any other error occurs when retrieving the file.@VisibleForTesting public File getStorageLocation(org.apache.flink.api.common.JobID jobId, BlobKey key) throws IOException
jobId - ID of the job this blob belongs to (or null if job-unrelated)key - identifying the fileIOException - if creating the directory failsprotected void cancelCleanupTask()
AbstractBlobCacheThis is called during AbstractBlobCache.close().
cancelCleanupTask in class AbstractBlobCacheCopyright © 2014–2025 The Apache Software Foundation. All rights reserved.