public abstract class AbstractHaServices extends Object implements HighAvailabilityServices
getLeaderPathForResourceManager(), getLeaderPathForDispatcher(), getLeaderPathForJobManager(org.apache.flink.api.common.JobID), getLeaderPathForRestServer(). The returned leader name is the ConfigMap name in Kubernetes and
child path in Zookeeper.
close() and cleanupAllData() should be implemented to destroy the resources.
The abstract class is also responsible for determining which component service should be
reused. For example, jobResultStore is created once and could be reused many times.
| Modifier and Type | Field and Description |
|---|---|
protected org.apache.flink.configuration.Configuration |
configuration
The runtime configuration.
|
protected Executor |
ioExecutor
The executor to run external IO operations on.
|
protected org.slf4j.Logger |
logger |
DEFAULT_JOB_ID, DEFAULT_LEADER_ID| Modifier | Constructor and Description |
|---|---|
protected |
AbstractHaServices(org.apache.flink.configuration.Configuration config,
LeaderElectionDriverFactory driverFactory,
Executor ioExecutor,
BlobStoreService blobStoreService,
JobResultStore jobResultStore) |
| Modifier and Type | Method and Description |
|---|---|
void |
cleanupAllData()
Deletes all data stored by high availability services in external stores.
|
void |
close()
Closes the high availability services, releasing all resources.
|
BlobStore |
createBlobStore()
Creates the BLOB store in which BLOBs are stored in a highly-available fashion.
|
protected abstract CheckpointRecoveryFactory |
createCheckpointRecoveryFactory()
Create the checkpoint recovery factory for the job manager.
|
protected abstract JobGraphStore |
createJobGraphStore()
Create the submitted job graph store for the job manager.
|
protected abstract LeaderRetrievalService |
createLeaderRetrievalService(String leaderName)
Create leader retrieval service with specified leaderName.
|
CheckpointRecoveryFactory |
getCheckpointRecoveryFactory()
Gets the checkpoint recovery factory for the job manager.
|
LeaderElection |
getClusterRestEndpointLeaderElection()
Gets the
LeaderElection for the cluster's rest endpoint. |
LeaderRetrievalService |
getClusterRestEndpointLeaderRetriever()
Get the leader retriever for the cluster's rest endpoint.
|
LeaderElection |
getDispatcherLeaderElection()
Gets the
LeaderElection for the cluster's dispatcher. |
LeaderRetrievalService |
getDispatcherLeaderRetriever()
Gets the leader retriever for the dispatcher.
|
JobGraphStore |
getJobGraphStore()
Gets the submitted job graph store for the job manager.
|
LeaderElection |
getJobManagerLeaderElection(org.apache.flink.api.common.JobID jobID)
Gets the
LeaderElection for the job with the given JobID. |
LeaderRetrievalService |
getJobManagerLeaderRetriever(org.apache.flink.api.common.JobID jobID)
Gets the leader retriever for the job JobMaster which is responsible for the given job.
|
LeaderRetrievalService |
getJobManagerLeaderRetriever(org.apache.flink.api.common.JobID jobID,
String defaultJobManagerAddress)
Gets the leader retriever for the job JobMaster which is responsible for the given job.
|
JobResultStore |
getJobResultStore()
Gets the store that holds information about the state of finished jobs.
|
protected abstract String |
getLeaderPathForDispatcher()
Get the leader path for Dispatcher.
|
protected abstract String |
getLeaderPathForJobManager(org.apache.flink.api.common.JobID jobID)
Get the leader path for specific JobManager.
|
protected abstract String |
getLeaderPathForResourceManager()
Get the leader path for ResourceManager.
|
protected abstract String |
getLeaderPathForRestServer()
Get the leader path for RestServer.
|
LeaderElection |
getResourceManagerLeaderElection()
Gets the
LeaderElection for the cluster's resource manager. |
LeaderRetrievalService |
getResourceManagerLeaderRetriever()
Gets the leader retriever for the cluster's resource manager.
|
CompletableFuture<Void> |
globalCleanupAsync(org.apache.flink.api.common.JobID jobID,
Executor executor)
globalCleanupAsync is expected to be called from the main thread. |
protected abstract void |
internalCleanup()
Clean up the meta data in the distributed system(e.g.
|
protected abstract void |
internalCleanupJobData(org.apache.flink.api.common.JobID jobID)
Clean up the meta data in the distributed system(e.g.
|
protected abstract void |
internalClose()
Closes the components which is used for external operations(e.g.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitcloseWithOptionalClean, getWebMonitorLeaderElection, getWebMonitorLeaderRetrieverprotected final org.slf4j.Logger logger
protected final Executor ioExecutor
protected final org.apache.flink.configuration.Configuration configuration
protected AbstractHaServices(org.apache.flink.configuration.Configuration config,
LeaderElectionDriverFactory driverFactory,
Executor ioExecutor,
BlobStoreService blobStoreService,
JobResultStore jobResultStore)
public LeaderRetrievalService getResourceManagerLeaderRetriever()
HighAvailabilityServicesgetResourceManagerLeaderRetriever in interface HighAvailabilityServicespublic LeaderRetrievalService getDispatcherLeaderRetriever()
HighAvailabilityServicesgetDispatcherLeaderRetriever in interface HighAvailabilityServicespublic LeaderRetrievalService getJobManagerLeaderRetriever(org.apache.flink.api.common.JobID jobID)
HighAvailabilityServicesgetJobManagerLeaderRetriever in interface HighAvailabilityServicesjobID - The identifier of the job.public LeaderRetrievalService getJobManagerLeaderRetriever(org.apache.flink.api.common.JobID jobID, String defaultJobManagerAddress)
HighAvailabilityServicesgetJobManagerLeaderRetriever in interface HighAvailabilityServicesjobID - The identifier of the job.defaultJobManagerAddress - JobManager address which will be returned by a static leader
retrieval service.public LeaderRetrievalService getClusterRestEndpointLeaderRetriever()
ClientHighAvailabilityServicesgetClusterRestEndpointLeaderRetriever in interface ClientHighAvailabilityServicesgetClusterRestEndpointLeaderRetriever in interface HighAvailabilityServicespublic LeaderElection getResourceManagerLeaderElection()
HighAvailabilityServicesLeaderElection for the cluster's resource manager.getResourceManagerLeaderElection in interface HighAvailabilityServicespublic LeaderElection getDispatcherLeaderElection()
HighAvailabilityServicesLeaderElection for the cluster's dispatcher.getDispatcherLeaderElection in interface HighAvailabilityServicespublic LeaderElection getJobManagerLeaderElection(org.apache.flink.api.common.JobID jobID)
HighAvailabilityServicesLeaderElection for the job with the given JobID.getJobManagerLeaderElection in interface HighAvailabilityServicespublic LeaderElection getClusterRestEndpointLeaderElection()
HighAvailabilityServicesLeaderElection for the cluster's rest endpoint.getClusterRestEndpointLeaderElection in interface HighAvailabilityServicespublic CheckpointRecoveryFactory getCheckpointRecoveryFactory() throws Exception
HighAvailabilityServicesgetCheckpointRecoveryFactory in interface HighAvailabilityServicesExceptionpublic JobGraphStore getJobGraphStore() throws Exception
HighAvailabilityServicesgetJobGraphStore in interface HighAvailabilityServicesException - if the submitted job graph store could not be createdpublic JobResultStore getJobResultStore() throws Exception
HighAvailabilityServicesgetJobResultStore in interface HighAvailabilityServicesException - if job result store could not be createdpublic BlobStore createBlobStore()
HighAvailabilityServicescreateBlobStore in interface HighAvailabilityServicespublic void close()
throws Exception
HighAvailabilityServicesThis method does not delete or clean up any data stored in external stores (file systems, ZooKeeper, etc). Another instance of the high availability services will be able to recover the job.
If an exception occurs during closing services, this method will attempt to continue closing other services and report exceptions only after all services have been attempted to be closed.
close in interface AutoCloseableclose in interface HighAvailabilityServicesException - Thrown, if an exception occurred while closing these services.public void cleanupAllData()
throws Exception
HighAvailabilityServicesAfter this method was called, any job or session that was managed by these high availability services will be unrecoverable.
If an exception occurs during cleanup, this method will attempt to continue the cleanup and report exceptions only after all cleanup steps have been attempted.
cleanupAllData in interface HighAvailabilityServicesException - if an error occurred while cleaning up data stored by them.public CompletableFuture<Void> globalCleanupAsync(org.apache.flink.api.common.JobID jobID, Executor executor)
GloballyCleanableResourceglobalCleanupAsync is expected to be called from the main thread. Heavy IO tasks
should be outsourced into the passed cleanupExecutor. Thread-safety must be ensured.globalCleanupAsync in interface GloballyCleanableResourceglobalCleanupAsync in interface HighAvailabilityServicesjobID - The JobID of the job for which the local data should be cleaned up.executor - The fallback executor for IO-heavy operations.protected abstract LeaderRetrievalService createLeaderRetrievalService(String leaderName)
leaderName - ConfigMap name in Kubernetes or child node path in Zookeeper.protected abstract CheckpointRecoveryFactory createCheckpointRecoveryFactory() throws Exception
Exceptionprotected abstract JobGraphStore createJobGraphStore() throws Exception
Exception - if the submitted job graph store could not be createdprotected abstract void internalClose()
throws Exception
Exception - if the close operation failedprotected abstract void internalCleanup()
throws Exception
If an exception occurs during internal cleanup, we will continue the cleanup in cleanupAllData() and report exceptions only after all cleanup steps have been attempted.
Exception - when do the cleanup operation on external storage.protected abstract void internalCleanupJobData(org.apache.flink.api.common.JobID jobID)
throws Exception
jobID - The identifier of the job to cleanup.Exception - when do the cleanup operation on external storage.protected abstract String getLeaderPathForResourceManager()
protected abstract String getLeaderPathForDispatcher()
protected abstract String getLeaderPathForJobManager(org.apache.flink.api.common.JobID jobID)
jobID - job idprotected abstract String getLeaderPathForRestServer()
Copyright © 2014–2025 The Apache Software Foundation. All rights reserved.