Flink State 状态后端分析
创始人
2024-05-27 16:52:30

flink状态实现分析

state

 *             State*               |*               +-------------------InternalKvState*               |                         |*          MergingState                   |*               |                         |*               +-----------------InternalMergingState*               |                         |*      +--------+------+                  |*      |               |                  |* ReducingState    ListState        +-----+-----------------+*      |               |            |                       |*      +-----------+   +-----------   -----------------InternalListState*                  |                |*                  +---------InternalReducingState

MemoryState

AbstractHeapStateHeapMapStateInternalMapStateInternalKvStateStateAbstractHeapMergingStateHeapListStateInternalListStateAbstractHeapAppendingStateInternalMergingStateInternalAppendingStateHeapValueStateInternalValueState

RocksDBState

StateInternalKvStateAbstractRocksDBStateRocksDBMapStateRocksDBListStateRocksDBValueStateRocksDBReducingStateRocksDBAggregatingState
class RocksDBMapState extends AbstractRocksDBState> {private TypeSerializer userKeySerializer;private TypeSerializer userValueSerializer;private RocksDBMapState(ColumnFamilyHandle columnFamily,TypeSerializer namespaceSerializer,TypeSerializer> valueSerializer,Map defaultValue,RocksDBKeyedStateBackend backend);public TypeSerializer getKeySerializer();public TypeSerializer getNamespaceSerializer();public TypeSerializer> getValueSerializer();public UV get(UK userKey){ //直接读rocksdbbyte[] rawKeyBytes =serializeCurrentKeyWithGroupAndNamespacePlusUserKey(userKey, userKeySerializer);byte[] rawValueBytes = backend.db.get(columnFamily, rawKeyBytes);return (rawValueBytes == null? null: deserializeUserValue(dataInputView, rawValueBytes, userValueSerializer));}public void put(UK userKey, UV userValue){ //直接写rocksdbbyte[] rawKeyBytes =serializeCurrentKeyWithGroupAndNamespacePlusUserKey(userKey, userKeySerializer);byte[] rawValueBytes = serializeValueNullSensitive(userValue, userValueSerializer);backend.db.put(columnFamily, writeOptions, rawKeyBytes, rawValueBytes); //backend.db是RocksDBKeyedStateBackend}public void putAll(Map map);public void remove(UK userKey);public boolean contains(UK userKey);public Iterable> entries();public Iterable keys();public Iterable values();public boolean isEmpty();public void clear();static  IS create(StateDescriptor stateDesc,Tuple2>registerResult,RocksDBKeyedStateBackend backend) { //backend在这里传入return (IS)new RocksDBMapState<>(registerResult.f0,registerResult.f1.getNamespaceSerializer(),(TypeSerializer>) registerResult.f1.getStateSerializer(),(Map) stateDesc.getDefaultValue(),backend);}
}

backend与checkpoint

AbstractKeyedStateBackendRocksDBKeyedStateBackendCheckpointableKeyedStateBackendKeyedStateBackendSnapshotableHeapKeyedStateBackendOperatorStateBackendDefaultOperatorStateBackendOperatorStateStore
public interface Snapshotable {RunnableFuture snapshot(long checkpointId,long timestamp,@Nonnull CheckpointStreamFactory streamFactory,@Nonnull CheckpointOptions checkpointOptions)throws Exception;
}

FSBackend

  • FsStateBackend中createKeyedStateBackend是创建了HeapKeyedStateBackend
  • FsStateBackend中createOperatorStateBackend是创建了DefaultOperatorStateBackend
  • DefaultOperatorStateBackend创建了PartitionableListState, 是State的子类
AbstractFileStateBackendFsStateBackendAbstractStateBackendCheckpointStorageStateBackendConfigurableStateBackend
public interface StateBackend extends java.io.Serializable {default String getName() {return this.getClass().getSimpleName();} CheckpointableKeyedStateBackend createKeyedStateBackend(Environment env,JobID jobID,String operatorIdentifier,TypeSerializer keySerializer,int numberOfKeyGroups,KeyGroupRange keyGroupRange,TaskKvStateRegistry kvStateRegistry,TtlTimeProvider ttlTimeProvider,MetricGroup metricGroup,@Nonnull Collection stateHandles,CloseableRegistry cancelStreamRegistry)throws Exception;OperatorStateBackend createOperatorStateBackend(Environment env,String operatorIdentifier,@Nonnull Collection stateHandles,CloseableRegistry cancelStreamRegistry)throws Exception;/** Whether the state backend uses Flink's managed memory. */default boolean useManagedMemory() {return false;}}
public class FsStateBackend extends AbstractFileStateBackend implements ConfigurableStateBackend {public CheckpointStorageAccess createCheckpointStorage(JobID jobId) throws IOException {checkNotNull(jobId, "jobId");return new FsCheckpointStorageAccess(getCheckpointPath(),getSavepointPath(),jobId,getMinFileSizeThreshold(),getWriteBufferSize());}public  AbstractKeyedStateBackend createKeyedStateBackend(Environment env,JobID jobID,String operatorIdentifier,TypeSerializer keySerializer,int numberOfKeyGroups,KeyGroupRange keyGroupRange,TaskKvStateRegistry kvStateRegistry,TtlTimeProvider ttlTimeProvider,MetricGroup metricGroup,@Nonnull Collection stateHandles,CloseableRegistry cancelStreamRegistry)throws BackendBuildingException {TaskStateManager taskStateManager = env.getTaskStateManager();LocalRecoveryConfig localRecoveryConfig = taskStateManager.createLocalRecoveryConfig();HeapPriorityQueueSetFactory priorityQueueSetFactory =new HeapPriorityQueueSetFactory(keyGroupRange, numberOfKeyGroups, 128);LatencyTrackingStateConfig latencyTrackingStateConfig =latencyTrackingConfigBuilder.setMetricGroup(metricGroup).build();return new HeapKeyedStateBackendBuilder<>( //这里是HeapKeyedStateBackendBuilderkvStateRegistry,keySerializer,env.getUserCodeClassLoader().asClassLoader(),numberOfKeyGroups,keyGroupRange,env.getExecutionConfig(),ttlTimeProvider,latencyTrackingStateConfig,stateHandles,AbstractStateBackend.getCompressionDecorator(env.getExecutionConfig()),localRecoveryConfig,priorityQueueSetFactory,isUsingAsynchronousSnapshots(),cancelStreamRegistry).build();}@Overridepublic OperatorStateBackend createOperatorStateBackend(Environment env,String operatorIdentifier,@Nonnull Collection stateHandles,CloseableRegistry cancelStreamRegistry)throws BackendBuildingException {return new DefaultOperatorStateBackendBuilder(  //这里是DefaultOperatorStateBackendBuilderenv.getUserCodeClassLoader().asClassLoader(),env.getExecutionConfig(),isUsingAsynchronousSnapshots(),stateHandles,cancelStreamRegistry).build();}
}

memory backend

  • MemoryStateBackend中createOperatorStateBackend是创建了DefaultOperatorStateBackend
  • MemoryStateBackend中createKeyedStateBackend是创建了HeapKeyedStateBackendBackend
  • 最终调用了HeapMapState::Create创建state
AbstractFileStateBackendMemoryStateBackendConfigurableStateBackendAbstractStateBackendCheckpointStorageStateBackend

flink checkpoint

CheckpointStorage+resolveCheckpoint(String externalPointer)+createCheckpointStorage(JobID jobId)RocksDBStateBackend+checkpointStreamBackend : StateBackendCheckpointStorageAccessAbstractFsCheckpointStorageAccessFsCheckpointStorageAccessMemoryBackendCheckpointStorageAccess RestoreOperationRocksDBRestoreOperationRocksDBFullRestoreOperationRocksDBHeapTimersFullRestoreOperationRocksDBIncrementalRestoreOperationRocksDBSnapshotOperationRocksDBIncrementalSnapshotOperationRocksDBNativeFullSnapshotOperation

参考资料

https://www.jianshu.com/p/569a7e67c1b3
https://blog.csdn.net/u010942041/article/details/114944767
https://cloud.tencent.com/developer/article/1792720
https://blog.51cto.com/dataclub/5351042
https://www.cnblogs.com/lighten/p/13234350.html
https://cloud.tencent.com/developer/article/1765572
https://blog.csdn.net/m0_63475429/article/details/127417649
https://blog.csdn.net/Direction_Wind/article/details/125646616

相关内容

热门资讯

赤水峥嵘岁月 长征绝处逢生 转自:贵州日报 近日,聚焦毛泽东同志军事生涯“平生得意之笔”的大型史诗电影《四渡》,正式发布首...
云漫湖公园启动“新春喜乐汇” 转自:贵州日报 本报讯 1月1日,贵安新区云漫湖生态度假公园启动为期三个月的“2026新春喜...
调试新设备 增添新动能 (来源:工人日报) 2025年12月29日,位于浙江省东阳市花园村的花园新材料股份有限公司车间...
首都工会公益伙伴项目为骑手提供... (来源:工人日报) 本报讯 (记者赖志凯 见习记者沙剑青)“几场活动下来,手机贴膜的问题解...
书房里的骑手 (来源:工人日报) 冬日的清晨6点,路灯还未熄灭,温州城浸在青灰色的薄雾里。我晨跑的脚步声,一...