Flink State 状态后端分析
创始人
2024-05-27 16:52:30

flink状态实现分析

state

 *             State*               |*               +-------------------InternalKvState*               |                         |*          MergingState                   |*               |                         |*               +-----------------InternalMergingState*               |                         |*      +--------+------+                  |*      |               |                  |* ReducingState    ListState        +-----+-----------------+*      |               |            |                       |*      +-----------+   +-----------   -----------------InternalListState*                  |                |*                  +---------InternalReducingState

MemoryState

AbstractHeapStateHeapMapStateInternalMapStateInternalKvStateStateAbstractHeapMergingStateHeapListStateInternalListStateAbstractHeapAppendingStateInternalMergingStateInternalAppendingStateHeapValueStateInternalValueState

RocksDBState

StateInternalKvStateAbstractRocksDBStateRocksDBMapStateRocksDBListStateRocksDBValueStateRocksDBReducingStateRocksDBAggregatingState
class RocksDBMapState extends AbstractRocksDBState> {private TypeSerializer userKeySerializer;private TypeSerializer userValueSerializer;private RocksDBMapState(ColumnFamilyHandle columnFamily,TypeSerializer namespaceSerializer,TypeSerializer> valueSerializer,Map defaultValue,RocksDBKeyedStateBackend backend);public TypeSerializer getKeySerializer();public TypeSerializer getNamespaceSerializer();public TypeSerializer> getValueSerializer();public UV get(UK userKey){ //直接读rocksdbbyte[] rawKeyBytes =serializeCurrentKeyWithGroupAndNamespacePlusUserKey(userKey, userKeySerializer);byte[] rawValueBytes = backend.db.get(columnFamily, rawKeyBytes);return (rawValueBytes == null? null: deserializeUserValue(dataInputView, rawValueBytes, userValueSerializer));}public void put(UK userKey, UV userValue){ //直接写rocksdbbyte[] rawKeyBytes =serializeCurrentKeyWithGroupAndNamespacePlusUserKey(userKey, userKeySerializer);byte[] rawValueBytes = serializeValueNullSensitive(userValue, userValueSerializer);backend.db.put(columnFamily, writeOptions, rawKeyBytes, rawValueBytes); //backend.db是RocksDBKeyedStateBackend}public void putAll(Map map);public void remove(UK userKey);public boolean contains(UK userKey);public Iterable> entries();public Iterable keys();public Iterable values();public boolean isEmpty();public void clear();static  IS create(StateDescriptor stateDesc,Tuple2>registerResult,RocksDBKeyedStateBackend backend) { //backend在这里传入return (IS)new RocksDBMapState<>(registerResult.f0,registerResult.f1.getNamespaceSerializer(),(TypeSerializer>) registerResult.f1.getStateSerializer(),(Map) stateDesc.getDefaultValue(),backend);}
}

backend与checkpoint

AbstractKeyedStateBackendRocksDBKeyedStateBackendCheckpointableKeyedStateBackendKeyedStateBackendSnapshotableHeapKeyedStateBackendOperatorStateBackendDefaultOperatorStateBackendOperatorStateStore
public interface Snapshotable {RunnableFuture snapshot(long checkpointId,long timestamp,@Nonnull CheckpointStreamFactory streamFactory,@Nonnull CheckpointOptions checkpointOptions)throws Exception;
}

FSBackend

  • FsStateBackend中createKeyedStateBackend是创建了HeapKeyedStateBackend
  • FsStateBackend中createOperatorStateBackend是创建了DefaultOperatorStateBackend
  • DefaultOperatorStateBackend创建了PartitionableListState, 是State的子类
AbstractFileStateBackendFsStateBackendAbstractStateBackendCheckpointStorageStateBackendConfigurableStateBackend
public interface StateBackend extends java.io.Serializable {default String getName() {return this.getClass().getSimpleName();} CheckpointableKeyedStateBackend createKeyedStateBackend(Environment env,JobID jobID,String operatorIdentifier,TypeSerializer keySerializer,int numberOfKeyGroups,KeyGroupRange keyGroupRange,TaskKvStateRegistry kvStateRegistry,TtlTimeProvider ttlTimeProvider,MetricGroup metricGroup,@Nonnull Collection stateHandles,CloseableRegistry cancelStreamRegistry)throws Exception;OperatorStateBackend createOperatorStateBackend(Environment env,String operatorIdentifier,@Nonnull Collection stateHandles,CloseableRegistry cancelStreamRegistry)throws Exception;/** Whether the state backend uses Flink's managed memory. */default boolean useManagedMemory() {return false;}}
public class FsStateBackend extends AbstractFileStateBackend implements ConfigurableStateBackend {public CheckpointStorageAccess createCheckpointStorage(JobID jobId) throws IOException {checkNotNull(jobId, "jobId");return new FsCheckpointStorageAccess(getCheckpointPath(),getSavepointPath(),jobId,getMinFileSizeThreshold(),getWriteBufferSize());}public  AbstractKeyedStateBackend createKeyedStateBackend(Environment env,JobID jobID,String operatorIdentifier,TypeSerializer keySerializer,int numberOfKeyGroups,KeyGroupRange keyGroupRange,TaskKvStateRegistry kvStateRegistry,TtlTimeProvider ttlTimeProvider,MetricGroup metricGroup,@Nonnull Collection stateHandles,CloseableRegistry cancelStreamRegistry)throws BackendBuildingException {TaskStateManager taskStateManager = env.getTaskStateManager();LocalRecoveryConfig localRecoveryConfig = taskStateManager.createLocalRecoveryConfig();HeapPriorityQueueSetFactory priorityQueueSetFactory =new HeapPriorityQueueSetFactory(keyGroupRange, numberOfKeyGroups, 128);LatencyTrackingStateConfig latencyTrackingStateConfig =latencyTrackingConfigBuilder.setMetricGroup(metricGroup).build();return new HeapKeyedStateBackendBuilder<>( //这里是HeapKeyedStateBackendBuilderkvStateRegistry,keySerializer,env.getUserCodeClassLoader().asClassLoader(),numberOfKeyGroups,keyGroupRange,env.getExecutionConfig(),ttlTimeProvider,latencyTrackingStateConfig,stateHandles,AbstractStateBackend.getCompressionDecorator(env.getExecutionConfig()),localRecoveryConfig,priorityQueueSetFactory,isUsingAsynchronousSnapshots(),cancelStreamRegistry).build();}@Overridepublic OperatorStateBackend createOperatorStateBackend(Environment env,String operatorIdentifier,@Nonnull Collection stateHandles,CloseableRegistry cancelStreamRegistry)throws BackendBuildingException {return new DefaultOperatorStateBackendBuilder(  //这里是DefaultOperatorStateBackendBuilderenv.getUserCodeClassLoader().asClassLoader(),env.getExecutionConfig(),isUsingAsynchronousSnapshots(),stateHandles,cancelStreamRegistry).build();}
}

memory backend

  • MemoryStateBackend中createOperatorStateBackend是创建了DefaultOperatorStateBackend
  • MemoryStateBackend中createKeyedStateBackend是创建了HeapKeyedStateBackendBackend
  • 最终调用了HeapMapState::Create创建state
AbstractFileStateBackendMemoryStateBackendConfigurableStateBackendAbstractStateBackendCheckpointStorageStateBackend

flink checkpoint

CheckpointStorage+resolveCheckpoint(String externalPointer)+createCheckpointStorage(JobID jobId)RocksDBStateBackend+checkpointStreamBackend : StateBackendCheckpointStorageAccessAbstractFsCheckpointStorageAccessFsCheckpointStorageAccessMemoryBackendCheckpointStorageAccess RestoreOperationRocksDBRestoreOperationRocksDBFullRestoreOperationRocksDBHeapTimersFullRestoreOperationRocksDBIncrementalRestoreOperationRocksDBSnapshotOperationRocksDBIncrementalSnapshotOperationRocksDBNativeFullSnapshotOperation

参考资料

https://www.jianshu.com/p/569a7e67c1b3
https://blog.csdn.net/u010942041/article/details/114944767
https://cloud.tencent.com/developer/article/1792720
https://blog.51cto.com/dataclub/5351042
https://www.cnblogs.com/lighten/p/13234350.html
https://cloud.tencent.com/developer/article/1765572
https://blog.csdn.net/m0_63475429/article/details/127417649
https://blog.csdn.net/Direction_Wind/article/details/125646616

相关内容

热门资讯

好消息候补成功,坏消息车开走了... (来源:上观新闻)前一秒还庆幸终于抢到票后一秒发现:人、车、票,三者完美错开,钱花了,票废了……今(...
最高检:文创产品版权案件高发,... 2月24日,最高人民检察院知识产权检察厅副厅长刘太宗做客“学思践悟党的二十届四中全会精神 持续推进习...
日本松下宣布:将欧美电视销售业... 本文来自微信公众号“大象新闻”2月24日,据报道,日本电子巨头松下控股正式宣布,2026年4月起将北...
走路快慢藏着“长寿密码”,5种... 健康聚焦1走路快慢藏着“长寿密码”美国《梅奥诊所学报》曾发表一项研究显示,无论体重如何,走路快的人往...
祝贺!亨通光电总经理张建峰入选... 运营商财经网讯近日,运营商财经网推出通信设备行业系列杰出榜单,对过去一年表现优异的企业管理者进行表彰...