解决spark sql读取hudi表出现偶然读不出来数据问题

相关版本

hadoop 3.2.0

spark 3.3.0

hudi 0.12.0

问题分析

用beeline连接spark thriftserver或者kyuubi(spark 3.3.0)查询hudi mor表,发现对于同一个spark SQL在同一个beeline session里面不同时间查到的东西都是一样的。比如我用select count(*) from xxx。除此之外还有个问题就是,在同一个beeline session里面再过一段时间后,由于有些文件被合并了,再查会报以前的log文件找不到的问题。

查看同一个beeline session中,两条SQL的执行计划对应的org.apache.hudi.MergeOnReadSnapshotRelation@3a576875一摸一样

但是上述问题的话,如果把beeline退出来,再进去就不会出现了

问题复现

  1. 创建flink任务,实时写入mor表

create catalog hudi with(
'type' = 'hudi',
'mode' = 'hms',
'hive.conf.dir'='/etc/hive/conf'
);

create database if not exists hudi.hudidb;

CREATE TABLE sourceT (
  uuid varchar(20),
  name varchar(10),
  age int,
  ts timestamp(3),
  `partition` varchar(20)
) WITH (
  'connector' = 'datagen',
  'rows-per-second' = '100'
);

create table hudi.hudidb.t2_20221024_5(
  uuid varchar(20),
  name varchar(10),
  age int,
  ts STRING,
  `partition` varchar(20)
)
partitioned by (`partition`)
with (
  'connector' = 'hudi',
  'table.type' = 'MERGE_ON_READ',
  'index.type' = 'BUCKET',
  'hive_sync.skip_ro_suffix' = 'true',
  'write.precombine.field' = 'ts',
  'hoodie.datasource.write.recordkey.field' = 'uuid',
  'use.hive.schema' = 'true'
);

insert into hudi.hudidb.t2_20221024_5(uuid, name, age, ts) select uuid, name, age, cast(ts as string) from sourceT;

报错

Logs for container_e288_1666319426871_0014_01_000004
ResourceManager
RM Home
NodeManager
Tools
22/10/24 17:45:58 INFO CoarseGrainedExecutorBackend: Started daemon with process name: 18131@host121
22/10/24 17:45:58 INFO SignalUtils: Registering signal handler for TERM
22/10/24 17:45:58 INFO SignalUtils: Registering signal handler for HUP
22/10/24 17:45:58 INFO SignalUtils: Registering signal handler for INT
22/10/24 17:45:58 INFO SecurityManager: Changing view acls to: spark
22/10/24 17:45:58 INFO SecurityManager: Changing modify acls to: spark
22/10/24 17:45:58 INFO SecurityManager: Changing view acls groups to: 
22/10/24 17:45:58 INFO SecurityManager: Changing modify acls groups to: 
22/10/24 17:45:58 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(spark); groups with view permissions: Set(); users  with modify permissions: Set(spark); groups with modify permissions: Set()
22/10/24 17:45:59 INFO TransportClientFactory: Successfully created connection to host117/10.45.46.117:38306 after 73 ms (0 ms spent in bootstraps)
22/10/24 17:45:59 WARN SparkConf: The configuration key 'spark.yarn.principal' has been deprecated as of Spark 3.0 and may be removed in the future. Please use the new key 'spark.kerberos.principal' instead.
22/10/24 17:45:59 WARN SparkConf: The configuration key 'spark.yarn.keytab' has been deprecated as of Spark 3.0 and may be removed in the future. Please use the new key 'spark.kerberos.keytab' instead.
22/10/24 17:45:59 INFO SparkHadoopUtil: Updating delegation tokens for current user.
22/10/24 17:45:59 INFO SecurityManager: Changing view acls to: spark
22/10/24 17:45:59 INFO SecurityManager: Changing modify acls to: spark
22/10/24 17:45:59 INFO SecurityManager: Changing view acls groups to: 
22/10/24 17:45:59 INFO SecurityManager: Changing modify acls groups to: 
22/10/24 17:45:59 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(spark); groups with view permissions: Set(); users  with modify permissions: Set(spark); groups with modify permissions: Set()
22/10/24 17:45:59 INFO TransportClientFactory: Successfully created connection to host117/10.45.46.117:38306 after 2 ms (0 ms spent in bootstraps)
22/10/24 17:45:59 INFO DiskBlockManager: Created local directory at /hadoop/yarn/local/usercache/spark/appcache/application_1666319426871_0014/blockmgr-b9151b84-c682-43c7-8bcb-08a9d5b935f0
22/10/24 17:45:59 INFO MemoryStore: MemoryStore started with capacity 366.3 MiB
22/10/24 17:45:59 INFO YarnCoarseGrainedExecutorBackend: Connecting to driver: spark://CoarseGrainedScheduler@host117:38306
22/10/24 17:46:00 INFO ResourceUtils: ==============================================================
22/10/24 17:46:00 INFO ResourceUtils: No custom resources configured for spark.executor.
22/10/24 17:46:00 INFO ResourceUtils: ==============================================================
22/10/24 17:46:00 INFO YarnCoarseGrainedExecutorBackend: Successfully registered with driver
22/10/24 17:46:00 INFO Executor: Starting executor ID 4 on host host121
22/10/24 17:46:00 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43288.
22/10/24 17:46:00 INFO NettyBlockTransferService: Server created on host121:43288
22/10/24 17:46:00 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
22/10/24 17:46:00 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(4, host121, 43288, None)
22/10/24 17:46:00 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(4, host121, 43288, None)
22/10/24 17:46:00 INFO BlockManager: external shuffle service port = 7337
22/10/24 17:46:00 INFO BlockManager: Registering executor with local external shuffle service.
22/10/24 17:46:00 INFO TransportClientFactory: Successfully created connection to host121/10.45.46.121:7337 after 1 ms (0 ms spent in bootstraps)
22/10/24 17:46:00 INFO BlockManager: Initialized BlockManager: BlockManagerId(4, host121, 43288, None)
22/10/24 17:46:00 INFO Executor: Starting executor with user classpath (userClassPathFirst = false): 'file:/hadoop/yarn/local/usercache/spark/appcache/application_1666319426871_0014/container_e288_1666319426871_0014_01_000004/__app__.jar,file:/hadoop/yarn/local/usercache/spark/appcache/application_1666319426871_0014/container_e288_1666319426871_0014_01_000004/__app__.jar'
22/10/24 17:46:00 INFO YarnCoarseGrainedExecutorBackend: Got assigned task 60
22/10/24 17:46:00 INFO Executor: Running task 0.0 in stage 5.0 (TID 60)
22/10/24 17:46:00 INFO TorrentBroadcast: Started reading broadcast variable 14 with 1 pieces (estimated total size 4.0 MiB)
22/10/24 17:46:00 INFO TransportClientFactory: Successfully created connection to host117/10.45.46.117:36971 after 1 ms (0 ms spent in bootstraps)
22/10/24 17:46:00 INFO MemoryStore: Block broadcast_14_piece0 stored as bytes in memory (estimated size 9.1 KiB, free 366.3 MiB)
22/10/24 17:46:00 INFO TorrentBroadcast: Reading broadcast variable 14 took 100 ms
22/10/24 17:46:00 INFO MemoryStore: Block broadcast_14 stored as values in memory (estimated size 21.0 KiB, free 366.3 MiB)
22/10/24 17:46:01 INFO CodeGenerator: Code generated in 250.698903 ms
22/10/24 17:46:02 INFO TorrentBroadcast: Started reading broadcast variable 13 with 1 pieces (estimated total size 4.0 MiB)
22/10/24 17:46:02 INFO MemoryStore: Block broadcast_13_piece0 stored as bytes in memory (estimated size 32.2 KiB, free 366.2 MiB)
22/10/24 17:46:02 INFO TorrentBroadcast: Reading broadcast variable 13 took 16 ms
22/10/24 17:46:02 INFO MemoryStore: Block broadcast_13 stored as values in memory (estimated size 533.1 KiB, free 365.7 MiB)
22/10/24 17:46:02 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from hdfs://host117:8020/apps/spark/warehouse/hudidb.db/t2_20221024_5
22/10/24 17:46:02 INFO HoodieTableConfig: Loading table properties from hdfs://host117:8020/apps/spark/warehouse/hudidb.db/t2_20221024_5/.hoodie/hoodie.properties
22/10/24 17:46:03 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from hdfs://host117:8020/apps/spark/warehouse/hudidb.db/t2_20221024_5
22/10/24 17:46:03 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[==>20221024174559919__deltacommit__INFLIGHT]}
22/10/24 17:46:03 ERROR AbstractHoodieLogRecordReader: Got IOException when reading log file
java.io.FileNotFoundException: File does not exist: /apps/spark/warehouse/hudidb.db/t2_20221024_5/.00000003-cfe8-48ce-b255-adcb0eaf7ed8_20221024173219931.log.1_3-4-0
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86)
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
  at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:153)
  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1946)
  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:739)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:432)
  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)

  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
  at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
  at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88)
  at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:865)
  at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:852)
  at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:841)
  at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1009)
  at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:325)
  at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:321)
  at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
  at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:333)
  at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:474)
  at org.apache.hudi.common.table.log.HoodieLogFileReader.<init>(HoodieLogFileReader.java:114)
  at org.apache.hudi.common.table.log.HoodieLogFormatReader.<init>(HoodieLogFormatReader.java:70)
  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:219)
  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:192)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:110)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:103)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:324)
  at org.apache.hudi.HoodieMergeOnReadRDD$.scanLog(HoodieMergeOnReadRDD.scala:402)
  at org.apache.hudi.HoodieMergeOnReadRDD$LogFileIterator.<init>(HoodieMergeOnReadRDD.scala:197)
  at org.apache.hudi.HoodieMergeOnReadRDD$RecordMergingFileIterator.<init>(HoodieMergeOnReadRDD.scala:278)
  at org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:132)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
  at org.apache.spark.scheduler.Task.run(Task.scala:136)
  at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
  at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /apps/spark/warehouse/hudidb.db/t2_20221024_5/.00000003-cfe8-48ce-b255-adcb0eaf7ed8_20221024173219931.log.1_3-4-0
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86)
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
  at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:153)
  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1946)
  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:739)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:432)
  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)

  at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511)
  at org.apache.hadoop.ipc.Client.call(Client.java:1457)
  at org.apache.hadoop.ipc.Client.call(Client.java:1367)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
  at com.sun.proxy.$Proxy31.getBlockLocations(Unknown Source)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:320)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
  at com.sun.proxy.$Proxy32.getBlockLocations(Unknown Source)
  at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:863)
  ... 38 more
22/10/24 17:46:03 ERROR Executor: Exception in task 0.0 in stage 5.0 (TID 60)
org.apache.hudi.exception.HoodieIOException: IOException when reading log file 
  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:349)
  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:192)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:110)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:103)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:324)
  at org.apache.hudi.HoodieMergeOnReadRDD$.scanLog(HoodieMergeOnReadRDD.scala:402)
  at org.apache.hudi.HoodieMergeOnReadRDD$LogFileIterator.<init>(HoodieMergeOnReadRDD.scala:197)
  at org.apache.hudi.HoodieMergeOnReadRDD$RecordMergingFileIterator.<init>(HoodieMergeOnReadRDD.scala:278)
  at org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:132)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
  at org.apache.spark.scheduler.Task.run(Task.scala:136)
  at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
  at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: File does not exist: /apps/spark/warehouse/hudidb.db/t2_20221024_5/.00000003-cfe8-48ce-b255-adcb0eaf7ed8_20221024173219931.log.1_3-4-0
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86)
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
  at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:153)
  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1946)
  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:739)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:432)
  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)

  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
  at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
  at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88)
  at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:865)
  at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:852)
  at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:841)
  at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1009)
  at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:325)
  at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:321)
  at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
  at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:333)
  at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:474)
  at org.apache.hudi.common.table.log.HoodieLogFileReader.<init>(HoodieLogFileReader.java:114)
  at org.apache.hudi.common.table.log.HoodieLogFormatReader.<init>(HoodieLogFormatReader.java:70)
  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:219)
  ... 27 more
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /apps/spark/warehouse/hudidb.db/t2_20221024_5/.00000003-cfe8-48ce-b255-adcb0eaf7ed8_20221024173219931.log.1_3-4-0
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86)
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
  at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:153)
  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1946)
  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:739)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:432)
  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)

  at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511)
  at org.apache.hadoop.ipc.Client.call(Client.java:1457)
  at org.apache.hadoop.ipc.Client.call(Client.java:1367)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
  at com.sun.proxy.$Proxy31.getBlockLocations(Unknown Source)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:320)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
  at com.sun.proxy.$Proxy32.getBlockLocations(Unknown Source)
  at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:863)
  ... 38 more
22/10/24 17:46:03 INFO YarnCoarseGrainedExecutorBackend: Got assigned task 64
22/10/24 17:46:03 INFO Executor: Running task 0.1 in stage 5.0 (TID 64)
22/10/24 17:46:03 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from hdfs://host117:8020/apps/spark/warehouse/hudidb.db/t2_20221024_5
22/10/24 17:46:03 INFO HoodieTableConfig: Loading table properties from hdfs://host117:8020/apps/spark/warehouse/hudidb.db/t2_20221024_5/.hoodie/hoodie.properties
22/10/24 17:46:03 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from hdfs://host117:8020/apps/spark/warehouse/hudidb.db/t2_20221024_5
22/10/24 17:46:03 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[==>20221024174559919__deltacommit__INFLIGHT]}
22/10/24 17:46:03 ERROR AbstractHoodieLogRecordReader: Got IOException when reading log file
java.io.FileNotFoundException: File does not exist: /apps/spark/warehouse/hudidb.db/t2_20221024_5/.00000003-cfe8-48ce-b255-adcb0eaf7ed8_20221024173219931.log.1_3-4-0
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86)
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
  at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:153)
  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1946)
  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:739)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:432)
  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)

  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
  at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
  at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88)
  at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:865)
  at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:852)
  at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:841)
  at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1009)
  at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:325)
  at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:321)
  at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
  at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:333)
  at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:474)
  at org.apache.hudi.common.table.log.HoodieLogFileReader.<init>(HoodieLogFileReader.java:114)
  at org.apache.hudi.common.table.log.HoodieLogFormatReader.<init>(HoodieLogFormatReader.java:70)
  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:219)
  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:192)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:110)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:103)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:324)
  at org.apache.hudi.HoodieMergeOnReadRDD$.scanLog(HoodieMergeOnReadRDD.scala:402)
  at org.apache.hudi.HoodieMergeOnReadRDD$LogFileIterator.<init>(HoodieMergeOnReadRDD.scala:197)
  at org.apache.hudi.HoodieMergeOnReadRDD$RecordMergingFileIterator.<init>(HoodieMergeOnReadRDD.scala:278)
  at org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:132)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
  at org.apache.spark.scheduler.Task.run(Task.scala:136)
  at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
  at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /apps/spark/warehouse/hudidb.db/t2_20221024_5/.00000003-cfe8-48ce-b255-adcb0eaf7ed8_20221024173219931.log.1_3-4-0
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86)
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
  at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:153)
  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1946)
  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:739)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:432)
  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)

  at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511)
  at org.apache.hadoop.ipc.Client.call(Client.java:1457)
  at org.apache.hadoop.ipc.Client.call(Client.java:1367)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
  at com.sun.proxy.$Proxy31.getBlockLocations(Unknown Source)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:320)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
  at com.sun.proxy.$Proxy32.getBlockLocations(Unknown Source)
  at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:863)
  ... 38 more
22/10/24 17:46:03 ERROR Executor: Exception in task 0.1 in stage 5.0 (TID 64)
org.apache.hudi.exception.HoodieIOException: IOException when reading log file 
  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:349)
  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:192)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:110)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:103)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:324)
  at org.apache.hudi.HoodieMergeOnReadRDD$.scanLog(HoodieMergeOnReadRDD.scala:402)
  at org.apache.hudi.HoodieMergeOnReadRDD$LogFileIterator.<init>(HoodieMergeOnReadRDD.scala:197)
  at org.apache.hudi.HoodieMergeOnReadRDD$RecordMergingFileIterator.<init>(HoodieMergeOnReadRDD.scala:278)
  at org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:132)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
  at org.apache.spark.scheduler.Task.run(Task.scala:136)
  at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
  at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: File does not exist: /apps/spark/warehouse/hudidb.db/t2_20221024_5/.00000003-cfe8-48ce-b255-adcb0eaf7ed8_20221024173219931.log.1_3-4-0
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86)
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
  at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:153)
  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1946)
  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:739)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:432)
  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)

  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
  at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
  at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88)
  at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:865)
  at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:852)
  at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:841)
  at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1009)
  at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:325)
  at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:321)
  at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
  at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:333)
  at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:474)
  at org.apache.hudi.common.table.log.HoodieLogFileReader.<init>(HoodieLogFileReader.java:114)
  at org.apache.hudi.common.table.log.HoodieLogFormatReader.<init>(HoodieLogFormatReader.java:70)
  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:219)
  ... 27 more
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /apps/spark/warehouse/hudidb.db/t2_20221024_5/.00000003-cfe8-48ce-b255-adcb0eaf7ed8_20221024173219931.log.1_3-4-0
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86)
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
  at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:153)
  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1946)
  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:739)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:432)
  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)

  at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511)
  at org.apache.hadoop.ipc.Client.call(Client.java:1457)
  at org.apache.hadoop.ipc.Client.call(Client.java:1367)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
  at com.sun.proxy.$Proxy31.getBlockLocations(Unknown Source)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:320)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
  at com.sun.proxy.$Proxy32.getBlockLocations(Unknown Source)
  at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:863)
  ... 38 more
22/10/24 17:46:03 INFO YarnCoarseGrainedExecutorBackend: Got assigned task 65
22/10/24 17:46:03 INFO Executor: Running task 0.2 in stage 5.0 (TID 65)
22/10/24 17:46:03 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from hdfs://host117:8020/apps/spark/warehouse/hudidb.db/t2_20221024_5
22/10/24 17:46:03 INFO HoodieTableConfig: Loading table properties from hdfs://host117:8020/apps/spark/warehouse/hudidb.db/t2_20221024_5/.hoodie/hoodie.properties
22/10/24 17:46:03 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from hdfs://host117:8020/apps/spark/warehouse/hudidb.db/t2_20221024_5
22/10/24 17:46:03 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[==>20221024174559919__deltacommit__INFLIGHT]}
22/10/24 17:46:03 ERROR AbstractHoodieLogRecordReader: Got IOException when reading log file
java.io.FileNotFoundException: File does not exist: /apps/spark/warehouse/hudidb.db/t2_20221024_5/.00000003-cfe8-48ce-b255-adcb0eaf7ed8_20221024173219931.log.1_3-4-0
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86)
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
  at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:153)
  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1946)
  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:739)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:432)
  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)

  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
  at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
  at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88)
  at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:865)
  at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:852)
  at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:841)
  at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1009)
  at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:325)
  at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:321)
  at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
  at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:333)
  at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:474)
  at org.apache.hudi.common.table.log.HoodieLogFileReader.<init>(HoodieLogFileReader.java:114)
  at org.apache.hudi.common.table.log.HoodieLogFormatReader.<init>(HoodieLogFormatReader.java:70)
  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:219)
  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:192)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:110)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:103)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:324)
  at org.apache.hudi.HoodieMergeOnReadRDD$.scanLog(HoodieMergeOnReadRDD.scala:402)
  at org.apache.hudi.HoodieMergeOnReadRDD$LogFileIterator.<init>(HoodieMergeOnReadRDD.scala:197)
  at org.apache.hudi.HoodieMergeOnReadRDD$RecordMergingFileIterator.<init>(HoodieMergeOnReadRDD.scala:278)
  at org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:132)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
  at org.apache.spark.scheduler.Task.run(Task.scala:136)
  at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
  at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /apps/spark/warehouse/hudidb.db/t2_20221024_5/.00000003-cfe8-48ce-b255-adcb0eaf7ed8_20221024173219931.log.1_3-4-0
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86)
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
  at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:153)
  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1946)
  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:739)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:432)
  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)

  at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511)
  at org.apache.hadoop.ipc.Client.call(Client.java:1457)
  at org.apache.hadoop.ipc.Client.call(Client.java:1367)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
  at com.sun.proxy.$Proxy31.getBlockLocations(Unknown Source)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:320)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
  at com.sun.proxy.$Proxy32.getBlockLocations(Unknown Source)
  at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:863)
  ... 38 more
22/10/24 17:46:03 ERROR Executor: Exception in task 0.2 in stage 5.0 (TID 65)
org.apache.hudi.exception.HoodieIOException: IOException when reading log file 
  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:349)
  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:192)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:110)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:103)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:324)
  at org.apache.hudi.HoodieMergeOnReadRDD$.scanLog(HoodieMergeOnReadRDD.scala:402)
  at org.apache.hudi.HoodieMergeOnReadRDD$LogFileIterator.<init>(HoodieMergeOnReadRDD.scala:197)
  at org.apache.hudi.HoodieMergeOnReadRDD$RecordMergingFileIterator.<init>(HoodieMergeOnReadRDD.scala:278)
  at org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:132)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
  at org.apache.spark.scheduler.Task.run(Task.scala:136)
  at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
  at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: File does not exist: /apps/spark/warehouse/hudidb.db/t2_20221024_5/.00000003-cfe8-48ce-b255-adcb0eaf7ed8_20221024173219931.log.1_3-4-0
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86)
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
  at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:153)
  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1946)
  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:739)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:432)
  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)

  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
  at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
  at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88)
  at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:865)
  at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:852)
  at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:841)
  at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1009)
  at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:325)
  at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:321)
  at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
  at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:333)
  at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:474)
  at org.apache.hudi.common.table.log.HoodieLogFileReader.<init>(HoodieLogFileReader.java:114)
  at org.apache.hudi.common.table.log.HoodieLogFormatReader.<init>(HoodieLogFormatReader.java:70)
  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:219)
  ... 27 more
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /apps/spark/warehouse/hudidb.db/t2_20221024_5/.00000003-cfe8-48ce-b255-adcb0eaf7ed8_20221024173219931.log.1_3-4-0
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86)
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
  at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:153)
  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1946)
  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:739)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:432)
  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)

  at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511)
  at org.apache.hadoop.ipc.Client.call(Client.java:1457)
  at org.apache.hadoop.ipc.Client.call(Client.java:1367)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
  at com.sun.proxy.$Proxy31.getBlockLocations(Unknown Source)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:320)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
  at com.sun.proxy.$Proxy32.getBlockLocations(Unknown Source)
  at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:863)
  ... 38 more
22/10/24 17:46:03 INFO YarnCoarseGrainedExecutorBackend: Got assigned task 66
22/10/24 17:46:03 INFO Executor: Running task 0.3 in stage 5.0 (TID 66)
22/10/24 17:46:03 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from hdfs://host117:8020/apps/spark/warehouse/hudidb.db/t2_20221024_5
22/10/24 17:46:03 INFO HoodieTableConfig: Loading table properties from hdfs://host117:8020/apps/spark/warehouse/hudidb.db/t2_20221024_5/.hoodie/hoodie.properties
22/10/24 17:46:03 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from hdfs://host117:8020/apps/spark/warehouse/hudidb.db/t2_20221024_5
22/10/24 17:46:03 INFO HoodieActiveTimeline: Loaded instants upto : Option{val=[==>20221024174559919__deltacommit__INFLIGHT]}
22/10/24 17:46:03 ERROR AbstractHoodieLogRecordReader: Got IOException when reading log file
java.io.FileNotFoundException: File does not exist: /apps/spark/warehouse/hudidb.db/t2_20221024_5/.00000003-cfe8-48ce-b255-adcb0eaf7ed8_20221024173219931.log.1_3-4-0
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86)
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
  at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:153)
  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1946)
  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:739)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:432)
  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)

  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
  at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
  at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88)
  at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:865)
  at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:852)
  at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:841)
  at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1009)
  at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:325)
  at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:321)
  at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
  at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:333)
  at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:474)
  at org.apache.hudi.common.table.log.HoodieLogFileReader.<init>(HoodieLogFileReader.java:114)
  at org.apache.hudi.common.table.log.HoodieLogFormatReader.<init>(HoodieLogFormatReader.java:70)
  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:219)
  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:192)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:110)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:103)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:324)
  at org.apache.hudi.HoodieMergeOnReadRDD$.scanLog(HoodieMergeOnReadRDD.scala:402)
  at org.apache.hudi.HoodieMergeOnReadRDD$LogFileIterator.<init>(HoodieMergeOnReadRDD.scala:197)
  at org.apache.hudi.HoodieMergeOnReadRDD$RecordMergingFileIterator.<init>(HoodieMergeOnReadRDD.scala:278)
  at org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:132)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
  at org.apache.spark.scheduler.Task.run(Task.scala:136)
  at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
  at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /apps/spark/warehouse/hudidb.db/t2_20221024_5/.00000003-cfe8-48ce-b255-adcb0eaf7ed8_20221024173219931.log.1_3-4-0
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86)
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
  at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:153)
  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1946)
  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:739)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:432)
  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)

  at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511)
  at org.apache.hadoop.ipc.Client.call(Client.java:1457)
  at org.apache.hadoop.ipc.Client.call(Client.java:1367)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
  at com.sun.proxy.$Proxy31.getBlockLocations(Unknown Source)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:320)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
  at com.sun.proxy.$Proxy32.getBlockLocations(Unknown Source)
  at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:863)
  ... 38 more
22/10/24 17:46:03 ERROR Executor: Exception in task 0.3 in stage 5.0 (TID 66)
org.apache.hudi.exception.HoodieIOException: IOException when reading log file 
  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:349)
  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:192)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:110)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:103)
  at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:324)
  at org.apache.hudi.HoodieMergeOnReadRDD$.scanLog(HoodieMergeOnReadRDD.scala:402)
  at org.apache.hudi.HoodieMergeOnReadRDD$LogFileIterator.<init>(HoodieMergeOnReadRDD.scala:197)
  at org.apache.hudi.HoodieMergeOnReadRDD$RecordMergingFileIterator.<init>(HoodieMergeOnReadRDD.scala:278)
  at org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:132)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
  at org.apache.spark.scheduler.Task.run(Task.scala:136)
  at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
  at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: File does not exist: /apps/spark/warehouse/hudidb.db/t2_20221024_5/.00000003-cfe8-48ce-b255-adcb0eaf7ed8_20221024173219931.log.1_3-4-0
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86)
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
  at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:153)
  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1946)
  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:739)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:432)
  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)

  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
  at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
  at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88)
  at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:865)
  at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:852)
  at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:841)
  at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1009)
  at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:325)
  at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:321)
  at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
  at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:333)
  at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:474)
  at org.apache.hudi.common.table.log.HoodieLogFileReader.<init>(HoodieLogFileReader.java:114)
  at org.apache.hudi.common.table.log.HoodieLogFormatReader.<init>(HoodieLogFormatReader.java:70)
  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:219)
  ... 27 more
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /apps/spark/warehouse/hudidb.db/t2_20221024_5/.00000003-cfe8-48ce-b255-adcb0eaf7ed8_20221024173219931.log.1_3-4-0
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86)
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
  at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:153)
  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1946)
  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:739)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:432)
  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)

  at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511)
  at org.apache.hadoop.ipc.Client.call(Client.java:1457)
  at org.apache.hadoop.ipc.Client.call(Client.java:1367)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
  at com.sun.proxy.$Proxy31.getBlockLocations(Unknown Source)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:320)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
  at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
  at com.sun.proxy.$Proxy32.getBlockLocations(Unknown Source)
  at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:863)
  ... 38 more

解决方法

refresh table xxx

或者设置如下参数,也就是metadata的过期时间,将其设置为hudi clean清理周期以内

spark.sql.metadataCacheTTLSeconds  1

本文为从大数据到人工智能博主「xiaozhch5」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。

原文链接:https://lrting.top/backend/10522/

未经允许不得转载:木盒主机 » 解决spark sql读取hudi表出现偶然读不出来数据问题

赞 (0)

相关推荐

    暂无内容!