dask读取hdfs文件时报错connect hdfs error

admin2024-04-03  0

问题详情:

/arrow/cpp/src/arrow/filesystem/hdfs.cc:51: Failed to disconnect hdfs client: IOError: HDFS hdfsFS::Disconnect failed, errno: 9 (Bad file descriptor)
Traceback (most recent call last):
File "/home/tdops/fucheng.pan/ray-code/read.py", line 15, in
print(fs.FileSystem.from_uri("hdfs:"))
File "pyarrow/_fs.pyx", line 347, in pyarrow._fs.FileSystem.from_uri
File "pyarrow/error.pxi", line 122, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 99, in pyarrow.lib.check_status
OSError: HDFS connection failed

dask及相关库的版本如下:

pyarrow                   15.0.2
dask                      2024.4.0
dask-expr                 1.0.6
fsspec                    2024.3.1

解决办法:

先将 fsspec 降至  2022.8.2 版本,如果再不行,将pyarrow降至 7.0.0 解决

相关解决原文:https://github.com/apache/arrow/issues/29187

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明原文出处。如若内容造成侵权/违法违规/事实不符,请联系SD编程学习网:675289112@qq.com进行投诉反馈,一经查实,立即删除!