admin管理员组文章数量:1584630
在centos7.9系统中安装gpu驱动及cuda,跑大模型会报错,提示让输入python -m bitsandbytes依然报错:
CUDA SETUP: Loading binary /usr/local/python3/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda117.so...
/lib64/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by /usr/local/python3/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda117.so)
CUDA SETUP: Something unexpected happened. Please compile from source:
git clone https://github/TimDettmers/bitsandbytes.git
cd bitsandbytes
CUDA_VERSION=117 make cuda11x
python setup.py install
Traceback (most recent call last):
File "/usr/local/python3/lib/python3.9/runpy.py", line 188, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "/usr/local/python3/lib/python3.9/runpy.py", line 147, in _get_module_details
return _get_module_details(pkg_main_name, error)
File "/usr/local/python3/lib/python3.9/runpy.py", line 111, in _get_module_details
__import__(pkg_name)
File "/usr/local/python3/lib/python3.9/site-packages/bitsandbytes/__init__.py", line 6, in <module>
from . import cuda_setup, utils, research
File "/usr/local/python3/lib/python3.9/site-packages/bitsandbytes/research/__init__.py", line 1, in <module>
from . import nn
File "/usr/local/python3/lib/python3.9/site-packages/bitsandbytes/research/nn/__init__.py", line 1, in <module>
from .modules import LinearFP8Mixed, LinearFP8Global
File "/usr/local/python3/lib/python3.9/site-packages/bitsandbytes/research/nn/modules.py", line 8, in <module>
from bitsandbytes.optim import GlobalOptimManager
File "/usr/local/python3/lib/python3.9/site-packages/bitsandbytes/optim/__init__.py", line 6, in <module>
from bitsandbytes.cextension import COMPILED_WITH_CUDA
File "/usr/local/python3/lib/python3.9/site-packages/bitsandbytes/cextension.py", line 20, in <module>
raise RuntimeError('''
RuntimeError:
CUDA Setup failed despite GPU being available. Please run the following command to get more information:
python -m bitsandbytes
Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
and open an issue at: https://github/TimDettmers/bitsandbytes/issues
发现问题:
可能是由于缺少CXXABI_1.3.9导致的错误
/lib64/libstdc++.so.6: version `CXXABI_1.3.9' not found
解决方案:
1、检查下gcc的版本
gcc -v
输出
gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC)
2、检查当前系统的动态库
strings /usr/lib64/libstdc++.so.6 | grep GLIBC
或者
strings /usr/lib64/libstdc++.so.6 | grep CXXABI
输出可以看到缺少CXXABI_1.3.9
,最新只到CXXABI_1.3.7
GLIBCXX_3.4
GLIBCXX_3.4.1
GLIBCXX_3.4.2
GLIBCXX_3.4.3
GLIBCXX_3.4.4
GLIBCXX_3.4.5
GLIBCXX_3.4.6
GLIBCXX_3.4.7
GLIBCXX_3.4.8
GLIBCXX_3.4.9
GLIBCXX_3.4.10
GLIBCXX_3.4.11
GLIBCXX_3.4.12
GLIBCXX_3.4.13
GLIBCXX_3.4.14
GLIBCXX_3.4.15
GLIBCXX_3.4.16
GLIBCXX_3.4.17
GLIBCXX_3.4.18
GLIBCXX_3.4.19
GLIBC_2.3
GLIBC_2.2.5
GLIBC_2.14
GLIBC_2.4
GLIBC_2.3.2
GLIBCXX_DEBUG_MESSAGE_LENGTH
分别输出 和
CXXABI_1.3
CXXABI_1.3.1
CXXABI_1.3.2
CXXABI_1.3.3
CXXABI_1.3.4
CXXABI_1.3.5
CXXABI_1.3.6
CXXABI_1.3.7
CXXABI_TM_1
3、查看当前动态库的位置和版本
find / -name libstdc++.so.6*
输出
/usr/lib64/libstdc++.so.6
/usr/lib64/libstdc++.so.6.0.19
/usr/share/gdb/auto-load/usr/lib64/libstdc++.so.6.0.19-gdb.py
/usr/share/gdb/auto-load/usr/lib64/libstdc++.so.6.0.19-gdb.pyc
/usr/share/gdb/auto-load/usr/lib64/libstdc++.so.6.0.19-gdb.pyo
/usr/local/cuda-12.1/nsight-systems-2023.1.2/host-linux-x64/libstdc++.so.6
/usr/local/cuda-12.1/nsight-compute-2023.1.1/host/linux-desktop-glibc_2_11_3-x64/libstdc++.so.6
输入
[root@dev ~]# cd /usr/lib64
[root@dev lib64]# ls -l libstdc++.so*
lrwxrwxrwx. 1 root root 19 Jul 31 09:44 libstdc++.so.6 -> libstdc++.so.6.0.19
-rwxr-xr-x. 1 root root 995840 Sep 30 2020 libstdc++.so.6.0.19
发现libstdc++.so.6 指向的版本是 libstdc++.so.6.0.19
4、结论
这个问题的原因是没有链接到CXXABI库的最新的版本
,那么升级会不会就能解决问题,思路清晰之后开搞
升级gcc
gcc的各个版本http://ftp.gnu/gnu/gcc/ ,根据自己需要选择,如果不知道,就选择最新的吧
下载GCC版本,这里选择最新的
wget http://ftp.gnu/gnu/gcc/gcc-11.2.0/gcc-11.2.0.tar.gz
解压:
tar -zxvf gcc-11.2.0.tar.gz
下载各项依赖
cd gcc-11.2.0
./contrib/download_prerequisites
创建编译目录
# 还是在gcc-11.2.0目录
mkdir build
cd build
../configure --enable-checking=release --enable-languages=c,c++ --disable-multilib
编译
make
# 编译的过程可能需要1-3个小时,建议使用nohup后台运行
nohup make &
编译完成后,执行安装
make install
检查gcc版本
gcc -v
如果版本还是旧的,执行reboot重启服务器,再查看
创建软链接(快结束了)
执行最开始的命令
strings /usr/lib64/libstdc++.so.6 | grep CXXABI
可以看到已经有了CXXABI_1.3.9
查找GCC编译时生成的最新的动态库位置
find / -name "libstdc++.so*"
输出:
/root/gcc-11.2.0/build/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so.6.0.29
/root/gcc-11.2.0/build/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so.6
/root/gcc-11.2.0/build/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so
/root/gcc-11.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so.6.0.29
/root/gcc-11.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so.6
/root/gcc-11.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so
/root/gcc-11.2.0/build/stage1-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so.6.0.29
/root/gcc-11.2.0/build/stage1-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so.6
/root/gcc-11.2.0/build/stage1-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so
/usr/lib/gcc/x86_64-redhat-linux/4.8.2/32/libstdc++.so
/usr/lib/gcc/x86_64-redhat-linux/4.8.2/libstdc++.so
/usr/lib64/libstdc++.so.6
/usr/lib64/libstdc++.so.6.0.19
/usr/lib64/libstdc++.so.6.0.29
/usr/share/gdb/auto-load/usr/lib64/libstdc++.so.6.0.19-gdb.py
/usr/share/gdb/auto-load/usr/lib64/libstdc++.so.6.0.19-gdb.pyc
/usr/share/gdb/auto-load/usr/lib64/libstdc++.so.6.0.19-gdb.pyo
/usr/local/lib64/libstdc++.so.6.0.29
/usr/local/lib64/libstdc++.so.6
/usr/local/lib64/libstdc++.so
/usr/local/lib64/libstdc++.so.6.0.29-gdb.py
/usr/local/cuda-12.1/nsight-systems-2023.1.2/host-linux-x64/libstdc++.so.6
/usr/local/cuda-12.1/nsight-compute-2023.1.1/host/linux-desktop-glibc_2_11_3-x64/libstdc++.so.6
可以看到,有更高的版本/usr/local/lib64/libstdc++.so.6.0.29
那么我们接下来让
创建软链接
cd /usr/lib64
cp /usr/local/lib64/libstdc++.so.6.0.29 /usr/lib64/
rm libstdc++.so.6
ln -s libstdc++.so.6.0.29 libstdc++.so.6
检查
[aigc@dev lib64]$ ls -l libstdc++.so*
lrwxrwxrwx 1 root root 19 Aug 9 11:48 libstdc++.so.6 -> libstdc++.so.6.0.29
-rwxr-xr-x. 1 root root 995840 Sep 30 2020 libstdc++.so.6.0.19
-rwxr-xr-x 1 root root 14595752 Aug 9 11:46 libstdc++.so.6.0.29
可以看到libstdc++.so.6
已经更新到了libstdc++.so.6.0.29
再次检查动态库
strings /usr/lib64/libstdc++.so.6 | grep CXXABI
可以看到,已经有了CXXABI_1.3.9
,还有更高版本的一些库
gcc升级结束,再去输入
python -m bitsandbytes
彻底解决啦
本文标签: 报错CUDAbitsandbytesPythonrun
版权声明:本文标题:centos7输入python -m bitsandbytes报错CUDA Setup failed despite GPU being available. Please run the follo 内容由热心网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:https://www.elefans.com/xitong/1727937672a1138914.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论