系统信息
- 系统镜像:
docker pull nvidia/cuda:12.4.1-cudnn-devel-rockylinux8
- python 版本:
3.10
- pytorch 版本:
2.6.0+cu124
源码编译安装
# 下载源码
git clone -b v0.2.7.post1 https://github.com/flashinfer-ai/flashinfer.git --recursive
# 安装依赖
yum install -y gcc-toolset-11
pip install ninja build
source /opt/rh/gcc-toolset-11/enable
# 编译 AOT kernels
TORCH_CUDA_ARCH_LIST="8.0 8.6 8.9 9.0a" \
MAX_JOBS=4 \
CMAKE_BUILD_TYPE=Release \
python -m flashinfer.aot
# 编译 whl 包
TORCH_CUDA_ARCH_LIST="8.0 8.6 8.9 9.0a" \
MAX_JOBS=4 \
CMAKE_BUILD_TYPE=Release \
python -m build -v --no-isolation --wheel
# 安装
pip install ./dist/flashinfer_python-*.whl
参考链接
- https://docs.flashinfer.ai/installation.html#install-from-source