FlashInfer 源码编译安装

系统信息

  • 系统镜像:docker pull nvidia/cuda:12.4.1-cudnn-devel-rockylinux8
  • python 版本:3.10
  • pytorch 版本:2.6.0+cu124

源码编译安装

# 下载源码
git clone -b v0.2.7.post1 https://github.com/flashinfer-ai/flashinfer.git --recursive

# 安装依赖
yum install -y gcc-toolset-11
pip install ninja build
source /opt/rh/gcc-toolset-11/enable

# 编译 AOT kernels
TORCH_CUDA_ARCH_LIST="8.0 8.6 8.9 9.0a" \
  MAX_JOBS=4 \
  CMAKE_BUILD_TYPE=Release \
  python -m flashinfer.aot
# 编译 whl 包
TORCH_CUDA_ARCH_LIST="8.0 8.6 8.9 9.0a" \
  MAX_JOBS=4 \
  CMAKE_BUILD_TYPE=Release \
  python -m build -v --no-isolation --wheel

# 安装
pip install ./dist/flashinfer_python-*.whl

参考链接

  1. https://docs.flashinfer.ai/installation.html#install-from-source
Comment