DeepSeek-OCR (GPU Setup | CUDA 11.8 | vLLM 0.8.5)
DeepSeek OCR is a powerful AI model that reads and understands text from images and documents with high accuracy. It works on GPUs using vLLM, allowing very fast processing of scanned PDFs, invoices,
Minimum Requirements for DeepSeek-OCR
Hardware Requirements
GPU
NVIDIA A30 (16–24GB VRAM)
A100 / RTX 6000 Ada (40–80GB VRAM)
CUDA Compute Capability
≥ 7.0
≥ 7.0
VRAM
16GB
96GB+
System RAM
64GB
128GB
Storage
40GB free
60GB+
Software Requirements
Ubuntu 22.04 / 24.04
NVIDIA Driver ≥ 520
CUDA Toolkit 11.8
Python 3.10
vLLM 0.8.5 (cu118 build)
Conda environment (Miniconda)
Verify GPU
nvidia-smi
If GPU details appear → continue. If not → install NVIDIA drivers first.
Install Miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.shbash miniconda.sh -b -p $HOME/miniconda eval "$($HOME/miniconda/bin/conda shell.bash hook)"Create Conda Environment
conda create -n deepseek-ocr python=3.10 -y
Run these:
conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/main
conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/r
Re-create environment
conda create -n deepseek-ocr python=3.10 -y
conda activate deepseek-ocr
Install PyTorch (CUDA 11.8 build)
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu118Wait for 10 mins
Verify PyTorch GPU access
python - <<'PY'
import torch, sys
print("torch:", torch.__version__)
print("torch.cuda.runtime:", torch.version.cuda)
print("cuda available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("device:", torch.cuda.get_device_name(0))
PY
Clone DeepSeek-OCR and install requirements
git clone https://github.com/deepseek-ai/DeepSeek-OCR.git
cd DeepSeek-OCR
pip install -r requirements.txtInstall CUDA Toolkit 11.8 (Required for FlashAttention)
Download installer
// Download and install CUDA 11.8 Toolkit
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run
Install toolkit only (not the driver)
sudo sh cuda_11.8.0_520.61.05_linux.run --toolkit --silent --overrideThis will take about 5-10 minutes to download and install.
Verify installation
ls -la /usr/local/cuda-11.8/bin/nvccSet environment variables (put CUDA 11.8 FIRST in PATH)
export CUDA_HOME=/usr/local/cuda-11.8 export PATH=/usr/local/cuda-11.8/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64:$LD_LIBRARY_PATHVerify the correct nvcc is now found which nvcc
nvcc --version
Install GCC-11 (required for CUDA 11.8 build tools)
sudo apt-get install -y gcc-11 g++-11Set GCC-11 as the compiler for this session
export CC=/usr/bin/gcc-11
export CXX=/usr/bin/g++-11// Verify
gcc-11 --version
g++-11 --versionpip install psutilInstall FlashAttention
pip install flash-attn --no-build-isolation
This should work now. The compilation will take 5-10 minutes.
Make CUDA & GCC Paths Permanent in Conda Env
// After successful installation
mkdir -p $CONDA_PREFIX/etc/conda/activate.d
cat > $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh << 'EOF'
export CUDA_HOME=/usr/local/cuda-11.8
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH
export CC=/usr/bin/gcc-11
export CXX=/usr/bin/g++-11
EOFInstall vLLM 0.8.5 (CUDA 11.8 wheel)
(If you already installed the correct vllm, this will just confirm/force it.)
// prefer the cu118 wheel index; this installs vllm 0.8.5 built for cu118
pip install --upgrade "vllm==0.8.5" --extra-index-url https://wheels.vllm.ai/cu118
// quick check
python -c "import vllm, sys; print('vllm', getattr(vllm,'__version__',None))"
Prepare input / output paths and a sample image
mkdir -p ~/deepseek_input ~/deepseek_outputopen /deeSeek-OCR/DeepSeek-OCR-master/DeepSeek-OCR-vllm nano config.py

Run DeepSeek-OCR
time python run_dpsk_ocr_image.py Successfully ran DeepSeek-OCR model.
Last updated