# Deployment and Optimization of DeepSeek-OCR

DeepSeek OCR is a powerful AI model that reads and understands text from images and documents with high accuracy. It works on GPUs using vLLM, allowing very fast processing of scanned PDFs, invoices.

#### Minimum Requirements for DeepSeek-OCR

Hardware Requirements

| Component               | Minimum                    | Recommended                        |
| ----------------------- | -------------------------- | ---------------------------------- |
| GPU                     | NVIDIA  A30 (16–24GB VRAM) | A100 / RTX 6000 Ada (40–80GB VRAM) |
| CUDA Compute Capability | ≥ 7.0                      | ≥ 8.0                              |
| VRAM                    | 16GB                       | 96GB+                              |
| System RAM              | 64GB                       | 128GB                              |
| Storage                 | 40GB free                  | 60GB+                              |

#### **Software Requirements**

* Ubuntu **22.04 / 24.04**
* NVIDIA Driver **≥ 520**
* CUDA Toolkit **11.8**
* Python **3.10**
* PyTorch 2.6.0 (CU118)
* vLLM 0.8.5 (cu118 build)
* Conda environment (Miniconda)

#### Verify GPU

```
nvidia-smi
```

Checks if your GPU is properly detected by the OS and NVIDIA drivers.

<figure><img src="/files/zrZHJPgXJ9MuuH7nxq3J" alt=""><figcaption></figcaption></figure>

> If GPU details appear : continue.\
> If not : install NVIDIA drivers first.

#### Install Miniconda

Download and install Conda, which isolates dependencies for DeepSeek-OCR.

```
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
```

Downloads the Miniconda installer and saves it as `miniconda.sh`

```
bash miniconda.sh -b -p $HOME/miniconda
```

Installs Miniconda silently (`-b`) into your home directory

```
eval "$($HOME/miniconda/bin/conda shell.bash hook)"
```

Activates Conda so that `conda` command works in your terminal

<figure><img src="/files/fu0KNjOdrQEz5nc5QEwr" alt=""><figcaption></figcaption></figure>

#### Create Conda Environment

```
conda create -n deepseek-ocr python=3.10 -y
```

{% hint style="info" %}
If you get ToS messages
{% endhint %}

Run these:

```
conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/main
conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/r
```

Re-create environment

```
conda create -n deepseek-ocr python=3.10 -y
conda activate deepseek-ocr
```

Creates an isolated environment to install DeepSeek OCR and its dependencies safely.

<figure><img src="/files/kpz8CoL9ZHaa6hH56Oef" alt=""><figcaption></figcaption></figure>

#### Install PyTorch (CUDA 11.8 build)

```
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu118
```

Installs GPU-enabled PyTorch 2.6.0 that matches CUDA 11.8.

Verify PyTorch GPU access

```
python - <<'PY'
import torch, sys
print("torch:", torch.__version__)
print("torch.cuda.runtime:", torch.version.cuda)
print("cuda available:", torch.cuda.is_available())
if torch.cuda.is_available():
    print("device:", torch.cuda.get_device_name(0))
PY
```

Checks if PyTorch can detect and use your GPU properly.

<figure><img src="/files/G3Pf3xAxHBPTSpb5RRlL" alt=""><figcaption></figcaption></figure>

#### Clone DeepSeek-OCR and install requirements

```
git clone https://github.com/deepseek-ai/DeepSeek-OCR.git
cd DeepSeek-OCR
pip install -r requirements.txt
```

Downloads the DeepSeek-OCR code and installs required Python libraries.

#### After Cloning DeepSeek-OCR  You Will See These Files

<figure><img src="/files/d53w3dkoj877Pa7DvCCH" alt=""><figcaption></figcaption></figure>

#### Install CUDA Toolkit 11.8 (Required for FlashAttention)

Download installer

```
// Download and install CUDA 11.8 Toolkit   
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run
```

<figure><img src="/files/3lwlyNj9nWFgxaKCrvTk" alt=""><figcaption></figcaption></figure>

Install toolkit only (not the driver)

```
sudo sh cuda_11.8.0_520.61.05_linux.run --toolkit --silent --override
```

Installs CUDA 11.8 compiler tools required to build FlashAttention.

This will take about 5-10 minutes to download and install.

&#x20;Verify installation

```
ls -la /usr/local/cuda-11.8/bin/nvcc
```

Set environment variables (put CUDA 11.8 FIRST in PATH)

```
 export CUDA_HOME=/usr/local/cuda-11.8 export PATH=/usr/local/cuda-11.8/bin:$PATH
 export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64:$LD_LIBRARY_PATH
```

Verify the correct nvcc is now found which nvcc

```
nvcc --version
```

<figure><img src="/files/NME5UA3SH4rLwzP8mSBx" alt=""><figcaption></figcaption></figure>

#### Install GCC-11 (required for CUDA 11.8 build tools)

```
sudo apt-get install -y gcc-11 g++-11
```

Set GCC-11 as the compiler for this session

```
export CC=/usr/bin/gcc-11
export CXX=/usr/bin/g++-11
```

```
// Verify
gcc-11 --version
g++-11 --version
```

```
pip install psutil
```

#### Install FlashAttention

```
pip install flash-attn --no-build-isolation
```

<figure><img src="/files/Xcj55bVoRX0Mv3DzuoUd" alt=""><figcaption></figcaption></figure>

This should work now. The compilation will take 5-10 minutes.

#### Make CUDA & GCC Paths Permanent in Conda Env

```
// After successful installation
mkdir -p $CONDA_PREFIX/etc/conda/activate.d
cat > $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh << 'EOF'
export CUDA_HOME=/usr/local/cuda-11.8
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH
export CC=/usr/bin/gcc-11
export CXX=/usr/bin/g++-11
EOF
```

Ensures correct CUDA & GCC versions auto-load every time you activate the environment.

#### Install vLLM 0.8.5 (CUDA 11.8 wheel)

> (If you already installed the correct vllm, this will just confirm/force it.)

```
// prefer the cu118 wheel index; this installs vllm 0.8.5 built for cu118
pip install --upgrade "vllm==0.8.5" --extra-index-url https://wheels.vllm.ai/cu118

// quick check
python -c "import vllm, sys; print('vllm', getattr(vllm,'__version__',None))"
```

Installs the version of vLLM that works correctly with CUDA 11.8 and PyTorch 2.6.

<figure><img src="/files/uguDiQwYwjmW5681Qxbp" alt=""><figcaption></figcaption></figure>

### Prepare input / output paths and a sample image

```
mkdir -p ~/deepseek_input ~/deepseek_output
```

Creates directories to store images and OCR output.

Now open the config file and set correct input/output paths

```
nano DeepSeek-OCR-vllm/config.py
```

<figure><img src="/files/gfKD5NDu6eUo17xlOEgl" alt=""><figcaption></figcaption></figure>

Save and exit:

* **CTRL + O** Enter
* **CTRL + X**  Exit

#### **Run DeepSeek-OCR**

```
time python run_dpsk_ocr_image.py 
```

**Successfully ran DeepSeek-OCR model.**


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.neevcloud.com/neevcloud-guide/neevcloud-knowledgebase/deployment-and-optimization-of-deepseek-ocr.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
