Running DeepSeek Janus Pro 7B on Windows with ComfyUI: Step-by-Step Guide

Running DeepSeek Janus Pro 7B on Windows with ComfyUI: Step-by-Step Guide

DeepSeek Janus Pro 7B is an advanced multimodal framework designed to unify understanding and generation tasks across various data types, including text and images. Its architecture decouples visual encoding into separate pathways while maintaining a unified transformer model, allowing it to outperform traditional models in multimodal benchmarks, addressing common pitfalls and optimization tips.

This article serves as a comprehensive guide on how to install and run DeepSeek Janus Pro 7B on Windows using ComfyUI. We will cover system requirements, installation steps, and troubleshooting tips to ensure a smooth setup.

System Requirements

Before proceeding with the installation, ensure that your system meets the following requirements:

  • Operating System: Windows 10 or later
  • CPU: Minimum of 8 cores; recommended 16 cores for optimal performance
  • RAM: At least 32 GB; 64 GB is recommended for better performance
  • GPU: NVIDIA GPU with CUDA support (RTX series recommended)
  • Disk Space: Minimum of 100 GB free space for installation and model storage
  • Software:
    • Python 3.8–3.11 (added to PATH during installation)
    • Git for repository cloning
    • CUDA Toolkit (match your GPU’s driver version)

Pre-Installation Checklist

  1. Update NVIDIA drivers via GeForce Experience or the NVIDIA Driver Portal.
  2. Disable antivirus temporarily to avoid installation interruptions.
  3. Ensure stable internet for downloading large model files (20–50 GB).

Step 1: Install Python and Git

  1. Download Python:
    • Visit the official Python website and download the latest version for Windows.
    • During installation, check the box that says "Add Python to PATH."
  2. Install Git:
    • Download Git from the official Git website.
    • Follow the installation prompts and use default settings.

Step 2: Install CUDA Toolkit (if using NVIDIA GPU)

To leverage your NVIDIA GPU’s capabilities, install the CUDA Toolkit:

  1. Visit the NVIDIA CUDA Toolkit website.
  2. Select your operating system and follow the instructions to download and install the CUDA Toolkit.

Step 3: Create a Virtual Environment

Creating a virtual environment helps manage dependencies effectively:

  1. Open Command Prompt (cmd).
  2. Run the following commands:
python -m venv deepseek-env
deepseek-env\Scripts\activate

Step 4: Clone the DeepSeek Janus Pro Repository

Use Git to clone the DeepSeek Janus Pro repository:

git clone https://github.com/deepseek-ai/Janus.git
cd Janus

Step 5: Install Required Packages

While in the cloned directory, install the required packages:

pip install -r requirements.txt

This command installs all necessary dependencies listed in requirements.txt.

Step 6: Download DeepSeek Janus Pro Models

You need to download the specific model files for Janus Pro:

  1. Go to Hugging Face and download the model files.
  2. Place these files in a directory structure as follows:
Janus/
└── models/
    └── Janus-Pro-7B/
        ├── config.json
        ├── pytorch_model.bin
        └── tokenizer.json

### Step 7: Install ComfyUI

ComfyUI is an interface that simplifies interaction with DeepSeek models:

1. Clone the ComfyUI repository:

```bash
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
  1. Install ComfyUI dependencies:
pip install -r requirements.txt

Step 8: Install ComfyUI-Janus-Pro Plugin

To integrate Janus Pro with ComfyUI, install its plugin:

pip install git+https://github.com/CY-CHENYUE/ComfyUI-Janus-Pro.git

Step 9: Launch ComfyUI

After installation, launch ComfyUI with:

python app.py

This starts a local server, typically accessible at http://localhost:8188.

Step 10: Testing Your Setup

Once ComfyUI is running, test if everything functions correctly by generating an image using a simple command in your terminal or through the web interface:

generate_image(prompt="a futuristic cityscape", num_images=4)

This command generates four images based on your prompt.

First-Time Setup: Quick Test

  1. In ComfyUI, load the Janus-Pro-7B workflow template.
  2. Enter a prompt (e.g., “cyberpunk city at sunset”).
  3. Adjust parameters:
    • num_images: Start with 1 to test speed.
    • resolution: Use 512x512 for faster generation.
  4. Click “Queue Prompt” to generate.

Troubleshooting Common Issues

Issue 1: CUDA Out of Memory

Fix: Reduce batch size or image resolution. Use FP16 precision:

torch.cuda.empty_cache()
model.half()  # Add to your script

Issue 2: Missing Dependencies

  • Fix: Reinstall requirements with pip install --force-reinstall -r requirements.txt.

Issue 3: Slow Performance

  • Fix: Enable NVIDIA GPU acceleration in Windows Settings > Display > Graphics Settings.

Issue 4: Hugging Face Download Errors

  • Fix: Use huggingface-cli login to authenticate before downloading models.

Pro Tips for Optimal Performance

  • VRAM Management: Close background apps like browsers or gaming clients.
  • Precision: Use FP16 mode for quicker inferences.
  • Hardware: For multi-GPU setups, assign tasks via CUDA_VISIBLE_DEVICES=0,1.
  • Updates: Regularly pull the latest Janus and ComfyUI commits with git pull.

Exploring Advanced Features

  1. Multi-Prompt Fusion: Combine text and image prompts for hybrid outputs.
  2. Batch Processing: Generate multiple images in parallel by increasing num_images.
  3. Custom Workflows: Save successful configurations as JSON templates in ComfyUI.

Conclusion

By following this detailed guide, you should be able to successfully install and run DeepSeek Janus Pro 7B on Windows using ComfyUI. This powerful multimodal framework opens up new possibilities for projects involving text-to-image generation and visual understanding.