Install and Run Mochi 1 on Ubuntu: A Complete Guide

Mochi 1 is an open-source AI video generation model developed by Genmo that transforms text prompts into dynamic videos. Whether you’re a content creator, marketer, or developer, this guide will help you install and optimize Mochi 1 on Ubuntu for seamless AI-driven video creation.

System Requirements

Ensure your Ubuntu system meets these specs:

Component Minimum Requirement Recommended
OS Ubuntu 20.04 LTS Ubuntu 22.04 LTS
CPU Intel i5 / AMD Ryzen 5 Intel i7 / AMD Ryzen 7
RAM 16 GB 32 GB
GPU NVIDIA GPU (8 GB VRAM) NVIDIA RTX 3090 (24 GB VRAM)
Python 3.8+ 3.10+

Step-by-Step Installation

Step 1: Update Your System

To ensure all packages are up-to-date, run the following command in your terminal:

sudo apt update && sudo apt upgrade -y

Step 2: Install Required Packages

Install essential dependencies including Git, Python, and pip:

sudo apt install git python3 python3-pip -y

Step 3: Install CUDA and cuDNN (For NVIDIA GPUs)

To fully utilize your GPU, install CUDA and cuDNN:

  1. Install CUDA:
  2. Install cuDNN:
    • Get it from the NVIDIA Developer website.
    • Extract the files and copy them to /usr/local/cuda.

Step 4: Clone the Mochi Repository

Clone the Mochi GitHub repository and navigate into it:

git clone https://github.com/genmoai/mochi.git
cd mochi

Step 5: Install Python Dependencies

Inside the mochi directory, install all required Python packages:

pip install -r requirements.txt

Step 6: Configure Environment Variables

Set up CUDA environment variables by adding the following lines to your ~/.bashrc file:

export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

Apply the changes:

source ~/.bashrc

Step 7: Run Mochi

To start video generation using Mochi, use the following command:

python run_mochi.py --text "Your text prompt here"

Running Your First Video Generation

Generate a video from a text prompt:

python run_mochi.py --text "A futuristic cityscape at sunset" --output video.mp4  

Example Output: A 5-second video at 24 FPS saved as video.mp4.

Optimizing Performance

GPU Memory Management

  • Reduce Batch Size: For 8 GB VRAM, use --batch_size 1.
  • Adjust Resolution: Lower --height and --width (default: 512x512).

Mixed Precision Training

Add --fp16 to use 16-bit precision for faster inference:

python run_mochi.py --text "A forest waterfall" --fp16  

Monitor GPU Usage

watch -n 1 nvidia-smi  

Troubleshooting Common Issues

Issue Solution
CUDA Out of Memory Reduce --batch_size or resolution.
Dependency Conflicts Use a virtual environment.
cuDNN Not Detected Verify CUDA/cuDNN paths in ~/.bashrc.

Advanced Features

1. Custom Checkpoints

Download community-trained models from Mochi Community Hub and load them with:

python run_mochi.py --checkpoint custom_model.ckpt  

2. ComfyUI Integration

Use the visual interface for workflow management:

  1. Install ComfyUI.
  2. Import Mochi 1 nodes for drag-and-drop video generation.

Conclusion

Mochi 1 on Ubuntu unlocks limitless possibilities for AI video generation. By following this guide, you’ve set up a robust environment for creating videos from text prompts, optimized performance, and explored advanced features. Stay updated with the Mochi GitHub repo for the latest enhancements!