Run SpatialLM on Windows: Step by Step Installation Guide

Run SpatialLM on Windows: Step by Step Installation Guide

SpatialLM is a groundbreaking AI tool designed for spatial reasoning and 3D scene understanding. It processes 3D point cloud data generated from videos or other sources and outputs structured representations, such as architectural layouts or object mappings.

Running SpatialLM on Windows requires specific configurations and tools. This guide provides a step-by-step walkthrough to install, configure, and use SpatialLM on a Windows system efficiently.

Overview of SpatialLM

SpatialLM integrates advanced AI technologies like SLAM (Simultaneous Localization and Mapping) and large language models to generate spatially coherent 3D maps. It has applications in architecture, robotics, interior design, autonomous navigation, and human-computer interaction.

Key Features:

  • Input Sources: Monocular video sequences, RGBD images, LiDAR sensors.
  • Outputs: Structured 3D layouts with semantic categories (walls, doors, furniture).
  • Models: Two versions available:
    • SpatialLM-Llama (1 billion parameters).
    • SpatialLM-Qwen (0.5 billion parameters).

Prerequisites

Before installing SpatialLM, ensure your system meets the following requirements:

Hardware Requirements

  • Nvidia GPU with CUDA cores.
  • At least 8 GB of RAM.
  • Sufficient disk space for model files and dependencies (~20 GB recommended).

Software Requirements

  • Windows 11 (preferred for compatibility with WSL2).
  • Python 3.11.
  • CUDA Toolkit (Version 12.4).
  • Conda for environment management.
  • Poetry for dependency installation.

Setting Up Windows for Machine Learning

  1. Enable Hardware Virtualization:
    • Check virtualization status in Task Manager.
    • Enable it through BIOS settings if disabled.
  2. Install Windows Subsystem for Linux (WSL):
    • Open PowerShell as Administrator.
    • Run wsl --install to install WSL2 and Ubuntu LTS.
  3. Update PowerShell:
    • Download the latest version from the official Microsoft website.
    • Install it and set it as the default terminal application.

Installation Steps

Follow these steps to install SpatialLM on your Windows machine:

Step 1: Clone the Repository

Open the terminal and execute:

git clone https://github.com/manycore-research/SpatialLM.git
cd SpatialLM

Step 2: Create a Conda Environment

Set up a Conda environment with CUDA support:

conda create -n spatiallm python=3.11
conda activate spatiallm
conda install -y nvidia/label/cuda-12.4.0::cuda-toolkit conda-forge::sparsehash

Step 3: Install Dependencies Using Poetry

Install Poetry and dependencies:

pip install poetry
poetry config virtualenvs.create false --local
poetry install
poe install-torchsparse

Note: Building the wheel for torchsparse might take some time.

Running Inference

Once installed, you can run inference using preprocessed point clouds or your own video data.

Step 1: Download Example Point Cloud Data

Use Hugging Face CLI to download sample data:

huggingface-cli download manycore-research/SpatialLM-Testset pcd/scene0000_00.ply --repo-type dataset --local-dir .

Step 2: Execute Inference Script

Run the inference script to process the point cloud:

python inference.py --point_cloud pcd/scene0000_00.ply --output scene0000_00.txt --model_path manycore-research/SpatialLM-Llama-1B

Step 3: Visualize Results

Use rerun to visualize the processed data:

rerun --input scene0000_00.txt --output visualization.html

Applications of SpatialLM

SpatialLM has diverse applications across multiple industries:

Interior Design and Architecture

  • Generate detailed floor plans from video inputs.
  • Optimize layouts based on user preferences.

Robotics

  • Enable robots to navigate complex environments using spatial reasoning.
  • Provide step-by-step navigation instructions based on mapped spaces.

Human Interaction

  • Act as an intelligent assistant for spatial queries.
  • Suggest modifications or improvements in room layouts.

Troubleshooting Tips

If you encounter issues during installation or execution:

  1. CUDA Errors: Ensure correct version compatibility between CUDA Toolkit and your GPU drivers.
  2. Dependency Installation Issues: Verify that Poetry is correctly configured to use the local environment.
  3. Visualization Problems: Check if rerun is installed properly and supports your output format.

Conclusion

Running SpatialLM on Windows provides a powerful tool for spatial understanding tasks without requiring specialized hardware setups. By leveraging WSL2, CUDA-enabled GPUs, and open-source tools like Conda and Poetry, users can efficiently map spaces and generate structured outputs for various applications.

References

  1. Run DeepSeek Janus-Pro 7B on Mac: A Comprehensive Guide Using ComfyUI
  2. Run DeepSeek Janus-Pro 7B on Mac: Step-by-Step Guide
  3. Run DeepSeek Janus-Pro 7B on Windows: A Complete Installation Guide
  4. Run SpatialLM on macos: Step by Step Guide