Run SpatialLM on Windows: Step by Step Installation Guide

SpatialLM is a groundbreaking AI tool designed for spatial reasoning and 3D scene understanding. It processes 3D point cloud data generated from videos or other sources and outputs structured representations, such as architectural layouts or object mappings.
Running SpatialLM on Windows requires specific configurations and tools. This guide provides a step-by-step walkthrough to install, configure, and use SpatialLM on a Windows system efficiently.
Overview of SpatialLM
SpatialLM integrates advanced AI technologies like SLAM (Simultaneous Localization and Mapping) and large language models to generate spatially coherent 3D maps. It has applications in architecture, robotics, interior design, autonomous navigation, and human-computer interaction.
Key Features:
- Input Sources: Monocular video sequences, RGBD images, LiDAR sensors.
- Outputs: Structured 3D layouts with semantic categories (walls, doors, furniture).
- Models: Two versions available:
- SpatialLM-Llama (1 billion parameters).
- SpatialLM-Qwen (0.5 billion parameters).
Prerequisites
Before installing SpatialLM, ensure your system meets the following requirements:
Hardware Requirements
- Nvidia GPU with CUDA cores.
- At least 8 GB of RAM.
- Sufficient disk space for model files and dependencies (~20 GB recommended).
Software Requirements
- Windows 11 (preferred for compatibility with WSL2).
- Python 3.11.
- CUDA Toolkit (Version 12.4).
- Conda for environment management.
- Poetry for dependency installation.
Setting Up Windows for Machine Learning
- Enable Hardware Virtualization:
- Check virtualization status in Task Manager.
- Enable it through BIOS settings if disabled.
- Install Windows Subsystem for Linux (WSL):
- Open PowerShell as Administrator.
- Run
wsl --install
to install WSL2 and Ubuntu LTS.
- Update PowerShell:
- Download the latest version from the official Microsoft website.
- Install it and set it as the default terminal application.
Installation Steps
Follow these steps to install SpatialLM on your Windows machine:
Step 1: Clone the Repository
Open the terminal and execute:
git clone https://github.com/manycore-research/SpatialLM.git
cd SpatialLM
Step 2: Create a Conda Environment
Set up a Conda environment with CUDA support:
conda create -n spatiallm python=3.11
conda activate spatiallm
conda install -y nvidia/label/cuda-12.4.0::cuda-toolkit conda-forge::sparsehash
Step 3: Install Dependencies Using Poetry
Install Poetry and dependencies:
pip install poetry
poetry config virtualenvs.create false --local
poetry install
poe install-torchsparse
Note: Building the wheel for torchsparse
might take some time.
Running Inference
Once installed, you can run inference using preprocessed point clouds or your own video data.
Step 1: Download Example Point Cloud Data
Use Hugging Face CLI to download sample data:
huggingface-cli download manycore-research/SpatialLM-Testset pcd/scene0000_00.ply --repo-type dataset --local-dir .
Step 2: Execute Inference Script
Run the inference script to process the point cloud:
python inference.py --point_cloud pcd/scene0000_00.ply --output scene0000_00.txt --model_path manycore-research/SpatialLM-Llama-1B
Step 3: Visualize Results
Use rerun
to visualize the processed data:
rerun --input scene0000_00.txt --output visualization.html
Applications of SpatialLM
SpatialLM has diverse applications across multiple industries:
Interior Design and Architecture
- Generate detailed floor plans from video inputs.
- Optimize layouts based on user preferences.
Robotics
- Enable robots to navigate complex environments using spatial reasoning.
- Provide step-by-step navigation instructions based on mapped spaces.
Human Interaction
- Act as an intelligent assistant for spatial queries.
- Suggest modifications or improvements in room layouts.
Troubleshooting Tips
If you encounter issues during installation or execution:
- CUDA Errors: Ensure correct version compatibility between CUDA Toolkit and your GPU drivers.
- Dependency Installation Issues: Verify that Poetry is correctly configured to use the local environment.
- Visualization Problems: Check if
rerun
is installed properly and supports your output format.
Conclusion
Running SpatialLM on Windows provides a powerful tool for spatial understanding tasks without requiring specialized hardware setups. By leveraging WSL2, CUDA-enabled GPUs, and open-source tools like Conda and Poetry, users can efficiently map spaces and generate structured outputs for various applications.