Nari Dia

Run Nari Dia 1.6B on Mac: Installation Guide

Anas Mohammad

Apr 24, 2025 • 4 min read

Nari Dia 1.6B is an advanced, open-source text-to-speech (TTS) model developed by Nari Labs. With 1.6 billion parameters, it is designed to generate highly realistic, multi-speaker conversational audio.

Its open weights and code, Apache 2.0 license, and dialogue-centric features make it a compelling alternative to commercial TTS services like ElevenLabs.

What is Nari Dia 1.6B?

Model Size: 1.6 billion parameters, optimized for capturing intricate speech patterns.
Dialogue Generation: Supports scripts with multiple speakers using simple tags (e.g., [S1], [S2]).
Non-Verbal Communication: Can generate sounds like laughter, coughs, and throat clearing when specified in the input.
Audio Conditioning: Allows users to influence voice output via audio samples, enabling emotion and tone control.
Open Source: Released under Apache 2.0, with open weights and code available on Hugging Face.
Language Support: Currently, only English is supported1.

Hardware and Software Requirements

A. Hardware Requirements

GPU Dependency: Dia 1.6B is designed for CUDA-enabled NVIDIA GPUs, requiring about 10GB of GPU memory for full performance. This typically means you need a mid-range to high-end GPU (e.g., RTX 3070/4070 or better)1.
Mac Hardware: Most Macs, including the latest Apple Silicon (M1, M2, M3), do not natively support CUDA. This presents a challenge for running Dia 1.6B at full speed on a Mac.
CPU Support: As of now, official CPU support is planned but not yet available. Running the model on CPU will be possible in the future, but with significant performance limitations1.
RAM: At least 16GB of system RAM is recommended, especially if running in a virtualized or emulated environment.

B. Software Requirements

Operating System: macOS (latest version recommended for compatibility and security)2.
Python: Python 3.8 or later.
Git: For cloning the repository.
uv (Recommended): A fast Python package manager (pip install uv)1.
PyTorch: The model requires PyTorch 2.0+ and CUDA 12.6 for GPU acceleration. On Mac, PyTorch can be installed for CPU, but without CUDA support.
Other Dependencies: Hugging Face Transformers, Gradio, and Descript Audio Codec (handled by the setup script).

Challenges of Running Dia 1.6B on Mac

A. Lack of Native CUDA Support

Apple Silicon and Intel-based Macs do not natively support NVIDIA’s CUDA, which is required for optimal Dia 1.6B performance. This means:

No direct GPU acceleration: Running on Mac will default to CPU, resulting in much slower inference.
Workarounds: Advanced users may attempt to use cloud GPUs, Docker containers with GPU passthrough (on supported hardware), or emulation/virtualization, but these are complex and not officially supported.

B. Alternatives for Mac Users

Try Online Demo: Use the Hugging Face ZeroGPU Space for a cloud-hosted demo without local setup.
Wait for CPU/Quantized Versions: Nari Labs plans to release CPU-compatible and quantized versions, which will lower hardware requirements and improve accessibility for Mac users.
Explore Orpheus.CPP: Other TTS models like Orpheus.CPP can be run on Mac CPUs, though with different features and quality.

Step-by-Step Installation Guide

A. Preparing Your Mac

Update macOS: Ensure your system is up to date for best compatibility and security.
Install Homebrew (if not already installed):bash/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
Install Python and Git:bashbrew install python git
(Optional) Install uv:bashpip3 install uv

B. Clone the Dia Repository

Open Terminal and run:

bashgit clone https://github.com/nari-labs/dia.git
cd dia

C. Set Up Python Environment

Using uv (Recommended):

bashuv run app.py

The first run will install all dependencies and download the model weights. This may take some time.

Manual Setup (Alternative):

bashpython3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python app.py

D. Running the Application

The application will launch a Gradio web interface, allowing you to enter text, select speakers, and generate audio.
On Mac, expect slower performance since inference will run on CPU.

Using Dia 1.6B: Features and Workflow

A. Dialogue Input Format

Use tags to designate speakers:text[S1] Hello, how are you?
[S2] I'm fine, thank you! (laughs)
Non-verbal cues (e.g., (laughs), (coughs)) are supported in the script.

B. Audio Conditioning

Upload a short audio sample to clone a voice or set emotional tone.
This feature enables custom voices and expressive speech output.

C. Gradio UI

The Gradio interface provides fields to input text, select speaker tags, upload audio samples, and listen to generated speech.

Performance Considerations

CPU Inference: On Mac, generation will be significantly slower than on a CUDA-enabled GPU. Expect long wait times for audio synthesis, especially for longer scripts.
Resource Usage: The model is memory-intensive; ensure you have sufficient RAM and disk space.
Future Improvements: Quantized and CPU-optimized versions are expected to improve performance and accessibility for Mac users1.

Troubleshooting and Tips

Dependency Issues: Ensure all required Python packages are installed. If issues arise, try updating pip and reinstalling dependencies.
Model Download Problems: Check your internet connection and available disk space.
Slow Performance: This is expected on Mac due to lack of GPU acceleration. Consider using cloud-based inference for faster results.
Audio Output Issues: Verify your Mac’s sound settings and output device configuration.

Alternative Approaches for Mac Users

A. Use Hugging Face ZeroGPU Space

No installation required.
Try Dia 1.6B online with limited usage1.

B. Wait for CPU/Quantized Support

Monitor Nari Labs’ announcements for updates on CPU-compatible releases.

C. Explore Other TTS Options

Orpheus.CPP: Can run on Mac CPU, supports text-to-speech with less hardware demand, though with different feature sets.
Other Open-Source TTS Models: Research alternatives that are optimized for CPU or Apple Silicon.

Conclusion

Running Nari Dia 1.6B on a Mac is possible, but with significant performance limitations due to the lack of native CUDA GPU support. The model’s open-source nature and advanced dialogue capabilities make it an exciting tool for developers, researchers, and hobbyists interested in TTS and voice cloning.

For the best experience, use a CUDA-enabled GPU on a supported system, or leverage cloud-based demos until Mac-optimized versions are available.