Void AI

How to Install Void AI and Connect It to Local Models (Ollama & LM Studio)

Learn how to install Void AI, the open-source Cursor alternative, and run it with local models via Ollama or LM Studio — with zero cloud dependencies.

John Walter

Apr 11, 2026 • 6 min read

Void AI is a free, open-source AI code editor built on VS Code that lets you connect any language model — local or cloud — directly to your development environment. Unlike Cursor or GitHub Copilot, Void never routes your code through a third-party server unless you explicitly choose a cloud provider. This guide covers installing Void AI with local models on Windows, macOS, and Linux, and connecting it to both Ollama and LM Studio — so you have a single reference regardless of your platform or preferred local inference backend.

What Is Void AI?

Void is a fork of VS Code maintained by an open-source community (backed by Y Combinator). It ships all the familiar VS Code features — extensions, themes, keybindings, settings sync — and adds a first-class AI layer that connects directly to the model provider of your choice.

Where it differs from Cursor:

Fully open-source under the MIT license — you can inspect every line of code that handles your prompts
No mandatory cloud account — run entirely offline with a local model
Bring your own model — Ollama, LM Studio, OpenAI, Anthropic, Gemini, and any OpenAI-compatible endpoint are all first-class options
Checkpoints — every AI-driven code change creates a snapshot you can review and roll back

For a detailed feature comparison with Cursor, see our Cursor AI vs Void AI comparison.

Why Use Local Models with Void AI?

Running local models inside Void AI gives you four things a cloud-only setup cannot:

Privacy — your code never leaves your machine. Critical for proprietary codebases, client work, or security-sensitive projects.
Zero per-token cost — once the model is downloaded, inference is free regardless of how many completions you request.
Offline operation — works on a plane, in a restricted network, or anywhere with no internet.
Model freedom — swap between Llama 4, Qwen 3, Mistral, DeepSeek, Devstral, and others without changing subscriptions or waiting for a provider to support a new model.

The trade-off is hardware: local inference is CPU- or GPU-bound on your machine. For most coding tasks on a modern laptop, a 7B–14B parameter quantised model is fast enough for inline edits and chat. Autocomplete benefits from a smaller, faster model in the 1.5B–3B range.

Installing Void AI on Windows, macOS, and Linux

Void distributes signed installers for all major platforms.

Step 1 — Download Void

Go to voideditor.com and click the download button for your operating system. Void is available as:

Windows: .exe installer (x64 and ARM64)
macOS: .dmg for Apple Silicon and Intel
Linux: .deb, .rpm, and .AppImage

Step 2 — Install

Windows: Run the .exe installer and follow the setup wizard. Void adds itself to the PATH so you can launch it from a terminal with the void command.

macOS: Open the .dmg, drag Void to your Applications folder. On first launch, macOS may warn about an unidentified developer — right-click the app and choose Open to bypass this once.

Linux (Debian/Ubuntu):

sudo dpkg -i void-editor_*.deb
sudo apt-get install -f   # resolve any missing dependencies

Linux (AppImage):

chmod +x Void-*.AppImage
./Void-*.AppImage

Step 3 — First Launch Onboarding

On first launch, Void presents a full-screen onboarding wizard. You must configure at least one AI provider before the editor opens. If you plan to use a local model, you can skip cloud provider sign-up and configure Ollama or LM Studio directly here — which is exactly what the next two sections cover.

If you already use VS Code, you can import your settings, extensions, and keybindings in the same onboarding screen in one click.

Connecting Void AI to Local Models via Ollama

Ollama is the simplest way to run local models with Void AI. Void detects a running Ollama instance automatically at http://127.0.0.1:11434.

Step 1 — Install Ollama

Download Ollama from ollama.com. On macOS, Ollama runs as a menu bar app after installation. On Linux:

curl -fsSL https://ollama.com/install.sh | sh

On Windows, run the .exe installer — Ollama starts as a background service automatically.

Step 2 — Pull a model

Open a terminal and pull at least one coding model:

# Capable coding model for chat and inline edits
ollama pull qwen2.5-coder:7b

# Fast model for autocomplete (lower latency)
ollama pull qwen2.5-coder:1.5b

Verify both are available with ollama list.

Step 3 — Configure Ollama in Void

Open Void and press Ctrl+Shift+P, then search "Void Settings"
Under AI Providers, click Add Provider and select Ollama
The endpoint defaults to http://127.0.0.1:11434 — leave it unchanged for local Ollama
Click Refresh Models — Void queries Ollama and lists all pulled models
Assign your 7B model to Chat and the 1.5B model to Autocomplete

For platform-specific Ollama setup details, see our dedicated guides for Windows, macOS, and Ubuntu.

Connecting Void AI to Local Models via LM Studio

LM Studio is a GUI application for downloading and running GGUF models locally. It exposes an OpenAI-compatible REST API that Void connects to as a generic provider.

Step 1 — Install and configure LM Studio

Download LM Studio from lmstudio.ai. Once open:

Search for a model in the Discover tab (e.g., Qwen2.5-Coder-7B-Instruct)
Download your chosen model
Go to the Local Server tab
Select your downloaded model and click Start Server
LM Studio serves at http://localhost:1234 by default

Step 2 — Add LM Studio as an OpenAI-compatible provider in Void

Open Void Settings → AI Providers → Add Provider
Choose OpenAI Compatible (not "OpenAI")
Set Base URL to http://localhost:1234/v1
Leave the API Key field blank — LM Studio requires no auth on localhost
Enter the Model ID exactly as shown in LM Studio's server panel
Click Save, then test with a quick chat message

Using Void AI's Core Features

Once a local model is connected, Void exposes three primary AI interaction modes plus its signature checkpoint system.

Tab Autocomplete

As you type, Void sends surrounding context to your local model and suggests completions inline. Press Tab to accept the suggestion or keep typing to dismiss it. For best latency, assign a 1.5B–3B model specifically to autocomplete — this keeps response times under 300ms on most hardware while reserving your larger model for chat and edits.

Inline Edit — Ctrl+K / Cmd+K

Select any code block, press Ctrl+K (Windows/Linux) or Cmd+K (macOS), and type a natural-language instruction:

Refactor this function to use async/await instead of callbacks

Void submits the selection and your instruction to the model and shows a diff of the proposed change. You can accept, reject, or modify the suggestion before it is applied.

AI Chat — Ctrl+L / Cmd+L

Press Ctrl+L to open the chat panel. Drag files into the chat context, use @filename to reference specific files, or use @folder to include entire directories. Chat is ideal for asking questions about code structure, debugging assistance, or generating new functions from a description.

Checkpoints

Every time Void applies an AI edit via Ctrl+K or chat, it records a checkpoint — a named snapshot with a full diff. The checkpoint panel in the sidebar shows a timeline of AI-driven changes. You can revert a specific checkpoint without undoing subsequent manual edits, which is significantly more useful than Ctrl+Z when rolling back a bad AI suggestion made ten edits ago.

Which Local Model Should You Use with Void AI?

The right model depends on your hardware and primary use case. A practical two-model setup works well: a small fast model for autocomplete and a larger model for chat and inline edits.

Autocomplete (low latency): Qwen2.5-Coder 1.5B Q8 — needs approximately 2 GB VRAM or 4 GB RAM
Chat and inline edit on a mid-range GPU (6 GB VRAM): Qwen2.5-Coder 7B Q4
Chat and inline edit on a high-end GPU (10 GB VRAM): Qwen2.5-Coder 14B Q4
CPU-only, no GPU: Qwen2.5-Coder 3B Q4 — needs approximately 8 GB RAM
Agentic and multi-file refactor (12 GB VRAM): DeepSeek-Coder-V2 Lite Q4

All of the models above are available via ollama pull or as GGUF downloads for LM Studio.

Troubleshooting Common Void AI Issues

Ollama models not appearing in Void

Verify Ollama is running: open a terminal and run ollama list. If it hangs or errors, restart Ollama. Then in Void AI Settings, navigate to the Ollama provider and click Refresh Models. On Linux with an AppImage install, check that the OLLAMA_HOST environment variable is not set to a non-default value.

Autocomplete feels too slow

Switch to a smaller model for autocomplete. Qwen2.5-Coder 1.5B responds in under 200ms on most hardware. If you are on CPU-only hardware and find even small models too slow, disable autocomplete entirely in Void Settings under AI and rely on Ctrl+K inline edits for on-demand suggestions instead.

LM Studio endpoint returns 404 or connection refused

Confirm LM Studio's server is running — the server tab should show a green Running badge. Check that the base URL in Void is http://localhost:1234/v1 with the trailing /v1 included. Omitting it causes 404 errors. Also confirm the model ID in Void matches the exact model ID shown in LM Studio's server panel.

VS Code extensions not available in Void

Void uses the Open VSX Registry instead of the Microsoft Extension Marketplace due to licensing constraints. Most popular extensions — ESLint, Prettier, GitLens, language packs — are available on Open VSX. If a specific extension is missing, download the .vsix file from the VS Code Marketplace and install it via Extensions: Install from VSIX.

Tip: If migrating from VS Code, use Void's built-in migration tool during onboarding. It imports your extensions list and attempts to resolve each one from Open VSX automatically.