How to Install Void AI and Connect It to Local Models (Ollama & LM Studio)
Learn how to install Void AI, the open-source Cursor alternative, and run it with local models via Ollama or LM Studio — with zero cloud dependencies.
Void AI is a free, open-source AI code editor built on VS Code that lets you connect any language model — local or cloud — directly to your development environment. Unlike Cursor or GitHub Copilot, Void never routes your code through a third-party server unless you explicitly choose a cloud provider. This guide covers installing Void AI with local models on Windows, macOS, and Linux, and connecting it to both Ollama and LM Studio — so you have a single reference regardless of your platform or preferred local inference backend.
What Is Void AI?
Void is a fork of VS Code maintained by an open-source community (backed by Y Combinator). It ships all the familiar VS Code features — extensions, themes, keybindings, settings sync — and adds a first-class AI layer that connects directly to the model provider of your choice.
Where it differs from Cursor:
- Fully open-source under the MIT license — you can inspect every line of code that handles your prompts
- No mandatory cloud account — run entirely offline with a local model
- Bring your own model — Ollama, LM Studio, OpenAI, Anthropic, Gemini, and any OpenAI-compatible endpoint are all first-class options
- Checkpoints — every AI-driven code change creates a snapshot you can review and roll back
For a detailed feature comparison with Cursor, see our Cursor AI vs Void AI comparison.
Why Use Local Models with Void AI?
Running local models inside Void AI gives you four things a cloud-only setup cannot:
- Privacy — your code never leaves your machine. Critical for proprietary codebases, client work, or security-sensitive projects.
- Zero per-token cost — once the model is downloaded, inference is free regardless of how many completions you request.
- Offline operation — works on a plane, in a restricted network, or anywhere with no internet.
- Model freedom — swap between Llama 4, Qwen 3, Mistral, DeepSeek, Devstral, and others without changing subscriptions or waiting for a provider to support a new model.
The trade-off is hardware: local inference is CPU- or GPU-bound on your machine. For most coding tasks on a modern laptop, a 7B–14B parameter quantised model is fast enough for inline edits and chat. Autocomplete benefits from a smaller, faster model in the 1.5B–3B range.
Installing Void AI on Windows, macOS, and Linux
Void distributes signed installers for all major platforms.
Step 1 — Download Void
Go to voideditor.com and click the download button for your operating system. Void is available as:
- Windows:
.exeinstaller (x64 and ARM64) - macOS:
.dmgfor Apple Silicon and Intel - Linux:
.deb,.rpm, and.AppImage
Step 2 — Install
Windows: Run the .exe installer and follow the setup wizard. Void adds itself to the PATH so you can launch it from a terminal with the void command.
macOS: Open the .dmg, drag Void to your Applications folder. On first launch, macOS may warn about an unidentified developer — right-click the app and choose Open to bypass this once.
Linux (Debian/Ubuntu):
sudo dpkg -i void-editor_*.deb
sudo apt-get install -f # resolve any missing dependenciesLinux (AppImage):
chmod +x Void-*.AppImage
./Void-*.AppImageStep 3 — First Launch Onboarding
On first launch, Void presents a full-screen onboarding wizard. You must configure at least one AI provider before the editor opens. If you plan to use a local model, you can skip cloud provider sign-up and configure Ollama or LM Studio directly here — which is exactly what the next two sections cover.
If you already use VS Code, you can import your settings, extensions, and keybindings in the same onboarding screen in one click.
Connecting Void AI to Local Models via Ollama
Ollama is the simplest way to run local models with Void AI. Void detects a running Ollama instance automatically at http://127.0.0.1:11434.
Step 1 — Install Ollama
Download Ollama from ollama.com. On macOS, Ollama runs as a menu bar app after installation. On Linux:
curl -fsSL https://ollama.com/install.sh | shOn Windows, run the .exe installer — Ollama starts as a background service automatically.
Step 2 — Pull a model
Open a terminal and pull at least one coding model:
# Capable coding model for chat and inline edits
ollama pull qwen2.5-coder:7b
# Fast model for autocomplete (lower latency)
ollama pull qwen2.5-coder:1.5bVerify both are available with ollama list.
Step 3 — Configure Ollama in Void
- Open Void and press Ctrl+Shift+P, then search "Void Settings"
- Under AI Providers, click Add Provider and select Ollama
- The endpoint defaults to
http://127.0.0.1:11434— leave it unchanged for local Ollama - Click Refresh Models — Void queries Ollama and lists all pulled models
- Assign your 7B model to Chat and the 1.5B model to Autocomplete
For platform-specific Ollama setup details, see our dedicated guides for Windows, macOS, and Ubuntu.
Connecting Void AI to Local Models via LM Studio
LM Studio is a GUI application for downloading and running GGUF models locally. It exposes an OpenAI-compatible REST API that Void connects to as a generic provider.
Step 1 — Install and configure LM Studio
Download LM Studio from lmstudio.ai. Once open:
- Search for a model in the Discover tab (e.g.,
Qwen2.5-Coder-7B-Instruct) - Download your chosen model
- Go to the Local Server tab
- Select your downloaded model and click Start Server
- LM Studio serves at
http://localhost:1234by default
Step 2 — Add LM Studio as an OpenAI-compatible provider in Void
- Open Void Settings → AI Providers → Add Provider
- Choose OpenAI Compatible (not "OpenAI")
- Set Base URL to
http://localhost:1234/v1 - Leave the API Key field blank — LM Studio requires no auth on localhost
- Enter the Model ID exactly as shown in LM Studio's server panel
- Click Save, then test with a quick chat message
Using Void AI's Core Features
Once a local model is connected, Void exposes three primary AI interaction modes plus its signature checkpoint system.
Tab Autocomplete
As you type, Void sends surrounding context to your local model and suggests completions inline. Press Tab to accept the suggestion or keep typing to dismiss it. For best latency, assign a 1.5B–3B model specifically to autocomplete — this keeps response times under 300ms on most hardware while reserving your larger model for chat and edits.
Inline Edit — Ctrl+K / Cmd+K
Select any code block, press Ctrl+K (Windows/Linux) or Cmd+K (macOS), and type a natural-language instruction:
Refactor this function to use async/await instead of callbacksVoid submits the selection and your instruction to the model and shows a diff of the proposed change. You can accept, reject, or modify the suggestion before it is applied.
AI Chat — Ctrl+L / Cmd+L
Press Ctrl+L to open the chat panel. Drag files into the chat context, use @filename to reference specific files, or use @folder to include entire directories. Chat is ideal for asking questions about code structure, debugging assistance, or generating new functions from a description.
Checkpoints
Every time Void applies an AI edit via Ctrl+K or chat, it records a checkpoint — a named snapshot with a full diff. The checkpoint panel in the sidebar shows a timeline of AI-driven changes. You can revert a specific checkpoint without undoing subsequent manual edits, which is significantly more useful than Ctrl+Z when rolling back a bad AI suggestion made ten edits ago.
Which Local Model Should You Use with Void AI?
The right model depends on your hardware and primary use case. A practical two-model setup works well: a small fast model for autocomplete and a larger model for chat and inline edits.
- Autocomplete (low latency): Qwen2.5-Coder 1.5B Q8 — needs approximately 2 GB VRAM or 4 GB RAM
- Chat and inline edit on a mid-range GPU (6 GB VRAM): Qwen2.5-Coder 7B Q4
- Chat and inline edit on a high-end GPU (10 GB VRAM): Qwen2.5-Coder 14B Q4
- CPU-only, no GPU: Qwen2.5-Coder 3B Q4 — needs approximately 8 GB RAM
- Agentic and multi-file refactor (12 GB VRAM): DeepSeek-Coder-V2 Lite Q4
All of the models above are available via ollama pull or as GGUF downloads for LM Studio.
Troubleshooting Common Void AI Issues
Ollama models not appearing in Void
Verify Ollama is running: open a terminal and run ollama list. If it hangs or errors, restart Ollama. Then in Void AI Settings, navigate to the Ollama provider and click Refresh Models. On Linux with an AppImage install, check that the OLLAMA_HOST environment variable is not set to a non-default value.
Autocomplete feels too slow
Switch to a smaller model for autocomplete. Qwen2.5-Coder 1.5B responds in under 200ms on most hardware. If you are on CPU-only hardware and find even small models too slow, disable autocomplete entirely in Void Settings under AI and rely on Ctrl+K inline edits for on-demand suggestions instead.
LM Studio endpoint returns 404 or connection refused
Confirm LM Studio's server is running — the server tab should show a green Running badge. Check that the base URL in Void is http://localhost:1234/v1 with the trailing /v1 included. Omitting it causes 404 errors. Also confirm the model ID in Void matches the exact model ID shown in LM Studio's server panel.
VS Code extensions not available in Void
Void uses the Open VSX Registry instead of the Microsoft Extension Marketplace due to licensing constraints. Most popular extensions — ESLint, Prettier, GitLens, language packs — are available on Open VSX. If a specific extension is missing, download the .vsix file from the VS Code Marketplace and install it via Extensions: Install from VSIX.
Tip: If migrating from VS Code, use Void's built-in migration tool during onboarding. It imports your extensions list and attempts to resolve each one from Open VSX automatically.