LLM Setup
AppStudio runs AI 100% locally. No OpenAI key, no cloud, no per-query costs.
How it works
AppStudio communicates with any OpenAI-compatible local server running on your machine. Pick whichever server you already use — or choose one below based on what fits your setup. It auto-detects available models and selects the best one automatically.
Model priority order (highest quality first):
qwen3:14b → qwen3:8b → qwen2.5:14b → qwen2.5:7b → llama3.2
The first model in this list that is installed and available will be used. You can override this in Settings → Advanced.
Ollama
Best for: developers who prefer a CLI workflow. Runs as a background service — no UI needed, just pull a model and AppStudio auto-connects.
Get it from ollama.com — available for Windows, macOS, and Linux.
Run the installer. Ollama starts automatically as a system tray app / background service and listens on http://localhost:11434.
ollama pull qwen3:14b
On 8 GB VRAM or less, use the smaller variant:
ollama pull qwen3:8b
No configuration needed. AppStudio detects Ollama on http://localhost:11434 automatically.
LM Studio
Best for: users who prefer a GUI model browser with built-in download management. No command line needed.
Get it from lmstudio.ai.
Use the built-in model browser. Search for qwen for the best results with AppStudio.
In LM Studio, go to the Local Server tab and click Start Server. The server runs on port 1234 by default.
Go to AppStudio → Settings and set Base URL to:
http://localhost:1234/v1
Jan
Best for: privacy-focused users who want an all-in-one desktop app for managing and running local models.
Get it from jan.ai.
Use Jan's model hub to download Qwen or a similar capable model.
In Jan's settings, turn on the local API server.
Go to AppStudio → Settings and set Base URL to:
http://localhost:1337/v1
Custom / Other server
Any OpenAI-compatible server works. In AppStudio → Settings → Advanced:
- Base URL — your server's URL (e.g.
http://localhost:8080/v1) - Model name — optionally override the auto-detected model name
Compatible servers include Koboldcpp, text-generation-webui (with OpenAI extension), llama.cpp server, and more.
Hardware requirements
| Model | VRAM needed | Speed (analysis) |
|---|---|---|
qwen3:14b |
~10 GB | Best quality, 2–4 min |
qwen3:8b |
~6 GB | Good quality, 3–5 min |
qwen2.5:7b |
~5 GB | Decent quality, 4–6 min |
llama3.2 |
~4 GB | Fallback, variable |
Troubleshooting
-
"Model not found" — your server is running but no model is loaded or the model name doesn't match.
- Ollama: run
ollama listto see installed models. Pull one withollama pull qwen3:14b. - LM Studio / Jan: make sure a model is selected and loaded in the server tab before starting the server.
- Ollama: run
-
"Connection refused" — AppStudio can't reach your local server. Make sure it's running:
- Ollama: check your system tray for the Ollama icon, or run
ollama servein a terminal. - LM Studio: open LM Studio → Local Server tab → click Start Server.
- Jan: open Jan → Settings → enable the API server.
- Custom server: verify the Base URL in AppStudio → Settings matches your server's address and port.
- Ollama: check your system tray for the Ollama icon, or run
-
Analysis hangs or times out — the model is likely swapping to RAM (out of VRAM), which is very slow. Try a smaller model. For Ollama:
ollama pull qwen3:8b. For LM Studio/Jan: switch to a smaller quant in the model browser. - Wrong model being used — AppStudio auto-selects based on its priority list. To force a specific model, go to Settings → Advanced and set the model name manually.
Still stuck? See the full Troubleshooting guide →