LLM Setup

AppStudio runs AI 100% locally. No OpenAI key, no cloud, no per-query costs.

How it works

AppStudio communicates with any OpenAI-compatible local server running on your machine. Pick whichever server you already use — or choose one below based on what fits your setup. It auto-detects available models and selects the best one automatically.

Model priority order (highest quality first):

qwen3:14b  →  qwen3:8b  →  qwen2.5:14b  →  qwen2.5:7b  →  llama3.2

The first model in this list that is installed and available will be used. You can override this in Settings → Advanced.

OLLAMA

CLI-first · auto-detected · background service · Setup →

LM STUDIO

GUI model browser · beginner-friendly · Setup →

JAN

Privacy-first · all-in-one UI · Setup →

ANY OPENAI-COMPATIBLE

Koboldcpp, LlamaFile, llama.cpp server · Setup →

Ollama

Best for: developers who prefer a CLI workflow. Runs as a background service — no UI needed, just pull a model and AppStudio auto-connects.

Download Ollama

Get it from ollama.com — available for Windows, macOS, and Linux.

Install and launch

Run the installer. Ollama starts automatically as a system tray app / background service and listens on http://localhost:11434.

Pull the recommended model

ollama pull qwen3:14b

On 8 GB VRAM or less, use the smaller variant:

ollama pull qwen3:8b

AppStudio auto-connects

No configuration needed. AppStudio detects Ollama on http://localhost:11434 automatically.

VRAM note: qwen3:14b requires approximately 10 GB VRAM. If you have an RTX 3060 12 GB or better, this is the recommended model for quality analysis results.

LM Studio

Best for: users who prefer a GUI model browser with built-in download management. No command line needed.

Download LM Studio

Get it from lmstudio.ai.

Download a model inside LM Studio

Use the built-in model browser. Search for qwen for the best results with AppStudio.

Start the local server

In LM Studio, go to the Local Server tab and click Start Server. The server runs on port 1234 by default.

Set the Base URL in AppStudio

Go to AppStudio → Settings and set Base URL to:

http://localhost:1234/v1

Jan

Best for: privacy-focused users who want an all-in-one desktop app for managing and running local models.

Download Jan

Get it from jan.ai.

Install a model

Use Jan's model hub to download Qwen or a similar capable model.

Enable the API server

In Jan's settings, turn on the local API server.

Set the Base URL in AppStudio

Go to AppStudio → Settings and set Base URL to:

http://localhost:1337/v1

Custom / Other server

Any OpenAI-compatible server works. In AppStudio → Settings → Advanced:

Base URL — your server's URL (e.g. http://localhost:8080/v1)
Model name — optionally override the auto-detected model name

Compatible servers include Koboldcpp, text-generation-webui (with OpenAI extension), llama.cpp server, and more.

Hardware requirements

Model	VRAM needed	Speed (analysis)
`qwen3:14b`	~10 GB	Best quality, 2–4 min
`qwen3:8b`	~6 GB	Good quality, 3–5 min
`qwen2.5:7b`	~5 GB	Decent quality, 4–6 min
`llama3.2`	~4 GB	Fallback, variable

CPU-only inference: AppStudio works without a GPU but is very slow — analysis may take 15–30+ minutes per run. A discrete GPU (NVIDIA, AMD, or Apple Silicon) is strongly recommended for a usable experience.

Troubleshooting

"Model not found" — your server is running but no model is loaded or the model name doesn't match.
- Ollama: run ollama list to see installed models. Pull one with ollama pull qwen3:14b.
- LM Studio / Jan: make sure a model is selected and loaded in the server tab before starting the server.
"Connection refused" — AppStudio can't reach your local server. Make sure it's running:
- Ollama: check your system tray for the Ollama icon, or run ollama serve in a terminal.
- LM Studio: open LM Studio → Local Server tab → click Start Server.
- Jan: open Jan → Settings → enable the API server.
- Custom server: verify the Base URL in AppStudio → Settings matches your server's address and port.
Analysis hangs or times out — the model is likely swapping to RAM (out of VRAM), which is very slow. Try a smaller model. For Ollama: ollama pull qwen3:8b. For LM Studio/Jan: switch to a smaller quant in the model browser.
Wrong model being used — AppStudio auto-selects based on its priority list. To force a specific model, go to Settings → Advanced and set the model name manually.

Still stuck? See the full Troubleshooting guide →