LLM Setup

AppStudio runs AI 100% locally. No OpenAI key, no cloud, no per-query costs.

How it works

AppStudio communicates with any OpenAI-compatible local server running on your machine. Pick whichever server you already use — or choose one below based on what fits your setup. It auto-detects available models and selects the best one automatically.

Model priority order (highest quality first):

qwen3:14b  →  qwen3:8b  →  qwen2.5:14b  →  qwen2.5:7b  →  llama3.2

The first model in this list that is installed and available will be used. You can override this in Settings → Advanced.

OLLAMA
CLI-first · auto-detected · background service · Setup →
LM STUDIO
GUI model browser · beginner-friendly · Setup →
JAN
Privacy-first · all-in-one UI · Setup →
ANY OPENAI-COMPATIBLE
Koboldcpp, LlamaFile, llama.cpp server · Setup →

Ollama

Best for: developers who prefer a CLI workflow. Runs as a background service — no UI needed, just pull a model and AppStudio auto-connects.

1
Download Ollama

Get it from ollama.com — available for Windows, macOS, and Linux.

2
Install and launch

Run the installer. Ollama starts automatically as a system tray app / background service and listens on http://localhost:11434.

3
Pull the recommended model
ollama pull qwen3:14b

On 8 GB VRAM or less, use the smaller variant:

ollama pull qwen3:8b
4
AppStudio auto-connects

No configuration needed. AppStudio detects Ollama on http://localhost:11434 automatically.

VRAM note: qwen3:14b requires approximately 10 GB VRAM. If you have an RTX 3060 12 GB or better, this is the recommended model for quality analysis results.

LM Studio

Best for: users who prefer a GUI model browser with built-in download management. No command line needed.

1
Download LM Studio

Get it from lmstudio.ai.

2
Download a model inside LM Studio

Use the built-in model browser. Search for qwen for the best results with AppStudio.

3
Start the local server

In LM Studio, go to the Local Server tab and click Start Server. The server runs on port 1234 by default.

4
Set the Base URL in AppStudio

Go to AppStudio → Settings and set Base URL to:

http://localhost:1234/v1

Jan

Best for: privacy-focused users who want an all-in-one desktop app for managing and running local models.

1
Download Jan

Get it from jan.ai.

2
Install a model

Use Jan's model hub to download Qwen or a similar capable model.

3
Enable the API server

In Jan's settings, turn on the local API server.

4
Set the Base URL in AppStudio

Go to AppStudio → Settings and set Base URL to:

http://localhost:1337/v1

Custom / Other server

Any OpenAI-compatible server works. In AppStudio → Settings → Advanced:

  • Base URL — your server's URL (e.g. http://localhost:8080/v1)
  • Model name — optionally override the auto-detected model name

Compatible servers include Koboldcpp, text-generation-webui (with OpenAI extension), llama.cpp server, and more.

Hardware requirements

Model VRAM needed Speed (analysis)
qwen3:14b ~10 GB Best quality, 2–4 min
qwen3:8b ~6 GB Good quality, 3–5 min
qwen2.5:7b ~5 GB Decent quality, 4–6 min
llama3.2 ~4 GB Fallback, variable
CPU-only inference: AppStudio works without a GPU but is very slow — analysis may take 15–30+ minutes per run. A discrete GPU (NVIDIA, AMD, or Apple Silicon) is strongly recommended for a usable experience.

Troubleshooting

  • "Model not found" — your server is running but no model is loaded or the model name doesn't match.
    • Ollama: run ollama list to see installed models. Pull one with ollama pull qwen3:14b.
    • LM Studio / Jan: make sure a model is selected and loaded in the server tab before starting the server.
  • "Connection refused" — AppStudio can't reach your local server. Make sure it's running:
    • Ollama: check your system tray for the Ollama icon, or run ollama serve in a terminal.
    • LM Studio: open LM Studio → Local Server tab → click Start Server.
    • Jan: open Jan → Settings → enable the API server.
    • Custom server: verify the Base URL in AppStudio → Settings matches your server's address and port.
  • Analysis hangs or times out — the model is likely swapping to RAM (out of VRAM), which is very slow. Try a smaller model. For Ollama: ollama pull qwen3:8b. For LM Studio/Jan: switch to a smaller quant in the model browser.
  • Wrong model being used — AppStudio auto-selects based on its priority list. To force a specific model, go to Settings → Advanced and set the model name manually.

Still stuck? See the full Troubleshooting guide →