mirror of
https://github.com/ggml-org/llama.vscode.git
synced 2026-05-07 01:15:23 +00:00
Page:
Local ai runner
Pages
Agent commands
Chat with AI about llama vscode
Chat with AI
Code completion
Copilot chat model provider
Custom eval tool
Custom tool
Delete models
Edit agent
Edit with AI
Env
Generate commit message
Health check
Home
How to use
Llama agent
Local ai runner
Manage agents
Manage chat models
Manage chats
Manage completion models
Manage embeddings models
Manage envs
Manage tools models
Mcp support
Menu
Model selection
More context files
Parallel completions
Rules
Skills
Statusbar
Subagents
Update todos tool
Use cases
No results
3
Local ai runner
igardev edited this page 2025-08-19 09:05:29 +03:00
Table of Contents
Use as local AI runner (as LM Studio, Ollama, etc.)
Overview
llama-vscode could be used as a local AI runner (as LM Studio, Ollama, etc.) . Models are searched in Huggingface. After a model is selected, llama-vscode automatically downloads it and starts a llama-server with it. With this the user could start chatting with an AI.
How to use it
- From llama-vscode menu select "Use as local AI runner" - llama view will be opened with buttons "llama.cpp", "Add", "Select", "Chat".
- Click "llama.cpp" button to install/upgrade llama.cpp (if not yet done). The installation for Windows (with winget) and Mac (with brew) is automatic. For Linux, the user should do it manually (download the latest llama.cpp package for Linux and add the bin folder to the PATH)
- Click "Add" button, enter search words to see a list of models from Huggingface, select a model, select quantization. If prefered - accept to start the model immediately. (not needed if the model is already added)
- Click "Select" button and select a model to run (not needed if the model is already started in the previous step)
- Click "Chat" button - a web page for chat with AI will be shown in VS Code
Enjoy talking with local AI.
https://github.com/user-attachments/assets/e75e96de-878b-43db-a45b-47cc0c554697