- API key is used (if needed and provided) on getting the list of models for adding OpenAI compatible provider
- Multiline field for Edit with AI
- Qwen3.5 models (2B, 4B, 9B) added in the predefined list - good for tools and chat
With this change llama.vscode could provide models for VS Code Copilot:
1. Start tools model from llama-vscode (local or external)
2. In VS Code Copilot show the models list -> Other Models -> Manage Models
3. Make the models (all models available by the application serving the tools model are shown) you want to use visible (click on the left of the model name)
4. Select the desired model from Copilot and start using it
Not needed tools from Copilot could be unchecked to reduce contex size if local model is used.
* Read SOUL.md and USER.md files from project root and add them in the prompt if they exist (similar to OpenClaw).
* - Subagents implemented
- new agent Unit Test Writer
- new tool create_agent
- new agent "Agent creator"
* Update documentation for llama-vscode
- Only if `llama-vscode-rules.md` is not present to preserve existing functionality
- TODO: Implement nested AGENTS.md
- See: https://agents.md/
Co-authored-by: Caleb Sawyer <caleb.sawyer@gsdeng.com>
- Add health check for models (works only with llama.cpp server)
- Visible in environment view
New settings:
- Health_check_interval_s: The interval in seconds for the health check
- Health_check_compl_enabled: Enable/disable health check for completion model
- Health_check_chat_enabled: Enable/disable health check for chat model
- Health_check_embs_enabled: Enable/disable health check for embedding model
- Health_check_tools_enabled: Enable/disable health check for tools model
- Setting max_parallel_completions determines how many completions to generate in parallel (default 3)
- Shortcuts - Alt+] - next completion, Alt+[ - previous completion
- Requires llama.cpp after December, 6, 2025 (commit c42712b) but is backword compatible (generates one completion for older versions)
- Skills (https://agentskills.io/home) could be now parsed by the LLM and added in the prompt
- skills_folder setting determines where are skills descriptions. If empty the <project_folder>/skills folder is used by default
- Anthropic models support skills best. I guess, the open source models will catch up.
Now the agent could have a default tools model. If there is such a model, on selecting the agent, the (tools) agent model will also be selected. However, it is still possible to change the tools model, while using the same agent.
- Fix the embeddings ranking - the wrong query was used
- Removed the free DeepSeek model from open router as not working now
- Setting endpoint_tools is now enought to start agent
- Agent View is now not hidden if the tools model is deselected
- Refactoring
* menu.ts is refactored - services classes are extracted
* - Refactor menu.ts model - extract services
- agent "Ask" added for questions about the project without changing the files
- predefiled free models from OpenRouter added (and xAi removed as not free anymore)
- Some bugs fixed
---------
Co-authored-by: igardev <ivailo.gardev@akros.ch>
- Fix bug for adding files by the agent
- Extract agent service from menu
- Adding agent from menu is now possible (not very user friendly, but working)
- File menu.ts refactored
- Predefined lists added for completion models, chat models, embeddings models, tools models and for envs
- Bugfixes
- If chat model is not selected, but a tools model is selected, it is used for generating commit messages, editing code with AI and in search_source tool
- xAI Grog4 free (from OpenRouter) added to the initial models
- Chat with AI with project context removed (agent does it better)
- Chat with AI about llama-vscode is now with agent, not using webui
- Agent - new buttons "Tools Model" and "Agent" - possibility to view the selected model and agent and to change them.
- Chat with AI with project context removed (agent does it better)
- Chat with AI about llama-vscode is now with agent, not using webui
- Agent - new buttons "Tools Model" and "Agent" - possibility to view the selected model and agent and to change them.
- xAI Grog4 free (from OpenRouter) added to the initial models
- Added rules - setting agent_rules or llama-vscode-rules.md
- Added agent commands - setting agent_commands/llama-vscode menu "agent commands...". (shortcut for often used prompts, in agent - press "/" and select agent command).
- Generate commit message now checks if there is a running chat model (or endpoint_chat)
- In Agent UI the requests the tokens are shown immediately, no when the complete response is received
- Bug fixes for Edit with AI
- tools_custom and context_custom settings are added
- -fa option is removed from huggingface download command
- Add model menu command is replaced with two Add local model and Add external model
Setting ask_install_llamacpp added to control if llama-vscode should ask the user to install llama.cpp
Setting upgrade_llamacpp_hours added to control how often llama-vscode should ask the user to upgrade llama.cpp
If the user cancels the llama.cpp installation on startup - llama-vscode suggests to disable the future popups for installation
If the user cancels the llama.cpp upgrade on startup - llama-vscode suggests to disable the future popups for upgrade (sets upgrade_llamacpp_hours to more than 8 years)
- Changes history added
- Chats could be selected, deleted, exported, imported
- llama-vscode UI (agent) is shown in a separate view now, not as part of Explorer view.
- Agent entity added - agents with different system prompts and default tools could be selected
- Fixed showing tables in llama agent
- Local envs with gpt-oss 20B added (also available for import from here )
* Increase the space for llama agent,
* fix a bug for showing llama-agent.
* Update the documentation for llama-vscode
* Envs with local gpt-oss for agent removed
Llama Agent UI improved - look and feel, statuc, etc.
New menus for managing completion models, chat models, embeddings models and tools models
Concept of selected models - for completion, chat, embeddings and tools
Orchestra concept introduced. Orchestra is a group of models. Starting(selecting)/stopping orchestra starts(selects)/stops all the models
Import/Export orchestra and models from/to file implemented
OpenAI gpt-oss 20B added as a local one in tools models and chat models
Predefined Orchestras for different use cases - only completion, chat + completion, chat + agent, etc.
- Llama Agent UI in Explorer view
- OpenRouter API model selection (assumes your OpenRauter key is in setting Api_key_tools)
- MCP Support
- 9 internal tools available for use
- custom_tool - returns the content of a file or a web page
- custom_eval_tool - write your own tool in Typescript/javascript
- Attach the selection to the context
- Configure maximum loops for Llama Agent
* feat: enhance text editor functionality
- Added methods to expand selection to full lines
- Implemented functions to remove leading spaces from text
- Added functions to add leading spaces to text
* fix: don't send requests for updating the context if the completions are disabled
* refactor: remove unused code and optimize performance
- Removed duplicate code and optimized performance in `architect.ts` and `text-editor.ts`
* feat: add RAG configuration option `rag_max_files` to limit the number of files indexed for RAG search
* fix: update cosine similarity logic
- Updated cosine similarity function to use chunk.embedding instead of getting embedding again
- Fixed edge case where chunk.embedding is empty
* feat: update menu items
- Added "Start all models" item with description
* feat: update chat edit text prompt
- Improve formatting for instructions and original text
- Remove redundant chunks section
- Navigate to the first difference after opening diff panel
* feat: update configuration options for llama.cpp server API keys
- Added `llama-vscode.api_key_chat` and `llama-vscode.api_key_embeddings` configuration options
- Updated `llama-vscode.api_key` to use new key names
- Edit with AI - don't send chunks, navigate to the first change in the diff panel
* bug: update API key configuration
- Updated API key configuration for chat and embeddings endpoints
---------
Co-authored-by: igardev <ivailo.gardev@akros.ch>