Commit graph

18 commits

Author SHA1 Message Date
igardev
8e42441edc
Generate multiple completions in parallel (#148)
- Setting max_parallel_completions determines how many completions to generate in parallel (default 3)
- Shortcuts - Alt+] - next completion, Alt+[ - previous completion
- Requires llama.cpp after December, 6, 2025 (commit c42712b) but is backword compatible (generates one completion for older versions)
2026-01-05 15:43:33 +02:00
igardev
1d5f9387ab
Enable skills usage (#147)
- Skills (https://agentskills.io/home) could be now parsed by the LLM and added in the prompt
- skills_folder setting determines where are skills descriptions. If empty the <project_folder>/skills folder is used by default
- Anthropic models support skills best. I guess, the open source models will catch up.
2025-12-31 17:36:24 +02:00
igardev
e5a56f269d
Image selection for agent context (#140)
* Image selection is now possible in Agent view

* Add qwen3 VL 30B Instruct as a predefined model
2025-11-18 17:37:01 +02:00
stoperro
5e565da06b
Working multi-file edit PoC. (#130) 2025-10-25 21:34:08 +03:00
igardev
13bf27699e
Menu refactor and Ask agent added (#124)
* menu.ts is refactored - services classes are extracted

* - Refactor menu.ts model - extract services
- agent "Ask" added for questions about the project without changing the files
- predefiled free models from OpenRouter added (and xAi removed as not free anymore)
- Some bugs fixed

---------

Co-authored-by: igardev <ivailo.gardev@akros.ch>
2025-10-05 15:44:04 +03:00
igardev
c26950cb2e
Chats, separate view (#100)
- Changes history added
- Chats could be selected, deleted, exported, imported
- llama-vscode UI (agent) is shown in a separate view now, not as part of Explorer view.
2025-08-27 12:48:37 +03:00
igardev
08c0a22a73
Agent entity added (#97)
- Agent entity added - agents with different system prompts and default tools could be selected
- Fixed showing tables in llama agent
- Local envs with gpt-oss 20B added (also available for import from here )
2025-08-18 10:48:55 +03:00
igardev
d55a8c5c0a
Remove typescript module from the source code. (#91) 2025-08-11 18:32:37 +03:00
igardev
8e0531f906
Add models from huggingface (#89)
* Adding a model from huggingface implemented

* Tool llama_vscode_help is added
2025-08-11 17:40:11 +03:00
igardev
61d1de7a07
Improvements for Llama Agent, introduction of Orchestra concept. (#84)
Llama Agent UI improved - look and feel, statuc, etc.
New menus for managing completion models, chat models, embeddings models and tools models
Concept of selected models - for completion, chat, embeddings and tools
Orchestra concept introduced. Orchestra is a group of models. Starting(selecting)/stopping orchestra starts(selects)/stops all the models
Import/Export orchestra and models from/to file implemented
OpenAI gpt-oss 20B added as a local one in tools models and chat models
Predefined Orchestras for different use cases - only completion, chat + completion, chat + agent, etc.
2025-08-07 23:04:56 +03:00
igardev
e49bb9f128
Agent support added (#79)
- Llama Agent UI in Explorer view
- OpenRouter API model selection (assumes your OpenRauter key is in setting Api_key_tools)
- MCP Support
- 9 internal tools available for use
- custom_tool - returns the content of a file or a web page
- custom_eval_tool - write your own tool in Typescript/javascript
- Attach the selection to the context
- Configure maximum loops for Llama Agent
2025-08-01 15:59:16 +03:00
adfnekc
3a5db548df
feat: add commit message generation feature (#58)
* feat: add commit message generation feature

- Implemented a new command to generate git commit messages using AI
- Added a new prompt template for generating commit messages
- Integrated the feature with the VS Code Git extension
![20250507143421_rec_](https://github.com/user-attachments/assets/25f5d1ae-3673-4416-ba52-7615969c1bb3)
2025-05-10 12:39:49 +03:00
igardev
d375aedd91
Add RAG search for Ask With AI with project conext (#56)
* Add RAG search for Ask With AI with project conext

* Remove duplicated call for getting context.

* Chat with project supports providing files as context with @ prefix (i.e. @test.cpp)

* Reindex files if rag settings are changed

* Add menu item for starting embedding server on mac.

* Improve excuding the files from .gitignore; reduce the memory usage by BM25 algorith.

* Update file chunks on save improvement, progress bar for calculating embeddings for RAG. search.

* Add prefix llama-vscode for the shortcut commands. This way it is easier to filter them.

* Removed senidng extra context chunks to the chat server.  Show error in case of problem with embeddings server. If embeddings server endpoint is not available - shows message and uses only BM25 filtering.

* Typing error fix in translations

* style : fix whitespaces + disable extra context for chat edit

* config : adjust params

* menu : fix embedding commands

---------

Co-authored-by: igardev <ivailo.gardev@akros.ch>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-05-09 12:35:37 +03:00
igardev
de06d78dc8
Suggestion from the beginning of the line (#48)
If between the cursor and the start of the line there are only spaces and tabs - generate suggestion as if the cursor was at the start of the line.
2025-03-21 07:59:09 +02:00
Georgi Gerganov
3bad0ed952
release : v0.0.7 2025-02-08 09:10:25 +02:00
ohmeow
f2fc58306a
endpoints : add experimental OpenAI support (#16)
* initial openai compatible api endpoint integration

* fix watch

* added openAiClientModel to config; tested with local vllm server

* fixed config and completions to work with FIM models by default

* remove unnecessary try catch

* core : remove repeating suffix of a suggestion + fix speculative FIM (#18)

* Remove repeating suffix of a suggestion

* If linesuffix is empty - cut the repeating suffix of the suggestion.

* If there is a linesuffix, suggest only one line, don't make hidden second request

* Fix the caching of the future suggestion in case of max inputPrefix length.

---------

Co-authored-by: igardev <ivailo.gardev@akros.ch>

* core : disable trimming of suggestions

* release : v0.0.6

* readme : add CPU-only configs

* fixed configuration/settings UI

* fixed conflicts

* fix watch

* fixed

* fixes

* update version

* readme : add example

* core : fix cutting the lines of a suggestion (#22)

* Fix the problem with cutting the lines of a suggestion after the first one.

* Remove the less important checks on cutting the suggestion.

---------

Co-authored-by: igardev <ivailo.gardev@akros.ch>

* Fix manual trigger without cache + accept always on pressing a Tab (#25)

* Ensure Ctrl+Shift+L always makes a new request to the servers.

* If a suggestion is visible - pressing a Tab always accepts it.

---------

Co-authored-by: igardev <ivailo.gardev@akros.ch>

* fixed conflicts

* fix watch

* fixed

* fixes

* initial openai compatible api endpoint integration

* added openAiClientModel to config; tested with local vllm server

* fixed config and completions to work with FIM models by default

* fixed

* make api key optional for openai compatible endpoints as well

* updated to work with llama.cpp without api key

* removed this.handleOpenAICompletion() call from prepareLlamaForNextCompletion per @igardev

* updated package-lock.json after build

---------

Co-authored-by: igardev <49397134+igardev@users.noreply.github.com>
Co-authored-by: igardev <ivailo.gardev@akros.ch>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-02-08 08:50:42 +02:00
Georgi Gerganov
9407bdd5d3
release : v0.0.6 2025-01-31 09:42:34 +02:00
igardev
cc07186197
First version of llama.vscode extension (#2)
* First draft version of llama.vscode plugin

* Update the instructions how to run llama.cpp server on Mac in Readme file. Removed not needed imports and not used variables.

* Fix the problem with not sending extra context.

* Reduce the number of requests on fast typing or deleting. Fix error on suggestion when curson on last line.

* Reduce last completion if the typed chars are the first chars of it

* Small fixes and improvements
- Next word in case of end of line should be the first word of the next line
- Similar for next line
- Avoid sendind requests on accepting next word or next line.

* Fix the problem with wrong prompt sending on removing chars with backspace.
No cashing if the suggestion is empty.
nindent parameter added in the request to llama-server

* Revert the publisher name change as it results in error on creating the installation file

* -ctrl+shift+l forces trigerring a request (no cache)
- Status message improved
- other minor fixes

* - n_indent added in the request to llama server
- ctrl+alt+c - copy chunks in the clipboard

* - n_indent parameter is now correctly sent to llama server
- Search in cache extended - now searches for match, which partially or completly includes prompt
- Messages in the status bar are now short (for users, not for developers)
- Setting for choosing language for the status bar messages is added.

* - n_indent parameter is now correctly sent to llama server
- Search in cache extended - now searches for match, which partially or completly includes prompt
- Messages in the status bar are now short (for users, not for developers)
- Setting for choosing language for the status bar messages is added.

* Readme file updated and other small refactorings.

* Run async slow pick chunks operations

* Fix error on search in cache.
Don't send a request if one is still running.

* Fix error on ignoring the result of the last request in some situations.

* Change the key assignment:
Copy chunks and cache: Ctrl+Shift+,
Accept next word: Ctrl+right arrow

* - Improve the time calculation for status bar
- Show additional info only if show_info is true. Show basic info (thinkg..., no suggestion + time) always

* Fix the error on accepting a line when the cursor is at the last line.

* Minor whitespace clean-up

* More whitespaces

* Update readme

---------

Co-authored-by: igardev <ivailo.gardev@akros.ch>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-01-21 14:05:38 +02:00