I can not load qwen3 model - so I decide to compile manually using VS & cuda toolkit (both install successfully). cmake -B build -DGGML_CUDA=ON -DCMAKE_CUDA_ARCHITECTURES="86;89" ...
Ever noticed your computer acting sluggish or warning you about low storage? Temporary files could be the sneaky culprit. Windows creates these files while installing apps, loading web pages, or ...
One solution can be to move common_chat_parse_* either to a dedicated file, chat-templates.cpp, or to move to the existing chat-parser.cpp This won't reduce the total compile time, but will provide a ...