feat(runner.sh): using runner.sh to select llm in the run time 69c6372 yusufs commited on Dec 26, 2024
feat(llama3.2): using llama model first for cost saving, until we want test sailor 92a4a4a yusufs commited on Nov 29, 2024
feat(llama3.2): run llama3.2 using bfloat16 with cache dtype fp8 with same model len 38d356a yusufs commited on Nov 29, 2024
feat(sail/Sailor-4B-Chat): try increase gpu-memory-utilization to 0.9 before changing the token length 4a9e328 yusufs commited on Nov 29, 2024
feat(llama3.2): using Llama-3.2-3B-Instruct 0cb88a4f764b7a12671c53f0838cd831a0843b95 8b37c20 yusufs commited on Nov 29, 2024
feat(dep_sizes.txt): removes dep_sizes.txt during build, it not needed 8e49b3b yusufs commited on Nov 27, 2024
feat(download_model.py): remove download_model.py during build, it causing big image size c360fd3 yusufs commited on Nov 27, 2024
docs(Dockerfile): add comment about estimated image size after compile 8dc2050 yusufs commited on Nov 27, 2024
feat(add-model): always download model during build, it will be cached in the consecutive builds 8679a35 yusufs commited on Nov 27, 2024
feat(reduce-max-num-batched-tokens): Reducing max-num-batched-tokens even the error state it want to reduce max_model_len 13a5c22 yusufs commited on Nov 27, 2024
feat(change-model): change to sail/Sailor-4B-Chat 89a866a7041e6ec023dd462adeca8e28dd53c83e d90e4d6 yusufs commited on Nov 27, 2024
fix(cmd): fix 'error: failed to solve: dockerfile parse error on line 19: unknown instruction: "python3",' de6b236 yusufs commited on Nov 27, 2024
feat(sailor-chat): add sail/Sailor-4B-Chat with the same context length 586265c yusufs commited on Nov 27, 2024