Documentation
File Structure:
docs/
- Documentation filescode/
- Code filesstorage/
- Storage filesvectorstores/
- Vector Databases.env
- Environment VariablesDockerfile
- Dockerfile for Hugging Face.chainlit
- Chainlit Configurationchainlit.md
- Chainlit READMEREADME.md
- Repository README.gitignore
- Gitignore filerequirements.txt
- Python Requirements.gitattributes
- Gitattributes file
Code Structure
code/main.py
- Main Chainlit Appcode/config.yaml
- Configuration File to set Embedding related, Vector Database related, and Chat Model related parameters.code/modules/vector_db.py
- Vector Database Creationcode/modules/chat_model_loader.py
- Chat Model Loader (Creates the Chat Model)code/modules/constants.py
- Constants (Loads the Environment Variables, Prompts, Model Paths, etc.)code/modules/data_loader.py
- Loads and Chunks the Datacode/modules/embedding_model.py
- Creates the Embedding Model to Embed the Datacode/modules/llm_tutor.py
- Creates the RAG LLM Tutor- The Function
qa_bot()
loads the vector database and the chat model, and sets the prompt to pass to the chat model.
- The Function
code/modules/helpers.py
- Helper Functions
Storage and Vectorstores
storage/data/
- Data Storage (Put your pdf files under this directory, and urls in the urls.txt file)storage/models/
- Model Storage (Put your local LLMs under this directory)vectorstores/
- Vector Databases (Stores the Vector Databases generated fromcode/modules/vector_db.py
)
Useful Configurations
set these in code/config.yaml
:
["embedding_options"]["embedd_files"]
- If set to True, embeds the files from the storage directory everytime you run the chainlit command. If set to False, uses the stored vector database.["embedding_options"]["expand_urls"]
- If set to True, gets and reads the data from all the links under the url provided. If set to False, only reads the data in the url provided.["embedding_options"]["search_top_k"]
- Number of sources that the retriever returns["llm_params]["use_history"]
- Whether to use history in the prompt or not["llm_params]["memory_window"]
- Number of interactions to keep a track of in the history
LlamaCpp
Hugging Face Models
- Download the
.gguf
files for your Local LLM from Hugging Face (Example: https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF)