|
### MACOS |
|
|
|
#### CPU |
|
|
|
Choose way to install Rust. |
|
|
|
* Native Rust: |
|
|
|
Install [Rust](https://www.geeksforgeeks.org/how-to-install-rust-in-macos/): |
|
|
|
```bash |
|
curl –proto ‘=https’ –tlsv1.2 -sSf https://sh.rustup.rs | sh |
|
``` |
|
|
|
Enter new shell and test: `rustc --version` |
|
|
|
When running a Mac with Intel hardware (not M1), you may run |
|
into `_clang: error: the clang compiler does not support '-march=native'_` during pip install. |
|
If so, set your archflags during pip install. eg: `ARCHFLAGS="-arch x86_64" pip3 install -r requirements.txt` |
|
|
|
If you encounter an error while building a wheel during the `pip install` process, you may need to install a C++ |
|
compiler on your computer. |
|
|
|
Setup environment: |
|
```bash |
|
conda create -n h2ogpt python=3.10 |
|
conda activate h2ogpt |
|
pip install -r requirements.txt |
|
``` |
|
|
|
* Conda Rust: |
|
|
|
If native rust does not work, try using conda way by creating conda environment with Python 3.10 and Rust. |
|
```bash |
|
conda create -n h2ogpt python=3.10 rust |
|
conda activate h2ogpt |
|
pip install -r requirements.txt |
|
``` |
|
To run CPU mode with default model, do: |
|
```bash |
|
python generate.py --base_model='llama' --prompt_type=wizard2 --score_model=None --langchain_mode='UserData' --user_path=user_path |
|
``` |
|
For the above, ignore the CLI output saying `0.0.0.0`, and instead point browser at http://localhost:7860 (for windows/mac) or the public live URL printed by the server (disable shared link with `--share=False`). To support document Q/A jump to [Install Optional Dependencies](#document-qa-dependencies). |
|
|
|
--- |
|
|
|
#### GPU (MPS Mac M1) |
|
|
|
1. Create conda environment with Python 3.10 and Rust. |
|
```bash |
|
conda create -n h2ogpt python=3.10 rust |
|
conda activate h2ogpt |
|
``` |
|
2. Install torch dependencies from nightly build to get latest mps support |
|
```bash |
|
pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu |
|
``` |
|
3. Verify whether torch uses mps, run below python script. |
|
```python |
|
import torch |
|
if torch.backends.mps.is_available(): |
|
mps_device = torch.device("mps") |
|
x = torch.ones(1, device=mps_device) |
|
print (x) |
|
else: |
|
print ("MPS device not found.") |
|
``` |
|
Output |
|
```bash |
|
tensor([1.], device='mps:0') |
|
``` |
|
4. Install other h2ogpt requirements |
|
```bash |
|
pip install -r requirements.txt |
|
``` |
|
5. Run h2oGPT (without document Q/A): |
|
```bash |
|
python generate.py --base_model=h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-7b --cli=True |
|
``` |
|
For the above, ignore the CLI output saying `0.0.0.0`, and instead point browser at http://localhost:7860 (for windows/mac) or the public live URL printed by the server (disable shared link with `--share=False`). |
|
|
|
To support document Q/A jump to [Install Optional Dependencies](#document-qa-dependencies). |
|
|
|
--- |
|
|
|
#### Document Q/A dependencies |
|
|
|
```bash |
|
# Required for Doc Q/A: LangChain: |
|
pip install -r reqs_optional/requirements_optional_langchain.txt |
|
# Required for CPU: LLaMa/GPT4All: |
|
pip install -r reqs_optional/requirements_optional_gpt4all.txt |
|
# Optional: PyMuPDF/ArXiv: |
|
pip install -r reqs_optional/requirements_optional_langchain.gpllike.txt |
|
# Optional: Selenium/PlayWright: |
|
pip install -r reqs_optional/requirements_optional_langchain.urls.txt |
|
# Optional: for supporting unstructured package |
|
python -m nltk.downloader all |
|
``` |
|
and for supporting Word and Excel documents, download libreoffice: https://www.libreoffice.org/download/download-libreoffice/ . |
|
To support OCR, install tesseract and other dependencies: |
|
```bash |
|
brew install libmagic |
|
brew link libmagic |
|
brew install poppler |
|
brew install tesseract --all-languages |
|
``` |
|
|
|
Then for document Q/A with UI using CPU: |
|
```bash |
|
python generate.py --base_model='llama' --prompt_type=wizard2 --score_model=None --langchain_mode='UserData' --user_path=user_path |
|
``` |
|
or MPS: |
|
```bash |
|
python generate.py --base_model=h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-7b --langchain_mode=MyData --score_model=None |
|
``` |
|
For the above, ignore the CLI output saying `0.0.0.0`, and instead point browser at http://localhost:7860 (for windows/mac) or the public live URL printed by the server (disable shared link with `--share=False`). |
|
|
|
See [CPU](README_CPU.md) and [GPU](README_GPU.md) for some other general aspects about using h2oGPT on CPU or GPU, such as which models to try. |
|
|
|
--- |
|
|
|
#### GPU with LLaMa |
|
|
|
**Note**: Currently `llama-cpp-python` only supports v3 ggml 4 bit quantized models for MPS, so use llama models ends with `ggmlv3` & `q4_x.bin`. |
|
|
|
1. Install dependencies |
|
```bash |
|
# Required for Doc Q/A: LangChain: |
|
pip install -r reqs_optional/requirements_optional_langchain.txt |
|
# Required for CPU: LLaMa/GPT4All: |
|
pip install -r reqs_optional/requirements_optional_gpt4all.txt |
|
``` |
|
2. Install the LATEST llama-cpp-python...which happily supports MacOS Metal GPU as of version 0.1.62 (you should now have llama-cpp-python v0.1.62 or higher installed) |
|
```bash |
|
pip uninstall llama-cpp-python -y |
|
CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install -U llama-cpp-python --no-cache-dir |
|
``` |
|
3. Edit below settings in `.env_gpt4all` |
|
- Uncomment line with `n_gpu_layers=20` |
|
- Change model name with your preferred model at line with `model_path_llama=WizardLM-7B-uncensored.ggmlv3.q8_0.bin` |
|
4. Run LLaMa model |
|
```bash |
|
python generate.py --base_model='llama' --cli==True |
|
``` |
|
|