File size: 5,448 Bytes
7e60a5e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 |
### MACOS
#### CPU
Choose way to install Rust.
* Native Rust:
Install [Rust](https://www.geeksforgeeks.org/how-to-install-rust-in-macos/):
```bash
curl –proto ‘=https’ –tlsv1.2 -sSf https://sh.rustup.rs | sh
```
Enter new shell and test: `rustc --version`
When running a Mac with Intel hardware (not M1), you may run
into `_clang: error: the clang compiler does not support '-march=native'_` during pip install.
If so, set your archflags during pip install. eg: `ARCHFLAGS="-arch x86_64" pip3 install -r requirements.txt`
If you encounter an error while building a wheel during the `pip install` process, you may need to install a C++
compiler on your computer.
Setup environment:
```bash
conda create -n h2ogpt python=3.10
conda activate h2ogpt
pip install -r requirements.txt
```
* Conda Rust:
If native rust does not work, try using conda way by creating conda environment with Python 3.10 and Rust.
```bash
conda create -n h2ogpt python=3.10 rust
conda activate h2ogpt
pip install -r requirements.txt
```
To run CPU mode with default model, do:
```bash
python generate.py --base_model='llama' --prompt_type=wizard2 --score_model=None --langchain_mode='UserData' --user_path=user_path
```
For the above, ignore the CLI output saying `0.0.0.0`, and instead point browser at http://localhost:7860 (for windows/mac) or the public live URL printed by the server (disable shared link with `--share=False`). To support document Q/A jump to [Install Optional Dependencies](#document-qa-dependencies).
---
#### GPU (MPS Mac M1)
1. Create conda environment with Python 3.10 and Rust.
```bash
conda create -n h2ogpt python=3.10 rust
conda activate h2ogpt
```
2. Install torch dependencies from nightly build to get latest mps support
```bash
pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
```
3. Verify whether torch uses mps, run below python script.
```python
import torch
if torch.backends.mps.is_available():
mps_device = torch.device("mps")
x = torch.ones(1, device=mps_device)
print (x)
else:
print ("MPS device not found.")
```
Output
```bash
tensor([1.], device='mps:0')
```
4. Install other h2ogpt requirements
```bash
pip install -r requirements.txt
```
5. Run h2oGPT (without document Q/A):
```bash
python generate.py --base_model=h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-7b --cli=True
```
For the above, ignore the CLI output saying `0.0.0.0`, and instead point browser at http://localhost:7860 (for windows/mac) or the public live URL printed by the server (disable shared link with `--share=False`).
To support document Q/A jump to [Install Optional Dependencies](#document-qa-dependencies).
---
#### Document Q/A dependencies
```bash
# Required for Doc Q/A: LangChain:
pip install -r reqs_optional/requirements_optional_langchain.txt
# Required for CPU: LLaMa/GPT4All:
pip install -r reqs_optional/requirements_optional_gpt4all.txt
# Optional: PyMuPDF/ArXiv:
pip install -r reqs_optional/requirements_optional_langchain.gpllike.txt
# Optional: Selenium/PlayWright:
pip install -r reqs_optional/requirements_optional_langchain.urls.txt
# Optional: for supporting unstructured package
python -m nltk.downloader all
```
and for supporting Word and Excel documents, download libreoffice: https://www.libreoffice.org/download/download-libreoffice/ .
To support OCR, install tesseract and other dependencies:
```bash
brew install libmagic
brew link libmagic
brew install poppler
brew install tesseract --all-languages
```
Then for document Q/A with UI using CPU:
```bash
python generate.py --base_model='llama' --prompt_type=wizard2 --score_model=None --langchain_mode='UserData' --user_path=user_path
```
or MPS:
```bash
python generate.py --base_model=h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-7b --langchain_mode=MyData --score_model=None
```
For the above, ignore the CLI output saying `0.0.0.0`, and instead point browser at http://localhost:7860 (for windows/mac) or the public live URL printed by the server (disable shared link with `--share=False`).
See [CPU](README_CPU.md) and [GPU](README_GPU.md) for some other general aspects about using h2oGPT on CPU or GPU, such as which models to try.
---
#### GPU with LLaMa
**Note**: Currently `llama-cpp-python` only supports v3 ggml 4 bit quantized models for MPS, so use llama models ends with `ggmlv3` & `q4_x.bin`.
1. Install dependencies
```bash
# Required for Doc Q/A: LangChain:
pip install -r reqs_optional/requirements_optional_langchain.txt
# Required for CPU: LLaMa/GPT4All:
pip install -r reqs_optional/requirements_optional_gpt4all.txt
```
2. Install the LATEST llama-cpp-python...which happily supports MacOS Metal GPU as of version 0.1.62 (you should now have llama-cpp-python v0.1.62 or higher installed)
```bash
pip uninstall llama-cpp-python -y
CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install -U llama-cpp-python --no-cache-dir
```
3. Edit below settings in `.env_gpt4all`
- Uncomment line with `n_gpu_layers=20`
- Change model name with your preferred model at line with `model_path_llama=WizardLM-7B-uncensored.ggmlv3.q8_0.bin`
4. Run LLaMa model
```bash
python generate.py --base_model='llama' --cli==True
```
|