File size: 5,448 Bytes
7e60a5e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
### MACOS

#### CPU

Choose way to install Rust.

* Native Rust:

    Install [Rust](https://www.geeksforgeeks.org/how-to-install-rust-in-macos/):

    ```bash
    curl –proto ‘=https’ –tlsv1.2 -sSf https://sh.rustup.rs | sh
    ```

    Enter new shell and test: `rustc --version`

    When running a Mac with Intel hardware (not M1), you may run
    into `_clang: error: the clang compiler does not support '-march=native'_` during pip install.
    If so, set your archflags during pip install. eg: `ARCHFLAGS="-arch x86_64" pip3 install -r requirements.txt`

    If you encounter an error while building a wheel during the `pip install` process, you may need to install a C++
    compiler on your computer.

    Setup environment:
    ```bash
    conda create -n h2ogpt python=3.10
    conda activate h2ogpt
    pip install -r requirements.txt
  ```

* Conda Rust:

    If native rust does not work, try using conda way by creating conda environment with Python 3.10 and Rust.
    ```bash
    conda create -n h2ogpt python=3.10 rust
    conda activate h2ogpt
    pip install -r requirements.txt
    ```
To run CPU mode with default model, do:
```bash
python generate.py --base_model='llama' --prompt_type=wizard2 --score_model=None --langchain_mode='UserData' --user_path=user_path
```
For the above, ignore the CLI output saying `0.0.0.0`, and instead point browser at http://localhost:7860 (for windows/mac) or the public live URL printed by the server (disable shared link with `--share=False`).   To support document Q/A jump to [Install Optional Dependencies](#document-qa-dependencies).

---

#### GPU (MPS Mac M1)

1. Create conda environment with Python 3.10 and Rust.
   ```bash
   conda create -n h2ogpt python=3.10 rust
   conda activate h2ogpt
   ```
2. Install torch dependencies from nightly build to get latest mps support
   ```bash
   pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
   ```
3. Verify whether torch uses mps, run below python script.
   ```python
    import torch
    if torch.backends.mps.is_available():
        mps_device = torch.device("mps")
        x = torch.ones(1, device=mps_device)
        print (x)
    else:
        print ("MPS device not found.")
   ```
   Output
    ```bash
    tensor([1.], device='mps:0')
    ```
4. Install other h2ogpt requirements
    ```bash
   pip install -r requirements.txt
    ```
5. Run h2oGPT (without document Q/A):
    ```bash
    python generate.py --base_model=h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-7b --cli=True
    ```
For the above, ignore the CLI output saying `0.0.0.0`, and instead point browser at http://localhost:7860 (for windows/mac) or the public live URL printed by the server (disable shared link with `--share=False`).

To support document Q/A jump to [Install Optional Dependencies](#document-qa-dependencies).

---

#### Document Q/A dependencies

```bash
# Required for Doc Q/A: LangChain:
pip install -r reqs_optional/requirements_optional_langchain.txt
# Required for CPU: LLaMa/GPT4All:
pip install -r reqs_optional/requirements_optional_gpt4all.txt
# Optional: PyMuPDF/ArXiv:
pip install -r reqs_optional/requirements_optional_langchain.gpllike.txt
# Optional: Selenium/PlayWright:
pip install -r reqs_optional/requirements_optional_langchain.urls.txt
# Optional: for supporting unstructured package
python -m nltk.downloader all
```
and for supporting Word and Excel documents, download libreoffice: https://www.libreoffice.org/download/download-libreoffice/ .
To support OCR, install tesseract and other dependencies:
```bash
brew install libmagic
brew link libmagic
brew install poppler
brew install tesseract --all-languages
```

Then for document Q/A with UI using CPU:
```bash
python generate.py --base_model='llama' --prompt_type=wizard2 --score_model=None --langchain_mode='UserData' --user_path=user_path
```
or MPS:
```bash
python generate.py --base_model=h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-7b --langchain_mode=MyData --score_model=None
```
For the above, ignore the CLI output saying `0.0.0.0`, and instead point browser at http://localhost:7860 (for windows/mac) or the public live URL printed by the server (disable shared link with `--share=False`).

See [CPU](README_CPU.md) and [GPU](README_GPU.md) for some other general aspects about using h2oGPT on CPU or GPU, such as which models to try.

---

#### GPU with LLaMa

**Note**: Currently `llama-cpp-python` only supports v3 ggml 4 bit quantized models for MPS, so use llama models ends with `ggmlv3` & `q4_x.bin`.

1. Install dependencies
    ```bash
    # Required for Doc Q/A: LangChain:
    pip install -r reqs_optional/requirements_optional_langchain.txt
    # Required for CPU: LLaMa/GPT4All:
    pip install -r reqs_optional/requirements_optional_gpt4all.txt
    ```
2. Install the LATEST llama-cpp-python...which happily supports MacOS Metal GPU as of version 0.1.62 (you should now have llama-cpp-python v0.1.62 or higher installed)
    ```bash
    pip uninstall llama-cpp-python -y
    CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install -U llama-cpp-python --no-cache-dir
    ```
3. Edit below settings in `.env_gpt4all`
    - Uncomment line with `n_gpu_layers=20`
    - Change model name with your preferred model at line with `model_path_llama=WizardLM-7B-uncensored.ggmlv3.q8_0.bin`
4. Run LLaMa model
    ```bash
    python generate.py --base_model='llama' --cli==True
    ```