File size: 5,322 Bytes
3943768
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
# macOS

Supports CPU and MPS (Metal M1/M2).

- [Install](#install)
- [Run](#run)

## Install
* Download and Install [Miniconda](https://docs.conda.io/en/latest/miniconda.html#macos-installers) for Python 3.10.
* Run Miniconda
* Setup environment with Conda Rust:
    ```bash
    conda create -n h2ogpt python=3.10 rust
    conda activate h2ogpt
    ```
* Install dependencies:
    ```bash
    git clone https://github.com/h2oai/h2ogpt.git
    cd h2ogpt

    # fix any bad env
    pip uninstall -y pandoc pypandoc pypandoc-binary
    pip install --upgrade pip
    python -m pip install --upgrade setuptools
    
    # Install Torch:
    pip install -r requirements.txt --extra-index https://download.pytorch.org/whl/cpu -c reqs_optional/reqs_constraints.txt
    ```
* Install document question-answer dependencies:
    ```bash
    # Required for Doc Q/A: LangChain:
    pip install -r reqs_optional/requirements_optional_langchain.txt -c reqs_optional/reqs_constraints.txt
  
    # Required for CPU: LLaMa/GPT4All:
    pip uninstall -y llama-cpp-python llama-cpp-python-cuda
    export CMAKE_ARGS=-DLLAMA_METAL=on
    export FORCE_CMAKE=1
    pip install -r reqs_optional/requirements_optional_llamacpp_gpt4all.txt -c reqs_optional/reqs_constraints.txt --no-cache-dir

    pip install librosa -c reqs_optional/reqs_constraints.txt
    # Optional: PyMuPDF/ArXiv:
    pip install -r reqs_optional/requirements_optional_langchain.gpllike.txt -c reqs_optional/reqs_constraints.txt
    # Optional: Selenium/PlayWright:
    pip install -r reqs_optional/requirements_optional_langchain.urls.txt -c reqs_optional/reqs_constraints.txt
    # Optional: DocTR OCR:
    conda install weasyprint pygobject -c conda-forge -y
    pip install -r reqs_optional/requirements_optional_doctr.txt -c reqs_optional/reqs_constraints.txt
    # Optional: for supporting unstructured package
    python -m nltk.downloader all
  ```
* For supporting Word and Excel documents, download libreoffice: https://www.libreoffice.org/download/download-libreoffice/ .
* To support OCR, install [Tesseract Documentation](https://tesseract-ocr.github.io/tessdoc/Installation.html):
    ```bash
    brew install libmagic
    brew link libmagic
    brew install poppler
    brew install tesseract
    brew install tesseract-lang
    brew install rubberband
    brew install pygobject3 gtk4
    brew install libjpeg
    brew install libpng
    brew install wget
    ```

See [FAQ](FAQ.md#adding-models) for how to run various models.  See [CPU](README_CPU.md) and [GPU](README_GPU.md) for some other general aspects about using h2oGPT on CPU or GPU, such as which models to try.

## Run 

For information on how to run h2oGPT offline, see [Offline](README_offline.md#tldr).

In your terminal, run:
```bash
python generate.py --base_model=TheBloke/zephyr-7B-beta-GGUF --prompt_type=zephyr --max_seq_len=4096
```
Or you can run it from a file called `run.sh` that would contain following text:
```bash
#!/bin/bash
python generate.py --base_model=TheBloke/zephyr-7B-beta-GGUF --prompt_type=zephyr --max_seq_len=4096
```
and run `sh run.sh` from the terminal placed in the parent folder of `run.sh`

To run with latest llama 3.1 gguf model, you can run:
```
python generate.py --base_model=llama --model_path_llama=https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/resolve/main/Meta-Llama-3.1-8B-Instruct-Q6_K_L.gguf?download=true --tokenizer_base_model=meta-llama/Meta-Llama-3.1-8B-Instruct --max_seq_len=8192
```
For more info about llama 3 models see [FAQ](https://github.com/h2oai/h2ogpt/blob/main/docs/FAQ.md#llama-3-or-other-chat-template-based-models)

---

## Issues
* Metal M1/M2 Only:
   Verify whether torch uses MPS, run below python script:
     ```python
      import torch
      if torch.backends.mps.is_available():
          mps_device = torch.device("mps")
          x = torch.ones(1, device=mps_device)
          print (x)
      else:
          print ("MPS device not found.")
     ```
  Output
     ```bash
     tensor([1.], device='mps:0')
     ```
* If you see `ld: library not found for -lSystem` then ensure you do below and then retry from scratch to do `pip install` commands:
    ```bash
    export LDFLAGS=-L/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib`
    ```
* If conda Rust has issus, you can download and install [Native Rust]((https://www.geeksforgeeks.org/how-to-install-rust-in-macos/):
    ```bash
    curl –proto ‘=https’ –tlsv1.2 -sSf https://sh.rustup.rs | sh
    # enter new shell and test:
    rustc --version
    ```
* When running a Mac with Intel hardware (not M1), you may run into
    ```text
    _clang: error: the clang compiler does not support '-march=native'_
    ```
    during pip install.  If so, set your archflags during pip install. E.g.
    ```bash
    ARCHFLAGS="-arch x86_64" pip install -r requirements.txt -c reqs_optional/reqs_constraints.txt
    ```
* If you encounter an error while building a wheel during the `pip install` process, you may need to install a C++ compiler on your computer.
* If you see the error `TypeError: Trying to convert BFloat16 to the MPS backend but it does not have support for that dtype.`:
  ```bash
  pip install -U torch==2.3.1
  pip install -U torchvision==0.18.1
  ```
  * Support for BFloat16 is added to MacOS from Sonama (14.0)