h2oai
/

h2ogpt-gm-oasst1-en-2048-falcon-7b

Text Generation

RefinedWebModel

large language model

text-generation-inference

Model card Files Files and versions Community

ilu000 commited on Jun 5, 2023

Commit

a8bb3b0

•

1 Parent(s): ddf17cf

Update README.md

Files changed (1) hide show

README.md +17 -6

README.md CHANGED Viewed

@@ -24,7 +24,7 @@ This model was trained using [H2O LLM Studio](https://github.com/h2oai/h2o-llmst
 ## Usage
-To use the model with the `transformers` library on a machine with GPUs, first make sure you have the `transformers`, `accelerate` and `torch` libraries installed.
 ```bash
 pip install transformers==4.29.2
@@ -68,7 +68,7 @@ print(generate_text.preprocess("Why is drinking water so healthy?")["prompt_text
 <|prompt|>Why is drinking water so healthy?<|endoftext|><|answer|>
 ```
-Alternatively, if you prefer to not use `trust_remote_code=True` you can download [h2oai_pipeline.py](h2oai_pipeline.py), store it alongside your notebook, and construct the pipeline yourself from the loaded model and tokenizer:
 ```python
@@ -79,12 +79,14 @@ from transformers import AutoModelForCausalLM, AutoTokenizer
 tokenizer = AutoTokenizer.from_pretrained(
     "h2oai/h2ogpt-gm-oasst1-en-2048-falcon-7b",
     use_fast=False,
-    padding_side="left"
 )
 model = AutoModelForCausalLM.from_pretrained(
     "h2oai/h2ogpt-gm-oasst1-en-2048-falcon-7b",
     torch_dtype=torch.float16,
-    device_map={"": "cuda:0"}
 )
 generate_text = H2OTextGenerationPipeline(model=model, tokenizer=tokenizer)
@@ -112,8 +114,17 @@ model_name = "h2oai/h2ogpt-gm-oasst1-en-2048-falcon-7b"  # either local folder o
 # You can find an example prompt in the experiment logs.
 prompt = "<|prompt|>How are you?<|endoftext|><|answer|>"
-tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False)
-model = AutoModelForCausalLM.from_pretrained(model_name)
 model.cuda().eval()
 inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to("cuda")

 ## Usage
+To use the model with the `transformers` library on a machine with GPUs, first make sure you have the `transformers`, `accelerate`, `torch` and `einops` libraries installed.
 ```bash
 pip install transformers==4.29.2
 <|prompt|>Why is drinking water so healthy?<|endoftext|><|answer|>
 ```
+Alternatively, you can download [h2oai_pipeline.py](h2oai_pipeline.py), store it alongside your notebook, and construct the pipeline yourself from the loaded model and tokenizer:
 ```python
 tokenizer = AutoTokenizer.from_pretrained(
     "h2oai/h2ogpt-gm-oasst1-en-2048-falcon-7b",
     use_fast=False,
+    padding_side="left",
+    trust_remote_code=True,
 )
 model = AutoModelForCausalLM.from_pretrained(
     "h2oai/h2ogpt-gm-oasst1-en-2048-falcon-7b",
     torch_dtype=torch.float16,
+    device_map={"": "cuda:0"},
+    trust_remote_code=True,
 )
 generate_text = H2OTextGenerationPipeline(model=model, tokenizer=tokenizer)
 # You can find an example prompt in the experiment logs.
 prompt = "<|prompt|>How are you?<|endoftext|><|answer|>"
+tokenizer = AutoTokenizer.from_pretrained(
+    model_name,
+    use_fast=False,
+    trust_remote_code=True,
+)
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    torch_dtype=torch.float16,
+    device_map={"": "cuda:0"},
+    trust_remote_code=True,
+)
 model.cuda().eval()
 inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to("cuda")