Tamnemtf
/

llama-2-7b-vi-oscar_mini

Text Generation

llama2-vietnamese

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Tamnemtf commited on Feb 4

Commit

1cb4134

•

1 Parent(s): cbafc7d

Update README.md

Files changed (1) hide show

README.md +0 -42

README.md CHANGED Viewed

@@ -22,45 +22,6 @@ tags:
 - Availability: The model checkpoint can be accessed on Hugging Face: Tamnemtf/llama-2-7b-vi-oscar_mini
 - Model trên được train dựa trên model gốc là ngoan/Llama-2-7b-vietnamese-20k
 ## How to Use
-  ```python
-      # Activate 4-bit precision base model loading
-      use_4bit = True
-      # Compute dtype for 4-bit base models
-      bnb_4bit_compute_dtype = "float16"
-      # Quantization type (fp4 or nf4)
-      bnb_4bit_quant_type = "nf4"
-      # Activate nested quantization for 4-bit base models (double quantization)
-      use_nested_quant = False
-      # Load the entire model on the GPU 0
-      device_map = {"": 0}
-  ```
-  ```python
-    compute_dtype = getattr(torch, bnb_4bit_compute_dtype)
-    bnb_config = BitsAndBytesConfig(
-        load_in_4bit=use_4bit,
-        bnb_4bit_quant_type=bnb_4bit_quant_type,
-        bnb_4bit_compute_dtype=compute_dtype,
-        bnb_4bit_use_double_quant=use_nested_quant,
-    )
-  ```
-  ```python
-    model = AutoModelForCausalLM.from_pretrained(
-      'Tamnemtf/llama-2-7b-vi-oscar_mini',
-      quantization_config=bnb_config,
-      device_map=device_map
-    )
-    model.config.use_cache = False
-    model.config.pretraining_tp = 1
-    tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
-    tokenizer.pad_token = tokenizer.eos_token
-    tokenizer.padding_side = "right" # Fix weird overflow issue with fp16 training
-  ```
   ```python
     # Run text generation pipeline with our next model
     prompt = "Canh chua cá lau là món gì ?"
@@ -69,9 +30,6 @@ tags:
     print(result[0]['generated_text'])
   ```
-  Để ưu tiên cho việc dễ dàng tiếp cận với các sinh viên dưới đây là mẫu ví dụ chạy thử model trên colab bằng T4
-  https://colab.research.google.com/drive/1ME_k-gUKSY2NbB7GQRk3sqz56CKsSV5C?usp=sharing
   ## Conntact
   nguyndantdm6@gmail.com

 - Availability: The model checkpoint can be accessed on Hugging Face: Tamnemtf/llama-2-7b-vi-oscar_mini
 - Model trên được train dựa trên model gốc là ngoan/Llama-2-7b-vietnamese-20k
 ## How to Use
   ```python
     # Run text generation pipeline with our next model
     prompt = "Canh chua cá lau là món gì ?"
     print(result[0]['generated_text'])
   ```
   ## Conntact
   nguyndantdm6@gmail.com