Ramikan-BR
/

tinyllama-coder-py-4bit-v10

@@ -198,9 +198,81 @@ Step	Training Loss
 4	0.331900
 5	0.276100
-Parameters:
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
 This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

 4	0.331900
 5	0.276100
+Quick test 1 after training the last part of the dataset:
+# alpaca_prompt = Copied from above
+FastLanguageModel.for_inference(model) # Enable native 2x faster inference
+inputs = tokenizer(
+[
+    alpaca_prompt.format(
+        "Continue the fibonnaci sequence.", # instruction
+        "1, 1, 2, 3, 5, 8", # input
+        "", # output - leave this blank for generation!
+    )
+], return_tensors = "pt").to("cuda")
+AI Response: ['<s> Below is an instruction that describes a task. Write a response that appropriately completes the request.\n### Input:\nContinue the fibonnaci sequence.\n\n### Output:\n1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 420, 787, 1444, 2881, 4765, 8640']
+Quick test 2 after training the last part of the dataset:
+# alpaca_prompt = Copied from above
+FastLanguageModel.for_inference(model) # Enable native 2x faster inference
+inputs = tokenizer(
+[
+    alpaca_prompt.format(
+        "Continue the fibonnaci sequence.", # instruction
+        "1, 1, 2, 3, 5, 8", # input
+        "", # output - leave this blank for generation!
+    )
+], return_tensors = "pt").to("cuda")
+from transformers import TextStreamer
+text_streamer = TextStreamer(tokenizer)
+_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)
+AI Response: <s> Below is an instruction that describes a task. Write a response that appropriately completes the request.
+### Input:
+Continue the fibonnaci sequence.
+### Output:
+1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 420, 787, 1444, 2881, 4765, 8640, 17281, 31362, 65325, 128672, 251345, 410000, 720000, 1280000,
+Quick test 3 after training the last part of the dataset:
+if False:
+    from unsloth import FastLanguageModel
+    model, tokenizer = FastLanguageModel.from_pretrained(
+        model_name = "lora_model", # YOUR MODEL YOU USED FOR TRAINING
+        max_seq_length = max_seq_length,
+        dtype = dtype,
+        load_in_4bit = load_in_4bit,
+    )
+    FastLanguageModel.for_inference(model) # Enable native 2x faster inference
+# alpaca_prompt = You MUST copy from above!
+inputs = tokenizer(
+[
+    alpaca_prompt.format(
+        "What is a famous tall tower in Paris?", # instruction
+        "", # input
+        "", # output - leave this blank for generation!
+    )
+], return_tensors = "pt").to("cuda")
+from transformers import TextStreamer
+text_streamer = TextStreamer(tokenizer)
+_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 64)
+AI Response: <s> Below is an instruction that describes a task. Write a response that appropriately completes the request.
+### Input:
+What is a famous tall tower in Paris?
+### Output:
+The famous tall tower in Paris is the Eiffel Tower. It is a 300-meter-tall steel tower located in the heart of Paris, France. The tower was built in 18892 and is a popular tourist attraction. It is also a symbol of the city
+outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True)
+tokenizer.batch_decode(outputs)
 This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.