{"cells":[{"cell_type":"markdown","source":["To run this, press \"*Runtime*\" and press \"*Run all*\" on a **free** Tesla T4 Google Colab instance!\n","
\n","\n","To install Unsloth on your own computer, follow the installation instructions on our Github page [here](https://github.com/unslothai/unsloth?tab=readme-ov-file#-installation-instructions).\n","\n","You will learn how to do [data prep](#Data), how to [train](#Train), how to [run the model](#Inference), & [how to save it](#Save) (eg for Llama.cpp).\n","\n","**[NEW] Gemma2-9b is trained on 8 trillion tokens! Gemma2-27b is 13 trillion!**"],"metadata":{"id":"IqM-T1RTzY6C"}},{"cell_type":"code","execution_count":1,"metadata":{"id":"2eSvM9zX_2d3","executionInfo":{"status":"ok","timestamp":1727858074739,"user_tz":-540,"elapsed":53054,"user":{"displayName":"최현진","userId":"12812404047517020320"}}},"outputs":[],"source":["%%capture\n","!pip install unsloth\n","# Also get the latest nightly Unsloth!\n","!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir \"unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git\"\n","\n","# Install Flash Attention 2 for softcapping support\n","import torch\n","if torch.cuda.get_device_capability()[0] >= 8:\n"," !pip install --no-deps packaging ninja einops \"flash-attn>=2.6.3\""]},{"cell_type":"markdown","source":["* We support Llama, Mistral, Phi-3, Gemma, Yi, DeepSeek, Qwen, TinyLlama, Vicuna, Open Hermes etc\n","* We support 16bit LoRA or 4bit QLoRA. Both 2x faster.\n","* `max_seq_length` can be set to anything, since we do automatic RoPE Scaling via [kaiokendev's](https://kaiokendev.github.io/til) method.\n","* With [PR 26037](https://github.com/huggingface/transformers/pull/26037), we support downloading 4bit models **4x faster**! [Our repo](https://huggingface.co/unsloth) has Llama, Mistral 4bit models.\n","* [**NEW**] We make Phi-3 Medium / Mini **2x faster**! See our [Phi-3 Medium notebook](https://colab.research.google.com/drive/1hhdhBa1j_hsymiW9m-WzxQtgqTH_NHqi?usp=sharing)"],"metadata":{"id":"r2v_X2fA0Df5"}},{"cell_type":"code","execution_count":2,"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":336,"referenced_widgets":["0a6c54edc08e44ee9cf092414a4e42ec","e68d1f22e75343999966d2ff409acc73","47a560c6e53b43a4bd24357339046c8f","909cbc94cb6d4a9e80ad41f5fcc369c9","bd62bd3d631e4429b2a0347b0cc5654c","9b395d60d64643b6804f2fa417df90e8","5f6369fe1652456da152c5189b9cf200","d3ba225fcb7e4cdca542551317b31569","89d85d9cfc154254ac2472f278be1a69","08ceb05830554399be0cf251f6256970","998157211dfb4d21a8dd0d8efa3fc245","deacfebe3871438abdc90b60dad9a32f","95025404591f4567b3db485c571095f9","1b584563488348579f132aa98d2dfe09","d78989601c354f1b8131a394efc440c2","3c772a9f001448d99d763219593686c3","d354121e9f904394b8b6af25fab434d3","395e60e1600f42739c905da1720cc187","eee99d979cbb42428313f7c55270298e","fdc41bad820143588d82734f6147b13f","0e63f87958bf4e80a5f582abe41f2364","742265a4480140ed9b3578b3098c8fc0","8a3ec41fc0be4616997556aab65744fd","db34ae736e5045f49ec389ccf26c7f5a","fcd9b941f5014d7587b2f00798721374","824f8c48ed134646bef3ea76baad855c","b1d7d85fb45746f4b57988d76ae26426","1546ea46587d480eb29e6c1844ad2f80","91f32f1839bd409783c9bca5e9ad0bcc","60b1328a8ab54a47be35e74cfabf9288","c38137c015c2457a807d403ef4cd3cc6","87cf68af10c0499199a65096b0beb78c","e8511def2cbb47d1b0269b5816e6c03b","fa50ca8ae1374b748d69e4b6541a5a72","4b8f823425a94cba9f0104848ee1e896","f0e7e6be84924623900da021faa743b6","d7b289713a004ca7804204792a2b6ed4","bbac4079f11e4e52a6108b6fb3e4a2e5","680be5d57ddf4bcf92f3a9b4ed413684","ac3f56ad9f354790a3a31101d196a78e","23428cd9c0584adea897d1bef50839af","49af41ecf8084d83bfdc37d25303e576","c4c646038ea64aa29c534bef36f5d06e","38909ccb06854dd886e7be9a8f7657c1","20ac96f8e5674c1bb3204efafb951d7d","16321c4d05a14e049a393dca08309ad6","3c75650d4a5647079a638dd7566dd8ec","490aec0bf7d642aa8d6fa219c23fa4f6","26e8a4b2d9d941c190ca828b4f0bc6ac","470b70df4d94475d91156919b93e5712","21b2d11c391b4442b0bda4f0a30bd625","3fcf984878dc4ec6b9a184e5103da8f3","deca135788e84a34a209df86101f6343","87473024bbea4171a03d41967520aa99","cb2030414a9048a88cb39f09ddc98189","15737b9be0544c10b7903dabda8653b1","8310ad421f0042a7868a45f87b845b44","627cb05685c64ddfa3bba3dab7471d11","ec5b4f91155d47d1a4df6e720b7de3cb","eebe0e888799428394efff69c629a1db","45c1af5a461747f5b2389977fdf851ce","afcbcc7413214d31ba00d881d5e04cc8","913a12d90a894664ab2219551561113e","801783d19f094cf2a10669ce2dc61f2d","b1091be9f09441c0b359986f144bec62","d8ad7fed7e7f4d858122658213c2ff4f"]},"id":"QmUBVEnvCDJv","executionInfo":{"status":"ok","timestamp":1727858188216,"user_tz":-540,"elapsed":113484,"user":{"displayName":"최현진","userId":"12812404047517020320"}},"outputId":"2e01341f-3fc8-4a9a-d242-c49603bb6762"},"outputs":[{"output_type":"stream","name":"stdout","text":["🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.\n","==((====))== Unsloth 2024.9.post4: Fast Gemma2 patching. Transformers = 4.44.2.\n"," \\\\ /| GPU: Tesla T4. Max memory: 14.748 GB. Platform = Linux.\n","O^O/ \\_/ \\ Pytorch: 2.4.1+cu121. CUDA = 7.5. CUDA Toolkit = 12.1.\n","\\ / Bfloat16 = FALSE. FA [Xformers = 0.0.28.post1. FA2 = False]\n"," \"-____-\" Free Apache license: http://github.com/unslothai/unsloth\n","Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!\n"]},{"output_type":"display_data","data":{"text/plain":["model.safetensors: 0%| | 0.00/6.13G [00:00, ?B/s]"],"application/vnd.jupyter.widget-view+json":{"version_major":2,"version_minor":0,"model_id":"0a6c54edc08e44ee9cf092414a4e42ec"}},"metadata":{}},{"output_type":"display_data","data":{"text/plain":["generation_config.json: 0%| | 0.00/190 [00:00, ?B/s]"],"application/vnd.jupyter.widget-view+json":{"version_major":2,"version_minor":0,"model_id":"deacfebe3871438abdc90b60dad9a32f"}},"metadata":{}},{"output_type":"display_data","data":{"text/plain":["tokenizer_config.json: 0%| | 0.00/46.4k [00:00, ?B/s]"],"application/vnd.jupyter.widget-view+json":{"version_major":2,"version_minor":0,"model_id":"8a3ec41fc0be4616997556aab65744fd"}},"metadata":{}},{"output_type":"display_data","data":{"text/plain":["tokenizer.model: 0%| | 0.00/4.24M [00:00, ?B/s]"],"application/vnd.jupyter.widget-view+json":{"version_major":2,"version_minor":0,"model_id":"fa50ca8ae1374b748d69e4b6541a5a72"}},"metadata":{}},{"output_type":"display_data","data":{"text/plain":["special_tokens_map.json: 0%| | 0.00/636 [00:00, ?B/s]"],"application/vnd.jupyter.widget-view+json":{"version_major":2,"version_minor":0,"model_id":"20ac96f8e5674c1bb3204efafb951d7d"}},"metadata":{}},{"output_type":"display_data","data":{"text/plain":["tokenizer.json: 0%| | 0.00/17.5M [00:00, ?B/s]"],"application/vnd.jupyter.widget-view+json":{"version_major":2,"version_minor":0,"model_id":"15737b9be0544c10b7903dabda8653b1"}},"metadata":{}}],"source":["from unsloth import FastLanguageModel\n","import torch\n","max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!\n","dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+\n","load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.\n","\n","# 4bit pre quantized models we support for 4x faster downloading + no OOMs.\n","fourbit_models = [\n"," \"unsloth/Meta-Llama-3.1-8B-bnb-4bit\", # Llama-3.1 15 trillion tokens model 2x faster!\n"," \"unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit\",\n"," \"unsloth/Meta-Llama-3.1-70B-bnb-4bit\",\n"," \"unsloth/Meta-Llama-3.1-405B-bnb-4bit\", # We also uploaded 4bit for 405b!\n"," \"unsloth/Mistral-Nemo-Base-2407-bnb-4bit\", # New Mistral 12b 2x faster!\n"," \"unsloth/Mistral-Nemo-Instruct-2407-bnb-4bit\",\n"," \"unsloth/mistral-7b-v0.3-bnb-4bit\", # Mistral v3 2x faster!\n"," \"unsloth/mistral-7b-instruct-v0.3-bnb-4bit\",\n"," \"unsloth/Phi-3.5-mini-instruct\", # Phi-3.5 2x faster!\n"," \"unsloth/Phi-3-medium-4k-instruct\",\n"," \"unsloth/gemma-2-9b-bnb-4bit\",\n"," \"unsloth/gemma-2-27b-bnb-4bit\", # Gemma 2x faster!\n","] # More models at https://huggingface.co/unsloth\n","\n","model, tokenizer = FastLanguageModel.from_pretrained(\n"," model_name = \"unsloth/gemma-2-9b\",\n"," max_seq_length = max_seq_length,\n"," dtype = dtype,\n"," load_in_4bit = load_in_4bit,\n"," # token = \"hf_...\", # use one if using gated models like meta-llama/Llama-2-7b-hf\n",")"]},{"cell_type":"markdown","source":["We now add LoRA adapters so we only need to update 1 to 10% of all parameters!"],"metadata":{"id":"SXd9bTZd1aaL"}},{"cell_type":"code","execution_count":3,"metadata":{"id":"6bZsfBuZDeCL","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1727858393516,"user_tz":-540,"elapsed":9745,"user":{"displayName":"최현진","userId":"12812404047517020320"}},"outputId":"39f4b630-aee4-45fe-b29f-a0008020fa24"},"outputs":[{"output_type":"stream","name":"stderr","text":["Unsloth 2024.9.post4 patched 42 layers with 42 QKV layers, 42 O layers and 42 MLP layers.\n"]}],"source":["model = FastLanguageModel.get_peft_model(\n"," model,\n"," r = 32, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128\n"," target_modules = [\"q_proj\", \"k_proj\", \"v_proj\", \"o_proj\",\n"," \"gate_proj\", \"up_proj\", \"down_proj\",],\n"," lora_alpha = 64, # rank의 2배\n"," lora_dropout = 0, # Supports any, but = 0 is optimized\n"," bias = \"none\", # Supports any, but = \"none\" is optimized\n"," # [NEW] \"unsloth\" uses 30% less VRAM, fits 2x larger batch sizes!\n"," use_gradient_checkpointing = \"unsloth\", # True or \"unsloth\" for very long context\n"," random_state = 3407,\n"," use_rslora = False, # We support rank stabilized LoRA\n"," loftq_config = None, # And LoftQ\n",")"]},{"cell_type":"markdown","source":["Use QLoRa"],"metadata":{"id":"0EWofs1z6myP"}},{"cell_type":"code","source":["# model = FastLanguageModel.get_peft_model(\n","# model,\n","# r = 16, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128\n","# target_modules = [\"q_proj\", \"k_proj\", \"v_proj\", \"o_proj\",\n","# \"gate_proj\", \"up_proj\", \"down_proj\",],\n","# lora_alpha = 16,\n","# lora_dropout = 0, # Supports any, but = 0 is optimized\n","# bias = \"none\", # Supports any, but = \"none\" is optimized\n","# # [NEW] \"unsloth\" uses 30% less VRAM, fits 2x larger batch sizes!\n","# use_gradient_checkpointing = \"unsloth\", # True or \"unsloth\" for very long context\n","# random_state = 3407,\n","# use_rslora = False, # We support rank stabilized LoRA\n","# loftq_config = {\n","# \"bnb_4bit\": True, # Use 4bit quantization\n","# \"bnb_4bit_compute_dtype\": torch.float16, # Set computation dtype\n","# \"bnb_4bit_quant_type\": \"nf4\" # Choose quantization type (nf4 recommended)\n","# }\n","# )"],"metadata":{"id":"jJ0SNz2Q6mlZ"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["# Load Data"],"metadata":{"id":"ZskSVjBKU9-L"}},{"cell_type":"code","source":["from google.colab import drive\n","drive.mount('/content/drive')"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"9ZUKAZ_oVImS","executionInfo":{"status":"ok","timestamp":1727858444817,"user_tz":-540,"elapsed":20771,"user":{"displayName":"최현진","userId":"12812404047517020320"}},"outputId":"63fd8d1d-1696-424c-d6df-41643a0783cd"},"execution_count":4,"outputs":[{"output_type":"stream","name":"stdout","text":["Mounted at /content/drive\n"]}]},{"cell_type":"code","source":["cd /content/drive/MyDrive/Google MLB 2024/Gemma Sprint/data"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"FYdtxQp9Vw6K","executionInfo":{"status":"ok","timestamp":1727858483832,"user_tz":-540,"elapsed":381,"user":{"displayName":"최현진","userId":"12812404047517020320"}},"outputId":"132b2066-89e1-41de-a689-04d7f3d7f3ec"},"execution_count":6,"outputs":[{"output_type":"stream","name":"stdout","text":["/content/drive/MyDrive/Google MLB 2024/Gemma Sprint/data\n"]}]},{"cell_type":"code","source":["from datasets import load_dataset\n","\n","# JSON 파일 경로\n","json_file = \"우리말샘_QA_dataset_전체.json\""],"metadata":{"id":"GaE6btRMVA50","executionInfo":{"status":"ok","timestamp":1727858493412,"user_tz":-540,"elapsed":397,"user":{"displayName":"최현진","userId":"12812404047517020320"}}},"execution_count":8,"outputs":[]},{"cell_type":"markdown","source":["\n","### Data Prep\n","We now use the Alpaca dataset from [yahma](https://huggingface.co/datasets/yahma/alpaca-cleaned), which is a filtered version of 52K of the original [Alpaca dataset](https://crfm.stanford.edu/2023/03/13/alpaca.html). You can replace this code section with your own data prep.\n","\n","**[NOTE]** To train only on completions (ignoring the user's input) read TRL's docs [here](https://huggingface.co/docs/trl/sft_trainer#train-on-completions-only).\n","\n","**[NOTE]** Remember to add the **EOS_TOKEN** to the tokenized output!! Otherwise you'll get infinite generations!\n","\n","If you want to use the `llama-3` template for ShareGPT datasets, try our conversational [notebook](https://colab.research.google.com/drive/1XamvWYinY6FOSX9GLvnqSjjsNflxdhNc?usp=sharing).\n","\n","For text completions like novel writing, try this [notebook](https://colab.research.google.com/drive/1ef-tab5bhkvWmBOObepl1WgJvfvSzn5Q?usp=sharing)."],"metadata":{"id":"vITh0KVJ10qX"}},{"cell_type":"code","execution_count":9,"metadata":{"id":"LjY75GoYUCB8","colab":{"base_uri":"https://localhost:8080/","height":81,"referenced_widgets":["ef997c8aba40422780940ca0747c4614","5f25d2216afe408b8d8b4ccb75da5840","e2fb0d0600c046008617abc45da4962c","0fe489f0907946ff8b755226f02cab45","3b0e9f300feb4867bd3850aa96d0e63a","b1ffa237c46d447bbaf88ca050d8b6de","e9744595dd2640209c6dd70c9c09eb84","411bc890289345fbb8c1eec04cb8ee55","0015b82880034b85981aeb78daa91ab2","492f49a7f115445daebec8851e199de8","96258cd3ae914460bba000d71d416e54","1faf886b995e472e8666ed01d78e2ee6","94ec88e90b95488d876d7d3e6e876d4f","c53b9f12fc0b410aa94fc94fb7f91e7e","96e7acc07a894f0d8c12df3f47d62a54","9565507bc58a40e58b536c1333d0da06","483f3084119141ce8a0429cd31e903f6","eac58a2cf5564476a0d06e8da5419a8f","79bf783adbb341d8b47fd5d3876f799b","64599a388e374269a5bbb0df3228824f","ea2ce876b2d142989e0fc9615baa6b2f","006650d347494172b83d27a97f0d0fae"]},"executionInfo":{"status":"ok","timestamp":1727858514436,"user_tz":-540,"elapsed":19039,"user":{"displayName":"최현진","userId":"12812404047517020320"}},"outputId":"4b14276c-e70a-4e6c-b2bb-bf96b8660846","collapsed":true},"outputs":[{"output_type":"display_data","data":{"text/plain":["Generating train split: 0 examples [00:00, ? examples/s]"],"application/vnd.jupyter.widget-view+json":{"version_major":2,"version_minor":0,"model_id":"ef997c8aba40422780940ca0747c4614"}},"metadata":{}},{"output_type":"display_data","data":{"text/plain":["Map: 0%| | 0/1181401 [00:00, ? examples/s]"],"application/vnd.jupyter.widget-view+json":{"version_major":2,"version_minor":0,"model_id":"1faf886b995e472e8666ed01d78e2ee6"}},"metadata":{}}],"source":["alpaca_prompt = \"\"\"Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n","\n","### Instruction:\n","{}\n","\n","### Input:\n","{}\n","\n","### Response:\n","{}\"\"\"\n","\n","EOS_TOKEN = tokenizer.eos_token # Must add EOS_TOKEN\n","def formatting_prompts_func(examples):\n"," instructions = examples[\"instruction\"]\n"," inputs = examples[\"input\"]\n"," outputs = examples[\"output\"]\n"," texts = []\n"," for instruction, input, output in zip(instructions, inputs, outputs):\n"," # Must add EOS_TOKEN, otherwise your generation will go on forever!\n"," text = alpaca_prompt.format(instruction, input, output) + EOS_TOKEN\n"," texts.append(text)\n"," return { \"text\" : texts, }\n","pass"]},{"cell_type":"code","source":["# Load data\n","from datasets import load_dataset\n","\n","dataset = load_dataset(\"json\", data_files=json_file)\n","\n","# Shuffle data\n","shuffled_dataset = dataset[\"train\"].shuffle(seed=3407)\n","\n","# Split train/test data (90:10)\n","train_test_split = shuffled_dataset.train_test_split(test_size=0.1)\n","train_dataset = train_test_split[\"train\"]\n","test_dataset = train_test_split[\"test\"]\n","dataset = train_dataset.map(formatting_prompts_func, batched = True,)"],"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":49,"referenced_widgets":["674ecbbecc64450198992a7735460e07","b4bb93af94a14918910064559e6f3c53","d837748b23514b769c1b78f6d1a62ac9","5fde99f56d9e43869fcf073c61df7bd9","35fb0e04580f4a3cb946f9611d5cd210","35834a01131b4d56af416433e2c28a08","bb94d7eaffe14e17bd2e1a5bd59b47a2","92296ea7e5a6487a9571d98364417b98","b2a5d3c1a9ea4cd6a34c6023476f290c","eccfa1faa3b84d3cb0138ef2f30bfa9f","53a4c3365930429095c818ecb14b1c80"]},"id":"TWBIGzAhaUUY","executionInfo":{"status":"ok","timestamp":1727858884024,"user_tz":-540,"elapsed":26339,"user":{"displayName":"최현진","userId":"12812404047517020320"}},"outputId":"7f142857-c093-4cf0-9524-e17795fa462a"},"execution_count":11,"outputs":[{"output_type":"display_data","data":{"text/plain":["Map: 0%| | 0/1063260 [00:00, ? examples/s]"],"application/vnd.jupyter.widget-view+json":{"version_major":2,"version_minor":0,"model_id":"674ecbbecc64450198992a7735460e07"}},"metadata":{}}]},{"cell_type":"markdown","source":["\n","### Train the model\n","Now let's use Huggingface TRL's `SFTTrainer`! More docs here: [TRL SFT docs](https://huggingface.co/docs/trl/sft_trainer). We do 60 steps to speed things up, but you can set `num_train_epochs=1` for a full run, and turn off `max_steps=None`. We also support TRL's `DPOTrainer`!"],"metadata":{"id":"idAEIeSQ3xdS"}},{"cell_type":"code","execution_count":12,"metadata":{"id":"95_Nn-89DhsL","colab":{"base_uri":"https://localhost:8080/","height":67,"referenced_widgets":["4758ab6d84ed4eb99ea0474c6eb97469","f23e7ba8b1ff40fdab5af080932266ad","b498992a458b47c78dedb921451230e6","6753316ed94b4b29b2d7f1856b0afb32","dcf7f5eed1ba4b459e5f7edfd38fc8fa","c25a74deead945a18c871359a6b03b31","92f54f687ec34a37955b17c7e4e3b8e2","31ee693f414b4b208bca95c0a5902653","aed8d91169a6436bb885e1f7e3e2c856","59bdfecc46be4c40bc135882dfe401c1","ddda26ee4fbb4e3d86e3843c5de85884"]},"executionInfo":{"status":"ok","timestamp":1727859087656,"user_tz":-540,"elapsed":187877,"user":{"displayName":"최현진","userId":"12812404047517020320"}},"outputId":"3cd623f4-7038-422c-a275-b64e960af426"},"outputs":[{"output_type":"display_data","data":{"text/plain":["Map (num_proc=2): 0%| | 0/1063260 [00:00, ? examples/s]"],"application/vnd.jupyter.widget-view+json":{"version_major":2,"version_minor":0,"model_id":"4758ab6d84ed4eb99ea0474c6eb97469"}},"metadata":{}},{"output_type":"stream","name":"stderr","text":["max_steps is given, it will override any value given in num_train_epochs\n"]}],"source":["from trl import SFTTrainer\n","from transformers import TrainingArguments\n","from unsloth import is_bfloat16_supported\n","\n","trainer = SFTTrainer(\n"," model = model,\n"," tokenizer = tokenizer,\n"," train_dataset = dataset,\n"," dataset_text_field = \"text\",\n"," max_seq_length = max_seq_length,\n"," dataset_num_proc = 2,\n"," packing = False, # Can make training 5x faster for short sequences.\n"," args = TrainingArguments(\n"," per_device_train_batch_size = 2,\n"," gradient_accumulation_steps = 4,\n"," warmup_steps = 5,\n"," max_steps = 60,\n"," learning_rate = 2e-4,\n"," fp16 = not is_bfloat16_supported(),\n"," bf16 = is_bfloat16_supported(),\n"," logging_steps = 1,\n"," optim = \"adamw_8bit\",\n"," weight_decay = 0.01,\n"," lr_scheduler_type = \"linear\",\n"," seed = 3407,\n"," output_dir = \"outputs\",\n"," ),\n",")"]},{"cell_type":"code","execution_count":13,"metadata":{"id":"2ejIt2xSNKKp","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1727859087657,"user_tz":-540,"elapsed":11,"user":{"displayName":"최현진","userId":"12812404047517020320"}},"outputId":"88ba7eef-ac5a-4050-b1db-9443d4786b34"},"outputs":[{"output_type":"stream","name":"stdout","text":["GPU = Tesla T4. Max memory = 14.748 GB.\n","7.029 GB of memory reserved.\n"]}],"source":["#@title Show current memory stats\n","gpu_stats = torch.cuda.get_device_properties(0)\n","start_gpu_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)\n","max_memory = round(gpu_stats.total_memory / 1024 / 1024 / 1024, 3)\n","print(f\"GPU = {gpu_stats.name}. Max memory = {max_memory} GB.\")\n","print(f\"{start_gpu_memory} GB of memory reserved.\")"]},{"cell_type":"code","execution_count":14,"metadata":{"id":"yqxqAZ7KJ4oL","colab":{"base_uri":"https://localhost:8080/","height":1000},"outputId":"5800975d-db6a-420b-e7a2-71788a48cfcb","executionInfo":{"status":"ok","timestamp":1727859483680,"user_tz":-540,"elapsed":396030,"user":{"displayName":"최현진","userId":"12812404047517020320"}}},"outputs":[{"output_type":"stream","name":"stderr","text":["==((====))== Unsloth - 2x faster free finetuning | Num GPUs = 1\n"," \\\\ /| Num examples = 1,063,260 | Num Epochs = 1\n","O^O/ \\_/ \\ Batch size per device = 2 | Gradient Accumulation steps = 4\n","\\ / Total batch size = 8 | Total steps = 60\n"," \"-____-\" Number of trainable parameters = 108,036,096\n"]},{"output_type":"display_data","data":{"text/plain":["Step | \n","Training Loss | \n","
---|---|
1 | \n","3.503900 | \n","
2 | \n","3.631400 | \n","
3 | \n","2.507900 | \n","
4 | \n","2.205400 | \n","
5 | \n","1.860200 | \n","
6 | \n","1.689300 | \n","
7 | \n","1.448900 | \n","
8 | \n","1.335800 | \n","
9 | \n","1.153200 | \n","
10 | \n","1.111900 | \n","
11 | \n","1.322200 | \n","
12 | \n","1.216700 | \n","
13 | \n","1.264900 | \n","
14 | \n","1.095600 | \n","
15 | \n","1.223100 | \n","
16 | \n","1.429300 | \n","
17 | \n","0.979800 | \n","
18 | \n","1.066300 | \n","
19 | \n","0.908500 | \n","
20 | \n","1.270700 | \n","
21 | \n","1.190300 | \n","
22 | \n","0.921000 | \n","
23 | \n","1.206700 | \n","
24 | \n","1.217600 | \n","
25 | \n","0.934000 | \n","
26 | \n","1.111900 | \n","
27 | \n","1.058500 | \n","
28 | \n","1.057700 | \n","
29 | \n","1.373700 | \n","
30 | \n","1.173400 | \n","
31 | \n","1.126800 | \n","
32 | \n","0.824900 | \n","
33 | \n","1.174500 | \n","
34 | \n","0.995500 | \n","
35 | \n","1.107000 | \n","
36 | \n","1.103700 | \n","
37 | \n","1.178300 | \n","
38 | \n","1.075800 | \n","
39 | \n","1.021900 | \n","
40 | \n","1.062400 | \n","
41 | \n","1.183900 | \n","
42 | \n","0.990900 | \n","
43 | \n","1.172100 | \n","
44 | \n","1.002000 | \n","
45 | \n","0.996900 | \n","
46 | \n","1.125000 | \n","
47 | \n","1.103000 | \n","
48 | \n","0.770400 | \n","
49 | \n","1.296300 | \n","
50 | \n","1.099700 | \n","
51 | \n","0.834700 | \n","
52 | \n","0.980000 | \n","
53 | \n","0.996300 | \n","
54 | \n","1.041800 | \n","
55 | \n","0.963500 | \n","
56 | \n","0.983300 | \n","
57 | \n","1.141800 | \n","
58 | \n","0.878200 | \n","
59 | \n","0.947400 | \n","
60 | \n","1.079900 | \n","
"]},"metadata":{}}],"source":["trainer_stats = trainer.train()"]},{"cell_type":"code","execution_count":15,"metadata":{"id":"pCqnaKmlO1U9","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1727859979984,"user_tz":-540,"elapsed":380,"user":{"displayName":"최현진","userId":"12812404047517020320"}},"outputId":"41137a73-c4b7-4166-a4ea-9405793a4d8f"},"outputs":[{"output_type":"stream","name":"stdout","text":["391.8304 seconds used for training.\n","6.53 minutes used for training.\n","Peak reserved memory = 9.307 GB.\n","Peak reserved memory for training = 2.278 GB.\n","Peak reserved memory % of max memory = 63.107 %.\n","Peak reserved memory for training % of max memory = 15.446 %.\n"]}],"source":["#@title Show final memory and time stats\n","used_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)\n","used_memory_for_lora = round(used_memory - start_gpu_memory, 3)\n","used_percentage = round(used_memory /max_memory*100, 3)\n","lora_percentage = round(used_memory_for_lora/max_memory*100, 3)\n","print(f\"{trainer_stats.metrics['train_runtime']} seconds used for training.\")\n","print(f\"{round(trainer_stats.metrics['train_runtime']/60, 2)} minutes used for training.\")\n","print(f\"Peak reserved memory = {used_memory} GB.\")\n","print(f\"Peak reserved memory for training = {used_memory_for_lora} GB.\")\n","print(f\"Peak reserved memory % of max memory = {used_percentage} %.\")\n","print(f\"Peak reserved memory for training % of max memory = {lora_percentage} %.\")"]},{"cell_type":"markdown","source":["\n","### Inference\n","Let's run the model! You can change the instruction and input - leave the output blank!"],"metadata":{"id":"ekOmTR1hSNcr"}},{"cell_type":"code","source":["# alpaca_prompt = Copied from above\n","FastLanguageModel.for_inference(model) # Enable native 2x faster inference\n","inputs = tokenizer(\n","[\n"," alpaca_prompt.format(\n"," \"Continue the fibonnaci sequence.\", # instruction\n"," \"1, 1, 2, 3, 5, 8\", # input\n"," \"\", # output - leave this blank for generation!\n"," )\n","], return_tensors = \"pt\").to(\"cuda\")\n","\n","outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True)\n","tokenizer.batch_decode(outputs)"],"metadata":{"id":"kR3gIAX-SM2q","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1727860012915,"user_tz":-540,"elapsed":12755,"user":{"displayName":"최현진","userId":"12812404047517020320"}},"outputId":"0658031a-6e3d-4b69-d09a-6b1ccc7d68c4"},"execution_count":16,"outputs":[{"output_type":"execute_result","data":{"text/plain":["['