Spaces:
Runtime error
Runtime error
File size: 4,246 Bytes
d5bdfe9 e90feb1 d5bdfe9 e90feb1 d5bdfe9 e90feb1 d5bdfe9 e90feb1 d5bdfe9 e90feb1 d5bdfe9 e90feb1 d5bdfe9 e90feb1 d5bdfe9 e90feb1 d5bdfe9 e90feb1 d5bdfe9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "47fPyWltjSqE"
},
"outputs": [],
"source": [
"!pip install transformers sentencepiece"
]
},
{
"cell_type": "code",
"source": [
"from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer\n",
"\n",
"hi_text = \"जीवन एक चॉकलेट बॉक्स की तरह है।\"\n",
"chinese_text = \"生活就像一盒巧克力。\"\n",
"\n",
"model = M2M100ForConditionalGeneration.from_pretrained(\"facebook/m2m100_1.2B\")\n",
"model.eval()\n",
"\"\"\"\n",
"在PyTorch中,`model.eval()`是用来将模型设置为评估(evaluation)模式的方法。在深度学习中,训练和评估两个阶段的模型行为可能会有所不同。以下是`model.eval()`的主要作用:\n",
"\n",
"1. **Batch Normalization和Dropout的影响:**\n",
"- 在训练阶段,`Batch Normalization`和`Dropout`等层的行为通常是不同的。在训练时,`Batch Normalization`使用批次统计信息来规范化输入,而`Dropout`层会随机丢弃一些神经元。在评估阶段,我们通常希望使用整个数据集的统计信息来规范化,而不是每个批次的统计信息,并且不再需要随机丢弃神经元。因此,通过执行`model.eval()`,模型会切换到评估模式,从而确保这些层的行为在评估时是正确的。\n",
"\n",
"2. **梯度计算的关闭:**\n",
"- 在评估模式下,PyTorch会关闭自动求导(autograd)的计算图,这样可以避免不必要的梯度计算和内存消耗。在训练时,我们通常需要计算梯度以进行反向传播和参数更新,而在评估时,我们只对模型的前向传播感兴趣,因此关闭梯度计算可以提高评估的速度和减少内存使用。\n",
"\n",
"总的来说,执行`model.eval()`是为了确保在评估阶段模型的行为和性能是正确的,并且可以提高评估时的效率。\n",
"\"\"\"\n",
"tokenizer = M2M100Tokenizer.from_pretrained(\"facebook/m2m100_1.2B\")"
],
"metadata": {
"id": "ziPisPX_jXNC"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# translate Hindi to French\n",
"tokenizer.src_lang = \"hi\"\n",
"encoded_hi = tokenizer(hi_text, return_tensors=\"pt\")\n",
"generated_tokens = model.generate(\n",
" **encoded_hi, forced_bos_token_id=tokenizer.get_lang_id(\"fr\")\n",
")\n",
"tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "00h7PwrOjehw",
"outputId": "eb4e92ec-5e00-452d-8ead-d06d2e23b78e"
},
"execution_count": 3,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['La vie est comme une boîte de chocolat.']"
]
},
"metadata": {},
"execution_count": 3
}
]
},
{
"cell_type": "code",
"source": [
"# translate Chinese to English\n",
"tokenizer.src_lang = \"zh\"\n",
"encoded_zh = tokenizer(chinese_text, return_tensors=\"pt\")\n",
"generated_tokens = model.generate(\n",
" **encoded_zh, forced_bos_token_id=tokenizer.get_lang_id(\"en\")\n",
")\n",
"tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "ifzvH6Ezj62j",
"outputId": "c5c6307d-5811-4978-f565-709e22d4a16b"
},
"execution_count": 4,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['Life is like a box of chocolate.']"
]
},
"metadata": {},
"execution_count": 4
}
]
},
{
"cell_type": "code",
"source": [],
"metadata": {
"id": "YwHxXY-RkDPH"
},
"execution_count": null,
"outputs": []
}
]
} |