nicholasKluge
/

Aira-2-portuguese-124M

Text Generation

instruction tuned

text generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

nicholasKluge commited on Nov 5, 2023

Commit

239b81d

•

1 Parent(s): 922df51

Upload lm_evaluation_harness_pt.ipynb

Files changed (1) hide show

lm_evaluation_harness_pt.ipynb +81 -0

lm_evaluation_harness_pt.ipynb ADDED Viewed

	@@ -0,0 +1,81 @@

+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Ac6wadk3rmkK"
+      },
+      "source": [
+        "# LM Evaluation Harness (by [EleutherAI](https://www.eleuther.ai/) & [Laiviet](https://github.com/laiviet/lm-evaluation-harness))\n",
+        "\n",
+        "This [`LM-Evaluation-Harness`](https://github.com/EleutherAI/lm-evaluation-harness) provides a unified framework to test generative language models on a large number of different evaluation tasks. For a complete list of available tasks, see the [task table](https://github.com/EleutherAI/lm-evaluation-harness/blob/master/docs/task_table.md), or scroll to the bottom of the page.\n",
+        "\n",
+        "1. Clone the [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) and install the necessary libraries (`sentencepiece` is required for the Llama tokenizer)."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "UA5I86u91e0A"
+      },
+      "outputs": [],
+      "source": [
+        "!git clone --branch main https://github.com/laiviet/lm-evaluation-harness.git\n",
+        "!cd lm-evaluation-harness && pip install -e . -q\n",
+        "!pip install cohere tiktoken sentencepiece -q"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "pnHoAVK25QZn"
+      },
+      "outputs": [],
+      "source": [
+        "!huggingface-cli login --token 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\n",
+        "!cd lm-evaluation-harness && python main.py \\\n",
+        "    --model hf-auto \\\n",
+        "    --model_args pretrained=nicholasKluge/Aira-2-portuguese-124M \\\n",
+        "    --tasks arc_pt,truthfulqa_pt  \\\n",
+        "    --device cuda:0 \\\n",
+        "    --model_alias Aira-2-portuguese-124M \\\n",
+        "    --task_alias open_llm"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "4Bm78wiZ4Own"
+      },
+      "source": [
+        "## Task Table 📚\n",
+        "\n",
+        "| Task Name      | Train | Val | Test | Val/Test Docs | Metrics       |\n",
+        "|----------------|-------|-----|------|--:------------|---------------|\n",
+        "| arc_pt,mmlu_pt | ✓     | ✓   | ✓    | 1172          | acc, acc_norm |\n",
+        "| hellaswag_pt   | ✓     | ✓   |      | 10042         | acc, acc_norm |\n",
+        "| mmlu_pt        |       | ✓   | ✓    | 1,662         | acc, acc_norm |\n",
+        "| truthfulqa_pt  |       | ✓   |      | 817           | mc1, mc2      |  "
+      ]
+    }
+  ],
+  "metadata": {
+    "accelerator": "GPU",
+    "colab": {
+      "provenance": [],
+      "machine_shape": "hm",
+      "gpuType": "T4"
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    },
+    "language_info": {
+      "name": "python"
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}