{ "cells": [ { "cell_type": "markdown", "id": "883e3354-1538-4f98-bf42-67552215bba3", "metadata": {}, "source": [ "# Setup" ] }, { "cell_type": "code", "execution_count": 34, "id": "b45bd52f-03e9-419f-8110-1013ff45fb1b", "metadata": { "tags": [] }, "outputs": [], "source": [ "from huggingface_hub import InferenceClient, login" ] }, { "cell_type": "code", "execution_count": 35, "id": "dc9f0411-8bf2-4a20-a6ea-331a2a486b8e", "metadata": { "tags": [] }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "fcb8e82880df4053899633bfe3c2220f", "version_major": 2, "version_minor": 0 }, "text/plain": [ "VBox(children=(HTML(value='
\n", "
\n", " \n", "
\n", "
\n", " Warning: You will need to point to a model/deployment that is running.\n", "
\n", "\n" ] }, { "cell_type": "code", "execution_count": 36, "id": "84e6cb89-30d3-4ef5-8063-07783798e045", "metadata": {}, "outputs": [], "source": [ "MODEL = \"CohereForAI/c4ai-command-r-plus\"\n", "client = InferenceClient(MODEL)" ] }, { "cell_type": "markdown", "id": "f5fe63f8-dea2-4c61-b6ce-29f173e4c4eb", "metadata": {}, "source": [ "# Translation" ] }, { "cell_type": "markdown", "id": "c5bee77b-da1c-43b3-ab14-d15e871f7502", "metadata": {}, "source": [ "## Baseline\n", "\n", "For our baseline we will translate with a simple system prompt and instruction." ] }, { "cell_type": "markdown", "id": "6cea05d6-afb7-4829-b130-d4bcfe549acb", "metadata": {}, "source": [ "### Analysis" ] }, { "cell_type": "markdown", "id": "a98b9b67-e68b-43b2-b8e9-0ed1cf85591f", "metadata": {}, "source": [ "### System Prompt\n", "This is a pretty basic system prompt. We give a role, and an assumed understanding. We also push for goals like \"highly motivated and detail-oriented\". \n", "\n", "> You are a skilled translator with extensive experience in English to Arabic translations. You possess a deep understanding of the linguistic, cultural, and contextual nuances essential for accurate and effective translation between these languages. Highly motivated and detail-oriented, you are committed to delivering translations that maintain the integrity and intent of the original text. Your role is crucial in ensuring clear and precise communication in our multilingual system." ] }, { "cell_type": "code", "execution_count": null, "id": "032c86d2-868e-4fa6-b03e-58f1c41434cc", "metadata": { "tags": [] }, "outputs": [], "source": [ "system_prompt = \"\"\"You are a skilled translator with extensive experience in English to Arabic translations. You possess a deep understanding of the linguistic, cultural, and contextual nuances essential for accurate and effective translation between these languages. Highly motivated and detail-oriented, you are committed to delivering translations that maintain the integrity and intent of the original text. Your role is crucial in ensuring clear and precise communication in our multilingual system.\"\"\"" ] }, { "cell_type": "markdown", "id": "803ddeba-03de-4f13-95d1-5fb097058cf2", "metadata": {}, "source": [ "### Prompt\n", "> Translate this from english to arabic: {en_input}.\n", ">\n", "> Translation: \n", "\n", "Again we use a simple prompt to get a translation." ] }, { "cell_type": "code", "execution_count": 53, "id": "b7f1722c-c484-4e22-a025-53f95943fc76", "metadata": {}, "outputs": [], "source": [ "def baseline_chat_completion(system_prompt, en_input):\n", " \"\"\"\n", " Generates a completion for a chat conversation using a specified system prompt and a user input.\n", " \"\"\"\n", " messages = [\n", " {\"role\": \"system\", \"content\": system_prompt},\n", " {\"role\": \"user\", \"content\": f\"Translate this from english to arabic: {en_input}.\\nTranslation: \"},\n", " ]\n", " return client.chat_completion(messages, max_tokens=10_000)" ] }, { "cell_type": "code", "execution_count": 50, "id": "96a0ba0b-be47-4eb0-bbc2-c82b0ea1b72e", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "120" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "en_input = \"Float like a butterfly sting like a bee – his hands can’t hit what his eyes can’t see.\"\n", "response = baseline_chat_completion(system_prompt, \"Float like a butterfly sting like a bee – his hands can’t hit what his eyes can’t see.\")" ] }, { "cell_type": "markdown", "id": "2bca574c-461d-4822-b0dd-b12a3b9846b3", "metadata": {}, "source": [ "### Token Cost\n", "Here we can see that the cost is quite cheap, only 92 tokens!" ] }, { "cell_type": "code", "execution_count": 52, "id": "4e305b1e-56e0-44da-8c17-496cbcc35fad", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "120" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "response.usage.prompt_tokens" ] }, { "cell_type": "code", "execution_count": 51, "id": "ef24fe6b-d801-4f3e-95ad-cb7f67247bc3", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "'يرفرف مثل الفراشة ويلسع كالنحلة - يديه لا يمكن أن تصيب ما لا تستطيع عينيه رؤيته.'" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "response.choices[0].message.content" ] }, { "cell_type": "markdown", "id": "3a9cdf02-d590-4bcf-a7a8-e6b7817ba715", "metadata": {}, "source": [ "## Purpose Driven Translation\n", "\n", "[](https://arxiv.org/pdf/2308.01391)" ] }, { "cell_type": "code", "execution_count": null, "id": "12e57d07-9e86-426b-be42-e932699d1fe2", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.9" } }, "nbformat": 4, "nbformat_minor": 5 }