|
--- |
|
title: NLP App |
|
emoji: ⚡ |
|
colorFrom: indigo |
|
colorTo: indigo |
|
sdk: streamlit |
|
sdk_version: 1.31.0 |
|
app_file: app.py |
|
pinned: false |
|
--- |
|
|
|
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |
|
|
|
## NLP App Hugging Face's logo |
|
Hugging Face |
|
# Streamlit app with computer vision 💡 |
|
Elbrus Bootcamp | Phase-2 | Team Project |
|
|
|
## Team🧑🏻💻 |
|
1. [Awlly](https://github.com/Awlly) |
|
2. [sakoser](https://github.com/sakoser) |
|
3. [whoisida]https://github.com/whoisida |
|
|
|
## Task 📌lassifi |
|
Create a service that classifies movie reviews into good, neutral and bad categories, a service that classifies user input as toxic or non-toxic, as well as a GPT 2 based text generation service that was trained to emulate a certain author’s writing. |
|
|
|
## Contents 📝 |
|
1. Classifies movie reviewsusing LSTM,ruBert,BOW 💨 [Dataset](https://drive.google.com/file/d/1c92sz81bEfOw-rutglKpmKGm6rySmYbt/view?usp=sharing) |
|
2. classifies user input as toxic or non-toxi using ruBert-tiny-toxicity 📑 [Dataset](https://drive.google.com/file/d/1O7orH9CrNEhnbnA5KjXji8sgrn6iD5n-/view?usp=drive_link) |
|
3. GPT 2 based text generation service |
|
|
|
## Deployment 🎈 |
|
The service is implemented on [Hugging Face](https://huggingface.co/spaces/Awlly/NLP_app) |
|
|
|
## Libraries 📖 |
|
```python |
|
import os |
|
import unicodedata |
|
import nltk |
|
from dataclasses import dataclass |
|
import joblib |
|
import numpy as np |
|
import matplotlib.pyplot as plt |
|
import torch |
|
import torch.nn as nn |
|
import torch.nn.functional as F |
|
import torch.optim as optim |
|
from torch.utils.data import DataLoader, TensorDataset |
|
from torchvision.datasets import ImageFolder |
|
from torchvision import datasets |
|
from torchvision import transforms as T |
|
from torchvision.io import read_image |
|
from torch.utils.data import Dataset, random_split |
|
import torchutils as tu |
|
from transformers import GPT2LMHeadModel, GPT2Tokenizer |
|
from typing import Tuple |
|
from tqdm import tqdm |
|
from transformers import AutoModel, AutoTokenizer |
|
from transformers import AutoModelForSequenceClassification |
|
import pydensecrf.densecrf as dcrf |
|
import pydensecrf.utils as dcrf_utils |
|
from preprocessing import data_preprocessing |
|
import streamlit as st |
|
import string |
|
from sklearn.linear_model import LogisticRegression |
|
import re |
|
|
|
|
|
|
|
|
|
from preprocessing import preprocess_single_string |
|
``` |
|
|
|
|
|
from preprocessing import data_preprocessing |
|
|
|
|
|
|
|
|
|
## Guide 📜 |
|
#### How to run locally? |
|
|
|
1. To create a Python virtual environment for running the code, enter: |
|
|
|
``python3 -m venv my-env`` |
|
|
|
2. Activate the new environment: |
|
|
|
* Windows: ```my-env\Scripts\activate.bat``` |
|
* macOS and Linux: ```source my-env/bin/activate``` |
|
|
|
|