File size: 2,644 Bytes
67ee02a
 
 
 
 
 
 
 
 
 
 
 
1f670ae
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
---
title: NLP App
emoji: 
colorFrom: indigo
colorTo: indigo
sdk: streamlit
sdk_version: 1.31.0
app_file: app.py
pinned: false
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

## NLP App Hugging Face's logo
Hugging Face
# Streamlit app with computer vision 💡
Elbrus Bootcamp | Phase-2 | Team Project 

## Team🧑🏻‍💻
1. [Awlly](https://github.com/Awlly)
2. [sakoser](https://github.com/sakoser)
3. [whoisida]https://github.com/whoisida

## Task 📌lassifi
Create a service that classifies movie reviews into good, neutral and bad categories, a service that classifies user input as toxic or non-toxic, as well as a GPT 2 based text generation service that was trained to emulate a certain author’s writing.

## Contents 📝
1. Classifies movie reviewsusing LSTM,ruBert,BOW 💨 [Dataset](https://drive.google.com/file/d/1c92sz81bEfOw-rutglKpmKGm6rySmYbt/view?usp=sharing)
2. classifies user input as toxic or non-toxi using ruBert-tiny-toxicity 📑 [Dataset](https://drive.google.com/file/d/1O7orH9CrNEhnbnA5KjXji8sgrn6iD5n-/view?usp=drive_link)
3. GPT 2 based text generation service

## Deployment 🎈
The service is implemented on [Hugging Face](https://huggingface.co/spaces/Awlly/NLP_app)

## Libraries 📖
```python
import os
import unicodedata
import nltk
from dataclasses import dataclass
import joblib
import numpy as np
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
from torchvision.datasets import ImageFolder
from torchvision import datasets
from torchvision import transforms as T
from torchvision.io import read_image
from torch.utils.data import Dataset, random_split
import torchutils as tu
from transformers import GPT2LMHeadModel, GPT2Tokenizer
from typing import Tuple
from tqdm import tqdm
from transformers import AutoModel, AutoTokenizer
from transformers import AutoModelForSequenceClassification
import pydensecrf.densecrf as dcrf
import pydensecrf.utils as dcrf_utils
from preprocessing import data_preprocessing
import streamlit as st
import string
from sklearn.linear_model import LogisticRegression
import re




from preprocessing import preprocess_single_string
```


from preprocessing import data_preprocessing




## Guide 📜 
####  How to run locally?

1. To create a Python virtual environment for running the code, enter:

    ``python3 -m venv my-env``

2. Activate the new environment:

    * Windows: ```my-env\Scripts\activate.bat```
    * macOS and Linux: ```source my-env/bin/activate```