Update README.md

1f35170 verified 2 months ago

No virus

6.68 kB

	---
	library_name: transformers
	tags: []
	---

	# SUMMARY

	Just a model using to learn Fine Tuning of 'DialoGPT-medium'
	- on a self made datasets
	- on a self made special tokens
	- on a multiple fine tuned with ~30K dataset (in progress mode)

	If interested in how I got to this point and how I created the datasets you can visit:
	[Crafting GPT2 for Personalized AI-Preparing Data the Long Way](https://medium.com/@deeokay/the-soul-in-the-machine-crafting-gpt2-for-personalized-ai-9d38be3f635f)
	<!-- Provide a quick summary of what the model is/does. -->


	## DECLARING NEW SPECIAL TOKENS

	```python
	special_tokens_dict = {
	'eos_token': '<\|STOP\|>',
	'bos_token': '<\|STOP\|>',
	'pad_token': '<\|PAD\|>',
	'additional_special_tokens': ['<\|BEGIN_QUERY\|>', '<\|BEGIN_QUERY\|>',
	'<\|BEGIN_ANALYSIS\|>', '<\|END_ANALYSIS\|>',
	'<\|BEGIN_RESPONSE\|>', '<\|END_RESPONSE\|>',
	'<\|BEGIN_SENTIMENT\|>', '<\|END_SENTIMENT\|>',
	'<\|BEGIN_CLASSIFICATION\|>', '<\|END_CLASSIFICATION\|>',]
	}

	tokenizer.add_special_tokens(special_tokens_dict)
	model.resize_token_embeddings(len(tokenizer))

	tokenizer.eos_token_id = tokenizer.convert_tokens_to_ids('<\|STOP\|>')
	tokenizer.bos_token_id = tokenizer.convert_tokens_to_ids('<\|STOP\|>')
	tokenizer.pad_token_id = tokenizer.convert_tokens_to_ids('<\|PAD\|>')
	```

	The order of tokens is as follows:

	```python
	def combine_text(user_prompt, analysis, sentiment, new_response, classification):
	user_q = f"<\|STOP\|><\|BEGIN_QUERY\|>{user_prompt}<\|END_QUERY\|>"
	analysis = f"<\|BEGIN_ANALYSIS\|>{analysis}<\|END_ANALYSIS\|>"
	new_response = f"<\|BEGIN_RESPONSE\|>{new_response}<\|END_RESPONSE\|>"
	sentiment = f"<\|BEGIN_SENTIMENT\|>Sentiment: {sentiment}<\|END_SENTIMENT\|><\|STOP\|>"
	classification = f"<\|BEGIN_CLASSIFICATION\|>{classification}<\|END_CLASSIFICATION\|>"
	return user_q + analysis + new_response + classification + sentiment
	```

	## INFERANCING

	I am currently testing two ways, if anyone knows a better one, please let me know!

	```python
	import torch
	from transformers import AutoModelForCausalLLM, AutoTokenizer

	models_folder = "Deeokay/DialoGPT-special-tokens-medium4"

	model = AutoModelForCausalLM.from_pretrained(models_folder)
	tokenizer = AutoTokenizer.from_pretrained(models_folder)

	# Device configuration <<change as needed>>
	device = torch.device("cpu")
	model.to(device)

	```

	### OPTION 1 INFERFENCE

	```python
	import time

	class Stopwatch:
	def __init__(self):
	self.start_time = None
	self.end_time = None

	def start(self):
	self.start_time = time.time()

	def stop(self):
	self.end_time = time.time()

	def elapsed_time(self):
	if self.start_time is None:
	return "Stopwatch hasn't been started"
	if self.end_time is None:
	return "Stopwatch hasn't been stopped"
	return self.end_time - self.start_time

	stopwatch1 = Stopwatch()

	def generate_response(input_text, max_length=250):

	stopwatch1.start()

	# Prepare the input
	# input_text = f"<\|BEGIN_QUERY\|>{input_text}<\|END_QUERY\|><\|BEGIN_ANALYSIS\|>{input_text}<\|END_ANALYSIS\|><\|BEGIN_RESPONSE\|>"
	input_text = f"<\|BEGIN_QUERY\|>{input_text}<\|END_QUERY\|><\|BEGIN_ANALYSIS\|>"

	input_ids = tokenizer.encode(input_text, return_tensors="pt").to(device)

	# Create attention mask
	attention_mask = torch.ones_like(input_ids).to(device)

	# Generate
	output = model.generate(
	input_ids,
	max_new_tokens=max_length,
	num_return_sequences=1,
	no_repeat_ngram_size=2,
	attention_mask=attention_mask,
	pad_token_id=tokenizer.eos_token_id,
	eos_token_id=tokenizer.convert_tokens_to_ids('<\|STOP\|>'),
	)

	stopwatch1.stop()
	return tokenizer.decode(output[0], skip_special_tokens=False)
	```

	### OPTION 2 INFERNCE

	```python
	import time

	class Stopwatch:
	def __init__(self):
	self.start_time = None
	self.end_time = None

	def start(self):
	self.start_time = time.time()

	def stop(self):
	self.end_time = time.time()

	def elapsed_time(self):
	if self.start_time is None:
	return "Stopwatch hasn't been started"
	if self.end_time is None:
	return "Stopwatch hasn't been stopped"
	return self.end_time - self.start_time

	stopwatch2 = Stopwatch()

	def generate_response2(input_text, max_length=250):

	stopwatch2.start()

	# Prepare the input
	# input_text = f"<\|BEGIN_QUERY\|>{input_text}<\|END_QUERY\|><\|BEGIN_ANALYSIS\|>{input_text}<\|END_ANALYSIS\|><\|BEGIN_RESPONSE\|>"
	input_text = f"<\|BEGIN_QUERY\|>{input_text}<\|END_QUERY\|><\|BEGIN_ANALYSIS\|>"
	input_ids = tokenizer.encode(input_text, return_tensors="pt").to(device)

	# Create attention mask
	attention_mask = torch.ones_like(input_ids).to(device)

	# # 2ND OPTION FOR : Generate
	output = model.generate(
	input_ids,
	max_new_tokens=max_length,
	attention_mask=attention_mask,
	do_sample=True,
	temperature=0.4,
	top_k=60,
	no_repeat_ngram_size=2,
	pad_token_id=tokenizer.pad_token_id,
	eos_token_id=tokenizer.eos_token_id,
	)

	stopwatch2.stop()
	return tokenizer.decode(output[0], skip_special_tokens=False)
	```
	### DECODING ANSWER

	When I need just the response

	```python
	def decode(text):
	full_text = text

	# Extract the response part
	start_token = "<\|BEGIN_RESPONSE\|>"
	end_token = "<\|END_RESPONSE\|>"
	start_idx = full_text.find(start_token)
	end_idx = full_text.find(end_token)

	if start_idx != -1 and end_idx != -1:
	response = full_text[start_idx + len(start_token):end_idx].strip()
	else:
	response = full_text.strip()

	return response
	```

	### MY SETUP

	I use the stopwatch to time the responses and I use both inference to see the difference

	```python
	input_text = "Who is Steve Jobs and what was contribution?"
	response1_full = generate_response(input_text)
	#response1 = decode(response1_full)
	print(f"Input: {input_text}")
	print("=======================================")
	print(f"Response1: {response1_full}")
	elapsed1 = stopwatch1.elapsed_time()
	print(f"Process took {elapsed1:.4f} seconds")
	print("=======================================")
	response2_full = generate_response2(input_text)
	#response2 = decode(response2_full)
	print(f"Response2: {response2_full}")
	elapsed2 = stopwatch2.elapsed_time()
	print(f"Process took {elapsed2:.4f} seconds")
	print("=======================================")
	```


	### Out-of-Scope Use

	Well everything that has a factual data.. trust at your own risk!

	Never tested on mathamatical knowledge.

	I quite enjoy how the response feels closer to what I had in mind..