Update README.md

958786b verified 5 months ago

3.94 kB

	---
	license: mit
	language:
	- en
	pipeline_tag: text2text-generation
	---
	# T5-Base Job Description to Resume JSON

	This model fine-tunes google/t5-base to convert job descriptions into structured resume JSON data.

	## Model description

	This model is based on the T5-base architecture fine-tuned on a dataset of 10,000 job description and resume pairs. It takes a job description as input and generates a JSON representation of a resume tailored to that job.

	Base model: google/t5-base

	Fine-tuning task: Text-to-JSON conversion

	Training data: 10,000 job description and resume pairs

	## Intended uses & limitations

	Intended uses:
	- Generating structured resume data from job descriptions
	- Assisting job seekers in tailoring resumes to specific job postings
	- Automating parts of the resume creation process

	Limitations:
	- The model's output quality depends on the input job description's detail and clarity
	- Generated resumes may require human review and editing
	- The model may not capture nuanced or industry-specific requirements
	- The model is not tokenized to output "{" or "}", and instead uses "RB>" and "LB>" respectively

	## Training data

	The model was trained on 10,000 pairs of job descriptions and corresponding resume JSON data. The data distribution and any potential biases in the training set are not specified.

	## Training procedure

	The model was fine-tuned using the standard T5 text-to-text framework. Specific hyperparameters and training details are not provided.

	# How to Get Started with the Model

	Use the code below to get started with the model.

	<details>
	<summary> Click to expand </summary>

	```python
	from transformers import T5Tokenizer, T5ForConditionalGeneration

	def load_model_and_tokenizer(model_path):
	"""
	Load the tokenizer and model from the specified path.
	"""
	tokenizer = T5Tokenizer.from_pretrained("google-t5/t5-base")
	model = T5ForConditionalGeneration.from_pretrained(model_path)
	return tokenizer, model

	def generate_text(prompt, tokenizer, model):
	"""
	Generate text using the model based on the given prompt.
	"""
	# Encode the input prompt to get the tensor
	input_ids = tokenizer(prompt, return_tensors="pt", padding=True).input_ids

	# Generate the output using the model
	outputs = model.generate(input_ids, max_length=512, num_return_sequences=1)

	# Decode the output tensor to human-readable text
	generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
	return generated_text

	def main():
	model_path = "nakamoto-yama/t5-resume-generation"
	print(f"Loading model and tokenizer from {model_path}")
	tokenizer, model = load_model_and_tokenizer(model_path)

	# Test the model with a prompt
	while True:
	prompt = input("Enter a job description or title: ")
	if prompt.lower() == 'exit':
	break
	response = generate_text(f"generate resume JSON for the following job: {prompt}", tokenizer, model)
	response = response.replace("LB>", "{").replace("RB>", "}")
	print(f"Generated Response: {response}")

	if __name__ == "__main__":
	main()
	```

	See the [Hugging Face T5](https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5Model) docs and a [Colab Notebook](https://colab.research.google.com/github/google-research/text-to-text-transfer-transformer/blob/main/notebooks/t5-trivia.ipynb) created by the model developers for more examples.
	</details>

	## Ethical considerations

	This model automates part of the resume creation process, which could have implications for job seeking and hiring practices. Users should be aware of potential biases in the training data that may affect the generated resumes.

	## Additional information

	For more details on the base T5 model, refer to the [T5 paper](https://arxiv.org/abs/1910.10683) and the [google/t5-base model card](https://huggingface.co/google/t5-base).