RoBERTa base BNE trained with data from the descriptive text corpus of the CelebA dataset

Overview

Language: Spanish
Data: CelebA_RoBERTa_Sp.
Architecture: roberta-base
- Paper: Information Processing and Management

Description

In order to improve the RoBERTa-large-bne encoder performance, this model has been trained using the generated corpus (in this respository) and following the strategy of using a Siamese network together with the loss function of cosine similarity. The following steps were followed:

Define sentence-transformer and torch libraries for the implementation of the encoder.
Divide the training corpus into two parts, training with 249,000 sentences and validation with 1,000 sentences.
Load training / validation data for the model. Two lists are generated for the storage of the information and, in each of them, the entries are composed of a pair of descriptive sentences and their similarity value.
Implement RoBERTa-large-bne as a baseline model for transformer training.
Train with a Siamese network in which, for a pair of sentences A and B from the training corpus, the similarities of their embedding vectors u and v generated using the cosine similarity metric (CosineSimilarityLoss()) are evaluated and compares with the real similarity value obtained from the training corpus. The performance measurement of the model during training was calculated using Spearman's correlation coefficient between the real similarity vector and the calculated similarity vector.

The total training time using the sentence-transformer library in Python was 42 days using all the available GPUs of the server, and with exclusive dedication.

A comparison was made between the Spearman's correlation for 1000 test sentences between the base model and our trained model. As can be seen in the following table, our model obtains better results (correlation closer to 1).

Models	Spearman's correlation
RoBERTa-base-bne	0.827176427
RoBERTa-celebA-Sp	0.999913276

How to use

Downloading the model results in a directory called roberta-large-bne-celebAEs-UNI that contains its main files. To make use of the model use the following code in Python:

from sentence_transformers import SentenceTransformer, InputExample, models, losses, util, evaluation
model_sbert = SentenceTransformer('roberta-large-bne-celebAEs-UNI')
caption = ['La mujer tiene pomulos altos. Su cabello es de color negro.
            Tiene las cejas arqueadas y la boca ligeramente abierta.
            La joven y atractiva mujer sonriente tiene mucho maquillaje.
            Lleva aretes, collar y lapiz labial.']
vector = model_sbert.encode(captions)
print(vector)

Results

As a result, the encoder will generate a numeric vector whose dimension is 1024.

>>$ print(vector)
>>$ [0.2,0.5,0.45,........0.9]
>>$ len(vector)
>>$ 1024

More information

To see more detailed information about the implementation visit the following link.

Licensing information

This model is available under the CC BY-NC 4.0.

Citation information

Citing: If you used RoBERTa+CelebA model in your work, please cite the paper publish in Information Processing and Management:

@article{YAURILOZANO2024103667,
title = {Generative Adversarial Networks for text-to-face synthesis & generation: A quantitative–qualitative analysis of Natural Language Processing encoders for Spanish},
journal = {Information Processing & Management},
volume = {61},
number = {3},
pages = {103667},
year = {2024},
issn = {0306-4573},
doi = {https://doi.org/10.1016/j.ipm.2024.103667},
url = {https://www.sciencedirect.com/science/article/pii/S030645732400027X},
author = {Eduardo Yauri-Lozano and Manuel Castillo-Cara and Luis Orozco-Barbosa and Raúl García-Castro}
}