--- tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:4895 - loss:OnlineContrastiveLoss base_model: bowphs/SPhilBerta widget: - source_sentence: 'Query: Sed, ut ardentissimus poeta testatur, Quidquid a multis peccatur, inultum est: Multitudo peccantium impetrabiliorem fecit impiis veniam, ut qui redacti in laicos pristina sacrilegii sui debuerant scelera deplorare, nunc resupini in pontificali solio sedeant, et ructent nobis simulatae fidei nauseas, immo opertae perfidiae aperta compendia.' sentences: - 'Candidate: Capena grandi porta qua pluit gutta Phrygiumque Matris Almo qua lavat ferrum, Horatiorum qua viret sacer campus Et qua pusilli fervet Herculis fanum, Faustine, plena Bassus ibat in raeda, Omnis beati copias trahens ruris.' - 'Candidate: Si qua videbuntur chartis tibi, lector, in istis Sive obscura nimis sive latina parum, Non meus est error: nocuit librarius illis, Dum properat versus adnumerare tibi.' - 'Candidate: ecce, nefas visu, mediis altaribus anguis exit et extinctis ignibus exta rapit, consulitur Phoebus: sors est ita reddita:' - source_sentence: 'Query: ille malum uirus serpentibus addidit atris praedarique lupos iussit, id est odium et inuidiam et dolum hominibus inseuit, ut tam essent quam serpentes uenenati, tam rapaces quam lupi.' sentences: - 'Candidate: quis negat?' - 'Candidate: quid moraris emori?' - 'Candidate: Et simul a medio media de parte secatur,' - source_sentence: 'Query: scintilla uigoris paterni lucet in filio et similitudo morum per speculum carnis erumpens: ingentes animos angusto in pectore uersat.' sentences: - 'Candidate: quod si mihi nullum aliud esset officium in omni vita reliquum nisi ut erga duces ipsos et principes atque auctores salutis meae satis gratus iudicarer, tamen exiguum reliquae vitae tempus non modo ad referendam verum etiam ad commemorandam gratiam mihi relictum putarem.' - 'Candidate: uno enim maledicto bis a me patriam servatam esse concedis, semel cum id feci quod omnes non negent immortalitati, si fieri potest, mandandum, tu supplicio puniendum putasti, iterum cum tuum multorumque praeter te inflammatum in bonos omnis impetum meo corpore excepi, ne eam civitatem quam servassem inermis armatus in discrimen adducerem.' - 'Candidate: Ac siquem potuit spatiosa senectus spectatorem operum multorum reddere, vixi annos bis centum; nunc tertia vivitur aetas.' - source_sentence: 'Query: si quid itaque in me potest esse consilii, si experto creditur, hoc primum moneo, hoc obtestor, ut sponsa Christi uinum fugiat pro ueneno.' sentences: - 'Candidate: Inscripta est basis indicatque nomen.' - 'Candidate: Erras, si tibi cunnus hic videtur, Ad quem mentula pertinere desit.' - 'Candidate: Chthonius quoque Teleboasque ense iacent nostro: ramum prior ille bifurcum gesserat, hic iaculum; iaculo mihi vulnera fecit: ' - source_sentence: 'Query: Quia ergo insanivit Israel, et percussus fornicationis spiritu, incredibili furore bacchatus est, ideo non multo post tempore, sed dum propheto, dum spiritus hos regit artus, pascet eos Dominus quasi agnum in latitudine.' sentences: - 'Candidate: Haec omnia vidi inflammari,' - 'Candidate: ut tuus amicus, Crasse, Granius non esse sextantis.' - 'Candidate: Te solum in bella secutus, Post te fata sequar: neque enim sperare secunda Fas mihi, nec liceat.' pipeline_tag: sentence-similarity library_name: sentence-transformers metrics: - cosine_accuracy - cosine_accuracy_threshold - cosine_f1 - cosine_f1_threshold - cosine_precision - cosine_recall - cosine_ap - cosine_mcc model-index: - name: SentenceTransformer based on bowphs/SPhilBerta results: - task: type: binary-classification name: Binary Classification dataset: name: latin intertext type: latin_intertext metrics: - type: cosine_accuracy value: 0.9597902097902098 name: Cosine Accuracy - type: cosine_accuracy_threshold value: 0.6651543378829956 name: Cosine Accuracy Threshold - type: cosine_f1 value: 0.7513227513227515 name: Cosine F1 - type: cosine_f1_threshold value: 0.6328521966934204 name: Cosine F1 Threshold - type: cosine_precision value: 0.8352941176470589 name: Cosine Precision - type: cosine_recall value: 0.6826923076923077 name: Cosine Recall - type: cosine_ap value: 0.8119318372417907 name: Cosine Ap - type: cosine_mcc value: 0.7335872874320771 name: Cosine Mcc --- # SentenceTransformer based on bowphs/SPhilBerta This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [bowphs/SPhilBerta](https://huggingface.co/bowphs/SPhilBerta). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [bowphs/SPhilBerta](https://huggingface.co/bowphs/SPhilBerta) - **Maximum Sequence Length:** 128 tokens - **Output Dimensionality:** 768 dimensions - **Similarity Function:** Cosine Similarity ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: RobertaModel (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("julian-schelb/SPhilBerta-latin-intertextuality-v1") # Run inference sentences = [ 'Query: Quia ergo insanivit Israel, et percussus fornicationis spiritu, incredibili furore bacchatus est, ideo non multo post tempore, sed dum propheto, dum spiritus hos regit artus, pascet eos Dominus quasi agnum in latitudine.', 'Candidate: Te solum in bella secutus, Post te fata sequar: neque enim sperare secunda Fas mihi, nec liceat.', 'Candidate: ut tuus amicus, Crasse, Granius non esse sextantis.', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 768] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Evaluation ### Metrics #### Binary Classification * Dataset: `latin_intertext` * Evaluated with [BinaryClassificationEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator) | Metric | Value | |:--------------------------|:-----------| | cosine_accuracy | 0.9598 | | cosine_accuracy_threshold | 0.6652 | | cosine_f1 | 0.7513 | | cosine_f1_threshold | 0.6329 | | cosine_precision | 0.8353 | | cosine_recall | 0.6827 | | **cosine_ap** | **0.8119** | | cosine_mcc | 0.7336 | ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 4,895 training samples * Columns: query, match, and label * Approximate statistics based on the first 1000 samples: | | query | match | label | |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:-----------------------------------------------| | type | string | string | int | | details | | | | * Samples: | query | match | label | |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------| | Query: quod et illustris poeta testatur dicens: sed fugit interea, fugit irreparabile tempus et iterum: Rhaebe, diu, res si qua diu mortalibus ulla est, uiximus. | Candidate: omnino si ego evolo mense Quintili in Graeciam, sunt omnia faciliora; sed cum sint ea tempora ut certi nihil esse possit quid honestum mihi sit, quid liceat, quid expediat, quaeso, da operam ut illum quam honestissime copiosissimeque tueamur. | 0 | | Query: Non solum in Ecclesia morantur oves, nec mundae tantum aves volitant; sed frumentum in agro seritur, interque nitentia culta Lappaeque et tribuli, et steriles dominantur avenae. | Candidate: atque hoc in loco, si facultas erit, exemplis uti oportebit, quibus in simili excusatione non sit ignotum, et contentione, magis illis ignoscendum fuisse, et deliberationis partibus, turpe aut inutile esse concedi eam rem, quae ab adversario commissa sit: permagnum esse et magno futurum detrimento, si ea res ab iis, qui potestatem habent vindicandi, neglecta sit. | 0 | | Query: adiuratus enim per eundem patrem et spes surgentis Iuli, nequaquam pepercit tums accensus et ira. | Candidate: factus olor niveis pendebat in aere pennis. | 0 | * Loss: [OnlineContrastiveLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#onlinecontrastiveloss) ### Evaluation Dataset #### Unnamed Dataset * Size: 1,144 evaluation samples * Columns: query, match, and label * Approximate statistics based on the first 1000 samples: | | query | match | label | |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------------------------| | type | string | string | int | | details | | | | * Samples: | query | match | label | |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------|:---------------| | Query: qui uero pauperes sunt et tenui substantiola uidenturque sibi scioli, pomparum ferculis similes procedunt ad publicum, ut caninam exerceant facundiam. | Candidate: cogitat reliquas colonias obire. | 0 | | Query: nec uarios discet mentiri lana colores, ipse sed in pratis aries iam suaue rubenti- murice, iam croceo mutabit uellera luto, sponte sua sandyx pascentis uestiet agnos. | Candidate: loquitur ad voluntatem; quicquid denunciatum est, facit, assectatur, assidet, muneratur. | 0 | | Query: credite experto, quasi Christianus Christianis loquor: uenenata sunt illius dogmata, aliena a scripturis sanctis, uim scripturis facientia. | Candidate: ignoscunt mihi, revocant in consuetudinem pristinam te que, quod in ea permanseris, sapientiorem quam me dicunt fuisse. | 0 | * Loss: [OnlineContrastiveLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#onlinecontrastiveloss) ### Training Hyperparameters #### Non-Default Hyperparameters - `overwrite_output_dir`: True - `eval_strategy`: steps - `per_device_train_batch_size`: 32 - `learning_rate`: 2e-05 - `weight_decay`: 0.01 - `num_train_epochs`: 4 - `warmup_steps`: 1958 - `prompts`: {'query': 'Query: ', 'match': 'Candidate: '} #### All Hyperparameters
Click to expand - `overwrite_output_dir`: True - `do_predict`: False - `eval_strategy`: steps - `prediction_loss_only`: True - `per_device_train_batch_size`: 32 - `per_device_eval_batch_size`: 8 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 1 - `eval_accumulation_steps`: None - `torch_empty_cache_steps`: None - `learning_rate`: 2e-05 - `weight_decay`: 0.01 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1.0 - `num_train_epochs`: 4 - `max_steps`: -1 - `lr_scheduler_type`: linear - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.0 - `warmup_steps`: 1958 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: False - `fp16`: False - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: False - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: None - `hub_always_push`: False - `hub_revision`: None - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `include_for_metrics`: [] - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `use_liger_kernel`: False - `liger_kernel_config`: None - `eval_use_gather_object`: False - `average_tokens_across_devices`: False - `prompts`: {'query': 'Query: ', 'match': 'Candidate: '} - `batch_sampler`: batch_sampler - `multi_dataset_batch_sampler`: proportional
### Training Logs | Epoch | Step | Training Loss | Validation Loss | latin_intertext_cosine_ap | |:------:|:----:|:-------------:|:---------------:|:-------------------------:| | 0.6494 | 50 | 0.6022 | 0.1430 | 0.7392 | | 1.2987 | 100 | 0.5519 | 0.1191 | 0.7579 | | 1.9481 | 150 | 0.4728 | 0.1021 | 0.7794 | | 2.5974 | 200 | 0.4001 | 0.0934 | 0.7917 | | 3.2468 | 250 | 0.2689 | 0.0917 | 0.8048 | | 3.8961 | 300 | 0.221 | 0.0834 | 0.8119 | ### Framework Versions - Python: 3.10.8 - Sentence Transformers: 4.1.0 - Transformers: 4.53.0 - PyTorch: 2.7.1+cu126 - Accelerate: 1.4.0 - Datasets: 3.3.2 - Tokenizers: 0.21.0 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ```