Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 13
How to use Trelis/all-MiniLM-L12-v2-ft-triplets-10Qs with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("Trelis/all-MiniLM-L12-v2-ft-triplets-10Qs")
sentences = [
"What is the purpose of the Rollball in Touch Rugby?",
" Attacking Team from scoring a Try.\nFIT Playing Rules - 5th Edition\nCOPYRIGHT © Touch Football Australia 2020\n15\n19 Advantage \n19.1\tWhere a Defending Team player is Offside at a Tap or Rollball and attempts \nto interfere with play, the Referee will allow Advantage or award a Penalty, \nwhichever is of greater Advantage to the Attacking Team.\n19.2\tShould the Attacking Team in the act of taking the Advantage subsequently \ninfringe, the Ruling on the initial Infringement will apply.\n20 Misconduct \n20.1\tMisconduct warranting Penalty, Forced Interchange, Sin Bin or Dismissal \nincludes:\n20.1.1\tContinuous or regular breaches of the Rules;\n20.1.2\tSwearing towards another player, Referee, spectator or other match \t\nofficial;\n20.1.3\tDisputing decisions of Referees or other match official(s);\n20.1.4\tUsing more than the necessary physical force to make a Touch;\n20.1.5\tPoor sportsmanship;\n20.1.6\tTripping, striking, or otherwise assaulting another player, Referee, \nspectator or other match official; or\n20.1.7\tAny other action that is contrary to the spirit of the game.\n21 Forced Interchange \n21.1\tWhere the Referee deems it necessary to implement a Forced Interchange \nfollowing an Infringement, the Referee is to stop the match, direct the ball to \nbe placed on the Mark, advise the offending player of the reason for the Forced \nInterchange, direct that player to return to the Interchange Area, display the \nrelevant signal and award a Penalty to the non-offending Team.\n22 Sin Bin \n22.1\tThe on-field Referee is required to indicate the commencement and the end of \nthe Sin Bin time.\n22.2\tAny player sent to the Sin Bin must stand in the Sin Bin Area at the opposition’s \nend of the Field of Play and on the same side as their Interchange Area. \n22.3\tAny player sent to the Sin Bin must return to the Interchange Area prior to re-\nentering the Field of Play.\n22.4\tAny action that causes the Touch Count to restart will result in a continuation of \nthat Possession. For the avoidance of",
" The Rollball \n \n13.1\tThe attacking player is to position on the Mark, face the opponent’s Try Line, \nmake a genuine attempt to stand parallel to the Sidelines, place the ball on the \nground between the feet in a controlled manner and:\n13.1.1\tstep Forward over the ball; or\n13.1.2\troll the ball back between the feet no more than one (1) metre; or\n13.1.3\tpass a foot over the ball.\nRuling = A Change of Possession to the Defending Team at the point of the Infringement.\n13.2\tA player must perform the Rollball on the Mark.\nRuling = A Penalty to the Defending Team at the point of the Infringement.\n13.3\tA player must not perform a Voluntary Rollball.\nRuling = A Penalty to the Defending Team at the point of the Infringement.\n13.4\tA player must not delay in performing the Rollball.\nRuling = A Penalty to the Defending Team at the point of the Infringement.\n13.5\tA player may only perform a Rollball at the Mark under the following \ncircumstances:\n13.5.1\twhen a Touch has been made; or\n13.5.2\twhen Possession changes following the sixth Touch; or\n13.5.3\twhen Possession changes due to the ball being dropped or passed and \ngoes to the ground; or\n13.5.4\twhen Possession changes due to an Infringement by an attacking player \nat a Penalty, a Tap or a Rollball; or\nFIT Playing Rules - 5th Edition\nCOPYRIGHT © Touch Football Australia 2020\n11\n13.5.5\twhen Possession changes after the Half is Touched or when the Half \nplaces the ball on or over the Try Line; or\n13.5.6\tin replacement of a Penalty Tap; or\n13.5.7\twhen so directed by the Referee.\n13.6\tA player is to perform a Rollball seven (7) metres in-field under the following \ncircumstances:\n13.6.1\twhen a Change of Possession takes place due to a player in Possession \nmaking contact with the Sideline or any ground outside the Field of Play, \nprior to a Touch being made; or\n13.6.2\twhen the ball",
"1\twhen a Change of Possession takes place due to a player in Possession \nmaking contact with the Sideline or any ground outside the Field of Play, \nprior to a Touch being made; or\n13.6.2\twhen the ball not in Possession of a player makes contact with the \nSideline or any ground outside the Field of Play.\n13.7\tA player may not perform a Tap in replacement of a Rollball.\nRuling = The offending Team must return to the Mark and perform the Rollball.\n13.8\tAn attacking player, other than the player performing the Rollball, may receive \nthe ball at the Rollball and shall do so without delay. That player is referred to as \nthe Half.\n13.9\tThe Half may control the ball with a foot prior to picking up the ball. \n13.10\tA player ceases to be the Half once the ball is passed to another player.\n13.11\tDefending players are not to interfere with the performance of the Rollball or the \nHalf. \nRuling = A Penalty to the Attacking Team at a point ten (10) metres directly Forward of the \nInfringement.\n13.12\tPlayers of the Defending Team must not move Forward of the Onside position \nuntil the Half has made contact with the ball, unless directed to do so by the \nReferee or in accordance with 13.12.1.\n13.12.1\tWhen the Half is not within one (1) metre of the Rollball, Onside players \nof the Defending Team may move Forward as soon as the player \nperforming the Rollball releases the ball. If the Half is not in position and \na defending player moves Forward and makes contact with the ball, a \nChange of Possession results.\n13.13\tIf in the act of performing the Rollball, the Attacking player makes contact with \nthe Sideline or any ground outside the Field of Play a Change of Possession will \noccur with the Rollball to be taken seven (7) metres in field.\n13.14\tAfter a Touch is made between the Dead Ball Line and the seven (7) metre line, \nan Attacking Team is permitted to Rollball on the seven (7) metre line at a point \ndirectly in line with where the Touch was made.\nFIT Playing Rules - 5th Edition\n12\nCOPYRIGHT © Touch Football Australia"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L12-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Trelis/all-MiniLM-L12-v2-ft-triplets-10Qs")
# Run inference
sentences = [
'What is the ruling if the referee causes obstruction on either an attacking or defending player, including when the ball makes contact with the referee?',
'fringement occurs in the In-Goal Area. \n16.4\tPlayers in the Defending Team may not obstruct or interfere with an attacking \nplayer.\nRuling = A Penalty to the non-offending Team at the point of the Infringement or on the \nseven (7) metre line if the Infringement occurs in the In-Goal Area. \n16.5\tShould a supporting, attacking player cause an apparent and involuntary or \naccidental Obstruction and the player in Possession ceases movement to allow \na Touch to be made, the Touch is to count.\n16.6\tIf the Referee causes Obstruction on either an attacking player or a defending \nplayer including when the ball makes contact with the Referee, play should \npause and recommence with a Rollball at the Mark where the interference \noccurred and the Touch count remains unchanged.\n17\u2002 Interchange \n17.1\tPlayers may Interchange at any time. \n17.2\tThere is no limit to the number of times a player may Interchange.\n17.3\tInterchange players must remain in their Interchange Area for the duration of \nthe match.\n17.4\tInterchanges may only occur after the player leaving the Field of Play has \nentered the Interchange Area. \n17.5\tPlayers leaving or entering the Field of Play shall not hinder or obstruct play.\nRuling = A Penalty to the non-offending Team at the point of the Infringement.\n17.6\tPlayers entering the Field of Play must take up an Onside position before \nbecoming involved in play.\nFIT Playing Rules - 5th Edition\n14\nCOPYRIGHT © Touch Football Australia 2020\nRuling = A Penalty to the non-offending Team at the point of the Infringement.\n17.7\tWhen an intercept has occurred or a line break made, players are not permitted \nto Interchange until the next Touch has been made or ball becomes Dead.\nRuling A = If a player enters the Field of Play and prevents the scoring of a Try, a Penalty Try \nwill be awarded and the offending player sent to the Sin Bin.\nRuling B = If a player enters the Field of Play but does not impede the scoring of a Try the \noffending player will be sent to the Sin Bin.\n17.8\tFollowing a Try, players may Interchange at will, without having to wait for',
' Player\nThe player who replaces another player during Interchange. There is \na maximum of eight (8) substitute players in any Team and except \nwhen interchanging, in the Sin Bin, dismissed or on the Field of Play, \nthey must remain in the Substitution Box.\nTap and Tap Penalty\nThe method of commencing the match, recommencing the match \nafter Half Time and after a Try has been scored. The Tap is also the \nmethod of recommencing play when a Penalty is awarded. The Tap \nis taken by placing the ball on the ground at or behind the Mark, \nreleasing both hands from the ball, tapping the ball gently with either \nfoot or touching the foot on the ball. The ball must not roll or move \nmore than one (1) metre in any direction and must be retrieved \ncleanly, without touching the ground again. The player may face any \ndirection and use either foot. Provided it is at the Mark, the ball does \nnot have to be lifted from the ground prior to a Tap being taken.\nTeam\nA group of players constituting one (1) side in a competition match.\nTFA\nTouch Football Australia Limited\nTouch\nAny contact between the player in Possession and a defending \nplayer. A Touch includes contact on the ball, hair or clothing and may \nbe made by a defending player or by the player in Possession.\nTouch Count\nThe progressive number of Touches that each Team has before a \nChange of Possession, from zero (0) to six (6).\nTry\nThe result of any attacking player, except the Half, placing the ball on \nor over the Team’s Attacking Try Line before being Touched.\nTry Lines\nThe lines separating the In-Goal Areas from the Field of Play. See \nAppendix 1.\nVoluntary Rollball\nThe player in Possession performs a Rollball before a Touch is made \nwith a defending player.\nWing\nThe player outside the Link player.\nWinner\nThe Team that scores the most Tries during the match.\nFIT Playing Rules - 5th Edition\n4\nCOPYRIGHT © Touch Football Australia 2020\n Rules of Play \n Mode of Play \nThe object of the game of Touch is for each Team to score Tries and to prevent the \nopposition from scoring. The ball may be passed, knocked or handed between players \nof the Attacking Team who may in turn run',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
eval_strategy: stepsper_device_train_batch_size: 16per_device_eval_batch_size: 16learning_rate: 0.0001num_train_epochs: 5lr_scheduler_type: cosinewarmup_ratio: 0.3overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonelearning_rate: 0.0001weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 5max_steps: -1lr_scheduler_type: cosinelr_scheduler_kwargs: {}warmup_ratio: 0.3warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseeval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falsebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | loss |
|---|---|---|---|
| 0.1667 | 2 | 4.8893 | - |
| 0.3333 | 4 | 4.9073 | - |
| 0.5 | 6 | 4.8582 | - |
| 0.6667 | 8 | 4.8634 | 4.8319 |
| 0.8333 | 10 | 4.81 | - |
| 1.0 | 12 | 4.8214 | - |
| 1.1667 | 14 | 4.6917 | - |
| 1.3333 | 16 | 4.571 | 4.6944 |
| 1.5 | 18 | 4.5726 | - |
| 1.6667 | 20 | 4.6054 | - |
| 1.8333 | 22 | 4.4568 | - |
| 2.0 | 24 | 4.5025 | 4.5390 |
| 2.1667 | 26 | 4.3231 | - |
| 2.3333 | 28 | 4.1362 | - |
| 2.5 | 30 | 4.3427 | - |
| 2.6667 | 32 | 4.2574 | 4.4695 |
| 2.8333 | 34 | 4.3008 | - |
| 3.0 | 36 | 4.1244 | - |
| 3.1667 | 38 | 4.0408 | - |
| 3.3333 | 40 | 4.1497 | 4.3349 |
| 3.5 | 42 | 4.0795 | - |
| 3.6667 | 44 | 3.8948 | - |
| 3.8333 | 46 | 4.1476 | - |
| 4.0 | 48 | 4.0925 | 4.2929 |
| 4.1667 | 50 | 3.7692 | - |
| 4.3333 | 52 | 4.058 | - |
| 4.5 | 54 | 3.8418 | - |
| 4.6667 | 56 | 4.049 | 4.3185 |
| 4.8333 | 58 | 4.184 | - |
| 5.0 | 60 | 4.0321 | - |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{hermans2017defense,
title={In Defense of the Triplet Loss for Person Re-Identification},
author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
year={2017},
eprint={1703.07737},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Base model
microsoft/MiniLM-L12-H384-uncased