diff --git "a/README.md" "b/README.md" --- "a/README.md" +++ "b/README.md" @@ -1,3 +1,6105 @@ ---- -license: mit ---- +--- +language: +- multilingual +- af +- am +- ar +- as +- az +- be +- bg +- bn +- br +- bs +- ca +- cs +- cy +- da +- de +- el +- en +- eo +- es +- et +- eu +- fa +- fi +- fr +- fy +- ga +- gd +- gl +- gu +- ha +- he +- hi +- hr +- hu +- hy +- id +- is +- it +- ja +- jv +- ka +- kk +- km +- kn +- ko +- ku +- ky +- la +- lo +- lt +- lv +- mg +- mk +- ml +- mn +- mr +- ms +- my +- ne +- nl +- 'no' +- om +- or +- pa +- pl +- ps +- pt +- ro +- ru +- sa +- sd +- si +- sk +- sl +- so +- sq +- sr +- su +- sv +- sw +- ta +- te +- th +- tl +- tr +- ug +- uk +- ur +- uz +- vi +- xh +- yi +- zh +license: mit +model-index: +- name: multilingual-e5-large + results: + - dataset: + config: en + name: MTEB AmazonCounterfactualClassification (en) + revision: e8379541af4e31359cca9fbcf4b00f2671dba205 + split: test + type: mteb/amazon_counterfactual + metrics: + - type: accuracy + value: 79.05970149253731 + - type: ap + value: 43.486574390835635 + - type: f1 + value: 73.32700092140148 + task: + type: Classification + - dataset: + config: de + name: MTEB AmazonCounterfactualClassification (de) + revision: e8379541af4e31359cca9fbcf4b00f2671dba205 + split: test + type: mteb/amazon_counterfactual + metrics: + - type: accuracy + value: 71.22055674518201 + - type: ap + value: 81.55756710830498 + - type: f1 + value: 69.28271787752661 + task: + type: Classification + - dataset: + config: en-ext + name: MTEB AmazonCounterfactualClassification (en-ext) + revision: e8379541af4e31359cca9fbcf4b00f2671dba205 + split: test + type: mteb/amazon_counterfactual + metrics: + - type: accuracy + value: 80.41979010494754 + - type: ap + value: 29.34879922376344 + - type: f1 + value: 67.62475449011278 + task: + type: Classification + - dataset: + config: ja + name: MTEB AmazonCounterfactualClassification (ja) + revision: e8379541af4e31359cca9fbcf4b00f2671dba205 + split: test + type: mteb/amazon_counterfactual + metrics: + - type: accuracy + value: 77.8372591006424 + - type: ap + value: 26.557560591210738 + - type: f1 + value: 64.96619417368707 + task: + type: Classification + - dataset: + config: default + name: MTEB AmazonPolarityClassification + revision: e2d317d38cd51312af73b3d32a06d1a08b442046 + split: test + type: mteb/amazon_polarity + metrics: + - type: accuracy + value: 93.489875 + - type: ap + value: 90.98758636917603 + - type: f1 + value: 93.48554819717332 + task: + type: Classification + - dataset: + config: en + name: MTEB AmazonReviewsClassification (en) + revision: 1399c76144fd37290681b995c656ef9b2e06e26d + split: test + type: mteb/amazon_reviews_multi + metrics: + - type: accuracy + value: 47.564 + - type: f1 + value: 46.75122173518047 + task: + type: Classification + - dataset: + config: de + name: MTEB AmazonReviewsClassification (de) + revision: 1399c76144fd37290681b995c656ef9b2e06e26d + split: test + type: mteb/amazon_reviews_multi + metrics: + - type: accuracy + value: 45.400000000000006 + - type: f1 + value: 44.17195682400632 + task: + type: Classification + - dataset: + config: es + name: MTEB AmazonReviewsClassification (es) + revision: 1399c76144fd37290681b995c656ef9b2e06e26d + split: test + type: mteb/amazon_reviews_multi + metrics: + - type: accuracy + value: 43.068 + - type: f1 + value: 42.38155696855596 + task: + type: Classification + - dataset: + config: fr + name: MTEB AmazonReviewsClassification (fr) + revision: 1399c76144fd37290681b995c656ef9b2e06e26d + split: test + type: mteb/amazon_reviews_multi + metrics: + - type: accuracy + value: 41.89 + - type: f1 + value: 40.84407321682663 + task: + type: Classification + - dataset: + config: ja + name: MTEB AmazonReviewsClassification (ja) + revision: 1399c76144fd37290681b995c656ef9b2e06e26d + split: test + type: mteb/amazon_reviews_multi + metrics: + - type: accuracy + value: 40.120000000000005 + - type: f1 + value: 39.522976223819114 + task: + type: Classification + - dataset: + config: zh + name: MTEB AmazonReviewsClassification (zh) + revision: 1399c76144fd37290681b995c656ef9b2e06e26d + split: test + type: mteb/amazon_reviews_multi + metrics: + - type: accuracy + value: 38.832 + - type: f1 + value: 38.0392533394713 + task: + type: Classification + - dataset: + config: default + name: MTEB ArguAna + revision: None + split: test + type: arguana + metrics: + - type: map_at_1 + value: 30.725 + - type: map_at_10 + value: 46.055 + - type: map_at_100 + value: 46.900999999999996 + - type: map_at_1000 + value: 46.911 + - type: map_at_3 + value: 41.548 + - type: map_at_5 + value: 44.297 + - type: mrr_at_1 + value: 31.152 + - type: mrr_at_10 + value: 46.231 + - type: mrr_at_100 + value: 47.07 + - type: mrr_at_1000 + value: 47.08 + - type: mrr_at_3 + value: 41.738 + - type: mrr_at_5 + value: 44.468999999999994 + - type: ndcg_at_1 + value: 30.725 + - type: ndcg_at_10 + value: 54.379999999999995 + - type: ndcg_at_100 + value: 58.138 + - type: ndcg_at_1000 + value: 58.389 + - type: ndcg_at_3 + value: 45.156 + - type: ndcg_at_5 + value: 50.123 + - type: precision_at_1 + value: 30.725 + - type: precision_at_10 + value: 8.087 + - type: precision_at_100 + value: 0.9769999999999999 + - type: precision_at_1000 + value: 0.1 + - type: precision_at_3 + value: 18.54 + - type: precision_at_5 + value: 13.542000000000002 + - type: recall_at_1 + value: 30.725 + - type: recall_at_10 + value: 80.868 + - type: recall_at_100 + value: 97.653 + - type: recall_at_1000 + value: 99.57300000000001 + - type: recall_at_3 + value: 55.619 + - type: recall_at_5 + value: 67.71000000000001 + task: + type: Retrieval + - dataset: + config: default + name: MTEB ArxivClusteringP2P + revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d + split: test + type: mteb/arxiv-clustering-p2p + metrics: + - type: v_measure + value: 44.30960650674069 + task: + type: Clustering + - dataset: + config: default + name: MTEB ArxivClusteringS2S + revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53 + split: test + type: mteb/arxiv-clustering-s2s + metrics: + - type: v_measure + value: 38.427074197498996 + task: + type: Clustering + - dataset: + config: default + name: MTEB AskUbuntuDupQuestions + revision: 2000358ca161889fa9c082cb41daa8dcfb161a54 + split: test + type: mteb/askubuntudupquestions-reranking + metrics: + - type: map + value: 60.28270056031872 + - type: mrr + value: 74.38332673789738 + task: + type: Reranking + - dataset: + config: default + name: MTEB BIOSSES + revision: d3fb88f8f02e40887cd149695127462bbcf29b4a + split: test + type: mteb/biosses-sts + metrics: + - type: cos_sim_pearson + value: 84.05942144105269 + - type: cos_sim_spearman + value: 82.51212105850809 + - type: euclidean_pearson + value: 81.95639829909122 + - type: euclidean_spearman + value: 82.3717564144213 + - type: manhattan_pearson + value: 81.79273425468256 + - type: manhattan_spearman + value: 82.20066817871039 + task: + type: STS + - dataset: + config: de-en + name: MTEB BUCC (de-en) + revision: d51519689f32196a32af33b075a01d0e7c51e252 + split: test + type: mteb/bucc-bitext-mining + metrics: + - type: accuracy + value: 99.46764091858039 + - type: f1 + value: 99.37717466945023 + - type: precision + value: 99.33194154488518 + - type: recall + value: 99.46764091858039 + task: + type: BitextMining + - dataset: + config: fr-en + name: MTEB BUCC (fr-en) + revision: d51519689f32196a32af33b075a01d0e7c51e252 + split: test + type: mteb/bucc-bitext-mining + metrics: + - type: accuracy + value: 98.29407880255337 + - type: f1 + value: 98.11248073959938 + - type: precision + value: 98.02443319392472 + - type: recall + value: 98.29407880255337 + task: + type: BitextMining + - dataset: + config: ru-en + name: MTEB BUCC (ru-en) + revision: d51519689f32196a32af33b075a01d0e7c51e252 + split: test + type: mteb/bucc-bitext-mining + metrics: + - type: accuracy + value: 97.79009352268791 + - type: f1 + value: 97.5176076665512 + - type: precision + value: 97.38136473848286 + - type: recall + value: 97.79009352268791 + task: + type: BitextMining + - dataset: + config: zh-en + name: MTEB BUCC (zh-en) + revision: d51519689f32196a32af33b075a01d0e7c51e252 + split: test + type: mteb/bucc-bitext-mining + metrics: + - type: accuracy + value: 99.26276987888363 + - type: f1 + value: 99.20133403545726 + - type: precision + value: 99.17500438827453 + - type: recall + value: 99.26276987888363 + task: + type: BitextMining + - dataset: + config: default + name: MTEB Banking77Classification + revision: 0fd18e25b25c072e09e0d92ab615fda904d66300 + split: test + type: mteb/banking77 + metrics: + - type: accuracy + value: 84.72727272727273 + - type: f1 + value: 84.67672206031433 + task: + type: Classification + - dataset: + config: default + name: MTEB BiorxivClusteringP2P + revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40 + split: test + type: mteb/biorxiv-clustering-p2p + metrics: + - type: v_measure + value: 35.34220182511161 + task: + type: Clustering + - dataset: + config: default + name: MTEB BiorxivClusteringS2S + revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908 + split: test + type: mteb/biorxiv-clustering-s2s + metrics: + - type: v_measure + value: 33.4987096128766 + task: + type: Clustering + - dataset: + config: default + name: MTEB CQADupstackRetrieval + revision: None + split: test + type: BeIR/cqadupstack + metrics: + - type: map_at_1 + value: 25.558249999999997 + - type: map_at_10 + value: 34.44425000000001 + - type: map_at_100 + value: 35.59833333333333 + - type: map_at_1000 + value: 35.706916666666665 + - type: map_at_3 + value: 31.691749999999995 + - type: map_at_5 + value: 33.252916666666664 + - type: mrr_at_1 + value: 30.252666666666666 + - type: mrr_at_10 + value: 38.60675 + - type: mrr_at_100 + value: 39.42666666666666 + - type: mrr_at_1000 + value: 39.48408333333334 + - type: mrr_at_3 + value: 36.17441666666665 + - type: mrr_at_5 + value: 37.56275 + - type: ndcg_at_1 + value: 30.252666666666666 + - type: ndcg_at_10 + value: 39.683 + - type: ndcg_at_100 + value: 44.68541666666667 + - type: ndcg_at_1000 + value: 46.94316666666668 + - type: ndcg_at_3 + value: 34.961749999999995 + - type: ndcg_at_5 + value: 37.215666666666664 + - type: precision_at_1 + value: 30.252666666666666 + - type: precision_at_10 + value: 6.904166666666667 + - type: precision_at_100 + value: 1.0989999999999995 + - type: precision_at_1000 + value: 0.14733333333333334 + - type: precision_at_3 + value: 16.037666666666667 + - type: precision_at_5 + value: 11.413583333333333 + - type: recall_at_1 + value: 25.558249999999997 + - type: recall_at_10 + value: 51.13341666666666 + - type: recall_at_100 + value: 73.08366666666667 + - type: recall_at_1000 + value: 88.79483333333334 + - type: recall_at_3 + value: 37.989083333333326 + - type: recall_at_5 + value: 43.787833333333325 + task: + type: Retrieval + - dataset: + config: default + name: MTEB ClimateFEVER + revision: None + split: test + type: climate-fever + metrics: + - type: map_at_1 + value: 10.338 + - type: map_at_10 + value: 18.360000000000003 + - type: map_at_100 + value: 19.942 + - type: map_at_1000 + value: 20.134 + - type: map_at_3 + value: 15.174000000000001 + - type: map_at_5 + value: 16.830000000000002 + - type: mrr_at_1 + value: 23.257 + - type: mrr_at_10 + value: 33.768 + - type: mrr_at_100 + value: 34.707 + - type: mrr_at_1000 + value: 34.766000000000005 + - type: mrr_at_3 + value: 30.977 + - type: mrr_at_5 + value: 32.528 + - type: ndcg_at_1 + value: 23.257 + - type: ndcg_at_10 + value: 25.733 + - type: ndcg_at_100 + value: 32.288 + - type: ndcg_at_1000 + value: 35.992000000000004 + - type: ndcg_at_3 + value: 20.866 + - type: ndcg_at_5 + value: 22.612 + - type: precision_at_1 + value: 23.257 + - type: precision_at_10 + value: 8.124 + - type: precision_at_100 + value: 1.518 + - type: precision_at_1000 + value: 0.219 + - type: precision_at_3 + value: 15.679000000000002 + - type: precision_at_5 + value: 12.117 + - type: recall_at_1 + value: 10.338 + - type: recall_at_10 + value: 31.154 + - type: recall_at_100 + value: 54.161 + - type: recall_at_1000 + value: 75.21900000000001 + - type: recall_at_3 + value: 19.427 + - type: recall_at_5 + value: 24.214 + task: + type: Retrieval + - dataset: + config: default + name: MTEB DBPedia + revision: None + split: test + type: dbpedia-entity + metrics: + - type: map_at_1 + value: 8.498 + - type: map_at_10 + value: 19.103 + - type: map_at_100 + value: 27.375 + - type: map_at_1000 + value: 28.981 + - type: map_at_3 + value: 13.764999999999999 + - type: map_at_5 + value: 15.950000000000001 + - type: mrr_at_1 + value: 65.5 + - type: mrr_at_10 + value: 74.53800000000001 + - type: mrr_at_100 + value: 74.71799999999999 + - type: mrr_at_1000 + value: 74.725 + - type: mrr_at_3 + value: 72.792 + - type: mrr_at_5 + value: 73.554 + - type: ndcg_at_1 + value: 53.37499999999999 + - type: ndcg_at_10 + value: 41.286 + - type: ndcg_at_100 + value: 45.972 + - type: ndcg_at_1000 + value: 53.123 + - type: ndcg_at_3 + value: 46.172999999999995 + - type: ndcg_at_5 + value: 43.033 + - type: precision_at_1 + value: 65.5 + - type: precision_at_10 + value: 32.725 + - type: precision_at_100 + value: 10.683 + - type: precision_at_1000 + value: 1.978 + - type: precision_at_3 + value: 50 + - type: precision_at_5 + value: 41.349999999999994 + - type: recall_at_1 + value: 8.498 + - type: recall_at_10 + value: 25.070999999999998 + - type: recall_at_100 + value: 52.383 + - type: recall_at_1000 + value: 74.91499999999999 + - type: recall_at_3 + value: 15.207999999999998 + - type: recall_at_5 + value: 18.563 + task: + type: Retrieval + - dataset: + config: default + name: MTEB EmotionClassification + revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37 + split: test + type: mteb/emotion + metrics: + - type: accuracy + value: 46.5 + - type: f1 + value: 41.93833713984145 + task: + type: Classification + - dataset: + config: default + name: MTEB FEVER + revision: None + split: test + type: fever + metrics: + - type: map_at_1 + value: 67.914 + - type: map_at_10 + value: 78.10000000000001 + - type: map_at_100 + value: 78.333 + - type: map_at_1000 + value: 78.346 + - type: map_at_3 + value: 76.626 + - type: map_at_5 + value: 77.627 + - type: mrr_at_1 + value: 72.74199999999999 + - type: mrr_at_10 + value: 82.414 + - type: mrr_at_100 + value: 82.511 + - type: mrr_at_1000 + value: 82.513 + - type: mrr_at_3 + value: 81.231 + - type: mrr_at_5 + value: 82.065 + - type: ndcg_at_1 + value: 72.74199999999999 + - type: ndcg_at_10 + value: 82.806 + - type: ndcg_at_100 + value: 83.677 + - type: ndcg_at_1000 + value: 83.917 + - type: ndcg_at_3 + value: 80.305 + - type: ndcg_at_5 + value: 81.843 + - type: precision_at_1 + value: 72.74199999999999 + - type: precision_at_10 + value: 10.24 + - type: precision_at_100 + value: 1.089 + - type: precision_at_1000 + value: 0.11299999999999999 + - type: precision_at_3 + value: 31.268 + - type: precision_at_5 + value: 19.706000000000003 + - type: recall_at_1 + value: 67.914 + - type: recall_at_10 + value: 92.889 + - type: recall_at_100 + value: 96.42699999999999 + - type: recall_at_1000 + value: 97.92 + - type: recall_at_3 + value: 86.21 + - type: recall_at_5 + value: 90.036 + task: + type: Retrieval + - dataset: + config: default + name: MTEB FiQA2018 + revision: None + split: test + type: fiqa + metrics: + - type: map_at_1 + value: 22.166 + - type: map_at_10 + value: 35.57 + - type: map_at_100 + value: 37.405 + - type: map_at_1000 + value: 37.564 + - type: map_at_3 + value: 30.379 + - type: map_at_5 + value: 33.324 + - type: mrr_at_1 + value: 43.519000000000005 + - type: mrr_at_10 + value: 51.556000000000004 + - type: mrr_at_100 + value: 52.344 + - type: mrr_at_1000 + value: 52.373999999999995 + - type: mrr_at_3 + value: 48.868 + - type: mrr_at_5 + value: 50.319 + - type: ndcg_at_1 + value: 43.519000000000005 + - type: ndcg_at_10 + value: 43.803 + - type: ndcg_at_100 + value: 50.468999999999994 + - type: ndcg_at_1000 + value: 53.111 + - type: ndcg_at_3 + value: 38.893 + - type: ndcg_at_5 + value: 40.653 + - type: precision_at_1 + value: 43.519000000000005 + - type: precision_at_10 + value: 12.253 + - type: precision_at_100 + value: 1.931 + - type: precision_at_1000 + value: 0.242 + - type: precision_at_3 + value: 25.617 + - type: precision_at_5 + value: 19.383 + - type: recall_at_1 + value: 22.166 + - type: recall_at_10 + value: 51.6 + - type: recall_at_100 + value: 76.574 + - type: recall_at_1000 + value: 92.192 + - type: recall_at_3 + value: 34.477999999999994 + - type: recall_at_5 + value: 41.835 + task: + type: Retrieval + - dataset: + config: default + name: MTEB HotpotQA + revision: None + split: test + type: hotpotqa + metrics: + - type: map_at_1 + value: 39.041 + - type: map_at_10 + value: 62.961999999999996 + - type: map_at_100 + value: 63.79899999999999 + - type: map_at_1000 + value: 63.854 + - type: map_at_3 + value: 59.399 + - type: map_at_5 + value: 61.669 + - type: mrr_at_1 + value: 78.082 + - type: mrr_at_10 + value: 84.321 + - type: mrr_at_100 + value: 84.49600000000001 + - type: mrr_at_1000 + value: 84.502 + - type: mrr_at_3 + value: 83.421 + - type: mrr_at_5 + value: 83.977 + - type: ndcg_at_1 + value: 78.082 + - type: ndcg_at_10 + value: 71.229 + - type: ndcg_at_100 + value: 74.10900000000001 + - type: ndcg_at_1000 + value: 75.169 + - type: ndcg_at_3 + value: 66.28699999999999 + - type: ndcg_at_5 + value: 69.084 + - type: precision_at_1 + value: 78.082 + - type: precision_at_10 + value: 14.993 + - type: precision_at_100 + value: 1.7239999999999998 + - type: precision_at_1000 + value: 0.186 + - type: precision_at_3 + value: 42.737 + - type: precision_at_5 + value: 27.843 + - type: recall_at_1 + value: 39.041 + - type: recall_at_10 + value: 74.96300000000001 + - type: recall_at_100 + value: 86.199 + - type: recall_at_1000 + value: 93.228 + - type: recall_at_3 + value: 64.105 + - type: recall_at_5 + value: 69.608 + task: + type: Retrieval + - dataset: + config: default + name: MTEB ImdbClassification + revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7 + split: test + type: mteb/imdb + metrics: + - type: accuracy + value: 90.23160000000001 + - type: ap + value: 85.5674856808308 + - type: f1 + value: 90.18033354786317 + task: + type: Classification + - dataset: + config: default + name: MTEB MSMARCO + revision: None + split: dev + type: msmarco + metrics: + - type: map_at_1 + value: 24.091 + - type: map_at_10 + value: 36.753 + - type: map_at_100 + value: 37.913000000000004 + - type: map_at_1000 + value: 37.958999999999996 + - type: map_at_3 + value: 32.818999999999996 + - type: map_at_5 + value: 35.171 + - type: mrr_at_1 + value: 24.742 + - type: mrr_at_10 + value: 37.285000000000004 + - type: mrr_at_100 + value: 38.391999999999996 + - type: mrr_at_1000 + value: 38.431 + - type: mrr_at_3 + value: 33.440999999999995 + - type: mrr_at_5 + value: 35.75 + - type: ndcg_at_1 + value: 24.742 + - type: ndcg_at_10 + value: 43.698 + - type: ndcg_at_100 + value: 49.145 + - type: ndcg_at_1000 + value: 50.23800000000001 + - type: ndcg_at_3 + value: 35.769 + - type: ndcg_at_5 + value: 39.961999999999996 + - type: precision_at_1 + value: 24.742 + - type: precision_at_10 + value: 6.7989999999999995 + - type: precision_at_100 + value: 0.95 + - type: precision_at_1000 + value: 0.104 + - type: precision_at_3 + value: 15.096000000000002 + - type: precision_at_5 + value: 11.183 + - type: recall_at_1 + value: 24.091 + - type: recall_at_10 + value: 65.068 + - type: recall_at_100 + value: 89.899 + - type: recall_at_1000 + value: 98.16 + - type: recall_at_3 + value: 43.68 + - type: recall_at_5 + value: 53.754999999999995 + task: + type: Retrieval + - dataset: + config: en + name: MTEB MTOPDomainClassification (en) + revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf + split: test + type: mteb/mtop_domain + metrics: + - type: accuracy + value: 93.66621067031465 + - type: f1 + value: 93.49622853272142 + task: + type: Classification + - dataset: + config: de + name: MTEB MTOPDomainClassification (de) + revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf + split: test + type: mteb/mtop_domain + metrics: + - type: accuracy + value: 91.94702733164272 + - type: f1 + value: 91.17043441745282 + task: + type: Classification + - dataset: + config: es + name: MTEB MTOPDomainClassification (es) + revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf + split: test + type: mteb/mtop_domain + metrics: + - type: accuracy + value: 92.20146764509674 + - type: f1 + value: 91.98359080555608 + task: + type: Classification + - dataset: + config: fr + name: MTEB MTOPDomainClassification (fr) + revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf + split: test + type: mteb/mtop_domain + metrics: + - type: accuracy + value: 88.99780770435328 + - type: f1 + value: 89.19746342724068 + task: + type: Classification + - dataset: + config: hi + name: MTEB MTOPDomainClassification (hi) + revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf + split: test + type: mteb/mtop_domain + metrics: + - type: accuracy + value: 89.78486912871998 + - type: f1 + value: 89.24578823628642 + task: + type: Classification + - dataset: + config: th + name: MTEB MTOPDomainClassification (th) + revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf + split: test + type: mteb/mtop_domain + metrics: + - type: accuracy + value: 88.74502712477394 + - type: f1 + value: 89.00297573881542 + task: + type: Classification + - dataset: + config: en + name: MTEB MTOPIntentClassification (en) + revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba + split: test + type: mteb/mtop_intent + metrics: + - type: accuracy + value: 77.9046967624259 + - type: f1 + value: 59.36787125785957 + task: + type: Classification + - dataset: + config: de + name: MTEB MTOPIntentClassification (de) + revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba + split: test + type: mteb/mtop_intent + metrics: + - type: accuracy + value: 74.5280360664976 + - type: f1 + value: 57.17723440888718 + task: + type: Classification + - dataset: + config: es + name: MTEB MTOPIntentClassification (es) + revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba + split: test + type: mteb/mtop_intent + metrics: + - type: accuracy + value: 75.44029352901934 + - type: f1 + value: 54.052855531072964 + task: + type: Classification + - dataset: + config: fr + name: MTEB MTOPIntentClassification (fr) + revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba + split: test + type: mteb/mtop_intent + metrics: + - type: accuracy + value: 70.5606013153774 + - type: f1 + value: 52.62215934386531 + task: + type: Classification + - dataset: + config: hi + name: MTEB MTOPIntentClassification (hi) + revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba + split: test + type: mteb/mtop_intent + metrics: + - type: accuracy + value: 73.11581211903908 + - type: f1 + value: 52.341291845645465 + task: + type: Classification + - dataset: + config: th + name: MTEB MTOPIntentClassification (th) + revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba + split: test + type: mteb/mtop_intent + metrics: + - type: accuracy + value: 74.28933092224233 + - type: f1 + value: 57.07918745504911 + task: + type: Classification + - dataset: + config: af + name: MTEB MassiveIntentClassification (af) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 62.38063214525892 + - type: f1 + value: 59.46463723443009 + task: + type: Classification + - dataset: + config: am + name: MTEB MassiveIntentClassification (am) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 56.06926698049766 + - type: f1 + value: 52.49084283283562 + task: + type: Classification + - dataset: + config: ar + name: MTEB MassiveIntentClassification (ar) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 60.74983187626093 + - type: f1 + value: 56.960640620165904 + task: + type: Classification + - dataset: + config: az + name: MTEB MassiveIntentClassification (az) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 64.86550100874243 + - type: f1 + value: 62.47370548140688 + task: + type: Classification + - dataset: + config: bn + name: MTEB MassiveIntentClassification (bn) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 63.971082716879636 + - type: f1 + value: 61.03812421957381 + task: + type: Classification + - dataset: + config: cy + name: MTEB MassiveIntentClassification (cy) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 54.98318762609282 + - type: f1 + value: 51.51207916008392 + task: + type: Classification + - dataset: + config: da + name: MTEB MassiveIntentClassification (da) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 69.45527908540686 + - type: f1 + value: 66.16631905400318 + task: + type: Classification + - dataset: + config: de + name: MTEB MassiveIntentClassification (de) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 69.32750504371216 + - type: f1 + value: 66.16755288646591 + task: + type: Classification + - dataset: + config: el + name: MTEB MassiveIntentClassification (el) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 69.09213180901143 + - type: f1 + value: 66.95654394661507 + task: + type: Classification + - dataset: + config: en + name: MTEB MassiveIntentClassification (en) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 73.75588433086752 + - type: f1 + value: 71.79973779656923 + task: + type: Classification + - dataset: + config: es + name: MTEB MassiveIntentClassification (es) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 70.49428379287154 + - type: f1 + value: 68.37494379215734 + task: + type: Classification + - dataset: + config: fa + name: MTEB MassiveIntentClassification (fa) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 69.90921318090115 + - type: f1 + value: 66.79517376481645 + task: + type: Classification + - dataset: + config: fi + name: MTEB MassiveIntentClassification (fi) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 70.12104909213181 + - type: f1 + value: 67.29448842879584 + task: + type: Classification + - dataset: + config: fr + name: MTEB MassiveIntentClassification (fr) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 69.34095494283793 + - type: f1 + value: 67.01134288992947 + task: + type: Classification + - dataset: + config: he + name: MTEB MassiveIntentClassification (he) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 67.61264290517822 + - type: f1 + value: 64.68730512660757 + task: + type: Classification + - dataset: + config: hi + name: MTEB MassiveIntentClassification (hi) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 67.79757901815738 + - type: f1 + value: 65.24938539425598 + task: + type: Classification + - dataset: + config: hu + name: MTEB MassiveIntentClassification (hu) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 69.68728984532616 + - type: f1 + value: 67.0487169762553 + task: + type: Classification + - dataset: + config: hy + name: MTEB MassiveIntentClassification (hy) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 62.07464694014795 + - type: f1 + value: 59.183532276789286 + task: + type: Classification + - dataset: + config: id + name: MTEB MassiveIntentClassification (id) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 70.04707464694015 + - type: f1 + value: 67.66829629003848 + task: + type: Classification + - dataset: + config: is + name: MTEB MassiveIntentClassification (is) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 62.42434431741762 + - type: f1 + value: 59.01617226544757 + task: + type: Classification + - dataset: + config: it + name: MTEB MassiveIntentClassification (it) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 70.53127101546738 + - type: f1 + value: 68.10033760906255 + task: + type: Classification + - dataset: + config: ja + name: MTEB MassiveIntentClassification (ja) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 72.50504371217215 + - type: f1 + value: 69.74931103158923 + task: + type: Classification + - dataset: + config: jv + name: MTEB MassiveIntentClassification (jv) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 57.91190316072628 + - type: f1 + value: 54.05551136648796 + task: + type: Classification + - dataset: + config: ka + name: MTEB MassiveIntentClassification (ka) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 51.78211163416275 + - type: f1 + value: 49.874888544058535 + task: + type: Classification + - dataset: + config: km + name: MTEB MassiveIntentClassification (km) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 47.017484868863484 + - type: f1 + value: 44.53364263352014 + task: + type: Classification + - dataset: + config: kn + name: MTEB MassiveIntentClassification (kn) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 62.16207128446537 + - type: f1 + value: 59.01185692320829 + task: + type: Classification + - dataset: + config: ko + name: MTEB MassiveIntentClassification (ko) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 69.42501681237391 + - type: f1 + value: 67.13169450166086 + task: + type: Classification + - dataset: + config: lv + name: MTEB MassiveIntentClassification (lv) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 67.0780094149294 + - type: f1 + value: 64.41720167850707 + task: + type: Classification + - dataset: + config: ml + name: MTEB MassiveIntentClassification (ml) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 65.57162071284466 + - type: f1 + value: 62.414138683804424 + task: + type: Classification + - dataset: + config: mn + name: MTEB MassiveIntentClassification (mn) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 61.71149966375252 + - type: f1 + value: 58.594805125087234 + task: + type: Classification + - dataset: + config: ms + name: MTEB MassiveIntentClassification (ms) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 66.03900470746471 + - type: f1 + value: 63.87937257883887 + task: + type: Classification + - dataset: + config: my + name: MTEB MassiveIntentClassification (my) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 60.8776059179556 + - type: f1 + value: 57.48587618059131 + task: + type: Classification + - dataset: + config: nb + name: MTEB MassiveIntentClassification (nb) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 69.87895090786819 + - type: f1 + value: 66.8141299430347 + task: + type: Classification + - dataset: + config: nl + name: MTEB MassiveIntentClassification (nl) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 70.45057162071285 + - type: f1 + value: 67.46444039673516 + task: + type: Classification + - dataset: + config: pl + name: MTEB MassiveIntentClassification (pl) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 71.546738399462 + - type: f1 + value: 68.63640876702655 + task: + type: Classification + - dataset: + config: pt + name: MTEB MassiveIntentClassification (pt) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 70.72965702757229 + - type: f1 + value: 68.54119560379115 + task: + type: Classification + - dataset: + config: ro + name: MTEB MassiveIntentClassification (ro) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 68.35574983187625 + - type: f1 + value: 65.88844917691927 + task: + type: Classification + - dataset: + config: ru + name: MTEB MassiveIntentClassification (ru) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 71.70477471418964 + - type: f1 + value: 69.19665697061978 + task: + type: Classification + - dataset: + config: sl + name: MTEB MassiveIntentClassification (sl) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 67.0880968392737 + - type: f1 + value: 64.76962317666086 + task: + type: Classification + - dataset: + config: sq + name: MTEB MassiveIntentClassification (sq) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 65.18493611297916 + - type: f1 + value: 62.49984559035371 + task: + type: Classification + - dataset: + config: sv + name: MTEB MassiveIntentClassification (sv) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 71.75857431069265 + - type: f1 + value: 69.20053687623418 + task: + type: Classification + - dataset: + config: sw + name: MTEB MassiveIntentClassification (sw) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 58.500336247478145 + - type: f1 + value: 55.2972398687929 + task: + type: Classification + - dataset: + config: ta + name: MTEB MassiveIntentClassification (ta) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 62.68997982515132 + - type: f1 + value: 59.36848202755348 + task: + type: Classification + - dataset: + config: te + name: MTEB MassiveIntentClassification (te) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 63.01950235373235 + - type: f1 + value: 60.09351954625423 + task: + type: Classification + - dataset: + config: th + name: MTEB MassiveIntentClassification (th) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 68.29186281102892 + - type: f1 + value: 67.57860496703447 + task: + type: Classification + - dataset: + config: tl + name: MTEB MassiveIntentClassification (tl) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 64.77471418964357 + - type: f1 + value: 61.913983147713836 + task: + type: Classification + - dataset: + config: tr + name: MTEB MassiveIntentClassification (tr) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 69.87222595830532 + - type: f1 + value: 66.03679033708141 + task: + type: Classification + - dataset: + config: ur + name: MTEB MassiveIntentClassification (ur) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 64.04505716207127 + - type: f1 + value: 61.28569169817908 + task: + type: Classification + - dataset: + config: vi + name: MTEB MassiveIntentClassification (vi) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 69.38466711499663 + - type: f1 + value: 67.20532357036844 + task: + type: Classification + - dataset: + config: zh-CN + name: MTEB MassiveIntentClassification (zh-CN) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 71.12306657700067 + - type: f1 + value: 68.91251226588182 + task: + type: Classification + - dataset: + config: zh-TW + name: MTEB MassiveIntentClassification (zh-TW) + revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 66.20040349697378 + - type: f1 + value: 66.02657347714175 + task: + type: Classification + - dataset: + config: af + name: MTEB MassiveScenarioClassification (af) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 68.73907195696032 + - type: f1 + value: 66.98484521791418 + task: + type: Classification + - dataset: + config: am + name: MTEB MassiveScenarioClassification (am) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 60.58843308675185 + - type: f1 + value: 58.95591723092005 + task: + type: Classification + - dataset: + config: ar + name: MTEB MassiveScenarioClassification (ar) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 66.22730329522528 + - type: f1 + value: 66.0894499712115 + task: + type: Classification + - dataset: + config: az + name: MTEB MassiveScenarioClassification (az) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 66.48285137861465 + - type: f1 + value: 65.21963176785157 + task: + type: Classification + - dataset: + config: bn + name: MTEB MassiveScenarioClassification (bn) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 67.74714189643578 + - type: f1 + value: 66.8212192745412 + task: + type: Classification + - dataset: + config: cy + name: MTEB MassiveScenarioClassification (cy) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 59.09213180901143 + - type: f1 + value: 56.70735546356339 + task: + type: Classification + - dataset: + config: da + name: MTEB MassiveScenarioClassification (da) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 75.05716207128448 + - type: f1 + value: 74.8413712365364 + task: + type: Classification + - dataset: + config: de + name: MTEB MassiveScenarioClassification (de) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 74.69737726967047 + - type: f1 + value: 74.7664341963 + task: + type: Classification + - dataset: + config: el + name: MTEB MassiveScenarioClassification (el) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 73.90383322125084 + - type: f1 + value: 73.59201554448323 + task: + type: Classification + - dataset: + config: en + name: MTEB MassiveScenarioClassification (en) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 77.51176866173503 + - type: f1 + value: 77.46104434577758 + task: + type: Classification + - dataset: + config: es + name: MTEB MassiveScenarioClassification (es) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 74.31069266980496 + - type: f1 + value: 74.61048660675635 + task: + type: Classification + - dataset: + config: fa + name: MTEB MassiveScenarioClassification (fa) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 72.95225285810356 + - type: f1 + value: 72.33160006574627 + task: + type: Classification + - dataset: + config: fi + name: MTEB MassiveScenarioClassification (fi) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 73.12373907195696 + - type: f1 + value: 73.20921012557481 + task: + type: Classification + - dataset: + config: fr + name: MTEB MassiveScenarioClassification (fr) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 73.86684599865501 + - type: f1 + value: 73.82348774610831 + task: + type: Classification + - dataset: + config: he + name: MTEB MassiveScenarioClassification (he) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 71.40215198386012 + - type: f1 + value: 71.11945183971858 + task: + type: Classification + - dataset: + config: hi + name: MTEB MassiveScenarioClassification (hi) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 72.12844653665098 + - type: f1 + value: 71.34450495911766 + task: + type: Classification + - dataset: + config: hu + name: MTEB MassiveScenarioClassification (hu) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 74.52252858103566 + - type: f1 + value: 73.98878711342999 + task: + type: Classification + - dataset: + config: hy + name: MTEB MassiveScenarioClassification (hy) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 64.93611297915265 + - type: f1 + value: 63.723200467653385 + task: + type: Classification + - dataset: + config: id + name: MTEB MassiveScenarioClassification (id) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 74.11903160726295 + - type: f1 + value: 73.82138439467096 + task: + type: Classification + - dataset: + config: is + name: MTEB MassiveScenarioClassification (is) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 67.15198386012105 + - type: f1 + value: 66.02172193802167 + task: + type: Classification + - dataset: + config: it + name: MTEB MassiveScenarioClassification (it) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 74.32414256893072 + - type: f1 + value: 74.30943421170574 + task: + type: Classification + - dataset: + config: ja + name: MTEB MassiveScenarioClassification (ja) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 77.46805648957633 + - type: f1 + value: 77.62808409298209 + task: + type: Classification + - dataset: + config: jv + name: MTEB MassiveScenarioClassification (jv) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 63.318762609280434 + - type: f1 + value: 62.094284066075076 + task: + type: Classification + - dataset: + config: ka + name: MTEB MassiveScenarioClassification (ka) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 58.34902488231338 + - type: f1 + value: 57.12893860987984 + task: + type: Classification + - dataset: + config: km + name: MTEB MassiveScenarioClassification (km) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 50.88433086751849 + - type: f1 + value: 48.2272350802058 + task: + type: Classification + - dataset: + config: kn + name: MTEB MassiveScenarioClassification (kn) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 66.4425016812374 + - type: f1 + value: 64.61463095996173 + task: + type: Classification + - dataset: + config: ko + name: MTEB MassiveScenarioClassification (ko) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 75.04707464694015 + - type: f1 + value: 75.05099199098998 + task: + type: Classification + - dataset: + config: lv + name: MTEB MassiveScenarioClassification (lv) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 70.50437121721586 + - type: f1 + value: 69.83397721096314 + task: + type: Classification + - dataset: + config: ml + name: MTEB MassiveScenarioClassification (ml) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 69.94283792871553 + - type: f1 + value: 68.8704663703913 + task: + type: Classification + - dataset: + config: mn + name: MTEB MassiveScenarioClassification (mn) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 64.79488903833222 + - type: f1 + value: 63.615424063345436 + task: + type: Classification + - dataset: + config: ms + name: MTEB MassiveScenarioClassification (ms) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 69.88231338264963 + - type: f1 + value: 68.57892302593237 + task: + type: Classification + - dataset: + config: my + name: MTEB MassiveScenarioClassification (my) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 63.248150638870214 + - type: f1 + value: 61.06680605338809 + task: + type: Classification + - dataset: + config: nb + name: MTEB MassiveScenarioClassification (nb) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 74.84196368527236 + - type: f1 + value: 74.52566464968763 + task: + type: Classification + - dataset: + config: nl + name: MTEB MassiveScenarioClassification (nl) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 74.8285137861466 + - type: f1 + value: 74.8853197608802 + task: + type: Classification + - dataset: + config: pl + name: MTEB MassiveScenarioClassification (pl) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 74.13248150638869 + - type: f1 + value: 74.3982040999179 + task: + type: Classification + - dataset: + config: pt + name: MTEB MassiveScenarioClassification (pt) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 73.49024882313383 + - type: f1 + value: 73.82153848368573 + task: + type: Classification + - dataset: + config: ro + name: MTEB MassiveScenarioClassification (ro) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 71.72158708809684 + - type: f1 + value: 71.85049433180541 + task: + type: Classification + - dataset: + config: ru + name: MTEB MassiveScenarioClassification (ru) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 75.137861466039 + - type: f1 + value: 75.37628348188467 + task: + type: Classification + - dataset: + config: sl + name: MTEB MassiveScenarioClassification (sl) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 71.86953597848016 + - type: f1 + value: 71.87537624521661 + task: + type: Classification + - dataset: + config: sq + name: MTEB MassiveScenarioClassification (sq) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 70.27572293207801 + - type: f1 + value: 68.80017302344231 + task: + type: Classification + - dataset: + config: sv + name: MTEB MassiveScenarioClassification (sv) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 76.09952925353059 + - type: f1 + value: 76.07992707688408 + task: + type: Classification + - dataset: + config: sw + name: MTEB MassiveScenarioClassification (sw) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 63.140551445864155 + - type: f1 + value: 61.73855010331415 + task: + type: Classification + - dataset: + config: ta + name: MTEB MassiveScenarioClassification (ta) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 66.27774041694687 + - type: f1 + value: 64.83664868894539 + task: + type: Classification + - dataset: + config: te + name: MTEB MassiveScenarioClassification (te) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 66.69468728984533 + - type: f1 + value: 64.76239666920868 + task: + type: Classification + - dataset: + config: th + name: MTEB MassiveScenarioClassification (th) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 73.44653665097512 + - type: f1 + value: 73.14646052013873 + task: + type: Classification + - dataset: + config: tl + name: MTEB MassiveScenarioClassification (tl) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 67.71351714862139 + - type: f1 + value: 66.67212180163382 + task: + type: Classification + - dataset: + config: tr + name: MTEB MassiveScenarioClassification (tr) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 73.9946200403497 + - type: f1 + value: 73.87348793725525 + task: + type: Classification + - dataset: + config: ur + name: MTEB MassiveScenarioClassification (ur) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 68.15400134498992 + - type: f1 + value: 67.09433241421094 + task: + type: Classification + - dataset: + config: vi + name: MTEB MassiveScenarioClassification (vi) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 73.11365164761264 + - type: f1 + value: 73.59502539433753 + task: + type: Classification + - dataset: + config: zh-CN + name: MTEB MassiveScenarioClassification (zh-CN) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 76.82582380632145 + - type: f1 + value: 76.89992945316313 + task: + type: Classification + - dataset: + config: zh-TW + name: MTEB MassiveScenarioClassification (zh-TW) + revision: 7d571f92784cd94a019292a1f45445077d0ef634 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 71.81237390719569 + - type: f1 + value: 72.36499770986265 + task: + type: Classification + - dataset: + config: default + name: MTEB MedrxivClusteringP2P + revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73 + split: test + type: mteb/medrxiv-clustering-p2p + metrics: + - type: v_measure + value: 31.480506569594695 + task: + type: Clustering + - dataset: + config: default + name: MTEB MedrxivClusteringS2S + revision: 35191c8c0dca72d8ff3efcd72aa802307d469663 + split: test + type: mteb/medrxiv-clustering-s2s + metrics: + - type: v_measure + value: 29.71252128004552 + task: + type: Clustering + - dataset: + config: default + name: MTEB MindSmallReranking + revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69 + split: test + type: mteb/mind_small + metrics: + - type: map + value: 31.421396787056548 + - type: mrr + value: 32.48155274872267 + task: + type: Reranking + - dataset: + config: default + name: MTEB NFCorpus + revision: None + split: test + type: nfcorpus + metrics: + - type: map_at_1 + value: 5.595 + - type: map_at_10 + value: 12.642000000000001 + - type: map_at_100 + value: 15.726 + - type: map_at_1000 + value: 17.061999999999998 + - type: map_at_3 + value: 9.125 + - type: map_at_5 + value: 10.866000000000001 + - type: mrr_at_1 + value: 43.344 + - type: mrr_at_10 + value: 52.227999999999994 + - type: mrr_at_100 + value: 52.898999999999994 + - type: mrr_at_1000 + value: 52.944 + - type: mrr_at_3 + value: 49.845 + - type: mrr_at_5 + value: 51.115 + - type: ndcg_at_1 + value: 41.949999999999996 + - type: ndcg_at_10 + value: 33.995 + - type: ndcg_at_100 + value: 30.869999999999997 + - type: ndcg_at_1000 + value: 39.487 + - type: ndcg_at_3 + value: 38.903999999999996 + - type: ndcg_at_5 + value: 37.236999999999995 + - type: precision_at_1 + value: 43.344 + - type: precision_at_10 + value: 25.480000000000004 + - type: precision_at_100 + value: 7.672 + - type: precision_at_1000 + value: 2.028 + - type: precision_at_3 + value: 36.636 + - type: precision_at_5 + value: 32.632 + - type: recall_at_1 + value: 5.595 + - type: recall_at_10 + value: 16.466 + - type: recall_at_100 + value: 31.226 + - type: recall_at_1000 + value: 62.778999999999996 + - type: recall_at_3 + value: 9.931 + - type: recall_at_5 + value: 12.884 + task: + type: Retrieval + - dataset: + config: default + name: MTEB NQ + revision: None + split: test + type: nq + metrics: + - type: map_at_1 + value: 40.414 + - type: map_at_10 + value: 56.754000000000005 + - type: map_at_100 + value: 57.457 + - type: map_at_1000 + value: 57.477999999999994 + - type: map_at_3 + value: 52.873999999999995 + - type: map_at_5 + value: 55.175 + - type: mrr_at_1 + value: 45.278 + - type: mrr_at_10 + value: 59.192 + - type: mrr_at_100 + value: 59.650000000000006 + - type: mrr_at_1000 + value: 59.665 + - type: mrr_at_3 + value: 56.141 + - type: mrr_at_5 + value: 57.998000000000005 + - type: ndcg_at_1 + value: 45.278 + - type: ndcg_at_10 + value: 64.056 + - type: ndcg_at_100 + value: 66.89 + - type: ndcg_at_1000 + value: 67.364 + - type: ndcg_at_3 + value: 56.97 + - type: ndcg_at_5 + value: 60.719 + - type: precision_at_1 + value: 45.278 + - type: precision_at_10 + value: 9.994 + - type: precision_at_100 + value: 1.165 + - type: precision_at_1000 + value: 0.121 + - type: precision_at_3 + value: 25.512 + - type: precision_at_5 + value: 17.509 + - type: recall_at_1 + value: 40.414 + - type: recall_at_10 + value: 83.596 + - type: recall_at_100 + value: 95.72 + - type: recall_at_1000 + value: 99.24 + - type: recall_at_3 + value: 65.472 + - type: recall_at_5 + value: 74.039 + task: + type: Retrieval + - dataset: + config: default + name: MTEB QuoraRetrieval + revision: None + split: test + type: quora + metrics: + - type: map_at_1 + value: 70.352 + - type: map_at_10 + value: 84.369 + - type: map_at_100 + value: 85.02499999999999 + - type: map_at_1000 + value: 85.04 + - type: map_at_3 + value: 81.42399999999999 + - type: map_at_5 + value: 83.279 + - type: mrr_at_1 + value: 81.05 + - type: mrr_at_10 + value: 87.401 + - type: mrr_at_100 + value: 87.504 + - type: mrr_at_1000 + value: 87.505 + - type: mrr_at_3 + value: 86.443 + - type: mrr_at_5 + value: 87.10799999999999 + - type: ndcg_at_1 + value: 81.04 + - type: ndcg_at_10 + value: 88.181 + - type: ndcg_at_100 + value: 89.411 + - type: ndcg_at_1000 + value: 89.507 + - type: ndcg_at_3 + value: 85.28099999999999 + - type: ndcg_at_5 + value: 86.888 + - type: precision_at_1 + value: 81.04 + - type: precision_at_10 + value: 13.406 + - type: precision_at_100 + value: 1.5350000000000001 + - type: precision_at_1000 + value: 0.157 + - type: precision_at_3 + value: 37.31 + - type: precision_at_5 + value: 24.54 + - type: recall_at_1 + value: 70.352 + - type: recall_at_10 + value: 95.358 + - type: recall_at_100 + value: 99.541 + - type: recall_at_1000 + value: 99.984 + - type: recall_at_3 + value: 87.111 + - type: recall_at_5 + value: 91.643 + task: + type: Retrieval + - dataset: + config: default + name: MTEB RedditClustering + revision: 24640382cdbf8abc73003fb0fa6d111a705499eb + split: test + type: mteb/reddit-clustering + metrics: + - type: v_measure + value: 46.54068723291946 + task: + type: Clustering + - dataset: + config: default + name: MTEB RedditClusteringP2P + revision: 282350215ef01743dc01b456c7f5241fa8937f16 + split: test + type: mteb/reddit-clustering-p2p + metrics: + - type: v_measure + value: 63.216287629895994 + task: + type: Clustering + - dataset: + config: default + name: MTEB SCIDOCS + revision: None + split: test + type: scidocs + metrics: + - type: map_at_1 + value: 4.023000000000001 + - type: map_at_10 + value: 10.071 + - type: map_at_100 + value: 11.892 + - type: map_at_1000 + value: 12.196 + - type: map_at_3 + value: 7.234 + - type: map_at_5 + value: 8.613999999999999 + - type: mrr_at_1 + value: 19.900000000000002 + - type: mrr_at_10 + value: 30.516 + - type: mrr_at_100 + value: 31.656000000000002 + - type: mrr_at_1000 + value: 31.723000000000003 + - type: mrr_at_3 + value: 27.400000000000002 + - type: mrr_at_5 + value: 29.270000000000003 + - type: ndcg_at_1 + value: 19.900000000000002 + - type: ndcg_at_10 + value: 17.474 + - type: ndcg_at_100 + value: 25.020999999999997 + - type: ndcg_at_1000 + value: 30.728 + - type: ndcg_at_3 + value: 16.588 + - type: ndcg_at_5 + value: 14.498 + - type: precision_at_1 + value: 19.900000000000002 + - type: precision_at_10 + value: 9.139999999999999 + - type: precision_at_100 + value: 2.011 + - type: precision_at_1000 + value: 0.33899999999999997 + - type: precision_at_3 + value: 15.667 + - type: precision_at_5 + value: 12.839999999999998 + - type: recall_at_1 + value: 4.023000000000001 + - type: recall_at_10 + value: 18.497 + - type: recall_at_100 + value: 40.8 + - type: recall_at_1000 + value: 68.812 + - type: recall_at_3 + value: 9.508 + - type: recall_at_5 + value: 12.983 + task: + type: Retrieval + - dataset: + config: default + name: MTEB SICK-R + revision: a6ea5a8cab320b040a23452cc28066d9beae2cee + split: test + type: mteb/sickr-sts + metrics: + - type: cos_sim_pearson + value: 83.967008785134 + - type: cos_sim_spearman + value: 80.23142141101837 + - type: euclidean_pearson + value: 81.20166064704539 + - type: euclidean_spearman + value: 80.18961335654585 + - type: manhattan_pearson + value: 81.13925443187625 + - type: manhattan_spearman + value: 80.07948723044424 + task: + type: STS + - dataset: + config: default + name: MTEB STS12 + revision: a0d554a64d88156834ff5ae9920b964011b16384 + split: test + type: mteb/sts12-sts + metrics: + - type: cos_sim_pearson + value: 86.94262461316023 + - type: cos_sim_spearman + value: 80.01596278563865 + - type: euclidean_pearson + value: 83.80799622922581 + - type: euclidean_spearman + value: 79.94984954947103 + - type: manhattan_pearson + value: 83.68473841756281 + - type: manhattan_spearman + value: 79.84990707951822 + task: + type: STS + - dataset: + config: default + name: MTEB STS13 + revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca + split: test + type: mteb/sts13-sts + metrics: + - type: cos_sim_pearson + value: 80.57346443146068 + - type: cos_sim_spearman + value: 81.54689837570866 + - type: euclidean_pearson + value: 81.10909881516007 + - type: euclidean_spearman + value: 81.56746243261762 + - type: manhattan_pearson + value: 80.87076036186582 + - type: manhattan_spearman + value: 81.33074987964402 + task: + type: STS + - dataset: + config: default + name: MTEB STS14 + revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375 + split: test + type: mteb/sts14-sts + metrics: + - type: cos_sim_pearson + value: 79.54733787179849 + - type: cos_sim_spearman + value: 77.72202105610411 + - type: euclidean_pearson + value: 78.9043595478849 + - type: euclidean_spearman + value: 77.93422804309435 + - type: manhattan_pearson + value: 78.58115121621368 + - type: manhattan_spearman + value: 77.62508135122033 + task: + type: STS + - dataset: + config: default + name: MTEB STS15 + revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3 + split: test + type: mteb/sts15-sts + metrics: + - type: cos_sim_pearson + value: 88.59880017237558 + - type: cos_sim_spearman + value: 89.31088630824758 + - type: euclidean_pearson + value: 88.47069261564656 + - type: euclidean_spearman + value: 89.33581971465233 + - type: manhattan_pearson + value: 88.40774264100956 + - type: manhattan_spearman + value: 89.28657485627835 + task: + type: STS + - dataset: + config: default + name: MTEB STS16 + revision: 4d8694f8f0e0100860b497b999b3dbed754a0513 + split: test + type: mteb/sts16-sts + metrics: + - type: cos_sim_pearson + value: 84.08055117917084 + - type: cos_sim_spearman + value: 85.78491813080304 + - type: euclidean_pearson + value: 84.99329155500392 + - type: euclidean_spearman + value: 85.76728064677287 + - type: manhattan_pearson + value: 84.87947428989587 + - type: manhattan_spearman + value: 85.62429454917464 + task: + type: STS + - dataset: + config: ko-ko + name: MTEB STS17 (ko-ko) + revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d + split: test + type: mteb/sts17-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 82.14190939287384 + - type: cos_sim_spearman + value: 82.27331573306041 + - type: euclidean_pearson + value: 81.891896953716 + - type: euclidean_spearman + value: 82.37695542955998 + - type: manhattan_pearson + value: 81.73123869460504 + - type: manhattan_spearman + value: 82.19989168441421 + task: + type: STS + - dataset: + config: ar-ar + name: MTEB STS17 (ar-ar) + revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d + split: test + type: mteb/sts17-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 76.84695301843362 + - type: cos_sim_spearman + value: 77.87790986014461 + - type: euclidean_pearson + value: 76.91981583106315 + - type: euclidean_spearman + value: 77.88154772749589 + - type: manhattan_pearson + value: 76.94953277451093 + - type: manhattan_spearman + value: 77.80499230728604 + task: + type: STS + - dataset: + config: en-ar + name: MTEB STS17 (en-ar) + revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d + split: test + type: mteb/sts17-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 75.44657840482016 + - type: cos_sim_spearman + value: 75.05531095119674 + - type: euclidean_pearson + value: 75.88161755829299 + - type: euclidean_spearman + value: 74.73176238219332 + - type: manhattan_pearson + value: 75.63984765635362 + - type: manhattan_spearman + value: 74.86476440770737 + task: + type: STS + - dataset: + config: en-de + name: MTEB STS17 (en-de) + revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d + split: test + type: mteb/sts17-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 85.64700140524133 + - type: cos_sim_spearman + value: 86.16014210425672 + - type: euclidean_pearson + value: 86.49086860843221 + - type: euclidean_spearman + value: 86.09729326815614 + - type: manhattan_pearson + value: 86.43406265125513 + - type: manhattan_spearman + value: 86.17740150939994 + task: + type: STS + - dataset: + config: en-en + name: MTEB STS17 (en-en) + revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d + split: test + type: mteb/sts17-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 87.91170098764921 + - type: cos_sim_spearman + value: 88.12437004058931 + - type: euclidean_pearson + value: 88.81828254494437 + - type: euclidean_spearman + value: 88.14831794572122 + - type: manhattan_pearson + value: 88.93442183448961 + - type: manhattan_spearman + value: 88.15254630778304 + task: + type: STS + - dataset: + config: en-tr + name: MTEB STS17 (en-tr) + revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d + split: test + type: mteb/sts17-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 72.91390577997292 + - type: cos_sim_spearman + value: 71.22979457536074 + - type: euclidean_pearson + value: 74.40314008106749 + - type: euclidean_spearman + value: 72.54972136083246 + - type: manhattan_pearson + value: 73.85687539530218 + - type: manhattan_spearman + value: 72.09500771742637 + task: + type: STS + - dataset: + config: es-en + name: MTEB STS17 (es-en) + revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d + split: test + type: mteb/sts17-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 80.9301067983089 + - type: cos_sim_spearman + value: 80.74989828346473 + - type: euclidean_pearson + value: 81.36781301814257 + - type: euclidean_spearman + value: 80.9448819964426 + - type: manhattan_pearson + value: 81.0351322685609 + - type: manhattan_spearman + value: 80.70192121844177 + task: + type: STS + - dataset: + config: es-es + name: MTEB STS17 (es-es) + revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d + split: test + type: mteb/sts17-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 87.13820465980005 + - type: cos_sim_spearman + value: 86.73532498758757 + - type: euclidean_pearson + value: 87.21329451846637 + - type: euclidean_spearman + value: 86.57863198601002 + - type: manhattan_pearson + value: 87.06973713818554 + - type: manhattan_spearman + value: 86.47534918791499 + task: + type: STS + - dataset: + config: fr-en + name: MTEB STS17 (fr-en) + revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d + split: test + type: mteb/sts17-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 85.48720108904415 + - type: cos_sim_spearman + value: 85.62221757068387 + - type: euclidean_pearson + value: 86.1010129512749 + - type: euclidean_spearman + value: 85.86580966509942 + - type: manhattan_pearson + value: 86.26800938808971 + - type: manhattan_spearman + value: 85.88902721678429 + task: + type: STS + - dataset: + config: it-en + name: MTEB STS17 (it-en) + revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d + split: test + type: mteb/sts17-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 83.98021347333516 + - type: cos_sim_spearman + value: 84.53806553803501 + - type: euclidean_pearson + value: 84.61483347248364 + - type: euclidean_spearman + value: 85.14191408011702 + - type: manhattan_pearson + value: 84.75297588825967 + - type: manhattan_spearman + value: 85.33176753669242 + task: + type: STS + - dataset: + config: nl-en + name: MTEB STS17 (nl-en) + revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d + split: test + type: mteb/sts17-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 84.51856644893233 + - type: cos_sim_spearman + value: 85.27510748506413 + - type: euclidean_pearson + value: 85.09886861540977 + - type: euclidean_spearman + value: 85.62579245860887 + - type: manhattan_pearson + value: 84.93017860464607 + - type: manhattan_spearman + value: 85.5063988898453 + task: + type: STS + - dataset: + config: en + name: MTEB STS22 (en) + revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 + split: test + type: mteb/sts22-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 62.581573200584195 + - type: cos_sim_spearman + value: 63.05503590247928 + - type: euclidean_pearson + value: 63.652564812602094 + - type: euclidean_spearman + value: 62.64811520876156 + - type: manhattan_pearson + value: 63.506842893061076 + - type: manhattan_spearman + value: 62.51289573046917 + task: + type: STS + - dataset: + config: de + name: MTEB STS22 (de) + revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 + split: test + type: mteb/sts22-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 48.2248801729127 + - type: cos_sim_spearman + value: 56.5936604678561 + - type: euclidean_pearson + value: 43.98149464089 + - type: euclidean_spearman + value: 56.108561882423615 + - type: manhattan_pearson + value: 43.86880305903564 + - type: manhattan_spearman + value: 56.04671150510166 + task: + type: STS + - dataset: + config: es + name: MTEB STS22 (es) + revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 + split: test + type: mteb/sts22-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 55.17564527009831 + - type: cos_sim_spearman + value: 64.57978560979488 + - type: euclidean_pearson + value: 58.8818330154583 + - type: euclidean_spearman + value: 64.99214839071281 + - type: manhattan_pearson + value: 58.72671436121381 + - type: manhattan_spearman + value: 65.10713416616109 + task: + type: STS + - dataset: + config: pl + name: MTEB STS22 (pl) + revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 + split: test + type: mteb/sts22-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 26.772131864023297 + - type: cos_sim_spearman + value: 34.68200792408681 + - type: euclidean_pearson + value: 16.68082419005441 + - type: euclidean_spearman + value: 34.83099932652166 + - type: manhattan_pearson + value: 16.52605949659529 + - type: manhattan_spearman + value: 34.82075801399475 + task: + type: STS + - dataset: + config: tr + name: MTEB STS22 (tr) + revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 + split: test + type: mteb/sts22-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 54.42415189043831 + - type: cos_sim_spearman + value: 63.54594264576758 + - type: euclidean_pearson + value: 57.36577498297745 + - type: euclidean_spearman + value: 63.111466379158074 + - type: manhattan_pearson + value: 57.584543715873885 + - type: manhattan_spearman + value: 63.22361054139183 + task: + type: STS + - dataset: + config: ar + name: MTEB STS22 (ar) + revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 + split: test + type: mteb/sts22-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 47.55216762405518 + - type: cos_sim_spearman + value: 56.98670142896412 + - type: euclidean_pearson + value: 50.15318757562699 + - type: euclidean_spearman + value: 56.524941926541906 + - type: manhattan_pearson + value: 49.955618528674904 + - type: manhattan_spearman + value: 56.37102209240117 + task: + type: STS + - dataset: + config: ru + name: MTEB STS22 (ru) + revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 + split: test + type: mteb/sts22-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 49.20540980338571 + - type: cos_sim_spearman + value: 59.9009453504406 + - type: euclidean_pearson + value: 49.557749853620535 + - type: euclidean_spearman + value: 59.76631621172456 + - type: manhattan_pearson + value: 49.62340591181147 + - type: manhattan_spearman + value: 59.94224880322436 + task: + type: STS + - dataset: + config: zh + name: MTEB STS22 (zh) + revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 + split: test + type: mteb/sts22-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 51.508169956576985 + - type: cos_sim_spearman + value: 66.82461565306046 + - type: euclidean_pearson + value: 56.2274426480083 + - type: euclidean_spearman + value: 66.6775323848333 + - type: manhattan_pearson + value: 55.98277796300661 + - type: manhattan_spearman + value: 66.63669848497175 + task: + type: STS + - dataset: + config: fr + name: MTEB STS22 (fr) + revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 + split: test + type: mteb/sts22-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 72.86478788045507 + - type: cos_sim_spearman + value: 76.7946552053193 + - type: euclidean_pearson + value: 75.01598530490269 + - type: euclidean_spearman + value: 76.83618917858281 + - type: manhattan_pearson + value: 74.68337628304332 + - type: manhattan_spearman + value: 76.57480204017773 + task: + type: STS + - dataset: + config: de-en + name: MTEB STS22 (de-en) + revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 + split: test + type: mteb/sts22-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 55.922619099401984 + - type: cos_sim_spearman + value: 56.599362477240774 + - type: euclidean_pearson + value: 56.68307052369783 + - type: euclidean_spearman + value: 54.28760436777401 + - type: manhattan_pearson + value: 56.67763566500681 + - type: manhattan_spearman + value: 53.94619541711359 + task: + type: STS + - dataset: + config: es-en + name: MTEB STS22 (es-en) + revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 + split: test + type: mteb/sts22-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 66.74357206710913 + - type: cos_sim_spearman + value: 72.5208244925311 + - type: euclidean_pearson + value: 67.49254562186032 + - type: euclidean_spearman + value: 72.02469076238683 + - type: manhattan_pearson + value: 67.45251772238085 + - type: manhattan_spearman + value: 72.05538819984538 + task: + type: STS + - dataset: + config: it + name: MTEB STS22 (it) + revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 + split: test + type: mteb/sts22-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 71.25734330033191 + - type: cos_sim_spearman + value: 76.98349083946823 + - type: euclidean_pearson + value: 73.71642838667736 + - type: euclidean_spearman + value: 77.01715504651384 + - type: manhattan_pearson + value: 73.61712711868105 + - type: manhattan_spearman + value: 77.01392571153896 + task: + type: STS + - dataset: + config: pl-en + name: MTEB STS22 (pl-en) + revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 + split: test + type: mteb/sts22-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 63.18215462781212 + - type: cos_sim_spearman + value: 65.54373266117607 + - type: euclidean_pearson + value: 64.54126095439005 + - type: euclidean_spearman + value: 65.30410369102711 + - type: manhattan_pearson + value: 63.50332221148234 + - type: manhattan_spearman + value: 64.3455878104313 + task: + type: STS + - dataset: + config: zh-en + name: MTEB STS22 (zh-en) + revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 + split: test + type: mteb/sts22-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 62.30509221440029 + - type: cos_sim_spearman + value: 65.99582704642478 + - type: euclidean_pearson + value: 63.43818859884195 + - type: euclidean_spearman + value: 66.83172582815764 + - type: manhattan_pearson + value: 63.055779168508764 + - type: manhattan_spearman + value: 65.49585020501449 + task: + type: STS + - dataset: + config: es-it + name: MTEB STS22 (es-it) + revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 + split: test + type: mteb/sts22-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 59.587830825340404 + - type: cos_sim_spearman + value: 68.93467614588089 + - type: euclidean_pearson + value: 62.3073527367404 + - type: euclidean_spearman + value: 69.69758171553175 + - type: manhattan_pearson + value: 61.9074580815789 + - type: manhattan_spearman + value: 69.57696375597865 + task: + type: STS + - dataset: + config: de-fr + name: MTEB STS22 (de-fr) + revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 + split: test + type: mteb/sts22-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 57.143220125577066 + - type: cos_sim_spearman + value: 67.78857859159226 + - type: euclidean_pearson + value: 55.58225107923733 + - type: euclidean_spearman + value: 67.80662907184563 + - type: manhattan_pearson + value: 56.24953502726514 + - type: manhattan_spearman + value: 67.98262125431616 + task: + type: STS + - dataset: + config: de-pl + name: MTEB STS22 (de-pl) + revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 + split: test + type: mteb/sts22-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 21.826928900322066 + - type: cos_sim_spearman + value: 49.578506634400405 + - type: euclidean_pearson + value: 27.939890138843214 + - type: euclidean_spearman + value: 52.71950519136242 + - type: manhattan_pearson + value: 26.39878683847546 + - type: manhattan_spearman + value: 47.54609580342499 + task: + type: STS + - dataset: + config: fr-pl + name: MTEB STS22 (fr-pl) + revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 + split: test + type: mteb/sts22-crosslingual-sts + metrics: + - type: cos_sim_pearson + value: 57.27603854632001 + - type: cos_sim_spearman + value: 50.709255283710995 + - type: euclidean_pearson + value: 59.5419024445929 + - type: euclidean_spearman + value: 50.709255283710995 + - type: manhattan_pearson + value: 59.03256832438492 + - type: manhattan_spearman + value: 61.97797868009122 + task: + type: STS + - dataset: + config: default + name: MTEB STSBenchmark + revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831 + split: test + type: mteb/stsbenchmark-sts + metrics: + - type: cos_sim_pearson + value: 85.00757054859712 + - type: cos_sim_spearman + value: 87.29283629622222 + - type: euclidean_pearson + value: 86.54824171775536 + - type: euclidean_spearman + value: 87.24364730491402 + - type: manhattan_pearson + value: 86.5062156915074 + - type: manhattan_spearman + value: 87.15052170378574 + task: + type: STS + - dataset: + config: default + name: MTEB SciDocsRR + revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab + split: test + type: mteb/scidocs-reranking + metrics: + - type: map + value: 82.03549357197389 + - type: mrr + value: 95.05437645143527 + task: + type: Reranking + - dataset: + config: default + name: MTEB SciFact + revision: None + split: test + type: scifact + metrics: + - type: map_at_1 + value: 57.260999999999996 + - type: map_at_10 + value: 66.259 + - type: map_at_100 + value: 66.884 + - type: map_at_1000 + value: 66.912 + - type: map_at_3 + value: 63.685 + - type: map_at_5 + value: 65.35499999999999 + - type: mrr_at_1 + value: 60.333000000000006 + - type: mrr_at_10 + value: 67.5 + - type: mrr_at_100 + value: 68.013 + - type: mrr_at_1000 + value: 68.038 + - type: mrr_at_3 + value: 65.61099999999999 + - type: mrr_at_5 + value: 66.861 + - type: ndcg_at_1 + value: 60.333000000000006 + - type: ndcg_at_10 + value: 70.41 + - type: ndcg_at_100 + value: 73.10600000000001 + - type: ndcg_at_1000 + value: 73.846 + - type: ndcg_at_3 + value: 66.133 + - type: ndcg_at_5 + value: 68.499 + - type: precision_at_1 + value: 60.333000000000006 + - type: precision_at_10 + value: 9.232999999999999 + - type: precision_at_100 + value: 1.0630000000000002 + - type: precision_at_1000 + value: 0.11299999999999999 + - type: precision_at_3 + value: 25.667 + - type: precision_at_5 + value: 17.067 + - type: recall_at_1 + value: 57.260999999999996 + - type: recall_at_10 + value: 81.94399999999999 + - type: recall_at_100 + value: 93.867 + - type: recall_at_1000 + value: 99.667 + - type: recall_at_3 + value: 70.339 + - type: recall_at_5 + value: 76.25 + task: + type: Retrieval + - dataset: + config: default + name: MTEB SprintDuplicateQuestions + revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46 + split: test + type: mteb/sprintduplicatequestions-pairclassification + metrics: + - type: cos_sim_accuracy + value: 99.74356435643564 + - type: cos_sim_ap + value: 93.13411948212683 + - type: cos_sim_f1 + value: 86.80521991300147 + - type: cos_sim_precision + value: 84.00374181478017 + - type: cos_sim_recall + value: 89.8 + - type: dot_accuracy + value: 99.67920792079208 + - type: dot_ap + value: 89.27277565444479 + - type: dot_f1 + value: 83.9276990718124 + - type: dot_precision + value: 82.04393505253104 + - type: dot_recall + value: 85.9 + - type: euclidean_accuracy + value: 99.74257425742574 + - type: euclidean_ap + value: 93.17993008259062 + - type: euclidean_f1 + value: 86.69396110542476 + - type: euclidean_precision + value: 88.78406708595388 + - type: euclidean_recall + value: 84.7 + - type: manhattan_accuracy + value: 99.74257425742574 + - type: manhattan_ap + value: 93.14413755550099 + - type: manhattan_f1 + value: 86.82483594144371 + - type: manhattan_precision + value: 87.66564729867483 + - type: manhattan_recall + value: 86 + - type: max_accuracy + value: 99.74356435643564 + - type: max_ap + value: 93.17993008259062 + - type: max_f1 + value: 86.82483594144371 + task: + type: PairClassification + - dataset: + config: default + name: MTEB StackExchangeClustering + revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259 + split: test + type: mteb/stackexchange-clustering + metrics: + - type: v_measure + value: 57.525863806168566 + task: + type: Clustering + - dataset: + config: default + name: MTEB StackExchangeClusteringP2P + revision: 815ca46b2622cec33ccafc3735d572c266efdb44 + split: test + type: mteb/stackexchange-clustering-p2p + metrics: + - type: v_measure + value: 32.68850574423839 + task: + type: Clustering + - dataset: + config: default + name: MTEB StackOverflowDupQuestions + revision: e185fbe320c72810689fc5848eb6114e1ef5ec69 + split: test + type: mteb/stackoverflowdupquestions-reranking + metrics: + - type: map + value: 49.71580650644033 + - type: mrr + value: 50.50971903913081 + task: + type: Reranking + - dataset: + config: default + name: MTEB SummEval + revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c + split: test + type: mteb/summeval + metrics: + - type: cos_sim_pearson + value: 29.152190498799484 + - type: cos_sim_spearman + value: 29.686180371952727 + - type: dot_pearson + value: 27.248664793816342 + - type: dot_spearman + value: 28.37748983721745 + task: + type: Summarization + - dataset: + config: default + name: MTEB TRECCOVID + revision: None + split: test + type: trec-covid + metrics: + - type: map_at_1 + value: 0.20400000000000001 + - type: map_at_10 + value: 1.6209999999999998 + - type: map_at_100 + value: 9.690999999999999 + - type: map_at_1000 + value: 23.733 + - type: map_at_3 + value: 0.575 + - type: map_at_5 + value: 0.885 + - type: mrr_at_1 + value: 78 + - type: mrr_at_10 + value: 86.56700000000001 + - type: mrr_at_100 + value: 86.56700000000001 + - type: mrr_at_1000 + value: 86.56700000000001 + - type: mrr_at_3 + value: 85.667 + - type: mrr_at_5 + value: 86.56700000000001 + - type: ndcg_at_1 + value: 76 + - type: ndcg_at_10 + value: 71.326 + - type: ndcg_at_100 + value: 54.208999999999996 + - type: ndcg_at_1000 + value: 49.252 + - type: ndcg_at_3 + value: 74.235 + - type: ndcg_at_5 + value: 73.833 + - type: precision_at_1 + value: 78 + - type: precision_at_10 + value: 74.8 + - type: precision_at_100 + value: 55.50000000000001 + - type: precision_at_1000 + value: 21.836 + - type: precision_at_3 + value: 78 + - type: precision_at_5 + value: 78 + - type: recall_at_1 + value: 0.20400000000000001 + - type: recall_at_10 + value: 1.894 + - type: recall_at_100 + value: 13.245999999999999 + - type: recall_at_1000 + value: 46.373 + - type: recall_at_3 + value: 0.613 + - type: recall_at_5 + value: 0.991 + task: + type: Retrieval + - dataset: + config: sqi-eng + name: MTEB Tatoeba (sqi-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 95.89999999999999 + - type: f1 + value: 94.69999999999999 + - type: precision + value: 94.11666666666667 + - type: recall + value: 95.89999999999999 + task: + type: BitextMining + - dataset: + config: fry-eng + name: MTEB Tatoeba (fry-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 68.20809248554913 + - type: f1 + value: 63.431048720066066 + - type: precision + value: 61.69143958161298 + - type: recall + value: 68.20809248554913 + task: + type: BitextMining + - dataset: + config: kur-eng + name: MTEB Tatoeba (kur-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 71.21951219512195 + - type: f1 + value: 66.82926829268293 + - type: precision + value: 65.1260162601626 + - type: recall + value: 71.21951219512195 + task: + type: BitextMining + - dataset: + config: tur-eng + name: MTEB Tatoeba (tur-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 97.2 + - type: f1 + value: 96.26666666666667 + - type: precision + value: 95.8 + - type: recall + value: 97.2 + task: + type: BitextMining + - dataset: + config: deu-eng + name: MTEB Tatoeba (deu-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 99.3 + - type: f1 + value: 99.06666666666666 + - type: precision + value: 98.95 + - type: recall + value: 99.3 + task: + type: BitextMining + - dataset: + config: nld-eng + name: MTEB Tatoeba (nld-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 97.39999999999999 + - type: f1 + value: 96.63333333333333 + - type: precision + value: 96.26666666666668 + - type: recall + value: 97.39999999999999 + task: + type: BitextMining + - dataset: + config: ron-eng + name: MTEB Tatoeba (ron-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 96 + - type: f1 + value: 94.86666666666666 + - type: precision + value: 94.31666666666668 + - type: recall + value: 96 + task: + type: BitextMining + - dataset: + config: ang-eng + name: MTEB Tatoeba (ang-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 47.01492537313433 + - type: f1 + value: 40.178867566927266 + - type: precision + value: 38.179295828549556 + - type: recall + value: 47.01492537313433 + task: + type: BitextMining + - dataset: + config: ido-eng + name: MTEB Tatoeba (ido-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 86.5 + - type: f1 + value: 83.62537480063796 + - type: precision + value: 82.44555555555554 + - type: recall + value: 86.5 + task: + type: BitextMining + - dataset: + config: jav-eng + name: MTEB Tatoeba (jav-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 80.48780487804879 + - type: f1 + value: 75.45644599303138 + - type: precision + value: 73.37398373983739 + - type: recall + value: 80.48780487804879 + task: + type: BitextMining + - dataset: + config: isl-eng + name: MTEB Tatoeba (isl-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 93.7 + - type: f1 + value: 91.95666666666666 + - type: precision + value: 91.125 + - type: recall + value: 93.7 + task: + type: BitextMining + - dataset: + config: slv-eng + name: MTEB Tatoeba (slv-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 91.73754556500607 + - type: f1 + value: 89.65168084244632 + - type: precision + value: 88.73025516403402 + - type: recall + value: 91.73754556500607 + task: + type: BitextMining + - dataset: + config: cym-eng + name: MTEB Tatoeba (cym-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 81.04347826086956 + - type: f1 + value: 76.2128364389234 + - type: precision + value: 74.2 + - type: recall + value: 81.04347826086956 + task: + type: BitextMining + - dataset: + config: kaz-eng + name: MTEB Tatoeba (kaz-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 83.65217391304348 + - type: f1 + value: 79.4376811594203 + - type: precision + value: 77.65797101449274 + - type: recall + value: 83.65217391304348 + task: + type: BitextMining + - dataset: + config: est-eng + name: MTEB Tatoeba (est-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 87.5 + - type: f1 + value: 85.02690476190476 + - type: precision + value: 83.96261904761904 + - type: recall + value: 87.5 + task: + type: BitextMining + - dataset: + config: heb-eng + name: MTEB Tatoeba (heb-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 89.3 + - type: f1 + value: 86.52333333333333 + - type: precision + value: 85.22833333333332 + - type: recall + value: 89.3 + task: + type: BitextMining + - dataset: + config: gla-eng + name: MTEB Tatoeba (gla-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 65.01809408926418 + - type: f1 + value: 59.00594446432805 + - type: precision + value: 56.827215807915444 + - type: recall + value: 65.01809408926418 + task: + type: BitextMining + - dataset: + config: mar-eng + name: MTEB Tatoeba (mar-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 91.2 + - type: f1 + value: 88.58 + - type: precision + value: 87.33333333333334 + - type: recall + value: 91.2 + task: + type: BitextMining + - dataset: + config: lat-eng + name: MTEB Tatoeba (lat-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 59.199999999999996 + - type: f1 + value: 53.299166276284915 + - type: precision + value: 51.3383908045977 + - type: recall + value: 59.199999999999996 + task: + type: BitextMining + - dataset: + config: bel-eng + name: MTEB Tatoeba (bel-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 93.2 + - type: f1 + value: 91.2 + - type: precision + value: 90.25 + - type: recall + value: 93.2 + task: + type: BitextMining + - dataset: + config: pms-eng + name: MTEB Tatoeba (pms-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 64.76190476190476 + - type: f1 + value: 59.867110667110666 + - type: precision + value: 58.07390192653351 + - type: recall + value: 64.76190476190476 + task: + type: BitextMining + - dataset: + config: gle-eng + name: MTEB Tatoeba (gle-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 76.2 + - type: f1 + value: 71.48147546897547 + - type: precision + value: 69.65409090909091 + - type: recall + value: 76.2 + task: + type: BitextMining + - dataset: + config: pes-eng + name: MTEB Tatoeba (pes-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 93.8 + - type: f1 + value: 92.14 + - type: precision + value: 91.35833333333333 + - type: recall + value: 93.8 + task: + type: BitextMining + - dataset: + config: nob-eng + name: MTEB Tatoeba (nob-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 97.89999999999999 + - type: f1 + value: 97.2 + - type: precision + value: 96.85000000000001 + - type: recall + value: 97.89999999999999 + task: + type: BitextMining + - dataset: + config: bul-eng + name: MTEB Tatoeba (bul-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 94.6 + - type: f1 + value: 92.93333333333334 + - type: precision + value: 92.13333333333333 + - type: recall + value: 94.6 + task: + type: BitextMining + - dataset: + config: cbk-eng + name: MTEB Tatoeba (cbk-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 74.1 + - type: f1 + value: 69.14817460317461 + - type: precision + value: 67.2515873015873 + - type: recall + value: 74.1 + task: + type: BitextMining + - dataset: + config: hun-eng + name: MTEB Tatoeba (hun-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 95.19999999999999 + - type: f1 + value: 94.01333333333335 + - type: precision + value: 93.46666666666667 + - type: recall + value: 95.19999999999999 + task: + type: BitextMining + - dataset: + config: uig-eng + name: MTEB Tatoeba (uig-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 76.9 + - type: f1 + value: 72.07523809523809 + - type: precision + value: 70.19777777777779 + - type: recall + value: 76.9 + task: + type: BitextMining + - dataset: + config: rus-eng + name: MTEB Tatoeba (rus-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 94.1 + - type: f1 + value: 92.31666666666666 + - type: precision + value: 91.43333333333332 + - type: recall + value: 94.1 + task: + type: BitextMining + - dataset: + config: spa-eng + name: MTEB Tatoeba (spa-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 97.8 + - type: f1 + value: 97.1 + - type: precision + value: 96.76666666666668 + - type: recall + value: 97.8 + task: + type: BitextMining + - dataset: + config: hye-eng + name: MTEB Tatoeba (hye-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 92.85714285714286 + - type: f1 + value: 90.92093441150045 + - type: precision + value: 90.00449236298293 + - type: recall + value: 92.85714285714286 + task: + type: BitextMining + - dataset: + config: tel-eng + name: MTEB Tatoeba (tel-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 93.16239316239316 + - type: f1 + value: 91.33903133903132 + - type: precision + value: 90.56267806267806 + - type: recall + value: 93.16239316239316 + task: + type: BitextMining + - dataset: + config: afr-eng + name: MTEB Tatoeba (afr-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 92.4 + - type: f1 + value: 90.25666666666666 + - type: precision + value: 89.25833333333334 + - type: recall + value: 92.4 + task: + type: BitextMining + - dataset: + config: mon-eng + name: MTEB Tatoeba (mon-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 90.22727272727272 + - type: f1 + value: 87.53030303030303 + - type: precision + value: 86.37121212121211 + - type: recall + value: 90.22727272727272 + task: + type: BitextMining + - dataset: + config: arz-eng + name: MTEB Tatoeba (arz-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 79.03563941299791 + - type: f1 + value: 74.7349505840072 + - type: precision + value: 72.9035639412998 + - type: recall + value: 79.03563941299791 + task: + type: BitextMining + - dataset: + config: hrv-eng + name: MTEB Tatoeba (hrv-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 97 + - type: f1 + value: 96.15 + - type: precision + value: 95.76666666666668 + - type: recall + value: 97 + task: + type: BitextMining + - dataset: + config: nov-eng + name: MTEB Tatoeba (nov-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 76.26459143968872 + - type: f1 + value: 71.55642023346303 + - type: precision + value: 69.7544932369835 + - type: recall + value: 76.26459143968872 + task: + type: BitextMining + - dataset: + config: gsw-eng + name: MTEB Tatoeba (gsw-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 58.119658119658126 + - type: f1 + value: 51.65242165242165 + - type: precision + value: 49.41768108434775 + - type: recall + value: 58.119658119658126 + task: + type: BitextMining + - dataset: + config: nds-eng + name: MTEB Tatoeba (nds-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 74.3 + - type: f1 + value: 69.52055555555555 + - type: precision + value: 67.7574938949939 + - type: recall + value: 74.3 + task: + type: BitextMining + - dataset: + config: ukr-eng + name: MTEB Tatoeba (ukr-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 94.8 + - type: f1 + value: 93.31666666666666 + - type: precision + value: 92.60000000000001 + - type: recall + value: 94.8 + task: + type: BitextMining + - dataset: + config: uzb-eng + name: MTEB Tatoeba (uzb-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 76.63551401869158 + - type: f1 + value: 72.35202492211837 + - type: precision + value: 70.60358255451713 + - type: recall + value: 76.63551401869158 + task: + type: BitextMining + - dataset: + config: lit-eng + name: MTEB Tatoeba (lit-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 90.4 + - type: f1 + value: 88.4811111111111 + - type: precision + value: 87.7452380952381 + - type: recall + value: 90.4 + task: + type: BitextMining + - dataset: + config: ina-eng + name: MTEB Tatoeba (ina-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 95 + - type: f1 + value: 93.60666666666667 + - type: precision + value: 92.975 + - type: recall + value: 95 + task: + type: BitextMining + - dataset: + config: lfn-eng + name: MTEB Tatoeba (lfn-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 67.2 + - type: f1 + value: 63.01595782872099 + - type: precision + value: 61.596587301587306 + - type: recall + value: 67.2 + task: + type: BitextMining + - dataset: + config: zsm-eng + name: MTEB Tatoeba (zsm-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 95.7 + - type: f1 + value: 94.52999999999999 + - type: precision + value: 94 + - type: recall + value: 95.7 + task: + type: BitextMining + - dataset: + config: ita-eng + name: MTEB Tatoeba (ita-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 94.6 + - type: f1 + value: 93.28999999999999 + - type: precision + value: 92.675 + - type: recall + value: 94.6 + task: + type: BitextMining + - dataset: + config: cmn-eng + name: MTEB Tatoeba (cmn-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 96.39999999999999 + - type: f1 + value: 95.28333333333333 + - type: precision + value: 94.75 + - type: recall + value: 96.39999999999999 + task: + type: BitextMining + - dataset: + config: lvs-eng + name: MTEB Tatoeba (lvs-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 91.9 + - type: f1 + value: 89.83 + - type: precision + value: 88.92 + - type: recall + value: 91.9 + task: + type: BitextMining + - dataset: + config: glg-eng + name: MTEB Tatoeba (glg-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 94.69999999999999 + - type: f1 + value: 93.34222222222223 + - type: precision + value: 92.75416666666668 + - type: recall + value: 94.69999999999999 + task: + type: BitextMining + - dataset: + config: ceb-eng + name: MTEB Tatoeba (ceb-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 60.333333333333336 + - type: f1 + value: 55.31203703703703 + - type: precision + value: 53.39971108326371 + - type: recall + value: 60.333333333333336 + task: + type: BitextMining + - dataset: + config: bre-eng + name: MTEB Tatoeba (bre-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 12.9 + - type: f1 + value: 11.099861903031458 + - type: precision + value: 10.589187932631877 + - type: recall + value: 12.9 + task: + type: BitextMining + - dataset: + config: ben-eng + name: MTEB Tatoeba (ben-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 86.7 + - type: f1 + value: 83.0152380952381 + - type: precision + value: 81.37833333333333 + - type: recall + value: 86.7 + task: + type: BitextMining + - dataset: + config: swg-eng + name: MTEB Tatoeba (swg-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 63.39285714285714 + - type: f1 + value: 56.832482993197274 + - type: precision + value: 54.56845238095237 + - type: recall + value: 63.39285714285714 + task: + type: BitextMining + - dataset: + config: arq-eng + name: MTEB Tatoeba (arq-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 48.73765093304062 + - type: f1 + value: 41.555736920720456 + - type: precision + value: 39.06874531737319 + - type: recall + value: 48.73765093304062 + task: + type: BitextMining + - dataset: + config: kab-eng + name: MTEB Tatoeba (kab-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 41.099999999999994 + - type: f1 + value: 36.540165945165946 + - type: precision + value: 35.05175685425686 + - type: recall + value: 41.099999999999994 + task: + type: BitextMining + - dataset: + config: fra-eng + name: MTEB Tatoeba (fra-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 94.89999999999999 + - type: f1 + value: 93.42333333333333 + - type: precision + value: 92.75833333333333 + - type: recall + value: 94.89999999999999 + task: + type: BitextMining + - dataset: + config: por-eng + name: MTEB Tatoeba (por-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 94.89999999999999 + - type: f1 + value: 93.63333333333334 + - type: precision + value: 93.01666666666665 + - type: recall + value: 94.89999999999999 + task: + type: BitextMining + - dataset: + config: tat-eng + name: MTEB Tatoeba (tat-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 77.9 + - type: f1 + value: 73.64833333333334 + - type: precision + value: 71.90282106782105 + - type: recall + value: 77.9 + task: + type: BitextMining + - dataset: + config: oci-eng + name: MTEB Tatoeba (oci-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 59.4 + - type: f1 + value: 54.90521367521367 + - type: precision + value: 53.432840025471606 + - type: recall + value: 59.4 + task: + type: BitextMining + - dataset: + config: pol-eng + name: MTEB Tatoeba (pol-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 97.39999999999999 + - type: f1 + value: 96.6 + - type: precision + value: 96.2 + - type: recall + value: 97.39999999999999 + task: + type: BitextMining + - dataset: + config: war-eng + name: MTEB Tatoeba (war-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 67.2 + - type: f1 + value: 62.25926129426129 + - type: precision + value: 60.408376623376626 + - type: recall + value: 67.2 + task: + type: BitextMining + - dataset: + config: aze-eng + name: MTEB Tatoeba (aze-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 90.2 + - type: f1 + value: 87.60666666666667 + - type: precision + value: 86.45277777777778 + - type: recall + value: 90.2 + task: + type: BitextMining + - dataset: + config: vie-eng + name: MTEB Tatoeba (vie-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 97.7 + - type: f1 + value: 97 + - type: precision + value: 96.65 + - type: recall + value: 97.7 + task: + type: BitextMining + - dataset: + config: nno-eng + name: MTEB Tatoeba (nno-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 93.2 + - type: f1 + value: 91.39746031746031 + - type: precision + value: 90.6125 + - type: recall + value: 93.2 + task: + type: BitextMining + - dataset: + config: cha-eng + name: MTEB Tatoeba (cha-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 32.11678832116788 + - type: f1 + value: 27.210415386260234 + - type: precision + value: 26.20408990846947 + - type: recall + value: 32.11678832116788 + task: + type: BitextMining + - dataset: + config: mhr-eng + name: MTEB Tatoeba (mhr-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 8.5 + - type: f1 + value: 6.787319277832475 + - type: precision + value: 6.3452094433344435 + - type: recall + value: 8.5 + task: + type: BitextMining + - dataset: + config: dan-eng + name: MTEB Tatoeba (dan-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 96.1 + - type: f1 + value: 95.08 + - type: precision + value: 94.61666666666667 + - type: recall + value: 96.1 + task: + type: BitextMining + - dataset: + config: ell-eng + name: MTEB Tatoeba (ell-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 95.3 + - type: f1 + value: 93.88333333333333 + - type: precision + value: 93.18333333333332 + - type: recall + value: 95.3 + task: + type: BitextMining + - dataset: + config: amh-eng + name: MTEB Tatoeba (amh-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 85.11904761904762 + - type: f1 + value: 80.69444444444444 + - type: precision + value: 78.72023809523809 + - type: recall + value: 85.11904761904762 + task: + type: BitextMining + - dataset: + config: pam-eng + name: MTEB Tatoeba (pam-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 11.1 + - type: f1 + value: 9.276381801735853 + - type: precision + value: 8.798174603174601 + - type: recall + value: 11.1 + task: + type: BitextMining + - dataset: + config: hsb-eng + name: MTEB Tatoeba (hsb-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 63.56107660455487 + - type: f1 + value: 58.70433569191332 + - type: precision + value: 56.896926581464015 + - type: recall + value: 63.56107660455487 + task: + type: BitextMining + - dataset: + config: srp-eng + name: MTEB Tatoeba (srp-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 94.69999999999999 + - type: f1 + value: 93.10000000000001 + - type: precision + value: 92.35 + - type: recall + value: 94.69999999999999 + task: + type: BitextMining + - dataset: + config: epo-eng + name: MTEB Tatoeba (epo-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 96.8 + - type: f1 + value: 96.01222222222222 + - type: precision + value: 95.67083333333332 + - type: recall + value: 96.8 + task: + type: BitextMining + - dataset: + config: kzj-eng + name: MTEB Tatoeba (kzj-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 9.2 + - type: f1 + value: 7.911555250305249 + - type: precision + value: 7.631246556216846 + - type: recall + value: 9.2 + task: + type: BitextMining + - dataset: + config: awa-eng + name: MTEB Tatoeba (awa-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 77.48917748917748 + - type: f1 + value: 72.27375798804371 + - type: precision + value: 70.14430014430013 + - type: recall + value: 77.48917748917748 + task: + type: BitextMining + - dataset: + config: fao-eng + name: MTEB Tatoeba (fao-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 77.09923664122137 + - type: f1 + value: 72.61541257724463 + - type: precision + value: 70.8998380754106 + - type: recall + value: 77.09923664122137 + task: + type: BitextMining + - dataset: + config: mal-eng + name: MTEB Tatoeba (mal-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 98.2532751091703 + - type: f1 + value: 97.69529354682193 + - type: precision + value: 97.42843279961184 + - type: recall + value: 98.2532751091703 + task: + type: BitextMining + - dataset: + config: ile-eng + name: MTEB Tatoeba (ile-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 82.8 + - type: f1 + value: 79.14672619047619 + - type: precision + value: 77.59489247311828 + - type: recall + value: 82.8 + task: + type: BitextMining + - dataset: + config: bos-eng + name: MTEB Tatoeba (bos-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 94.35028248587571 + - type: f1 + value: 92.86252354048965 + - type: precision + value: 92.2080979284369 + - type: recall + value: 94.35028248587571 + task: + type: BitextMining + - dataset: + config: cor-eng + name: MTEB Tatoeba (cor-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 8.5 + - type: f1 + value: 6.282429263935621 + - type: precision + value: 5.783274240739785 + - type: recall + value: 8.5 + task: + type: BitextMining + - dataset: + config: cat-eng + name: MTEB Tatoeba (cat-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 92.7 + - type: f1 + value: 91.025 + - type: precision + value: 90.30428571428571 + - type: recall + value: 92.7 + task: + type: BitextMining + - dataset: + config: eus-eng + name: MTEB Tatoeba (eus-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 81 + - type: f1 + value: 77.8232380952381 + - type: precision + value: 76.60194444444444 + - type: recall + value: 81 + task: + type: BitextMining + - dataset: + config: yue-eng + name: MTEB Tatoeba (yue-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 91 + - type: f1 + value: 88.70857142857142 + - type: precision + value: 87.7 + - type: recall + value: 91 + task: + type: BitextMining + - dataset: + config: swe-eng + name: MTEB Tatoeba (swe-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 96.39999999999999 + - type: f1 + value: 95.3 + - type: precision + value: 94.76666666666667 + - type: recall + value: 96.39999999999999 + task: + type: BitextMining + - dataset: + config: dtp-eng + name: MTEB Tatoeba (dtp-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 8.1 + - type: f1 + value: 7.001008218834307 + - type: precision + value: 6.708329562594269 + - type: recall + value: 8.1 + task: + type: BitextMining + - dataset: + config: kat-eng + name: MTEB Tatoeba (kat-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 87.1313672922252 + - type: f1 + value: 84.09070598748882 + - type: precision + value: 82.79171454104429 + - type: recall + value: 87.1313672922252 + task: + type: BitextMining + - dataset: + config: jpn-eng + name: MTEB Tatoeba (jpn-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 96.39999999999999 + - type: f1 + value: 95.28333333333333 + - type: precision + value: 94.73333333333332 + - type: recall + value: 96.39999999999999 + task: + type: BitextMining + - dataset: + config: csb-eng + name: MTEB Tatoeba (csb-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 42.29249011857708 + - type: f1 + value: 36.981018542283365 + - type: precision + value: 35.415877813576024 + - type: recall + value: 42.29249011857708 + task: + type: BitextMining + - dataset: + config: xho-eng + name: MTEB Tatoeba (xho-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 83.80281690140845 + - type: f1 + value: 80.86854460093896 + - type: precision + value: 79.60093896713614 + - type: recall + value: 83.80281690140845 + task: + type: BitextMining + - dataset: + config: orv-eng + name: MTEB Tatoeba (orv-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 45.26946107784431 + - type: f1 + value: 39.80235464678088 + - type: precision + value: 38.14342660001342 + - type: recall + value: 45.26946107784431 + task: + type: BitextMining + - dataset: + config: ind-eng + name: MTEB Tatoeba (ind-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 94.3 + - type: f1 + value: 92.9 + - type: precision + value: 92.26666666666668 + - type: recall + value: 94.3 + task: + type: BitextMining + - dataset: + config: tuk-eng + name: MTEB Tatoeba (tuk-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 37.93103448275862 + - type: f1 + value: 33.15192743764172 + - type: precision + value: 31.57456528146183 + - type: recall + value: 37.93103448275862 + task: + type: BitextMining + - dataset: + config: max-eng + name: MTEB Tatoeba (max-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 69.01408450704226 + - type: f1 + value: 63.41549295774648 + - type: precision + value: 61.342778895595806 + - type: recall + value: 69.01408450704226 + task: + type: BitextMining + - dataset: + config: swh-eng + name: MTEB Tatoeba (swh-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 76.66666666666667 + - type: f1 + value: 71.60705960705961 + - type: precision + value: 69.60683760683762 + - type: recall + value: 76.66666666666667 + task: + type: BitextMining + - dataset: + config: hin-eng + name: MTEB Tatoeba (hin-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 95.8 + - type: f1 + value: 94.48333333333333 + - type: precision + value: 93.83333333333333 + - type: recall + value: 95.8 + task: + type: BitextMining + - dataset: + config: dsb-eng + name: MTEB Tatoeba (dsb-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 52.81837160751566 + - type: f1 + value: 48.435977731384824 + - type: precision + value: 47.11291973845539 + - type: recall + value: 52.81837160751566 + task: + type: BitextMining + - dataset: + config: ber-eng + name: MTEB Tatoeba (ber-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 44.9 + - type: f1 + value: 38.88962621607783 + - type: precision + value: 36.95936507936508 + - type: recall + value: 44.9 + task: + type: BitextMining + - dataset: + config: tam-eng + name: MTEB Tatoeba (tam-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 90.55374592833876 + - type: f1 + value: 88.22553125484721 + - type: precision + value: 87.26927252985884 + - type: recall + value: 90.55374592833876 + task: + type: BitextMining + - dataset: + config: slk-eng + name: MTEB Tatoeba (slk-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 94.6 + - type: f1 + value: 93.13333333333333 + - type: precision + value: 92.45333333333333 + - type: recall + value: 94.6 + task: + type: BitextMining + - dataset: + config: tgl-eng + name: MTEB Tatoeba (tgl-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 93.7 + - type: f1 + value: 91.99666666666667 + - type: precision + value: 91.26666666666668 + - type: recall + value: 93.7 + task: + type: BitextMining + - dataset: + config: ast-eng + name: MTEB Tatoeba (ast-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 85.03937007874016 + - type: f1 + value: 81.75853018372703 + - type: precision + value: 80.34120734908137 + - type: recall + value: 85.03937007874016 + task: + type: BitextMining + - dataset: + config: mkd-eng + name: MTEB Tatoeba (mkd-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 88.3 + - type: f1 + value: 85.5 + - type: precision + value: 84.25833333333334 + - type: recall + value: 88.3 + task: + type: BitextMining + - dataset: + config: khm-eng + name: MTEB Tatoeba (khm-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 65.51246537396122 + - type: f1 + value: 60.02297410192148 + - type: precision + value: 58.133467727289236 + - type: recall + value: 65.51246537396122 + task: + type: BitextMining + - dataset: + config: ces-eng + name: MTEB Tatoeba (ces-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 96 + - type: f1 + value: 94.89 + - type: precision + value: 94.39166666666667 + - type: recall + value: 96 + task: + type: BitextMining + - dataset: + config: tzl-eng + name: MTEB Tatoeba (tzl-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 57.692307692307686 + - type: f1 + value: 53.162393162393165 + - type: precision + value: 51.70673076923077 + - type: recall + value: 57.692307692307686 + task: + type: BitextMining + - dataset: + config: urd-eng + name: MTEB Tatoeba (urd-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 91.60000000000001 + - type: f1 + value: 89.21190476190475 + - type: precision + value: 88.08666666666667 + - type: recall + value: 91.60000000000001 + task: + type: BitextMining + - dataset: + config: ara-eng + name: MTEB Tatoeba (ara-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 88 + - type: f1 + value: 85.47 + - type: precision + value: 84.43266233766234 + - type: recall + value: 88 + task: + type: BitextMining + - dataset: + config: kor-eng + name: MTEB Tatoeba (kor-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 92.7 + - type: f1 + value: 90.64999999999999 + - type: precision + value: 89.68333333333332 + - type: recall + value: 92.7 + task: + type: BitextMining + - dataset: + config: yid-eng + name: MTEB Tatoeba (yid-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 80.30660377358491 + - type: f1 + value: 76.33044137466307 + - type: precision + value: 74.78970125786164 + - type: recall + value: 80.30660377358491 + task: + type: BitextMining + - dataset: + config: fin-eng + name: MTEB Tatoeba (fin-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 96.39999999999999 + - type: f1 + value: 95.44 + - type: precision + value: 94.99166666666666 + - type: recall + value: 96.39999999999999 + task: + type: BitextMining + - dataset: + config: tha-eng + name: MTEB Tatoeba (tha-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 96.53284671532847 + - type: f1 + value: 95.37712895377129 + - type: precision + value: 94.7992700729927 + - type: recall + value: 96.53284671532847 + task: + type: BitextMining + - dataset: + config: wuu-eng + name: MTEB Tatoeba (wuu-eng) + revision: 9080400076fbadbb4c4dcb136ff4eddc40b42553 + split: test + type: mteb/tatoeba-bitext-mining + metrics: + - type: accuracy + value: 89 + - type: f1 + value: 86.23190476190476 + - type: precision + value: 85.035 + - type: recall + value: 89 + task: + type: BitextMining + - dataset: + config: default + name: MTEB Touche2020 + revision: None + split: test + type: webis-touche2020 + metrics: + - type: map_at_1 + value: 2.585 + - type: map_at_10 + value: 9.012 + - type: map_at_100 + value: 14.027000000000001 + - type: map_at_1000 + value: 15.565000000000001 + - type: map_at_3 + value: 5.032 + - type: map_at_5 + value: 6.657 + - type: mrr_at_1 + value: 28.571 + - type: mrr_at_10 + value: 45.377 + - type: mrr_at_100 + value: 46.119 + - type: mrr_at_1000 + value: 46.127 + - type: mrr_at_3 + value: 41.156 + - type: mrr_at_5 + value: 42.585 + - type: ndcg_at_1 + value: 27.551 + - type: ndcg_at_10 + value: 23.395 + - type: ndcg_at_100 + value: 33.342 + - type: ndcg_at_1000 + value: 45.523 + - type: ndcg_at_3 + value: 25.158 + - type: ndcg_at_5 + value: 23.427 + - type: precision_at_1 + value: 28.571 + - type: precision_at_10 + value: 21.429000000000002 + - type: precision_at_100 + value: 6.714 + - type: precision_at_1000 + value: 1.473 + - type: precision_at_3 + value: 27.211000000000002 + - type: precision_at_5 + value: 24.490000000000002 + - type: recall_at_1 + value: 2.585 + - type: recall_at_10 + value: 15.418999999999999 + - type: recall_at_100 + value: 42.485 + - type: recall_at_1000 + value: 79.536 + - type: recall_at_3 + value: 6.239999999999999 + - type: recall_at_5 + value: 8.996 + task: + type: Retrieval + - dataset: + config: default + name: MTEB ToxicConversationsClassification + revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c + split: test + type: mteb/toxic_conversations_50k + metrics: + - type: accuracy + value: 71.3234 + - type: ap + value: 14.361688653847423 + - type: f1 + value: 54.819068624319044 + task: + type: Classification + - dataset: + config: default + name: MTEB TweetSentimentExtractionClassification + revision: d604517c81ca91fe16a244d1248fc021f9ecee7a + split: test + type: mteb/tweet_sentiment_extraction + metrics: + - type: accuracy + value: 61.97792869269949 + - type: f1 + value: 62.28965628513728 + task: + type: Classification + - dataset: + config: default + name: MTEB TwentyNewsgroupsClustering + revision: 6125ec4e24fa026cec8a478383ee943acfbd5449 + split: test + type: mteb/twentynewsgroups-clustering + metrics: + - type: v_measure + value: 38.90540145385218 + task: + type: Clustering + - dataset: + config: default + name: MTEB TwitterSemEval2015 + revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1 + split: test + type: mteb/twittersemeval2015-pairclassification + metrics: + - type: cos_sim_accuracy + value: 86.53513739047506 + - type: cos_sim_ap + value: 75.27741586677557 + - type: cos_sim_f1 + value: 69.18792902473774 + - type: cos_sim_precision + value: 67.94708725515136 + - type: cos_sim_recall + value: 70.47493403693932 + - type: dot_accuracy + value: 84.7052512368123 + - type: dot_ap + value: 69.36075482849378 + - type: dot_f1 + value: 64.44688376631296 + - type: dot_precision + value: 59.92288500793831 + - type: dot_recall + value: 69.70976253298153 + - type: euclidean_accuracy + value: 86.60666388508076 + - type: euclidean_ap + value: 75.47512772621097 + - type: euclidean_f1 + value: 69.413872536473 + - type: euclidean_precision + value: 67.39562624254472 + - type: euclidean_recall + value: 71.55672823218997 + - type: manhattan_accuracy + value: 86.52917684925792 + - type: manhattan_ap + value: 75.34000110496703 + - type: manhattan_f1 + value: 69.28489190226429 + - type: manhattan_precision + value: 67.24608889992551 + - type: manhattan_recall + value: 71.45118733509234 + - type: max_accuracy + value: 86.60666388508076 + - type: max_ap + value: 75.47512772621097 + - type: max_f1 + value: 69.413872536473 + task: + type: PairClassification + - dataset: + config: default + name: MTEB TwitterURLCorpus + revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf + split: test + type: mteb/twitterurlcorpus-pairclassification + metrics: + - type: cos_sim_accuracy + value: 89.01695967710637 + - type: cos_sim_ap + value: 85.8298270742901 + - type: cos_sim_f1 + value: 78.46988128389272 + - type: cos_sim_precision + value: 74.86017897091722 + - type: cos_sim_recall + value: 82.44533415460425 + - type: dot_accuracy + value: 88.19420188613343 + - type: dot_ap + value: 83.82679165901324 + - type: dot_f1 + value: 76.55833777304208 + - type: dot_precision + value: 75.6884875846501 + - type: dot_recall + value: 77.44841392054204 + - type: euclidean_accuracy + value: 89.03054294252338 + - type: euclidean_ap + value: 85.89089555185325 + - type: euclidean_f1 + value: 78.62997658079624 + - type: euclidean_precision + value: 74.92329149232914 + - type: euclidean_recall + value: 82.72251308900523 + - type: manhattan_accuracy + value: 89.0266620095471 + - type: manhattan_ap + value: 85.86458997929147 + - type: manhattan_f1 + value: 78.50685331000291 + - type: manhattan_precision + value: 74.5499861534201 + - type: manhattan_recall + value: 82.90729904527257 + - type: max_accuracy + value: 89.03054294252338 + - type: max_ap + value: 85.89089555185325 + - type: max_f1 + value: 78.62997658079624 + task: + type: PairClassification +tags: +- mteb +- Sentence Transformers +- sentence-similarity +- feature-extraction +- sentence-transformers +- onnx +- teradata + +--- +# A Teradata Vantage compatible Embeddings Model + +# intfloat/multilingual-e5-large + +## Overview of this Model + +An Embedding Model which maps text (sentence/ paragraphs) into a vector. The [intfloat/multilingual-e5-large](https://huggingface.co/intfloat/multilingual-e5-large) model well known for its effectiveness in capturing semantic meanings in text data. It's a state-of-the-art model trained on a large corpus, capable of generating high-quality text embeddings. + +- 559.89M params (Sizes in ONNX format - "int8": 535.01MB, "uint8": 535.01MB) +- 514 maximum input tokens +- 1024 dimensions of output vector +- Licence: mit. The released models can be used for commercial purposes free of charge. +- Reference to Original Model: https://huggingface.co/intfloat/multilingual-e5-large + + +## Quickstart: Deploying this Model in Teradata Vantage + +We have pre-converted the model into the ONNX format compatible with BYOM 6.0, eliminating the need for manual conversion. + +**Note:** Ensure you have access to a Teradata Database with BYOM 6.0 installed. + +To get started, clone the pre-converted model directly from the Teradata HuggingFace repository. + + +```python + +import teradataml as tdml +import getpass +from huggingface_hub import hf_hub_download + +model_name = "multilingual-e5-large" +number_dimensions_output = 1024 +model_file_name = "model_int8.onnx" + +# Step 1: Download Model from Teradata HuggingFace Page + +hf_hub_download(repo_id=f"Teradata/{model_name}", filename=f"onnx/{model_file_name}", local_dir="./") +hf_hub_download(repo_id=f"Teradata/{model_name}", filename=f"tokenizer.json", local_dir="./") + +# Step 2: Create Connection to Vantage + +tdml.create_context(host = input('enter your hostname'), + username=input('enter your username'), + password = getpass.getpass("enter your password")) + +# Step 3: Load Models into Vantage +# a) Embedding model +tdml.save_byom(model_id = model_name, # must be unique in the models table + model_file = model_file_name, + table_name = 'embeddings_models' ) +# b) Tokenizer +tdml.save_byom(model_id = model_name, # must be unique in the models table + model_file = 'tokenizer.json', + table_name = 'embeddings_tokenizers') + +# Step 4: Test ONNXEmbeddings Function +# Note that ONNXEmbeddings expects the 'payload' column to be 'txt'. +# If it has got a different name, just rename it in a subquery/CTE. +input_table = "emails.emails" +embeddings_query = f""" +SELECT + * +from mldb.ONNXEmbeddings( + on {input_table} as InputTable + on (select * from embeddings_models where model_id = '{model_name}') as ModelTable DIMENSION + on (select model as tokenizer from embeddings_tokenizers where model_id = '{model_name}') as TokenizerTable DIMENSION + using + Accumulate('id', 'txt') + ModelOutputTensor('sentence_embedding') + EnableMemoryCheck('false') + OutputFormat('FLOAT32({number_dimensions_output})') + OverwriteCachedModel('true') + ) a +""" +DF_embeddings = tdml.DataFrame.from_query(embeddings_query) +DF_embeddings +``` + + + +## What Can I Do with the Embeddings? + +Teradata Vantage includes pre-built in-database functions to process embeddings further. Explore the following examples: + +- **Semantic Clustering with TD_KMeans:** [Semantic Clustering Python Notebook](https://github.com/Teradata/jupyter-demos/blob/main/UseCases/Language_Models_InVantage/Semantic_Clustering_Python.ipynb) +- **Semantic Distance with TD_VectorDistance:** [Semantic Similarity Python Notebook](https://github.com/Teradata/jupyter-demos/blob/main/UseCases/Language_Models_InVantage/Semantic_Similarity_Python.ipynb) +- **RAG-Based Application with TD_VectorDistance:** [RAG and Bedrock Query PDF Notebook](https://github.com/Teradata/jupyter-demos/blob/main/UseCases/Language_Models_InVantage/RAG_and_Bedrock_QueryPDF.ipynb) + + +## Deep Dive into Model Conversion to ONNX + +**The steps below outline how we converted the open-source Hugging Face model into an ONNX file compatible with the in-database ONNXEmbeddings function.** + +You do not need to perform these steps—they are provided solely for documentation and transparency. However, they may be helpful if you wish to convert another model to the required format. + + +### Part 1. Importing and Converting Model using optimum + +We start by importing the pre-trained [intfloat/multilingual-e5-large](https://huggingface.co/intfloat/multilingual-e5-large) model from Hugging Face. + +To enhance performance and ensure compatibility with various execution environments, we'll use the [Optimum](https://github.com/huggingface/optimum) utility to convert the model into the ONNX (Open Neural Network Exchange) format. + +After conversion to ONNX, we are fixing the opset in the ONNX file for compatibility with ONNX runtime used in Teradata Vantage + +We are generating ONNX files for multiple different precisions: int8, uint8 + +You can find the detailed conversion steps in the file [convert.py](./convert.py) + +### Part 2. Running the model in Python with onnxruntime & compare results + +Once the fixes are applied, we proceed to test the correctness of the ONNX model by calculating cosine similarity between two texts using native SentenceTransformers and ONNX runtime, comparing the results. + +If the results are identical, it confirms that the ONNX model gives the same result as the native models, validating its correctness and suitability for further use in the database. + + +```python +import onnxruntime as rt + +from sentence_transformers.util import cos_sim +from sentence_transformers import SentenceTransformer + +import transformers + + +sentences_1 = 'How is the weather today?' +sentences_2 = 'What is the current weather like today?' + +# Calculate ONNX result +tokenizer = transformers.AutoTokenizer.from_pretrained("intfloat/multilingual-e5-large") +predef_sess = rt.InferenceSession("onnx/model_int8.onnx") + +enc1 = tokenizer(sentences_1) +embeddings_1_onnx = predef_sess.run(None, {"input_ids": [enc1.input_ids], + "attention_mask": [enc1.attention_mask]}) + +enc2 = tokenizer(sentences_2) +embeddings_2_onnx = predef_sess.run(None, {"input_ids": [enc2.input_ids], + "attention_mask": [enc2.attention_mask]}) + + +# Calculate embeddings with SentenceTransformer +model = SentenceTransformer(model_id, trust_remote_code=True) +embeddings_1_sentence_transformer = model.encode(sentences_1, normalize_embeddings=True, trust_remote_code=True) +embeddings_2_sentence_transformer = model.encode(sentences_2, normalize_embeddings=True, trust_remote_code=True) + +# Compare results +print("Cosine similiarity for embeddings calculated with ONNX:" + str(cos_sim(embeddings_1_onnx[1][0], embeddings_2_onnx[1][0]))) +print("Cosine similiarity for embeddings calculated with SentenceTransformer:" + str(cos_sim(embeddings_1_sentence_transformer, embeddings_2_sentence_transformer))) +``` + +You can find the detailed ONNX vs. SentenceTransformer result comparison steps in the file [test_local.py](./test_local.py) +