Add new SentenceTransformer model.
Browse files- README.md +58 -31
- config.json +1 -1
- model.safetensors +1 -1
README.md
CHANGED
|
@@ -6,7 +6,7 @@ tags:
|
|
| 6 |
- generated_from_trainer
|
| 7 |
- dataset_size:208
|
| 8 |
- loss:BatchSemiHardTripletLoss
|
| 9 |
-
base_model: BAAI/bge-base-en
|
| 10 |
widget:
|
| 11 |
- source_sentence: '
|
| 12 |
|
|
@@ -362,7 +362,7 @@ metrics:
|
|
| 362 |
- euclidean_accuracy
|
| 363 |
- max_accuracy
|
| 364 |
model-index:
|
| 365 |
-
- name: SentenceTransformer based on BAAI/bge-base-en
|
| 366 |
results:
|
| 367 |
- task:
|
| 368 |
type: triplet
|
|
@@ -372,19 +372,19 @@ model-index:
|
|
| 372 |
type: bge-base-en-v1.5-train
|
| 373 |
metrics:
|
| 374 |
- type: cosine_accuracy
|
| 375 |
-
value: 0.
|
| 376 |
name: Cosine Accuracy
|
| 377 |
- type: dot_accuracy
|
| 378 |
-
value: 0.
|
| 379 |
name: Dot Accuracy
|
| 380 |
- type: manhattan_accuracy
|
| 381 |
-
value: 0.
|
| 382 |
name: Manhattan Accuracy
|
| 383 |
- type: euclidean_accuracy
|
| 384 |
-
value: 0.
|
| 385 |
name: Euclidean Accuracy
|
| 386 |
- type: max_accuracy
|
| 387 |
-
value: 0.
|
| 388 |
name: Max Accuracy
|
| 389 |
- task:
|
| 390 |
type: triplet
|
|
@@ -394,31 +394,46 @@ model-index:
|
|
| 394 |
type: bge-base-en-v1.5-eval
|
| 395 |
metrics:
|
| 396 |
- type: cosine_accuracy
|
| 397 |
-
value: 0
|
| 398 |
name: Cosine Accuracy
|
| 399 |
- type: dot_accuracy
|
| 400 |
-
value: 0.
|
| 401 |
name: Dot Accuracy
|
| 402 |
- type: manhattan_accuracy
|
| 403 |
-
value: 0
|
| 404 |
name: Manhattan Accuracy
|
| 405 |
- type: euclidean_accuracy
|
| 406 |
-
value: 0
|
| 407 |
name: Euclidean Accuracy
|
| 408 |
- type: max_accuracy
|
| 409 |
-
value: 0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 410 |
name: Max Accuracy
|
| 411 |
---
|
| 412 |
|
| 413 |
-
# SentenceTransformer based on BAAI/bge-base-en
|
| 414 |
|
| 415 |
-
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en](https://huggingface.co/BAAI/bge-base-en). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
| 416 |
|
| 417 |
## Model Details
|
| 418 |
|
| 419 |
### Model Description
|
| 420 |
- **Model Type:** Sentence Transformer
|
| 421 |
-
- **Base model:** [BAAI/bge-base-en](https://huggingface.co/BAAI/bge-base-en) <!-- at revision
|
| 422 |
- **Maximum Sequence Length:** 512 tokens
|
| 423 |
- **Output Dimensionality:** 768 tokens
|
| 424 |
- **Similarity Function:** Cosine Similarity
|
|
@@ -506,25 +521,37 @@ You can finetune this model on your own dataset.
|
|
| 506 |
* Dataset: `bge-base-en-v1.5-train`
|
| 507 |
* Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
|
| 508 |
|
| 509 |
-
| Metric | Value
|
| 510 |
-
|
| 511 |
-
| cosine_accuracy | 0.
|
| 512 |
-
| dot_accuracy | 0.
|
| 513 |
-
| manhattan_accuracy | 0.
|
| 514 |
-
| euclidean_accuracy | 0.
|
| 515 |
-
| **max_accuracy** | **0.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 516 |
|
| 517 |
#### Triplet
|
| 518 |
* Dataset: `bge-base-en-v1.5-eval`
|
| 519 |
* Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
|
| 520 |
|
| 521 |
-
| Metric | Value
|
| 522 |
-
|
| 523 |
-
| cosine_accuracy | 0
|
| 524 |
-
| dot_accuracy | 0.
|
| 525 |
-
| manhattan_accuracy | 0
|
| 526 |
-
| euclidean_accuracy | 0
|
| 527 |
-
| **max_accuracy** | **0
|
| 528 |
|
| 529 |
<!--
|
| 530 |
## Bias, Risks and Limitations
|
|
@@ -713,8 +740,8 @@ You can finetune this model on your own dataset.
|
|
| 713 |
### Training Logs
|
| 714 |
| Epoch | Step | bge-base-en-v1.5-eval_max_accuracy | bge-base-en-v1.5-train_max_accuracy |
|
| 715 |
|:-----:|:----:|:----------------------------------:|:-----------------------------------:|
|
| 716 |
-
| 0 | 0 | - | 0.
|
| 717 |
-
| 5.0 | 65 | 0
|
| 718 |
|
| 719 |
|
| 720 |
### Framework Versions
|
|
|
|
| 6 |
- generated_from_trainer
|
| 7 |
- dataset_size:208
|
| 8 |
- loss:BatchSemiHardTripletLoss
|
| 9 |
+
base_model: BAAI/bge-base-en-v1.5
|
| 10 |
widget:
|
| 11 |
- source_sentence: '
|
| 12 |
|
|
|
|
| 362 |
- euclidean_accuracy
|
| 363 |
- max_accuracy
|
| 364 |
model-index:
|
| 365 |
+
- name: SentenceTransformer based on BAAI/bge-base-en-v1.5
|
| 366 |
results:
|
| 367 |
- task:
|
| 368 |
type: triplet
|
|
|
|
| 372 |
type: bge-base-en-v1.5-train
|
| 373 |
metrics:
|
| 374 |
- type: cosine_accuracy
|
| 375 |
+
value: 0.8461538461538461
|
| 376 |
name: Cosine Accuracy
|
| 377 |
- type: dot_accuracy
|
| 378 |
+
value: 0.15384615384615385
|
| 379 |
name: Dot Accuracy
|
| 380 |
- type: manhattan_accuracy
|
| 381 |
+
value: 0.8509615384615384
|
| 382 |
name: Manhattan Accuracy
|
| 383 |
- type: euclidean_accuracy
|
| 384 |
+
value: 0.8461538461538461
|
| 385 |
name: Euclidean Accuracy
|
| 386 |
- type: max_accuracy
|
| 387 |
+
value: 0.8509615384615384
|
| 388 |
name: Max Accuracy
|
| 389 |
- task:
|
| 390 |
type: triplet
|
|
|
|
| 394 |
type: bge-base-en-v1.5-eval
|
| 395 |
metrics:
|
| 396 |
- type: cosine_accuracy
|
| 397 |
+
value: 1.0
|
| 398 |
name: Cosine Accuracy
|
| 399 |
- type: dot_accuracy
|
| 400 |
+
value: 0.0
|
| 401 |
name: Dot Accuracy
|
| 402 |
- type: manhattan_accuracy
|
| 403 |
+
value: 1.0
|
| 404 |
name: Manhattan Accuracy
|
| 405 |
- type: euclidean_accuracy
|
| 406 |
+
value: 1.0
|
| 407 |
name: Euclidean Accuracy
|
| 408 |
- type: max_accuracy
|
| 409 |
+
value: 1.0
|
| 410 |
+
name: Max Accuracy
|
| 411 |
+
- type: cosine_accuracy
|
| 412 |
+
value: 1.0
|
| 413 |
+
name: Cosine Accuracy
|
| 414 |
+
- type: dot_accuracy
|
| 415 |
+
value: 0.0
|
| 416 |
+
name: Dot Accuracy
|
| 417 |
+
- type: manhattan_accuracy
|
| 418 |
+
value: 1.0
|
| 419 |
+
name: Manhattan Accuracy
|
| 420 |
+
- type: euclidean_accuracy
|
| 421 |
+
value: 1.0
|
| 422 |
+
name: Euclidean Accuracy
|
| 423 |
+
- type: max_accuracy
|
| 424 |
+
value: 1.0
|
| 425 |
name: Max Accuracy
|
| 426 |
---
|
| 427 |
|
| 428 |
+
# SentenceTransformer based on BAAI/bge-base-en-v1.5
|
| 429 |
|
| 430 |
+
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
| 431 |
|
| 432 |
## Model Details
|
| 433 |
|
| 434 |
### Model Description
|
| 435 |
- **Model Type:** Sentence Transformer
|
| 436 |
+
- **Base model:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) <!-- at revision a5beb1e3e68b9ab74eb54cfd186867f64f240e1a -->
|
| 437 |
- **Maximum Sequence Length:** 512 tokens
|
| 438 |
- **Output Dimensionality:** 768 tokens
|
| 439 |
- **Similarity Function:** Cosine Similarity
|
|
|
|
| 521 |
* Dataset: `bge-base-en-v1.5-train`
|
| 522 |
* Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
|
| 523 |
|
| 524 |
+
| Metric | Value |
|
| 525 |
+
|:-------------------|:----------|
|
| 526 |
+
| cosine_accuracy | 0.8462 |
|
| 527 |
+
| dot_accuracy | 0.1538 |
|
| 528 |
+
| manhattan_accuracy | 0.851 |
|
| 529 |
+
| euclidean_accuracy | 0.8462 |
|
| 530 |
+
| **max_accuracy** | **0.851** |
|
| 531 |
+
|
| 532 |
+
#### Triplet
|
| 533 |
+
* Dataset: `bge-base-en-v1.5-eval`
|
| 534 |
+
* Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
|
| 535 |
+
|
| 536 |
+
| Metric | Value |
|
| 537 |
+
|:-------------------|:--------|
|
| 538 |
+
| cosine_accuracy | 1.0 |
|
| 539 |
+
| dot_accuracy | 0.0 |
|
| 540 |
+
| manhattan_accuracy | 1.0 |
|
| 541 |
+
| euclidean_accuracy | 1.0 |
|
| 542 |
+
| **max_accuracy** | **1.0** |
|
| 543 |
|
| 544 |
#### Triplet
|
| 545 |
* Dataset: `bge-base-en-v1.5-eval`
|
| 546 |
* Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
|
| 547 |
|
| 548 |
+
| Metric | Value |
|
| 549 |
+
|:-------------------|:--------|
|
| 550 |
+
| cosine_accuracy | 1.0 |
|
| 551 |
+
| dot_accuracy | 0.0 |
|
| 552 |
+
| manhattan_accuracy | 1.0 |
|
| 553 |
+
| euclidean_accuracy | 1.0 |
|
| 554 |
+
| **max_accuracy** | **1.0** |
|
| 555 |
|
| 556 |
<!--
|
| 557 |
## Bias, Risks and Limitations
|
|
|
|
| 740 |
### Training Logs
|
| 741 |
| Epoch | Step | bge-base-en-v1.5-eval_max_accuracy | bge-base-en-v1.5-train_max_accuracy |
|
| 742 |
|:-----:|:----:|:----------------------------------:|:-----------------------------------:|
|
| 743 |
+
| 0 | 0 | - | 0.8510 |
|
| 744 |
+
| 5.0 | 65 | 1.0 | - |
|
| 745 |
|
| 746 |
|
| 747 |
### Framework Versions
|
config.json
CHANGED
|
@@ -1,5 +1,5 @@
|
|
| 1 |
{
|
| 2 |
-
"_name_or_path": "BAAI/bge-base-en",
|
| 3 |
"architectures": [
|
| 4 |
"BertModel"
|
| 5 |
],
|
|
|
|
| 1 |
{
|
| 2 |
+
"_name_or_path": "BAAI/bge-base-en-v1.5",
|
| 3 |
"architectures": [
|
| 4 |
"BertModel"
|
| 5 |
],
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 437951328
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4226550437f27f985a4aaa7684a4bfcf05baedd330b64315cbdf0882a4d02c57
|
| 3 |
size 437951328
|