Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 14
How to use aaa961/finetuned-bge-m3-base-en with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("aaa961/finetuned-bge-m3-base-en")
sentences = [
"Semicolons in Emmet abbreviations inside SASS files <!-- ⚠️⚠️ Do Not Delete This! bug_report_template ⚠️⚠️ -->\r\n<!-- Please read our Rules of Conduct: https://opensource.microsoft.com/codeofconduct/ -->\r\n<!-- Please search existing issues to avoid creating duplicates. -->\r\n<!-- Also please test using the latest insiders build to make sure your issue has not already been fixed: https://code.visualstudio.com/insiders/ -->\r\n\r\n<!-- Use Help > Report Issue to prefill these. -->\r\n- VSCode Version: 1.53.0\r\n- OS Version: Windows 10 Pro 20H2\r\n\r\nSteps to Reproduce:\r\n\r\n1. Open SASS file.\r\n2. Expand any Emmet abbreviation.\r\n3. The expanded line ends with a semicolon, which is unacceptable for SASS.\r\n4. The issue appeared after 1.53.0 update.\r\n\r\n<!-- Launch with `code --disable-extensions` to check. -->\r\nDoes this issue occur when all extensions are disabled?: Yes/No\r\n",
"Breakpoint decorations go to wrong place - Open menuService.ts\r\n- Set some breakpoints\r\n- They show in the wrong place and cover a character, but seemingly get fixed when setting a breakpoint on the next line\r\n\r\n\r\n",
"Breadcrumbs/selectbox dropdown does not relayout after resizing panel <!-- ⚠️⚠️ Do Not Delete This! bug_report_template ⚠️⚠️ -->\r\n<!-- Please read our Rules of Conduct: https://opensource.microsoft.com/codeofconduct/ -->\r\n<!-- Please search existing issues to avoid creating duplicates. -->\r\n<!-- Also please test using the latest insiders build to make sure your issue has not already been fixed: https://code.visualstudio.com/insiders/ -->\r\n\r\n<!-- Use Help > Report Issue to prefill these. -->\r\n- VSCode Version: stable and insiders\r\n\r\nSteps to Reproduce:\r\n\r\n1. Click on a breadcrumb/selectbox so the dropdown shows up\r\n2. Resize panel\r\n3. :bug: Observe the dropdown does not relayout \r\n\r\n\r\n\r\n\r\n",
"Terminal Sticky Scroll disappears when the beginning (not the end) of code block reaches the end of the scope Testing #199240\r\n\r\nSo I was testing the terminal sticky scroll and I noticed that the sticky code block disappears in an abrupt manner when the beginning (not the end) of the multiline terminal command, reaches the end of the scope. To give more context about what I mean, I added the GIF below. I was sort of expecting the code block to start disappearing when the line `> ` reached the end of the output for that command. I noticed that the code block disappeared however when the line `echo hi; \\` reached the end of the output for that run.\r\n\r\nhttps://github.com/microsoft/vscode/assets/61460952/8c92b925-f29a-498a-a8f0-cd438e6d6c6c\r\n\r\nI don't think this is a problem per say, it's a different manner of animating sticky scroll. I was just wondering if this is intended? I think the potential issue with this approach is that the sticky scroll block is hiding the next terminal command and its output while you are scrolling until it disappears. In the editor, sticky scroll starts disappearing progressively when the bottom of the last block/line touches the end of the scope. Perhaps we could explore something similar?\r\n\r\nhttps://github.com/microsoft/vscode/assets/61460952/e3c8fdf2-7dc1-4363-a9f7-ff403a36f533\r\n\r\n\r\n"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from BAAI/bge-m3. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'XLMRobertaModel'})
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("aaa961/finetuned-bge-m3-base-en")
# Run inference
sentences = [
'Shell integration: bash and zsh don\'t serialize \\n and ; characters Part of https://github.com/microsoft/vscode/issues/155639\r\n\r\nRepro:\r\n\r\n1. Open a bash or zsh session\r\n2. Run:\r\n ```sh\r\n echo "a\r\n … b"\r\n ```\r\n \r\n3. ctrl+alt+r to run recent command, select the last command, 🐛 it\'s run without the new line\r\n \r\n',
'TreeView state out of sync Testing #117304\r\n\r\nRepro: Not Sure\r\n\r\nTest state shows passed in file but still running in tree view.\r\n\r\n\r\n',
'Setting icon and color in createTerminal API no longer works correctly See https://github.com/fabiospampinato/vscode-terminals/issues/77\r\n\r\nLooks like the default tab color/icon change probably regressed this.\r\n\r\n',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.4264, 0.4315],
# [0.4264, 1.0000, 0.4278],
# [0.4315, 0.4278, 1.0000]])
bge-base-en-trainTripletEvaluator| Metric | Value |
|---|---|
| cosine_accuracy | 1.0 |
bge-base-en-trainTripletEvaluator| Metric | Value |
|---|---|
| cosine_accuracy | 0.9524 |
texts and label| texts | label | |
|---|---|---|
| type | string | int |
| details |
|
|
| texts | label |
|---|---|
Branch list is sometimes out of order |
|
Type: Bug |
|
1. Open a workspace |
|
2. Quickly open the branch picker and type main |
|
Bug |
|
The first time you do this, sometimes you end up with an unordered list: |
|
The correct order shows up when you keep start typing or try doing this again: |
|
VS Code version: Code - Insiders 1.91.0-insider (Universal) (0354163c1c66b950b0762364f5b4cd37937b624a, 2024-06-26T10:12:33.304Z) |
|
OS version: Darwin arm64 23.5.0 |
|
Modes: |
|
|Item|Value| |
|
|---|---| |
|
|CPUs|Apple M2 Max (12 x 2400)| |
|
|GPU Status|2d_canvas: unavailable_software canvas_oop_rasterization: disabled_off direct_rendering_display_compositor: disabled_off_ok gpu_compositing: disabled_software multiple_raster_threads: enabled_on ope... |
218 |
Git Branch Picker Race Condition If I paste the branch too quickly and then press enter, it does not switch to it, but creates a new branch. |
|
This breaks muscle memory, as it works when you do it slowly. |
|
Once loading completes, it should select the branch again. |
218 |
links aren't discoverable to screen reader users in markdown documents They're only discoverable via visual distinction and the action that can be taken (IE opening them) is only indicated in the tooltip AFAICT. |
|
https://github.com/microsoft/vscode/assets/29464607/09d28b81-c2cc-4477-b1fc-7b1de1baae74 |
|
177 |
BatchSemiHardTripletLosstexts and label| texts | label | |
|---|---|---|
| type | string | int |
| details |
|
|
Base model
BAAI/bge-m3