Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

zacbrld
/
MNLP_M2_document_encoder

Sentence Similarity
sentence-transformers
Safetensors
bert
feature-extraction
Generated from Trainer
dataset_size:5489
loss:MultipleNegativesRankingLoss
text-embeddings-inference
Model card Files Files and versions
xet
Community

Instructions to use zacbrld/MNLP_M2_document_encoder with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

  • Libraries
  • sentence-transformers

    How to use zacbrld/MNLP_M2_document_encoder with sentence-transformers:

    from sentence_transformers import SentenceTransformer
    
    model = SentenceTransformer("zacbrld/MNLP_M2_document_encoder")
    
    sentences = [
        "Military activity affects the physical geology. This was first noted through the intensive shelling on the Western Front during World War I, which caused the shattering of the bedrock and changed the rocks' permeability. New minerals, rocks, and land-forms are also a byproduct of nuclear testing.",
        "Silicon can form sigma bonds to other silicon atoms (and disilane is the parent of this class of compounds). However, it is difficult to prepare and isolate SinH2n+2 (analogous to the saturated alkane hydrocarbons) with n greater than about 8, as their thermal stability decreases with increases in the number of silicon atoms.  Silanes higher in molecular weight than disilane decompose to polymeric polysilicon hydride and hydrogen.  But with a suitable pair of organic substituents in place of hydrogen on each silicon it is possible to prepare polysilanes (sometimes, erroneously called polysilenes) that are analogues of alkanes. These long chain compounds have surprising electronic properties - high electrical conductivity, for example - arising from sigma delocalization of the electrons in the chain.\nEven silicon–silicon pi bonds are possible. However, these bonds are less stable than the carbon analogues. Disilane and longer silanes are quite reactive compared to alkanes. Disilene and disilynes are quite rare, unlike alkenes and alkynes. Examples of disilynes, long thought to be too unstable to be isolated were reported in 2004.",
        "The increasing sophistication of brain-reading technologies has led many to investigate their potential applications for lie detection. Legally required brain scans arguably violate “the guarantee against self-incrimination” because they differ from acceptable forms of bodily evidence, such as fingerprints or blood samples, in an important way: they are not simply physical, hard evidence, but evidence that is intimately linked to the defendant's mind. Under US law, brain-scanning technologies might also raise implications for the Fourth Amendment, calling into question whether they constitute an unreasonable search and seizure.",
        "Military activity affects the physical geology. This was first noted through the intensive shelling on the Western Front during World War I, which caused the shattering of the bedrock and changed the rocks' permeability. New minerals, rocks, and land-forms are also a byproduct of nuclear testing."
    ]
    embeddings = model.encode(sentences)
    
    similarities = model.similarity(embeddings, embeddings)
    print(similarities.shape)
    # [4, 4]
  • Notebooks
  • Google Colab
  • Kaggle
MNLP_M2_document_encoder
91.8 MB
Ctrl+K
Ctrl+K
  • 1 contributor
History: 6 commits
zacbrld's picture
zacbrld
🔁 Fine-tuned on custom STEM corpus
6f1d702 verified 12 months ago
  • 1_Pooling
    Add new SentenceTransformer model 12 months ago
  • .gitattributes
    1.52 kB
    initial commit 12 months ago
  • README.md
    32.4 kB
    🔁 Fine-tuned on custom STEM corpus 12 months ago
  • config.json
    617 Bytes
    Upload model 12 months ago
  • config_sentence_transformers.json
    199 Bytes
    Add new SentenceTransformer model 12 months ago
  • model.safetensors
    90.9 MB
    xet
    🔁 Fine-tuned on custom STEM corpus 12 months ago
  • modules.json
    349 Bytes
    Add new SentenceTransformer model 12 months ago
  • sentence_bert_config.json
    53 Bytes
    Add new SentenceTransformer model 12 months ago
  • special_tokens_map.json
    695 Bytes
    Upload tokenizer 12 months ago
  • tokenizer.json
    712 kB
    🔁 Fine-tuned on custom STEM corpus 12 months ago
  • tokenizer_config.json
    1.46 kB
    Add new SentenceTransformer model 12 months ago
  • vocab.txt
    232 kB
    Upload tokenizer 12 months ago