CrabInHoney
/

urlbert-tiny-base-v3

Text Classification

Model card Files Files and versions

urlbert-tiny-base-v3 / README.md

CrabInHoney's picture

Update README.md

77cf7ad verified about 1 year ago

|

history blame contribute delete

1.92 kB

	---
	license: apache-2.0
	language:
	- en
	pipeline_tag: text-classification
	tags:
	- url
	- urls
	- classification
	new_version: CrabInHoney/urlbert-tiny-base-v4
	---
	This is a very small version of BERT, intended for later fine-tune under URL analysis.


	An updated version of the old basic model for URL analysis

	Old version: https://huggingface.co/CrabInHoney/urlbert-tiny-base-v2

	Model size

	3.69M params

	Tensor type

	F32

	Test example:

	from transformers import BertTokenizerFast, BertForMaskedLM, pipeline
	import torch

	device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
	print(f"Используемое устройство: {device}")

	model_name = "CrabInHoney/urlbert-tiny-base-v3"

	tokenizer = BertTokenizerFast.from_pretrained(model_name)
	model = BertForMaskedLM.from_pretrained(model_name)
	model.to(device)

	fill_mask = pipeline(
	"fill-mask",
	model=model,
	tokenizer=tokenizer,
	device=0 if torch.cuda.is_available() else -1
	)

	sentences = [
	"http://example.[MASK]/"
	]

	for sentence in sentences:
	print(f"\nИсходное предложение: {sentence}")
	results = fill_mask(sentence)
	for result in results:
	token_str = result['token_str']
	score = result['score']
	print(f"Предсказанное слово: {token_str}, вероятность: {score:.4f}")


	Output:

	Исходное предложение: http://example.[MASK]/

	Предсказанное слово: com, вероятность: 0.7018

	Предсказанное слово: org, вероятность: 0.1191

	Предсказанное слово: nl, вероятность: 0.0406

	Предсказанное слово: net, вероятность: 0.0294

	Предсказанное слово: ca, вероятность: 0.0190