morenolq commited on
Commit
9a97fc9
Β·
verified Β·
1 Parent(s): c213df1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -6
README.md CHANGED
@@ -4,8 +4,6 @@ language:
4
  tags:
5
  - text2text-generation
6
  - summarization
7
- - legal-ai
8
- - italian-law
9
  license: mit
10
  datasets:
11
  - joelniklaus/Multi_Legal_Pile
@@ -28,6 +26,8 @@ They build upon **BART-IT** ([`morenolq/bart-it`](https://huggingface.co/morenol
28
  - **Trained on legal documents** such as **statutes, case law, and contracts** πŸ“‘
29
  - **Not fine-tuned for specific tasks** (requires further adaptation)
30
 
 
 
31
  ## πŸ“‚ Available Models
32
 
33
  | Model | Description | Link |
@@ -38,8 +38,8 @@ They build upon **BART-IT** ([`morenolq/bart-it`](https://huggingface.co/morenol
38
  | **LEGIT-SCRATCH-BART** | Trained from scratch on **Italian legal texts** | [πŸ”— Link](https://huggingface.co/morenolq/LEGIT-SCRATCH-BART) |
39
  | **LEGIT-SCRATCH-BART-LSG-4096** | Trained from scratch with **LSG attention**, supporting **4,096 tokens** | [πŸ”— Link](https://huggingface.co/morenolq/LEGIT-SCRATCH-BART-LSG-4096) |
40
  | **LEGIT-SCRATCH-BART-LSG-16384** | Trained from scratch with **LSG attention**, supporting **16,384 tokens** | [πŸ”— Link](https://huggingface.co/morenolq/LEGIT-SCRATCH-BART-LSG-16384) |
41
- | **BART-IT-LSG-4096** | `morenolq/bart-it` with **LSG attention**, supporting **4,096 tokens** (no legal adaptation) | [πŸ”— Link](https://huggingface.co/morenolq/BART-IT-LSG-4096)
42
- | **BART-IT-LSG-16384** | `morenolq/bart-it` with **LSG attention**, supporting **16,384 tokens** (no legal adaptation) | [πŸ”— Link](https://huggingface.co/morenolq/BART-IT-LSG-16384) |
43
 
44
  ---
45
 
@@ -66,13 +66,13 @@ They build upon **BART-IT** ([`morenolq/bart-it`](https://huggingface.co/morenol
66
  from transformers import BartForConditionalGeneration, AutoTokenizer
67
 
68
  # Load tokenizer and model
69
- model_name = "morenolq/LEGIT-BART"
70
  tokenizer = AutoTokenizer.from_pretrained(model_name)
71
  model = BartForConditionalGeneration.from_pretrained(model_name)
72
 
73
  # Example input
74
  input_text = "<mask> 1234: Il contratto si intende concluso quando..."
75
- inputs = tokenizer(input_text, return_tensors="pt", max_length=4096, truncation=True)
76
 
77
  # Generate summary
78
  summary_ids = model.generate(inputs.input_ids, max_length=150, num_beams=4, early_stopping=True)
 
4
  tags:
5
  - text2text-generation
6
  - summarization
 
 
7
  license: mit
8
  datasets:
9
  - joelniklaus/Multi_Legal_Pile
 
26
  - **Trained on legal documents** such as **statutes, case law, and contracts** πŸ“‘
27
  - **Not fine-tuned for specific tasks** (requires further adaptation)
28
 
29
+ ⚠️ This specific model is pre-trained on general-purpose Italian text! Please select the best model from the table below.
30
+
31
  ## πŸ“‚ Available Models
32
 
33
  | Model | Description | Link |
 
38
  | **LEGIT-SCRATCH-BART** | Trained from scratch on **Italian legal texts** | [πŸ”— Link](https://huggingface.co/morenolq/LEGIT-SCRATCH-BART) |
39
  | **LEGIT-SCRATCH-BART-LSG-4096** | Trained from scratch with **LSG attention**, supporting **4,096 tokens** | [πŸ”— Link](https://huggingface.co/morenolq/LEGIT-SCRATCH-BART-LSG-4096) |
40
  | **LEGIT-SCRATCH-BART-LSG-16384** | Trained from scratch with **LSG attention**, supporting **16,384 tokens** | [πŸ”— Link](https://huggingface.co/morenolq/LEGIT-SCRATCH-BART-LSG-16384) |
41
+ | **BART-IT-LSG-4096** | `morenolq/bart-it` with **LSG attention**, supporting **4,096 tokens** (⚠️ no legal adaptation) | [πŸ”— Link](https://huggingface.co/morenolq/BART-IT-LSG-4096)
42
+ | **BART-IT-LSG-16384** | `morenolq/bart-it` with **LSG attention**, supporting **16,384 tokens** (⚠️ no legal adaptation) | [πŸ”— Link](https://huggingface.co/morenolq/BART-IT-LSG-16384) |
43
 
44
  ---
45
 
 
66
  from transformers import BartForConditionalGeneration, AutoTokenizer
67
 
68
  # Load tokenizer and model
69
+ model_name = "morenolq/BART-IT-LSG-16384"
70
  tokenizer = AutoTokenizer.from_pretrained(model_name)
71
  model = BartForConditionalGeneration.from_pretrained(model_name)
72
 
73
  # Example input
74
  input_text = "<mask> 1234: Il contratto si intende concluso quando..."
75
+ inputs = tokenizer(input_text, return_tensors="pt", max_length=16384, truncation=True)
76
 
77
  # Generate summary
78
  summary_ids = model.generate(inputs.input_ids, max_length=150, num_beams=4, early_stopping=True)