jemartin commited on
Commit
27eced0
·
verified ·
1 Parent(s): 654d7d9

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +151 -0
README.md ADDED
@@ -0,0 +1,151 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: apache-2.0
4
+ model_name: bidaf-9.onnx
5
+ tags:
6
+ - validated
7
+ - text
8
+ - machine_comprehension
9
+ - bidirectional_attention_flow
10
+ ---
11
+ <!--- SPDX-License-Identifier: MIT -->
12
+
13
+ # BiDAF
14
+
15
+ ## Description
16
+ This model is a neural network for answering a query about a given context paragraph.
17
+
18
+ ## Model
19
+
20
+ |Model |Download |Download (with sample test data)|ONNX version|Opset version|Accuracy |
21
+ |-------------|:--------------|:--------------|:--------------|:--------------|:--------------|
22
+ |BiDAF |[41.5 MB](model/bidaf-9.onnx) |[37.3 MB](model/bidaf-9.tar.gz)|1.4 | 9 |EM of 68.1 in SQuAD v1.1 |
23
+ |BiDAF-int8 |[12 MB](model/bidaf-11-int8.onnx) |[8.7 MB](model/bidaf-11-int8.tar.gz)|1.13.1 |11 |EM of 65.93 in SQuAD v1.1 |
24
+ > Compared with the fp32 BiDAF, int8 BiDAF accuracy drop ratio is 0.23% and performance improvement is 0.89x in SQuAD v1.1.
25
+ >
26
+ > The performance depends on the test hardware. Performance data here is collected with Intel® Xeon® Platinum 8280 Processor, 1s 4c per instance, CentOS Linux 8.3, data batch size is 1.
27
+
28
+ <hr>
29
+
30
+ ## Inference
31
+
32
+ ### Input to model
33
+ Tokenized strings of context paragraph and query.
34
+
35
+ ### Preprocessing steps
36
+ Tokenize words and chars in string for context and query. The tokenized words are in lower case, while chars are not. Chars of each word needs to be clamped or padded to list of length 16. Note [NLTK](https://www.nltk.org/install.html) is used in preprocess for word tokenize.
37
+
38
+ * context_word: [seq, 1,] of string
39
+ * context_char: [seq, 1, 1, 16] of string
40
+ * query_word: [seq, 1,] of string
41
+ * query_char: [seq, 1, 1, 16] of string
42
+
43
+ The following code shows how to preprocess input strings:
44
+
45
+ ```python
46
+ import numpy as np
47
+ import string
48
+ from nltk import word_tokenize
49
+
50
+ def preprocess(text):
51
+ tokens = word_tokenize(text)
52
+ # split into lower-case word tokens, in numpy array with shape of (seq, 1)
53
+ words = np.asarray([w.lower() for w in tokens]).reshape(-1, 1)
54
+ # split words into chars, in numpy array with shape of (seq, 1, 1, 16)
55
+ chars = [[c for c in t][:16] for t in tokens]
56
+ chars = [cs+['']*(16-len(cs)) for cs in chars]
57
+ chars = np.asarray(chars).reshape(-1, 1, 1, 16)
58
+ return words, chars
59
+
60
+ # input
61
+ context = 'A quick brown fox jumps over the lazy dog.'
62
+ query = 'What color is the fox?'
63
+ cw, cc = preprocess(context)
64
+ qw, qc = preprocess(query)
65
+ ```
66
+
67
+ ### Output of model
68
+ The model has 2 outputs.
69
+
70
+ * start_pos: the answer's start position (0-indexed) in context,
71
+ * end_pos: the answer's inclusive end position (0-indexed) in context.
72
+
73
+ ### Postprocessing steps
74
+ Post processing and meaning of output
75
+ ```
76
+ # assuming answer contains the np arrays for start_pos/end_pos
77
+ start = np.asscalar(answer[0])
78
+ end = np.asscalar(answer[1])
79
+ print([w.encode() for w in cw[start:end+1].reshape(-1)])
80
+ ```
81
+
82
+ For this testcase, it would output
83
+ ```
84
+ [b'brown'].
85
+ ```
86
+ <hr>
87
+
88
+ ## Dataset (Train and validation)
89
+ The model is trained with [SQuAD v1.1](https://rajpurkar.github.io/SQuAD-explorer/explore/1.1/dev/).
90
+ <hr>
91
+
92
+ ## Validation accuracy
93
+ Metric is Exact Matching (EM) of 68.1, computed over SQuAD v1.1 dev data.
94
+ <hr>
95
+
96
+ ## Quantization
97
+ BiDAF-int8 is obtained by quantizing fp32 BiDAF model. We use [Intel® Neural Compressor](https://github.com/intel/neural-compressor) with onnxruntime backend to perform quantization. View the [instructions](https://github.com/intel/neural-compressor/blob/master/examples/onnxrt/nlp/onnx_model_zoo/BiDAF/quantization/ptq_dynamic/README.md) to understand how to use Intel® Neural Compressor for quantization.
98
+
99
+
100
+ ### Prepare Model
101
+ Download model from [ONNX Model Zoo](https://github.com/onnx/models).
102
+
103
+ ```shell
104
+ wget https://github.com/onnx/models/raw/main/text/machine_comprehension/bidirectional_attention_flow/model/bidaf-9.onnx
105
+ ```
106
+
107
+ Convert opset version to 11 for more quantization capability.
108
+
109
+ ```python
110
+ import onnx
111
+ from onnx import version_converter
112
+
113
+ model = onnx.load('bidaf-9.onnx')
114
+ model = version_converter.convert_version(model, 11)
115
+ onnx.save_model(model, 'bidaf-11.onnx')
116
+ ```
117
+
118
+ ### Model quantize
119
+
120
+ Dynamic quantization:
121
+
122
+ ```bash
123
+ bash run_tuning.sh --input_model=path/to/model \ # model path as *.onnx
124
+ --dataset_location=path/to/squad/dev-v1.1.json
125
+ --output_model=path/to/model_tune
126
+ ```
127
+
128
+ <hr>
129
+
130
+ ## Publication/Attribution
131
+ Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, Hannaneh Hajishirzi. Bidirectional Attention Flow for Machine Comprehension, [paper](https://arxiv.org/abs/1611.01603)
132
+
133
+ <hr>
134
+
135
+ ## References
136
+ * This model is converted from a CNTK model trained from [this implementation](https://github.com/microsoft/CNTK/tree/nikosk/bidaf/Examples/Text/BidirectionalAttentionFlow/squad).
137
+ * [Intel® Neural Compressor](https://github.com/intel/neural-compressor)
138
+ <hr>
139
+
140
+ ## Contributors
141
+ * [mengniwang95](https://github.com/mengniwang95) (Intel)
142
+ * [yuwenzho](https://github.com/yuwenzho) (Intel)
143
+ * [airMeng](https://github.com/airMeng) (Intel)
144
+ * [ftian1](https://github.com/ftian1) (Intel)
145
+ * [hshen14](https://github.com/hshen14) (Intel)
146
+ <hr>
147
+
148
+ ## License
149
+ MIT License
150
+ <hr>
151
+