Created
June 6, 2020 08:21
-
-
Save saburbutt/3b4919c5800248df695f6ec71b43f6bf to your computer and use it in GitHub Desktop.
Model Parameters for extractive QA with SQuAD
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
SpanBert Parameter Details: attention probs dropout_prob: 0.1, directionality: bidi, hidden_act: gelu, | |
hidden dropout prob: 0.1, hidden size: 1024, initializer range: 0.02, intermediate size: 4096, | |
layer norm eps: 1e-12, max position embeddings: 512, num attention heads: 16, num hidden layers: 24, | |
pad token id: 0, pooler fc size: 768, pooler num attention heads: 12, pooler num fc layers: 3, | |
pooler size per head: 128, pooler type: "first_token_transform", type vocab size: 2, vocab size: 28996 | |
Albert Parameter Details: $attentionprobs dropoutprob:0, bos token id:2, classifier dropout prob: 0.1, | |
down scale factor: 1, embedding size: 128, eos token id: 3, gap size: 0, hidden act: "gelu", | |
hidden dropout prob: 0, hidden size: 4096, initializer range: 0.02, inner group num: 1, | |
intermediate size: 16384, layer norm eps: 1e-12, max position embeddings: 512, net_structure_type: 0, | |
num_attention_heads: 64, num_hidden_groups: 1, num_hidden_layers: 12, num_memory_blocks: 0, | |
output_past: true, pad_token_id: 0, type_vocab_size: 2, vocab_size: 30000$ | |
BERT Parameter Details: $attention probs dropout prob: 0.1, hidden act: gelu, hidden dropout prob: 0.1, | |
hidden size: 1024, initializer range: 0.02, intermediate size: 4096, layer norm eps: 1e-12, | |
max position embeddings: 512, num attention heads: 16, | |
num hidden layers: 24, pad token id: 0, type vocab size: 2, vocab size: 30522$ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment