Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.
SQuAD2.0 combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering.
Explore SQuAD2.0 and model predictionsSQuAD2.0 paper (Rajpurkar & Jia et al. '18)SQuAD 1.1, the previous version of the SQuAD dataset, contains 100,000+ question-answer pairs on 500+ articles.
Explore SQuAD1.1 and model predictionsSQuAD1.0 paper (Rajpurkar et al. '16)We've built a few resources to help you get started with the dataset.
Download a copy of the dataset (distributed under the CC BY-SA 4.0 license):
To evaluate your models, we have also made available the evaluation script we will use for official evaluation, along with a sample prediction file that the script will take as input. To run the evaluation, use python evaluate-v2.0.py <path_to_dev-v2.0> <path_to_predictions>.
Once you have a built a model that works to your expectations on the dev set, you submit it to get official scores on the dev and a hidden test set. To preserve the integrity of test results, we do not release the test set to the public. Instead, we require you to submit your model so that we can run it on the test set for you. Here's a tutorial walking you through official evaluation of your model:
Submission TutorialBecause SQuAD is an ongoing effort, we expect the dataset to evolve.
To keep up to date with major changes to the dataset, please subscribe:
Ask us questions at our google group or at [email protected] and [email protected].
SQuAD2.0 tests the ability of a system to not only answer reading comprehension questions, but also abstain when presented with a question that cannot be answered based on the provided paragraph.
| Rank | Model | EM | F1 |
|---|---|---|---|
| Human Performance Stanford University (Rajpurkar & Jia et al. '18) | 86.831 | 89.452 | |
| 1 Nov 06, 2019 | ALBERT + DAAF + Verifier (ensemble) PINGAN Omni-Sinitic | 90.002 | 92.425 |
| 2 Sep 18, 2019 | ALBERT (ensemble model) Google Research & TTIC https://arxiv.org/abs/1909.11942 | 89.731 | 92.215 |
| 3 Dec 08, 2019 | ALBERT+Entailment DA (ensemble) CloudWalk | 88.761 | 91.745 |
| 4 Jul 22, 2019 | XLNet + DAAF + Verifier (ensemble) PINGAN Omni-Sinitic | 88.592 | 90.859 |
| 4 Nov 22, 2019 | albert+verifier (single model) Ping An Life Insurance Company AI Team | 88.355 | 91.019 |
| 4 Dec 08, 2019 | ALBERT+Entailment DA Verifier (single model) CloudWalk | 87.847 | 91.265 |
| 4 Sep 16, 2019 | ALBERT (single model) Google Research & TTIC https://arxiv.org/abs/1909.11942 | 88.107 | 90.902 |
| 4 Jul 26, 2019 | UPM (ensemble) Anonymous | 88.231 | 90.713 |
| 5 Aug 04, 2019 | XLNet + SG-Net Verifier (ensemble) Shanghai Jiao Tong University & CloudWalk https://arxiv.org/abs/1908.05147 | 88.174 | 90.702 |
| 6 Nov 15, 2019 | XLNet (single model) Google Brain & CMU | 87.926 | 90.689 |
| 7 Aug 04, 2019 | XLNet + SG-Net Verifier++ (single model) Shanghai Jiao Tong University & CloudWalk https://arxiv.org/abs/1908.05147 | 87.238 | 90.071 |
| 8 Jul 26, 2019 | UPM (single model) Anonymous | 87.193 | 89.934 |
| 8 Nov 27, 2019 | RoBERTa+Verify (ensemble) CW | 86.933 | 90.037 |
| 8 Mar 20, 2019 | BERT + DAE + AoA (ensemble) Joint Laboratory of HIT and iFLYTEK Research | 87.147 | 89.474 |
| 8 Jul 20, 2019 | RoBERTa (single model) Facebook AI | 86.820 | 89.795 |
| 9 Nov 12, 2019 | RoBERTa+Verify (single model) CW | 86.448 | 89.586 |
| 9 Mar 15, 2019 | BERT + ConvLSTM + MTL + Verifier (ensemble) Layer 6 AI | 86.730 | 89.286 |
| 10 Mar 05, 2019 | BERT + N-Gram Masking + Synthetic Self-Training (ensemble) Google AI Language https://github.com/google-research/bert | 86.673 | 89.147 |
| 11 Oct 16, 2019 | Xlnet+Verifier single model | 86.594 | 89.082 |
| 12 Aug 30, 2019 | Xlnet+Verifier (single model) Ping An Life Insurance Company AI Team | 86.572 | 89.063 |
| 12 Dec 09, 2019 | XLNET-V2-123+ (single model) MST/EOI http://tia.today | 86.403 | 89.148 |
| 13 May 21, 2019 | XLNet (single model) Google Brain & CMU | 86.346 | 89.133 |
| 14 May 14, 2019 | SG-Net (ensemble) Shanghai Jiao Tong University https://arxiv.org/abs/1908.05147 | 86.211 | 88.848 |
| 14 Apr 13, 2019 | SemBERT (ensemble) Shanghai Jiao Tong University https://arxiv.org/abs/1909.02209 | 86.166 | 88.886 |
| 14 Sep 29, 2019 | BERTSP (single model) NEUKG http://www.techkg.cn/--please | 85.838 | 88.921 |
| 14 Mar 16, 2019 | BERT + DAE + AoA (single model) Joint Laboratory of HIT and iFLYTEK Research | 85.884 | 88.621 |
| 14 Jul 22, 2019 | SpanBERT (single model) FAIR & UW | 85.748 | 88.709 |
| 15 May 14, 2019 | SG-Net (single model) Shanghai Jiao Tong University https://arxiv.org/abs/1908.05147 | 85.229 | 87.926 |
| 15 Mar 13, 2019 | BERT + ConvLSTM + MTL + Verifier (single model) Layer 6 AI | 84.924 | 88.204 |
| 15 Mar 05, 2019 | BERT + N-Gram Masking + Synthetic Self-Training (single model) Google AI Language https://github.com/google-research/bert | 85.150 | 87.715 |
| 15 Jun 19, 2019 | BNDVnet (single model) PAOS | 85.003 | 87.833 |
| 15 Jan 15, 2019 | BERT + MMFT + ADA (ensemble) Microsoft Research Asia | 85.082 | 87.615 |
| 15 Apr 11, 2019 | SemBERT (single model) Shanghai Jiao Tong University https://arxiv.org/abs/1909.02209 | 84.800 | 87.864 |
| 15 Sep 13, 2019 | xlnet (single model) VerifiedXiaoPAI | 84.642 | 88.000 |
| 15 Apr 16, 2019 | Insight-baseline-BERT (single model) PAII Insight Team | 84.834 | 87.644 |
| 16 Sep 03, 2019 | Hanvon_model (single model) Hanvon_WuHan | 84.721 | 87.117 |
| 17 Jan 10, 2019 | BERT + Synthetic Self-Training (ensemble) Google AI Language https://github.com/google-research/bert | 84.292 | 86.967 |
| 18 Nov 08, 2019 | BERT + Multiple-CNN (ensemble) Kyonggi University (ICL) & KISTI | 84.202 | 86.767 |
| 19 Jul 22, 2019 | Tuned BERT-1seq Large Cased (single model) FAIR & UW | 83.751 | 86.594 |
| 20 Mar 20, 2019 | Bert-raw (ensemble) None | 83.604 | 86.036 |
| 20 Dec 13, 2018 | BERT finetune baseline (ensemble) Anonymous | 83.536 | 86.096 |
| 20 Dec 21, 2018 | PAML+BERT (ensemble model) PINGAN GammaLab | 83.457 | 86.122 |
| 20 Dec 16, 2018 | Lunet + Verifier + BERT (ensemble) Layer 6 AI NLP Team | 83.469 | 86.043 |
| 21 Dec 15, 2018 | Lunet + Verifier + BERT (single model) Layer 6 AI NLP Team | 82.995 | 86.035 |
| 21 Jun 20, 2019 | SENSEFORTH + BERT single https://senseforth.ai | 83.142 | 85.873 |
| 21 Jan 14, 2019 | BERT + MMFT + ADA (single model) Microsoft Research Asia | 83.040 | 85.892 |
| 21 May 14, 2019 | ATB (single model) Anonymous | 82.882 | 86.002 |
| 21 Feb 16, 2019 | Bert-raw (ensemble) None | 83.175 | 85.635 |
| 21 Feb 26, 2019 | BERT with Something (ensemble) Anonymous | 83.051 | 85.737 |
| 21 Jan 10, 2019 | BERT + Synthetic Self-Training (single model) Google AI Language https://github.com/google-research/bert | 82.972 | 85.810 |
| 21 Jul 22, 2019 | Tuned BERT Large Cased (single model) FAIR & UW | 82.803 | 85.863 |
| 21 Mar 11, 2019 | Bert-raw (ensemble) None | 83.119 | 85.510 |
| 21 Feb 15, 2019 | BERT + NeurQuRI (ensemble) 2SAH | 82.803 | 85.703 |
| 22 Feb 27, 2019 | BERT + NeurQuRI (ensemble) 2SAH | 82.713 | 85.584 |
| 22 May 13, 2019 | BERT-Base + QA Pre-training (single model) Anonymous | 82.724 | 85.491 |
| 22 Dec 16, 2018 | PAML+BERT (single model) PINGAN GammaLab | 82.577 | 85.603 |
| 23 Nov 16, 2018 | AoA + DA + BERT (ensemble) Joint Laboratory of HIT and iFLYTEK Research | 82.374 | 85.310 |
| 24 Dec 12, 2018 | BERT finetune baseline (single model) Anonymous | 82.126 | 84.820 |
| 24 Feb 28, 2019 | BERT_s (single model) Anonymous | 81.979 | 84.846 |
| 24 Dec 10, 2018 | Candi-Net+BERT (ensemble) 42Maru NLP Team | 82.126 | 84.624 |
| 25 Feb 28, 2019 | BERT-large+UBFT (single model) anonymous | 81.573 | 84.535 |
| 26 Feb 15, 2019 | BERT + NeurQuRI (single model) 2SAH | 81.257 | 84.342 |
| 26 Feb 25, 2019 | BERT with Something (single model) Anonymous | 81.110 | 84.386 |
| 26 Nov 16, 2018 | AoA + DA + BERT (single model) Joint Laboratory of HIT and iFLYTEK Research | 81.178 | 84.251 |
| 27 Mar 20, 2019 | Bert-raw (single) None | 80.693 | 83.922 |
| 27 Mar 07, 2019 | BERT + UnAnsQ (single model) Anonymous | 80.749 | 83.851 |
| 28 Dec 19, 2018 | Candi-Net+BERT (single model) 42Maru NLP Team | 80.659 | 83.562 |
| 29 Jan 22, 2019 | BERT + NeurQuRI (single model) 2SAH | 80.591 | 83.391 |
| 29 Nov 11, 2019 | BERTlarge (ensemble) SAIL | 80.456 | 83.509 |
| 30 Mar 11, 2019 | Bert-raw (single) None | 80.411 | 83.457 |
| 31 Feb 16, 2019 | Bert-raw (single model) None | 80.343 | 83.243 |
| 31 May 28, 2019 | Bert Single Model https://senseforth.ai | 80.422 | 83.118 |
| 31 Apr 04, 2019 | BISAN-CC (single model) Seoul National University & Hyundai Motors | 80.208 | 83.149 |
| 31 Dec 03, 2018 | PwP+BERT (single model) AITRICS | 80.117 | 83.189 |
| 31 Dec 05, 2018 | Candi-Net+BERT (single model) 42Maru NLP Team | 80.388 | 82.908 |
| 31 Jul 22, 2019 | Original BERT Large Cased (single model) FAIR & UW | 79.971 | 83.266 |
| 31 Feb 19, 2019 | BERT + UDA (single model) Anonymous | 80.005 | 83.208 |
| 32 Apr 10, 2019 | bert (single model) vinda msqjmxx | 79.971 | 83.184 |
| 32 Feb 28, 2019 | ST_bl single model | 80.140 | 82.962 |
| 32 Nov 08, 2018 | BERT (single model) Google AI Language | 80.005 | 83.061 |
| 33 Feb 12, 2019 | BERT + Sparse-Transformer single model | 79.948 | 83.023 |
| 34 Mar 07, 2019 | BERT uncased (single model) Anonymous | 79.745 | 83.020 |
| 34 Dec 06, 2018 | NEXYS_BASE (single model) NEXYS, DGIST R7 | 79.779 | 82.912 |
| 35 Feb 01, 2019 | {bert-finetuning} (single model) ksai | 79.632 | 82.852 |
| 36 Nov 09, 2018 | L6Net + BERT (single model) Layer 6 AI | 79.181 | 82.259 |
| 36 Mar 14, 2019 | {Anonymous} (single model) Anonymous | 78.876 | 82.524 |
| 37 Apr 24, 2019 | BERT + WIAN (ensemble) Infosys Limited | 78.650 | 81.497 |
| 37 Nov 11, 2019 | BERTlarge (single model) SAIL | 78.650 | 81.474 |
| 37 Mar 14, 2019 | BISAN (single model) Seoul National University & Hyundai Motors | 78.481 | 81.531 |
| 38 Dec 26, 2019 | BERT-Large-Cased (single model) sysu | 78.357 | 81.500 |
| 39 Dec 14, 2018 | BERT+AC (single model) Hithink RoyalFlush | 78.052 | 81.174 |
| 40 Nov 06, 2018 | SLQA+BERT (single model) Alibaba DAMO NLP http://www.aclweb.org/anthology/P18-1158 | 77.003 | 80.209 |
| 41 Jan 05, 2019 | synss (single model) bert_finetune | 76.055 | 79.329 |
| 42 Dec 18, 2018 | ARSG-BERT (single model) TRINITI RESEARCH LABS, Active.ai https://active.ai | 74.746 | 78.227 |
| 42 Nov 05, 2018 | MIR-MRC(F-Net) (single model) Kangwon National University, Natural Language Processing Lab. & ForceWin, KP Lab. | 74.791 | 77.988 |
| 43 May 23, 2019 | {BERTcw} (single model) private | 74.385 | 77.308 |
| 44 Sep 13, 2018 | nlnet (single model) Microsoft Research Asia | 74.272 | 77.052 |
| 45 Dec 29, 2018 | MMIPN Single | 73.505 | 76.424 |
| 46 Apr 20, 2019 | BERT-Base (single model) Dining Philosophers | 73.099 | 76.236 |
| 47 Oct 12, 2018 | YARCS (ensemble) IBM Research AI | 72.670 | 75.507 |
| 48 Nov 14, 2018 | BERT+Answer Verifier (single model) Pingan Tech Olatop Lab | 71.666 | 75.457 |
| 49 Sep 17, 2018 | Unet (ensemble) Fudan University & Liulishuo Lab https://arxiv.org/abs/1810.06638 | 71.417 | 74.869 |
| 49 Apr 24, 2019 | BERT-Base (single) GreenflyAI https://greenfly.ai | 71.699 | 74.430 |
| 49 Aug 15, 2018 | Reinforced Mnemonic Reader + Answer Verifier (single model) NUDT https://arxiv.org/abs/1808.05759 | 71.767 | 74.295 |
| 49 Aug 28, 2018 | SLQA+ (single model) Alibaba DAMO NLP http://www.aclweb.org/anthology/P18-1158 | 71.462 | 74.434 |
| 49 Jan 19, 2019 | {BERT-base} (single-model) Anonymous | 70.763 | 74.449 |
| 49 Sep 14, 2018 | SAN (ensemble model) Microsoft Business Applications AI Research https://arxiv.org/abs/1712.03556 | 71.316 | 73.704 |
| 50 Aug 21, 2018 | FusionNet++ (ensemble) Microsoft Business Applications Group AI Research https://arxiv.org/abs/1711.07341 | 70.300 | 72.484 |
| 50 Sep 26, 2018 | Multi-Level Attention Fusion(MLAF) (single model) Chonbuk National University, Cognitive Computing Lab. | 69.476 | 72.857 |
| 51 Sep 14, 2018 | Unet (single model) Fudan University & Liulishuo Lab | 69.262 | 72.642 |
| 52 Dec 20, 2018 | DocQA + NeurQuRI (single model) 2SAH | 68.766 | 71.662 |
| 53 Aug 21, 2018 | SAN (single model) Microsoft Business Applications AI Research https://arxiv.org/abs/1712.03556 | 68.653 | 71.439 |
| 53 Sep 13, 2018 | BiDAF++ with pair2vec (single model) UW and FAIR | 68.021 | 71.583 |
| 53 Jun 24, 2018 | KACTEIL-MRC(GFN-Net) (single model) Kangwon National University, Natural Language Processing Lab. | 68.213 | 70.878 |
| 53 Jul 13, 2018 | VS^3-NET (single model) Kangwon National University in South Korea | 67.897 | 70.884 |
| 54 Jan 01, 2019 | EBB-Net (single model) Enliple AI | 66.610 | 70.303 |
| 55 Jun 25, 2018 | KakaoNet2 (single model) Kakao NLP Team | 65.719 | 69.381 |
| 56 Sep 13, 2018 | BiDAF++ (single model) UW and FAIR | 65.651 | 68.866 |
| 56 Jul 11, 2018 | abcNet (single model) Fudan University & Liulishuo AI Lab | 65.256 | 69.206 |
| 57 Jun 27, 2018 | BSAE AddText (single model) reciTAL.ai | 63.338 | 67.422 |
| 58 Aug 14, 2018 | eeAttNet (single model) BBD NLP Team https://www.bbdservice.com | 63.327 | 66.633 |
| 58 May 30, 2018 | BiDAF + Self Attention + ELMo (single model) Allen Institute for Artificial Intelligence [modified by Stanford] | 63.372 | 66.251 |
| 59 May 30, 2018 | BiDAF + Self Attention (single model) Allen Institute for Artificial Intelligence [modified by Stanford] | 59.332 | 62.305 |
| 60 May 30, 2018 | BiDAF-No-Answer (single model) University of Washington [modified by Stanford] | 59.174 | 62.093 |
| 60 Nov 27, 2018 | Tree-LSTM + BiDAF + ELMo (single model) Carnegie Mellon University | 57.707 | 62.341 |
Here are the ExactMatch (EM) and F1 scores evaluated on the test set of SQuAD v1.1.
| Rank | Model | EM | F1 |
|---|---|---|---|
| Human Performance Stanford University (Rajpurkar et al. '16) | 82.304 | 91.221 | |
| 1 May 21, 2019 | XLNet (single model) Google Brain & CMU | 89.898 | 95.080 |
| 2 Aug 11, 2019 | XLNET-123 (single model) MST/EOI | 89.646 | 94.930 |
| 3 Sep 25, 2019 | BERTSP (single model) NEUKG http://www.techkg.cn/ | 88.912 | 94.584 |
| 3 Jul 21, 2019 | SpanBERT (single model) FAIR & UW | 88.839 | 94.635 |
| 4 Jul 03, 2019 | BERT+WWM+MT (single model) Xiaoi Research | 88.650 | 94.393 |
| 5 Jul 21, 2019 | Tuned BERT-1seq Large Cased (single model) FAIR & UW | 87.465 | 93.294 |
| 6 Oct 05, 2018 | BERT (ensemble) Google AI Language https://arxiv.org/abs/1810.04805 | 87.433 | 93.160 |
| 7 May 14, 2019 | ATB (single model) Anonymous | 86.940 | 92.641 |
| 8 Jul 21, 2019 | Tuned BERT Large Cased (single model) FAIR & UW | 86.521 | 92.617 |
| 8 Jul 04, 2019 | BERT+MT (single model) Xiaoi Research | 86.458 | 92.645 |
| 9 Feb 14, 2019 | KT-NET (single model) Baidu NLP | 85.944 | 92.425 |
| 9 Sep 26, 2018 | nlnet (ensemble) Microsoft Research Asia | 85.954 | 91.677 |
| 9 Feb 28, 2019 | ST_bl single model | 85.430 | 91.976 |
| 10 Nov 21, 2019 | EL-BERT (single model) YeonTaek Oh | 85.335 | 91.807 |
| 11 Mar 14, 2019 | BISAN (single model) Seoul National University & Hyundai Motors | 85.314 | 91.756 |
| 11 Jun 03, 2019 | DPN (single model) Anonymous | 84.978 | 92.019 |
| 11 Oct 05, 2018 | BERT (single model) Google AI Language https://arxiv.org/abs/1810.04805 | 85.083 | 91.835 |
| 11 Jul 10, 2019 | BERT-uncased (single model) Anonymous | 84.926 | 91.932 |
| 11 Feb 16, 2019 | BERT+Sparse-Transformer single model | 85.125 | 91.623 |
| 11 Sep 09, 2018 | nlnet (ensemble) Microsoft Research Asia | 85.356 | 91.202 |
| 11 Jul 21, 2019 | Original BERT Large Cased (single model) FAIR & UW | 84.328 | 91.281 |
| 11 Feb 19, 2019 | WD (single model) Anonymous | 84.402 | 90.561 |
| 11 Jul 11, 2018 | QANet (ensemble) Google Brain & CMU | 84.454 | 90.490 |
| 11 Apr 21, 2019 | Common-sense Governed BERT-123 (single model) Jerry AGI Ragtag | 83.930 | 90.613 |
| 12 Feb 21, 2019 | WD1 (single model) Anonymous | 83.804 | 90.429 |
| 12 Jul 08, 2018 | r-net (ensemble) Microsoft Research Asia | 84.003 | 90.147 |
| 12 May 08, 2019 | Common-sense Governed BERT-123 (single model) MST/EOI | 82.943 | 91.074 |
| 12 Jun 20, 2018 | MARS (ensemble) YUANFUDAO research NLP | 83.982 | 89.796 |
| 13 Mar 19, 2018 | QANet (ensemble) Google Brain & CMU | 83.877 | 89.737 |
| 13 Sep 09, 2018 | nlnet (single model) Microsoft Research Asia | 83.468 | 90.133 |
| 14 Sep 01, 2018 | MARS (single model) YUANFUDAO research NLP | 83.185 | 89.547 |
| 15 Jun 21, 2018 | MARS (single model) YUANFUDAO research NLP | 83.122 | 89.224 |
| 16 Mar 06, 2018 | QANet (ensemble) Google Brain & CMU | 82.744 | 89.045 |
| 16 Jun 20, 2018 | QANet (single) Google Brain & CMU | 82.471 | 89.306 |
| 16 Jan 22, 2018 | Hybrid AoA Reader (ensemble) Joint Laboratory of HIT and iFLYTEK Research | 82.482 | 89.281 |
| 16 Feb 19, 2018 | Reinforced Mnemonic Reader + A2D (ensemble model) Microsoft Research Asia & NUDT | 82.849 | 88.764 |
| 16 May 09, 2018 | MARS (single model) YUANFUDAO research NLP | 82.587 | 88.880 |
| 16 Jan 03, 2018 | r-net+ (ensemble) Microsoft Research Asia | 82.650 | 88.493 |
| 16 Jan 05, 2018 | SLQA+ (ensemble) Alibaba iDST NLP | 82.440 | 88.607 |
| 16 Jul 14, 2019 | BERT (single model) KTNET | 82.062 | 88.947 |
| 16 Feb 27, 2018 | QANet (single model) Google Brain & CMU | 82.209 | 88.608 |
| 16 Feb 02, 2018 | Reinforced Mnemonic Reader (ensemble model) NUDT and Fudan University https://arxiv.org/abs/1705.02798 | 82.283 | 88.533 |
| 16 Dec 23, 2018 | MMIPN Single | 81.580 | 88.948 |
| 16 Dec 17, 2017 | r-net (ensemble) Microsoft Research Asia http://aka.ms/rnet | 82.136 | 88.126 |
| 16 Dec 17, 2018 | ARSG-BERT (single model) TRINITI RESEARCH LABS, Active.ai https://active.ai | 81.307 | 88.909 |
| 16 Dec 22, 2017 | AttentionReader+ (ensemble) Tencent DPDAC NLP | 81.790 | 88.163 |
| 17 May 09, 2018 | Reinforced Mnemonic Reader + A2D (single model) Microsoft Research Asia & NUDT | 81.538 | 88.130 |
| 17 Apr 23, 2018 | r-net (single model) Microsoft Research Asia | 81.391 | 88.170 |
| 17 May 09, 2018 | Reinforced Mnemonic Reader + A2D + DA (single model) Microsoft Research Asia & NUDT | 81.401 | 88.122 |
| 17 Apr 03, 2018 | KACTEIL-MRC(GF-Net+) (ensemble) Kangwon National University, Natural Language Processing Lab. | 81.496 | 87.557 |
| 17 Feb 27, 2018 | QANet (single model) Google Brain & CMU | 80.929 | 87.773 |
| 17 Nov 17, 2017 | BiDAF + Self Attention + ELMo (ensemble) Allen Institute for Artificial Intelligence | 81.003 | 87.432 |
| 17 Feb 19, 2018 | Reinforced Mnemonic Reader + A2D (single model) Microsoft Research Asia & NUDT | 80.919 | 87.492 |
| 18 Feb 12, 2018 | Reinforced Mnemonic Reader + A2D (single model) Microsoft Research Asia & NUDT | 80.489 | 87.454 |
| 18 Apr 12, 2018 | AVIQA+ (ensemble) aviqa team | 80.615 | 87.311 |
| 19 Jan 13, 2018 | SLQA+ single model | 80.436 | 87.021 |
| 19 Jan 04, 2018 | {EAZI} (ensemble) Yiwise NLP Group | 80.436 | 86.912 |
| 19 Jan 12, 2018 | EAZI+ (ensemble) Yiwise NLP Group | 80.426 | 86.912 |
| 19 Jan 22, 2018 | Hybrid AoA Reader (single model) Joint Laboratory of HIT and iFLYTEK Research | 80.027 | 87.288 |
| 19 Mar 20, 2018 | DNET (ensemble) QA geeks | 80.164 | 86.721 |
| 20 Feb 12, 2018 | BiDAF + Self Attention + ELMo + A2D (single model) Microsoft Research Asia & NUDT | 79.996 | 86.711 |
| 21 Jan 03, 2018 | r-net+ (single model) Microsoft Research Asia | 79.901 | 86.536 |
| 21 Feb 23, 2018 | MAMCN+ (single model) Samsung Research | 79.692 | 86.727 |
| 22 Jan 29, 2018 | Reinforced Mnemonic Reader (single model) NUDT and Fudan University https://arxiv.org/abs/1705.02798 | 79.545 | 86.654 |
| 22 Dec 05, 2017 | SAN (ensemble model) Microsoft Business AI Solutions Team https://arxiv.org/abs/1712.03556 | 79.608 | 86.496 |
| 22 Dec 28, 2017 | SLQA+ (single model) Alibaba iDST NLP | 79.199 | 86.590 |
| 23 Oct 17, 2017 | Interactive AoA Reader+ (ensemble) Joint Laboratory of HIT and iFLYTEK | 79.083 | 86.450 |
| 23 Nov 05, 2018 | MIR-MRC(F-Net) (single model) ForceWin, KP Lab. | 79.083 | 86.288 |
| 24 Jun 01, 2018 | MDReader single model | 79.031 | 86.006 |
| 24 Oct 24, 2017 | FusionNet (ensemble) Microsoft Business AI Solutions Team https://arxiv.org/abs/1711.07341 | 78.978 | 86.016 |
| 25 Oct 22, 2017 | DCN+ (ensemble) Salesforce Research https://arxiv.org/abs/1711.00106 | 78.852 | 85.996 |
| 26 Mar 29, 2018 | KACTEIL-MRC(GF-Net+) (single model) Kangwon National University, Natural Language Processing Lab. | 78.664 | 85.780 |
| 26 Nov 03, 2017 | BiDAF + Self Attention + ELMo (single model) Allen Institute for Artificial Intelligence | 78.580 | 85.833 |
| 27 May 09, 2018 | KakaoNet (single model) Kakao NLP Team | 78.401 | 85.724 |
| 28 Nov 30, 2017 | SLQA (ensemble) Alibaba iDST NLP | 78.328 | 85.682 |
| 28 Mar 19, 2018 | aviqa (ensemble) aviqa team | 78.496 | 85.469 |
| 28 Jan 02, 2018 | Conductor-net (ensemble) CMU https://arxiv.org/abs/1710.10504 | 78.433 | 85.517 |
| 28 Sep 18, 2018 | BiDAF++ with pair2vec (single model) UW and FAIR | 78.223 | 85.535 |
| 28 Jun 01, 2018 | MDReader0 single model | 78.171 | 85.543 |
| 28 Jan 03, 2018 | MEMEN (single model) Zhejiang University https://arxiv.org/abs/1707.09098 | 78.234 | 85.344 |
| 28 Jan 29, 2018 | test single | 78.087 | 85.348 |
| 29 Jul 25, 2017 | Interactive AoA Reader (ensemble) Joint Laboratory of HIT and iFLYTEK Research | 77.845 | 85.297 |
| 30 Mar 20, 2018 | DNET (single model) QA geeks | 77.646 | 84.905 |
| 31 Sep 18, 2018 | BiDAF++ (single model) UW and FAIR | 77.573 | 84.858 |
| 31 Dec 06, 2017 | AttentionReader+ (single) Tencent DPDAC NLP | 77.342 | 84.925 |
| 31 Dec 13, 2017 | RaSoR + TR + LM (single model) Tel-Aviv University https://arxiv.org/abs/1712.03609 | 77.583 | 84.163 |
| 31 Dec 21, 2017 | Jenga (ensemble) Facebook AI Research | 77.237 | 84.466 |
| 31 Nov 06, 2017 | Conductor-net (ensemble) CMU https://arxiv.org/abs/1710.10504 | 76.996 | 84.630 |
| 31 Jan 23, 2018 | MARS (single model) YUANFUDAO research NLP | 76.859 | 84.739 |
| 32 May 14, 2018 | VS^3-NET (single model) Kangwon National University in South Korea | 76.775 | 84.491 |
| 32 Nov 01, 2017 | SAN (single model) Microsoft Business AI Solutions Team https://arxiv.org/abs/1712.03556 | 76.828 | 84.396 |
| 32 Sep 26, 2018 | {gqa} (single model) FAIR | 77.090 | 83.931 |
| 32 Dec 19, 2017 | FRC (single model) in review | 76.240 | 84.599 |
| 32 Oct 13, 2017 | r-net (single model) Microsoft Research Asia http://aka.ms/rnet | 76.461 | 84.265 |
| 33 Oct 22, 2017 | Conductor-net (ensemble) CMU | 76.146 | 83.991 |
| 34 Sep 08, 2017 | FusionNet (single model) Microsoft Business AI Solutions team https://arxiv.org/abs/1711.07341 | 75.968 | 83.900 |
| 35 Oct 22, 2017 | Interactive AoA Reader+ (single model) Joint Laboratory of HIT and iFLYTEK | 75.821 | 83.843 |
| 35 Oct 18, 2018 | KAR (single model) York University https://arxiv.org/abs/1809.03449 | 76.125 | 83.538 |
| 36 Jul 14, 2017 | smarnet (ensemble) Eigen Technology & Zhejiang University | 75.989 | 83.475 |
| 37 Mar 15, 2018 | AVIQA-v2 (single model) aviqa team | 75.926 | 83.305 |
| 38 Aug 18, 2017 | RaSoR + TR (single model) Tel-Aviv University https://arxiv.org/abs/1712.03609 | 75.789 | 83.261 |
| 39 Oct 23, 2017 | DCN+ (single model) Salesforce Research https://arxiv.org/abs/1711.00106 | 75.087 | 83.081 |
| 39 Nov 01, 2017 | Mixed model (ensemble) Sean | 75.265 | 82.769 |
| 39 May 21, 2017 | MEMEN (ensemble) Eigen Technology & Zhejiang University https://arxiv.org/abs/1707.09098 | 75.370 | 82.658 |
| 39 Nov 17, 2017 | two-attention-self-attention (ensemble) guotong1988 | 75.223 | 82.716 |
| 39 Jul 10, 2017 | DCN+ (single model) Salesforce Research https://arxiv.org/abs/1711.00106 | 74.866 | 82.806 |
| 39 Mar 09, 2017 | ReasoNet (ensemble) MSR Redmond https://arxiv.org/abs/1609.05284 | 75.034 | 82.552 |
| 39 Oct 31, 2017 | SLQA (single model) Alibaba iDST NLP | 74.489 | 82.815 |
| 39 Feb 06, 2018 | Jenga (single model) Facebook AI Research | 74.373 | 82.845 |
| 39 Jan 02, 2018 | Conductor-net (single model) CMU https://arxiv.org/abs/1710.10504 | 74.405 | 82.742 |
| 39 Aug 14, 2018 | eeAttNet (single model) BBD NLP Team https://www.bbdservice.com | 74.604 | 82.501 |
| 40 Feb 13, 2018 | SSR-BiDAF ensemble model | 74.541 | 82.477 |
| 41 Jul 14, 2017 | Mnemonic Reader (ensemble) NUDT and Fudan University https://arxiv.org/abs/1705.02798 | 74.268 | 82.371 |
| 42 Dec 23, 2017 | S^3-Net (ensemble) Kangwon National University in South Korea | 74.121 | 82.342 |
| 43 Jul 29, 2017 | SEDT (ensemble model) CMU https://arxiv.org/abs/1703.00572 | 74.090 | 81.761 |
| 44 Jul 06, 2017 | SSAE (ensemble) Tsinghua University | 74.080 | 81.665 |
| 44 Jul 25, 2017 | Interactive AoA Reader (single model) Joint Laboratory of HIT and iFLYTEK Research | 73.639 | 81.931 |
| 44 Feb 22, 2017 | BiDAF (ensemble) Allen Institute for AI & University of Washington https://arxiv.org/abs/1611.01603 | 73.744 | 81.525 |
| 44 Apr 22, 2017 | SEDT+BiDAF (ensemble) CMU https://arxiv.org/abs/1703.00572 | 73.723 | 81.530 |
| 44 Nov 06, 2017 | Conductor-net (single) CMU https://arxiv.org/abs/1710.10504 | 73.240 | 81.933 |
| 44 Dec 14, 2017 | Jenga (single model) Facebook AI Research | 73.303 | 81.754 |
| 44 Jan 24, 2017 | Multi-Perspective Matching (ensemble) IBM Research https://arxiv.org/abs/1612.04211 | 73.765 | 81.257 |
| 44 May 01, 2017 | jNet (ensemble) USTC & National Research Council Canada & York University https://arxiv.org/abs/1703.04617 | 73.010 | 81.517 |
| 45 Oct 22, 2017 | Conductor-net (single) CMU | 72.590 | 81.415 |
| 45 Apr 12, 2017 | T-gating (ensemble) Peking University | 72.758 | 81.001 |
| 45 Nov 16, 2017 | two-attention-self-attention (single model) guotong1988 | 72.600 | 81.011 |
| 45 Sep 20, 2017 | BiDAF + Self Attention (single model) Allen Institute for Artificial Intelligence https://arxiv.org/abs/1710.10723 | 72.139 | 81.048 |
| 45 Mar 03, 2018 | AVIQA (single model) aviqa team | 72.485 | 80.550 |
| 45 Dec 15, 2017 | S^3-Net (single model) Kangwon National University in South Korea | 71.908 | 81.023 |
| 46 Nov 06, 2017 | attention+self-attention (single model) guotong1988 | 71.698 | 80.462 |
| 47 Nov 01, 2016 | Dynamic Coattention Networks (ensemble) Salesforce Research https://arxiv.org/abs/1611.01604 | 71.625 | 80.383 |
| 47 Apr 13, 2017 | QFASE NUS | 71.898 | 79.989 |
| 47 Jul 14, 2017 | smarnet (single model) Eigen Technology & Zhejiang University https://arxiv.org/abs/1710.02772 | 71.415 | 80.160 |
| 48 Jul 14, 2017 | Mnemonic Reader (single model) NUDT and Fudan University https://arxiv.org/abs/1705.02798 | 70.995 | 80.146 |
| 48 May 23, 2018 | AttReader (single) College of Computer & Information Science, SouthWest University, Chongqing, China | 71.373 | 79.725 |
| 48 Apr 22, 2018 | MAMCN (single model) Samsung Research | 70.985 | 79.939 |
| 48 Oct 27, 2017 | M-NET (single) UFL | 71.016 | 79.835 |
| 49 Mar 24, 2017 | jNet (single model) USTC & National Research Council Canada & York University https://arxiv.org/abs/1703.04617 | 70.607 | 79.821 |
| 49 Apr 02, 2017 | Ruminating Reader (single model) New York University https://arxiv.org/abs/1704.07415 | 70.639 | 79.456 |
| 49 Mar 14, 2017 | Document Reader (single model) Facebook AI Research https://arxiv.org/abs/1704.00051 | 70.733 | 79.353 |
| 49 Mar 08, 2017 | ReasoNet (single model) MSR Redmond https://arxiv.org/abs/1609.05284 | 70.555 | 79.364 |
| 49 Dec 28, 2016 | FastQAExt German Research Center for Artificial Intelligence https://arxiv.org/abs/1703.04816 | 70.849 | 78.857 |
| 49 May 13, 2017 | RaSoR (single model) Google NY, Tel-Aviv University https://arxiv.org/abs/1611.01436 | 70.849 | 78.741 |
| 49 Apr 14, 2017 | Multi-Perspective Matching (single model) IBM Research https://arxiv.org/abs/1612.04211 | 70.387 | 78.784 |
| 50 Aug 30, 2017 | SimpleBaseline (single model) Technical University of Vienna | 69.600 | 78.236 |
| 50 Feb 05, 2018 | SSR-BiDAF single model | 69.443 | 78.358 |
| 51 Apr 12, 2017 | SEDT+BiDAF (single model) CMU https://arxiv.org/abs/1703.00572 | 68.478 | 77.971 |
| 52 Jun 25, 2017 | PQMN (single model) KAIST & AIBrain & Crosscert | 68.331 | 77.783 |
| 53 Apr 12, 2017 | T-gating (single model) Peking University | 68.132 | 77.569 |
| 53 Jul 29, 2017 | SEDT (single model) CMU https://arxiv.org/abs/1703.00572 | 68.163 | 77.527 |
| 53 Dec 28, 2016 | FastQA German Research Center for Artificial Intelligence https://arxiv.org/abs/1703.04816 | 68.436 | 77.070 |
| 53 Jan 22, 2018 | FABIR Single Model https://arxiv.org/abs/1810.09580 | 67.744 | 77.605 |
| 53 Nov 28, 2016 | BiDAF (single model) Allen Institute for AI & University of Washington https://arxiv.org/abs/1611.01603 | 67.974 | 77.323 |
| 54 Oct 26, 2016 | Match-LSTM with Ans-Ptr (Boundary) (ensemble) Singapore Management University https://arxiv.org/abs/1608.07905 | 67.901 | 77.022 |
| 54 Sep 19, 2017 | AllenNLP BiDAF (single model) Allen Institute for AI http://allennlp.org/ | 67.618 | 77.151 |
| 55 Feb 05, 2017 | Iterative Co-attention Network Fudan University | 67.502 | 76.786 |
| 56 Jan 03, 2018 | newtest single model | 66.527 | 75.787 |
| 56 Nov 01, 2016 | Dynamic Coattention Networks (single model) Salesforce Research https://arxiv.org/abs/1611.01604 | 66.233 | 75.896 |
| 57 Oct 26, 2016 | Match-LSTM with Bi-Ans-Ptr (Boundary) Singapore Management University https://arxiv.org/abs/1608.07905 | 64.744 | 73.743 |
| 58 Sep 21, 2017 | OTF dict+spelling (single) University of Montreal https://arxiv.org/abs/1706.00286 | 64.083 | 73.056 |
| 58 Feb 19, 2017 | Attentive CNN context with LSTM NLPR, CASIA | 63.306 | 73.463 |
| 59 Nov 02, 2016 | Fine-Grained Gating Carnegie Mellon University https://arxiv.org/abs/1611.01724 | 62.446 | 73.327 |
| 59 Sep 21, 2017 | OTF spelling (single) University of Montreal https://arxiv.org/abs/1706.00286 | 62.897 | 72.016 |
| 60 Sep 21, 2017 | OTF spelling+lemma (single) University of Montreal https://arxiv.org/abs/1706.00286 | 62.604 | 71.968 |
| 61 Sep 28, 2016 | Dynamic Chunk Reader IBM https://arxiv.org/abs/1610.09996 | 62.499 | 70.956 |
| 61 Nov 15, 2019 | RQA+IDR (single model) Anonymous | 61.145 | 71.389 |
| 62 Aug 27, 2016 | Match-LSTM with Ans-Ptr (Boundary) Singapore Management University https://arxiv.org/abs/1608.07905 | 60.474 | 70.695 |
| 63 Aug 27, 2016 | Match-LSTM with Ans-Ptr (Sentence) Singapore Management University https://arxiv.org/abs/1608.07905 | 54.505 | 67.748 |
| 63 Nov 15, 2019 | RQA (single model) Anonymous | 55.827 | 65.467 |
| 64 Aug 22, 2019 | UQA (single model) Anonymous | 53.698 | 64.036 |