The Stanford Question Answering Dataset

What is SQuAD?

Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.

SQuAD2.0 combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering.

Explore SQuAD2.0 and model predictions SQuAD2.0 paper (Rajpurkar & Jia et al. '18)

SQuAD 1.1, the previous version of the SQuAD dataset, contains 100,000+ question-answer pairs on 500+ articles.

Explore SQuAD1.1 and model predictions SQuAD1.0 paper (Rajpurkar et al. '16)

Getting Started

We've built a few resources to help you get started with the dataset.

Download a copy of the dataset (distributed under the CC BY-SA 4.0 license):

To evaluate your models, we have also made available the evaluation script we will use for official evaluation, along with a sample prediction file that the script will take as input. To run the evaluation, use python evaluate-v2.0.py <path_to_dev-v2.0> <path_to_predictions>.

Once you have a built a model that works to your expectations on the dev set, you submit it to get official scores on the dev and a hidden test set. To preserve the integrity of test results, we do not release the test set to the public. Instead, we require you to submit your model so that we can run it on the test set for you. Here's a tutorial walking you through official evaluation of your model:

Submission Tutorial

Because SQuAD is an ongoing effort, we expect the dataset to evolve.

To keep up to date with major changes to the dataset, please subscribe:

Have Questions?

Ask us questions at our google group or at [email protected] and [email protected].

Star

Leaderboard

SQuAD2.0 tests the ability of a system to not only answer reading comprehension questions, but also abstain when presented with a question that cannot be answered based on the provided paragraph.

Rank	Model	EM	F1
	Human Performance Stanford University (Rajpurkar & Jia et al. '18)	86.831	89.452
1 Nov 06, 2019	ALBERT + DAAF + Verifier (ensemble) PINGAN Omni-Sinitic	90.002	92.425
2 Sep 18, 2019	ALBERT (ensemble model) Google Research & TTIC https://arxiv.org/abs/1909.11942	89.731	92.215
3 Dec 08, 2019	ALBERT+Entailment DA (ensemble) CloudWalk	88.761	91.745
4 Jul 22, 2019	XLNet + DAAF + Verifier (ensemble) PINGAN Omni-Sinitic	88.592	90.859
4 Nov 22, 2019	albert+verifier (single model) Ping An Life Insurance Company AI Team	88.355	91.019
4 Dec 08, 2019	ALBERT+Entailment DA Verifier (single model) CloudWalk	87.847	91.265
4 Sep 16, 2019	ALBERT (single model) Google Research & TTIC https://arxiv.org/abs/1909.11942	88.107	90.902
4 Jul 26, 2019	UPM (ensemble) Anonymous	88.231	90.713
5 Aug 04, 2019	XLNet + SG-Net Verifier (ensemble) Shanghai Jiao Tong University & CloudWalk https://arxiv.org/abs/1908.05147	88.174	90.702
6 Nov 15, 2019	XLNet (single model) Google Brain & CMU	87.926	90.689
7 Aug 04, 2019	XLNet + SG-Net Verifier++ (single model) Shanghai Jiao Tong University & CloudWalk https://arxiv.org/abs/1908.05147	87.238	90.071
8 Jul 26, 2019	UPM (single model) Anonymous	87.193	89.934
8 Nov 27, 2019	RoBERTa+Verify (ensemble) CW	86.933	90.037
8 Mar 20, 2019	BERT + DAE + AoA (ensemble) Joint Laboratory of HIT and iFLYTEK Research	87.147	89.474
8 Jul 20, 2019	RoBERTa (single model) Facebook AI	86.820	89.795
9 Nov 12, 2019	RoBERTa+Verify (single model) CW	86.448	89.586
9 Mar 15, 2019	BERT + ConvLSTM + MTL + Verifier (ensemble) Layer 6 AI	86.730	89.286
10 Mar 05, 2019	BERT + N-Gram Masking + Synthetic Self-Training (ensemble) Google AI Language https://github.com/google-research/bert	86.673	89.147
11 Oct 16, 2019	Xlnet+Verifier single model	86.594	89.082
12 Aug 30, 2019	Xlnet+Verifier (single model) Ping An Life Insurance Company AI Team	86.572	89.063
12 Dec 09, 2019	XLNET-V2-123+ (single model) MST/EOI http://tia.today	86.403	89.148
13 May 21, 2019	XLNet (single model) Google Brain & CMU	86.346	89.133
14 May 14, 2019	SG-Net (ensemble) Shanghai Jiao Tong University https://arxiv.org/abs/1908.05147	86.211	88.848
14 Apr 13, 2019	SemBERT (ensemble) Shanghai Jiao Tong University https://arxiv.org/abs/1909.02209	86.166	88.886
14 Sep 29, 2019	BERTSP (single model) NEUKG http://www.techkg.cn/--please	85.838	88.921
14 Mar 16, 2019	BERT + DAE + AoA (single model) Joint Laboratory of HIT and iFLYTEK Research	85.884	88.621
14 Jul 22, 2019	SpanBERT (single model) FAIR & UW	85.748	88.709
15 May 14, 2019	SG-Net (single model) Shanghai Jiao Tong University https://arxiv.org/abs/1908.05147	85.229	87.926
15 Mar 13, 2019	BERT + ConvLSTM + MTL + Verifier (single model) Layer 6 AI	84.924	88.204
15 Mar 05, 2019	BERT + N-Gram Masking + Synthetic Self-Training (single model) Google AI Language https://github.com/google-research/bert	85.150	87.715
15 Jun 19, 2019	BNDVnet (single model) PAOS	85.003	87.833
15 Jan 15, 2019	BERT + MMFT + ADA (ensemble) Microsoft Research Asia	85.082	87.615
15 Apr 11, 2019	SemBERT (single model) Shanghai Jiao Tong University https://arxiv.org/abs/1909.02209	84.800	87.864
15 Sep 13, 2019	xlnet (single model) VerifiedXiaoPAI	84.642	88.000
15 Apr 16, 2019	Insight-baseline-BERT (single model) PAII Insight Team	84.834	87.644
16 Sep 03, 2019	Hanvon_model (single model) Hanvon_WuHan	84.721	87.117
17 Jan 10, 2019	BERT + Synthetic Self-Training (ensemble) Google AI Language https://github.com/google-research/bert	84.292	86.967
18 Nov 08, 2019	BERT + Multiple-CNN (ensemble) Kyonggi University (ICL) & KISTI	84.202	86.767
19 Jul 22, 2019	Tuned BERT-1seq Large Cased (single model) FAIR & UW	83.751	86.594
20 Mar 20, 2019	Bert-raw (ensemble) None	83.604	86.036
20 Dec 13, 2018	BERT finetune baseline (ensemble) Anonymous	83.536	86.096
20 Dec 21, 2018	PAML+BERT (ensemble model) PINGAN GammaLab	83.457	86.122
20 Dec 16, 2018	Lunet + Verifier + BERT (ensemble) Layer 6 AI NLP Team	83.469	86.043
21 Dec 15, 2018	Lunet + Verifier + BERT (single model) Layer 6 AI NLP Team	82.995	86.035
21 Jun 20, 2019	SENSEFORTH + BERT single https://senseforth.ai	83.142	85.873
21 Jan 14, 2019	BERT + MMFT + ADA (single model) Microsoft Research Asia	83.040	85.892
21 May 14, 2019	ATB (single model) Anonymous	82.882	86.002
21 Feb 16, 2019	Bert-raw (ensemble) None	83.175	85.635
21 Feb 26, 2019	BERT with Something (ensemble) Anonymous	83.051	85.737
21 Jan 10, 2019	BERT + Synthetic Self-Training (single model) Google AI Language https://github.com/google-research/bert	82.972	85.810
21 Jul 22, 2019	Tuned BERT Large Cased (single model) FAIR & UW	82.803	85.863
21 Mar 11, 2019	Bert-raw (ensemble) None	83.119	85.510
21 Feb 15, 2019	BERT + NeurQuRI (ensemble) 2SAH	82.803	85.703
22 Feb 27, 2019	BERT + NeurQuRI (ensemble) 2SAH	82.713	85.584
22 May 13, 2019	BERT-Base + QA Pre-training (single model) Anonymous	82.724	85.491
22 Dec 16, 2018	PAML+BERT (single model) PINGAN GammaLab	82.577	85.603
23 Nov 16, 2018	AoA + DA + BERT (ensemble) Joint Laboratory of HIT and iFLYTEK Research	82.374	85.310
24 Dec 12, 2018	BERT finetune baseline (single model) Anonymous	82.126	84.820
24 Feb 28, 2019	BERT_s (single model) Anonymous	81.979	84.846
24 Dec 10, 2018	Candi-Net+BERT (ensemble) 42Maru NLP Team	82.126	84.624
25 Feb 28, 2019	BERT-large+UBFT (single model) anonymous	81.573	84.535
26 Feb 15, 2019	BERT + NeurQuRI (single model) 2SAH	81.257	84.342
26 Feb 25, 2019	BERT with Something (single model) Anonymous	81.110	84.386
26 Nov 16, 2018	AoA + DA + BERT (single model) Joint Laboratory of HIT and iFLYTEK Research	81.178	84.251
27 Mar 20, 2019	Bert-raw (single) None	80.693	83.922
27 Mar 07, 2019	BERT + UnAnsQ (single model) Anonymous	80.749	83.851
28 Dec 19, 2018	Candi-Net+BERT (single model) 42Maru NLP Team	80.659	83.562
29 Jan 22, 2019	BERT + NeurQuRI (single model) 2SAH	80.591	83.391
29 Nov 11, 2019	BERTlarge (ensemble) SAIL	80.456	83.509
30 Mar 11, 2019	Bert-raw (single) None	80.411	83.457
31 Feb 16, 2019	Bert-raw (single model) None	80.343	83.243
31 May 28, 2019	Bert Single Model https://senseforth.ai	80.422	83.118
31 Apr 04, 2019	BISAN-CC (single model) Seoul National University & Hyundai Motors	80.208	83.149
31 Dec 03, 2018	PwP+BERT (single model) AITRICS	80.117	83.189
31 Dec 05, 2018	Candi-Net+BERT (single model) 42Maru NLP Team	80.388	82.908
31 Jul 22, 2019	Original BERT Large Cased (single model) FAIR & UW	79.971	83.266
31 Feb 19, 2019	BERT + UDA (single model) Anonymous	80.005	83.208
32 Apr 10, 2019	bert (single model) vinda msqjmxx	79.971	83.184
32 Feb 28, 2019	ST_bl single model	80.140	82.962
32 Nov 08, 2018	BERT (single model) Google AI Language	80.005	83.061
33 Feb 12, 2019	BERT + Sparse-Transformer single model	79.948	83.023
34 Mar 07, 2019	BERT uncased (single model) Anonymous	79.745	83.020
34 Dec 06, 2018	NEXYS_BASE (single model) NEXYS, DGIST R7	79.779	82.912
35 Feb 01, 2019	{bert-finetuning} (single model) ksai	79.632	82.852
36 Nov 09, 2018	L6Net + BERT (single model) Layer 6 AI	79.181	82.259
36 Mar 14, 2019	{Anonymous} (single model) Anonymous	78.876	82.524
37 Apr 24, 2019	BERT + WIAN (ensemble) Infosys Limited	78.650	81.497
37 Nov 11, 2019	BERTlarge (single model) SAIL	78.650	81.474
37 Mar 14, 2019	BISAN (single model) Seoul National University & Hyundai Motors	78.481	81.531
38 Dec 26, 2019	BERT-Large-Cased (single model) sysu	78.357	81.500
39 Dec 14, 2018	BERT+AC (single model) Hithink RoyalFlush	78.052	81.174
40 Nov 06, 2018	SLQA+BERT (single model) Alibaba DAMO NLP http://www.aclweb.org/anthology/P18-1158	77.003	80.209
41 Jan 05, 2019	synss (single model) bert_finetune	76.055	79.329
42 Dec 18, 2018	ARSG-BERT (single model) TRINITI RESEARCH LABS, Active.ai https://active.ai	74.746	78.227
42 Nov 05, 2018	MIR-MRC(F-Net) (single model) Kangwon National University, Natural Language Processing Lab. & ForceWin, KP Lab.	74.791	77.988
43 May 23, 2019	{BERTcw} (single model) private	74.385	77.308
44 Sep 13, 2018	nlnet (single model) Microsoft Research Asia	74.272	77.052
45 Dec 29, 2018	MMIPN Single	73.505	76.424
46 Apr 20, 2019	BERT-Base (single model) Dining Philosophers	73.099	76.236
47 Oct 12, 2018	YARCS (ensemble) IBM Research AI	72.670	75.507
48 Nov 14, 2018	BERT+Answer Verifier (single model) Pingan Tech Olatop Lab	71.666	75.457
49 Sep 17, 2018	Unet (ensemble) Fudan University & Liulishuo Lab https://arxiv.org/abs/1810.06638	71.417	74.869
49 Apr 24, 2019	BERT-Base (single) GreenflyAI https://greenfly.ai	71.699	74.430
49 Aug 15, 2018	Reinforced Mnemonic Reader + Answer Verifier (single model) NUDT https://arxiv.org/abs/1808.05759	71.767	74.295
49 Aug 28, 2018	SLQA+ (single model) Alibaba DAMO NLP http://www.aclweb.org/anthology/P18-1158	71.462	74.434
49 Jan 19, 2019	{BERT-base} (single-model) Anonymous	70.763	74.449
49 Sep 14, 2018	SAN (ensemble model) Microsoft Business Applications AI Research https://arxiv.org/abs/1712.03556	71.316	73.704
50 Aug 21, 2018	FusionNet++ (ensemble) Microsoft Business Applications Group AI Research https://arxiv.org/abs/1711.07341	70.300	72.484
50 Sep 26, 2018	Multi-Level Attention Fusion(MLAF) (single model) Chonbuk National University, Cognitive Computing Lab.	69.476	72.857
51 Sep 14, 2018	Unet (single model) Fudan University & Liulishuo Lab	69.262	72.642
52 Dec 20, 2018	DocQA + NeurQuRI (single model) 2SAH	68.766	71.662
53 Aug 21, 2018	SAN (single model) Microsoft Business Applications AI Research https://arxiv.org/abs/1712.03556	68.653	71.439
53 Sep 13, 2018	BiDAF++ with pair2vec (single model) UW and FAIR	68.021	71.583
53 Jun 24, 2018	KACTEIL-MRC(GFN-Net) (single model) Kangwon National University, Natural Language Processing Lab.	68.213	70.878
53 Jul 13, 2018	VS^3-NET (single model) Kangwon National University in South Korea	67.897	70.884
54 Jan 01, 2019	EBB-Net (single model) Enliple AI	66.610	70.303
55 Jun 25, 2018	KakaoNet2 (single model) Kakao NLP Team	65.719	69.381
56 Sep 13, 2018	BiDAF++ (single model) UW and FAIR	65.651	68.866
56 Jul 11, 2018	abcNet (single model) Fudan University & Liulishuo AI Lab	65.256	69.206
57 Jun 27, 2018	BSAE AddText (single model) reciTAL.ai	63.338	67.422
58 Aug 14, 2018	eeAttNet (single model) BBD NLP Team https://www.bbdservice.com	63.327	66.633
58 May 30, 2018	BiDAF + Self Attention + ELMo (single model) Allen Institute for Artificial Intelligence [modified by Stanford]	63.372	66.251
59 May 30, 2018	BiDAF + Self Attention (single model) Allen Institute for Artificial Intelligence [modified by Stanford]	59.332	62.305
60 May 30, 2018	BiDAF-No-Answer (single model) University of Washington [modified by Stanford]	59.174	62.093
60 Nov 27, 2018	Tree-LSTM + BiDAF + ELMo (single model) Carnegie Mellon University	57.707	62.341

SQuAD1.1 Leaderboard

Here are the ExactMatch (EM) and F1 scores evaluated on the test set of SQuAD v1.1.

Rank	Model	EM	F1
	Human Performance Stanford University (Rajpurkar et al. '16)	82.304	91.221
1 May 21, 2019	XLNet (single model) Google Brain & CMU	89.898	95.080
2 Aug 11, 2019	XLNET-123 (single model) MST/EOI	89.646	94.930
3 Sep 25, 2019	BERTSP (single model) NEUKG http://www.techkg.cn/	88.912	94.584
3 Jul 21, 2019	SpanBERT (single model) FAIR & UW	88.839	94.635
4 Jul 03, 2019	BERT+WWM+MT (single model) Xiaoi Research	88.650	94.393
5 Jul 21, 2019	Tuned BERT-1seq Large Cased (single model) FAIR & UW	87.465	93.294
6 Oct 05, 2018	BERT (ensemble) Google AI Language https://arxiv.org/abs/1810.04805	87.433	93.160
7 May 14, 2019	ATB (single model) Anonymous	86.940	92.641
8 Jul 21, 2019	Tuned BERT Large Cased (single model) FAIR & UW	86.521	92.617
8 Jul 04, 2019	BERT+MT (single model) Xiaoi Research	86.458	92.645
9 Feb 14, 2019	KT-NET (single model) Baidu NLP	85.944	92.425
9 Sep 26, 2018	nlnet (ensemble) Microsoft Research Asia	85.954	91.677
9 Feb 28, 2019	ST_bl single model	85.430	91.976
10 Nov 21, 2019	EL-BERT (single model) YeonTaek Oh	85.335	91.807
11 Mar 14, 2019	BISAN (single model) Seoul National University & Hyundai Motors	85.314	91.756
11 Jun 03, 2019	DPN (single model) Anonymous	84.978	92.019
11 Oct 05, 2018	BERT (single model) Google AI Language https://arxiv.org/abs/1810.04805	85.083	91.835
11 Jul 10, 2019	BERT-uncased (single model) Anonymous	84.926	91.932
11 Feb 16, 2019	BERT+Sparse-Transformer single model	85.125	91.623
11 Sep 09, 2018	nlnet (ensemble) Microsoft Research Asia	85.356	91.202
11 Jul 21, 2019	Original BERT Large Cased (single model) FAIR & UW	84.328	91.281
11 Feb 19, 2019	WD (single model) Anonymous	84.402	90.561
11 Jul 11, 2018	QANet (ensemble) Google Brain & CMU	84.454	90.490
11 Apr 21, 2019	Common-sense Governed BERT-123 (single model) Jerry AGI Ragtag	83.930	90.613
12 Feb 21, 2019	WD1 (single model) Anonymous	83.804	90.429
12 Jul 08, 2018	r-net (ensemble) Microsoft Research Asia	84.003	90.147
12 May 08, 2019	Common-sense Governed BERT-123 (single model) MST/EOI	82.943	91.074
12 Jun 20, 2018	MARS (ensemble) YUANFUDAO research NLP	83.982	89.796
13 Mar 19, 2018	QANet (ensemble) Google Brain & CMU	83.877	89.737
13 Sep 09, 2018	nlnet (single model) Microsoft Research Asia	83.468	90.133
14 Sep 01, 2018	MARS (single model) YUANFUDAO research NLP	83.185	89.547
15 Jun 21, 2018	MARS (single model) YUANFUDAO research NLP	83.122	89.224
16 Mar 06, 2018	QANet (ensemble) Google Brain & CMU	82.744	89.045
16 Jun 20, 2018	QANet (single) Google Brain & CMU	82.471	89.306
16 Jan 22, 2018	Hybrid AoA Reader (ensemble) Joint Laboratory of HIT and iFLYTEK Research	82.482	89.281
16 Feb 19, 2018	Reinforced Mnemonic Reader + A2D (ensemble model) Microsoft Research Asia & NUDT	82.849	88.764
16 May 09, 2018	MARS (single model) YUANFUDAO research NLP	82.587	88.880
16 Jan 03, 2018	r-net+ (ensemble) Microsoft Research Asia	82.650	88.493
16 Jan 05, 2018	SLQA+ (ensemble) Alibaba iDST NLP	82.440	88.607
16 Jul 14, 2019	BERT (single model) KTNET	82.062	88.947
16 Feb 27, 2018	QANet (single model) Google Brain & CMU	82.209	88.608
16 Feb 02, 2018	Reinforced Mnemonic Reader (ensemble model) NUDT and Fudan University https://arxiv.org/abs/1705.02798	82.283	88.533
16 Dec 23, 2018	MMIPN Single	81.580	88.948
16 Dec 17, 2017	r-net (ensemble) Microsoft Research Asia http://aka.ms/rnet	82.136	88.126
16 Dec 17, 2018	ARSG-BERT (single model) TRINITI RESEARCH LABS, Active.ai https://active.ai	81.307	88.909
16 Dec 22, 2017	AttentionReader+ (ensemble) Tencent DPDAC NLP	81.790	88.163
17 May 09, 2018	Reinforced Mnemonic Reader + A2D (single model) Microsoft Research Asia & NUDT	81.538	88.130
17 Apr 23, 2018	r-net (single model) Microsoft Research Asia	81.391	88.170
17 May 09, 2018	Reinforced Mnemonic Reader + A2D + DA (single model) Microsoft Research Asia & NUDT	81.401	88.122
17 Apr 03, 2018	KACTEIL-MRC(GF-Net+) (ensemble) Kangwon National University, Natural Language Processing Lab.	81.496	87.557
17 Feb 27, 2018	QANet (single model) Google Brain & CMU	80.929	87.773
17 Nov 17, 2017	BiDAF + Self Attention + ELMo (ensemble) Allen Institute for Artificial Intelligence	81.003	87.432
17 Feb 19, 2018	Reinforced Mnemonic Reader + A2D (single model) Microsoft Research Asia & NUDT	80.919	87.492
18 Feb 12, 2018	Reinforced Mnemonic Reader + A2D (single model) Microsoft Research Asia & NUDT	80.489	87.454
18 Apr 12, 2018	AVIQA+ (ensemble) aviqa team	80.615	87.311
19 Jan 13, 2018	SLQA+ single model	80.436	87.021
19 Jan 04, 2018	{EAZI} (ensemble) Yiwise NLP Group	80.436	86.912
19 Jan 12, 2018	EAZI+ (ensemble) Yiwise NLP Group	80.426	86.912
19 Jan 22, 2018	Hybrid AoA Reader (single model) Joint Laboratory of HIT and iFLYTEK Research	80.027	87.288
19 Mar 20, 2018	DNET (ensemble) QA geeks	80.164	86.721
20 Feb 12, 2018	BiDAF + Self Attention + ELMo + A2D (single model) Microsoft Research Asia & NUDT	79.996	86.711
21 Jan 03, 2018	r-net+ (single model) Microsoft Research Asia	79.901	86.536
21 Feb 23, 2018	MAMCN+ (single model) Samsung Research	79.692	86.727
22 Jan 29, 2018	Reinforced Mnemonic Reader (single model) NUDT and Fudan University https://arxiv.org/abs/1705.02798	79.545	86.654
22 Dec 05, 2017	SAN (ensemble model) Microsoft Business AI Solutions Team https://arxiv.org/abs/1712.03556	79.608	86.496
22 Dec 28, 2017	SLQA+ (single model) Alibaba iDST NLP	79.199	86.590
23 Oct 17, 2017	Interactive AoA Reader+ (ensemble) Joint Laboratory of HIT and iFLYTEK	79.083	86.450
23 Nov 05, 2018	MIR-MRC(F-Net) (single model) ForceWin, KP Lab.	79.083	86.288
24 Jun 01, 2018	MDReader single model	79.031	86.006
24 Oct 24, 2017	FusionNet (ensemble) Microsoft Business AI Solutions Team https://arxiv.org/abs/1711.07341	78.978	86.016
25 Oct 22, 2017	DCN+ (ensemble) Salesforce Research https://arxiv.org/abs/1711.00106	78.852	85.996
26 Mar 29, 2018	KACTEIL-MRC(GF-Net+) (single model) Kangwon National University, Natural Language Processing Lab.	78.664	85.780
26 Nov 03, 2017	BiDAF + Self Attention + ELMo (single model) Allen Institute for Artificial Intelligence	78.580	85.833
27 May 09, 2018	KakaoNet (single model) Kakao NLP Team	78.401	85.724
28 Nov 30, 2017	SLQA (ensemble) Alibaba iDST NLP	78.328	85.682
28 Mar 19, 2018	aviqa (ensemble) aviqa team	78.496	85.469
28 Jan 02, 2018	Conductor-net (ensemble) CMU https://arxiv.org/abs/1710.10504	78.433	85.517
28 Sep 18, 2018	BiDAF++ with pair2vec (single model) UW and FAIR	78.223	85.535
28 Jun 01, 2018	MDReader0 single model	78.171	85.543
28 Jan 03, 2018	MEMEN (single model) Zhejiang University https://arxiv.org/abs/1707.09098	78.234	85.344
28 Jan 29, 2018	test single	78.087	85.348
29 Jul 25, 2017	Interactive AoA Reader (ensemble) Joint Laboratory of HIT and iFLYTEK Research	77.845	85.297
30 Mar 20, 2018	DNET (single model) QA geeks	77.646	84.905
31 Sep 18, 2018	BiDAF++ (single model) UW and FAIR	77.573	84.858
31 Dec 06, 2017	AttentionReader+ (single) Tencent DPDAC NLP	77.342	84.925
31 Dec 13, 2017	RaSoR + TR + LM (single model) Tel-Aviv University https://arxiv.org/abs/1712.03609	77.583	84.163
31 Dec 21, 2017	Jenga (ensemble) Facebook AI Research	77.237	84.466
31 Nov 06, 2017	Conductor-net (ensemble) CMU https://arxiv.org/abs/1710.10504	76.996	84.630
31 Jan 23, 2018	MARS (single model) YUANFUDAO research NLP	76.859	84.739
32 May 14, 2018	VS^3-NET (single model) Kangwon National University in South Korea	76.775	84.491
32 Nov 01, 2017	SAN (single model) Microsoft Business AI Solutions Team https://arxiv.org/abs/1712.03556	76.828	84.396
32 Sep 26, 2018	{gqa} (single model) FAIR	77.090	83.931
32 Dec 19, 2017	FRC (single model) in review	76.240	84.599
32 Oct 13, 2017	r-net (single model) Microsoft Research Asia http://aka.ms/rnet	76.461	84.265
33 Oct 22, 2017	Conductor-net (ensemble) CMU	76.146	83.991
34 Sep 08, 2017	FusionNet (single model) Microsoft Business AI Solutions team https://arxiv.org/abs/1711.07341	75.968	83.900
35 Oct 22, 2017	Interactive AoA Reader+ (single model) Joint Laboratory of HIT and iFLYTEK	75.821	83.843
35 Oct 18, 2018	KAR (single model) York University https://arxiv.org/abs/1809.03449	76.125	83.538
36 Jul 14, 2017	smarnet (ensemble) Eigen Technology & Zhejiang University	75.989	83.475
37 Mar 15, 2018	AVIQA-v2 (single model) aviqa team	75.926	83.305
38 Aug 18, 2017	RaSoR + TR (single model) Tel-Aviv University https://arxiv.org/abs/1712.03609	75.789	83.261
39 Oct 23, 2017	DCN+ (single model) Salesforce Research https://arxiv.org/abs/1711.00106	75.087	83.081
39 Nov 01, 2017	Mixed model (ensemble) Sean	75.265	82.769
39 May 21, 2017	MEMEN (ensemble) Eigen Technology & Zhejiang University https://arxiv.org/abs/1707.09098	75.370	82.658
39 Nov 17, 2017	two-attention-self-attention (ensemble) guotong1988	75.223	82.716
39 Jul 10, 2017	DCN+ (single model) Salesforce Research https://arxiv.org/abs/1711.00106	74.866	82.806
39 Mar 09, 2017	ReasoNet (ensemble) MSR Redmond https://arxiv.org/abs/1609.05284	75.034	82.552
39 Oct 31, 2017	SLQA (single model) Alibaba iDST NLP	74.489	82.815
39 Feb 06, 2018	Jenga (single model) Facebook AI Research	74.373	82.845
39 Jan 02, 2018	Conductor-net (single model) CMU https://arxiv.org/abs/1710.10504	74.405	82.742
39 Aug 14, 2018	eeAttNet (single model) BBD NLP Team https://www.bbdservice.com	74.604	82.501
40 Feb 13, 2018	SSR-BiDAF ensemble model	74.541	82.477
41 Jul 14, 2017	Mnemonic Reader (ensemble) NUDT and Fudan University https://arxiv.org/abs/1705.02798	74.268	82.371
42 Dec 23, 2017	S^3-Net (ensemble) Kangwon National University in South Korea	74.121	82.342
43 Jul 29, 2017	SEDT (ensemble model) CMU https://arxiv.org/abs/1703.00572	74.090	81.761
44 Jul 06, 2017	SSAE (ensemble) Tsinghua University	74.080	81.665
44 Jul 25, 2017	Interactive AoA Reader (single model) Joint Laboratory of HIT and iFLYTEK Research	73.639	81.931
44 Feb 22, 2017	BiDAF (ensemble) Allen Institute for AI & University of Washington https://arxiv.org/abs/1611.01603	73.744	81.525
44 Apr 22, 2017	SEDT+BiDAF (ensemble) CMU https://arxiv.org/abs/1703.00572	73.723	81.530
44 Nov 06, 2017	Conductor-net (single) CMU https://arxiv.org/abs/1710.10504	73.240	81.933
44 Dec 14, 2017	Jenga (single model) Facebook AI Research	73.303	81.754
44 Jan 24, 2017	Multi-Perspective Matching (ensemble) IBM Research https://arxiv.org/abs/1612.04211	73.765	81.257
44 May 01, 2017	jNet (ensemble) USTC & National Research Council Canada & York University https://arxiv.org/abs/1703.04617	73.010	81.517
45 Oct 22, 2017	Conductor-net (single) CMU	72.590	81.415
45 Apr 12, 2017	T-gating (ensemble) Peking University	72.758	81.001
45 Nov 16, 2017	two-attention-self-attention (single model) guotong1988	72.600	81.011
45 Sep 20, 2017	BiDAF + Self Attention (single model) Allen Institute for Artificial Intelligence https://arxiv.org/abs/1710.10723	72.139	81.048
45 Mar 03, 2018	AVIQA (single model) aviqa team	72.485	80.550
45 Dec 15, 2017	S^3-Net (single model) Kangwon National University in South Korea	71.908	81.023
46 Nov 06, 2017	attention+self-attention (single model) guotong1988	71.698	80.462
47 Nov 01, 2016	Dynamic Coattention Networks (ensemble) Salesforce Research https://arxiv.org/abs/1611.01604	71.625	80.383
47 Apr 13, 2017	QFASE NUS	71.898	79.989
47 Jul 14, 2017	smarnet (single model) Eigen Technology & Zhejiang University https://arxiv.org/abs/1710.02772	71.415	80.160
48 Jul 14, 2017	Mnemonic Reader (single model) NUDT and Fudan University https://arxiv.org/abs/1705.02798	70.995	80.146
48 May 23, 2018	AttReader (single) College of Computer & Information Science, SouthWest University, Chongqing, China	71.373	79.725
48 Apr 22, 2018	MAMCN (single model) Samsung Research	70.985	79.939
48 Oct 27, 2017	M-NET (single) UFL	71.016	79.835
49 Mar 24, 2017	jNet (single model) USTC & National Research Council Canada & York University https://arxiv.org/abs/1703.04617	70.607	79.821
49 Apr 02, 2017	Ruminating Reader (single model) New York University https://arxiv.org/abs/1704.07415	70.639	79.456
49 Mar 14, 2017	Document Reader (single model) Facebook AI Research https://arxiv.org/abs/1704.00051	70.733	79.353
49 Mar 08, 2017	ReasoNet (single model) MSR Redmond https://arxiv.org/abs/1609.05284	70.555	79.364
49 Dec 28, 2016	FastQAExt German Research Center for Artificial Intelligence https://arxiv.org/abs/1703.04816	70.849	78.857
49 May 13, 2017	RaSoR (single model) Google NY, Tel-Aviv University https://arxiv.org/abs/1611.01436	70.849	78.741
49 Apr 14, 2017	Multi-Perspective Matching (single model) IBM Research https://arxiv.org/abs/1612.04211	70.387	78.784
50 Aug 30, 2017	SimpleBaseline (single model) Technical University of Vienna	69.600	78.236
50 Feb 05, 2018	SSR-BiDAF single model	69.443	78.358
51 Apr 12, 2017	SEDT+BiDAF (single model) CMU https://arxiv.org/abs/1703.00572	68.478	77.971
52 Jun 25, 2017	PQMN (single model) KAIST & AIBrain & Crosscert	68.331	77.783
53 Apr 12, 2017	T-gating (single model) Peking University	68.132	77.569
53 Jul 29, 2017	SEDT (single model) CMU https://arxiv.org/abs/1703.00572	68.163	77.527
53 Dec 28, 2016	FastQA German Research Center for Artificial Intelligence https://arxiv.org/abs/1703.04816	68.436	77.070
53 Jan 22, 2018	FABIR Single Model https://arxiv.org/abs/1810.09580	67.744	77.605
53 Nov 28, 2016	BiDAF (single model) Allen Institute for AI & University of Washington https://arxiv.org/abs/1611.01603	67.974	77.323
54 Oct 26, 2016	Match-LSTM with Ans-Ptr (Boundary) (ensemble) Singapore Management University https://arxiv.org/abs/1608.07905	67.901	77.022
54 Sep 19, 2017	AllenNLP BiDAF (single model) Allen Institute for AI http://allennlp.org/	67.618	77.151
55 Feb 05, 2017	Iterative Co-attention Network Fudan University	67.502	76.786
56 Jan 03, 2018	newtest single model	66.527	75.787
56 Nov 01, 2016	Dynamic Coattention Networks (single model) Salesforce Research https://arxiv.org/abs/1611.01604	66.233	75.896
57 Oct 26, 2016	Match-LSTM with Bi-Ans-Ptr (Boundary) Singapore Management University https://arxiv.org/abs/1608.07905	64.744	73.743
58 Sep 21, 2017	OTF dict+spelling (single) University of Montreal https://arxiv.org/abs/1706.00286	64.083	73.056
58 Feb 19, 2017	Attentive CNN context with LSTM NLPR, CASIA	63.306	73.463
59 Nov 02, 2016	Fine-Grained Gating Carnegie Mellon University https://arxiv.org/abs/1611.01724	62.446	73.327
59 Sep 21, 2017	OTF spelling (single) University of Montreal https://arxiv.org/abs/1706.00286	62.897	72.016
60 Sep 21, 2017	OTF spelling+lemma (single) University of Montreal https://arxiv.org/abs/1706.00286	62.604	71.968
61 Sep 28, 2016	Dynamic Chunk Reader IBM https://arxiv.org/abs/1610.09996	62.499	70.956
61 Nov 15, 2019	RQA+IDR (single model) Anonymous	61.145	71.389
62 Aug 27, 2016	Match-LSTM with Ans-Ptr (Boundary) Singapore Management University https://arxiv.org/abs/1608.07905	60.474	70.695
63 Aug 27, 2016	Match-LSTM with Ans-Ptr (Sentence) Singapore Management University https://arxiv.org/abs/1608.07905	54.505	67.748
63 Nov 15, 2019	RQA (single model) Anonymous	55.827	65.467
64 Aug 22, 2019	UQA (single model) Anonymous	53.698	64.036