Dataset and Model

Table of Contents


Multi-Task

Dataset

  • GLUE Benchmark: The General Language Understanding Evaluation (GLUE) benchmark is a collection of resources for training, evaluating, and analyzing natural language understanding systems.

    • CoLA, MNLI, MRPC, QNLI, QQP, RTE, SST-2, STS-B, WNLI

Reading Comprehension

Dataset

  • HistoryQA: Joseon History Question Answering Dataset (SQuAD Style)

  • KorQuAD: KorQuAD는 한국어 Machine Reading Comprehension을 위해 만든 데이터셋입니다. 모든 질의에 대한 답변은 해당 Wikipedia 아티클 문단의 일부 하위 영역으로 이루어집니다. Stanford Question Answering Dataset(SQuAD) v1.0과 동일한 방식으로 구성되었습니다.

  • SQuAD: Stanford Question Answering Dataset is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.


Regression

  • GLUE Benchmark: The General Language Understanding Evaluation (GLUE) benchmark is a collection of resources for training, evaluating, and analyzing natural language understanding systems.

    • STS-B


Semantic Parsing

Dataset


Sequence Classification

Dataset

  • GLUE Benchmark: The General Language Understanding Evaluation (GLUE) benchmark is a collection of resources for training, evaluating, and analyzing natural language understanding systems.

    • CoLA, MNLI, MRPC, QNLI, QQP, RTE, SST-2, WNLI


Token Classification

Dataset

  • NER - CoNLL 2013: The shared task of CoNLL-2003 concerns language-independent named entity recognition. Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.