Skip to content

Introduction to various methodologies for topic classification

License

Notifications You must be signed in to change notification settings

snumin44/topic-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Topic Classification with Different Approaches

Topic Classification 을 다음과 같은 다양한 방법론으로 해결합니다.

  1. Classification
  2. Masked Language Modeling (MLM)
  3. Matching
  4. Seq2Seq

1. Classification

  • Classifier를 이용해 사전학습모델(PLM)을 Fine-tuning 하는 방법입니다.
  • 일반적으로 Classification 테스크에서 사용되는 방법입니다. BLOG

example image

2. Masked Language Modeling (MLM)

example image

3. Matching

  • 텍스트와 레이블 사이의 함의(Entaliment) 여부를 예측하는 방법입니다.
  • 다중 분류 테스크를 이진 분류 테스크로 전환해 해결합니다. BLOG
  • 참고: Entailment as Few-Shot Learner

example image

4. Seq2Seq

  • 인코더 모델이 아닌 Seq2Seq 모델을 이용해 분류 테스크를 해결합니다.
  • 디코더로부터 출력된 마지막 토큰의 표현을 이용해 분류를 수행합니다. BLOG

example image

Citing

@article{schick2020exploiting,
  title={Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference},
  author={Timo Schick and Hinrich Schütze},
  journal={Computing Research Repository},
  volume={arXiv:2001.07676},
  url={http://arxiv.org/abs/2001.07676},
  year={2020}
}
@article{schick2020small,
  title={It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners},
  author={Timo Schick and Hinrich Schütze},
  journal={Computing Research Repository},
  volume={arXiv:2009.07118},
  url={http://arxiv.org/abs/2009.07118},
  year={2020}
}
@article{wang2020entailment,
  title={Entailment and Few-Shot Learner},
  author={Sinong Wang, Han Fang, Madian Khabsa, Hanzi Mao, Hao Ma},
  journal={Computing Research Repository},
  volume={arXiv:2104.14690},
  url={http://arxiv.org/abs/2009.07118},
  year={2020}
}

About

Introduction to various methodologies for topic classification

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published