Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor experiment/training/finetuning scripts with documentation #20

Open
cstorm125 opened this issue Jan 27, 2021 · 1 comment
Open
Assignees
Labels
enhancement New feature or request

Comments

@cstorm125
Copy link
Contributor

  • Experiment scripts
  • Finetuning scripts / notebooks
  • Training MLM / finetuning MLM scripts / notebooks
@cstorm125 cstorm125 added the enhancement New feature or request label Jan 27, 2021
@lalital
Copy link
Contributor

lalital commented Feb 25, 2021

[WIP]

2. Finetuning scripts

List of files to refactor

1. Multiclass and Multilabel Sequence Classification Finetuning scripts

Path: ./scripts/downstream/train_sequence_classification_lm_finetuning.py
Branch name: refactor/scripts/lm_finetune_for_seq_cls
Unittest: ./tests/test_finetuner_seq_cls.py

Todo:

  • Move static variables (data type: Dict) out from the script

  • Use default argument class instead of manually create arguments with ArgumentParaser

  • Make the finetuner be able to access via thai2transformers module. (for example;

     from thai2transformer.finetuners import SequenceClassificationFinetuner
     seq_cls_finetuner = SequenceClassificationFInetuner( ... )
     # specify base model, tokenizer
     # specify target dataset
     # specify hyperparameters
    

2. Token Classification Finetuning scripts

Path: ./scripts/downstream/train_token_classificaition_lm_finetuning.py
Branch name: refactor/scripts/lm_finetune_for_token_cls

Todo:

  • Make the finetuner be able to access via thai2transformers module.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants