Have you tried condenser pretraining on RoBERTa ? #17

1024er · 2022-05-26T04:32:27Z

I pretrained a condeser-roberta-base on the same data and hyperparameters, but the results on downstream tasks were not high.

Have you ever tried condenser pretraining on RoBERTa-base ?

Thank you

luyug · 2022-05-26T12:57:02Z

Same data no. I have trained with openwebtext (a open version of web text, part of Roberta training data) with a base architecture Roberta. It does better on sentence similarity task but not on retrieval tasks, when compared with Bert condenser. As a side note, we observed previously that vanilla Roberta base is typically inferior to vanilla Bert base on retrieval tasks.

We have just started test runs with condenser-roberta-large and therefore not much to say there yet.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Have you tried condenser pretraining on RoBERTa ? #17

Have you tried condenser pretraining on RoBERTa ? #17

1024er commented May 26, 2022

luyug commented May 26, 2022

Have you tried condenser pretraining on RoBERTa ? #17

Have you tried condenser pretraining on RoBERTa ? #17

Comments

1024er commented May 26, 2022

luyug commented May 26, 2022