Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TrainingArgs not recognised in Trainer #559

Open
PrithivirajDamodaran opened this issue Sep 22, 2024 · 2 comments
Open

TrainingArgs not recognised in Trainer #559

PrithivirajDamodaran opened this issue Sep 22, 2024 · 2 comments

Comments

@PrithivirajDamodaran
Copy link

PrithivirajDamodaran commented Sep 22, 2024

v1.1.0

Below warnings are throws thrown by the snippet.

2024-09-22 19:09:50,723 - No TrainingArguments passed, using output_dir=tmp_trainer.
2024-09-22 19:09:50,735 - No loss passed, using losses.CoSENTLoss as a default option.

from setfit import SetFitModel, Trainer, TrainingArguments

:
:
training_args = TrainingArguments(
            output_dir=output_dir,                
            eval_strategy=save_strategy, 
            save_strategy=save_strategy,          
            batch_size=batch_size, 
            num_epochs=epochs,
            body_learning_rate = lr,         
            warmup_proportion=warmup_proportion,   
            logging_dir=f"{output_dir}/logs",    
            load_best_model_at_end=True,  
            show_progress_bar = True,
            use_amp = use_amp,
            samples_per_label=min_samples,
            loss=CosineSimilarityLoss,
        )

 trainer = Trainer(
        model=model,
        args=training_args,
        train_dataset=train_dataset,
        eval_dataset=test_dataset,
        metric="accuracy",
        column_mapping={"text": "text", "label": "label"},
    )
@cjuracek-tess
Copy link

cjuracek-tess commented Oct 1, 2024

I think you need to be more specific with how you're defining your variables - the following example does not raise the loss warning you are discussing. I commented out the parameters which are variables because we don't know their values:

from datasets import load_dataset
from setfit import SetFitModel, Trainer, TrainingArguments, sample_dataset
from sentence_transformers.losses.CosineSimilarityLoss import CosineSimilarityLoss


dataset = load_dataset("sst2")
train_dataset = sample_dataset(dataset["train"], label_column="label", num_samples=8)
test_dataset = dataset["validation"]
model = SetFitModel.from_pretrained("sentence-transformers/paraphrase-mpnet-base-v2")

training_args = TrainingArguments(
            # output_dir=output_dir,
            # eval_strategy=save_strategy,
            # save_strategy=save_strategy,
            # batch_size=batch_size,
            # num_epochs=epochs,
            # body_learning_rate = lr,
            # warmup_proportion=warmup_proportion,
            # logging_dir=f"{output_dir}/logs",
            # load_best_model_at_end=True,
            show_progress_bar=True,
            # use_amp = use_amp,
            # samples_per_label=min_samples,
            loss=CosineSimilarityLoss,
        )

trainer = Trainer(
        model=model,
        args=training_args,
        train_dataset=train_dataset,
        eval_dataset=test_dataset,
        metric="accuracy",
        column_mapping={"sentence": "text", "label": "label"}
)

Output:

FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
  warnings.warn(
model_head.pkl not found on HuggingFace Hub, initialising classification head with random weights. You should TRAIN this model on a downstream task to use it for predictions and inference.
Applying column mapping to the training dataset
Applying column mapping to the evaluation dataset
Map: 100%|██████████| 16/16 [00:00<00:00, 5759.43 examples/s]

Process finished with exit code 0

So the problem is likely related to:

  • Your environment (which python version are you using?)
  • Your configuration of variables you pass to TrainingArguments / Trainer

@seanfarr788
Copy link

I think it is just an erroneous warning message, printing out trainer.args shows the args being set correctly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants