Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fix: mandatory fields are handled better when not filled in #71

Merged
merged 13 commits into from
Sep 22, 2023

Conversation

bedroesb
Copy link
Collaborator

@bedroesb bedroesb commented Jun 23, 2022

This PR fixes certain weird behaviors. Before when the table was not filled in at a column that is mandatory, or just NA, the validator would not complain. It would only complain when a field is missing that has a controlled vocabulary.

Now all mandatory attributes got wrapped with an if statement that stops the script when missing. It is done this way because if the <key></key> is included, but is empty, the validator will not complain, because it is present + now the error will not appear if the platform or instrument model is not given that lowercase can not be applied to nan objects.

  • Terminates the program if a mandatory field is not filled in. This will close nan is filled in when column is mandatory and cell is empty #62
  • Better handling of Na values by setting them in pandas while parsing
  • Extensive testing. Tested with leaving out all types of values in the table.
  • More useful logging. It will tell the user which row and column that is missing a value and in which table.

@bedroesb bedroesb marked this pull request as ready for review July 1, 2022 16:36
@bedroesb bedroesb requested a review from bgruening July 1, 2022 16:37
@bedroesb bedroesb changed the title Start with fixing the nan problem Bug fix: mandatory fields are handled better when not filled in Jul 1, 2022
@bgruening
Copy link
Member

Puh this is gross :)

Can you add a comment to the template generator why this is needed and do you see any way how we can test this?

@bedroesb
Copy link
Collaborator Author

bedroesb commented Jul 1, 2022

@bgruening I know this does not look pretty ;) I found a nicer solution but problem is that the error that was thrown from the XSD validator was about the XML and not about to tsv/xlsx input file + it was not triggered when <key></key> was left empty....

When it comes to testing, we would need many files in the current situation, because it stops the script when it encounters the first one. I don't know if I can keep track in the template of the errors so if the XML generation is done I know in the main python script there where errors 🤔

@bgruening
Copy link
Member

We need a rebase here :(
Sorry I was to late, feel free to merge, looks good to me

@bedroesb
Copy link
Collaborator Author

@bgruening wel good is relative, "Puh this is gross :)" still applies sadly haha! I think it might benefit to have a look in how we could change the way we fill in the XMLs currenlty

@bedroesb bedroesb merged commit c7fde2c into master Sep 22, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nan is filled in when column is mandatory and cell is empty
2 participants