Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Naming of return-values for T-tests etc #208

Open
Tracked by #279
thomas-haslwanter opened this issue Nov 2, 2021 · 6 comments
Open
Tracked by #279

Naming of return-values for T-tests etc #208

thomas-haslwanter opened this issue Nov 2, 2021 · 6 comments
Assignees
Labels

Comments

@thomas-haslwanter
Copy link

some of the names of the returned dataframe for the function result = pg.ttest(data) are unfortunately named, since it does not comply with the naming conventions for variables.
As a result, some parameters can be read out with e.g.
result.dof
while those with the non-compliant names can only be accessed with square brackets:
result['p-value']

It would be highly desirable to have names that comply with the Python conventions and requirements for variable names.

@raphaelvallat raphaelvallat self-assigned this Nov 2, 2021
@raphaelvallat raphaelvallat added the feature request 🚧 New feature or request label Nov 2, 2021
@raphaelvallat
Copy link
Owner

Thanks @thomas-haslwanter, I agree and I've been wanting to fix this in the next release. We simply need to replace all the "-" in variable names with a "_", e.g. "p-unc" -> "p_unc".

I'll implement that in the next release,

Thanks,
Raphael

@thomas-haslwanter
Copy link
Author

Don't forget to change the 'CI95%' to 'CI95', since the "%"-sign also causes problems.

@thomas-haslwanter
Copy link
Author

And one more thing here: parameters that return a single float should not be returned as a pandas Series object, but simply as a float.
For example, the p-value of the test
result = pg.ttest(before, after)
currently has to be retrieved as
result['p-value']['T-test']
This should be simplified to
result.pval

@raphaelvallat
Copy link
Owner

Hi @thomas-haslwanter,

This would mean returning the output of most Pingouin functions as a pandas.Series instead of a pandas.DataFrame. While it would be indeed simpler to access the value, I think that the Series output in Jupyter notebook is less easy-to-read than a traditional DataFrame. This is quite a big conceptual modification, so we should discuss that in a separate issue and maybe do a poll.

Thanks,
Raphael

@thomas-haslwanter
Copy link
Author

When thinking it though I actually agree with you: using a pd.DataFrame for the result really makes the results MUCH clearer, and should therefore be kept. Either way, thank you for the quick reply!

@raphaelvallat
Copy link
Owner

Edit: this will not be included in the next release of Pingouin (v0.5.1) which is a minor release, but it should be included in the next major release (0.6.0).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants