Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python): Add show methods to DataFrame and LazyFrame #19634

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

guilhem-dvr
Copy link

@guilhem-dvr guilhem-dvr commented Nov 4, 2024

This adds a show method for both DataFrame and LazyFrame objects, taking inspiration from pyspark's show method and taking into account the requirements from @stinodego in #16534.

I choose to only expose config options that influence the result width's size, to mimic pyspark's truncate option.

I've provided tests, but I'm not super satisfied with them: they could break when changing the default display options. I was thinking of mocking Config, print and display_html, would that be okay?

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars labels Nov 4, 2024
@alexander-beedie
Copy link
Collaborator

alexander-beedie commented Nov 5, 2024

I've provided tests, but I'm not super satisfied with them: they could break when changing the default display options. I was thinking of mocking Config, print and display_html, would that be okay?

No need to mockConfig; it can act as a decorator1, so you could decorate your tests such that you explicitly set the Config to some known values (eg: the current defaults) and then you can modify the values away from these defaults inside the test 👍

There are a few other options it would be nice to expose as well, such as tbl_formatting, tbl_cell_alignment, tbl_cell_numeric_alignment, etc.

I'd also change the parameter name n to limit, and allow it to be None (but maintain the default of 5). This would allow the caller to print the entire table (eg: with no limit).

Footnotes

  1. Config as a decorator:
    https://docs.pola.rs/api/python/stable/reference/config.html#use-as-a-decorator

Copy link

codecov bot commented Nov 5, 2024

Codecov Report

Attention: Patch coverage is 62.96296% with 10 lines in your changes missing coverage. Please review.

Project coverage is 79.93%. Comparing base (4360f9d) to head (c841458).
Report is 103 commits behind head on main.

Files with missing lines Patch % Lines
py-polars/polars/lazyframe/frame.py 50.00% 6 Missing ⚠️
py-polars/polars/dataframe/frame.py 73.33% 3 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #19634      +/-   ##
==========================================
- Coverage   80.05%   79.93%   -0.13%     
==========================================
  Files        1532     1536       +4     
  Lines      210752   211721     +969     
  Branches     2442     2449       +7     
==========================================
+ Hits       168715   169230     +515     
- Misses      41482    41936     +454     
  Partials      555      555              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@guilhem-dvr
Copy link
Author

@alexander-beedie I've added all config options that impact the display format of a dataframe.

I was thinking of hiding the dataframe shape by default because I find it a bit irrelevant when showing a dataframe, usually you would know what is the shape of the frame you are working with. But with the limitless option and for the sake of consistency I think I will leave it visible.

@alexander-beedie
Copy link
Collaborator

@alexander-beedie I've added all config options that impact the display format of a dataframe.

Good stuff, will take a look.

I was thinking of hiding the dataframe shape by default because I find it a bit irrelevant when showing a dataframe, usually you would know what is the shape of the frame you are working with.

Not necessarily - if it's wide (so cols are truncated in the repr) or you've just filtered the data you won't know the shape; definitely want to keep it 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature python Related to Python Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants