You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
really is composed of two tables that are separable. Ideally this component could use heuristics to figure out how to split these to tables into two smaller ones while optionally preserving the row and header columns.
This type of component is highly relevant for business users who have extremely large excel sheets that contain many different sub-tables.
2. A TableCleaner component
This component should remove empty rows and columns from a table while optionally retaining the original column and row headers. For example, for the table
,A,B,C
1,,,
2,,,
3,,,
4,,col_a,col_b
5,,1.5,test
I'd like to remove the the first three rows (1-3) and column A to end up with
,B,C
4,col_a,col_b
5,1.5,test
The text was updated successfully, but these errors were encountered:
As a follow up to this PR #8522 which adds XLSXToDocument converter to Haystack I believe the following would also be very useful.
1. A TableSplitter component
I specifically want a component that can detect if there are multiple tables within a table so to speak. For example, this table
really is composed of two tables that are separable. Ideally this component could use heuristics to figure out how to split these to tables into two smaller ones while optionally preserving the row and header columns.
This type of component is highly relevant for business users who have extremely large excel sheets that contain many different sub-tables.
2. A TableCleaner component
This component should remove empty rows and columns from a table while optionally retaining the original column and row headers. For example, for the table
I'd like to remove the the first three rows (1-3) and column A to end up with
The text was updated successfully, but these errors were encountered: