Why is Stable Sort Required in SortPreservingMerge? #12430
Unanswered
berkaysynnada
asked this question in
Q&A
Replies: 1 comment
-
for our usecase in InfluxData the order of files is related to insert order and thus we have a specific need to have the sort be stable (so that values for certain streams come before others with the same values) It seems very reasonable for this to be some sort of option / configuration so those who don't need a stable sort could potentially use unstable and faster sorting. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
datafusion/datafusion/physical-plan/src/sorts/sort_preserving_merge.rs
Lines 48 to 67 in 3ece7a7
The SortPreservingMerge implementation notes that "
Note Stable Sort: the merged stream places equal rows from stream 1
" (there is a tie between partition values). Why is this stable sorting necessary when partitions tie?We observed that breaking this contract can reduce latencies, especially with uneven source throughput. Could this be made configurable?
@alamb perhaps you could have some idea about that?
Beta Was this translation helpful? Give feedback.
All reactions