-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize createFpAndTrim with shrinkMutableByteArray# #576
base: master
Are you sure you want to change the base?
Conversation
It looks like GHC indeed behaves poorly when shrinking large |
@@ -1023,6 +1028,12 @@ memcpy p q s = void $ c_memcpy p q (fromIntegral s) | |||
memcpyFp :: ForeignPtr Word8 -> ForeignPtr Word8 -> Int -> IO () | |||
memcpyFp fp fq s = unsafeWithForeignPtr fp $ \p -> | |||
unsafeWithForeignPtr fq $ \q -> memcpy p q s | |||
|
|||
shrinkFp :: ForeignPtr Word8 -> Int -> IO () | |||
shrinkFp (ForeignPtr _ (PlainPtr marr)) (I# l#) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd write shrinkFp _ _ = pure ()
instead of error
, but otherwise looks sane to me as is.
if assert (0 <= l' && l' <= l) $ l' >= l | ||
then return $! BS fp l | ||
else | ||
if l < 4096 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition is the main tricky bit.
- Large
MutableByteArray#
s do not return their memory to the RTS when we shrink them in-place. But they do return their memory to the RTS when we make a copy. - Small pinned
MutableByteArray#
s may well retain all of their memory regardless of whether we shrink in place or create a copy, because of how they are currently allocated in GHC. (There are probably several GHC issues related to this topic. See for example this one and this one. It's not clear if or when this will be improved.)
So we want to make a copy when:
- We expect the underlying buffer to be a large heap object, and
- the amount of space shrinking would leak is large enough to care about.
If I recall correctly, the exact threshold for when a buffer is allocated as a large heap object depends on both the platform and the GHC version, but has been somewhere around 3K for 64-bit systems and half that for 32-bit systems for some time now.
(This reasoning should be documented in the code.)
@oberblastmeister could you please rebase? |
Addresses #549