Memory usage keeps increasing on md5sum compute #502

shashi-banger · 2023-09-06T03:52:20Z

Mountpoint for Amazon S3 version

mount-s3 1.0.1-unofficial+7643a22

AWS Region

us-east-1

Describe the running environment

Running on a local PC docker container. ALso experienced OOMKilled when running as a pod on AWS EKS

What happened?

Docker started in privileged mode
mount of s3 bucket was successful
When trying to execute md5sum on a 10GB file and monitoring docker stats memory usage keeps increasing steadily to 2GB and above
Even tried to chunk by chunk processing using the following python code snippet

def generate_file_md5(filepath):
    bufsize = 2**20 * 6
    buf = bytearray(bufsize)
    bufview = memoryview(buf)
    md5_hash = hashlib.md5()
    with open(filepath, 'rb', buffering=0) as f:
        while True:
            nread = f.readinto(bufview)
            if not nread:
                break
            md5_hash.update(bufview[:nread])
    return md5_hash.hexdigest()

same behaviour with above python code execution also

Relevant log output

CONTAINER ID   NAME        CPU %     MEM USAGE / LIMIT     MEM %     NET I/O          BLOCK I/O    PIDS
5363c739f1c3   keen_pike   20.20%    1.071GiB / 7.772GiB   13.78%    13.8GB / 108MB   0B / 270kB   16

The text was updated successfully, but these errors were encountered:

shashi-banger · 2023-09-06T04:21:27Z

Attaching debug logs for reference

mountpoint-s3-2023-09-06T04-03-12Z.log

passaro · 2023-09-08T09:30:54Z

Hi, thank you for the feedback. Consider that Mountpoint is optimized for reading large files sequentially and can prefetch data when it detects a sequential read pattern to improve throughput. This may have an impact on the memory usage, depending on the specific access pattern of different applications.

In your use cases with md5sum and the python script, we expect Mountpoint to see mostly sequential reads and start prefetching increasingly large chunks of data, currently up to a maximum size of 2GB. That could explain the behavior you are seeing.

Also, not sure if relevant, but we have an open issue around how dropped GetObject requests (e.g. on out-of-order reads) are handled: #510. It may be worth tracking that and re-run your workflow once it is fixed.

shashi-banger · 2023-09-08T12:29:13Z

Thank you for the response. Please check if it is worth considering a command line option or some configuration to limit the maximum prefetch size. This would allow users to set the memory limits for a container more reliably.

May be #510 fix will help. Will retry once it is fixed.

CrawX · 2024-01-24T09:07:40Z

I second that it would be very convenient to have an option to limit the memory usage for scenarios where memory availability is limited. I understand that this will likely impact performance, but that's still better than getting OOM killed.

If I wanted to modify that behavior to preload only 128MiB for example, I'd need to modify the constants here right?

mountpoint-s3/mountpoint-s3/src/prefetch.rs

Line 142 in 7dcaee0

max_request_size: 2 * 1024 * 1024 * 1024,

jamesbornholt · 2024-02-06T04:44:15Z

@CrawX yes, that's the constant you'd want to modify to scale back the prefetcher's aggressiveness.

We're currently looking into a more comprehensive way to limit memory usage; hope to have more to share on that soon!

eminden · 2024-07-18T15:05:24Z

I came across this issue while searching for excessive memory usage, my use case is reading 100 large files sequencially and concurrently, I can confirm that after updating max_request_size to 64 MBs dramatically reduces the memory usage. But this requires a recompile, is there any plan to make this configurable?

dannycjones · 2024-08-21T12:53:51Z

I came across this issue while searching for excessive memory usage, my use case is reading 100 large files sequencially and concurrently, I can confirm that after updating max_request_size to 64 MBs dramatically reduces the memory usage. But this requires a recompile, is there any plan to make this configurable?

We don't currently plan to expose this as a configuration. Instead, we're working on improvements that will allow Mountpoint to automatically scale down the prefetching amount based on available resources. I don't have any date I can share for when this would be completed, but the work is ongoing. I hope to be able to share more news soon. (Most recent change refactoring prefetching which prepares for this work: #980)

Sorry for the delay in responding here!

eminden · 2024-08-21T13:03:21Z

Thanks @dannycjones, this is great news, as a workaround we are using mountpoint by patching the max_request_size value for now. I'll be waiting for your work to be completed.

dannycjones · 2024-08-21T15:00:40Z

I've created this issue which is where we'll share updates on the automatic prefetcher scaling: #987.

unexge · 2024-10-15T15:47:11Z

Mountpoint v1.10.0 has been released with some prefetcher improvements and might reduce memory usage. Could you please try upgrading to see if it provides any improvements for you?

shashi-banger added the bug Something isn't working label Sep 6, 2023

passaro mentioned this issue Dec 5, 2023

ls: cannot open directory '...': Transport endpoint is not connected #630

Closed

dannycjones mentioned this issue Sep 18, 2024

Add temporary way to configure amount of data prefetched per file handle #1021

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory usage keeps increasing on md5sum compute #502

Memory usage keeps increasing on md5sum compute #502

shashi-banger commented Sep 6, 2023 •

edited

Loading

shashi-banger commented Sep 6, 2023

passaro commented Sep 8, 2023

shashi-banger commented Sep 8, 2023

CrawX commented Jan 24, 2024

jamesbornholt commented Feb 6, 2024

eminden commented Jul 18, 2024

dannycjones commented Aug 21, 2024

eminden commented Aug 21, 2024

dannycjones commented Aug 21, 2024

unexge commented Oct 15, 2024

Memory usage keeps increasing on md5sum compute #502

Memory usage keeps increasing on md5sum compute #502

Comments

shashi-banger commented Sep 6, 2023 • edited Loading

Mountpoint for Amazon S3 version

AWS Region

Describe the running environment

What happened?

Relevant log output

shashi-banger commented Sep 6, 2023

passaro commented Sep 8, 2023

shashi-banger commented Sep 8, 2023

CrawX commented Jan 24, 2024

jamesbornholt commented Feb 6, 2024

eminden commented Jul 18, 2024

dannycjones commented Aug 21, 2024

eminden commented Aug 21, 2024

dannycjones commented Aug 21, 2024

unexge commented Oct 15, 2024

shashi-banger commented Sep 6, 2023 •

edited

Loading