Skip to content

Commit

Permalink
Introduce new ObjectPart type
Browse files Browse the repository at this point in the history
The `ObjectPart` type represents a slice of the content of an S3 object. It contains the information required to identify the object it belongs to and its offset in it. It also maintains checksums to validate its integrity. The new type is used in the prefetcher and replaces both `ChecksummedBytes` and `Part`.

Signed-off-by: Alessandro Passaro <[email protected]>
  • Loading branch information
passaro committed Nov 28, 2023
1 parent 99d5277 commit 9cf3ebb
Show file tree
Hide file tree
Showing 14 changed files with 919 additions and 870 deletions.
458 changes: 2 additions & 456 deletions mountpoint-s3/src/checksums.rs

Large diffs are not rendered by default.

28 changes: 7 additions & 21 deletions mountpoint-s3/src/data_cache.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,20 +8,12 @@ mod cache_directory;
mod disk_data_cache;
mod in_memory_data_cache;

use mountpoint_s3_client::types::ETag;
use thiserror::Error;

pub use crate::checksums::ChecksummedBytes;
pub use crate::data_cache::cache_directory::ManagedCacheDir;
pub use crate::data_cache::disk_data_cache::{CacheLimit, DiskDataCache, DiskDataCacheConfig};
pub use crate::data_cache::in_memory_data_cache::InMemoryDataCache;

/// Struct representing a key for accessing an entry in a [DataCache].
#[derive(Clone, Debug, Hash, PartialEq, Eq)]
pub struct CacheKey {
pub s3_key: String,
pub etag: ETag,
}
use crate::part::{ObjectId, ObjectPart};

/// Indexes blocks within a given object.
pub type BlockIndex = u64;
Expand All @@ -44,26 +36,20 @@ pub type DataCacheResult<Value> = Result<Value, DataCacheError>;
/// Data cache for fixed-size checksummed buffers.
///
/// TODO: Deletion and eviction of cache entries.
/// TODO: Some version information (ETag) independent from [CacheKey] to allow smarter eviction?
/// TODO: Some version information (ETag) independent from [ObjectId] to allow smarter eviction?
pub trait DataCache {
/// Get block of data from the cache for the given [CacheKey] and [BlockIndex], if available.
/// Get block of data from the cache for the given [ObjectId] and [BlockIndex], if available.
///
/// Operation may fail due to errors, or return [None] if the block was not available in the cache.
fn get_block(
&self,
cache_key: &CacheKey,
cache_key: &ObjectId,
block_idx: BlockIndex,
block_offset: u64,
) -> DataCacheResult<Option<ChecksummedBytes>>;
) -> DataCacheResult<Option<ObjectPart>>;

/// Put block of data to the cache for the given [CacheKey] and [BlockIndex].
fn put_block(
&self,
cache_key: CacheKey,
block_idx: BlockIndex,
block_offset: u64,
bytes: ChecksummedBytes,
) -> DataCacheResult<()>;
/// Put block of data to the cache.
fn put_block(&self, bytes: ObjectPart, block_idx: BlockIndex) -> DataCacheResult<()>;

/// Returns the block size for the data cache.
fn block_size(&self) -> u64;
Expand Down
Loading

0 comments on commit 9cf3ebb

Please sign in to comment.