Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OAK-10341 Tree store #1577

Merged
merged 94 commits into from
Sep 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
94 commits
Select commit Hold shift + click to select a range
9750a6f
OAK-10341 Tree store
thomasmueller Jul 11, 2024
72ab6b9
OAK-10341 Tree store
thomasmueller Jul 11, 2024
d49b6c7
OAK-10341 Tree store
thomasmueller Jul 11, 2024
2404d14
OAK-10341 Tree store
thomasmueller Jul 12, 2024
1761ddb
OAK-10341 Tree store
thomasmueller Jul 12, 2024
3d6d009
OAK-10944: oak-auth-ldap: update commons-pool2 dependency to 2.12.0 (…
reschke Jul 11, 2024
dc10a12
OAK-10705: oak-standalone: update dependencies (#1411)
mbaedke Jul 11, 2024
f12ccb1
OAK-10905 | Add a configurable async checkpoint creator service (#1560)
nit0906 Jul 12, 2024
7f112fa
OAK-10905 | Add license header to AsyncCheckpointService (#1579)
nit0906 Jul 12, 2024
ba530fe
OAK-10848: commons: remove use of slf4j.event.Level in SystemProperty…
reschke Jul 15, 2024
36cfcfa
OAK-10685: remove unused import of java.io.UnsupportedEncodingException
reschke Jul 16, 2024
cafbd29
OAK-10949: blob-cloud, segment-aws: update aws SDK to 1.12.761 (depen…
reschke Jul 18, 2024
b098fd8
OAK-10954: Update spotbugs plugin to 4.8.6.2 (#1588)
reschke Jul 18, 2024
5fb55e0
OAK-10959: webapp: update Tomcat dependency to 9.0.90 (#1589)
reschke Jul 18, 2024
cd35db7
OAK-10960: blob-cloud, segment: update netty version to 4.1.111 (#1590)
reschke Jul 19, 2024
5d56db1
OAK-10945: Remove usage of Guava Function interface (#1578)
reschke Jul 19, 2024
95ed6c4
OAK-10962: oak-solr-osgi: update zookeeper dependency to 3.9.2 (#1591)
reschke Jul 20, 2024
8db9183
Update build.yml to disable Sonar for now
reschke Jul 20, 2024
8b05cae
OAK-10965 - Make ConsoleIndexingReporter thread safe. (#1592)
nfsantos Jul 22, 2024
6c44805
OAK-6762: Convert oak-blob to OSGi R7 annotations (#1413)
mbaedke Jul 22, 2024
0521c63
OAK-6773: Convert oak-store-composite to OSGi R7 annotations (#1489)
mbaedke Jul 22, 2024
ebefe01
OAK-10951 - Add a new configuration property (#1594)
nfsantos Jul 23, 2024
b37db4c
OAK-10966 - Avoid object creation in PathUtils.isAncestor (#1596)
nfsantos Jul 23, 2024
9528fdd
OAK-10964: bump nimbus-jose-jwt dependency to latest (#1593)
t-rana Jul 23, 2024
818317c
OAK-6773: Convert oak-store-composite to OSGi R7 annotations - fix li…
reschke Jul 23, 2024
25792e7
OAK-10803 -- compress/uncompress property, disabled by default (#1526)
ionutzpi Jul 24, 2024
aefb990
OAK-10974 and OAK-10869 : temporarily disabling flaky tests
stefan-egli Jul 24, 2024
c416850
OAK-10971 - Add a method to test if a path is a direct ancestor of an…
nfsantos Jul 25, 2024
5dd6344
OAK-10803: fix NPE when Mongo is unavailable, remove '*' imports (#1601)
reschke Jul 25, 2024
e411960
OAK-10976 - Avoid unnecessary call to PathUtils.getName in IndexDefin…
nfsantos Jul 26, 2024
632a15b
OAK-10904: Close token refresh executor service after access token is…
t-rana Jul 26, 2024
ce5c7df
OAK-10977 - Cleanup IndexDefinition class (#1604)
nfsantos Jul 29, 2024
5814638
OAK-10978 - Skip Azure compaction when there's not enough garbage in …
dulceanu Jul 29, 2024
8a72ef8
OAK-10966 - Indexing job: create optimized version of PersistedLinked…
nfsantos Jul 29, 2024
560cc48
Revert "OAK-10966 - Indexing job: create optimized version of Persist…
thomasmueller Jul 30, 2024
992e532
Revert "OAK-10978 - Skip Azure compaction when there's not enough gar…
thomasmueller Jul 30, 2024
5af1306
Revert "OAK-10977 - Cleanup IndexDefinition class (#1604)"
thomasmueller Jul 30, 2024
c57579f
Revert "OAK-10904: Close token refresh executor service after access …
thomasmueller Jul 30, 2024
c2be5ba
Revert "OAK-10976 - Avoid unnecessary call to PathUtils.getName in In…
thomasmueller Jul 30, 2024
329b751
Revert "OAK-10803: fix NPE when Mongo is unavailable, remove '*' impo…
thomasmueller Jul 30, 2024
46b91a6
Revert "OAK-10971 - Add a method to test if a path is a direct ancest…
thomasmueller Jul 30, 2024
aea743c
Revert "OAK-10974 and OAK-10869 : temporarily disabling flaky tests"
thomasmueller Jul 30, 2024
1985a5f
Revert "OAK-10803 -- compress/uncompress property, disabled by defaul…
thomasmueller Jul 30, 2024
de1d34c
Revert "OAK-6773: Convert oak-store-composite to OSGi R7 annotations …
thomasmueller Jul 30, 2024
b0df159
Revert "OAK-10964: bump nimbus-jose-jwt dependency to latest (#1593)"
thomasmueller Jul 30, 2024
bddb5cc
Revert "OAK-10966 - Avoid object creation in PathUtils.isAncestor (#1…
thomasmueller Jul 30, 2024
44c72d8
Revert "OAK-10951 - Add a new configuration property (#1594)"
thomasmueller Jul 30, 2024
d744465
Revert "OAK-6773: Convert oak-store-composite to OSGi R7 annotations …
thomasmueller Jul 30, 2024
87d4d92
Revert "OAK-6762: Convert oak-blob to OSGi R7 annotations (#1413)"
thomasmueller Jul 30, 2024
d49014d
Revert "OAK-10965 - Make ConsoleIndexingReporter thread safe. (#1592)"
thomasmueller Jul 30, 2024
501780a
Revert "Update build.yml to disable Sonar for now"
thomasmueller Jul 30, 2024
474745f
Revert "OAK-10962: oak-solr-osgi: update zookeeper dependency to 3.9.…
thomasmueller Jul 30, 2024
9585fd9
Revert "OAK-10945: Remove usage of Guava Function interface (#1578)"
thomasmueller Jul 30, 2024
28acbac
Revert "OAK-10960: blob-cloud, segment: update netty version to 4.1.1…
thomasmueller Jul 30, 2024
52fa899
Revert "OAK-10959: webapp: update Tomcat dependency to 9.0.90 (#1589)"
thomasmueller Jul 30, 2024
063c26b
Revert "OAK-10954: Update spotbugs plugin to 4.8.6.2 (#1588)"
thomasmueller Jul 30, 2024
50e951f
Revert "OAK-10949: blob-cloud, segment-aws: update aws SDK to 1.12.76…
thomasmueller Jul 30, 2024
2128367
Revert "OAK-10685: remove unused import of java.io.UnsupportedEncodin…
thomasmueller Jul 30, 2024
6ea10b7
Revert "OAK-10848: commons: remove use of slf4j.event.Level in System…
thomasmueller Jul 30, 2024
73ce197
Revert "OAK-10905 | Add license header to AsyncCheckpointService (#1…
thomasmueller Jul 30, 2024
556d346
Revert "OAK-10905 | Add a configurable async checkpoint creator serv…
thomasmueller Jul 30, 2024
de30c04
Revert "OAK-10705: oak-standalone: update dependencies (#1411)"
thomasmueller Jul 30, 2024
1ec3ad8
Revert "OAK-10944: oak-auth-ldap: update commons-pool2 dependency to …
thomasmueller Jul 30, 2024
0e464a6
Merge branch 'trunk' into OAK-10341b
thomasmueller Jul 30, 2024
86cc399
Merge branch 'trunk' into OAK-10341b
thomasmueller Jul 30, 2024
9c829f3
OAK-10341 Tree store (bugfix for loggers)
thomasmueller Jul 30, 2024
cca4e59
OAK-10341 Tree store (PipelinedTreeStoreStrategy.java)
thomasmueller Aug 2, 2024
5bf6a11
OAK-10341 Tree store (tests)
thomasmueller Aug 5, 2024
db74b33
Merge branch 'trunk' into OAK-10341b
thomasmueller Aug 5, 2024
cf8c348
OAK-10341 Tree store (use less memory)
thomasmueller Aug 5, 2024
e325f51
OAK-10341 Tree store (fix memory cache calculation)
thomasmueller Aug 6, 2024
7b3d10d
OAK-10341 Tree store (blob prefetch)
thomasmueller Aug 7, 2024
bc04a80
OAK-10341 Tree store (blob prefetch)
thomasmueller Aug 7, 2024
b242328
OAK-10341 Tree store (blob prefetch)
thomasmueller Aug 9, 2024
962dab6
OAK-10341 Tree store (node prefetch)
thomasmueller Aug 14, 2024
1fa3f20
Merge trunk
thomasmueller Aug 14, 2024
927aafc
OAK-10341 Tree store (incremental)
thomasmueller Aug 15, 2024
144fde8
OAK-10341 Tree store (pack files)
thomasmueller Aug 16, 2024
c67ed65
Merge branch 'trunk' into OAK-10341b
thomasmueller Aug 16, 2024
64639a5
OAK-10341 Tree store (pack files)
thomasmueller Aug 16, 2024
a32bf44
OAK-10341 Tree store (tests)
thomasmueller Aug 20, 2024
fb515ef
OAK-10341 Tree store (incremental)
thomasmueller Aug 22, 2024
a884a87
Merge branch 'trunk' into OAK-10341b
thomasmueller Aug 22, 2024
7a31307
OAK-10341 Tree store (compress)
thomasmueller Aug 23, 2024
6776860
OAK-10341 Tree store (traverse included paths)
thomasmueller Aug 26, 2024
7f80b98
OAK-10341 Tree store
thomasmueller Aug 27, 2024
822d0e9
OAK-10341 Tree store
thomasmueller Aug 29, 2024
0d522d7
OAK-10341 Tree store
thomasmueller Aug 29, 2024
d312ca8
OAK-10341 Tree store
thomasmueller Aug 30, 2024
a1bf314
OAK-10341 Tree store
thomasmueller Sep 3, 2024
3d370e8
Merge branch 'trunk' into OAK-10341b
thomasmueller Sep 3, 2024
cee6866
OAK-10341 Tree store (use same blob prefetching configuration as for …
thomasmueller Sep 4, 2024
95cd63e
OAK-10341 Tree store (javadocs)
thomasmueller Sep 5, 2024
785a118
OAK-10341 Tree store (code review)
thomasmueller Sep 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -239,6 +239,16 @@ nodeStore, getMongoDocumentStore(), traversalLog))
return storeList;
}

public IndexStore buildTreeStore() throws IOException, CommitFailedException {
String old = System.setProperty(FlatFileNodeStoreBuilder.OAK_INDEXER_SORT_STRATEGY_TYPE,
FlatFileNodeStoreBuilder.SortStrategyType.PIPELINED_TREE.name());
try {
return buildFlatFileStore();
} finally {
System.setProperty(FlatFileNodeStoreBuilder.OAK_INDEXER_SORT_STRATEGY_TYPE, old);
}
}

public IndexStore buildStore() throws IOException, CommitFailedException {
return buildFlatFileStore();
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,15 @@ private AheadOfTimeBlobDownloadingFlatFileStore(FlatFileStore ffs, CompositeInde
}
}

static boolean isEnabledForIndexes(String indexesEnabledPrefix, List<String> indexPaths) {
/**
* Whether blob downloading is needed for the given indexes.
*
* @param indexesEnabledPrefix the comma-separated list of prefixes of the index
* definitions that benefit from the download
* @param indexPaths the index paths
* @return true if any of the indexes start with any of the prefixes
*/
public static boolean isEnabledForIndexes(String indexesEnabledPrefix, List<String> indexPaths) {
List<String> enableForIndexes = splitAndTrim(indexesEnabledPrefix);
for (String indexPath : indexPaths) {
if (enableForIndexes.stream().anyMatch(indexPath::startsWith)) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,14 @@
import org.apache.jackrabbit.oak.index.indexer.document.CompositeException;
import org.apache.jackrabbit.oak.index.indexer.document.CompositeIndexer;
import org.apache.jackrabbit.oak.index.indexer.document.NodeStateEntryTraverserFactory;
import org.apache.jackrabbit.oak.index.indexer.document.flatfile.pipelined.ConfigHelper;
import org.apache.jackrabbit.oak.index.indexer.document.flatfile.pipelined.PipelinedStrategy;
import org.apache.jackrabbit.oak.index.indexer.document.flatfile.pipelined.PipelinedTreeStoreStrategy;
import org.apache.jackrabbit.oak.index.indexer.document.indexstore.IndexStore;
import org.apache.jackrabbit.oak.index.indexer.document.indexstore.IndexStoreSortStrategy;
import org.apache.jackrabbit.oak.index.indexer.document.indexstore.IndexStoreUtils;
import org.apache.jackrabbit.oak.index.indexer.document.tree.Prefetcher;
import org.apache.jackrabbit.oak.index.indexer.document.tree.TreeStore;
import org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore;
import org.apache.jackrabbit.oak.plugins.document.RevisionVector;
import org.apache.jackrabbit.oak.plugins.document.mongo.MongoDocumentStore;
Expand Down Expand Up @@ -122,7 +126,11 @@ public enum SortStrategyType {
/**
* System property {@link #OAK_INDEXER_SORT_STRATEGY_TYPE} if set to this value would result in {@link PipelinedStrategy} being used.
*/
PIPELINED
PIPELINED,
/**
* System property {@link #OAK_INDEXER_SORT_STRATEGY_TYPE} if set to this value would result in {@link PipelinedTreeStoreStrategy} being used.
*/
PIPELINED_TREE,
}

public FlatFileNodeStoreBuilder(File workDir) {
Expand Down Expand Up @@ -224,20 +232,52 @@ public IndexStore build(IndexHelper indexHelper, CompositeIndexer indexer) throw
entryWriter = new NodeStateEntryWriter(blobStore);
IndexStoreFiles indexStoreFiles = createdSortedStoreFiles();
File metadataFile = indexStoreFiles.metadataFile;
FlatFileStore store = new FlatFileStore(blobStore, indexStoreFiles.storeFiles.get(0), metadataFile,
new NodeStateEntryReader(blobStore),
unmodifiableSet(preferredPathElements), algorithm);
File file = indexStoreFiles.storeFiles.get(0);
IndexStore store;
if (file.isDirectory()) {
store = buildTreeStoreForIndexing(indexHelper, file);
} else {
store = new FlatFileStore(blobStore, file, metadataFile,
new NodeStateEntryReader(blobStore),
unmodifiableSet(preferredPathElements), algorithm);
}
if (entryCount > 0) {
store.setEntryCount(entryCount);
}
if (indexer == null || indexHelper == null) {
return store;
}
if (withAheadOfTimeBlobDownloading) {
return AheadOfTimeBlobDownloadingFlatFileStore.wrap(store, indexer, indexHelper);
} else {
return store;
if (withAheadOfTimeBlobDownloading && store instanceof FlatFileStore) {
FlatFileStore ffs = (FlatFileStore) store;
return AheadOfTimeBlobDownloadingFlatFileStore.wrap(ffs, indexer, indexHelper);
}
return store;
}

public IndexStore buildTreeStoreForIndexing(IndexHelper indexHelper, File file) {
TreeStore indexingTreeStore = new TreeStore(
"indexing", file,
new NodeStateEntryReader(blobStore), 10);
indexingTreeStore.setIndexDefinitions(indexDefinitions);

// use a separate tree store (with a smaller cache)
// for prefetching, to avoid cache evictions
TreeStore prefetchTreeStore = new TreeStore(
"prefetch", file,
new NodeStateEntryReader(blobStore), 3);
prefetchTreeStore.setIndexDefinitions(indexDefinitions);
String blobPrefetchEnableForIndexes = ConfigHelper.getSystemPropertyAsString(
AheadOfTimeBlobDownloadingFlatFileStore.BLOB_PREFETCH_ENABLE_FOR_INDEXES_PREFIXES, "");
Prefetcher prefetcher = new Prefetcher(prefetchTreeStore, indexingTreeStore);
String blobSuffix = "";
if (AheadOfTimeBlobDownloadingFlatFileStore.isEnabledForIndexes(
blobPrefetchEnableForIndexes, indexHelper.getIndexPaths())) {
blobSuffix = ConfigHelper.getSystemPropertyAsString(
AheadOfTimeBlobDownloadingFlatFileStore.BLOB_PREFETCH_BINARY_NODES_SUFFIX, "");
}
prefetcher.setBlobSuffix(blobSuffix);
prefetcher.startPrefetch();
return indexingTreeStore;
}

public List<IndexStore> buildList(IndexHelper indexHelper, IndexerSupport indexerSupport,
Expand Down Expand Up @@ -351,15 +391,24 @@ IndexStoreSortStrategy createSortStrategy(File dir) {
log.warn("TraverseWithSortStrategy is deprecated and will be removed in the near future. Use PipelinedStrategy instead.");
return new TraverseWithSortStrategy(nodeStateEntryTraverserFactory, preferredPathElements, entryWriter, dir,
algorithm, pathPredicate, checkpoint);
case PIPELINED:
case PIPELINED: {
log.info("Using PipelinedStrategy");
List<PathFilter> pathFilters = indexDefinitions.stream().map(IndexDefinition::getPathFilter).collect(Collectors.toList());
List<String> indexNames = indexDefinitions.stream().map(IndexDefinition::getIndexName).collect(Collectors.toList());
indexingReporter.setIndexNames(indexNames);
return new PipelinedStrategy(mongoClientURI, mongoDocumentStore, nodeStore, rootRevision,
preferredPathElements, blobStore, dir, algorithm, pathPredicate, pathFilters, checkpoint,
statisticsProvider, indexingReporter);

}
case PIPELINED_TREE: {
log.info("Using PipelinedTreeStoreStrategy");
List<PathFilter> pathFilters = indexDefinitions.stream().map(IndexDefinition::getPathFilter).collect(Collectors.toList());
List<String> indexNames = indexDefinitions.stream().map(IndexDefinition::getIndexName).collect(Collectors.toList());
indexingReporter.setIndexNames(indexNames);
return new PipelinedTreeStoreStrategy(mongoClientURI, mongoDocumentStore, nodeStore, rootRevision,
preferredPathElements, blobStore, dir, algorithm, pathPredicate, pathFilters, checkpoint,
statisticsProvider, indexingReporter);
}
}
throw new IllegalStateException("Not a valid sort strategy value " + sortStrategyType);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
* histogram are correct but if the histogram overflowed, it may be missing some entries.
*/
public class BoundedHistogram {
private static final Logger LOG = LoggerFactory.getLogger(PipelinedStrategy.class);
private static final Logger LOG = LoggerFactory.getLogger(BoundedHistogram.class);
private final ConcurrentHashMap<String, LongAdder> histogram = new ConcurrentHashMap<>();
private volatile boolean overflowed = false;
private final String histogramName;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
import org.slf4j.LoggerFactory;

public class ConfigHelper {
private static final Logger LOG = LoggerFactory.getLogger(PipelinedStrategy.class);
private static final Logger LOG = LoggerFactory.getLogger(ConfigHelper.class);

public static int getSystemPropertyAsInt(String name, int defaultValue) {
int result = Integer.getInteger(name, defaultValue);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ public Result call() throws Exception {
for (NodeDocument nodeDoc : nodeDocumentBatch) {
statistics.incrementMongoDocumentsTraversed();
mongoObjectsProcessed++;
if (mongoObjectsProcessed % 50000 == 0) {
if (mongoObjectsProcessed % 50_000 == 0) {
LOG.info("Mongo objects: {}, total entries: {}, current batch: {}, Size: {}/{} MB",
mongoObjectsProcessed, totalEntryCount, nseBatch.numberOfEntries(),
nseBatch.sizeOfEntriesBytes() / FileUtils.ONE_MB,
Expand Down
Loading
Loading