Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it possible to detemine WAL format at runtime #304

Open
wants to merge 56 commits into
base: REL_15_STABLE_neon
Choose a base branch
from

Conversation

lubennikovaav and others added 30 commits August 9, 2023 15:31
Most significant changes are:
- `xlog.c` refactoring - some code was moved to `xlogreader.c` and `xlogprefetcher.c`.
- `ThisTimeLineID` refactoring (4a92a1c and e997a0c), which affects walproposer code
- `XLogFileInit` refactoring, Multiple commits changed the function signature.
- resolve initdb and pg_waldump neon-specific options that conflictes with the ones from PostgreSQL.
-
* Move backpressure throttling implementation to neon extension and function for monitoring throttling time

* Update src/include/miscadmin.h

Co-authored-by: Heikki Linnakangas <[email protected]>

Co-authored-by: Heikki Linnakangas <[email protected]>
Disabled by default. The plan is to merge this now, so that we can do
performance testing quickly, and if it helps, rewrite and review it
properly.

Author: Konstantin Knizhnik
Commit a703269 replaced $(INSTALL) with plain "cp" for installing the
server header files. It sped up "make install" significantly, because
the old logic called $(INSTALL) separately for every header file,
whereas plain "cp" could copy all the files in one command. However, we
have long since made it a requirement that $(INSTALL) can also install
multiple files in one command, see commit f1c5247. Switch back to
$(INSTALL).

Discussion: https://www.postgresql.org/message-id/200503252305.j2PN52m23610%40candle.pha.pa.us
Discussion: https://www.postgresql.org/message-id/2415283.1641852217%40sss.pgh.pa.us
to support only extensions that were built against Neon PostgreSQL
Neon generates PG_VERSION files in one format - just major version number without newline. Be consistent with it
No need to perform WAL recovery in Neon

Co-authored-by: Konstantin Knizhnik <[email protected]>
…ion because spec_token is not wal logged (#223)

* Pin pages with speculative insert tuples to prevent their reconstruction because spec_token is not wal logged

refer ##2587

* Update src/backend/access/heap/heapam.c

Co-authored-by: Heikki Linnakangas <[email protected]>

Co-authored-by: Heikki Linnakangas <[email protected]>
* Fix shared memory initialization for last written LSN cache

Replace (from,till) with (from,n_blocks) for SetLastWrittenLSNForBlockRange function

* Fast exit from SetLastWrittenLSNForBlockRange for n_blocks == 0
Without this patch, on bootstrap XLP_FIRST_IS_CONTRECORD has been always put on
header of a page where WAL writing continues. This confuses WAL decoding on
safekeepers, making it think decoding starts in the middle of a record, leading
to

 2022-08-12T17:48:13.816665Z ERROR {tid=37}: query handler for 'START_WAL_PUSH postgresql://no_user:@localhost:15050' failed: failed to run ReceiveWalConn

 Caused by:
    0: failed to process ProposerAcceptorMessage
    1: invalid xlog page header: unexpected XLP_FIRST_IS_CONTRECORD at 0/2CF8000

Rebase of a1af529 for v14.
- Refactor the way the WalProposerMain function is called when started
  with --sync-safekeepers. The postgres binary now explicitly loads
  the 'neon.so' library and calls the WalProposerMain in it. This is
  simpler than the global function callback "hook" we previously used.

- Move the WAL redo process code to a new library, neon_walredo.so,
  and use the same mechanism as for --sync-safekeepers to call the
  WalRedoMain function, when launched with --walredo argument.

- Also move the seccomp code to neon_walredo.so library. I kept the
  configure check in the postgres side for now, though.
Fix indentation, remove unused definitions, resolve some FIXMEs.
Previously, we called PrefetchBuffer [NBlkScanned * seqscan_prefetch_buffers]
times in each of those situations, but now only NBlkScanned.

In addition, the prefetch mechanism for the vacuum scans is now based on
blocks instead of tuples - improving the efficiency.
Parallel seqscans didn't take their parallelism into account when determining
which block to prefetch, and vacuum's cleanup scan didn't correctly determine
which blocks would need to be prefetched, and could get into an infinite loop.
* Use prefetch in pg_prewarm extension

* Change prefetch order as suggested in review
* Update prefetch mechanisms:

- **Enable enable_seqscan_prefetch by default**
- Store prefetch distance in the relevant scan structs
- Slow start sequential scan, to accommodate LIMIT clauses.
- Replace seqscan_prefetch_buffer with the relations' tablespaces'
  *_io_concurrency; and drop seqscan_prefetch_buffer as a result.
- Clarify enable_seqscan_prefetch GUC description
- Fix prefetch in pg_prewarm
- Add prefetching to autoprewarm worker
- Fix an issue where we'd incorrectly not prefetch data when hitting a table wraparound. The same issue also resulted in assertion failures in debug builds.
- Fix parallel scan prefetching - we didn't take into account that parallel scans have scan synchronization, too.
knizhnik and others added 24 commits August 9, 2023 15:31
#245)

* Maintain last written LSN for each page to enable prefetch on vacuum, delete and other massive update operations

* Move PageSetLSN in heap_xlog_visible before MarkBufferDirty
- Prefetch the pages in index vacuum's sequential scans
   Implemented in NBTREE, GIST and SP-GIST.
   BRIN does not have a 2nd phase of vacuum, and both GIN and HASH clean up
   their indexes in a non-seqscan fashion: GIN scans the btree from left to
   right, and HASH only scans the initial buckets sequentially.
* Show prefetch statistic in EXPLAIN

refer #2994

* Update heap pge LSN in case of VM changes even if wal_redo_hints=off

refer #2807

* Undo occasional changes

* Undo occasional changes
* Show prefetch statistic in EXPLAIN

refer #2994

* Collect per-node prefetch statistics

* Show number of prefetch duplicates in explain
* Implement efficient prefetch for parallel bitmap heap scan

* Change MAX_IO_CONCURRENCY to be power of 2
* Avoid errors when accessing indexes of unlogge tables after compute restart

* Support unlogged sequences

* Extract sequence start value from pg_sequence

* Initialize unlogged index undex eclusive lock
They will be handled in pageserver, ref neondatabase/neon#3706

This reverts commit ad5e789
This reverts commit 46c44e8

This does *not* revert commit 285cd13. We likely should do that, but
check_restored_datadir_content complains in some diff in init fork contents
after test_pg_regress, this should be sorted out.
Now similar kind of hack (using malloc() instead of shmem) is
done in the wal-redo extension.
* Adjust prefetch target for parallel bitmap scan

* More fixes for parallel bitmap scan prefetch
* Pefeth for index and inex-only scans

* Remove debug logging

* Move prefetch_blocks array to the end of BTScanOpaqueData struct
* Recovery requirements:

Add condition variable for WAL recovery; allowing backends to wait for recovery up to some record pointer.

* Fix issues w.r.t. WAL when LwLsn is initiated and when recovery starts.
This fixes some test failures that showed up after updating Neon code to do
more precise handling of replica's get_page_at_lsn's request_lsn lsns.

---------

Co-authored-by: Matthias van de Meent <[email protected]>
* Make it possible to grant self created roles

* Update expected file for create_role test

---------

Co-authored-by: Konstantin Knizhnik <[email protected]>
…extetnded Neon SMGR API (#300)

Co-authored-by: Konstantin Knizhnik <[email protected]>
…m xl_multi_insert)_tuple to xl_multi_insert
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants