Skip to content
nathan-retrace edited this page Jul 28, 2020 · 18 revisions

Welcome to the retrace-glibc wiki!

Objective: enable enough information to be captured in record mode, that replay can be performed with a high level of accuracy.

Black-box vs White-box record/replay

Some record/replay mechanisms aim for complete deterministic capture and replay. An example of such an approach is Mozilla's RR. In this scenarios ALL non-deterministic program interactions are captured for later replay. When implemented correctly and exact program replica is possible down to memory layouts. For such an apporach to work many system calls need to be captured, thread context switching for example. The load on the system is high, and storage requirements are also high. Also the replay program needs to be identical to the record program. With Retrace, while the business logic is the same in the record and replay executions, the exact execution mechanism may differ.

For this reason an exact mechanism such as RR is infeasible for retrace.

Well bahaved programs

When we talk of well-behaved programs, we're talking about programs which do not exhibit non-deterministic behaviour due to multi-threading race conditions, or other sources of non-determinism, for instance predicating branches on memory addresses. This behaviour is out-of-scope of this discussion.

Captured system calls

We're interested in capturing non-determinism at the boundary layer of the program, excluding such things as memory allocation and thread scheduling details.

Examples include:

  1. file and socket IO
  2. system-time functions
  3. random functions

The question becomes whats the minimum surface area of system calls to record which allows deterministic replay of the pythin business logic

Program Determinism

Given the same program and exactly the same execution hardware and inputs, the result will be identical and the pattern of system calls will be identical on a given thread. As previously stated exact program replication is not feasible, so then the question becomes, across the system calls we're capturing should the pattern of system calls be identical in record and replay?

I would argue it should be. We can easily detect when it isn't given some test program. I propose that we make the assumption that in a 'well-behaved' program the system calls and their inputs 'should' be identical across successive runs. We defensively check for the cases where it isn't and log enough information to enable analysis of why our assumption did not hold.

Global Counter

To get determinism across system calls on different threads a global counter may be used to sequencing. Before a recorded system call returns it gets and increments a global counter atomically.

A good candidate for this counter might be:

https://gcc.gnu.org/onlinedocs/gcc-4.1.0/gcc/Atomic-Builtins.html

Recording

A system call is made on a program thread and has a start time and end time. System calls on other threads may interleave execution.

After the system call has finished executing thefollowing information should be captured:

  • System call Id - To identify which system call was being made, e.g open(), select(), read()...
  • Thread Id - the thread identifier making the request
  • system call counter
  • Start time? - not strictly required, but maybe useful for debugging
  • Current Time (nano-seconds, sufficiently high resolution timer)
  • Result of the system call, this may be return code and/or a copy of any buffers written to, e.g. read()
  • parameters? - Used to check if the system call is correct or the record/replay has gone off track

The results of this capture could go to a per thread file, or a single file

Replay

With the assumption that the sequence of captured system calls will be identical in record/replay on any given thread, we can use the above captured information to reconstruct a program execution with the correct sequencing between threads.

A thread may initiate a captured system call. This puts the thread into a waiting state. The thread is released to return its pre-recorded result when the system call counter matches that of the captured system call. The virtual system clock is updated to reflect the time at the end of the system call.

Trapping Errors

If a thread trys to make an unexpected system call, or a system call with different parameters/flags this can be trapped.

Storage format

Message may be saved and read using google protocol buffers. This format is extensible and offers a good mechansim to co-ordinate between record and replay.

Alternative mechanism

An alternative higher level abstraction may be used at a later stage in development, if the above assumptions are proved to be incorrect. Data streams consumed by the program are replicated. The exact internal mechanisms used by the program to consume those streams can now flex slightly.

If possible the exact sequence of system calls should not be relied on to be deterministic. What can be made deterministic is the order in which data from various streams is fed into the program. An example of order being important to business logic would be a shopping platform whereby the first served request reserves the item.

As an example, when a socket is created, that is represented by a file descriptor. When data is read from the socket, that data is written to a file for later replay before being passed back to the calling function.

Timing and relative ordering important in multi-threaded programs. The data recorded should be sufficiently detailed to allow various replay strategies.