Skip to content
/ S3M Public
forked from pvldb23/S3M

Artifact of VLDB'23 "When Database Meets New Storage Device: Understanding and Exposing Performance Mismatches via Configurations"

Notifications You must be signed in to change notification settings

TimHe95/S3M

 
 

Repository files navigation

S3M (performance Mismatch in I/O Size, ParalleliSm and Sequentiality)

Haochen He, Erci Xu, Shanshan Li, Zhouyang Jia, Si Zheng, Yue Yu, Jun Ma, and Xiangke Liao. "When Database Meets New Storage Device: Understanding and Exposing Performance Mismatches via Configurations." PVLDB, 16(7). (TO APPEAR)

confSysTaint is the core of S3M, confSysTaint is based on LLVM IR, it analyzes the control and data dependency between configuration variable and specific syscalls. An example shows how it works:

截屏2022-09-03 12 02 02

Target syscalls

In section 4.1 Test Input Generation, we mention that S3M focuses on four series of syscalls, theay are obtained by our manual investigation on every Linux syscall (335 found in kernel version 5.4.0) by reading the official manual, followed by cross-checking. The filtered out 21 syscalls that may affect I/O size, parallelism, and sequentiality:

read series write series sync series thread series
read pread64 readv preadv preadv2 io_submit io_getevents madvise open mmap write pwrite64 writev pwritev pwritev2 io_submit io_getevents madvise open mmap fsync fdatasync syncfs sync_file_range fcntl clone(pthread_create) fork

Scenarios that confSysTaint supports

Data flow

  • basic LLVM "Use" support
    截屏2022-09-03 15 27 06
  • Field sensitive analysis
    截屏2022-09-03 15 44 43
  • Inter-procedure (with pointer)
    截屏2022-09-03 15 53 39
    截屏2022-09-03 16 02 01
  • Our extended data-flow (phi-node)
    截屏2022-09-03 16 06 28
    • How to formaly determine if a phi-node will be tainted
      Given a phiNode like:
         phi i32 [ %5, %bb1.i ], [ 0, %bb1 ]
                    pre_node      pre_node2
      
      we check if: 截屏2022-09-03 16 07 53

Control flow

Formaly define how the control flow:

  • Control Dependency: A block Y is control dependent on block X if and only if: Y post-dominates at least one but not all successors of X.
    • Transitivity:if A control dependent on B, B control dependent on C, then A control dependent on C.

An example, where the yellow square indicats the complicated code structures that motivate the use of the formal definition.
截屏2022-09-03 16 39 03

Usage

Dependency

Build

cd tainter
cmake -DCMAKE_CXX_COMPILER=/usr/bin/clang++-10 -DCMAKE_C_COMPILER=/usr/bin/clang-10 -DLLVM_DIR=/usr/lib/llvm-10/cmake . 
make

Run

cd test/demo
../../tainter test.bc test-var.txt

For real DBMS, use gllvm to obtain the .bc file (e.g., mysqld.bc).

Check results

cat test-records.dat

Example result

Tainted Functions (group by Caller-Functions): 

		Clone_Handle::open_file <------------ func-1 of "srv_unix_file_flush_method"
				Clone_Task_Manager::set_error ----- Tainted Function.

		Clone_Snapshot::update_block_size <-- func-2 of "srv_unix_file_flush_method"
				os_event_set -------------------\ 
				pfs_unlock_mutex_v1              |_ Tainted Function.
				sync_array_object_signalled      |
				ut_dbg_assertion_failed --------/

		Double_write::sync_page_flush <------ func-3 of "srv_unix_file_flush_method"
				__clang_call_terminate ---------\ 
				buf_page_io_complete             |-- Tainted Function.
				fil_flush ----------------------/
        
    ...

About

Artifact of VLDB'23 "When Database Meets New Storage Device: Understanding and Exposing Performance Mismatches via Configurations"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 94.6%
  • Makefile 2.4%
  • C 1.5%
  • CMake 1.5%