Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement context-sensitive memory model #755

Open
shaobo-he opened this issue Sep 1, 2021 · 1 comment
Open

Implement context-sensitive memory model #755

shaobo-he opened this issue Sep 1, 2021 · 1 comment
Assignees

Comments

@shaobo-he
Copy link
Contributor

shaobo-he commented Sep 1, 2021

Motivation

Small functions that modify/access pointers such as strlen, strcmp, and xmalloc cause DSA to merge nodes unnecessarily. For example,

#include <string.h>

char* x = "hello";
char* y = "world";

__attribute__((always_inline))
int mystrlen(char* p) {
  int z = 0;
  while(*p++)
    z += 1;
  return z;
}
int main(void) {
  return strlen(x) + strlen(y);
  //return mystrlen(x) + mystrlen(y);
}

SMACK only generates one region for both x and y whereas if we choose to use mystrlen, SMACK produces two regions. If we adopt a context-sensitive version of sea-dsa, we will get a region for each call site of strlen.

@shaobo-he shaobo-he self-assigned this Sep 1, 2021
@shaobo-he
Copy link
Contributor Author

shaobo-he commented Sep 8, 2021

Some background information about sea-dsa:

sea-dsa's context-sensitive analysis produces a graph for each function, where each pointer of the function is associated with a cell. Take one of sea-dsa's regression tests as an example,

extern int nd(void);

void f ( int *x , int *y ) {
  *x = 1 ; *y = 2 ;
}

void g (int *p , int *q , int *r , int *s) {
  f (p,q) ;
  f (r,s);
}

int main (int argc, char**argv){

  int x;
  int y;
  int w;
  int z;

  int *a = &x;
  int *b = &y;
  if (nd ()) a = b;
  
  g(a, b, &w, &z);
  return x + y + w + z;
}

Compiling it with O1 and invoking the context-sensitive version of sea-dsa produces the following three graphs.

image

image

image

So a rudimentary attempt to add the context-sensitive memory model is to encode the functions in store-passing style. Namely, each Boogie procedure takes input as the regions which its pointers are associated with and returns the updated regions.

For example, for f, g, and main, their signatures are,

procedure f(x: ref, y: ref, M.in.1: ref[int]) returns (M.out.1: ref[int]);
procedure g(p, q, r, s, M.in.1: ref[int], M.in.2:ref[int]) returns (M.out.1: ref[int], M.out.2: ref[int])
procedure main(M.in.1: ref[int], M.in.2:ref[int]) returns (M.out.1: ref[int], M.out.2: ref[int])

For each procedure, we assign the in version of a region into its out version because parameters are immutable in Boogie. So, for example, we do the following in g:

entry:
M.out.1 := M.in.1;
M.out.2 := M.in.2;

In this way, we always use the out version of a region when encoding load/store instructions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant