Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Observable sharing #43

Closed
bruderj15 opened this issue Jul 9, 2024 · 9 comments · Fixed by #76
Closed

Observable sharing #43

bruderj15 opened this issue Jul 9, 2024 · 9 comments · Fixed by #76
Labels
enhancement New feature or request research Research an idea

Comments

@bruderj15
Copy link
Owner

Add observable sharing similar to what ekmett/ersatz does.
Is it as simple though?

From Ersatz.Problem:

generateLiteral :: MonadSAT s m => a -> (Literal -> m ()) -> m Literal
generateLiteral a f = do
  let sn = unsafePerformIO (makeStableName' a)
  use (stableMap.at sn) >>= \ ml -> case ml of
    Just l -> return l
    Nothing -> do
      l <- literalExists
      stableMap.at sn ?= l
      f l
      return l

On the first glance this only seems to share variables. We can easily do that likewise too.
However, where is sharing of entire formula sub-trees?

@bruderj15 bruderj15 added enhancement New feature or request research Research an idea labels Jul 9, 2024
@fabeulous
Copy link

On the first glance this only seems to share variables. We can easily do that likewise too.
However, where is sharing of entire formula sub-trees?

I believe this happens in Ersatz.Bits . Specifically:

runBit :: MonadSAT s m => Bit -> m Literal
runBit (Not c) = negateLiteral `fmap` runBit c
runBit (Var l) = return l
runBit (Run action) = action >>= runBit
runBit b = generateLiteral b $ \out ->
  assertFormula =<< case b of
    And bs    -> formulaAnd out `fmap` mapM runBit (toList bs)
    Xor x y   -> liftM2 (formulaXor out) (runBit x) (runBit y)
    Mux x y p -> liftM3 (formulaMux out) (runBit x) (runBit y) (runBit p)

This generates a fresh literal out per sub-formula b and asserts that out <-> b. Going forward any occurrence of b (with the same StableName) is replaced by out, hence sharing the full sub-tree.

I guess for SMT-Lib the equivalent would turn

let a = 2 * x
in assert $ a === a + a

into something like

(declare-const c1 Int)
(assert (= c1 (* 2 x)))   ; c1 represents a
(declare-const c2 Int)
(assert (= c2 (+ c1 c1))) ; c2 represents a + a
(declare-const c3 Bool)
(assert (= c3 (= c1 c2))) ; c3 represents a === a + a
(assert c3)

@bruderj15
Copy link
Owner Author

Exactly, yes.
I am not sure if we actually need it though.
Ersatz does it because it makes the Tseitin-transformation cheaper.

I am not sure how replacing each node in the expression with a new var will influence solver performance.
It seems to often yield benefits for boolean nodes in my experience.

We may give it a try and evaluate.

@fabeulous
Copy link

I am not sure how replacing each node in the expression with a new var will influence solver performance.

I also don't know how this would impact solver performance.

However not sharing expressions that are shared in the Haskell expressions can lead to the SMT expression growing exponentially compared to the Haskell expression. In something like

let a1 = x + x
    a2 = a1 + a1
    a3 = a2 + a2
    ...
in assert $ an === an 

each new let binding would lead to doubling the size of the SMT expression. With sharing it only grows linearly.
While this comes at the cost of adding linearly many fresh variables, I think that would be worth it.

Maybe as another data point, SBV seems to also introduce a variable per subexpression in newExpr. It goes even further by sharing equivalent subexpressions independent of their StableName.

@bruderj15
Copy link
Owner Author

Yes, that does look quite promising.
I will drop a PR without equivalence checks utilizing commutativity etc for now. This may be extended later on.
Many thanks for your input!

@bruderj15
Copy link
Owner Author

Transforming the GADT is easy. See: 95c121b
However sharing expressions via StableName is not.

assert expr = do
  smt <- get
  qExpr <- case smt^.mlogic of
    Nothing    -> return expr
    Just logic -> if "QF" `isPrefixOf` logic then return expr else quantify expr
  qExpr' <- runSharing qExpr
  modify $ \s -> s & formulas %~ (|> qExpr')
    
-- This is run once on every assertion.
runSharing :: (KnownSMTSort t, MonadSMT SMT m) => Expr t -> m (Expr t)
runSharing = transformM1 (
    \expr ->
      if isLeaf expr
      then return expr
      else case sortSing' expr of -- bring Eq (HaskellTyp t) into scope
        SBoolSort     -> share expr
        SIntSort      -> share expr
        SRealSort     -> share expr
        SBvSort _     -> share expr
        SArraySort _ _-> share expr
        SStringSort   -> share expr

-- Transformation of a single node
share :: (Equatable (Expr t), KnownSMTSort t, MonadSMT SMT m) => Expr t -> m (Expr t)
share expr = do
  let sn = unsafePerformIO (makeStableName' expr)
   in use (stableMap.at sn) >>= \mexpr' -> case mexpr' of   -- _stableMap :: !(HashMap (StableName ()) (SomeKnownSMTSort Expr))
        Just (SomeSMTSort expr') -> case geq (sortSing' expr) (sortSing' expr') of
          Nothing -> makeNode sn expr
          Just Refl -> return expr'
        Nothing -> makeNode sn expr

-- Eval complete expression then make StableName
makeStableName' :: GNFData f => f a -> IO (StableName ())
makeStableName' x = grnf x `seq` fmap unsafeCoerce (makeStableName x)

-- Create aux var and bind to node expr
makeNode :: (Equatable (Expr t), KnownSMTSort t, MonadSMT SMT m) => StableName () -> Expr t -> m (Expr t)
makeNode sn nodeExpr = do
      nodeVar <- var
      modify $ \s -> s & formulas %~ (|> (nodeVar === nodeExpr))
      stableMap.at sn ?= SomeSMTSort nodeVar
      return nodeVar

This successfully creates an aux var for every node in all asserted expressions.
However sharing fails and we create multiple aux vars for the same expressions.
Printing the StableName's hash of all nodes we transform shows that different StableName's are produced for the same expression.

runSharing :: (KnownSMTSort t, MonadSMT SMT m) => Expr t -> m (Expr t)
runSharing = transformM1 (
    \expr ->
      if let !printer = unsafePerformIO $ (makeStableName' expr) >>= putStrLn . (show expr <>) . show . hashStableName
        in isLeaf expr
      then return expr
      else ...

For example

  res <- solveWith @SMT (debug cvc5 def) $ do
    setLogic "QF_LIA"
    x <- var @IntSort
    let a1 = x + x
        an = a1 === a1
    assert an

    return ()

prints

var_1 8 -- hash of StableName of var_1 is 8
var_1 8
(+ var_1 var_1) 7   
var_18 
var_1 8
(+ var_1 var_1) 6 -- earlier hashed as 7, now 6 - bad - different pointers
(= var_2 var_3) 5

and therefore encodes

(set-option :produce-models true)
(set-logic QF_LIA)
(declare-fun var_1 () Int)
(declare-fun var_2 () Int)
(declare-fun var_3 () Int)
(declare-fun var_4 () Bool)
(assert (= var_2 (+ var_1 var_1)))
(assert (= var_3 (+ var_1 var_1)))
(assert (= var_4 (= var_2 var_3)))
(assert var_4)

If we traverse the expression to the nodes we expect to be shared and print their StableName's within the problem definition, we get the expected same StableName for those shared expressions:

  res <- solveWith @SMT (debug cvc5 def) $ do
    setLogic "QF_LIA"
    x <- var @IntSort
    let a1 = x + x
        an = a1 === a1
     -- traverse an here, make StableNames for nodes, hash and print
    assert an    
    return ()

I cannot figure out why.
I would like to avoid comparison with Eq and Hashable.
This would perform bad for large expressions.

Any further input is highly appreciated.

@fabeulous
Copy link

The issue is likely down to transformM1 traversing the expression bottom up, reallocating the expression before you call makeStableName. That is

uniplate1 f (Plus x y) = Plus <$> f x <*> f y

will allocate a new Expression with a different StableName each time.

A paramorphism is probably more fitting here. Something like:

paraM1 :: (Monad m, Uniplate1 f cs, AllC cs b) => (forall a. AllC cs a => f a -> f a -> m (f a)) -> f b -> m (f b)
paraM1 f x = uniplate1 (paraM1 f) x >>= f x

then using

-- This is run once on every assertion.
runSharing :: (KnownSMTSort t, MonadSMT SMT m) => Expr t -> m (Expr t)
runSharing = paraM1 (
    \origExpr expr ->
      if isLeaf expr
      then return expr
      else case sortSing' expr of
        SBoolSort     -> share origExpr expr
        SIntSort      -> share origExpr expr
        SRealSort     -> share origExpr expr
        SBvSort _     -> share origExpr expr
        SArraySort _ _ -> share origExpr expr
        SStringSort   -> share origExpr expr)

-- Transformation of a single node
share :: (Equatable (Expr t), KnownSMTSort t, MonadSMT SMT m) => Expr t -> Expr t -> m (Expr t)
share origExpr expr = do
  let sn = unsafePerformIO (makeStableName' origExpr)
   in use (stableMap.at sn) >>= \mexpr' -> case mexpr' of  
        Just (SomeSMTSort expr') -> case geq (sortSing' expr) (sortSing' expr') of
          Nothing -> makeNode sn expr
          Just Refl -> return expr'
        Nothing -> makeNode sn expr

might work?

I haven't testing any of this, since I currently can't compile the library (I'll create another issue for that).

@fabeulous
Copy link

fabeulous commented Aug 16, 2024

On second thought, while this should produce the correct SMT encoding, it will still do multiple traversals of each shared subexpression. Some kind of "lazy" para that delays the monadic effect of the recursion would be better I think.

lazyParaM1 :: (Monad m, Uniplate1 f cs, AllC cs b) => (forall a. AllC cs a => f a -> m (f a) -> m (f a)) -> f b -> m (f b)
lazyParaM1 f x = f x (uniplate1 (lazyParaM1 f) x)

runSharing :: (KnownSMTSort t, MonadSMT SMT m) => Expr t -> m (Expr t)
runSharing = paraM1 (
    \origExpr expr ->
      if isLeaf origExpr
      then return origExpr
      else case sortSing' origExpr of
        SBoolSort     -> share origExpr expr
        SIntSort      -> share origExpr expr
        SRealSort     -> share origExpr expr
        SBvSort _     -> share origExpr expr
        SArraySort _ _ -> share origExpr expr
        SStringSort   -> share origExpr expr)

-- Transformation of a single node
share :: (Equatable (Expr t), KnownSMTSort t, MonadSMT SMT m) => Expr t -> m (Expr t) -> m (Expr t)
share origExpr expr = do
  let sn = unsafePerformIO (makeStableName' origExpr)
   in use (stableMap.at sn) >>= \mexpr' -> case mexpr' of  
        Just (SomeSMTSort expr') -> case geq (sortSing' origExpr) (sortSing' expr') of
          Nothing -> expr >>= makeNode sn
          Just Refl -> return expr'
        Nothing -> expr >>= makeNode sn

@bruderj15 bruderj15 linked a pull request Aug 16, 2024 that will close this issue
@bruderj15
Copy link
Owner Author

Totally right. See 51f7c3c

@bruderj15
Copy link
Owner Author

I am not sure how replacing each node in the expression with a new var will influence solver performance.

I also don't know how this would impact solver performance.

For me, horrendeus for performance.
How does it affect you?
Feel free to reply in #97.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request research Research an idea
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants