Observable sharing #43

bruderj15 · 2024-07-09T10:44:45Z

Add observable sharing similar to what ekmett/ersatz does.
Is it as simple though?

generateLiteral :: MonadSAT s m => a -> (Literal -> m ()) -> m Literal
generateLiteral a f = do
  let sn = unsafePerformIO (makeStableName' a)
  use (stableMap.at sn) >>= \ ml -> case ml of
    Just l -> return l
    Nothing -> do
      l <- literalExists
      stableMap.at sn ?= l
      f l
      return l

On the first glance this only seems to share variables. We can easily do that likewise too.
However, where is sharing of entire formula sub-trees?

fabeulous · 2024-08-14T11:15:57Z

On the first glance this only seems to share variables. We can easily do that likewise too.
However, where is sharing of entire formula sub-trees?

I believe this happens in Ersatz.Bits . Specifically:

runBit :: MonadSAT s m => Bit -> m Literal
runBit (Not c) = negateLiteral `fmap` runBit c
runBit (Var l) = return l
runBit (Run action) = action >>= runBit
runBit b = generateLiteral b $ \out ->
  assertFormula =<< case b of
    And bs    -> formulaAnd out `fmap` mapM runBit (toList bs)
    Xor x y   -> liftM2 (formulaXor out) (runBit x) (runBit y)
    Mux x y p -> liftM3 (formulaMux out) (runBit x) (runBit y) (runBit p)

This generates a fresh literal out per sub-formula b and asserts that out <-> b. Going forward any occurrence of b (with the same StableName) is replaced by out, hence sharing the full sub-tree.

I guess for SMT-Lib the equivalent would turn

let a = 2 * x
in assert $ a === a + a

into something like

(declare-const c1 Int)
(assert (= c1 (* 2 x)))   ; c1 represents a
(declare-const c2 Int)
(assert (= c2 (+ c1 c1))) ; c2 represents a + a
(declare-const c3 Bool)
(assert (= c3 (= c1 c2))) ; c3 represents a === a + a
(assert c3)

bruderj15 · 2024-08-14T13:35:10Z

Exactly, yes.
I am not sure if we actually need it though.
Ersatz does it because it makes the Tseitin-transformation cheaper.

I am not sure how replacing each node in the expression with a new var will influence solver performance.
It seems to often yield benefits for boolean nodes in my experience.

We may give it a try and evaluate.

fabeulous · 2024-08-14T14:47:09Z

I am not sure how replacing each node in the expression with a new var will influence solver performance.

I also don't know how this would impact solver performance.

However not sharing expressions that are shared in the Haskell expressions can lead to the SMT expression growing exponentially compared to the Haskell expression. In something like

let a1 = x + x
    a2 = a1 + a1
    a3 = a2 + a2
    ...
in assert $ an === an

each new let binding would lead to doubling the size of the SMT expression. With sharing it only grows linearly.
While this comes at the cost of adding linearly many fresh variables, I think that would be worth it.

Maybe as another data point, SBV seems to also introduce a variable per subexpression in newExpr. It goes even further by sharing equivalent subexpressions independent of their StableName.

bruderj15 · 2024-08-14T19:13:05Z

Yes, that does look quite promising.
I will drop a PR without equivalence checks utilizing commutativity etc for now. This may be extended later on.
Many thanks for your input!

bruderj15 · 2024-08-16T10:54:55Z

Transforming the GADT is easy. See: 95c121b
However sharing expressions via StableName is not.

assert expr = do
  smt <- get
  qExpr <- case smt^.mlogic of
    Nothing    -> return expr
    Just logic -> if "QF" `isPrefixOf` logic then return expr else quantify expr
  qExpr' <- runSharing qExpr
  modify $ \s -> s & formulas %~ (|> qExpr')
    
-- This is run once on every assertion.
runSharing :: (KnownSMTSort t, MonadSMT SMT m) => Expr t -> m (Expr t)
runSharing = transformM1 (
    \expr ->
      if isLeaf expr
      then return expr
      else case sortSing' expr of -- bring Eq (HaskellTyp t) into scope
        SBoolSort     -> share expr
        SIntSort      -> share expr
        SRealSort     -> share expr
        SBvSort _     -> share expr
        SArraySort _ _-> share expr
        SStringSort   -> share expr

-- Transformation of a single node
share :: (Equatable (Expr t), KnownSMTSort t, MonadSMT SMT m) => Expr t -> m (Expr t)
share expr = do
  let sn = unsafePerformIO (makeStableName' expr)
   in use (stableMap.at sn) >>= \mexpr' -> case mexpr' of   -- _stableMap :: !(HashMap (StableName ()) (SomeKnownSMTSort Expr))
        Just (SomeSMTSort expr') -> case geq (sortSing' expr) (sortSing' expr') of
          Nothing -> makeNode sn expr
          Just Refl -> return expr'
        Nothing -> makeNode sn expr

-- Eval complete expression then make StableName
makeStableName' :: GNFData f => f a -> IO (StableName ())
makeStableName' x = grnf x `seq` fmap unsafeCoerce (makeStableName x)

-- Create aux var and bind to node expr
makeNode :: (Equatable (Expr t), KnownSMTSort t, MonadSMT SMT m) => StableName () -> Expr t -> m (Expr t)
makeNode sn nodeExpr = do
      nodeVar <- var
      modify $ \s -> s & formulas %~ (|> (nodeVar === nodeExpr))
      stableMap.at sn ?= SomeSMTSort nodeVar
      return nodeVar

This successfully creates an aux var for every node in all asserted expressions.
However sharing fails and we create multiple aux vars for the same expressions.
Printing the StableName's hash of all nodes we transform shows that different StableName's are produced for the same expression.

runSharing :: (KnownSMTSort t, MonadSMT SMT m) => Expr t -> m (Expr t)
runSharing = transformM1 (
    \expr ->
      if let !printer = unsafePerformIO $ (makeStableName' expr) >>= putStrLn . (show expr <>) . show . hashStableName
        in isLeaf expr
      then return expr
      else ...

For example

  res <- solveWith @SMT (debug cvc5 def) $ do
    setLogic "QF_LIA"
    x <- var @IntSort
    let a1 = x + x
        an = a1 === a1
    assert an

    return ()

prints

var_1 8 -- hash of StableName of var_1 is 8
var_1 8
(+ var_1 var_1) 7   
var_18 
var_1 8
(+ var_1 var_1) 6 -- earlier hashed as 7, now 6 - bad - different pointers
(= var_2 var_3) 5

and therefore encodes

(set-option :produce-models true)
(set-logic QF_LIA)
(declare-fun var_1 () Int)
(declare-fun var_2 () Int)
(declare-fun var_3 () Int)
(declare-fun var_4 () Bool)
(assert (= var_2 (+ var_1 var_1)))
(assert (= var_3 (+ var_1 var_1)))
(assert (= var_4 (= var_2 var_3)))
(assert var_4)

If we traverse the expression to the nodes we expect to be shared and print their StableName's within the problem definition, we get the expected same StableName for those shared expressions:

  res <- solveWith @SMT (debug cvc5 def) $ do
    setLogic "QF_LIA"
    x <- var @IntSort
    let a1 = x + x
        an = a1 === a1
     -- traverse an here, make StableNames for nodes, hash and print
    assert an    
    return ()

I cannot figure out why.
I would like to avoid comparison with Eq and Hashable.
This would perform bad for large expressions.

Any further input is highly appreciated.

fabeulous · 2024-08-16T13:49:59Z

The issue is likely down to transformM1 traversing the expression bottom up, reallocating the expression before you call makeStableName. That is

uniplate1 f (Plus x y) = Plus <$> f x <*> f y

will allocate a new Expression with a different StableName each time.

A paramorphism is probably more fitting here. Something like:

paraM1 :: (Monad m, Uniplate1 f cs, AllC cs b) => (forall a. AllC cs a => f a -> f a -> m (f a)) -> f b -> m (f b)
paraM1 f x = uniplate1 (paraM1 f) x >>= f x

then using

-- This is run once on every assertion.
runSharing :: (KnownSMTSort t, MonadSMT SMT m) => Expr t -> m (Expr t)
runSharing = paraM1 (
    \origExpr expr ->
      if isLeaf expr
      then return expr
      else case sortSing' expr of
        SBoolSort     -> share origExpr expr
        SIntSort      -> share origExpr expr
        SRealSort     -> share origExpr expr
        SBvSort _     -> share origExpr expr
        SArraySort _ _ -> share origExpr expr
        SStringSort   -> share origExpr expr)

-- Transformation of a single node
share :: (Equatable (Expr t), KnownSMTSort t, MonadSMT SMT m) => Expr t -> Expr t -> m (Expr t)
share origExpr expr = do
  let sn = unsafePerformIO (makeStableName' origExpr)
   in use (stableMap.at sn) >>= \mexpr' -> case mexpr' of  
        Just (SomeSMTSort expr') -> case geq (sortSing' expr) (sortSing' expr') of
          Nothing -> makeNode sn expr
          Just Refl -> return expr'
        Nothing -> makeNode sn expr

might work?

I haven't testing any of this, since I currently can't compile the library (I'll create another issue for that).

fabeulous · 2024-08-16T14:00:04Z

On second thought, while this should produce the correct SMT encoding, it will still do multiple traversals of each shared subexpression. Some kind of "lazy" para that delays the monadic effect of the recursion would be better I think.

lazyParaM1 :: (Monad m, Uniplate1 f cs, AllC cs b) => (forall a. AllC cs a => f a -> m (f a) -> m (f a)) -> f b -> m (f b)
lazyParaM1 f x = f x (uniplate1 (lazyParaM1 f) x)

runSharing :: (KnownSMTSort t, MonadSMT SMT m) => Expr t -> m (Expr t)
runSharing = paraM1 (
    \origExpr expr ->
      if isLeaf origExpr
      then return origExpr
      else case sortSing' origExpr of
        SBoolSort     -> share origExpr expr
        SIntSort      -> share origExpr expr
        SRealSort     -> share origExpr expr
        SBvSort _     -> share origExpr expr
        SArraySort _ _ -> share origExpr expr
        SStringSort   -> share origExpr expr)

-- Transformation of a single node
share :: (Equatable (Expr t), KnownSMTSort t, MonadSMT SMT m) => Expr t -> m (Expr t) -> m (Expr t)
share origExpr expr = do
  let sn = unsafePerformIO (makeStableName' origExpr)
   in use (stableMap.at sn) >>= \mexpr' -> case mexpr' of  
        Just (SomeSMTSort expr') -> case geq (sortSing' origExpr) (sortSing' expr') of
          Nothing -> expr >>= makeNode sn
          Just Refl -> return expr'
        Nothing -> expr >>= makeNode sn

bruderj15 · 2024-08-16T17:24:12Z

Totally right. See 51f7c3c

bruderj15 · 2024-09-05T17:13:44Z

I am not sure how replacing each node in the expression with a new var will influence solver performance.

I also don't know how this would impact solver performance.

For me, horrendeus for performance.
How does it affect you?
Feel free to reply in #97.

bruderj15 added enhancement New feature or request research Research an idea labels Jul 9, 2024

bruderj15 linked a pull request Aug 16, 2024 that will close this issue

43 observable sharing #76

Merged

bruderj15 closed this as completed in #76 Aug 21, 2024

bruderj15 mentioned this issue Aug 22, 2024

Sharing equivalent sub-expressions independent of their StableName #81

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Observable sharing #43

Observable sharing #43

bruderj15 commented Jul 9, 2024

fabeulous commented Aug 14, 2024

bruderj15 commented Aug 14, 2024

fabeulous commented Aug 14, 2024

bruderj15 commented Aug 14, 2024

bruderj15 commented Aug 16, 2024

fabeulous commented Aug 16, 2024

fabeulous commented Aug 16, 2024 •

edited

Loading

bruderj15 commented Aug 16, 2024

bruderj15 commented Sep 5, 2024

Observable sharing #43

Observable sharing #43

Comments

bruderj15 commented Jul 9, 2024

fabeulous commented Aug 14, 2024

bruderj15 commented Aug 14, 2024

fabeulous commented Aug 14, 2024

bruderj15 commented Aug 14, 2024

bruderj15 commented Aug 16, 2024

fabeulous commented Aug 16, 2024

fabeulous commented Aug 16, 2024 • edited Loading

bruderj15 commented Aug 16, 2024

bruderj15 commented Sep 5, 2024

fabeulous commented Aug 16, 2024 •

edited

Loading