Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Derived Applicative instance causes synthesis speed to blow up #2814

Open
gergoerdi opened this issue Sep 30, 2024 · 7 comments
Open

Derived Applicative instance causes synthesis speed to blow up #2814

gergoerdi opened this issue Sep 30, 2024 · 7 comments

Comments

@gergoerdi
Copy link
Contributor

In the following Clash circuit, the MATRIX_APPLICATIVE_COERCE preprocessor flag enables an Applicative instance for the included Matrix type that is equivalent to what one would get with deriving via (Compose (Vec n) (Vec m)). With the flag disabled, the pure implementation is still the same as the derived one would be, but (<*>) is implemented directly as liftA2 (<*>).

With liftA2 (<*>), synthesis finishes in less than 4 seconds on my notebook. With the coerce-based implementation, it finishes in 1m46s. And this, of course, is a cut-down version of a much more complex circuit where the synthesis time goes from ~1 minute to something that I didn't even bother waiting out.

{-# LANGUAGE DerivingStrategies, DerivingVia, GeneralizedNewtypeDeriving #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE InstanceSigs, StandaloneDeriving #-}
{-# LANGUAGE CPP #-}
{-# OPTIONS -Wunused-binds -Wunused-imports #-}
{-# OPTIONS -fconstraint-solver-iterations=5 #-}
module Sudoku where

#define MATRIX_APPLICATIVE_COERCE 0

import Clash.Prelude hiding (lift)
import Clash.Annotations.TH

import Data.Functor.Compose
import Data.Coerce (coerce)

newtype Matrix n m a = FromRows{ matrixRows :: Vec n (Vec m a) }
    deriving stock (Generic)
    deriving newtype (NFDataX, BitPack)
    deriving (Functor) via Compose (Vec n) (Vec m)
    -- deriving (Applicative) via Compose (Vec n) (Vec m)

instance (KnownNat n, KnownNat m) => Applicative (Matrix n m) where
    {-# INLINE pure #-}
    pure :: forall a. a -> Matrix n m a
    pure = coerce (pure @(Compose (Vec n) (Vec m)) @a)

    {-# INLINE (<*>) #-}
    (<*>) :: forall a b. Matrix n m (a -> b) -> Matrix n m a -> Matrix n m b
#if MATRIX_APPLICATIVE_COERCE
    (<*>) = coerce $ (<*>) @(Compose (Vec n) (Vec m)) @a @b
#else
    mf <*> mx = FromRows $ liftA2 (<*>) (matrixRows mf) (matrixRows mx)
#endif

toRowMajorOrder :: Matrix n m a -> Vec (n * m) a
toRowMajorOrder = concat . matrixRows

newtype Grid n m a = Grid{ getGrid :: Matrix n m (Matrix m n a) }
    deriving stock (Generic)
    deriving newtype (NFDataX)
    deriving (Functor, Applicative) via Compose (Matrix n m) (Matrix m n)

deriving newtype instance (KnownNat n, KnownNat m, BitPack a) => BitPack (Grid n m a)

flattenGrid :: Grid n m a -> Vec (n * m * m * n) a
flattenGrid = concat . toRowMajorOrder . fmap toRowMajorOrder . getGrid

headGrid :: forall n m a. (KnownNat n, KnownNat m, 1 <= n * m * m * n) => Grid n m a -> a
headGrid = head @(n * m * m * n - 1) . flattenGrid

type Solvable n m = (KnownNat n, KnownNat m, 1 <= n * m * m * n)

propagator
    :: forall n m dom. (Solvable n m, HiddenClockResetEnable dom)
    => SNat n -> SNat m -> Signal dom ()
propagator _ _ = headGrid units
  where
    units :: Grid n m (Signal dom ())
    units = pure unit
        <*> pure (pure ())
        <*> pure (pure ())
        <*> pure (pure ())
        <*> pure (pure ())

    unit
        :: Signal dom ()
        -> Signal dom ()
        -> Signal dom ()
        -> Signal dom ()
        -> Signal dom ()
    unit _ _ _ _ = cell
      where
        cell = register () cell


topEntity
    :: "CLK" ::: Clock System
    -> "RESET" ::: Reset System
    -> "OUT" ::: Signal System ()
topEntity clk rst = withClockResetEnable clk rst enableGen $
  propagator d2 d2

makeTopEntity 'topEntity
@gergoerdi
Copy link
Contributor Author

Even simpler:

{-# LANGUAGE BlockArguments, ViewPatterns, MultiWayIf, RecordWildCards, ApplicativeDo #-}
{-# LANGUAGE DerivingStrategies, DerivingVia, GeneralizedNewtypeDeriving #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE InstanceSigs, StandaloneDeriving #-}
{-# LANGUAGE CPP #-}
{-# OPTIONS -Wunused-binds -Wunused-imports #-}
{-# OPTIONS -fconstraint-solver-iterations=5 #-}
module Sudoku (topEntity) where

#define MATRIX_APPLICATIVE_COERCE 1

import Clash.Prelude hiding (lift)
import Clash.Annotations.TH

import Data.Functor.Compose
import Data.Coerce (coerce)

newtype Matrix n m a = FromRows{ matrixRows :: Vec n (Vec m a) }
    deriving stock (Generic)
    deriving newtype (NFDataX, BitPack)
    deriving (Functor) via Compose (Vec n) (Vec m)
    -- deriving (Applicative) via Compose (Vec n) (Vec m)

instance (KnownNat n, KnownNat m) => Applicative (Matrix n m) where
    {-# INLINE pure #-}
    pure :: forall a. a -> Matrix n m a
    pure = coerce (pure @(Compose (Vec n) (Vec m)) @a)

    {-# INLINE (<*>) #-}
    (<*>) :: forall a b. Matrix n m (a -> b) -> Matrix n m a -> Matrix n m b
#if MATRIX_APPLICATIVE_COERCE
    (<*>) = coerce $ (<*>) @(Compose (Vec n) (Vec m)) @a @b
#else
    mf <*> mx = FromRows $ zipWith (<*>) (matrixRows mf) (matrixRows mx)
#endif

headMatrix :: forall n m a. (KnownNat n, KnownNat m, 1 <= n, 1 <= m) => Matrix n m a -> a
headMatrix = head @(m - 1) . head @(n - 1) . matrixRows

type Solvable n m = (KnownNat n, KnownNat m, 1 <= n, 1 <= m)

propagator
    :: forall n m dom. (Solvable n m, HiddenClockResetEnable dom)
    => SNat n -> SNat m -> Signal dom ()
propagator _ _ = headMatrix . headMatrix . getCompose $ units
  where
    units :: Compose (Matrix n m) (Matrix m n) (Signal dom ())
    units = pure unit
        <*> pure (pure ())
        <*> pure (pure ())
        <*> pure (pure ())
        <*> pure (pure ())

    unit
        :: Signal dom ()
        -> Signal dom ()
        -> Signal dom ()
        -> Signal dom ()
        -> Signal dom ()
    unit _ _ _ _ = cell
      where
        cell = register () cell

topEntity
    :: "CLK" ::: Clock System
    -> "RESET" ::: Reset System
    -> "OUT" ::: Signal System ()
topEntity clk rst = withClockResetEnable clk rst enableGen $
  propagator d2 d2

makeTopEntity 'topEntity

@christiaanb
Copy link
Member

More of an FYI: On GHC 9.10 both are equally fast. With the coerce based version actually requiring fewer transformations.

@christiaanb
Copy link
Member

christiaanb commented Oct 4, 2024

On GHC 9.4, commenting out

{-# RULES
"zipWith$map" forall f xs ys. zipWith (\g a -> g a) (map f xs) ys = zipWith f xs ys
#-}
also makes the coerce based version go through. Achieving a similar number of transformations, and similar speed, as we do on GHC 9.10.
Note that on GHC 9.10 the above rule never fires.

@christiaanb
Copy link
Member

Something to check: perhaps it's due to loss of sharing in the reduceNonRepPrim transformation.

@gergoerdi
Copy link
Contributor Author

I would love to move on GHC 9.10, but I'm out of the loop on where Clash is on GHC compatibility. I thought 9.4 or 9.6 was the last working version?

@DigitalBrains1
Copy link
Member

We have a README.md in the root of the repo with a section GHC compatibility. Both Clash 1.8 and master support GHC 9.8. In fact, for master we also run CI for GHC 9.10, so it has some support. Perhaps we just forgot to update that table to list GHC 9.10.

It would appear the READMEs on the 1.8 branch and the v1.8.1 tag are outdated; that's unfortunate. But the one on the master branch is fairly reliable.

@gergoerdi
Copy link
Contributor Author

Ah, right, it was the Cabal/Stack issue of not picking up the Nat plugins that was my issue earlier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants