Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List fusion #22

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
cabal-dev
dist
.cabal-sandbox/
cabal.sandbox.config
.stack-work/
*.hi
*.o

# This file will be autogenerated on 'cabal build'.
cbits/GmpDerivedConstants.h
16 changes: 12 additions & 4 deletions bitset.cabal
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@ Bug-reports: http://github.com/lambda-llama/bitset/issues
Stability: Experimental
Cabal-Version: >= 1.12
Build-type: Custom
Tested-with: GHC >= 7.4.1
Extra-Source-Files: bin/mkDerivedGmpConstants.c
Tested-with: GHC == 7.4.1, GHC == 7.6.3, GHC == 7.8.4
Extra-Source-Files: bin/mkDerivedGmpConstants.c, include/bitset.h

Source-repository head
Type: git
Expand All @@ -27,9 +27,15 @@ Library
Hs-source-dirs: src
Ghc-options: -Wall -fno-warn-orphans
Default-language: Haskell2010
Other-extensions: CPP, NamedFieldPuns, MagicHash, UnboxedTuples,
BangPatterns, ForeignFunctionInterface,
GHCForeignImportPrim, MagicHash,
UnliftedFFITypes, UnboxedTuples,
GeneralizedNewtypeDeriving, TypeFamilies,
DeriveDataTypeable

C-sources: cbits/gmp-extras.cmm
Include-dirs: cbits
Include-dirs: cbits, include

if os(windows)
Extra-libraries: gmp-10
Expand All @@ -52,6 +58,7 @@ Test-suite bitset-tests
Hs-source-dirs: tests
Ghc-options: -Wall -O2 -fno-warn-orphans
Default-language: Haskell2010
Other-extensions: CPP

Type: exitcode-stdio-1.0
Main-is: Tests.hs
Expand All @@ -66,9 +73,10 @@ Benchmark bitset-benchmarks
Hs-source-dirs: src benchmarks
Ghc-options: -Wall -fno-warn-orphans -O2 -optc-O3 -optc-msse4.1
Default-language: Haskell2010
Other-extensions: CPP, ExistentialQuantification

C-sources: cbits/gmp-extras.cmm
Include-dirs: cbits
Include-dirs: cbits, include
Extra-libraries: gmp

Type: exitcode-stdio-1.0
Expand Down
39 changes: 39 additions & 0 deletions include/bitset.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
/*
* Common macros for bitset
*/

#ifndef HASKELL_BITSET_H
#define HASKELL_BITSET_H

/*
* We use cabal-generated MIN_VERSION_base to adapt to changes of base.
* Nevertheless, as a convenience, we also allow compiling without cabal by
* defining an approximate MIN_VERSION_base if needed. The alternative version
* guesses the version of base using the version of GHC. This is usually
* sufficiently accurate. However, it completely ignores minor version numbers,
* and it makes the assumption that a pre-release version of GHC will ship with
* base libraries with the same version numbers as the final release. This
* assumption is violated in certain stages of GHC development, but in practice
* this should very rarely matter, and will not affect any released version.
*/
#ifndef MIN_VERSION_base
#if __GLASGOW_HASKELL__ >= 711
#define MIN_VERSION_base(major1,major2,minor) (((major1)<4)||(((major2) == 4)&&((major2)<=9)))
#elif __GLASGOW_HASKELL__ >= 709
#define MIN_VERSION_base(major1,major2,minor) (((major1)<4)||(((major1) == 4)&&((major2)<=8)))
#elif __GLASGOW_HASKELL__ >= 707
#define MIN_VERSION_base(major1,major2,minor) (((major1)<4)||(((major1) == 4)&&((major2)<=7)))
#elif __GLASGOW_HASKELL__ >= 705
#define MIN_VERSION_base(major1,major2,minor) (((major1)<4)||(((major1) == 4)&&((major2)<=6)))
#elif __GLASGOW_HASKELL__ >= 703
#define MIN_VERSION_base(major1,major2,minor) (((major1)<4)||(((major1) == 4)&&((major2)<=5)))
#elif __GLASGOW_HASKELL__ >= 701
#define MIN_VERSION_base(major1,major2,minor) (((major1)<4)||(((major1) == 4)&&((major2)<=4)))
#elif __GLASGOW_HASKELL__ >= 700
#define MIN_VERSION_base(major1,major2,minor) (((major1)<4)||(((major1) == 4)&&((major2)<=3)))
#else
#define MIN_VERSION_base(major1,major2,minor) (0)
#endif
#endif

#endif
9 changes: 5 additions & 4 deletions src/Data/BitSet/Dynamic.hs
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
{-# LANGUAGE MagicHash #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}

#include <bitset.h>

-----------------------------------------------------------------------------
-- |
-- Module : Data.BitSet.Dynamic
Expand Down Expand Up @@ -122,11 +124,10 @@ instance Bits FasterInteger where
isSigned = isSigned . unFI
{-# INLINE isSigned #-}

bitSize = bitSize . unFI
{-# INLINE bitSize #-}
bitSize _ = error "bitSize: FasterInteger does not support bitSize."

#if defined(__GLASGOW_HASKELL__) && (__GLASGOW_HASKELL__ >= 707)
bitSizeMaybe = bitSizeMaybe . unFI
#if MIN_VERSION_base(4,7,0)
bitSizeMaybe _ = Nothing
{-# INLINE bitSizeMaybe #-}
#endif

Expand Down
126 changes: 97 additions & 29 deletions src/Data/BitSet/Generic.hs
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@
{-# LANGUAGE DeriveDataTypeable #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}

#include <bitset.h>

module Data.BitSet.Generic
(
-- * Bit set type
Expand Down Expand Up @@ -79,6 +81,9 @@ import Control.Applicative ((<$>))
import Control.DeepSeq (NFData(..))
import Data.Bits (Bits, (.|.), (.&.), complement, bit,
testBit, setBit, clearBit, popCount)
#if MIN_VERSION_base(4,7,0)
import Data.Bits (bitSizeMaybe, isSigned, unsafeShiftR, zeroBits)
#endif
import Data.Data (Typeable)
import Data.Monoid (Monoid(..))
import Foreign (Storable)
Expand All @@ -94,29 +99,35 @@ import qualified Data.List as List
newtype BitSet c a = BitSet { getBits :: c }
deriving (Eq, NFData, Storable, Ord, Typeable)

instance (Enum a, Read a, Bits c, Num c) => Read (BitSet c a) where
instance (Enum a, Read a, Bits c) => Read (BitSet c a) where
readPrec = parens . prec 10 $ do
Ident "fromList" <- lexP
fromList <$> readPrec

instance (Enum a, Show a, Bits c, Num c) => Show (BitSet c a) where
instance (Enum a, Show a, Bits c) => Show (BitSet c a) where
showsPrec p bs = showParen (p > 10) $
showString "fromList " . shows (toList bs)

instance (Enum a, Bits c, Num c) => Monoid (BitSet c a) where
instance Bits c => Monoid (BitSet c a) where
mempty = empty
mappend = union

#if defined(__GLASGOW_HASKELL__) && (__GLASGOW_HASKELL__ >= 707)
instance (Enum a, Bits c, Num c) => IsList (BitSet c a) where
instance (Enum a, Bits c) => IsList (BitSet c a) where
type Item (BitSet c a) = a
fromList = fromList
toList = toList
#endif

#if !MIN_VERSION_base(4,7,0)
zeroBits :: Bits c => c
zeroBits = bit 0 `clearBit` 0
{-# INLINE zeroBits #-}
#endif

-- | /O(1)/. Is the bit set empty?
null :: (Eq c, Num c) => BitSet c a -> Bool
null = (== 0) . getBits
null :: Bits c => BitSet c a -> Bool
null = (== zeroBits) . getBits
{-# INLINE null #-}

-- | /O(1)/. The number of elements in the bit set.
Expand All @@ -136,22 +147,22 @@ notMember x = not . member x

-- | /O(max(n, m))/. Is this a subset? (@s1 `isSubsetOf` s2@) tells whether
-- @s1@ is a subset of @s2@.
isSubsetOf :: (Bits c, Eq c) => BitSet c a -> BitSet c a -> Bool
isSubsetOf :: Bits c => BitSet c a -> BitSet c a -> Bool
isSubsetOf (BitSet bits1) (BitSet bits2) = bits2 .|. bits1 == bits2
{-# INLINE isSubsetOf #-}

-- | /O(max(n, m)/. Is this a proper subset? (ie. a subset but not equal).
isProperSubsetOf :: (Bits c, Eq c) => BitSet c a -> BitSet c a -> Bool
isProperSubsetOf :: Bits c => BitSet c a -> BitSet c a -> Bool
isProperSubsetOf bs1 bs2 = bs1 `isSubsetOf` bs2 && bs1 /= bs2
{-# INLINE isProperSubsetOf #-}

-- | The empty bit set.
empty :: (Enum a, Bits c, Num c) => BitSet c a
empty = BitSet 0
empty :: Bits c => BitSet c a
empty = BitSet zeroBits
{-# INLINE empty #-}

-- | O(1). Create a singleton set.
singleton :: (Enum a, Bits c, Num c) => a -> BitSet c a
singleton :: (Enum a, Bits c) => a -> BitSet c a
singleton = BitSet . bit . fromEnum
{-# INLINE singleton #-}

Expand Down Expand Up @@ -186,7 +197,7 @@ intersection (BitSet bits1) (BitSet bits2) = BitSet $ bits1 .&. bits2

-- | /O(d * n)/ Transform this bit set by applying a function to every
-- value. Resulting bit set may be smaller then the original.
map :: (Enum a, Enum b, Bits c, Num c) => (a -> b) -> BitSet c a -> BitSet c b
map :: (Enum a, Enum b, Bits c) => (a -> b) -> BitSet c a -> BitSet c b
map f = foldl' (\bs -> (`insert` bs) . f) empty
{-# INLINE map #-}

Expand All @@ -195,38 +206,95 @@ map f = foldl' (\bs -> (`insert` bs) . f) empty
-- operator is evaluated before before using the result in the next
-- application. This function is strict in the starting value.
foldl' :: (Enum a, Bits c) => (b -> a -> b) -> b -> BitSet c a -> b
foldl' f acc0 (BitSet bits) = go acc0 (popCount bits) 0 where
go !acc 0 _b = acc
go !acc !n b = if bits `testBit` b
then go (f acc $ toEnum b) (pred n) (succ b)
else go acc n (succ b)
#if MIN_VERSION_base(4,7,0)
-- If the bit set is represented by an unsigned type
-- then we can shift the bits off one by one until we're
-- left with all zeros. If the type is fairly narrow, then
-- this is likely to be cheap. In particular, in this case
-- we don't need to calculate the `popCount` and all shifts
-- are by fixed amounts.
foldl' f acc0 (BitSet bits0)
| not (isSigned bits0) && maybe False (<= 128) (bitSizeMaybe bits0) =
go acc0 bits0 0
where
go !acc !bits !b
| bits == zeroBits = acc
| bits `testBit` 0 = go (f acc $ toEnum b) (bits `unsafeShiftR` 1) (b + 1)
| otherwise = go acc (bits `unsafeShiftR` 1) (b + 1)
#endif
foldl' f acc0 (BitSet bits) = go acc0 (popCount bits) 0
where
go !acc 0 !_b = acc
go !acc n !b = if bits `testBit` b
then go (f acc $ toEnum b) (n - 1) (b + 1)
else go acc n (b + 1)
{-# INLINE foldl' #-}

-- | /O(d * n)/ Reduce this bit set by applying a binary function to
-- all elements, using the given starting value.
foldr :: (Enum a, Bits c) => (a -> b -> b) -> b -> BitSet c a -> b
#if MIN_VERSION_base(4,7,0)
foldr f acc0 (BitSet bits0)
| not (isSigned bits0) && maybe False (<= 128) (bitSizeMaybe bits0) = go bits0 0
where
go !bits !b
| bits == zeroBits = acc0
| bits `testBit` 0 = toEnum b `f` go (bits `unsafeShiftR` 1) (b + 1)
| otherwise = go (bits `unsafeShiftR` 1) (b + 1)
#endif
foldr f acc0 (BitSet bits) = go (popCount bits) 0 where
go 0 _b = acc0
go !n b = if bits `testBit` b
then toEnum b `f` go (pred n) (succ b)
else go n (succ b)
then toEnum b `f` go (n - 1) (b + 1)
else go n (b + 1)
{-# INLINE foldr #-}

-- | /O(d * n)/ Filter this bit set by retaining only elements satisfying
-- predicate.
filter :: (Enum a, Bits c, Num c) => (a -> Bool) -> BitSet c a -> BitSet c a
filter :: (Enum a, Bits c) => (a -> Bool) -> BitSet c a -> BitSet c a
filter f = foldl' (\bs x -> if f x then x `insert` bs else bs) empty
{-# INLINE filter #-}

-- | /O(d * n)/. Convert this bit set set to a list of elements.
toList :: (Enum a, Bits c, Num c) => BitSet c a -> [a]
toList bs = build (\k z -> foldr k z bs)
{-# INLINE [0] toList #-}
toList :: (Enum a, Bits c) => BitSet c a -> [a]
toList bs = foldr (:) [] bs
{-# NOINLINE [0] toList #-}

-- We rewrite toList to a `build` form to fuse with `foldr`. We write
-- `fromList` using a `foldr` form to fuse with `build` and `augment`. The
-- fromList/toList rule is more general than the old `fromList . toList = id`
-- rule. This extra generality fell out naturally from the rule construction,
-- but it seems to be at least somewhat useful; for example, `fromList $ toList
-- xs ++ toList ys` rewrites to the union of `xs` and `ys`.
{-# RULES
"toList" [~1] forall bs . toList bs = build (toListFB bs)
"toList/List" [1] forall bs . toListFB bs (:) [] = toList bs
"fromList/toList" forall bs f cs. toListFB bs fromListFB f cs =
f $! union bs cs
#-}

{-
Explanation of fromList/toList rule:

toListFB bs fromListFB f cs =
foldr fromListFB f bs cs =
foldr (\x r -> \ !acc -> r (insert x acc)) f bs cs =
foldr (\x r !acc -> r (insert x acc)) f bs cs

This last form inserts each element of `bs` into `cs`, accumulating strictly,
then applies `f` to the final result. This is just the same as taking their
*union* and applying `f` to it.
-}

toListFB :: (Enum a, Bits c) => BitSet c a -> (a -> b -> b) -> b -> b
toListFB bs = \k z -> foldr k z bs
{-# INLINE [0] toListFB #-}

-- | /O(d * n)/. Make a bit set from a list of elements.
fromList :: (Enum a, Bits c, Num c) => [a] -> BitSet c a
fromList = BitSet . List.foldl' (\i x -> i `setBit` fromEnum x) 0
{-# INLINE [0] fromList #-}
{-# RULES
"fromList/toList" forall bs. fromList (toList bs) = bs
#-}
fromList :: (Enum a, Bits c) => [a] -> BitSet c a
fromList xs = List.foldr fromListFB id xs empty
{-# INLINE fromList #-}

fromListFB :: (Enum a, Bits c) => a -> (BitSet c a -> b) -> BitSet c a -> b
fromListFB x r = \ !acc -> r (insert x acc)
{-# INLINE [0] fromListFB #-}
14 changes: 7 additions & 7 deletions src/Data/BitSet/Word.hs
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ insert = GS.insert

-- | /O(1)/. Delete an item from the bit set.
delete :: Enum a => a -> BitSet a -> BitSet a
delete = GS.delete
delete x xs = GS.delete x xs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, could you explain why the arguments are necessary here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look at the generated core with -O2. All will be fine if WS.delete is fully
applied and inlines, but if it's partially applied and passed to a
higher-order function that doesn't inline, the Bits dictionary may be
passed at runtime (I believe). The extra parameters there are to fully
apply GS.delete in WS.delete, so GHC will inline it when compiling
WS.delete.
On Jun 15, 2016 2:45 PM, "Sergei Lebedev" [email protected] wrote:

In src/Data/BitSet/Word.hs
#22 (comment):

@@ -124,7 +124,7 @@ insert = GS.insert

-- | /O(1)/. Delete an item from the bit set.
delete :: Enum a => a -> BitSet a -> BitSet a
-delete = GS.delete
+delete x xs = GS.delete x xs

Hmm, could you explain why the arguments are necessary here?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/lambda-llama/bitset/pull/22/files/231962e0e6a63fe63e1952826f97ca529f20a4e5#r67221750,
or mute the thread
https://github.com/notifications/unsubscribe/ABzi_QRX_uV6tfFowIlj45kg27ygZu9Cks5qMEg4gaJpZM4IyfKO
.

{-# INLINE delete #-}

-- | /O(1)/. The union of two bit sets.
Expand All @@ -149,35 +149,35 @@ intersection = GS.intersection
-- | /O(n)/ Transform this bit set by applying a function to every value.
-- Resulting bit set may be smaller then the original.
map :: (Enum a, Enum b) => (a -> b) -> BitSet a -> BitSet b
map = GS.map
map f = GS.map f
{-# INLINE map #-}

-- | /O(n)/ Reduce this bit set by applying a binary function to all
-- elements, using the given starting value. Each application of the
-- operator is evaluated before before using the result in the next
-- application. This function is strict in the starting value.
foldl' :: Enum a => (b -> a -> b) -> b -> BitSet a -> b
foldl' = GS.foldl'
foldl' f b xs = GS.foldl' f b xs
{-# INLINE foldl' #-}

-- | /O(n)/ Reduce this bit set by applying a binary function to all
-- elements, using the given starting value.
foldr :: Enum a => (a -> b -> b) -> b -> BitSet a -> b
foldr = GS.foldr
foldr c n xs = GS.foldr c n xs
{-# INLINE foldr #-}

-- | /O(n)/ Filter this bit set by retaining only elements satisfying a
-- predicate.
filter :: Enum a => (a -> Bool) -> BitSet a -> BitSet a
filter = GS.filter
filter f = GS.filter f
{-# INLINE filter #-}

-- | /O(n)/. Convert the bit set set to a list of elements.
toList :: Enum a => BitSet a -> [a]
toList = GS.toList
toList xs = GS.toList xs
{-# INLINE toList #-}

-- | /O(n)/. Make a bit set from a list of elements.
fromList :: Enum a => [a] -> BitSet a
fromList = GS.fromList
fromList xs = GS.fromList xs
{-# INLINE fromList #-}
Loading