[work on the IxSet tutorial Jeremy Shaw **20120119235722 Ignore-this: 7f7766079d5cccba98c1873416f2ca ] hunk ./AcidState.lhs 25 +#include "IxSet.lhs" addfile ./IxSet.markdown.lhs hunk ./IxSet.markdown.lhs 1 + +

IxSet: a set with multiple indexed keys

+

To use IxSet you will need to install the optional ixset package.

+ +In the first acid-state example we stored a single value. But in real database we typically need to store a large collection of records. And we want to be able to efficiently search and update those records. For simple key/value pairs we can use `Data.Map`. + +However, in practice, we often want to have *multiple* keys. That is what `IxSet` set offers -- a set-like type which can be indexed by multiple keys. + +IxSet can be found here on hackage. + +In this example, we will use `IxSet` to create a mini-blog. + +
+ +> {-# LANGUAGE DeriveDataTypeable, GeneralizedNewtypeDeriving, RecordWildCards, TemplateHaskell #-} + +> module Main where + +> import Control.Monad.Reader (ask) +> import Control.Monad.State (get, put) +> import Data.Acid (Update, Query, makeAcidic) +> import Data.Acid.Advanced (update', query') +> import Data.Data (Data, Typeable) +> import Data.IxSet (Indexable(..), IxSet(..), (@=), Proxy(..), getOne, ixFun, ixSet) +> import qualified Data.IxSet as IxSet +> import Data.SafeCopy (SafeCopy, base, deriveSafeCopy) +> import Data.Text (Text) +> import qualified Data.Text as Text +> import Data.Time (UTCTime(..)) + +
+ +The first thing we are going to need is a type to represent a blog post. + +It is convenient to assign a unique id to each blog post so that it can be easily referenced in urls and easily queried in the `IxSet`. In order to keep ourselves sane, we can create a `newtype` wrapper around an `Integer` instead of just using a nameless `Integer`. + +
+ +> newtype PostId = PostId { unPostId :: Integer } +> deriving (Eq, Ord, Data, Enum, Typeable, SafeCopy) + +
+ +Note that in addition to deriving normal classes like `Eq` and `Ord`, we also derive an instance of `SafeCopy`. This is not required by `IxSet` itself, but since we want to store the our blog posts in `acid-state` we will need it there. + +Our blog post will be able to have two status 'draft' and 'published'. We could use a boolean value, but it is easier to understand what `Draft` vs `Published` mean instead of trying to remember what `True` and `False` mean. Additionally, we can easily extend the type with additional states later. + +
+ +> data Status = +> Draft +> | Published +> deriving (Eq, Ord, Data, Typeable) +> +> deriveSafeCopy 0 'base ''Status + +
+ +A now we can create a simple record which represents a single blog post: + +
+ +> data Post = Post +> { postId :: PostId +> , title :: Text +> , author :: Text +> , body :: Text +> , date :: UTCTime +> , status :: Status +> , tags :: [Text] +> } +> deriving (Eq, Ord, Data, Typeable) +> +> deriveSafeCopy 0 'base ''Post + +
+ +Each `IxSet` key needs to have a unique type. Looking at `Post` it seems like that could be trouble -- because we have multiple fields which all have the type `Text`. Fortunately, we can easily get around this by introducing some newtypes which are used for indexing. + +
+ +> newtype Title = Title Text deriving (Eq, Ord, Data, Typeable, SafeCopy) +> newtype Author = Author Text deriving (Eq, Ord, Data, Typeable, SafeCopy) +> newtype Tag = Tag Text deriving (Eq, Ord, Data, Typeable, SafeCopy) + +
+ +We are now ready to create an instance of the `Indexable` class. This is the class that defines the keys for a `Post` so that we can store it in an `IxSet`: + +
+ +> instance Indexable Post where +> empty = ixSet [ ixFun $ \bp -> [ postId bp ] +> , ixFun $ \bp -> [ Title $ title bp ] +> , ixFun $ \bp -> [ Author $ author bp ] +> , ixFun $ \bp -> map Tag (tags bp) +> , ixFun $ (:[]) . date -- point-free, just for variety +> ] +> + +
+ +In the `Indexable Post` instance we create a list of `Ix Post` values by using the `ixFun` helper function: + +
+#ifdef HsColour +> ixFun :: (Ord b, Typeable b) => (a -> [b]) -> Ix a +#endif +
+ +We pass to `ixFun` a key extraction function. For example, in this line: + +
+#ifdef HsColour +> ixFun $ \bp -> [ postId bp ] +#endif +
+ +we extract the `PostId` from a `Post`. Note that we return a list of keys values not just a single key. That is because a single entry might have several keys for a specific type. For example, a `Post` has a list of tags. But, we want to be able to search for posts that match a specific tag. So, we index each tag separately: + +
+#ifdef HsColour +> ixFun $ \bp -> map Tag (tags bp) +#endif +
+ +For the `Title` and `Author` keys we add the newtype wrapper. + +Now that we have a place to store our posts, we can stash them in a record that we will store in acid-state: + +
+ +> data Blog = Blog +> { nextPostId :: PostId +> , posts :: IxSet Post +> } +> +> initialBlogState :: Blog +> initialBlogState = +> Blog { nextPostId = PostId 1 +> , posts = empty +> } + +
+ +`IxSet` does not (currently) provide any auto-increment functionality for indexes, so we have to keep track of that ourselves. That is why we have the `nextPostId` field. Note that in `initialBlogState` the `nextPostId` is initialized to 1 not 0. Sometimes we want to create a `Post` that is not yet in the database, and hence does not have a valid `PostId`. I like to reserve `PostId 0` to mean uninitialized. If I ever see a `PostId 0` stored in the database, I know there is a bug in my code. + +Next we will create some update and query functions for our acid-state database. + +> -- | create a new, empty post and add it to the database +> newPost :: UTCTime -> Update Blog Post +> newPost pubDate = +> do b@Blog{..} <- get +> let post = Post { postId = nextPostId +> , title = Text.empty +> , author = Text.empty +> , body = Text.empty +> , date = pubDate +> , status = Draft +> , tags = [] +> } +> put $ b { nextPostId = succ nextPostId +> , posts = IxSet.insert post posts +> } +> return post + +Nothing in that function should be too surprising. We have to pass in `UTCTime`, because we can not do IO in the update function. Because `PostId` is an instance of `Enum` we can use `succ` to increment it. To add the new post to the `IxSet` we use `IxSet.insert`. + +
+#ifdef HsColour +> insert :: (Typeable a, Ord a, Indexable a) => a -> IxSet a -> IxSet a +#endif +
+ +Next we have a function that replaces a `Post` in the database with a newer version + +> -- | update the post in the database (indexed by PostId) +> updatePost :: Post -> Update Blog () +> updatePost updatedPost = +> do b@Blog{..} <- get +> put $ b { posts = IxSet.updateIx (postId updatedPost) updatedPost posts +> } + +Note that instead of `insert` we use `updateIx`: + +
+#ifdef HsColour +> updateIx :: (Indexable a, Ord a, Typeable a, Typeable key) => key -> a -> IxSet a -> IxSet a +#endif +
+ +The first argument to `updateIx` is a key that maps to the post we want to updated in the database. The key must uniquely identify a single entry in the database. In this case we use our primary key, `PostId`. + +Next we have some query functions. + +> postById :: PostId -> Query Blog (Maybe Post) +> postById pid = +> do Blog{..} <- ask +> return $ getOne $ posts @= pid + +`postById` is used to lookup a specific post by its `PostId`. This is our first example of querying an `IxSet`. He we use the equals query operator: + +
+#ifdef HsColour +> (@=) :: (Typeable key, Ord a, Typeable a, Indexable a) => IxSet a -> key -> IxSet a +#endif +
+ +It takes an `IxSet` and filters it to produce a new `IxSet` which only contains values that match the specified key. In this case, we have specified a primary key, so we expect exactly zero or one values in the resulting `IxSet`. So, we can use `getOne` to turn the result into a simple `Maybe` value: + +
+#ifdef HsColour +> getOne :: Ord a => IxSet a -> Maybe a +#endif +
+ +Here is a query function that gets all the published posts and sorts them in reverse chronological order (aka, newest first) + +> getPublishedPosts :: Query Blog [Post] +> getPublishedPosts = +> do Blog{..} <- ask +> return $ IxSet.toDescList (Proxy :: Proxy UTCTime) $ posts @= Published + +We use the `@=` operator again to select just the posts which have the `Published` status. Since the publication date is a key (`UTCTime`) we can use `toDescList` to return a sorted list: + +
+#ifdef HsColour +> toDescList :: (Typeable k, Typeable a, Indexable a) => Proxy k -> IxSet a -> [a] +#endif +
+ +`toDescList` takes a funny argument `(Proxy :: Proxy UTCTime)`. While the `Post` type itself has an `Ord` instance -- we generally want to order by a specific key, which may have a different ordering. Since our keys are type directed, we need a way to pass a type to 'toDescList' so that it knows which key we want to order by. The `Proxy` type exists for that sole reason: + +
+#ifdef HsColour +> data Proxy a = Proxy +#endif +
+ +It just gives us a place to stick a type signature that `toDescList` and other functions can use. + + + hunk ./Makefile 13 -ACIDSTATE_DEMOS := AcidStateCounter.lhs +ACIDSTATE_DEMOS := AcidStateCounter.lhs IxSet.lhs hunk ./Makefile 32 -%.lhs: %.markdown.lhs +%.lhs : %.markdown.lhs