I recently decided to switch from using Jekyll Bootstrap to build my blog, to using Hakyll. Mostly just because I was rewriting all the html and css anyway and I wanted to give Hakyll a try, but also because I wanted to have a better idea of what was going on with each part of the site.
Getting started with Hakyll is pretty easy, the package comes with a program hakyll-init
which creates a basic site for you to build on, and there are lots of sites using it that open source their code. Some of the stuff I wanted to do took some work to figure out or find though, so I thought I’d collect a list of it here.
There are a number of ways to add next and previous buttons to the posts, I liked the method I found on Richard Goulter’s blog, but hakyll has added some new functions since that was written and we can cut it down a bit now.
postList <- sortRecentFirst =<< getMatches "posts/*"
match "posts/*" $ do
-- strip date from filename when producing route
route $ gsubRoute postDateRegex (const "posts/") `composeRoutes`
setExtension "html"
compile $ do
let postLocationContext =
field "nextPost" (nextPostURL postList) `mappend`
field "prevPost" (prevPostURL postList) `mappend`
postCtx
pandocCompiler
>>= loadAndApplyTemplate "templates/post.html" postLocationContext
>>= saveSnapshot "content"
>>= loadAndApplyTemplate "templates/default.html" postLocationContext
>>= relativizeUrls
With sortRecentFirst
from Hakyll, I now only need to define nextPostURL
and prevPostURL
, which are almost identical.
findPostUrl :: ([Identifier] -> Identifier -> Maybe Identifier)
-> [Identifier] -> Item String
-> Compiler String
findPostUrl p posts post = do
let id = itemIdentifier post
case p posts id of
Just m -> maybe empty toUrl <$> getRoute m
Nothing -> empty
prevPostURL :: [Identifier] -> Item String -> Compiler String
prevPostURL = findPostUrl lookupPrev
nextPostURL :: [Identifier] -> Item String -> Compiler String
nextPostURL = findPostUrl lookupNext
lookupPrev :: Eq a => [a] -> a -> Maybe a
lookupPrev ids id = case elemIndex id ids of
Just i -> if i >= (length ids - 1) then Nothing else Just $ ids!!(i+1)
Nothing -> Nothing
lookupNext :: Eq a => [a] -> a -> Maybe a
lookupNext ids id = case elemIndex id ids of
Just i -> if i <= 0 then Nothing else Just $ ids!!(i-1)
Nothing -> Nothing
I also changed lookupNext
and lookupPrev
to use indicies because I managed to get them subtly wrong like, five times while setting up the site (in fact at the time of this writing they are wrong, and will hopefully be finally fixed for real when I post this).
Github doesn’t let me directly tell the server to issue redirects, but fortunately HTML already has a way to do this. I’ll create a template redirect.html
that contains the following lines:
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<meta http-equiv="refresh" content="0; url=/$postName$" />
</head>
</html>
So then I just need to get a list of all the pages I want to redirect from and their targets, and then produce a tiny page for each one with postName
replaced. This turns out to be fairly simple once you spend some time looking at the Hakyll docs. For extensibility, I’ll use post metadata to decide what redirect pages to generate, this way I don’t need to artificially separate my old posts and new posts, and if I want to I can create multiple aliases to a page.
aliasList <- getAliases <$> getAllMetadata "posts/*"
let aliases = map fromFilePath $ snd $ unzip aliasList
create aliases $ do
route idRoute
compile $ do
let aliasCtx = field "postName" (getRealName aliasList) `mappend`
defaultContext
makeItem ""
>>= loadAndApplyTemplate "templates/redirect.html" aliasCtx
>>= relativizeUrls
I grab all the metadata for each file in posts/
, and convert them into (target,alias)
pairs. The list aliases
contains only the alias names, which I use to generate the redirect pages, looking up the target page name in aliasList
with getRealName
. The actual work is done in getAliases
, which filters out pages that don’t define the aliases
keyword and reformats titles to match my url scheme.
stripPostDate :: FilePath -> FilePath
stripPostDate = replaceAll postDateRegex (const "posts/")
-- convert a list of (Identifier,Metadata) into a list of (target,alias)
getAliases :: [(Identifier,Metadata)] -> [(FilePath,FilePath)]
getAliases ids =
let pairs = filter (not . null . snd) $ map expand ids
paths = map (second (map addIndex) . first idToPath) pairs
in concatMap unzipSecond paths
where
expand :: (Identifier,Metadata) -> (Identifier,[FilePath])
expand = second (maybe [] read . M.lookup "aliases")
idToPath :: Identifier -> FilePath
idToPath = stripPostDate . toFilePath
addIndex :: FilePath -> FilePath
addIndex f = dropWhile (=='/') $ dropExtension f </> "index.html"
-- get the path of the page an alias is pointing to
getRealName :: [(FilePath,FilePath)] -> Item String -> Compiler String
getRealName as i = do
let id = toFilePath $ itemIdentifier i
path = fst $ fromJust $ find ((==id) . snd) as
return $ replaceExtension path "html"
unzipSecond :: (a,[b]) -> [(a,b)]
unzipSecond (x,ys) = map (\a->(x,a)) ys
Some of this is specific to my site layout and title format, but most of it is general-purpose. If you haven’t seen them before, first
and second
from Control.Arrow
have kind of scary types, but the way most people use them is just to apply a function inside a tuple: first foo (a,b) == (foo a,b)
and second bar (a,b) == (a, bar b)
.
So, if you’re not using Jekyll, github wants you to just upload the static site directly, but this is a problem because I’ve got all this stuff that isn’t part of the site sitting in the repo. The solution is to keep all the source in a separate branch, which I call hakyll
, and then deploy by pushing only the built site in _site/
to master
. For this purpose I modified the deploy.sh
script used by Jorge Israel Peña for his blog. Hakyll has a deploy command you can set in the site configuration, so I add
siteConfig :: Configuration
siteConfig = defaultConfiguration{ deployCommand = "bash deploy.sh deploy" }
And replace the call to hakyll
with hakyllWith siteConfig
in main
. Then I can just deploy the site by calling ./site deploy
.
Overall switching to Hakyll was pretty simple, and I didn’t really have any issues figuring out how to do anything I wanted, so I’d call this a win.
]]>This post doesn’t cover any new functionality, instead, we’ll be revisiting some old code and refactoring it to be less awful. In particular, we’ll be using lenses make our nested datatypes less annoying to work with. I’ve tried to record every change I made, but it’s possible I missed something, if you have trouble compiling, look at the github changelog to try to find the issue, and also let me know so I can update this article.
First: I am not going to attempt to explain lenses in any detail. This tutorial is a decent introduction to what lenses are, but for our purposes it’s enough to know that lenses look and act a lot like accessors in other languages. Lenses compose “backwards” so for example (content . height)
will access the height element of the content element of a Dimensions
.
Second: Lenses are a pain to read, partly because they compose backwards but also because the lens
library includes operators for every situation Edward Kmett could think of (over 100 operators, including such gems as the squid operator (<<<>~)
and the cartoon invective operator (^@!?)
). We’ll try to avoid readability issues by using a really minimal set of lens functions and operators, namely: (^.)
which gives the value of the field a lens accesses, (.~)
which assigns a value to the field a lens accesses, (.=)
which is (.~)
but for the state inside a State
monad, (&)
which is just flip $
and is used to chain lens operations, (&~)
which is (&)
inside a State
monad, and zoom
which “zooms in” on a lens, so if we’re going to perform several operations on the same field, we don’t need to keep writing it.
The lens
library conveniently provides a template haskell macro to create lenses for a structure, all we need to do is preface all fields we want to derive a lens for with _
, and then call makeLenses ''<type name>
. The ability to automatically derive lenses is the main reason we’re using the lens
library instead of just writing our own simple lenses, if you don’t want to include all the dependencies lens
has, you can also use lens-family
which has all the operators we’ll be using.
For starters, we’ll derive lenses for NTree
in Dom.hs
-- I'm only showing lines that have changed
-- It's not actually mentioned on the page, probably because they thought it
-- was obvious, but you need the TemplateHaskell pragma to use makeLenses
{-# LANGUAGE OverloadedStrings, FlexibleInstances, TemplateHaskell#-}
import Control.Lens
data NTree a = NTree { _root :: a, _children :: [NTree a] }
makeLenses ''NTree
Most of our work is going to be in Layout.hs
which will look a lot nice by the time we finish.
{-#LANGUAGE BangPatterns, OverloadedStrings, TemplateHaskell#-}
import Data.Function (on)
import Control.Lens hiding (children)
data Rect = Rect { _x :: Float
, _y :: Float
, _width :: Float
, _height :: Float }
data Dimensions = Dimensions { _content :: Rect
, _padding :: EdgeSize
, _border :: EdgeSize
, _margin :: EdgeSize }
data EdgeSize = EdgeSize { _left :: Float
, _right :: Float
, _top :: Float
, _bottom :: Float }
makeLenses ''Rect
makeLenses ''Dimensions
makeLenses ''EdgeSize
-- here we use (.~) to pass contBlock with its height value set to 0
layoutTree :: StyledNode -> Dimensions -> Either T.Text LayoutBox
layoutTree root contBlock = buildLayoutTree root >>=
flip layout (contBlock & content.height.~0)
-- only one line has changed in buildLayoutTree
buildLayoutTree :: StyledNode -> Either T.Text LayoutBox
...
anonify = concatMap mergeInlines . groupBy ((&&) `on` isInline)
...
-- here we use a lens to get rid of the ugly nested pattern match
layout :: LayoutBox -> Dimensions -> Either T.Text LayoutBox
layout l contBlock = case l^.root._2 of
BlockNode _ -> layoutBlock contBlock l
InlineNode _ -> undefined
AnonymousBlock -> undefined
-- the next few function have changed significantly, and several parts
-- have been pulled out into their own things
auto = Keyword "auto"
zero = Length 0 Px
calcWidth :: Dimensions -> LayoutBox -> Either T.Text LayoutBox
calcWidth contBlock rt = do
style <- getStyledElem rt
vals <- lookupSideVals rt
let w = fromMaybe auto $ value style "width"
total = sum $ map toPx (w:vals)
underflow = contBlock^.content.width - total
(margins,vals') = splitAt 2 vals
(w',ml',mr') = checkUnderflow w underflow $ checkAutoMargins margins w total
[w'',ml,mr,blw,brw,plf,prt] = map toPx (w':ml':mr':vals')
-- did you know you can use semicolons in haskell?
return $ rt &~ zoom (root . _1) (do
content.width .= w''
padding.left .= plf; padding.right .= prt
border.left .= blw; border.right .= brw
margin.left .= ml ; margin.right .= mr)
where
checkAutoMargins [x,y] w total
| w /= auto && total > contBlock^.content.width = (check x,check y)
| otherwise = (x,y)
where check a = if a == auto then zero else a
checkUnderflow w uflow (mlf,mrt) = case (w == auto, mlf == auto, mrt == auto) of
(False,False,False) -> (w , mlf, Length (toPx mrt + uflow) Px)
(False,False,True) -> (w , mlf, Length uflow Px)
(False,True,False) -> (w , Length uflow Px , mrt)
(False,True,True) -> (w , Length (uflow/2) Px, Length (uflow/2) Px)
(True,_,_) ->
let l = if mlf == auto then zero else mlf
r = if mrt == auto then zero else mrt
in if uflow >= 0
then (Length uflow Px,l,r)
else (zero,l,Length (toPx r + uflow) Px)
-- I pulled out the lookup functions since they took up a lot of space,
-- and gave them nice descriptive names
lookupSideVals :: LayoutBox -> Either T.Text [Value]
lookupSideVals rt = do
style <- getStyledElem rt
return $ map (\a -> lookup style a zero)
[ ["margin-left" , "margin"]
, ["margin-right" , "margin"]
, ["border-left-width" , "border-width"]
, ["border-right-width", "border-width"]
, ["padding-left" , "padding"]
, ["padding-right" , "padding"] ]
lookupVertVals :: LayoutBox -> Either T.Text [Float]
lookupVertVals rt = do
style <- getStyledElem rt
return $ map (toPx . (\a -> lookup style a zero))
[ ["margin-top" , "margin"]
, ["margin-bottom" , "margin"]
, ["border-top-width" , "border-width"]
, ["border-bottom-width", "border-width"]
, ["padding-top" , "padding"]
, ["padding-bottom" , "padding"] ]
calcPosition :: Dimensions -> LayoutBox -> Either T.Text LayoutBox
calcPosition contBlock rt = do
[mt,mb,bt,bb,pt,pb] <- lookupVertVals rt
let d = rt^.root._1
return $ rt &~ zoom (root . _1) (do
content.x.= contBlock^.content.x
+ d^.margin.left
+ d^.border.left
+ d^.padding.left
content.y.= contBlock^.content.y
+ contBlock^.content.height
+ pt + bt + mt
padding.top .= pt; padding.bottom .= pb
border.top .= bt; border.bottom .= bb
margin.top .= mt; margin.bottom .= mb)
layoutChildren :: LayoutBox -> Either T.Text LayoutBox
layoutChildren rt = do
(dim,cs) <- foldM foo (rt^.root._1,[]) $ rt^.children
return $ rt &~ root._1.= dim &~ children.= cs
where
foo :: (Dimensions,[LayoutBox]) -> LayoutBox -> Either T.Text (Dimensions,[LayoutBox])
foo (d,acc) c = do
c' <- layout c d
return (d & content.height+~ marginBoxHeight (c'^.root._1), acc ++ [c'])
calcHeight :: LayoutBox -> Either T.Text LayoutBox
calcHeight rt = do
s <- getStyledElem rt
case value s "height" of
Just (Length h Px) -> return $ rt & root._1.content.height.~ h
Nothing -> return rt
marginBoxHeight :: Dimensions -> Float
marginBoxHeight dim = (marginBox dim)^.height
getStyledElem :: LayoutBox -> Either T.Text StyledNode
getStyledElem rt = case rt^.root._2 of
BlockNode s -> Right $ NTree s []
InlineNode s -> Right $ NTree s []
AnonymousBlock -> Left "Error: attempted to access the nonexistant\
\ StyleNode of an AnonymousBlock"
expandedBy :: Rect -> EdgeSize -> Rect
expandedBy rec edge = rec &~ do
x -= edge^.left
y -= edge^.top
width += (edge^.left + edge^.right)
height += (edge^.top + edge^.bottom)
paddingBox :: Dimensions -> Rect
paddingBox d = (d^.content) `expandedBy` (d^.padding)
marginBox :: Dimensions -> Rect
marginBox d = borderBox d `expandedBy` (d^.margin)
borderBox :: Dimensions -> Rect
borderBox d = paddingBox d `expandedBy` (d^.margin)
Now that we’re using lenses throughout our layout tree, we’ll need to update Painting.hs
and tests.hs
as well.
-- Painting.hs
import Control.Lens
paint :: LayoutBox -> Rect -> Canvas
paint root bounds = let dlist = buildDisplayList root
canvas = newCanvas w h
w = fromInteger . floor $ bounds^.width
h = fromInteger . floor $ bounds^.height
in F.foldl' paintItem canvas dlist
buildDisplayList :: LayoutBox -> DisplayList
buildDisplayList = F.foldMap renderLayoutBox
renderBorders :: (Dimensions,BoxType) -> DisplayList
renderBorders (dim,ty) = maybe mempty renderBorders' (getColor ty "border-color")
where
renderBorders' color = V.fromList $ map (SolidColor color) [l, r, t, b]
bbox = borderBox dim
bdr = dim^.border
l = bbox & width.~ bdr^.left
r = bbox & x+~ bbox^.width - bdr^.right
& width.~ bdr^.right
t = bbox & height.~ bdr^.top
b = bbox & y+~ bbox^.height - bdr^.bottom
& height.~ bdr^.bottom
paintItem :: Canvas -> DisplayCommand -> Canvas
paintItem cs (SolidColor color rect) = updateChunk cs (x0,x1) (y0,y1) color
where
x0 = clampInt 0 (w-1) (rect^.x)
y0 = clampInt 0 (h-1) (rect^.y)
x1 = clampInt 0 (w-1) (rect^.x + rect^.width - 1)
y1 = clampInt 0 (h-1) (rect^.y + rect^.height - 1)
w = asFloat $ wdth cs
h = asFloat $ hght cs
asFloat = fromInteger . toInteger
-- tests.hs
import Control.Lens
contBlock = defaultDim & content.width.~800 & content.height.~168
paintpng = paintpng' s d
where
(Right d) = PS.parseHtml pnghtml
(Right s) = parseCSS pngcss
paintpng' s d = do
let st = styleTree d s
lyt <- layoutTree st contBlock
let vec = pixels $ paint lyt (contBlock^.content)
return $ generateImage (\x y-> c2px $ vec V.! (x + (y * 800))) 800 168
c2px (Color r g b _) = PixelRGB8 r g b
Finally, update hubert.cabal
to add our lens dependency
executable hubert
main-is: Main.hs
other-modules: Dom,
HTML.Parser,
HTML.Parsec,
CSS,
Style,
Layout
-- other-extensions:
build-depends: base >=4.7 && <5,
unordered-containers >=0.2 && <0.3,
mtl >= 2.2.1,
text >= 1.1.0.0,
parsec == 3.1.*,
lens >= 4.6
Test-Suite hunit-tests
type: exitcode-stdio-1.0
main-is: tests.hs
other-modules: Dom,
HTML.Parser,
HTML.Parsec,
CSS,
Style,
Layout,
Painting
build-depends: base >= 4.7 && < 5,
unordered-containers >=0.2 && <0.3,
mtl >= 2.2.1,
text >= 1.1.0.0,
HUnit >= 1.2.5.0,
parsec == 3.1.*,
vector >= 0.10.9.1,
JuicyPixels >= 3.1.7.1,
lens >= 4.6
hs-source-dirs: tests,
src
default-language: Haskell2010
At this point, everything should compile and run correctly. We’ve used lenses to make the deeply nested updates in Layout.hs
somewhat less awful to read, and we should be set until Robinson updates again.
As usual, you can find the source for this post here and the source for Robinson here.
]]>Welcome back, today we’re going to implement painting of the layout tree to an image (but not to a real window, yet). The actual painting code is fairly simple, but before that we’re going to want to backtrack and modify some older code.
First we’ll add a Foldable
instance to our NTree
in Dom.hs.
import Data.Monoid ((<>)) -- (<>) is an infix version of mappend
import Data.Foldable
instance Foldable NTree where
foldMap f (NTree n []) = f n
foldMap f (NTree n ns) = f n <> foldMap (foldMap f) ns
This will let us easily convert the layout tree into an intermediate representation.
Next we’ll refactor a couple data structures to match how Robinson is laying things out. We’ll pull the first four fields of the Dimensions
struct out into their own Rect
struct and write a couple helpers for it. We’ll also pull the color data out of the Value
type, so that later we can add other representations if we want to. (If this next bit seems hard to follow, the changelog has a full diff)
---------------------- Layout.hs --------------------
data Rect = Rect { x :: Float
, y :: Float
, width :: Float
, height :: Float }
data Dimensions = Dimensions { content :: Rect
, padding :: EdgeSize
, border :: EdgeSize
, margin :: EdgeSize }
emptyRect = Rect 0 0 0 0
defaultDim = Dimensions emptyRect emptyEdge emptyEdge emptyEdge
-- Rect and Dimensions helpers
expandedBy :: EdgeSize -> Rect -> Rect
expandedBy edge rec = Rect{ x = x rec - left edge
, y = y rec - top edge
, width = width rec + left edge + right edge
, height = height rec + top edge + bottom edge }
paddingBox :: Dimensions -> Rect
paddingBox d = expandedBy (padding d) $ content d
marginBox :: Dimensions -> Rect
marginBox d = expandedBy (margin d) $ borderBox d
borderBox :: Dimensions -> Rect
borderBox d = expandedBy (border d) $ paddingBox d
layoutTree :: StyledNode -> Dimensions ->Either T.Text LayoutBox
layoutTree root contBlock = buildLayoutTree root >>=
flip layout contBlock{content = (content contBlock){height=0}}
-- updateDim in calcWidth
updateDim d = let pad = padding d
mar = margin d
bor = border d
rec = content d
in d{ content = rec{ width = w'' }
, padding = pad{ left = plf, right = prt }
, border = bor{ left = blw, right = brw }
, margin = mar{ left = ml, right = mr } }
-- updateDim in calcPosition
updateDim d [mt,mb,bt,bb,pt,pb] =
let pad = padding d
mar = margin d
bor = border d
brec = content contBlock
drec = content d
x' = x brec
+ left (margin d)
+ left (border d)
+ left (padding d)
y' = y brec + height brec + pt + bt + mt
in d{ content = drec{ x = x', y = y' }
, padding = pad{ top = pt, bottom = pb }
, border = bor{ top = bt, bottom = bb }
, margin = mar{ top = mt, bottom = mb } }
layoutChildren (NTree (dim,x) cs) = do
(dim',cs') <- foldM foo (dim,[]) cs
return $ NTree (dim',x) cs'
where
foo (d,acc) c@(NTree (cdim,_) _) = do
c'@(NTree (cdim',_)_) <- layout c d
let rec = content d
return (d{ content =
rec{height = height rec + marginBoxHeight cdim'}}, acc ++ [c'])
calcHeight :: LayoutBox -> Either T.Text LayoutBox
calcHeight root@(NTree (d,x)y) = do
s <- getStyledElem root
let d' = case value s "height" of
Just (Length h Px) -> d{content = (content d){height=h}}
Nothing -> d
return $ NTree (d',x) y
----------------------------- Css.hs ------------------------------
data Value = Keyword T.Text
| ColorValue Color
| Length Float Unit
deriving (Show, Eq)
data Color = Color Word8 Word8 Word8 Word8
deriving (Show, Eq)
color = do
char '#'
cs <- count 3 (count 2 hexDigit)
let [r,g,b] = map (fst . head . readHex) cs
return $ ColorValue (Color r g b 255)
There are a couple other places where you’ll need to replace foo rect
with foo (content rect)
but the compiler will point them out.
Now we’re ready to implement painting.
Painting will be a two-phase process: first we construct a DisplayList
from our layout tree, and then we write all the entries in the DisplayList
to a canvas.
{-# LANGUAGE OverloadedStrings #-}
module Painting
( Canvas (..)
, newCanvas
, paint
) where
import Data.Monoid ((<>),mempty)
import Data.Word
import qualified Data.Foldable as F
import qualified Data.Vector as V
import qualified Data.Text as T
import Dom
import Layout
import Style
import CSS (Value(ColorValue), Color(..))
type DisplayList = V.Vector DisplayCommand
data DisplayCommand = SolidColor Color Rect
data Canvas = Canvas { pixels :: V.Vector Color
, wdth :: Word -- we need to abbreviate these because they conflict
, hght :: Word } -- with Rect.width and Rect.height, GHC 7.10 fixes this
newCanvas :: Word -> Word -> Canvas
newCanvas w h = let white = Color 255 255 255 255 in
Canvas (V.replicate (fromIntegral(w * h)) white) w h
The DisplayList
is an intermediate representation we can use to modify the data we draw, for instance by culling boxes that are entirely outside the screen. I suspect that a Data.Vector
will be a good choice for this structure, but I’d like to minimize the amount of code we need to change if we later want to use something like Data.Sequence
instead, so we’ll try to use functions from Data.Foldable
where possible. Right now we’re only drawing boxes, so our DisplayCommand
is just a solid color and a rectangle.
Building a DisplayList
is a simple fold:
buildDisplayList :: LayoutBox -> DisplayList
buildDisplayList lbox = F.foldMap renderLayoutBox lbox
Rendering a layout box consists of drawing the background rectangle, and then a rect for each border. If a color isn’t specified for the background or borders, we don’t add a DisplayCommand
for that part.
renderLayoutBox :: (Dimensions,BoxType) -> DisplayList
renderLayoutBox box = renderBackgroud box <> renderBorders box
renderBackgroud :: (Dimensions,BoxType) -> DisplayList
renderBackgroud (dim,ty) = maybe mempty
(return . flip SolidColor (borderBox dim)) (getColor ty "background")
renderBorders :: (Dimensions,BoxType) -> DisplayList
renderBorders (dim,ty) = maybe mempty renderBorders' (getColor ty "border-color")
where
renderBorders' color = V.fromList $ map (SolidColor color) [l, r, t, b]
bbox = borderBox dim
bdr = border dim
l = bbox{ width = left bdr }
r = bbox{ x = x bbox + width bbox - right bdr
, width = right bdr }
t = bbox{ height = top bdr }
b = bbox{ y = y bbox + height bbox - bottom bdr
, height = bottom bdr }
getColor :: BoxType -> T.Text -> Maybe Color
getColor (BlockNode style) name = getColor' style name
getColor (InlineNode style) name = getColor' style name
getColor AnonymousBlock _ = Nothing
getColor' style name = case value (NTree style []) name of
Just (ColorValue (Color r g b a)) -> Just (Color r g b a)
_ -> Nothing
We’re not actually implementing any functions to modify the DisplayList
right now, so all that’s left is painting. We draw each command on top of the previous one.
paint :: LayoutBox -> Rect -> Canvas
paint root bounds = let dlist = buildDisplayList root
canvas = newCanvas w h
w = fromInteger . floor $ width bounds
h = fromInteger . floor $ height bounds
in F.foldl' paintItem canvas dlist
paintItem :: Canvas -> DisplayCommand -> Canvas
paintItem cs (SolidColor color rect) = updateChunk cs (x0,x1) (y0,y1) color
where
x0 = clampInt 0 (w-1) (x rect)
y0 = clampInt 0 (h-1) (y rect)
x1 = clampInt 0 (w-1) (x rect + width rect - 1)
y1 = clampInt 0 (h-1) (y rect + height rect - 1)
w = asFloat $ wdth cs
h = asFloat $ hght cs
asFloat = fromInteger . toInteger
-- this probably modifies the pixel vector in-place, if I'm reading the
-- Data.Vector source correctly
updateChunk :: Canvas -> (Integer,Integer) -> (Integer,Integer) -> Color -> Canvas
updateChunk cs (x0,x1) (y0,y1) c = let pxs = V.update (pixels cs) chunk in
cs{ pixels = pxs}
where
chunk = V.map (\a->(fromIntegral a,c)) indicies
indicies = V.fromList [ y * (toInteger $ wdth cs) + x | x <- [x0..x1], y <- [y0..y1] ]
clampInt :: Float -> Float -> Float -> Integer
clampInt f c = floor . min c . max f
Note that this code differs slightly from the equivalent Rust code in Robinson: clamping to [0,height] and [0,width] is a runtime error (you’ll try to write one row/column of pixels too far, I’m not sure why Rust allows it), and straight up adding the dimensions to the coordinates will also result in boxes that are 1 pixel too large in both dimensions.
That’s all our painting code, we just need to add a quick test to make sure it works. We’ll need a library to load images for this, and sadly there isn’t one included in the Haskell Platform, so we’ll need to install it. Add the following lines to build-depends in your .cabal file:
vector >= 0.10.9.1,
JuicyPixels >= 3.1.7.1
And then in a command prompt run:
cabal update
cabal install --enable-tests --only-dependencies
This will install the JuicyPixels library, which has good support for dealing with images in the most common formats.
We’ll add one test (you’ll need to grab rainbow.png from the github repo)
-- add these at the top of tests.hs
import Data.Maybe
import qualified Data.Vector as V
import Codec.Picture
import Codec.Picture.Types
import Painting
----------------------------- PAINT TESTS ---------------------------
testPaint = TestCase $ do
testpng <- readPng "tests/rainbow.png"
either (\_->assertFailure "missing png image")
(compareImage paintpng)
testpng
compareImage (Left e) _ = assertFailure $ T.unpack e
compareImage (Right i1) (ImageRGB8 i2) = do
assertEqual "height" (imageHeight i1) (imageHeight i2)
assertEqual "width" (imageWidth i1) (imageWidth i2)
assertEqual "pixels" (imageData i1) (imageData i2)
contBlock = defaultDim{content = (content defaultDim){ width=800, height=168 } }
paintpng = paintpng' s d
where
(Right d) = PS.parseHtml pnghtml
(Right s) = parseCSS pngcss
paintpng' s d = do
let st = styleTree d s
lyt <- layoutTree st contBlock
let vec = pixels $ paint lyt (content contBlock)
return $ generateImage (\x y-> c2px $ vec V.! (x + (y * 800))) 800 168
c2px (Color r g b _) = PixelRGB8 r g b
pnghtml = "<div class=\"a\">\
\ <div class=\"b\">\
\ <div class=\"c\">\
\ <div class=\"d\">\
\ <div class=\"e\">\
\ <div class=\"f\">\
\ <div class=\"g\">\
\ </div>\
\ </div>\
\ </div>\
\ </div>\
\ </div>\
\ </div>\
\</div>"
pngcss = "* { display: block; padding: 12px; }\
\.a { background: #ff0000; }\
\.b { background: #ffa500; }\
\.c { background: #ffff00; }\
\.d { background: #008000; }\
\.e { background: #0000ff; }\
\.f { background: #4b0082; }\
\.g { background: #800080; }"
Don’t forget to add testPaint
to the test list.
That’s everything for painting. I might put up another post covering some refactorings and additions to the current code before the next Robinson update comes out, or I might work on something completely different for a bit. We’ll see.
As usual, you can find the source for this post here and the source for Robinson here.
]]>Welcome back, today we’re going to implement Boxes and also the Block Layout, because Matt’s post on Boxes was mostly talking about how they’re set up and I don’t have to do that since he already has. To start with we’re going to want to set up our imports and define some new types.
{-#LANGUAGE BangPatterns, OverloadedStrings#-}
module Layout where
import Prelude hiding (lookup)
import Control.Applicative ((<$>))
import Control.Monad (foldM)
import Data.List (foldl', groupBy)
import Data.Maybe (fromMaybe)
import qualified Data.Text as T
import Dom
import CSS
import Style
data Dimensions = Dimensions { x :: Float
, y :: Float
, width :: Float
, height :: Float
, padding :: EdgeSize
, border :: EdgeSize
, margin :: EdgeSize }
data EdgeSize = EdgeSize { left :: Float
, right :: Float
, top :: Float
, bottom :: Float }
type LayoutBox = NTree (Dimensions,BoxType)
type StyledElement = (NodeType,PropertyMap)
data BoxType = BlockNode StyledElement | InlineNode StyledElement | AnonymousBlock
emptyEdge = EdgeSize 0 0 0 0
defaultDim = Dimensions 0 0 0 0 emptyEdge emptyEdge emptyEdge
We’ll also want to add some code to the style module:
data Display = Inline | Block | DisplayNone
deriving (Eq)
-- if name exists, return its specified value
value :: StyledNode -> T.Text -> Maybe Value
value (NTree node _) name = HM.lookup name (snd node)
-- look up the display value of a node
display :: StyledNode -> Display
display n = case value n "display" of
Just (Keyword "block") -> Block
Just (Keyword "none") -> DisplayNone
_ -> Inline
-- this lookup is different than the robinson one, it supports
-- an arbitrary number of possible keywords
-- return the specified value of the first property in ks to exist
-- or def if no properties match
lookup :: StyledNode -> [T.Text] -> Value -> Value
lookup s ks def = maybe def (fromJust . value s) (find (isJust . value s) ks)
And a single helper function to CSS
toPx :: Value -> Float
toPx (Length len Px) = len
toPx _ = 0
Now we can start to work on the layout tree. Our first function traverses the style tree and builds a box for each node. Nodes with display="none"
(and their children) are filtered from the tree, and adjacent Inline boxes that are children of a Block node are grouped together beneath Anonymous boxes. We also give every node a default initial set of dimensions.
buildLayoutTree :: StyledNode -> Either T.Text LayoutBox
buildLayoutTree root = case display root of
Block -> Right $ addDim <$> blt root
Inline -> Right $ addDim <$> blt root
DisplayNone -> Left "error: root node has display:none"
where
addDim x = (defaultDim,x)
blt rt@(NTree nd cs) = NTree n ns
where
(!n, !ns) = case display rt of
Block -> (BlockNode nd, anonify ns')
Inline -> (InlineNode nd, ns')
-- won't ever hit DisplayNone, it's filtered out
anonify = concatMap mergeInlines . groupBy (\x y -> isInline x && isInline y)
mergeInlines x = if isInline $ head x then [NTree AnonymousBlock x] else x
isInline (NTree InlineNode{} _) = True
isInline _ = False
ns' = map blt $ filter ((/=DisplayNone) . display) cs
I’m a little worried about the space performance of this function, for now I’ve used the BangPatterns
pragma which allows me to mark n
and ns
as “strict” (really just evaluated to weak head normal form) by prefixing them with an exclamation point. This should hopefully prevent the program from building up a huge set of thunks while traversing the tree (reasoning about strictness/laziness and space usage are areas I need to improve on).
We now add some code to set the dimensions on the layout tree, at the moment we’ll only support Block nodes, so anything without display="block"
set will give us an error (I won’t be adding a test for this part yet).
-- walk a layout tree, setting the dimensions of each node
layout :: LayoutBox -> Dimensions -> Either T.Text LayoutBox
layout l@(NTree (_,box)_) contBlock = case box of
BlockNode _ -> layoutBlock contBlock l
InlineNode _ -> undefined
AnonymousBlock -> undefined
layoutBlock dim root = calcWidth dim root >>=
calcPosition dim >>=
layoutChildren >>= -- you know what? this might leak
calcHeight
layoutBlock
starts at the root, computes the width and position of the node, then does the same for all its children. Once the whole tree has been set out, we walk back up it from the leaves calculating the height of each node. It’s possible that recursing in the middle of the function could cause problems for large pages, although that is also how the Rust implementation is written. This will probably get refactored later.
Calculating the width is fairly ugly, nothing is complicated, but there are a bunch of boundary checks to make.
calcWidth :: Dimensions -> LayoutBox -> Either T.Text LayoutBox
calcWidth contBlock root@(NTree (dim,x) y) = do
style <- getStyledElem root
let
auto = Keyword "auto"
zero = Length 0 Px
w = fromMaybe auto $ value style "width"
vals = map (\a -> lookup style a zero) [
["margin-left" , "margin"]
, ["margin-right" , "margin"]
, ["border-left-width" , "border-width"]
, ["border-right-width", "border-width"]
, ["padding-left" , "padding"]
, ["padding-right" , "padding"] ]
total = sum $ map toPx (w:vals)
underflow = width contBlock - total
([ml'',mr''],vals') = splitAt 2 vals
(w',ml',mr') = checkUnderflow w $ checkAutoMargins (ml'',mr'')
checkAutoMargins (x,y)
| w /= auto && total > width contBlock = (check x,check y)
| otherwise = (x,y)
where check a = if a == auto then zero else a
checkUnderflow w (mlf,mrt) = case (w == auto, mlf == auto, mrt == auto) of
(False,False,False) -> (w , mlf, Length (toPx mrt + underflow) Px)
(False,False,True) -> (w , mlf, Length underflow Px)
(False,True,False) -> (w , Length underflow Px , mrt)
(False,True,True) -> (w , Length (underflow/2) Px, Length (underflow/2) Px)
(True,_,_) ->
let l = if mlf == auto then zero else mlf
r = if mrt == auto then zero else mrt
in if underflow >= 0 then (Length underflow Px,l,r)
else (zero,l,Length (toPx r + underflow) Px)
[w'',ml,mr,blw,brw,plf,prt] = map toPx (w':ml':mr':vals')
updateDim d = let pad = padding d
mar = margin d
bor = border d
in d{ width = w''
, padding = pad{ left = plf, right = prt}
, border = bor{ left = blw, right = brw}
, margin = mar{ left = ml, right = mr} }
return $ NTree (updateDim dim,x) y
getStyledElem :: LayoutBox -> Either T.Text StyledNode
getStyledElem (NTree (_,box) _) = case box of
BlockNode s -> Right $ NTree s []
InlineNode s -> Right $ NTree s []
AnonymousBlock -> Left "Error: attempted to access the nonexistant\
\ StyleNode of an AnonymousBlock"
This will probably be refactored to look a bit nicer, something like
calcWidth :: Dimensions -> LayoutBox -> Either T.Text LayoutBox
calcWidth contBlock root@(NTree (dim,x) y) = checkBounds <$> getStyledElem rt
where
checkBounds = updateDim . checkUnderflow . checkAutoMargins . computeVals
The resulting function there won’t be any nicer, but at least you get a quick idea of what it’s doing.
The remaining functions aren’t as bad.
calcPosition :: Dimensions -> LayoutBox -> Either T.Text LayoutBox
calcPosition contBlock root@(NTree (dim,a)b) = do
style <- getStyledElem root
let
zero = Length 0 Px
vals = map (toPx . (\a -> lookup style a zero)) [
["margin-top" , "margin"]
, ["margin-bottom" , "margin"]
, ["border-top-width" , "border-width"]
, ["border-bottom-width", "border-width"]
, ["padding-top" , "padding"]
, ["padding-bottom" , "padding"] ]
updateDim d [mt,mb,bt,bb,pt,pb] =
let pad = padding d
mar = margin d
bor = border d
x' = x contBlock
+ left (margin d)
+ left (border d)
+ left (padding d)
y' = y contBlock + height contBlock + pt + bt + mt
in d{ x = x'
, y = y'
, padding = pad{ top = pt, bottom = pb }
, border = bor{ top = bt, bottom = bb }
, margin = mar{ top = mt, bottom = mb } }
return $ NTree (updateDim dim vals,a) b
-- recursively lay out the children of a node
layoutChildren (NTree (dim,x) cs) = do
(dim',cs') <- foldM foo (dim,[]) cs
return $ NTree (dim',x) cs'
where
foo (d,acc) c@(NTree (cdim,_) _) = do
c' <- layout c d
return (d{height = height d + marginBoxHeight cdim}, acc ++ [c'])
-- compute the hight of a box
calcHeight :: LayoutBox -> Either T.Text LayoutBox
calcHeight root@(NTree (d,x)y) = do
s <- getStyledElem root
let d' = case value s "height" of
Just (Length h Px) -> d{height=h}
Nothing -> d
return $ NTree (d',x) y
marginBoxHeight :: Dimensions -> Float
marginBoxHeight (Dimensions _ _ _ h p b m) = sum [ h, top p, bottom p
, top b, bottom b
, top m, bottom m ]
Worth noting here is that the updateDim
function in both calcWidth
and calcPosition
could be be rewritten using lenses. The Lens library pulls in a ton of dependencies though, so for the moment I won’t be using it (I will probably eventually though, so if you’re following along feel free to start using them whenever); but deeply nested data structures like we’re using will usually be easier to work with if you’re willing to learn how to use lenses.
It’s also worth pointing out that this isn’t really the ideal Haskell way to construct the tree, which would generally be to build our tree of boxes and a separate tree of dimensions, then zip them together. This would indeed be simpler to write, but unfortunately the spec allows inline boxes to modify their children to avoid overflowing their own bounding box. Since we’re mostly sticking to Robinson’s implementation I want to wait until I see if this behavior is ignored there.
With that said, we’ve now caught up with Matt’s blog posts and I’m not particularly interested in slowing down, so for the next post I will be going back and making previous modules more comprehensive.
As usual, you can find the source for this post here and the source for Robinson here.
]]>Hey guys, since I’m trying to stick to the same content per post as Matt’s blog, this post will be pretty short.
Today we’re going to implement styling of the DOM, wherein we combine a Stylesheet and a DOM tree to create a new DOM, with Rule
s attached to its nodes.
Our imports list is nice and small:
module Style where
import Data.Maybe (mapMaybe)
import Data.List (sortBy,find)
import qualified Data.HashMap.Strict as HM
import qualified Data.HashSet as HS
import qualified Data.Text as T
import Dom
import CSS
We’ll some types, first a map from property names to values
type PropertyMap = HM.HashMap T.Text Value
Second, the styled tree itself, consisting of a NodeType
and a PropertyMap
for each node.
-- instead of building a tree with references to the DOM, we'll
-- just augment the DOM tree with PropertyMaps
type StyledNode = NTree (NodeType,PropertyMap)
Actually styling the tree is just a map over the elements of our DOM (Later, when we add more CSS features like inheritance, it’ll become a fold instead). We’re using our own custom tree type, so we’ll need to add an instance of Functor
back in Dom.hs
. We’ll also add some accessors while we’re in there.
instance Functor NTree where
fmap f (NTree n ns) = NTree (f n) $ fmap (fmap f) ns
findAttr :: ElementData -> T.Text -> Maybe T.Text
findAttr (ElementData _ m) k = HM.lookup k m
findID :: ElementData -> Maybe T.Text
findID = flip findAttr "id"
classes :: ElementData -> HashSet T.Text
classes = maybe empty (fromList . T.split (==' ')) . flip findAttr "class"
And the styling function:
-- traverse the DOM, attaching PropertyMaps to each Node to
-- create a styled tree
styleTree :: Node -> Stylesheet -> StyledNode
styleTree root stylesheet = fmap style root
where
style e@(Element e') = (e, specifiedValues e' stylesheet)
style t@(Text _) = (t, HM.empty)
SpecifiedValues
finds all the properties attached to an Element
and collects them into a PropertyMap
-- Build a map of all the properties attached to an Element
specifiedValues :: ElementData -> Stylesheet -> PropertyMap
specifiedValues e s = HM.fromList $ concatMap expand rules
where
rules = sortBy (compare `on` fst) $ matchingRules e s
expand (_,Rule _ ds) = map (\(Declaration n v) -> (n,v)) ds
type MatchedRule = (Specificity, Rule)
-- get all of the rules from a stylesheet that match the given element
matchingRules :: ElementData -> Stylesheet -> [MatchedRule]
matchingRules e (Stylesheet rules) = mapMaybe (matchRule e) rules
-- find the first rule that matches the given element
matchRule :: ElementData -> Rule -> Maybe MatchedRule
matchRule e r@(Rule selectors _) = do
s <- find (matches e) selectors
return (spec s, r)
-- check if a selector matches an element
matches :: ElementData -> Selector -> Bool
matches e sl@(Simple _ _ _) = matchSimple e sl
That’s all pretty simple stuff, the remaining function to implement, matchSimple
is more annoying. Here’s Matt’s code for this function in Robinson:
fn matches_simple_selector(elem: &ElementData, selector: &SimpleSelector) -> bool {
// Check type selector
if selector.tag_name.iter().any(|name| elem.tag_name != *name) {
return false;
}
// Check ID selector
if selector.id.iter().any(|id| elem.id() != Some(id)) {
return false;
}
// Check class selectors
let elem_classes = elem.classes();
if selector.class.iter().any(|class| !elem_classes.contains(&class.as_slice())) {
return false;
}
// We didn't find any non-matching selector components.
return true;
}
It’s non-obvious from reading this code what exactly constitutes an acceptable match, primarily because iter().any()
silently drops out of the Optional
type, it always returns False for None
. It looks like matches_simple_selector
is looking to check that everything matches, but what it’s actually doing is checking that a simple selector doesn’t have any fields that don’t match, values of None
are skipped instead of counted as match failures. The recommended way to deal with optionals in Rust is to explicitly pattern match against them, which would have made this behavior a little more obvious.
Our Haskell equivalent is a teeny bit ugly
-- matchSimple returns False if any selector field that exists
-- does not match the given element
matchSimple :: ElementData -> Selector -> Bool
matchSimple e@(ElementData nm _) (Simple n i c) =
let x = fmap (==nm) n
y = if i == Nothing then Nothing else Just $ i == (findID e)
z = if not $ null c then all (flip HS.member (classes e)) c else True
in case (x,y,z) of
(Nothing, Nothing, b3) -> b3
(Nothing, Just b2, b3) -> b2 && b3
(Just b1, Nothing, b3) -> b1 && b3
(Just b1, Just b2, b3) -> b1 && b2 && b3
That’s not totally illegible, but it’s not great. We could be more explicit by pattern matching on the function arguments like this:
matchSimple e (Simple Nothing Nothing c) = matchClasses e c
matchSimple e (Simple (Just n) Nothing c) = matchNames e n
&& matchClasses e c
matchSimple e (Simple Nothing (Just i) c) = matchId e i
&& matchClasses e c
matchSimple e (Simple (Just n) (Just i) c) = matchNames e n
&& matchId e i
&& matchClasses e c
matchNames (ElementData nm _) n = n == nm
matchId e i = findID e == Just i
matchClasses e [] = True
matchClasses e c = all (flip HS.member (classes e)) c
I personally don’t really feel like this version is much easier to read, but it does feel a little better.
Finally we’ll add another really simple test.
testStyle = TestCase $ assertEqual "styletree" styletree $ styleTree dom css2
css2 = Stylesheet [ Rule [ Simple (Just "head") Nothing [] ]
[ Declaration "margin" (Keyword "auto")
, Declaration "color" (Color 0 0 0 255) ]
, Rule [ Simple (Just "p") Nothing ["inner"] ]
[ Declaration "padding" (Length 17 Px) ] ]
styletree = NTree (Element (ElementData "html" empt),empt) [head,p1,p2]
where
head = NTree (Element (ElementData "head" empt),rule1) [title]
title = NTree (Element (ElementData "title" empt),empt) [test']
test' = NTree (Text "Test",empt) []
p1 = NTree (Element (ElementData "p" (HM.singleton "class" "inner")),rule2) [hello,span]
hello = NTree (Text "Hello, ",empt) []
span = NTree (Element (ElementData "span" (HM.singleton "id" "name")),empt) [world]
world = NTree (Text "world!",empt) []
p2 = NTree (Element (ElementData "p" (HM.singleton "class" "inner")),rule2) [goodbye]
goodbye = NTree (Text "Goodbye!\n ",empt) []
empt = HM.empty
rule1 = HM.fromList [("margin",Keyword "auto"),("color",Color 0 0 0 255)]
rule2 = HM.singleton "padding" (Length 17 Px)
The covers the entire Styling module, next time we’ll start building the Layout tree as we close in on actually getting something we can render.
I often forget that I have a linter installed (the Haskell Platform comes with Hlint), so I haven’t been linting the code for hubert up until this point. It’s generally a good idea to do so though, so let’s run Hlint real quick.
hlint src --report
Hlint will write all its complaints to your terminal, but it also generates a file report.html
. For me Hlint found 9 things to complain about, let’s go through and fix them.
src\Css.hs:81:13: Warning: Redundant bracket
Found
(sortBy comp) <$> sepBy1 (selector <* spaces) comma
Why not
sortBy comp <$> sepBy1 (selector <* spaces) comma
src\Css.hs:106:8: Warning: Use void
Found
char '*' >> return ()
Why not
Control.Monad.void (char '*')
src\Css.hs:139:44: Warning: Redundant bracket
Found
digit <|> (char '.')
Why not
digit <|> char '.'
src\Css.hs:162:26: Warning: Redundant bracket
Found
(notFollowedBy end) *> p
Why not
notFollowedBy end *> p
src\Dom.hs:36:1: Error: Eta reduce
Found
elem name atts cs = NTree (Element (ElementData name atts)) cs
Why not
elem name atts = NTree (Element (ElementData name atts))
src\Style.hs:47:24: Warning: Use section
Found
(flip HS.member (classes e))
Why not
(`HS.member` (classes e))
src\Html\Parsec.hs:30:20: Warning: Use void
Found
try (string "</") >> return ()
Why not
Control.Monad.void (try (string "</"))
src\Html\Parser.hs:27:1: Error: Eta reduce
Found
runParserS p s = evalState (runExceptT p) s
Why not
runParserS p = evalState (runExceptT p)
src\Html\Parser.hs:62:14: Error: Use unless
Found
if b then return () else throwError s
Why not
Control.Monad.unless b $ throwError s
I’ll make all these changes except for one: I’m not going to change (flip HS.member (classes e))
to (`HS.member` (classes e))
because it doesn’t really seem any better to me (possibly because Sublime refuses to color it). You can go ahead and make the change if you want. I’ll try to remember to lint as I go from now on, so we can avoid these interludes in the future.
As usual, you can find the source for this post here and the source for Robinson here.
]]>Now that we’ve got some tests set up, we’re ready to write a CSS parser. But first, I’d like to take a minute to talk about Control.Applicative
and some of the combinators it contains. Applicative combinators are really useful for building parsers, but tend to look a lot like voodoo when presented without explanation. We’ve already used Parsec
’s <|>
combinator, which tries the parser on its left, and then the one on its right if the first fails. In writing our CSS parser, we’ll also use the following operators:
<$>
is an operator synonym for fmap
.*>
performs the action on it’s left, and then the action on its right, ignoring the result of the first action. It is equivalent to >>
on monads.<*
performs the action on its left, then the action on its right, and returns the result of the first action. It is equivalent to the monadic expression do{ f <- foo; bar; return f}
.<*>
takes a binary function inside an Applicative
on its left, and an argument to that function, also inside an Applicative
on the right.We mostly use *>
and <*
for parsers where we need to parse something but don’t care about holding on to it, like trailing whitespace, or separators. <$>
and <*>
are useful for collecting the results of multiple parsers, for some type Foo
we can write Foo <$> bar <*> baz <*> quux
instead of
a <- bar
b <- baz
c <- quux
return $ Foo a b c
We could also write that as LiftA3 Foo bar baz quux
.
Everyone has trouble with Applicative
stuff for a while (it doesn’t help that they make it really easy to write illegible code), so don’t feel bad if you don’t really get it right now.
We’re now ready to start on the parser. First, the module definition and imports:
{-# LANGUAGE FlexibleContexts, NoMonomorphismRestriction #-}
module CSS
( Stylesheet(..)
, Rule(..)
, Selector(..)
, Declaration(..)
, Value(..)
, Unit(..)
, parseCSS
, selectors
, declarations
) where
import Prelude hiding (id)
import Data.Word (Word(..), Word8(..))
import Data.List (sortBy)
import Data.Maybe (maybe)
import Numeric (readFloat, readHex)
import Control.Applicative ((<*), (*>), (<$>), (<*>))
import Text.Parsec
import Text.Parsec.Text
import qualified Data.Text as T
We’ll create some simple types to represent our CSS:
data Stylesheet = Stylesheet [Rule]
deriving (Show, Eq)
data Rule = Rule [Selector] [Declaration]
deriving (Show, Eq)
-- only handle simple selectors for now
data Selector = Simple (Maybe T.Text) (Maybe T.Text) [T.Text]
deriving (Show, Eq)
data Declaration = Declaration T.Text Value
deriving (Show, Eq)
data Value = Keyword T.Text
| Color Word8 Word8 Word8 Word8
| Length Float Unit
deriving (Show, Eq)
data Unit = Px --only Px for now
deriving (Show, Eq)
-- an empty selector
nilS = Simple Nothing Nothing []
Note that because we don’t want to move too far away from how Robinson works, we’re actually going to be parsing only CSS2 simple selectors instead of following the more complicated CSS3 spec, for now.
The top level parser looks remarkably familiar
-- parse an entire CSS document into a Stylesheet
parseCSS :: T.Text -> Either ParseError Stylesheet
parseCSS css = case runParser rules nilS "" css of
Left err -> Left err
Right rs -> Right (Stylesheet rs)
rules = spaces >> manyTill (rule <* spaces) eof
A Rule
is just a list of Selector
s, and a list of Declaration
s
rule = Rule <$> selectors <*> declarations
Selectors are in a comma separated list
selectors = (sortBy comp) <$> sepBy1 (selector <* spaces) comma
where comma = char ',' <* spaces
comp a b = spec a `compare` spec b
We sort the parsed selectors by their specificity
type Specificity = (Word,Word,Word)
-- compute the specificity of a Selector
spec :: Selector -> Specificity
spec (Simple name id cls) = (maybeLen id, fromIntegral $ length cls, maybeLen name)
where maybeLen = fromIntegral . maybe 0 T.length
In order to actually build the Selector
, we’ll take advantage of the fact that the Parsec
monad transformer includes a StateT
for user state.
-- manyTill, but the terminal parser is optional
manyUnless p end = many ((notFollowedBy end) *> p)
-- parse a simple selector
selector = do
putState nilS
manyUnless (id <|> cls <|> univ <|> name) eof
getState
-- selector id
id = do
char '#'
i <- identifier
modifyState (\(Simple n _ cs) -> Simple n (Just i) cs)
-- selector class
cls = do
char '.'
c <- identifier
modifyState (\(Simple n i cs) -> Simple n i (cs++[c]))
-- universal selector
univ = char '*' >> return ()
-- selector name
name = do
n' <- validId
n <- identifier
let nm = n' `T.cons` n
modifyState (\(Simple _ i cs) -> Simple (Just nm) i cs)
Declarations are parsed from a semicolon separated list bracketed by curly braces.
declarations = do
char '{'
spaces *> manyTill (declaration <* spaces) (char '}')
declaration = do
n <- identifier
spaces >> char ':' >> spaces
v <- value
spaces >> char ';'
return $ Declaration n v
value = len <|> color <|> keyword
len = Length <$> float <*> unit
-- parse a floating point number
float :: Stream s m Char => ParsecT s u m Float
float = (fst . head . readFloat) <$> many (digit <|> (char '.'))
-- parse the unit type in a Value
-- currently only Px is supported
unit = do
char 'p' <|> char 'P'
char 'x' <|> char 'X'
return Px
color = do
char '#'
cs <- count 3 (count 2 hexDigit)
let [r,g,b] = map (fst . head . readHex) cs
return $ Color r g b 255
keyword = Keyword <$> identifier
identifier = T.pack <$> many validId
validId = alphaNum <|> char '-' <|> char '_'
For some reason Parsec doesn’t have built in number parsers, so we have to fall back to the read functions from Numeric
That’s the whole CSS parser, let’s add a test for it
css = "h1, h2, h3 { margin: auto; color: #cc0000; }\n\
\div.note { margin-bottom: 20px; padding: 10px; }\n\
\#answer { display: none; }"
sheet = Stylesheet [ Rule [ Simple (Just "h1") Nothing []
, Simple (Just "h2") Nothing []
, Simple (Just "h3") Nothing [] ]
[ Declaration "margin" (Keyword "auto")
, Declaration "color" (Color 204 0 0 255) ]
, Rule [ Simple (Just "div") Nothing ["note"] ]
[ Declaration "margin-bottom" (Length 20 Px)
, Declaration "padding" (Length 10 Px) ]
, Rule [ Simple Nothing (Just "answer") [] ]
[ Declaration "display" (Keyword "none") ] ]
testCss = parseTest "for valid css" sheet $ parseCSS css
And if I’ve copied everything here correctly, that test will pass.
That’s all for CSS parsing, next time we’ll work on combining the Dom and Stylesheet into a single Layout.
As usual, you can find the source for this post here and the source for Robinson here.
]]>Welcome back. In part 2 we wrote an html parser two different ways, and decided to implement some tests before going any further. Let’s do that now.
We’ll create a file tests.hs
in a new folder /tests
, to keep our test code separate. We’ll be using the HUnit test framework, which is included in the Haskell Platform. As usual, we’ll start off with the imports:
{-# LANGUAGE OverloadedStrings #-}
module Main where
import Prelude hiding (elem)
import Data.Either (either)
import Test.HUnit
import Text.Parsec hiding (parseTest)
import Text.Parsec.Text
import qualified Data.Text as T
import qualified Data.HashMap.Strict as HM
import qualified HTML.Parser as PR
import qualified HTML.Parsec as PS
import Dom
We’ll be running the same tests for both the ParserS
and Parsec
html parsers, but after this we’re pretty much going to ignore the ParserS
one, the Parsec
parser will be much easier to extend if we want to add features.
We’ll start with some simple data to test against:
testText = text "candygram"
testElem = elem "p" (HM.singleton "ham" "doctor") [text "sup"]
-- a small test html page
-- yeah, multi-line strings in haskell kind of suck
html = "<html>\n\
\ <head>\n\
\ <title>Test</title>\n\
\ </head>\n\
\ <p class=\"inner\">\n\
\ Hello, <span id=\"name\">world!</span>\n\
\ </p>\n\
\ <p class=\"inner\">\n\
\ Goodbye!\n\
\ </p>\n\
\</html>"
-- the expected result of parsing the test page
dom = elem "html" HM.empty [head,p1,p2]
where
head = elem "head" HM.empty [title]
title = elem "title" HM.empty [text "Test"]
p1 = elem "p" (HM.singleton "class" "inner") [hello, span]
hello = text "Hello, "
span = elem "span" (HM.singleton "id" "name") [text "world"]
p2 = elem "p" (HM.singleton "class" "inner") [text "Goodbye!"]
For each test, we want to check both that the parser succeeded, and that it returned the correct value, so we’ll define a little helper function for that:
-- generic test: given an expected value and an actual value, check that the actual
-- value is not an error message, then compare it to the expected value
parseTest msg e = TestCase . either (assertFailure . show) (assertEqual msg e)
The first argument to parseTest
is a message that will be associated with that test by the runner, if we don’t get a successful parse the runner will instead show the parser’s error message.
The tests for ParserS
and Parsec
are identical aside from the function used to run the parser, and needing to pack the strings for Parsec
:
--------------------------- PARSER_S TESTS ------------------------------
parsePR p i = PR.runParserS p (PR.Parser i)
htmlPR = parseTest "for valid html" dom $ PR.parseHtml html
textPR = parseTest "for valid text" testText $ parsePR PR.parseText "candygram"
elemPR = parseTest "for valid elem" testElem $
parsePR PR.parseElement "<p ham=\"doctor\">sup</p>"
---------------------------- PARSEC TESTS ------------------------------
htmlPS = parseTest "for valid html" dom $ PS.parseHtml html
textPS = parseTest "for valid text" testText $
parse PS.parseText "" $ T.pack "candygram"
elemPS = parseTest "for valid elem" testElem $
parse PS.parseElement "" $ T.pack "<p ham=\"doctor\">sup</p>"
Finally we’ll group the tests up, and run them:
main = runTestTT tests
tests = TestList [TestLabel "ParserS html" htmlPR,
TestLabel "ParserS text" textPR,
TestLabel "ParserS elem" elemPR,
TestLabel "Parsec html" htmlPS,
TestLabel "Parsec text" textPS,
TestLabel "Parsec elem" elemPS]
You can run these tests in GHCi with the command runTestTT tests
(or just compile tests.hs outright) but if you’re being responsible and using a .cabal file + sandbox like I am, you might want to add a test suite to the .cabal file.
Test-Suite hunit-tests
type: exitcode-stdio-1.0
main-is: tests.hs
other-modules: Dom,
HTML.Parser,
HTML.Parsec
build-depends: base >= 4.7 && < 5,
unordered-containers >=0.2 && <0.3,
mtl >= 2.2.1,
text >= 1.1.0.0,
HUnit >= 1.2.5.0,
parsec == 3.1.*
hs-source-dirs: tests,
src
default-language: Haskell2010
You can now compile your tests with
cabal configure --enable-tests
cabal build
cabal test
However, HUnit does not actually conform to the format expected by exitcode-stdio-1.0
(which is amusing, since it’s supposed to be a backwards compatibility setting) so cabal test
will always claim to have run all tests successfully. The real results will be printed to the logfile located in dist/test.
Test suite hunit-tests: RUNNING...
Cases: 6 Tried: 0 Errors: 0 Failures: 0
### Failure in: 0:ParserS html
for valid html
expected: NTree (Element (ElementData "html" fromList [])) [NTree (Element (ElementData "head" fromList [])) [NTree (Element (ElementData "title" fromList [])) [NTree (Text "Test") []]],NTree (Element (ElementData "p" fromList [("class","inner")])) [NTree (Text "Hello, ") [],NTree (Element (ElementData "span" fromList [("id","name")])) [NTree (Text "world") []]],NTree (Element (ElementData "p" fromList [("class","inner")])) [NTree (Text "Goodbye!") []]]
but got: NTree (Element (ElementData "html" fromList [])) [NTree (Text "\n ") [],NTree (Element (ElementData "head" fromList [])) [NTree (Text "\n ") [],NTree (Element (ElementData "title" fromList [])) [NTree (Text "Test") []],NTree (Text "\n ") []],NTree (Text "\n ") [],NTree (Element (ElementData "p" fromList [("class","inner")])) [NTree (Text "\n Hello, ") [],NTree (Element (ElementData "span" fromList [("id","name")])) [NTree (Text "world!") []],NTree (Text "\n ") []],NTree (Text "\n ") [],NTree (Element (ElementData "p" fromList [("class","inner")])) [NTree (Text "\n Goodbye!\n ") []],NTree (Text "\n") []]
Cases: 6 Tried: 1 Errors: 0 Failures: 1
Cases: 6 Tried: 2 Errors: 0 Failures: 1
Cases: 6 Tried: 3 Errors: 0 Failures: 1
### Failure in: 3:Parsec html
for valid html
expected: NTree (Element (ElementData "html" fromList [])) [NTree (Element (ElementData "head" fromList [])) [NTree (Element (ElementData "title" fromList [])) [NTree (Text "Test") []]],NTree (Element (ElementData "p" fromList [("class","inner")])) [NTree (Text "Hello, ") [],NTree (Element (ElementData "span" fromList [("id","name")])) [NTree (Text "world") []]],NTree (Element (ElementData "p" fromList [("class","inner")])) [NTree (Text "Goodbye!") []]]
but got: NTree (Element (ElementData "html" fromList [])) [NTree (Element (ElementData "head" fromList [])) [NTree (Element (ElementData "title" fromList [])) [NTree (Text "Test") []]],NTree (Element (ElementData "p" fromList [("class","inner")])) [NTree (Text "Hello, ") [],NTree (Element (ElementData "span" fromList [("id","name")])) [NTree (Text "world!") []]],NTree (Element (ElementData "p" fromList [("class","inner")])) [NTree (Text "Goodbye!\n ") []]]
Cases: 6 Tried: 4 Errors: 0 Failures: 2
Cases: 6 Tried: 5 Errors: 0 Failures: 2
Cases: 6 Tried: 6 Errors: 0 Failures: 2
Test suite hunit-tests: PASS
Test suite logged to: dist\test\hubert-0.1.0.0-hunit-tests.log
Oh hey, looks like our tests have found a problem: parseHtml
has failed for both parsers. Since our other tests have succeeded, we can guess that the issue is at the top level of the parser, and inspecting the output, we can see that indeed, the ParserS
is parsing whitespace between nodes as text nodes. That should be easy to fix. Since consumeWhitespace = consumeWhile (==' ')
is only consuming spaces, we’ll rewrite it as:
consumeWhitespace :: ParserS T.Text
consumeWhitespace = consumeWhile isSpace
While the Parsec
parser is not having this problem, it actually will exhibit the same behavior for improper html files where there is no root node; because for reasons I can’t remember, parseNodes
and parseChildren
do different things. parseChildren
is the proper implementation though, so we’ll just rename it to parseNodes
and change every call to parseChildren
to call parseNodes
.
We have a second issue, which is affecting both parsers: the parser is returning a text element of "Goodbye!\n "
instead of the expected "Goodbye!"
. This is actually a mistake writing the test, the correct behavior is for the html parser to include the whitespace in the text element, and later algorithms can ignore it or not as they wish. I’ve also accidentally left off the ‘!’ at the end of "world"
.
With those changes, all tests now pass:
Test suite hunit-tests: RUNNING...
Cases: 6 Tried: 0 Errors: 0 Failures: 0
Cases: 6 Tried: 1 Errors: 0 Failures: 0
Cases: 6 Tried: 2 Errors: 0 Failures: 0
Cases: 6 Tried: 3 Errors: 0 Failures: 0
Cases: 6 Tried: 4 Errors: 0 Failures: 0
Cases: 6 Tried: 5 Errors: 0 Failures: 0
Cases: 6 Tried: 6 Errors: 0 Failures: 0
Test suite hunit-tests: PASS
Test suite logged to: dist\test\hubert-0.1.0.0-hunit-tests.log
With that, we can move on to parsing CSS. I had originally planned to cover CSS in this post, but I’d like to avoid huge posts like the previous one so we’ll end this here. Expect the next post fairly soon.
As usual, you can find the source for this post here and the source for Robinson here.
]]>Ok, so in part 1 we built a set of types to use for a really basic DOM. In part 2, we’ll write a simple html parser (actually we’ll write two) that only supports Balanced Tags, Attributes with quoted values, and Text nodes.
First, we’re going to write a straight port of Matt’s Rust parser, pretty much just to see how it goes. Then we’re gonna turn around write a more Haskell flavored parser, since writing parsers is something we can do really really well in Haskell. In this way we explore the similarities and differences between languages, as well as the design process that led to today’s modern applicative parsing libraries.
Ok no that’s not true, I wrote the whole first parser at two in the morning before I remembered that Parsec is part of the Haskell Platform, so now you get to experience it too (to be fair, it’s a pretty decent example of using monad transformers, I feel).
First things first, we’re switching from using String
s, which are just lists of Char
s to Text
s which are 1) designed to deal with all the different utf formats and 2) way faster. A nice side effect of this is that we can treat everything like regular Char
s and the compiler will figure it out for us. If you’re following along, go back and update your types in Dom.hs.
First we’ll create a Parser type, in Robinson the Parser consists of a vector and an index, but for our purposes it’ll be better to consume the Text
as we read it. We don’t have to worry about accidentally destroying it, and in theory the compiler should fuse everything together into a single traversal.
-- here's all the imports for the file
{-# LANGUAGE OverloadedStrings #-} -- this lets the compiler turn string literals into Texts
import Data.Char (isAlphaNum)
import Control.Monad (liftM)
import Control.Monad.State.Lazy (StateT(..), evalState, get, put)
import Control.Monad.Except (ExceptT(..), runExceptT, throwError)
import Control.Monad.Identity
import qualified Data.HashMap.Strict as HM
import qualified Data.Text as T
import Dom
data Parser = Parser T.Text
type ParserS = ExceptT T.Text (StateT Parser Identity)
We’ve defined a Parser
as just a wrapper around a Text
, and a ParserS
monad transformer stack where all our parsing code will live. If you’re not familiar with monad transformers you should go read Monad Transformers Step by Step which, despite being a .pdf file, is actually a short introductory tutorial. To read this post though, all you need to know is that ParserS
basically functions as a State a
and an Either T.Text a
at the same time, so we can carry our Parser
along and also toss up an error if something goes wrong.
We’ll also define a few convenience functions for working with the Parser
.
runParserS p s = evalState (runExceptT p) s
nextchr :: Parser -> Char
nextchr (Parser s) = T.head s -- errors if called when string is empty
startsWith :: Parser -> T.Text -> Bool
startsWith (Parser input) s = s `T.isPrefixOf` input
eof :: Parser -> Bool
eof (Parser input) = T.null input
runParserS
extracts our results from the monad transformer stack, the other three functions are just the same as the equivalent Rust versions, except that we’re always looking at the head of our Text
.
The worker functions
consumeChar :: ParserS Char
consumeChar = do
(Parser inp) <- get
case T.uncons inp of
Nothing -> throwError "ERROR: unexpectedly reached end of file"
Just (c,inp') -> do
put (Parser inp')
return c
consumeWhile :: (Char -> Bool) -> ParserS T.Text
consumeWhile f = do
Parser input <- get
let (s,input') = T.span f input
put $ Parser input'
return s
consumeWhitespace :: ParserS T.Text
consumeWhitespace = consumeWhile (==' ')
get
and put
access the ParserS
’s state, and throwError
will short circuit the computation with a Left
value. We lose a bit over Rust here in needing to explicitly handle trying to read an empty Text
where Rust would have simply tossed up an error. Speaking of errors, Robinson doesn’t really have any error handling here, but it is liberally sprinkled with assert!
s which, unlike every other language I’ve seen that has them, also run in non-debug code (apparently this is being changed soon). It’s convenient to define our own equivalent, which we’ll call assert
as well.
assert :: T.Text -> Bool -> ParserS ()
assert s b = if b then return () else throwError s
Now we can nicely write checks on one line.
We gain a lot of brevity for short functions:
parseTagName :: ParserS T.Text
parseTagName = consumeWhile isAlphaNum
parseNode :: ParserS Node
parseNode = do
p <- get
if nextchr p == '<' then parseElement else parseText
parseText :: ParserS Node
parseText = liftM Dom.text $ consumeWhile (/='<')
Longer functions look fairly similar to their Rust counterparts, albeit a little easier on the eyes. I’m not totally happy with how the asserts look, but it’s not incredibly hard to follow the flow of the function.
parseElement :: ParserS Node
parseElement = do
-- open tag
consumeChar >>= assert "missing < in open tag" . (=='<')
tag <- parseTagName
attrs <- parseAttributes
consumeChar >>= assert "missing > in open tag" . (=='>')
-- contents
children <- parseNodes
--end tag
consumeChar >>= assert "missing < in close tag" . (=='<')
consumeChar >>= assert "missing / in close tag" . (=='/')
parseTagName >>= assert "end tag doesn't match start tag" . (==tag)
consumeChar >>= assert "missing > in close tag" . (=='>')
return $ Dom.elem tag attrs children
parseAttr :: ParserS (T.Text, T.Text)
parseAttr = do
name <- parseTagName
consumeChar >>= assert "missing =" . (=='=')
value <- parseAttrValue
return (name,value)
parseAttrValue :: ParserS T.Text
parseAttrValue = do
open <- consumeChar
assert "invalid open" (open == '\"' || open == '\'')
val <- consumeWhile (/=open)
consumeChar >>= assert "invalid close" . (==open)
return val
parseAttributes :: ParserS AttrMap
parseAttributes = parseAttributes' HM.empty
where
parseAttributes' attrs = do
consumeWhitespace
p <- get
if nextchr p == '>' then return attrs
else do
(name,val) <- parseAttr
parseAttributes' $ HM.insert name val attrs
parseNodes :: ParserS [Node]
parseNodes = parseNodes' []
where
parseNodes' nodes = do
consumeWhitespace
p <- get
if eof p || p `startsWith` "</"
then return nodes
else parseNode >>= parseNodes' . (nodes++) . (:[]) --slow for big DOM
Finally, we’ll write the function that actually parses an HTML string.
parseHtml :: T.Text -> Either T.Text Node
parseHtml s = case runParserS parseNodes (Parser s) of
Left err -> Left err
Right nodes -> Right $
if length nodes == 1
then head nodes
else Dom.elem "html" HM.empty nodes
So, that’s the whole parser in only about 135 lines of code, which isn’t bad. Of course, this implementation is fragile, overly verbose, and I suspect that the wrapper types might prevent proper stream fusion on the Text
functions (I’m not super familiar with the details of Haskell stream fusion).
Fortunately we have the Parsec library, which will allow us to write a much nicer parser in about half the space (and yes I know, brevity tends to lead to the unreadable gibberish Haskell is famous for, but truest me). Parsec is not the fastest parser library, (that would be attoparsec, because if there is one thing all programmers love, it is puns) but it is the most robust, and it’s really easy to use. Most importantly, attoparsec produces truly confusing error messages, whereas Parsec will actually allow us to add our own if we want (though today I won’t).
Our imports list is slightly shorter:
{-# LANGUAGE OverloadedStrings, NoMonomorphismRestriction #-}
module HTML.Parsec
( parseHtml
) where
import Control.Monad (liftM)
import qualified Data.Text as T
import Text.Parsec
import Text.Parsec.Text
import qualified Data.HashMap.Strict as HM
import Dom
And our top level function is all but identical:
parseHtml :: T.Text -> Either ParseError Node
parseHtml s = case parse parseNodes "" s of
Left err -> Left err
Right nodes -> Right $
if length nodes == 1
then head nodes
else Dom.elem "html" HM.empty nodes
The only difference here is our call to parse
instead of runParserS
. The underlying monad in Parsec is actually pretty much the same as our ParserS
but with a bit more. It’s also a monad transformer, so although we’re not going to use it today, we could add even more monadic features if we needed to.
When working with Parsec it’s convenient to build the parser from the top down, our first function is simple enough:
parseNodes = manyTill parseNode eof
That’ll just keep parsing Nodes until the parser fails or we hit the end of the file.
parseNode
is easy too, just parse a Text
or an Element
:
parseNode = parseElement <|> parseText
How do we parse a Text
? Just keep taking characters until we hit a ‘<’
parseText = liftM (Dom.text . T.pack) $ many (noneOf "<")
What about an Element
? That’s a bit longer, but still pretty readable.
parseElement = do
-- opening tag
(tag, attrs) <- between (char '<') (char '>') tagData
-- contents
children <- parseChildren
-- closing tag
string $ tag ++ ">" -- "</" is consumed by parseChildren, maybe bad form?
return $ Dom.elem (T.pack tag) attrs children
-- the try combinator won't consume input if it fails, so the next parser will get that input
-- otherwise if string "</" matched '<' but not '/' the next parser would start at '/'
parseChildren = spaces >> manyTill parseChild end
where
end = eof <|> (try (string "</") >> return ())
parseChild = spacesAfter parseNode
tagData = do
t <- tagName
attrs <- attributes
return (t,attrs)
tagName = many1 alphaNum
--this is safe because attribute will fail without consuming on '>''
attributes = liftM HM.fromList $ spaces >> many (spacesAfter attribute)
attribute = do
name <- tagName
char '='
open <- char '\"' <|> char '\''
value <- manyTill anyChar (try $ char open)
return (T.pack name, T.pack value)
-- run parser p and then strip the trailing spaces, returning the result of p.
spacesAfter p = p <* spaces
That’s the whole thing.
One thing I like about Parsec is that it’s very easy to write short functions with readable names, and then compose them into larger parsers that are still really easy to read. Moving forward, I’m just going to use Parsec for any other parsing in this project, without messing around with the ParserS
(I’ve left it in the repo though).
That’s all for today, I was planning to write the CSS parser next time, but now I’m thinking I might take a break and set up a test framework first. We’ll see where that goes.
The source for this post is here. As per usual, the source for Robinson is here.
]]>Matt Brubeck over at Mozilla has recently started a series of blog posts on constructing a toy html rendering engine in Rust. Now, I’m not particularly interested in Rust, but I am interested in web browsers, so I thought it might be fun to try following along in Haskell instead (obviously, I could also use an imperative language like C++, but that would be less interesting).
Matt’s trying to stick to using only the Rust standard library, but for Haskell that would limit us unnecessarily; instead, I’ll try to stick with libraries shipped with the Haskell Platform (so no lenses, unless things get really awkward). I’ll use GHC 7.8, but the code will probably compile with 7.6.
We’ll name the project Hubert because in Haskell we like to start or end program names with H.
A browser’s DOM is a tree of Nodes representing the structure of an HTML page. Robinson represents the DOM with an intrusive tree: each Node has a NodeType, and a vec
of it’s own children. In Haskell, we generally like to separate structure and data when possible, so instead we’ll build a Tree and fill it with Nodes.
First the tree type:
data NTree a = NTree a [NTree a]
deriving (Show)
That was easy, next up are the Nodes. Matt uses a heavily pared down DOM, we’ll only have text and element nodes, which we represent with a sum type.
import qualified Data.Text as T
-- data specific to each node type
data NodeType = Text T.Text
| Element ElementData
deriving (Show)
Finally we’ll make a nice type alias for a Node:
type Node = NTree NodeType
All that’s left to do is define our node types. We already have a full definition for Text
, it just holds a Text
, Element
holds an ElementData
however, which consists of a name, and a hashmap of attributes (which we’ll represent with Texts), imaginatively named AttrMap
. Efficient maps can be annoying to write in Haskell, so we’ll import a HashMap from unordered-containers
.
import qualified Data.HashMap.Strict as HM
type AttrMap = HM.HashMap T.Text T.Text
data ElementData = ElementData T.Text AttrMap
deriving (Show)
Haskell manages to use a lot less space by not bothering to name fields in our types, the trade off is that we need to write the full constructor name out in our function declarations. If that gets too annoying we can switch to record fields later.
Matt also provides constructor functions for a Node
with each NodeType
. We don’t strictly need these, but they’ll save us a lot of space so let’s write them.
text :: T.Text -> Node
text = flip NTree [] . Text
elem :: T.Text -> AttrMap -> [Node] -> Node
elem name atts cs = NTree (Element (ElementData name atts)) cs
We’ll probably write some accessors later but this is enough to get started with. Next time, we’ll write the HTML parser, and actually build a DOM using these types.
The full source for this article can be found here. The full source for Robinson can be found here.
]]>The folks over at Stoic released the free to play multiplayer component of their post-ragnarok tactical rpg epic The Banner Saga last week. I haven’t played a whole lot, for reasons I’ll get into in a bit, but I have played a few games and I’d like to talk about the game itself a bit before moving on to some perhaps more interesting musings on what it tells us about rpgs in the multiplayer space.
The Banner Saga is a tactical rpg in the vein of Tactics:Ogre, Final Fantasy Tactics, and Fire Emblem, although it plays very differently than those games. Combat takes place on a grid where units take turns moving and attacking, but unlike the aforementioned games BS does away with the classic Speed unit stat, instead players simply take turns moving units. This turns out to impact play in a big way. Set after the Ragnarok, you control a team of humans and Varl(which I assume to be related to giants in some way) with each race having access to certain classes. A unit is able to progress to an advanced class after defeating at least five enemies and spending 50 “Reknown”, which is earned by killing enemies, winning battles, and meeting certain goals. In addition to gaining an ability and improving their attributes, advanced classes can move some of their stats around, allowing the player to attempt to optimize their team for a particular play-style. The game and cut scenes are rendered in a beautiful hand-painted style.
Units have two health stats: Armor, which reduces damage taken, and Strength, which acts both as a standard HP bar and as a gauge of the units ability to deal damage. Combat consists of balancing your ability to damage a unit and its capacity to deal damage to you in the short term. Your MP equivalent, “Willpower”, is used for increasing move distance and damage, as well as activating abilities. Willpower does not regenerate unless you spend a full turn resting (do not move or act) or expend a charge from your war-horn (which gains a charge every time you defeat an enemy unit). Units additionally have a stat, “Exertion”, that determines how many points of Willpower they can spend on any individual action.
The combat is tight and responsive – limiting each unit to a single ability helps both to prevent decision paralysis and to reason about your opponent’s actions. Animations and sound are well chosen and well synced, hitting an enemy unit feels great. Unit abilities are well thought out and varied. Forcing players to take turns is an interesting change to the standard formula.
Combat is SLOW, like really slow. By default each player has what feels like a full minute to make their move(I didn’t actually time this), and in the event of connectivity problems you tend to see each move after at least this much time. Of the games I played, the only one lasting less than half an hour was when my opponent forfeited. Forfeiting incidentally, is not recommended, fighting until the bitter end will net you some consolation Renown but if you forfeit you get nothing. It’s possible to reduce the timer to a slightly better thirty seconds, but Stoic have opted to call this “expert mode”, thereby guaranteeing that most players won’t try it before spending several hours playing the extremely slow default mode.
The interface is not entirely intuitive, which is compounded by the fact that some features (such as the horn at the top of the screen, and miss chances) are not explained anywhere in the game except for the optional advanced tutorial video, which I doubt many people watch.
It’s worth noting at this point that nearly all my gripes with the game are UI issues or multiplayer related, they are unlikely to detract from your enjoyment of the final single player product. I’d like to spend some time talking about the multiplayer issues though, because they highlight some issues with implementing a multiplayer tactical rpg, which is sort of a long-term goal of mine.
The major lesson to learn from Banner Saga is how hard it is to make a tactical rpg play at anything resembling an acceptable speed in multiplayer matches. I complained a lot about the speed up above, but really Stoic have put an incredible amount of work into making combat as fast as possible. By restricting units to only a single active ability the player has only a small set of possible moves for each unit instead of needing to consider say, 5 damage dealing abilities and 7 status modifying abilities. They probably could have gotten away with dialing expert mode down to 20 second turns, but they’re obviously worried about limiting the ability of new players to get into the game. Having players take turns moving units, and showing the unit order, also improve flow by making it easier to reason about later turns.
There’s a major problem with Stoic’s solution to the speed issue though, it severely limits their ability to add complexity to the game, and complexity is a pretty integral part of tactical rpgs. Players like their characters to feel stronger after 50 hours of play, which is usually achieved by granting access to newer more powerful abilities and classes. Furthermore, players generally like having many options on a turn; even though it slows gameplay down, it makes the player feel more involved in the battle.
So, what are some other ways we could try to get a fast game without limiting complexity? The simplest method is to create a “normal” tactical rpg but restrict turns to some arbitrary time limit. This is not a good solution – without some streamlining factor the player simply has too many options to play quickly. A more nuanced approach would be to restrict players to a smaller set of abilities at any given turn, but allow them some way to change what abilities they will have in later turns. This can create issues with reasoning about future turns, as units become more mutable the player is less able to think ahead.
We could at this point try to find another solution, but the idea of mutating a unit’s abilities has caught my fancy, and I see a way to fix a second issue, namely that imposing a time limit on turns can make player “feel” rushed even when they are not. Instead of giving up on this solution let’s say that units have some small set of initial abilities, and then a number of tiers of more advanced abilities not initially available. Each tier has about the same number of abilities, some of which may be direct upgrades of earlier abilities while others may open up new options for the unit. Now, instead of punishing a player for taking too long on a turn, let’s reward them for acting quickly. Give each unit a stat, we’ll call it “momentum”, that starts at zero. Each time a unit performs an action it gains some momentum, and whenever a unit’s turn is active its momentum slowly decreases. At certain predetermined thresholds of momentum the unit replaces its available abilities with the next higher tier, and when the momentum drops back below a threshold the unit again must use the lower tier of abilities. In this way the player is rewarded for playing quickly, and with some tuning of the combat system we hope to push both players into a positive feedback loop whereby play becomes progressively faster as combat goes on. With this point we can introduce a certain amount of additional complexity over Banner Saga, if we’re careful, because gameplay should be moving too fast for players to be thinking very far ahead, and instead they should be focusing on more immediate payoffs.
There are of course a number of problems with this idea. Loosing too much momentum in the middle of a match could easily see the player get caught unable to build it back up, but their opponent might not be able to take advantage of this, resulting in a match that drags on. Connection issues are even more of an issue here than they are with Banner Saga – a ten second pause could cripple a player – and there isn’t a great way for the game to let them recover. The additional complexity we seek to add also becomes an issue for the developer: complexity is a lot of work to manage in any case, but in this system the danger of players restricting themselves to very small subsets of the available choices(classes, abilities, equipment, etc…) becomes exacerbated, forcing additional testing.
Still, it’s worth looking into.
]]>