A HAML parser for Haskell
HAML‘s lovely. As I’ve been working with Turbinado, I’ve been having some issues with HSP. HSP is an insanely impressive piece of software, but its error messages can be a bit unclear. So I started playing around a bit with HAML. Got me wondering “How easy would be to write a HAML parser in Haskell?”
So I tried. Here’s a first-pass, to-be-updated [and somewhat incomplete] HAML parser for Haskell. Not all of the features are implemented, but it’s a start. It generates HTML bits suitable for compilation by GHC.
Input
f = content page = #content .left.column %h2 Welcome to our site! %p = print_information %p = print_inline .right.column = render [abba = ding, ding = abba] dinger |
Output
f = (stringToHtml "content") page = ((tag "div"![strAttr "id" "content"]) ((tag "div"![strAttr "class" "left column"]) ((tag "h2") (stringToHtml "Welcome to our site!") ) +++ ((tag "p") (stringToHtml print_information) ) +++ ((tag "p") (stringToHtml print_inline) ) +++ ((tag "div"![strAttr "class" "right column"]) (stringToHtml render) ) +++ ((tag "div"![strAttr "abba" "ding", strAttr "ding" "abba"]) (stringToHtml "dinger") ) ) ) |
le Code
module Main where import Text.ParserCombinators.Parsec import Text.ParserCombinators.Parsec.Language import Text.ParserCombinators.Parsec.Pos import qualified Text.ParserCombinators.Parsec.Token as T import Data.Char import Data.List import Data.Maybe import System.IO.Unsafe main = do s <- getContents case (parse mainParser "stdin" s) of Left err -> putStrLn "Error: " >> print err Right hs -> putStrLn hs -- Try to parse HAML, otherwise re-output raw lines mainParser = do whiteSpace ls <- many1 (hamlCode <|> tilEOL) return $ unlines ls -- -- * HAML lexer -- hamlLexer = T.makeTokenParser emptyDef whiteSpace= T.whiteSpace hamlLexer lexeme = T.lexeme hamlLexer symbol = T.symbol hamlLexer natural = T.natural hamlLexer parens = T.parens hamlLexer semi = T.semi hamlLexer squares = T.squares hamlLexer stringLiteral= T.stringLiteral hamlLexer identifier= T.identifier hamlLexer reserved = T.reserved hamlLexer reservedOp= T.reservedOp hamlLexer commaSep1 = T.commaSep1 hamlLexer -- -- * Main HAML parsers -- -- hamlCode is just many identifiers (e.g. 'func a b c' followed by '=' followed by a hamlBlock -- func a b c = %somehaml hamlCode = try ( do is <- many1 identifier symbol "=" currentPos <- getPosition x <- manyTill1 (lexeme $ hamlBlock) (notSameIndent currentPos) return $ (concat $ intersperse " " is) ++ " = \n" ++ (concat $ (intersperse (indent currentPos ++ "+++\n") $ filter (not . null) $ x)) ) -- A Block may start with some whitespace, then has a valid bit of data hamlBlock = do currentPos <- getPosition bs <- manyTill1 (pTag <|> pText) (notSameIndent currentPos) return $ intercalate (indent currentPos ++ "+++\n") bs pTag = do currentPos <- getPosition try (do t <- lexeme tagParser ts <- (isInline currentPos >> char '/' >> return []) <|> (hamlBlock) return $ intercalate "\n" $ filter (not . null) $ [ (indent currentPos) ++ "((" ++ (if (null ts) then "i" else "") ++ t ++ ")" , if null ts then [] else ts , (indent currentPos) ++ ")\n"] ) pText = lexeme stringParser notSameIndent p = (eof >> return []) <|> (do innerPos <- getPosition case (sourceColumn p) == (sourceColumn innerPos) of True -> pzero False -> return [] ) -- -- * Various little parsers -- tagParser :: CharParser () String tagParser = do t <- optionMaybe tagParser' i <- optionMaybe idParser c <- optionMaybe (many1 classParser) a <- optionMaybe attributesParser if (isJust t || isJust i || isJust c || isJust a) then do return $ "tag \"" ++ (fromMaybe "div" t) ++ "\"" ++ (if not (isJust i || isJust c || isJust a) then "" else concat $ [ "![" , intercalate ", " $ filter (not . null) [ (maybe "" (\i' -> "strAttr \"id\" \"" ++ i' ++ "\"") i) , (maybe "" (\c' -> "strAttr \"class\" \"" ++ (intercalate " " c') ++ "\"") c) , (maybe "" (\kv -> intercalate ", " $ map (\(k,v) -> "strAttr \"" ++ k ++ "\" \"" ++ v ++ "\"") kv) a) ] , "]"] ) else pzero tagParser' :: CharParser () String tagParser' = do char '%' many1 termChar idParser :: CharParser () String idParser = do char '#' many1 termChar classParser :: CharParser () String classParser = do char '.' many1 termChar attributesParser :: CharParser () [(String, String)] attributesParser = squares (commaSep1 attributeParser) attributeParser :: CharParser () (String, String) attributeParser = do k <- identifier symbol "=" cs <- many1 identifier return (k, intercalate " " cs) stringParser :: CharParser () String stringParser = do currentPos <- getPosition modifier <- optionMaybe (char '=' <|> char '-') whiteSpace c <- alphaNum cs<- tilEOL case modifier of Just '-' -> return $ (indent currentPos) ++ "-" ++ c:cs Just '=' -> return $ (indent currentPos) ++ "(stringToHtml " ++ c:cs ++ ")" Nothing -> return $ (indent currentPos) ++ "(stringToHtml \"" ++ c:cs ++ "\")" -- -- * Utility functions -- isInline p = do p2 <- getPosition case (sourceLine p ) == (sourceLine p2) of True -> return [] False -> pzero isSameIndent p1 p2 = (sourceColumn p1) == (sourceColumn p2) tilEOL = manyTill1 (noneOf "\n") eol eol = newline <|> (eof >> return '\n') termChar = satisfy (\c -> (isAlphaNum c) || (c `elem` termPunctuation) ) termPunctuation = "-_" indent p = take (sourceColumn (p) - 1) (repeat ' ') manyTill1 p e = do ms <- manyTill p e case (null ms) of True -> pzero False -> return ms |
Golly, but I wish that I’d cleaned up the code, but there it is in all of its raw, un-thought-through glory…
Nice!
The possibility of having portable templating syntax (Markdown, HAML, etc.) is intriguing. Yet another good thing that can come to Haskell thanks to having support for parsing built into the standard library.
Paul Brown
11 Dec 08 at 8:36 pm
Thanks for the kudos! It was fun to write. Not a full implementation, but enough to serve as a starting point for further work.
Parsec is a pretty sweet library to have built into the standard set of libraries, though I do wish that Parsec had more usage examples. Sometimes the error messages can be a bit terse or misleading, but it’s a pretty nice library.
alson
12 Dec 08 at 12:02 am