Haskell at CentralApp

Ashesh Ambasta & Brecht Serckx

09.12.2019

Introduction

Ashesh

  • Co-founder + CTO
  • Ops
  • Backend programming
  • Design and architecture

Brecht

  • was in your spot a few years ago
  • graduated in 2019 from KUL CS - AI
  • backend engineering intern at CentralApp since september

Agenda

  1. About CentralApp
  2. Architecture
  3. Haskell at CentralApp
  4. Benefits of using Haskell
  5. Example?
  6. Tooling
  7. Starting as a Fresh Graduate
  8. Outro

About CentralApp

What we do

  • We're a SaaS that is aimed at SME's: we centralise + distribute data about the SME
    • online presence: reviews + social/public profiles
    • websites
    • online transactions: booking requests/emails/enquiries

And how we do it

  • We're a small development team running and building a large product
    • 150k LOC: Scala and Haskell
    • N services in production
    • 2 backend devs: Brecht & I
    • millions of requests per day
    • 1000+ unique websites

Architecture

Overview

Diagram

Sorry, your browser does not support SVG.

Figure 1: Arch. overview

A bit more detail

  • Service groups: (as in the diagram)
  • Functional programming: Scala + Haskell
  • Decoupled:
    • SOA: services don't talk to each other over a sync. channel
    • Services exchange messages over a MQ (minimal coupling)
    • A service can run without others

Quasar

Death star

death-star.jpg

Figure 2: Death star

Enter Quasar

  • Quasar is our answer to the death-star
  • Quasar as "service-mesh" and API gateway: flattens out the death-star and removes inter-svc. deps.
    • we'll get to Quasar in more detail

Quasar: Overview

Sorry, your browser does not support SVG.

Figure 3: Quasar overview

Haskell at CentralApp

Started with Scala, moved to Haskell: how, why

  • Scala (4-5 years ago):
    • Haskell's tooling story wasn't great
    • we needed a product really fast
    • Scala let us use the JVM libs. in a safe way.
  • Haskell (now):
    • I was the lone dev
    • I was looking to maximise productivity in a large codebase
    • I looked at safe (and pure) languages: Haskell

Haskell pilot project

  • SSL engine (+ nginx)
  • invoked on each website visit: MANY of times a day; thus high performance
  • on the fly LetsEncrypt SSL issuance, renewal & verification
  • was a massive success: 0 bugs till date.

Move to Haskell

  • New services are written in Haskell
  • Existing systems requiring major changes will be rewritten
  • Service types:
    • API servers
    • Database access
    • Network infrastructure (distributed systems)
    • Image processing

How is Haskell used now

Quasar (more details)

  • Quasar is our API gateway
  • Avoids death-star
    Death-star, in simple terms, means every service can make API calls to other services. This has problems:
    1. Dependency chains
    2. Cycles: broken infra

Pipelines

  • Uses Trees (acyclic) to describe each API endpoint, called pipelines:
    1. each node is an instruction to do one of:
      • proxy this request to some service
      • transform its input
      • pass its input (in ∥) to children
      • add input to accumulator (STM)
    2. input of a node: output of parent node;
      • root node input is the HTTP request
    3. output of a leaf node ↣ root accumulator

PlSteps (legend)

Sorry, your browser does not support SVG.

Figure 4: Quasar PL Legend

An example

Sorry, your browser does not support SVG.

Figure 5: Quasar PL Example

Benefits

  • this means services don't need to call one another
  • Haskell's expressive type system and STM lets us do this elegantly!

Rest APIs using servant

First, the code

  • Since we deal with API's all the time, Servant is a great tool in our arsenal:
-- Example servant route: 
"login" :> Header "Authorization" AuthData :> Post '[JSON] AuthResult 
-- Authentication result 
data AuthResult = AuthSuccess UserInfo 
                | AuthFailed Reason 
data AuthData = UserToken ByteString  -- Authentication data 

What this all means:

  • HTTP request:
POST /login
Authorization: <UserAuthData> 
  • which will respond with a Haskell type that can be serialised to JSON format, AuthResult
  • And this route will need to be handled by a function that matches the type of the route:
authUser :: UserAuthData -> AuthResult 
authUser authData = do something 

So what has happened here?

  • We've defined our route at the type level:
    • compiler ensures type of route ≡ type of handler function.
  • Runtime errors → compile time errors

Benefits of using Haskell

  • Compiles to fast binaries ∴ low costs + happy customers
  • Ultimate in safety: statically typed, functional, expressive and pure.
  • best in class STM (Software transactional memory)
  • Clean: function composition is trivial to read
  • Has libraries like Servant & Lens
  • Great type-system: we can effortlessly model things like Quasar pipelines
  • Type inference: GHC is a clever and informed guesser.

The Maybe functor

var int myNullableInt = 10; 
myNullableInt = null; // this perfectly ok
myNullableInt = 20; // also ok 

// elsewhere in your code (also ok): 
myNullableInt2 = myNullableInt + 1;
  • what happens when myNullableInt is null and you're adding 1 to it?
    • BOOM: NPE (or if you're lucky, null may be treated as 0 depending on the whims of your language)

How can Haskell help us here?

  • null is not a valid value
  • we have Maybe Int instead of an Int that can randomly disappear on us

    data Maybe a = Just a | Nothing 
    
  • and Maybe is a Functor
  • Thus we've avoided NPE

Wait, what's a Functor?

  • as per Category Theory, a Functor is a mapping between categories (out of scope of this talk)
  • tl;dr: it lets us do things like:

    -- say we have a function: 
    func :: a -> b 
    func = something 
    -- and we have a functor f and a functor value: 
    x :: f a 
    -- the functor typeclass in haskell supplies us with a function called fmap: 
    fmap :: Functor f => (a -> b) -> f a -> f b
    -- and now we can do this: 
    fmap func x 
    -- which will give us a 
    value :: f b 
    

Cool, but what about the NPE?

  • Now we can do:

    -- say we have some value that is a Maybe Int 
    x = Just 10 
    -- in our previous example, we wanted to add 1 to the int, safely. 
    -- so we can now do: 
    fmap (+1) x 
    -- this will give us: Just 11 
    -- and what if we had a Nothing? That's fine too: 
    y = Nothing :: Maybe Int 
    fmap (+1) y 
    -- this will give us: Nothing 
    

Servant and Type-Safe APIs

Why

  • control:
    • type of parameters
    • type of payload
    • type of response body
    • type of response codes

How

  • encode this information at the type level
    => compiler checks for mistakes!
  • with servant library

Example

type SecretSantaAPI
  = "api" :> "secretsanta" :>

    -- PUT request that accepts a `UnmatchedHat` payload and returns `Id`, 
    -- both in JSON format
    (  ReqBody '[JSON] UnmatchedHat :> Put '[JSON] Id

    -- GET request that accepts an `Id` URL parameter and returns a hat 
    -- if found
  :<|> Capture "id" Id :> Get '[JSON] (Maybe AnyHat)

    -- POST request that accepts an `Id` URL parameter and matches its 
    -- hat if it exists
  :<|> Capture "id" Id :> "match" :> Post '[JSON] (Maybe SantaError)
    )

-- | The function that actually serves the API defined above.
secretSantaServer :: Server SecretSantaAPI
secretSantaServer = runServer
    $  putHat
  :<|> getHat
  :<|> matchHatById

GADTs and DataKinds

  • lift runtime errs. to compile time errs.
  • runtime checks: prone to being missed; relies on dev.
  • compile time checks: your compiler enforces rules

A business example:

  • datatype: Category
  • Category.level: level0 or level1
  • function required:

    addCategoryToBusiness :: Bus -> Cat -> Bus
    

    ∀ c.level == level1

Runtime based implementation:

data Lvl = L1 | L0 
data Cat = Cat { name :: Text, lvl :: Lvl }
addCategoryToBusiness :: Bus -> Cat -> Bus 
addCategoryToBusiness b c | lvl c == L1 = do something
                          | otherwise = do error 
  • possible to call addCategoryToBusiness with Cat with L0

Compile-time based implementation

  • DataKinds: lift data-types to kinds (kinds ≊ types of types)
  • eventually: lift business requirements to types
  • we lift Lvl to the kinds level

Code:

data CatLvl (lvl :: Lvl) where 
  CatL0 :: Cat -> CatLvl 'L0 
  CatL1 :: CatLvl 'L0 -- parent cat.
        -> Cat -- the L1 cat 
        -> CatLvl 'L1
-- thusly: 
addCategoryToBusiness :: Bus -> CatLvl 'L1 -> Bus 
-- compile err: 
addCategoryToBusiness (undefined :: Bus) (undefined :: CatLvl 'L0) 

IO monad

Why

  • separate pure functions and code with effects
  • easy tracking of where things can go wrong

How

  • IO monad
  • run as little code in IO as possible

STM

STM monad allows elegant locking and transactions

  • concurrency: compose STM actions on guarantees of atomicity
  • locking: reads/writes are completely atomic
  • resources can be held/released, locking is out of the box.
  • Drop to IO:

    atomically :: STM a -> IO a 
    
  • great for maintaining a mutable state
    • Quasar pipelines
    • on-the-fly config. (used across svcs) (no redeployments/env-vars)

ReaderT pattern

Why

  • environment that we want to access or modify at runtime
    • configuration values
    • mutable state
    • manager for open http connections
  • don't want to pass all values as parameters
    => use ReaderT pattern

How

  • Reader monad Reader r a:
    • implicitly pass an immutable value of type r
    • get the value with

      ask :: Reader r r 
      
  • ReaderT monad transformer ReaderT r m a:
    • add Reader capabilities to another monad m

ReaderT pattern

  • ReaderT pattern:
    • transformer stack: ReaderT Env IO
    • Env datatype contains our application values
    • read-only values: standard attribute
    • read-write values: use a TVar (with STM from previous part)

=> Env stores read-only pointer, we modify the value at the pointer

Example: Runtime environment

-- | The memory is simply a map of Ids to Hats
type Memory = Map.Map Id AnyHat

-- | The environment contains a mutable variable to the memory
newtype Env =
  Env
  { envMemory      :: TVar Memory
  , envName        :: String
  , envHttpManager :: HttpManager
  }

-- | Create an empty environment
emptyEnv :: IO Env
emptyEnv = do
  mem <- newTVarIO Map.empty
  let name = "my app name"
  mgr <- newHttpManager
  return $ Env mem name mgr

Example: ReaderT usage

putHat :: UnmatchedHat -> ReaderT Env IO Id
putHat h = do
  let anyhat = AnyHat h
  tvar <- envMemory <$> ask
  lift . atomically $ stateTVar tvar $ \mem ->
    let n = Map.size mem
    in  (n, Map.insert n anyhat mem)

Example

Secret Santa API server:

Tooling

Stack, cabal

  • Cabal:
    • standard Haskell build tool
    • takes code, tries to solve dependencies, tries to build
    • cabal hell
  • Stack:
    • tries to solve cabal hell
    • uses snapshot
      => set of packages that are known to work together
    • Just Works™

Nix

Nix

  • package manager for Linux/UNIX
  • reproducible packages
  • easy build environments
  • same build environment on different machines
    => just copy the configuration.nix to the other machine and build
  • NixOS: Linux distro built around Nix

Derivations

  • specify how packages are built
  • are 'programmed' in the Nix language
    => pure functional language

IDE/Editors

Editor Integration:

Name Editor(s)
Haskell IDE Engine LSP
Intero (EOL) Emacs
dante Emacs
ghci CLI
ghcid CLI

Formatting/linting

  • Lint: hlint
  • Formatters:
    • stylish-haskell
    • brittany
    • ormulu

Starting as a Fresh Graduate

Downsides

  • steep learning curve
  • be prepared to understand far too little of the code you read
    • Haskell extensions
    • custom operators
    • weird type stuff
  • not a lot of job opportunities in Belgium
    => we'd like to change that

Upsides

  • compiler gives you confidence when trying new stuff
  • saves you a lot of debugging
  • you do not want to know how many unit tests we have (hint: far too little)
  • beautiful

Outroduction