GSoC 2017 - Embedding Biodalliance in a Purescript app

Posted on July 27, 2017 by Christian Fischer

The genome/genetics browser I am building makes use of Biodalliance (BD) to visualize data that has a position on the genome. While an end goal is to have a genome browser component written in Purescript, using Biodalliance like this lets us quickly reach something useful. Being compatible with existing BD tracks is a nice bonus.

The code for this project can be found on GitHub: https://github.com/chfi/purescript-genetics-browser

Screenshot of BD component

Screenshot of BD component

Screenshot of Cy.js component

Screenshot of Cy.js component

Here the Cytoscape graph has been filtered by an event sent from the Biodalliance browser.

Halogen

The browser uses Halogen, a UI library that is reminiscent of React. Web pages are created by composing components, and those components must compose in such a way that their types fit together – a parent component must know what its children can do, for example. Components also each keep their own state, and can send and receive messages from their parent.

Currently, our browser consists of a main container component, a Biodalliance component, and a Cytoscape.js component. The container has the other two as children, and can communicate with them.

This post will explain only some of the very basics of Halogen, for more information, see the Halogen guide.

Main component

The main function of the entire app takes a configuration as argument, and contains the BD and Cytoscape.js (Cy.js) child components, initializing them with the data described in the track configurations (described later in this post). This part also takes care of inter-track communication, routing events between tracks.

Other than the BD and Cy.js components, there is also a basic GUI, with buttons for scrolling the BD view.

Let’s have a look at the Biodalliance component.

BD component

The BD component’s Query, Message and State types are quite simple:

data Query a
  = Scroll Bp a
  | Jump Chr Bp Bp a
  | Initialize (forall eff. HTMLElement -> Eff (bd :: BD | eff) Biodalliance) a

data Message
  = Initialized
  | SendBD Biodalliance

type State = { bd :: Maybe Biodalliance }

In words, this says the component can be scrolled by some number of basepairs, jump to a position on some chromosome, as well as take a callback to create the Biodalliance browser instance. It can send a couple of messages to the parent, and keeps track of the BD browser instance in its state.

The component also has an `eval` function, which evaluates the Query algebra to actions – each data constructor in the query type can be seen as a `command` which the component can understand. The `eval` function then, describes what the component actually does when it receives such a command, e.g. updating the component state, sending messages, performing AJAX requests, etc.

`eval` has the following type:

eval :: Query ~> H.ComponentDSL State Query Message (Aff eff)

Which is a lot of words to say that the function works in a monad which can read and update the component state, raise other queries, send messages to the parent component, as well as perform actions in the Aff (asynchronous effects) monad.

Some of the BD component’s `eval` function follows. First, creating the Biodalliance browser:

Initialize browser next -> do
  -- get the HTML element (created by the Halogen component in the renderer)
  H.getHTMLElementRef (H.RefLabel "bd") >>= case _ of
    Nothing -> pure unit
    Just el -> do
  -- Run the browser constructor & bind the resulting instance
      bd <- liftEff $ browser el
  -- Update the component state with the new BD instance
      H.modify (_ { bd = Just bd })
  pure next

Here `mkBd` is a function that takes an HTML element and creates a BD browser instance which renders to it. The state is then updated with that instance. As the effects in this `eval` function take place in the Aff monad, we need to use the `liftEff` function to lift mkBd from Eff to Aff.

Next, the part that moves the browser view in response to a Jump query. An action is only taken if the component state actually contains a BD instance – i.e. if the browser has been initialized. If it has, the appropriate function is called on said instance. Again, we lift the function from Eff to Aff with liftEff.

Jump chr minPos maxPos next -> do
  -- Fetch the `bd` value from the component state
  mbd <- H.gets _.bd
  case mbd of
    Nothing -> pure next
    Just bd -> do
      liftEff $ Biodalliance.setLocation bd chr minPos maxPos
      pure next

Events

Biodalliance can produce events, and we also want the BD instance to be able to respond to events from other tracks. The events in this case are specific to the data in the track; for example we may be interested in the user clicking on a node in Cytoscape that has a location (chromosome + basepair) related to it – say we want to scroll BD to the corresponding position.

Then BD needs a corresponding event handler, of a type signature like this:

handleLocation :: forall eff
                . Variant (location :: {chr :: Chr, pos :: Bp})
               -> Eff (bd :: BD | eff) Unit

Multiple handlers are combined using purescript-variant’s features, matching on :

-- given handlers in a record { location :: Location -> Eff _ Unit
--                            , range :: Range -> Eff Unit }
default (pure unit)
  # on (SProxy :: SProxy "location") location
  # on (SProxy :: SProxy "range") range

Events are passed to the handler using a bus from purescript-aff-bus; the main function of the app takes care of creating the buses and running the handlers on them.

We can also create handlers that, when the BD instance fires an event, pushes the event to the outgoing event bus. For example, given a way to parse JSON objects (events as sent from BD) to a range on a chromosome, as well as a BD instance and an event bus, we can attach the handler like so:

subscribeBDEvents :: { range :: JObject -> Maybe Range }
                  -> Biodalliance
                  -> BusRW (Variant (range :: Range))
                  -> Eff _ Unit
subscribeBDEvents {range} bd bus =
  Biodalliance.addFeatureListener bd $ \obj -> do
    case range obj of
      Nothing -> pure unit
      Just ran -> do
        _ <- Aff.launchAff $ Bus.write (inj (SProxy :: SProxy "range") ran) bus
        pure unit

There’s an analogous function for subscribing to Cy.js events.

In the future the user will be able to painlessly describe the desired interactions and event flows between tracks, letting the program manage the actual routing details. Currently it is hardcoded, though simple (and clunky) enough, to wire these examples together:

busFromBD <- Bus.make
busFromCy <- Bus.make

io.subscribe $ CR.consumer $ case _ of
  BDInstance bd -> do
    liftEff $ log "attaching BD event handlers"
    _ <- createBDHandler { location: locationHandlerBD } bd busFromCy
    _ <- liftEff $ subscribeBDEvents { range: parseRangeEventBD } bd busFromBD
    pure Nothing
  _ -> pure $ Just unit
io.query $ H.action $ CreateBD mkBd

io.subscribe $ CR.consumer $ case _ of
  CyInstance cy -> do
    liftEff $ log "attaching Cy event handlers"
    _ <- createCyHandler { range: rangeHandlerCy } cy busFromBD
    _ <- liftEff $ subscribeCyEvents { location: parseLocationEventCy } cy busFromCy
    pure Nothing
  _ -> pure $ Just unit
io.query $ H.action $ CreateCy cyElemsUrl

This diagram hopefully illustrates the app hierarchy and data flow:

App hierarchy and data flow

App hierarchy and data flow

Biodalliance configuration

This is the function that instantiates the BD browser:

initBD ::  eff.
          Options Biodalliance
       -> RenderWrapper
       -> BrowserConstructor
       -> HTMLElement -> Eff (bd :: BD | eff) Biodalliance
initBD opts = initBDimpl (BDOptions $ options opts)

It takes a set of options used to configure the BD instance, as well as a couple of helper functions (RenderWrapper and BrowserConstructor). The function returns a function taking an HTML element, filling it with the BD browser instance, returning the instance.

The corresponding FFI implementation is (cut down here):

exports.initBDimpl = function(opts) {
    return function(wrapRenderer) {
        return function(browser) {
            return function(el) {
                return function() {
                    var renderers = {};
                    opts.renderers.forEach(function(r) {
                        renderers[r.name] = wrapRenderer(r.renderer, r.canvasHeight);
                    });

                    var sources = opts.sources;

                    var b = new browser({
                        ...
                        ... boring extra configuration
                        ...

                        injectionPoint: el,
                        externalRenderers: renderers,
                        sources: sources
                    });

                    return b;
                };
            };
        };
    };
};

The details of Biodalliance’s configuration options can be found here, we only open a few to user customization. Note also that the function doesn’t explicitly return a function from an HTMLElement to a Biodalliance instance – this is automatically the case thanks to currying.

BD Tracks

The configuration system is compatible with existing Biodalliance track configurations. The configurations are read from JSON, validated, then converted back to a Foreign value. An array of these validated track configs are then sent as one of the options when creating the BD instance.

There are a couple of types used to manage the track configurations:

newtype TracksMap = TracksMap (StrMap Json)

data TrackType = BDTrack | CyGraph

getConfigs :: TracksMap -> TrackType -> Either String (Array Json)
getConfigs (TracksMap ts) tt =
  maybe (Left $ "No track config of type: " <> (show tt)) Right $
        ts ^? ix (show tt) <<< _Array

A value of type `TracksMap` corresponds to a map from a TrackType to an array of track configurations of the corresponding type. The underlying representation, `StrMap Json`, is simply a JSON object – we know that it has strings for keys, but each value needs to be parsed from JSON before using it. This parsing is done using a couple of prisms (using purescript-profunctor-lenses), first seeing if there are any configurations of the given track type, then, if there are, parsing those as Arrays.

The whole pipeline consists of a handful of functions. We have a very basic function for validating the configuration of a Biodalliance track (Cytoscape has a similar one):

newtype BDTrackConfig = BDTrackConfig Json

validateBDConfig :: Json -> Either String BDTrackConfig
validateBDConfig json = case json ^? _Object <<< ix "name" of
  Nothing -> Left $ "BD track config does not have a name"

At this point we need a function to validate all possible track configurations, and we want to know which ones worked and which ones did not. ValidatedConfigs is a type alias that helps clean things up, together with the foldErrors function:

foldErrors :: forall e r
            . Array (Either e r)
           -> { errors :: Array e
              , results :: Array r
              }
foldErrors = foldr (\c confs@{errors, results} ->
                                case c of Left  e -> confs { errors  = (e : errors)  }
                                          Right r -> confs { results = (r : results) }
                   ) { errors: [], results: [] }

`foldErrors` reduces the given Array of Eithers, putting all Lefts in “errors” and all Rights in “results”. With this we can easily create a function that takes TracksMaps, some TrackType, and a (possiblp erroring) function to apply to that TrackType, returning a ValidatedConfigs.

type ValidatedConfigs a = { errors :: Array String
                          , results :: Array a
                          }

foldConfig :: forall a
             . TracksMap
            -> TrackType
            -> (Json -> Either String a)
            -> ValidatedConfigs a
foldConfig trackMap trackType f = foldErrors $ map (_ >>= f) $ sequence $ getConfigs trackMap trackType

Given a TracksMap, we can now apply the corresponding validation functions to the various TrackTypes we index the map with:

validateConfigs :: TracksMap
                -> { bdTracks :: ValidatedConfigs BDTrackConfig
                   , cyGraphs :: ValidatedConfigs CyGraphConfig
                   }
validateConfigs tracksMap = { bdTracks
                            , cyGraphs
                            }
  where bdTracks = foldConfig tm BDTrack validateBDConfig
        cyGraphs = foldConfig tm CyGraph validateCyConfig

We then end up with the configurations for the various track types split up, ready to be piped into e.g. the Biodalliance constructor – and the errors can be dealt with as well.