GSoC 2017 - Type-safe events in the Genetics Browser

Posted on August 24, 2017 by Christian Fischer

In the previous post I described how I connected a Biodalliance track with a Cytoscape.js graph, letting the Cy.js graph respond to the user interacting with BD tracks and vice versa. I did this by writing functions that produced and consumed Variants, and manually wrote functions to route the data between the tracks.

This post describes my attempt at a more flexible and reusable solution.

The Goal

The goal is to provide users with a way to easily define what events a track publishes, by providing a mapping from raw track events to some other type plus a label, as well as what events a track subscribes to, by providing a callback function that is called with the appropriate event when one is received.

Currently the system consists of two types, TrackSource and TrackSink. The implementation will likely change, especially as I continue to learn the language, but the general concepts and structure should stay as it is. The general idea is that sources map raw events to polymorphic variants which are then consumed by sinks; i.e. we don’t care about what data flows between sources and sinks, as long as the types line up.

TrackSource

Conceptually, a TrackSource is a list of parsers which are run on the raw event data from e.g. Biodalliance. When BD produces an event (for example after the user has clicked a feature in the track), the TrackSource can be applied to produce parsed objects.

The type definition:

data TrackSource input (rOut :: # Type) = TrackSource (List (input -> Maybe (Variant rOut)))

Since the row of possible outputs appears in the type, we can later make sure that we only connect compatible sources and sinks.

Constructing TrackSources

An empty TrackSource simply wraps the empty list:

emptyTrackSource ::  input.
                    TrackSource input ()
emptyTrackSource = TrackSource mempty

Given a label, a function to parse a raw event, and an existing TrackSource, we can create a new TrackSource that produces another possible output. We can only do so if the given TrackSource doesn’t already produce an event with the label we’re providing – that’s what the `RowLacks` and `RowCons` constraints ensure. The `Union` constraint ensures that the output row of the new TrackSource is a superset of the input TrackSource.

appendTrackSource ::  l a b rOut1 r rOut2.
                     Union rOut1 r rOut2
                  => RowLacks l rOut1
                  => RowCons l b rOut1 rOut2
                  => IsSymbol l
                  => SProxy l
                  -> (a -> Maybe b)
                  -> TrackSource a rOut1
                  -> TrackSource a rOut2
appendTrackSource l f (TrackSource h) = TrackSource $ f' : (map <<< map <<< map) expand h
  where f' :: a -> Maybe (Variant rOut2)
        f' a = inj l <$> f a

In words, the code lifts the given parser function into producing Variants, by mapping over the parser result, and then appends the new function to the existing list of parsers.

`expand` is used to expand the type of the existing list; `h` has type

List (a -> Maybe (Variant rOut1))

but since `f’` produces `Variant rOut2`s, we need to use `expand`. Since the Variant produced is in a Maybe produced by a function in a list, we need to map three levels deep, hence `map <<< map <<< map`.

Using TrackSources

A TrackSource can be run on appropriate input to produce a list of successful parses:

applyTrackSource ::  a rOut.
                    TrackSource a rOut
                 -> a
                 -> List (Variant rOut)
applyTrackSource (TrackSource h) a = mapMaybe (\f -> f a) h

It simply maps the application of a function on the input over the list of parsers in the TrackSource, and filters out the unsuccessful parses (the Nothings).

TrackSink

The other end is TrackSink, which can be connected to a track to receive events and perform actions on said track. The type definition:

data TrackSink (rIn :: # Type) (rFun :: # Type) out = TrackSink (Record rFun)

`rIn` corresponds to the possible inputs to the track, `out` is the output type (often some Eff if we actually want to do something effectful on the track), and `rFun` holds the functions that transform the input to the output. Note that `rIn` and `out` are both phantom types – they don’t appear in the actual value, but let us easily make sure we’re sending the correct data to the TrackSink, and that all the functions have the correct output.

Constructing TrackSinks

The empty TrackSink is a wrapper over the empty record, which cannot handle any input:

emptyTrackSink ::  b.
                  TrackSink () () b
emptyTrackSink = TrackSink {}

Just like with TrackSource, we can extend an existing TrackSink by providing a label and function, and only if the TrackSink doesn’t already handle that label:

appendTrackSink ::  l a b rIn1 rIn2 rFun1 rFun2.
                   RowLacks l rIn1
                => RowLacks l rFun1
                => RowCons l a rIn1 rIn2
                => RowCons l (a -> b) rFun1 rFun2
                => IsSymbol l
                => SProxy l
                -> (a -> b)
                -> TrackSink rIn1 rFun1 b
                -> TrackSink rIn2 rFun2 b
appendTrackSink l f (TrackSink r) = TrackSink $ insert l f r

The type constraints make sure the new label doesn’t exist in the given TrackSink.

The implementation is not much more complex than the emptyTrackSink; the interesting things happen in the type signature. First we make sure the label we want to handle isn’t already in either the input row or the function row of the TrackSink. Then, using RowCons, we can add the label and input type to the input row, and the label and function type to the function row. The result is a TrackSink with the same output type as the first, and whose rows type have one extra type.

Using TrackSinks

With the function `applyTrackSink` we can give it a Variant containing something in the input row, and produce something of the appropriate output. It’s not as nice as the construction (basically copied from purescript-variant’s `match`):

applyTrackSink ::  lt a rIn rFun b.
                  Union lt a rIn
               => TrackSink rIn rFun b
               -> Variant lt
               -> b
applyTrackSink (TrackSink h) v =
  case coerceV v of
    Tuple tag a -> a # unsafeGet tag h
  where coerceV ::  c. Variant lt -> Tuple String c
        coerceV = unsafeCoerce

The type signature says that the Variant can be any subset of the TrackSink’s input row – since TrackSinks are always created using functions that make sure that the input row and function row line up, there is no risk of applying a TrackSink to a value of an incorrect type.

Future

This is a start, but there are plenty of improvements and extensions I want to make. Combining and constructing sources or sinks should be easier, for example it should be possible to combine two existing TrackSources, rather than append one handler at a time to a single existing TrackSource. This should be possible using the relatively new RowToList support.

There are also two important parts that I haven’t implemented yet. On the user side, it should be easy to configure what events are published by tracks, as well as define interactions when a track receives a given event. Also missing is a central manager or other way to tie sources and sinks together; it must route events produced asynchronously by external (BD or Cy.js, for example) sources, into the appropriate handler functions defined by the set of sinks.