GSoC - Final update

Posted on August 22, 2016 by Christian Fischer
Tags: gsoc

My work

My work this summer has been divided over three repos on Github, links to my commits follow:

Biodalliance

gn_server

GeneNetwork2

During the summer the course of the project shifted somewhat, with a larger focus on rewriting parts of Biodalliance than was initially intended, but in the end I’m very happy with the results. I basically accomplished what I set out to do, despite spending a considerable time on BD, and relatively little time on gn_server.

Biodalliance

The Biodalliance renderer rewrite, detailed here, is complete, with extensive passing tests comparing it to the original renderer, meaning it is fit to replace the old renderer. The tests are described in this blog post, though I have since extended them by making sure to test as many types of features and styles as possible. This is done by using a test source I wrote, which can be configured to return any sort of feature to the browser.

Other than the renderer, my work was largely focused on adding sources to support the various data that we want to view in the embedded browser in GeneNetwork2. This consisted in adding support for CSV files, for the reading of R/QTL data, and writing various stylesheets which are applied to the the tracks, making them look like they should. I also added a QTL plot, which uses the lineplot already in BD.

gn_server

On gn_server, the Elixir REST API that works as the main channel of communication between the embedded BD browser and the GN2 backend, I added routing for the various types of data served to BD, static file serving, as well as calculating and serving SNP density data in JSON format.

GeneNetwork2

My work on GeneNetwork2 was mainly embedding the Biodalliance browser in it - tying the knot between BD and gn_server in the intended environment. The browser works well and is integrated in the interface that already existed in GN2.

Embedding the browser itself consisted of little more than adding a div to the page in question - most of the work was in making it behave well with the rest of the page, as well as making sure it got the correct data.

The page already had a clickable chart showing a QTL plot, complete with zooming in to show a single chromosome. What I did was add a button to swap between the BD view and the previously existing chart, when at the chromosome level.

Single Chromosome Vector Map

Single Chromosome Vector Map

The page BD was to be embedded in consisted of over forty thousand lines of HTML - the vast majority of which were the data used by the charts already on the page. This was changed so the data was served as separate JSON and CSV files, the latter which was consumed by BD.

Below is a screenshot of Biodalliance as currently embedded in GN2. There are some visual bugs, mainly in the top track, to still be ironed out, but the functionality is there.

Biodalliance in GeneNetwork2

Biodalliance in GeneNetwork2

I also wrote a simple wrapper around BD, to simplify creating/opening the browser and adding sources to it. The goal is to make adding a source to the page embedding BD to be as simple and painless as possible, ideally requiring little or no knowledge of how BD works. For now the function that adds a source simply takes a single BD source, as described on the BD homepage.

Future Work

On the BD side, there are a fair number of bugs to stamp out, as well as some UX-improvements to be made. Specifically to its use in GN2, the R/QTL genotype plot is a bit buggy. There are also some improvements to be made to the zooming function. For example, at the moment the SNP density track (as seen in the figure above) is calculated only once, for the entire chromosome. This means that the resolution is, by computational necessity, very low. Preferably the track should refetch new data from the server when zooming.

As for gn_server, it is still a very early project, and much functionality is still to be added. At the moment, for example, QTL plot data is served to BD from GN2, though gn_server should be serving it. Basically, there’s lots of interesting and exciting work to be done!

Acknowledgements

This has been an incredibly fun and fulfilling summer for me, and I’m very grateful to Google and the Open Bioinformatics Foundation for this opportunity.

I must also give thanks to my mentors, Pjotr Prins and Danny Arends. Thanks to everyone who has given me feedback on the BD and GN2 mailing lists, as well.