S4: a short guide for the perplexed

I recently attended the Bioconductor 2019 conference in New York City, where I was lucky enough to give a workshop on my Bioconductor package plyranges and present some new ideas I’m working on range-based summarisation and visualisation. After some discussion with both Bioconductor veterans and new-comers there was general agreement that it was hard to find good resources or even a beginner’s guide for learning S4. This blog-post is an attempt to rectify that.

Rookie mistakes and how to fix them when making plots of data

In this assignment, the focus was to practice data cleaning. Students suggested questions to build a class survey, to get to know the interests of other class members, and then completed the composed survey. After cleaning the data, a few summary plots of interesting aspects of the data were made. There are some common mistakes that rookies often make when constructing data plots: packing too much into a single graphic, leaving categorical variables unordered, reversing norms for response and explanatory variables, conditioning in wrong order, plotting counts when proportions should be the focus, not normalizing by counts, using a boxplot for small sample size.