Blogs & News
Following on from last week’s post where I introduced our new mini-apps capability, today I’m going to demonstrate how to create more advanced and interactive mini-apps. I will also introduce the ggvis
plotting library that allows us to create interactive plots. I’m going to do that by investigating prescribing data and related costs for three types of medications – statins, diabetes and antidepressants – in London between 2011-2013. The result will be an interactive map that will allow us to explore and connect with the data in detail. Have a look at the video, then I’ll walk you through the steps I’ve taken to get there.
This post covers:
ggvis
inside a mini-appggvis
plotsStatins are medicines which lower the level of cholesterol in the blood. High levels of ‘bad cholesterol’ can increase the risk of having a heart attack or stroke and of developing cardiovascular disease.
www.bbc.co.uk/news/health-18101554
The National Institute for Health and Care Excellence (NICE) says the scope for offering this treatment should be widened to save more lives. The NHS currently spends about £450 million a year on statins. If the draft recommendations go ahead, this bill will increase substantially, although the drugs have become significantly cheaper over the years. It is not clear precisely how many more people would be eligible for statin therapy under the new recommendations, but NICE says it could be many hundreds of thousands or millions.
www.bbc.co.uk/news/health-26132758
The use of antidepressants rose significantly in England during the financial crisis and subsequent recession, with 12.5m more pills prescribed in 2012 than in 2007, a study has found. Researchers from the Nuffield Trust and the Health Foundation identified a long-term trend of increasing prescription of antidepressants, rising from 15m items in 1998 to 40 million in 2012. But the yearly rate of increase accelerated during the banking crisis and recession to 8.5%, compared to 6.7% before it.
www.theguardian.com/society/2014/may/28/-sp-antidepressant-use-soared-during-recession-uk-study
The report also found that rises in unemployment were associated with significant increases in the number of antidepressants dispensed and that areas with poor housing tended to see significantly higher antidepressant use.
It is estimated that more than one in 17 people in the UK have diabetes. In 2014, 3.2 million people had been diagnosed, and by 2025 this number is estimated to grow to 5 million.
www.diabetes.org.uk/Documents/About%20Us/Statistics/Diabetes-key-stats-guidelines-April2014.pdf
Diabetes, when present in the body over many years, can give rise to all sorts of complications. These include heart disease, kidney disease, retinopathy and neuropathy. Diabetes is the leading cause of amputations.
www.diabetes.co.uk/diabetes-and-amputation.html
The data we are using for this mini-app cover the prescriptions of statins, diabetic drugs and antidepressants in London between 2011-2013. The initial dataset is available through HSCIC. The data were aggregated to count how many drugs were prescribed and the cost of these by each year. The aggregated data were then filtered to include the 3 drugs of interest for London.
The data are ready to use and loaded in the platform, so let’s build the mini-app.
We can start by designing our UI. We have chosen three widgets: a slider for year selection, a drop-down menu for the variable to be plotted selection and a radio button for drug type selection. In this worked example we will keep the plotting functions empty for now and populate them later. We can divide the space in two columns, a smaller one for the widgets and a larger one for the plot:
ui.r
## ui.r
library(shiny)
library(ggvis)
shinyUI(fluidPage(
titlePanel("Prescribing data London"),
fluidRow(
column(width = 3,
wellPanel(
sliderInput("year",
"Select year to visualise: ", 2011, 2013,
value = 2011,
animate = F),
br(),
selectInput("var",
label = "Select variable to visualise",
choices = list("% Items" = "items_perc", "Average cost (£)" = "act_cost_perc"),
selected = "items_perc"),
br(),
radioButtons("drug",
label="Select the drug to visualise",
choices= list("Diabetes drugs" = "diabetes", "Anti-depressant drugs" = "antidepressants", "Statins" = "statins"),
selected="statins")
)
),
column(9,
wellPanel()
)
)
)
)
Using an empty shinyServer
function this will look like:
We now have a functional UI with all the required widgets, so let’s start adding some functionality to the shinyServer
function. We can start with reading in the data. Since the data we will be using won’t change for every session, we don’t have to read the data in a reactive piece of code:
server.r
# Load required libraries
library(dplyr)
library(ggvis)
# Set miniapp option
options(dplyr.length = 1e10)
# Miniapp server function
shinyServer(function(input, output, session) {
## Read the data
df <- xap.read_table("presc_bnf_ccg_summary_demo_ldn")
}
)
It is good practice to divide up the functionality into small modules, so that’s what we’ll do.
The first task is filtering the data according to the selection of the user. Define a reactive function, we’ll call it select_data()
:
## Filter the data according to the values of the widgets
select_data <- reactive({
# Get the widgets values
yr <- input$year
var <- input$var
drug <- input$drug
# Filter the data
selected_data <- df %>%
filter(year == yr & case == drug) %>%
select(ccg_code, id, year, case, act_cost, nic, ccg13nm, long, lat, order, group, items_perc, act_cost_perc)
# Select variable to be plotted
if (var == "items_perc"){
selected_data <- selected_data %>% mutate(var = items_perc)
}
else if (var == "act_cost_perc"){
selected_data <- selected_data %>% mutate(var = act_cost_perc)
}
# Return the selected data
selected_data
})
Here we filter df
according to the selected year and the selected type of drug. Afterwards we add an extra column (named var
) with the selected variable to be plotted. This extra column is going to help us create the map plot independently from the selected variable.
Create a reactive function that will create some summary of our data, the summarise_data()
. This is going to be our basis for the colour palette we will need to colour the map.
## Summarise the data
summarise_data <- reactive({
# Get selected data
selected_data <- select_data()
# Group the data by CCG and calculate avg value of the selected variable
summarised_data <- selected_data %>%
group_by(ccg_code) %>%
summarise(var=mean(var, na.rm = TRUE)) %>%
arrange(var)
# Return
summarised_data
})
Our data is ready to be plotted, but we still need to add colour in our map, according to the range of values of var
. Essentially we want to automatically divide this range in three equally probable intervals that will represent low, moderate and high values. These three intervals will be coloured green, orange and red accordingly. We can do that using the quantile function.
## Get splines to segment and colour data by
get_splines <- reactive({
# Get selected data and summarised data
selected_data <- select_data()
c_palette <- summarise_data()
# Define quartile splines to segment and colour data by
c_splines <- quantile(c_palette$var, probs = seq(0, 1, 1/3), na.rm = TRUE)
# Return splines
c_splines
})
Figuring out the splines to segment the data by is not enough. We also have segment the data point and assign colours to each one of them, so we are going to assign labels that represent the low, moderate and high values to each one of the data points. Then we are going to create a gradient scale of colours for each one of these intervals, this will give us the colour palette which we will have to combine with our initial data.
## Add the colour palette to the selected dataset
add_palette <- reactive({
# Get selected data, summarised data as a basis for the colour palette and the quartile splines
selected_data <- select_data()
c_palette <- summarise_data()
c_splines <- get_splines()
# Define colour groups
colour_group <- cut(
c_palette$var,
c(0, c_splines[2:3], max(c_palette$var)),
labels=c("low", "moderate", "high")
)
c_palette$col_grp <- colour_group
# Define gradient colours within each of the colour groups
c_palette$col_code <- c(colorRampPalette(c("#c8f1cb", "#4ad254"))(nrow(c_palette%>% filter(col_grp=="low"))),
colorRampPalette(c("#ffdb99","#ffb732"))(nrow(c_palette%>% filter(col_grp=="moderate"))),
colorRampPalette(c("#f69494","#ee2a2a"))(nrow(c_palette%>% filter(col_grp=="high"))))
# Combine the selected data with the colour palette
selected_data <- left_join(selected_data, c_palette)
# Return the resulted data frame
selected_data
})
We are almost ready to move onto creating our map, but before that we want to add a dynamic title and some information about the colour scales and what values they represent. We can create a separate function for this task:
## A function that creates the map's title
create_maps_title <- function(maps_data, year = NULL, var = NULL, drug = NULL, c_splines) {
# Set colours
g_col = "#4ad254"
o_col = "#ffb732"
r_col = "#ee2a2a"
# Set names to appear for certain options
if (drug=="antidepressants"){
drug = "anti-depressants"
}
else if (drug=="diabetes"){
drug = "diabetic drugs"
}
# Create title's HTML code
if (var=="items_perc"){
lab1 <- paste0("Percentage of ", drug, " prescribed in ", year, " across populations in London")
title <- paste0("<h3> ",
lab1,
"</h3>",
"<h4>",
"Groups: ",
" <font color='",g_col,"'>0 < ", round(c_splines[2],2),"%</font>",
" <font color='",o_col,"'>", round(c_splines[2],2), " < ", round(c_splines[3],2),"%</font>",
" <font color='",r_col, "'>", round(c_splines[3],2), "% +</font>",
"</h4>"
)
}
else if (var=="act_cost_perc"){
lab1 <- paste0("Average cost (£ per person) of ", drug, " prescribed in ", year, " corrected to CCG population sizes across London")
title <- paste0("<h3> ",
lab1,
"</h3>",
"<h4>",
"Groups: ",
" <font color='",g_col,"'>£0 < ", round(c_splines[2],2),"</font>",
" <font color='",o_col,"'>£", round(c_splines[2],2), " < ", round(c_splines[3],2),"</font>",
" <font color='",r_col, "'>£", round(c_splines[3],2), "+</font>",
"</h4>"
)
}
# Return title
return(title)
}
and assign its result in the output
variable through a render
function:
## Add the map title to the UI
output$map_title <- renderUI(
HTML(
create_maps_title(add_palette(), input$year, input$var, input$drug, get_splines())
)
)
Finally, we have to add the title in the shinyUI
function, in the preserved space:
ui.r
## ui.r
library(shiny)
library(ggvis)
shinyUI(fluidPage(
titlePanel("Prescribing data London"),
fluidRow(
column(width = 3,
wellPanel(
sliderInput("year",
"Select year to visualise: ", 2011, 2013,
value = 2011,
animate = F),
br(),
selectInput("var",
label = "Select variable to visualise",
choices = list("% Items" = "items_perc", "Average cost (£)" = "act_cost_perc"),
selected = "items_perc"),
br(),
radioButtons("drug",
label="Select the drug to visualise",
choices= list("Diabetes drugs" = "diabetes", "Anti-depressant drugs" = "antidepressants", "Statins" = "statins"),
selected="statins")
)
),
column(9,
wellPanel(
htmlOutput("map_title", inline=F)
)
)
)
)
)
ggvis
plotting libraryNow we are finally ready to move into the actual goal of this article; creating the interactive map. Before that we should briefly explore the ggvis
interactive plotting library in order better understand how to build our visualisation.
ggvis
basicsggvis
is a R library that is used to create interactive graphics. The underlying logic is similar to ggplot2
, although its syntax is different. ggvis
works on top of dplyr
which makes the connection between data manipulation and plotting easier.
Every ggvis
visualisation uses the function ggvis()
. The first argument in this function is the dataset being used, in the format of a data frame, and the rest of the arguments specify how the data will be mapped to the visual properties of ggvis (for example, which field goes to the x-axis).
ggvis(mtcars, x = ~wt, y = ~mpg)
To assign vectors and not single values on visual properties, the R formula
s must be used. The example above, although a ggvis visualisation object, doesn’t include any information about how to plot the data. In order to give that information we need to specify a layer. ggvis offers many options, like layer_points()
, layer_bars()
, layer_lines()
, layer_text()
etc. These functions, take as a first argument the ggvis visualisation object. More arguments that specify other visual properties can be given here.
vis <- ggvis(mtcars, x = ~wt, y = ~mpg)
layer_points(vis, fill := 'grey')
Other layers include:
Layer | Description |
---|---|
layer_points() | produces a scatter plot |
layer_lines() | produces a line plot, connecting all the given data points |
layer_paths() | produces lines if fill is empty, and polygons if it is set to a value |
layer_bar() | produces a bar plot |
layer_text() | produces text on the specified |
These layers, although a subsets of the layers available in ggplot2
, are enough to create most of the visualisations.
We already mentioned before that ggvis works on top of dplyr
. So, it makes use of the %>%
pipe operator from dplyr. All the ggvis functions, take as a first argument a ggvis visualisation object, that has probably been already created by another ggvis function. In order to simplify the syntax we can use the pipe operator to give the first argument to the next function, and avoid having many nested calls:
ggvis(mtcars, x = ~wt, y = ~mpg) %>% layer_points(fill := 'grey')
Another advantage of having this operator available is that we can use dplyr within ggvis calls.
ggvis
The main reason that ggvis
plots are more interactive that ggplot
plots is the tooltip functionality. ggvis
plots allow us to create tooltips that will be triggered from an event, either by hovering over an item or clicking it. These tooltips can be extremely useful to display extra information that couldn’t fit the initial plot, or an interpretation of what the plot demonstrates.
Tooltips are being added in ggvis
plots by the function add_tooltip()
. This function takes three arguments:
ggvis
visualisation.x
and return the HTML
tooltip to be displayed. The argument x
is a one-row data frame created by ggvis
that represents the mark that is currently under the mouse. The HTML
return value should either be a string that contains some functional HTML code, or NULL
in case we don’t want a tooltip to appear."hover"
, "click"
or c("hover", "click")
.Function for the example above:
tooltip <- function(x){
tip <- paste0("Wt: ", x$wt, "<br> Mpg: ", x$mpg)
return(tip)
}
ggvis(mtcars, x = ~wt, y = ~mpg) %>%
layer_points(fill := 'grey') %>%
add_tooltip(tooltip, "hover")
The argument x
of the tooltip function is a data frame that contains information about the mark under the mouse. This information normally includes the x and y axis values, the render details such as colour or shape if there are any, and any grouping information available. Within mini-apps we often want to display additional information that exists in the initial data about the current data point, but it is not included in the x
argument. Because of that it is a useful practice to create the tooltip function inside the shinyServer
function, so that we have access to the initial data.
ggvis
With a better understanding of the ggvis
plotting library, we can proceed to creating our map:
## Create ggvis interactive map
vis <- reactive({
# Get seelcted data
selected_data <- add_palette()
# Create the ggvis object
selected_data %>%
arrange(order) %>%
group_by(ccg_code) %>%
ggvis() %>%
layer_paths(x = ~long, y = ~lat, fill := ~col_code)
})
Essentially we are creating a map plot by ordering the data points, and grouping them by their CCG code. Using the path layer we are assigning the longitude and latitude variables to the x and y axes respectively, and we are colouring the groups by the colour variable we calculated earlier. Now let’s create a tooltip function with some additional information:
## Tooltip function. x is the ggvis object that is currently triggering the tooltip
map_tooltip <- function(x) {
# Return an empty tooltip if x is empty
if(is.null(x)) return(NULL)
# Retrieve the code of the ccg that triggered the tooltip
selected_ccg_code <- x$ccg_code
# Get selected data
selected_data <- add_palette()
# Filter ans summarise the selected data according to the retrieved CCG code
ccg <- selected_data %>%
filter(ccg_code == selected_ccg_code) %>%
group_by(
ccg_code,
ccg13nm
) %>%
summarise(
items = round(mean(items_perc, na.rm=T), 2),
avg_cost = round(mean(act_cost_perc, na.rm=T), 2)
)
ccg <- unique(ccg)
# Create the HTML code for the tooltip
tip <- paste0("CCG: ", ccg$ccg13nm, "<br>",
"Items (%): ", ccg$items, "<br>",
"Avg cost (£): ", ccg$avg_cost, "<br>"
)
# Return tooltip
return(tip)
}
and assign this function to our visualisation:
## Create ggvis interactive map
vis <- reactive({
# Get seelcted data
selected_data <- add_palette()
# Create the ggvis object
selected_data %>%
arrange(order) %>%
group_by(ccg_code) %>%
ggvis() %>%
layer_paths(x = ~long, y = ~lat, fill := ~col_code)
})
In order to make this visualisation appear in the UI, we need to bind the visualisation in the UI:
## Bind interactive map to the UI
vis %>% bind_shiny("map_plot")
This has to be placed in the shinyServer()
function, but not inside a reactive component. The reason for that is that when we bind a ggvis plot in the UI, then a placeholder and some javascript code to generate the plot are being created. When a widget changes, only the updated data will be sent to the UI.
Finally, we need to modify our shinyUI()
function to include the map plot:
## ui.r
library(shiny)
library(ggvis)
shinyUI(fluidPage(
titlePanel("Prescribing data London"),
fluidRow(
column(width = 3,
wellPanel(
sliderInput("year",
"Select year to visualise: ", 2011, 2013,
value = 2011,
animate = F),
br(),
selectInput("var",
label = "Select variable to visualise",
choices = list("% Items" = "items_perc", "Average cost (£)" = "act_cost_perc"),
selected = "items_perc"),
br(),
radioButtons("drug",
label="Select the drug to visualise",
choices= list("Diabetes drugs" = "diabetes", "Anti-depressant drugs" = "antidepressants", "Statins" = "statins"),
selected="statins")
)
),
column(9,
wellPanel(
htmlOutput("map_title", inline=F),
ggvisOutput("map_plot")
)
)
)
)
)
After that we can start using our mini-app with the interactive maps. The mini-app widgets in combination with the ggvis
map plot provide us with many different choices and a lot more additional information that cannot be shown in a regular plot. So ggvis
can help us add even more interactivity to our visualisations, take our analysis one step further and allow the final user do part of the analysis themselves.
July 23, 2015
Pamela joined Aridhia in 2011, bringing several years' experience in marketing and communications to the company. She has been involved in some of Aridhia’s highest profile projects, including DECIPHER Health and the launch of AnalytiXagility, and is a valued member of the commercial team. Pamela likes to make simple messages out of complicated concepts and works closely with the entire Aridhia team, collaborative partners, products, and perceptions to build relationships, brands and marketing strategies.