Lesson 2: Data visualization

This section helps you create the most popular visualizations - from quick and dirty plots to publication-ready graphs. The text relies heavily on the ggplot2 package for graphics, but other approaches covered as well.

This dataset contains four lists: env, fish, xy, species. Of these, the fish file lists all species that were recorded in the Doubs River. The data can be extracted as:

data(doubs, package = "ade4")
DoubsSpe <- doubs$fish
DoubsEnv <- doubs$env
DoubsSpa <- doubs$xy

Load the required packages

library(vegan) library(labdsv) library(MASS) library(mvpart) library(ggplot2)

In ecological data, it’s common that a site or plot has 0’s value for any organisms. If so, you should remove it out of your dataset, and do not include the site for your analysis. As for the data of doubs, you can detect 0’s value like this:

rowSums(DoubsSpe) == 0
    1     2     3     4     5     6     7     8     9    10    11 
FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE 
   12    13    14    15    16    17    18    19    20    21    22 
FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 
   23    24    25    26    27    28    29    30 
FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 
#colSums(DoubsSpe) == 0

Because the site no. 8 contains no species, so we remove the row 8 and corresponding abiotic data (site 8), and assign the data new variables:

spe <- DoubsSpe[-8, ]
env <- DoubsEnv[-8, ]

Explore the environment dataset and calculate correlation efficients among variables. See the website for details.

The hclust() for hierarchical clustering order is used to reorder the correlation matrix according to the correlation coefficient. This is useful to identify the hidden pattern in the matrix.

Reordered correlation data:

# Reorder the correlation matrix
cormat <- reorder_cormat(cormat)

The correlation matrix has redundant information. We’ll use the codes below to set half of it to NA.

# Get upper triangle of the correlation matrix
get_upper_tri <- function(cormat){
    cormat[lower.tri(cormat)] <- NA
    return(cormat)
  }
upper_tri <- get_upper_tri(cormat)
#upper_tri

Melt the correlation data, and drop the rows with NA values:

# Melt the correlation matrix and drop NA
library(reshape2) #The package reshape used to melt the correlation matrix
melted_cormat <- melt(upper_tri, na.rm = TRUE) # na.rm-->drop NA
#head(melted_cormat)

Add mort details to plot for improving the figure and perform the codes below.

# Create a heatmap
library(ggplot2)
ggheatmap <- ggplot(melted_cormat, aes(Var2, Var1, fill = value)) +
   geom_tile(color = "white") + 
  # using geom_title() to create heetmap, "color" means the grid line color
 scale_fill_gradient2(low = "blue", high = "red", mid = "white", 
   midpoint = 0, limit = c(-1,1), space = "Lab", 
    name="Pearson\nCorrelation") + 
  # use scale_fill_gradient2 () to creates a color gradient (low-mid-high), here bule notes egative correlations and red notes positive correlations. limit = c(-1,1) as correlation coefficients range from -1 to 1. See legend for details.
  theme_minimal() + # Create a white background with no grid lines and no border
  theme(axis.text.x = element_text(angle = 0, vjust = 1, 
    size = 10, hjust = 1)) + # x axis tick mark labels 
 coord_fixed() # ensures one unit on the x-axis is the same length as one unit on the y-axis

# Print the heatmap
print(ggheatmap)