I first learned about embedding many small subplots into a larger plot as a way to visualize large datasets with package ggsubplot. Embedding subplots is still possible in ggplot2 today with the annotation_custom() function. I demonstrate one approach to do this, making many subplots in a loop and then adding them to the larger plot.
When working with counts, having many zeros does not necessarily indicate zero inflation. I demonstrate this by simulating data from the negative binomial and generalized Poisson distributions. I then show one way to check if the data has excess zeros compared to the number of zeros expected based on the model.
Analyzing positive data with 0 values can be challenging, since a direct log transformation isn't possible. I discuss some of the things to consider when deciding on an analysis strategy for such data and then explore the effect of the value of the constant, c, when using log(y + c) as the response variable.
In this post I show an example of how to automate the process of making many exploratory plots in ggplot2 with multiple continuous response and explanatory variables. To loop through both x and y variables involves nested looping. In the latter section of the post I go over options for saving the resulting plots, either together in a single document, separately, or by creating combined plots prior to saving.
I currently work as an applied statistician in aviation and aeronautics. In a previous role as a consulting statistician in academia I created and taught R workshops for applied science graduate students who are just getting started in R, where my goal was to make their transition to a programming language as smooth as possible. See these workshop materials at my website.