Saturday, October 29, 2016

Exploring the ggplot2() Library in R

The R language ggplot2() library includes a number of basic and intricate plotting and graphics capabilities to aide in visualizing data. Included within the ggplot2() library are also a number of datasets to help demonstrate and explore the capabilities.


Using the included diamond data set it is possible to explore several of the graph and plot types built in to the ggplot2() library. The basic structure is the call to ggplot, passing it the dataset, isolating specific variables for plotting, the type of graph or plot, and options. The following commands plot histograms, frequency polygons, line plots, and statistical smoothing.

Starting with a regular histogram on the number of diamonds based on carat weight:



Adding the factored variable "clarity" on the same histogram plot of carat vs count the ggplot() library colorizes the clarity values for each carat bin.



We can use clarity as our factor variable in geom_freqpoly(), as well, for a slightly different view of the data compared to the histogram:




For the next graph we can scatter plot the carat weight vs price, include a fitted line to scatter plot but including a call to stat_smooth():






Reviewing the documentation for the ggplot2() library the map_data() function caught my eye. Bringing up the Help for map_data within R Studio offers details on the arguments and a small sample R script plotting a set of sample data for fifty states. However the map projection rendered in the sample script is only of the continental 48 states.


Within this set of libraries is also the necessary data and code for plotting individual states and major cities in those states. Creating a subset of data for Florida and Florida cities it was simple to plot a map of Florida and those major cities.


Passing elements of lat and long from this fl_cities dataset to ggplot, the borders details for plotting the state and county borders for the state of Florida, specify that each city should be plotted with a single point, providing a title of “Major Florida Cities”, and finally the X and Y axis labels of Longitude and Latitude results in a “graph” of a map of the cities, counties, and state.




No comments:

Post a Comment