How does the function call
df = dataset("ggplot2", "diamonds") p = plot(df, x = :Price, color = :Cut, Stat.histogram, Geom.bar)
actually get turned into the following plot?
The rendering pipeline transforms a plot specification into a Compose scene graph that contains a set of guides (e.g. axis ticks, color keys) and one or more layers of geometry (e.g. points, lines). The specification of each layer has
a data source (e.g.
a geometry to represent the layer's data (e.g. point, line, etc.)
mappings to associate aesthetics of the geometry with elements of the data source (e.g.
:color => :Cut)
layer-wise statistics (optional) to be applied to the layer's data
All layers of a plot share the same
Coordinates for the geometry (e.g. cartesian, polar, etc.)
axis Scales (e.g. loglog, semilog, etc.)
plot-wise Statistics (optional) to be applied to all layers
A full plot specification must describe these shared elements as well as all the layer specifications. In the example above, we see that only the data source, statistics, geometry, and mapping are specified. The missing elements are either inferred from the data (e.g. categorical values in
df[:Cut] implies a discrete color scale), or assumed using defaults (e.g. continuous x-axis scale). For example, invoking
plot with all the elements will look something like
p = plot(layer(df, x = :Price, color = :Cut, Stat.histogram, Geom.bar), Scale.x_continuous, Scale.color_discrete, Coord.cartesian, Guide.xticks, Guide.yticks, Guide.xlabel("Price"), Guide.colorkey(title="Cut"))
Once a full plot specification is filled out, the rendering process proceeds as follows:
For each layer in the plot, we first map subsets of the data source to a
Cutcolumns of the
diamonddataset are mapped to the
Scales are applied to the data to obtain plottable aesthetics. Scale.x_continuous keeps the values of
df[:Price]unchanged, while Scale.color_discrete_hue maps the unique elements of
df[:Cut](an array of strings) to actual color values.
The aesthetics are transformed by layer-wise and plot-wise statistics, in order. Stat.histogram replaces the
xfield of the aesthetics with bin positions, and sets the
yfield with the corresponding counts.
Using the position aesthetics from all layers, we create a Compose context with a coordinate system that fits the data to screen coordinates. Coord.cartesian creates a Compose context that maps a vertical distance of
3000counts to about two inches in the rendered plot.
Each layer renders its own geometry.
Finally, we compute the layout of the guides and render them on top of the plot context.