# Statistics

Statistics are functions taking as input one or more aesthetics, operating on those values, then outputting to one or more aesthetics. For example, drawing of boxplots typically uses the boxplot statistic (`Stat.boxplot`

) that takes as input the `x`

and `y`

aesthetic, and outputs the middle, and upper and lower hinge, and upper and lower fence aesthetics.

`Gadfly.Stat.band`

`Gadfly.Stat.bar`

`Gadfly.Stat.binmean`

`Gadfly.Stat.boxplot`

`Gadfly.Stat.candlestick`

`Gadfly.Stat.contour`

`Gadfly.Stat.density`

`Gadfly.Stat.density2d`

`Gadfly.Stat.dodge`

`Gadfly.Stat.ellipse`

`Gadfly.Stat.func`

`Gadfly.Stat.hair`

`Gadfly.Stat.hexbin`

`Gadfly.Stat.histogram`

`Gadfly.Stat.histogram2d`

`Gadfly.Stat.identity`

`Gadfly.Stat.nil`

`Gadfly.Stat.qq`

`Gadfly.Stat.quantile_bars`

`Gadfly.Stat.rectbin`

`Gadfly.Stat.smooth`

`Gadfly.Stat.step`

`Gadfly.Stat.unidistribution`

`Gadfly.Stat.vectorfield`

`Gadfly.Stat.violin`

`Gadfly.Stat._calculate_quantile_bar`

`Gadfly.Stat.x_jitter`

`Gadfly.Stat.xticks`

`Gadfly.Stat.y_jitter`

`Gadfly.Stat.yticks`

`Gadfly.Stat.band`

— Type`Stat.band[(; orientation=:vertical)]`

Transform points in the `xmin`

, `xmax`

, `ymin`

and `ymax`

aesthetics into rectangles in the `xmin`

, `xmax`

, `ymin`

and `ymax`

aesthetics. Used by `Geom.band`

.

`Gadfly.Stat.bar`

— Type`Stat.bar[(; position=:stack, orientation=:vertical)]`

Transform the `x`

aesthetic into the `xmin`

and `xmax`

aesthetics. Used by `Geom.bar`

.

`Gadfly.Stat.binmean`

— Type`Stat.binmean[(; n=20)]`

Transform the the `x`

and `y`

aesthetics into `n`

bins each of which contains the mean within than bin.

`Gadfly.Stat.boxplot`

— Type`Stat.boxplot[(; method=:tukey)]`

Transform the the `x`

and `y`

aesthetics into the `x`

, `middle`

, `lower_hinge`

, `upper_hinge`

, `lower_fence`

, `upper_fence`

and `outliers`

aesthetics. If `method`

is `:tukey`

then Tukey's rule is used (i.e. fences are 1.5 times the inter-quartile range). Otherwise, `method`

should be a vector of five numbers giving quantiles for lower fence, lower hinge, middle, upper hinge, and upper fence in that order. Used by `Geom.boxplot`

.

`Gadfly.Stat.candlestick`

— Type`Stat.candlestick[()]`

`Gadfly.Stat.contour`

— Type`Stat.contour[(; levels=15, samples=150)]`

Transform the 2D function, matrix, or DataFrame in the `z`

aesthetic into a set of lines in `x`

and `y`

showing the iso-level contours. A function requires that either the `x`

and `y`

or the `xmin`

, `xmax`

, `ymin`

and `ymax`

aesthetics also be defined. The latter are interpolated using `samples`

. A matrix and DataFrame can optionally input `x`

and `y`

aesthetics to specify the coordinates of the rows and columns, respectively. In each case `levels`

sets the number of contours to draw: either a vector of contour levels, an integer that specifies the number of contours to draw, or a function which inputs `z`

and outputs either a vector or an integer. Used by `Geom.contour`

.

`Gadfly.Stat.density`

— Type`Stat.density[(; n=256, bandwidth=-Inf)]`

Estimate the density of `x`

at `n`

points, and put the result in `x`

and `y`

. Smoothing is controlled by `bandwidth`

. Used by `Geom.density`

.

`Gadfly.Stat.density2d`

— Type`Stat.density2d[(; n=(256,256), bandwidth=(-Inf,-Inf), levels=15)]`

Estimate the density of the `x`

and `y`

aesthetics at `n`

points and put the results into the `x`

, `y`

and `z`

aesthetics. Smoothing is controlled by `bandwidth`

. Calls `Stat.contour`

to compute the `levels`

. Used by `Geom.density2d`

.

`Gadfly.Stat.dodge`

— Type`Stat.dodge[(; position=:dodge, axis=:x)]`

Transform the points in the `x`

and `y`

aesthetics into set of dodged or stacked points in the `x`

and `y`

aesthetics. `position`

is `:dodge`

or `:stack`

. `axis`

is `:x`

or `:y`

.

`Gadfly.Stat.ellipse`

— Type`Stat.ellipse[(; distribution=MvNormal, levels=[0.95], nsegments=51)]`

Transform the points in the `x`

and `y`

aesthetics into set of a lines in the `x`

and `y`

aesthetics. `distribution`

specifies a multivariate distribution to use; `levels`

the quantiles for which confidence ellipses are calculated; and `nsegments`

the number of segments with which to draw each ellipse. Used by `Geom.ellipse`

.

`Gadfly.Stat.func`

— Type`Stat.func[(; num_samples=250)]`

Transform the functions or expressions in the `y`

, `xmin`

and `xmax`

aesthetics into points in the `x`

, `y`

and `group`

aesthetics.

`Gadfly.Stat.hair`

— Type`Stat.hair[(; intercept=0.0, orientation=:vertical)]`

Transform points in the `x`

and `y`

aesthetics into lines in the `x`

, `y`

, `xend`

and `yend`

aesthetics. Used by `Geom.hair`

.

`Gadfly.Stat.hexbin`

— Type`Stat.hexbin[(; xbincount=50, ybincount=50)]`

Bin the points in the `x`

and `y`

aesthetics into hexagons in the `x`

, `y`

, `xsize`

and `ysize`

aesthetics. `xbincount`

and `ybincount`

manually fix the number of bins.

`Gadfly.Stat.histogram`

— Type```
Stat.histogram[(; bincount=nothing, minbincount=3, maxbincount=150,
position=:stack, orientation=:vertical, density=false, limits=NamedTuple())]
```

Transform the `x`

aesthetic into the `x`

, `y`

, `xmin`

and `xmax`

aesthetics, optionally grouping by `color`

. Exchange y for x when `orientation`

is `:horizontal`

. `bincount`

specifies the number of bins to use. If set to `nothing`

, an optimization method is used to determine a reasonable value which uses `minbincount`

and `maxbincount`

to set the lower and upper limits. If `density`

is `true`

, normalize the counts by their total. `limits`

is a `NamedTuple`

that sets the limits of the histogram `(min= , max= )`

: `min`

or `max`

or both can be set.

`Gadfly.Stat.histogram2d`

— Type```
Stat.histogram2d[(; xbincount=nothing, xminbincount=3, xmaxbincount=150,
ybincount=nothing, yminbincount=3, ymaxbincount=150)]
```

Bin the points in the `x`

and `y`

aesthetics into rectangles in the `xmin`

, `ymax`

, `ymin`

, `ymax`

and `color`

aesthetics. `xbincount`

and `ybincount`

manually fix the number of bins. If set to `nothing`

, an optimization method is used to determine a reasonable value which uses `xminbincount`

, `xmaxbincount`

, `yminbincount`

and `ymaxbincount`

to set the lower and upper limits.

`Gadfly.Stat.identity`

— Type`Stat.identity`

`Gadfly.Stat.nil`

— Type`Stat.Nil`

`Gadfly.Stat.qq`

— Type`Stat.qq`

Transform the `x`

and `y`

aesthetics into quantiles. If each is a numeric vector, their sample quantiles will be compared. If one is a `Distribution`

, then its theoretical quantiles will be compared with the sample quantiles of the other. Optionally group using the `color`

aesthetic. `Stat.qq`

uses function `qqbuild`

from Distributions.jl.

`Gadfly.Stat.quantile_bars`

— Type`Stat.quantile_bars[(; quantiles=[0.025, 0.975], bar_width=0.1, n=256, bandwidth=-Inf)]`

Transform the point in the `x`

aesthetic into a set of the `x`

, `y`

, `xend`

and `yend`

aesthetics points. These points can then be drawn via `Geom.segment`

. Here, `bandwidth`

works independently from the `bandwidth`

setting for `Stat.density`

.

`Gadfly.Stat.rectbin`

— Type`Stat.rectbin`

Transform the `x`

and `y`

aesthetics into the `xmin`

, `xmax`

, `ymin`

and `ymax`

aesthetics.

`Gadfly.Stat.smooth`

— Type`Stat.smooth[(; method=:loess, smoothing=0.75, levels=[0.95])]`

Transform the `x`

and `y`

aesthetics into the `x`

, `y`

, `ymin`

and `ymax`

aesthetics. `method`

can either be`:loess`

or `:lm`

. `smoothing`

controls the degree of smoothing. For `:loess`

, this is the span parameter giving the proportion of data used for each local fit where 0.75 is the default. Larger values use more data (less local context), smaller values use less data (more local context). `levels`

is a vector of quantiles at which confidence bands are calculated (currently for `method=:lm`

only). For confidence bands, use `Stat.smooth()`

with `Geom.ribbon`

.

`Gadfly.Stat.step`

— Type`Stat.step[(; direction=:hv)]`

Perform stepwise interpolation between the points in the `x`

and `y`

aesthetics. If `direction`

is `:hv`

a horizontal line extends to the right of each point and a vertical line below it; if `:vh`

then vertical above and horizontal to the left. More concretely, between `(x[i], y[i])`

and `(x[i+1], y[i+1])`

, either `(x[i+1], y[i])`

or `(x[i], y[i+1])`

is inserted, for `:hv`

and `:vh`

, respectively.

`Gadfly.Stat.unidistribution`

— Type`Stat.unidistribution(quantiles::Vector{Vector}; n=40)`

Transform a univariate distribution in the `y`

aesthetic into a set of points in the `x`

and `y`

aesthetics. These points can be drawn with `Geom.ribbon`

and/or `Geom.line`

. `color`

and `group`

work alternately to specify quantile groups, depending on whether multiple distributions are specified by `group`

or `color`

. `quantiles`

is a set of 2-length vectors, specifying the quantile ranges to be plotted (default is `[[0.0001,0.9999]]`

). `n`

is the number of points in each quantile group.

`Gadfly.Stat.vectorfield`

— Type`Stat.vectorfield[(; smoothness=1.0, scale=1.0, samples=20)]`

Transform the 2D function or matrix in the `z`

aesthetic into a set of lines from `x`

, `y`

to `xend`

, `yend`

showing the gradient vectors. A function requires that either the `x`

and `y`

or the `xmin`

, `xmax`

, `ymin`

and `ymax`

aesthetics also be defined. The latter are interpolated using `samples`

. A matrix can optionally input `x`

and `y`

aesthetics to specify the coordinates of the rows and columns, respectively. In each case, `smoothness`

can vary from 0 to Inf; and `scale`

sets the size of vectors.

`Gadfly.Stat.violin`

— Type`Stat.violin[(n=300)]`

Transform the `x`

, `y`

and `color`

aesthetics.

`Gadfly.Stat._calculate_quantile_bar`

— Method`_calculate_quantile_bar(stat::QuantileBarsStatistic, aes)`

Helper function for `apply_statistic(stat::QuantileBarsStatistic, ...)`

.

`Gadfly.Stat.x_jitter`

— Method`Stat.x_jitter[(; range=0.8, seed=0x0af5a1f7)]`

Add a random number to the `x`

aesthetic, which is typically categorical, to reduce the likelihood that points overlap. The maximum jitter is `range`

times the smallest non-zero difference between two points.

`Gadfly.Stat.xticks`

— Method```
Stat.xticks[(; ticks=:auto, granularity_weight=1/4, simplicity_weight=1/6,
coverage_weight=1/3, niceness_weight=1/4)]
```

Compute an appealing set of x-ticks that encompass the data by transforming the `x`

, `xmin`

, `xmax`

, `xintercept`

and `xend`

aesthetics into the `xtick`

and `xgrid`

aesthetics. `ticks`

is a vector of desired values, or `:auto`

to indicate they should be computed. the importance of having a reasonable number of ticks is specified with `granularity_weight`

; of including zero with `simplicity_weight`

; of tightly fitting the span of the data with `coverage_weight`

; and of having a nice numbering with `niceness_weight`

.

`Gadfly.Stat.y_jitter`

— Method`Stat.y_jitter[(; range=0.8, seed=0x0af5a1f7)]`

Add a random number to the `y`

aesthetic, which is typically categorical, to reduce the likelihood that points overlap. The maximum jitter is `range`

times the smallest non-zero difference between two points.

`Gadfly.Stat.yticks`

— Method```
Stat.yticks[(; ticks=:auto, granularity_weight=1/4, simplicity_weight=1/6,
coverage_weight=1/3, niceness_weight=1/4)]
```

Compute an appealing set of y-ticks that encompass the data by transforming the `y`

, `ymin`

, `ymax`

, `yintercept`

, `middle`

, `lower_hinge`

, `upper_hinge`

, `lower_fence`

, `upper_fence`

and `yend`

aesthetics into the `ytick`

and `ygrid`

aesthetics. `ticks`

is a vector of desired values, or `:auto`

to indicate they should be computed. the importance of having a reasonable number of ticks is specified with `granularity_weight`

; of including zero with `simplicity_weight`

; of tightly fitting the span of the data with `coverage_weight`

; and of having a nice numbering with `niceness_weight`

.