Histograms
In this demo, we will analyze the possibilities of constructing histograms in Engee.
A histogram is a way of representing tabular data as a bar chart.
The quantitative ratios of a certain indicator are presented in the form of rectangles, the areas of which are proportional. Most often, for ease of perception, the width of the rectangles is taken to be the same, while their height determines the ratio of the displayed parameter.
Now let's move on to the implementation and start by connecting the visualization library and consider the histogram function using the example of the simplest visualization of a histogram using a vector of random numbers with a normal distribution.
using Plots
x = randn(10^3)
histogram(x)
The number of histogram columns is determined by the Friedman-Diakonis formula by default.
Alternatively, you can pass a range to the bins parameter to more precisely control the number of intervals, as well as their minimum and maximum. The wording "to build 20 intervals from -5 to +5, enter ..." is used, and here we need to add 1 to the length, since the length takes into account the number of interval boundaries.
b_range = range(-5, 5, length=21)
plot(b_range, seriestype=:scatter)
Normalization
It is often necessary to normalize the histogram in some way. The normalize attribute is used for this. It allows you to normalize the total area of the intervals to 1. Since we have selected a sample from a normal distribution, we can also graph it.
p(x) = 1/sqrt(2pi) * exp(-x^2/2)
histogram(x, bins=b_range, normalize=:pdf, color=:gray)
plot!(p, lw=3, color=:red)
xlims!(-5, 5)
ylims!(0, 0.4)
normalize can take other values, including:
-
:probability – sums up all cell heights to 1;
-
:density – the area of each container is equal to the quantity.
Weighted histograms
The next display option that we will consider is weighted histograms.
f_exp(x) = exp(x)/(exp(1)-1)
x = rand(10^4)
w = exp.(x)
histogram(x, bins=:scott, weights=w, normalize=:pdf, color=:gray)
plot!(f_exp, lw=3, color=:red)
plot!(legend=:topleft)
xlims!(0, 1.0)
ylims!(0, 1.6)
Other options
- Histogram spread charts can be constructed using scatterhist and scatterhist!, where columns are replaced by dots.
- Step-by-step histogram graphs can be constructed using stephist and stephist!, where the contour replaces the columns.
p1 = histogram(x, title="Bar")
p2 = scatterhist(x, title="Scatter")
p3 = stephist(x, title="Step")
plot(p1, p2, p3, layout=(1, 3), legend=false)
Note: Please note that the Y-axis of the histogram scatter plot does not start at 0 by default.
Conclusion
In this example, we examined the possibilities of constructing various histograms in Engee and made sure that it is convenient to work with this tool.