Benford's Law
We are considering the implementation of Benford's law in the Julia programming language using the Fibonacci sequence.
Introduction
Benford's law is an empirical law that describes the distribution of the first digits in sets of numerical data from real life. According to this law, the number 1 occurs as the first significantly more often than other numbers (about 30%), and the number 9 is the least frequent (less than 5%). The law has wide application in the field of data audit, fraud detection and economic analysis.
In this example, we are implementing Benford's law in the Julia programming language. We will also create a sequence of Fibonacci numbers, apply the law to it, and then plot and compare the empirical data with the theoretical distribution.
Algorithm implementation
The Benford rule is given by the formula:
where - the first digit is from 1 to 9, and - the probability of this number being the first.
"""
Вычисляет вероятность первой цифры по закону Бенфорда.
"""
P(d) = log10(1 + 1 / d)
benford(numbers) -- calculates the frequencies of the first digits in a set of numbers.
Returns an array of 9 elements, where the ith element indicates which fraction of the numbers starts with the digit i.
function benford(numbers)
# Получает первую цифру числа n (самую правую цифру после обращения)
firstdigit(n) = last(digits(n))
# Инициализируем массив счетчиков для каждой цифры (1–9)
counts = zeros(9)
# Для каждого числа увеличиваем соответствующий счетчик
foreach(n -> counts[firstdigit(n)] += 1, numbers)
# Нормируем частоты, чтобы их сумма была равна 1
counts ./ sum(counts)
end
We define an iterator to generate a sequence of Fibonacci numbers. Base.iterate it is implemented in such a way that it is possible to work with an infinite stream using big.Int to prevent overflow.
# Создаем тип для обозначения итератора Фибоначчи
struct Fibonacci end
Implementation of the iteration protocol for the type Fibonacci
Base.iterate(::Fibonacci, (a, b) = big.((0, 1))) = b, (b, a + b)
Generating the first 1000 numbers from the Fibonacci sequence
sample = Iterators.take(Fibonacci(), 1000)
We calculate the observed frequencies of the first digits and convert them to percentages.
observed = benford(sample) .* 100
We calculate the theoretical values according to Benford's law and convert them to percentages.
expected = P.(1:9) .* 100
Creating a table for convenient data comparison
table = Real[1:9 observed expected]
We plot the bars for the observed data and the lines for the expected ones.
using Plots
plot(
[observed expected];
title = "Benford's Law",
seriestype = [:bar :line],
linewidth = [0 5],
xticks = 1:9,
xlabel = "first digit",
ylabel = "frequency %",
label = ["1000 Fibonacci numbers" "P(d)=log10(1+1/d)"]
)
We output the results to the console in a well-formatted form.
using Printf
println("Benford's Law\nFrequency of first digit\nin 1000 Fibonacci numbers")
println("digit observed expected")
foreach(i -> @printf("%3d%9.2f%%%9.2f%%\n", table[i, :]...), 1:9)
Conclusion
We have reviewed the implementation of Benford's law in the Julia language. We wrote a function to calculate the distribution of the first digits, applied it to a sample of the first 1000 Fibonacci numbers, and compared the data obtained with the theoretical values of the law. Visual graphs were also constructed, demonstrating the coincidence of the observed statistics with the mathematical law. Such methods can be used to verify the naturalness of the data or to identify possible distortions and fraud.
The example was developed using materials from Rosetta Code