Nonparametric tests
Anderson-Darling test
Available are both one-sample and -sample tests.
#
HypothesisTests.OneSampleADTest
— Type
OneSampleADTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)
Perform a one-sample Anderson—Darling test of the null hypothesis that the data in vector x
come from the distribution d
against the alternative hypothesis that the sample is not drawn from d
.
Implements: pvalue
#
HypothesisTests.KSampleADTest
— Type
KSampleADTest(xs::AbstractVector{<:Real}...; modified = true, nsim = 0)
Perform a -sample Anderson—Darling test of the null hypothesis that the data in the vectors xs
come from the same distribution against the alternative hypothesis that the samples come from different distributions.
modified
parameter enables a modified test calculation for samples whose observations do not all coincide.
If nsim
is equal to 0 (the default) the asymptotic calculation of p-value is used. If it is greater than 0, an estimation of p-values is used by generating nsim
random splits of the pooled data on samples, evaluating the AD statistics for each split, and computing the proportion of simulated values which are greater or equal to observed. This proportion is reported as p-value estimate.
Implements: pvalue
References
-
F. W. Scholz and M. A. Stephens, K-Sample Anderson-Darling Tests, Journal of the American Statistical Association, Vol. 82, No. 399. (Sep., 1987), pp. 918-924.
Binomial test
#
HypothesisTests.BinomialTest
— Type
BinomialTest(x::Integer, n::Integer, p::Real = 0.5)
BinomialTest(x::AbstractVector{Bool}, p::Real = 0.5)
Perform a binomial test of the null hypothesis that the distribution from which x
successes were encountered in n
draws (or alternatively from which the vector x
was drawn) has success probability p
against the alternative hypothesis that the success probability is not equal to p
.
Computed confidence intervals by default are Clopper-Pearson intervals. See the confint(::BinomialTest)
documentation for a list of supported methods to compute confidence intervals.
Implements: pvalue
, confint(::BinomialTest)
#
StatsAPI.confint
— Method
confint(test::BinomialTest; level = 0.95, tail = :both, method = :clopper_pearson)
Compute a confidence interval with coverage level
for a binomial proportion using one of the following methods. Possible values for method
are:
-
:clopper_pearson
(default): Clopper-Pearson interval is based on the binomial distribution. The empirical coverage is never less than the nominal coverage oflevel
; it is usually too conservative. -
:wald
: Wald (or normal approximation) interval relies on the standard approximation of the actual binomial distribution by a normal distribution. Coverage can be erratically poor for success probabilities close to zero or one. -
:waldcc
: Wald interval with a continuity correction that extends the interval by1/2n
on both ends. -
:wilson
: Wilson score interval relies on a normal approximation. In contrast to:wald
, the standard deviation is not approximated by an empirical estimate, resulting in good empirical coverages even for small numbers of draws and extreme success probabilities. -
:jeffrey
: Jeffreys interval is a Bayesian credible interval obtained by using a non-informative Jeffreys prior. The interval is very similar to the Wilson interval. -
:agresti_coull
: Agresti-Coull interval is a simplified version of the Wilson interval; both are centered around the same value. The Agresti Coull interval has higher or equal coverage. -
:arcsine
: Confidence interval computed using the arcsine transformation to make independent of the probability .
References
-
Brown, L.D., Cai, T.T., and DasGupta, A. Interval estimation for a binomial proportion. Statistical Science, 16(2):101—117, 2001.
-
Pires, Ana & Amado, Conceição. (2008). Interval Estimators for a Binomial Proportion: Comparison of Twenty Methods. REVSTAT. 6. 10.57805/revstat.v6i2.63.
External links
Fisher exact test
#
HypothesisTests.FisherExactTest
— Type
FisherExactTest(a::Integer, b::Integer, c::Integer, d::Integer)
Perform Fisher’s exact test of the null hypothesis that the success probabilities and are equal, that is the odds ratio is one, against the alternative hypothesis that they are not equal.
See pvalue(::FisherExactTest)
and confint(::FisherExactTest)
for details about the computation of the default p-value and confidence interval, respectively.
The contingency table is structured as:
- | X1 | X2 |
---|---|---|
Y1 |
a |
b |
Y2 |
c |
d |
The |
Implements: pvalue(::FisherExactTest)
, confint(::FisherExactTest)
References
-
Fay, M.P., Supplementary material to "Confidence intervals that match Fisher’s exact or Blaker’s exact tests". Biostatistics, Volume 11, Issue 2, 1 April 2010, Pages 373—374, link
#
StatsAPI.confint
— Method
confint(x::FisherExactTest; level::Float64=0.95, tail=:both, method=:central)
Compute a confidence interval with coverage level
. One-sided intervals are based on Fisher’s non-central hypergeometric distribution. For tail = :both
, the only method
implemented yet is the central interval (:central
).
Since the p-value is not necessarily unimodal, the corresponding confidence region might not be an interval. |
References
-
Gibbons, J.D, Pratt, J.W. P-values: Interpretation and Methodology, American Statistican, 29(1):20-25, 1975.
-
Fay, M.P., Supplementary material to "Confidence intervals that match Fisher’s exact or Blaker’s exact tests". Biostatistics, Volume 11, Issue 2, 1 April 2010, Pages 373—374, link
#
StatsAPI.pvalue
— Method
pvalue(x::FisherExactTest; tail = :both, method = :central)
Compute the p-value for a given Fisher exact test.
The one-sided p-values are based on Fisher’s non-central hypergeometric distribution with odds ratio :
For tail = :both
, possible values for method
are:
-
:central
(default): Central interval, i.e. the p-value is two times the minimum of the one-sided p-values. -
:minlike
: Minimum likelihood interval, i.e. the p-value is computed by summing all tables with the same marginals that are equally or less probable:
References
-
Gibbons, J.D., Pratt, J.W., P-values: Interpretation and Methodology, American Statistican, 29(1):20-25, 1975.
-
Fay, M.P., Supplementary material to "Confidence intervals that match Fisher’s exact or Blaker’s exact tests". Biostatistics, Volume 11, Issue 2, 1 April 2010, Pages 373—374, link
Kolmogorov-Smirnov test
Available are an exact one-sample test and approximate (i.e. asymptotic) one- and two-sample tests.
#
HypothesisTests.ExactOneSampleKSTest
— Type
ExactOneSampleKSTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)
Perform a one-sample exact Kolmogorov—Smirnov test of the null hypothesis that the data in vector x
comes from the distribution d
against the alternative hypothesis that the sample is not drawn from d
.
Implements: pvalue
#
HypothesisTests.ApproximateOneSampleKSTest
— Type
ApproximateOneSampleKSTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)
Perform an asymptotic one-sample Kolmogorov—Smirnov test of the null hypothesis that the data in vector x
comes from the distribution d
against the alternative hypothesis that the sample is not drawn from d
.
Implements: pvalue
#
HypothesisTests.ApproximateTwoSampleKSTest
— Type
ApproximateTwoSampleKSTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
Perform an asymptotic two-sample Kolmogorov—Smirnov-test of the null hypothesis that x
and y
are drawn from the same distribution against the alternative hypothesis that they come from different distributions.
Implements: pvalue
External links
Kruskal-Wallis rank sum test
#
HypothesisTests.KruskalWallisTest
— Type
KruskalWallisTest(groups::AbstractVector{<:Real}...)
Perform Kruskal-Wallis rank sum test of the null hypothesis that the groups
come from the same distribution against the alternative hypothesis that that at least one group stochastically dominates one other group.
The Kruskal-Wallis test is an extension of the Mann-Whitney U test to more than two groups.
The p-value is computed using a approximation to the distribution of the test statistic :
where is the set of the counts of tied values at each tied position, is the total number of observations across all groups, and and are the number of observations and the rank sum in group , respectively. See references for further details.
Implements: pvalue
References
-
Meyer, J.P, Seaman, M.A., Expanded tables of critical values for the Kruskal-Wallis H statistic. Paper presented at the annual meeting of the American Educational Research Association, San Francisco, April 2006.
External links
Mann-Whitney U test
#
HypothesisTests.MannWhitneyUTest
— Function
MannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
Perform a Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as x
is greater than an observation drawn from the same population as y
is equal to the probability that an observation drawn from the same population as y
is greater than an observation drawn from the same population as x
against the alternative hypothesis that these probabilities are not equal.
The Mann-Whitney U test is sometimes known as the Wilcoxon rank-sum test.
When there are no tied ranks and ≤50 samples, or tied ranks and ≤10 samples, MannWhitneyUTest
performs an exact Mann-Whitney U test. In all other cases, MannWhitneyUTest
performs an approximate Mann-Whitney U test. Behavior may be further controlled by using ExactMannWhitneyUTest
or ApproximateMannWhitneyUTest
directly.
Implements: pvalue
#
HypothesisTests.ExactMannWhitneyUTest
— Type
ExactMannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
Perform an exact Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as x
is greater than an observation drawn from the same population as y
is equal to the probability that an observation drawn from the same population as y
is greater than an observation drawn from the same population as x
against the alternative hypothesis that these probabilities are not equal.
When there are no tied ranks, the exact p-value is computed using the pwilcox
function from the Rmath
package. In the presence of tied ranks, a p-value is computed by exhaustive enumeration of permutations, which can be very slow for even moderately sized data sets.
Implements: pvalue
#
HypothesisTests.ApproximateMannWhitneyUTest
— Type
ApproximateMannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
Perform an approximate Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as x
is greater than an observation drawn from the same population as y
is equal to the probability that an observation drawn from the same population as y
is greater than an observation drawn from the same population as x
against the alternative hypothesis that these probabilities are not equal.
The p-value is computed using a normal approximation to the distribution of the Mann-Whitney U statistic:
where is the set of the counts of tied values at each tied position.
Implements: pvalue
Sign test
#
HypothesisTests.SignTest
— Type
SignTest(x::AbstractVector{T<:Real}, median::Real = 0)
SignTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, median::Real = 0)
Perform a sign test of the null hypothesis that the distribution from which x
(or x - y
if y
is provided) was drawn has median median
against the alternative hypothesis that the median is not equal to median
.
Wald-Wolfowitz independence test
#
HypothesisTests.WaldWolfowitzTest
— Type
WaldWolfowitzTest(x::AbstractVector{Bool})
WaldWolfowitzTest(x::AbstractVector{<:Real})
Perform the Wald-Wolfowitz (or Runs) test of the null hypothesis that the given data is random, or independently sampled. The data can come as many-valued or two-valued (Boolean). If many-valued, the sample is transformed by labelling each element as above or below the median.
Implements: pvalue
Wilcoxon signed rank test
#
HypothesisTests.SignedRankTest
— Function
SignedRankTest(x::AbstractVector{<:Real})
SignedRankTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
Perform a Wilcoxon signed rank test of the null hypothesis that the distribution of x
(or the difference x - y
if y
is provided) has zero median against the alternative hypothesis that the median is non-zero.
When there are no tied ranks and ≤50 samples, or tied ranks and ≤15 samples, SignedRankTest
performs an exact signed rank test. In all other cases, SignedRankTest
performs an approximate signed rank test. Behavior may be further controlled by using ExactSignedRankTest
or ApproximateSignedRankTest
directly.
#
HypothesisTests.ExactSignedRankTest
— Type
ExactSignedRankTest(x::AbstractVector{<:Real}[, y::AbstractVector{<:Real}])
Perform a Wilcoxon exact signed rank U test of the null hypothesis that the distribution of x
(or the difference x - y
if y
is provided) has zero median against the alternative hypothesis that the median is non-zero.
When there are no tied ranks, the exact p-value is computed using the psignrank
function from the Rmath
package. In the presence of tied ranks, a p-value is computed by exhaustive enumeration of permutations, which can be very slow for even moderately sized data sets.
#
HypothesisTests.ApproximateSignedRankTest
— Type
ApproximateSignedRankTest(x::AbstractVector{<:Real}[, y::AbstractVector{<:Real}])
Perform a Wilcoxon approximate signed rank U test of the null hypothesis that the distribution of x
(or the difference x - y
if y
is provided) has zero median against the alternative hypothesis that the median is non-zero.
The p-value is computed using a normal approximation to the distribution of the signed rank statistic:
where is the set of the counts of tied values at each tied position.
Permutation test
#
HypothesisTests.ExactPermutationTest
— Function
ExactPermutationTest(x::Vector, y::Vector, f::Function)
Perform a permutation test (a.k.a. randomization test) of the null hypothesis that f(x)
is equal to f(y)
. All possible permutations are sampled.
#
HypothesisTests.ApproximatePermutationTest
— Function
ApproximatePermutationTest([rng::AbstractRNG,] x::Vector, y::Vector, f::Function, n::Int)
Perform a permutation test (a.k.a. randomization test) of the null hypothesis that f(x)
is equal to f(y)
. n
of the factorial(length(x)+length(y))
permutations are sampled at random. A random number generator can optionally be passed as the first argument. The default generator is Random.default_rng()
.
Fligner-Killeen test
#
HypothesisTests.FlignerKilleenTest
— Function
FlignerKilleenTest(groups::AbstractVector{<:Real}...)
Perform Fligner-Killeen median test of the null hypothesis that the groups
have equal variances, a test for homogeneity of variances.
This test is most robust against departures from normality, see references. It is a -sample simple linear rank method that uses the ranks of the absolute values of the centered samples and weights
The version implemented here uses median centering in each of the samples.
Implements: pvalue
References
-
Conover, W. J., Johnson, M. E., Johnson, M. M., A comparative study of tests for homogeneity of variances, with applications to the outer continental shelf bidding data. Technometrics, 23, 351—361, 1980
External links