Title: | Tools for Biostatistics, Public Policy, and Law |
---|---|
Description: | Statistical tests widely utilized in biostatistics, public policy, and law. Along with the well-known tests for equality of means and variances, randomness, and measures of relative variability, the package contains new robust tests of symmetry, omnibus and directional tests of normality, and their graphical counterparts such as robust QQ plot, robust trend tests for variances, etc. All implemented tests and methods are illustrated by simulations and real-life examples from legal statistics, economics, and biostatistics. |
Authors: | Joseph L. Gastwirth [aut],
Yulia R. Gel [aut, cre],
W. L. Wallace Hui [aut],
Vyacheslav Lyubchich [aut] |
Maintainer: | Yulia R. Gel <[email protected]> |
License: | GPL (>= 2) |
Version: | 3.6 |
Built: | 2025-02-24 03:16:57 UTC |
Source: | https://github.com/vlyubchich/lawstat |
Bartels (1982) test for randomness that is based
on the ranked version of von Neumann's ratio (RVN).
Users can choose whether to test against two-sided, negative,
or positive correlation. NA
s from the data are omitted.
bartels.test( y, alternative = c("two.sided", "positive.correlated", "negative.correlated") )
bartels.test( y, alternative = c("two.sided", "positive.correlated", "negative.correlated") )
y |
a numeric vector of data values. |
alternative |
a character string specifying the alternative hypothesis,
must be one of |
A list of class "htest"
with the following components:
statistic |
the value of the standardized Bartels statistic. |
parameter |
RVN ratio. |
p.value |
the |
data.name |
a character string giving the names of the data. |
alternative |
a character string describing the alternative hypothesis. |
Kimihiro Noguchi, Wallace Hui, Yulia R. Gel, Joseph L. Gastwirth, Weiwen Miao
Bartels R (1982). “The rank version of von Neumann's ratio test for randomness.” Journal of the American Statistical Association, 77(377), 40–46. doi:10.1080/01621459.1982.10477764.
## Simulate 100 observations from an autoregressive model of ## the first order AR(1) y = arima.sim(n = 100, list(ar = c(0.5))) ## Test y for randomness bartels.test(y) ## Sample Output ## ## Bartels Test - Two sided ## data: y ## Standardized Bartels Statistic -4.4929, RVN Ratio = ## 1.101, p-value = 7.024e-06
## Simulate 100 observations from an autoregressive model of ## the first order AR(1) y = arima.sim(n = 100, list(ar = c(0.5))) ## Test y for randomness bartels.test(y) ## Sample Output ## ## Bartels Test - Two sided ## data: y ## Standardized Bartels Statistic -4.4929, RVN Ratio = ## 1.101, p-value = 7.024e-06
Prediction errors of 48-hour ahead MM5 forecasts of surface temperature measured at 96 locations in the US Pacific Northwest on 3-January-2000. The prediction error, or "bias", is the difference between the forecasted and observed surface temperature. (MM5 is the fifth-generation Pennsylvania State University – National Center for Atmospheric Research Mesoscale Model.)
data(bias)
data(bias)
A numeric vector of length 96.
The data were kindly provided by the research group of Professor Clifford Mass in the Department of Atmospheric Sciences at the University of Washington. Detailed information about the Pacific Northwest prediction effort and the associated data archive can be found online at https://a.atmos.uw.edu/mm5rt/info.html and https://atmos.uw.edu/marka/pnw.html, respectively.
Number of black and white candidates (hired or rejected) for eight professions (Gastwirth 1984).
data(blackhire)
data(blackhire)
An array with 2 rows by 2 columns by 8 levels.
Gastwirth JL (1984). “Statistical methods for analyzing claims of employment discrimination.” ILR Review, 38(1), 75–86. doi:10.1177/001979398403800108.
The Brunner–Munzel test for stochastic equality of two samples,
which is also known as the Generalized Wilcoxon test.
NA
s from the data are omitted.
brunner.munzel.test( x, y, alternative = c("two.sided", "greater", "less"), alpha = 0.05 )
brunner.munzel.test( x, y, alternative = c("two.sided", "greater", "less"), alpha = 0.05 )
x |
the numeric vector of data values from the sample 1. |
y |
the numeric vector of data values from the sample 2. |
alternative |
a character string specifying the alternative hypothesis,
must be one of |
alpha |
significance level, default is 0.05 for 95% confidence interval. |
There exist discrepancies with Brunner and Munzel (2000) because there is a typo in the paper. The corrected version is in Neubert and Brunner (2007) (e.g., compare the estimates for the case study on pain scores). The current function follows Neubert and Brunner (2007).
A list of class "htest"
with the following components:
statistic |
the Brunner–Munzel test statistic. |
parameter |
the degrees of freedom. |
conf.int |
the confidence interval. |
p.value |
the |
data.name |
a character string giving the name of the data. |
estimate |
an estimate of the effect size, i.e., |
Wallace Hui, Yulia R. Gel, Joseph L. Gastwirth, Weiwen Miao. This function was updated with the help of Dr. Ian Fellows.
Brunner E, Munzel U (2000).
“The nonparametric Behrens–Fisher problem: asymptotic theory and a small-sample approximation.”
Biometrical Journal, 42(1), 17–25.
Neubert K, Brunner E (2007).
“A studentized permutation test for the non-parametric Behrens–Fisher problem.”
Computational Statistics & Data Analysis, 51(10), 5192–5204.
doi:10.1016/j.csda.2006.05.024.
## Pain score on the third day after surgery for 14 patients under ## the treatment Y and 11 patients under the treatment N ## (see Brunner and Munzel, 2000; Neubert and Brunner, 2007). Y <- c(1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 2, 4, 1, 1) N <- c(3, 3, 4, 3, 1, 2, 3, 1, 1, 5, 4) brunner.munzel.test(Y, N) ## Brunner-Munzel Test ## data: Y and N ## Brunner-Munzel Test Statistic = 3.1375, df = 17.683, p-value = 0.005786 ## 95 percent confidence interval: ## 0.5952169 0.9827052 ## sample estimates: ## P(X<Y)+.5*P(X=Y) ## 0.788961
## Pain score on the third day after surgery for 14 patients under ## the treatment Y and 11 patients under the treatment N ## (see Brunner and Munzel, 2000; Neubert and Brunner, 2007). Y <- c(1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 2, 4, 1, 1) N <- c(3, 3, 4, 3, 1, 2, 3, 1, 1, 5, 4) brunner.munzel.test(Y, N) ## Brunner-Munzel Test ## data: Y and N ## Brunner-Munzel Test Statistic = 3.1375, df = 17.683, p-value = 0.005786 ## 95 percent confidence interval: ## 0.5952169 0.9827052 ## sample estimates: ## P(X<Y)+.5*P(X=Y) ## 0.788961
Measure of relative inequality (or relative variation) of the data.
Coefficient of dispersion (CD) is the ratio of the mean absolute deviation from
the median (MAAD) to the median of the data. NA
s from the data are omitted.
See Gastwirth (1988) and
Bonett and Seier (2006).
cd(x)
cd(x)
x |
a numeric vector of data values. |
The coefficient of dispersion.
Wallace Hui, Yulia R. Gel, Joseph L. Gastwirth, Weiwen Miao
Bonett DG, Seier E (2006).
“Confidence interval for a coefficient of dispersion in nonnormal distributions.”
Biometrical Journal, 48(1), 144–148.
doi:10.1002/bimj.200410148.
Gastwirth JL (1988).
Statistical Reasoning in Law and Public Policy: Statistical Concepts and Issues of Fairness, volume 1.
Academic Press, San Diego, CA.
## The Baker v. Carr Case: one-person-one-vote decision. ## Measure of Relative Inequality of Population data in 33 districts ## of the Tennessee Legislature in 1900 and 1972. See ## popdata (see Gastwirth, 1988). data(popdata) cd(popdata[,"pop1900"]) cd(popdata[,"pop1972"])
## The Baker v. Carr Case: one-person-one-vote decision. ## Measure of Relative Inequality of Population data in 33 districts ## of the Tennessee Legislature in 1900 and 1972. See ## popdata (see Gastwirth, 1988). data(popdata) cd(popdata[,"pop1900"]) cd(popdata[,"pop1972"])
The Cochran–Mantel–Haenszel (CMH) procedure tests homogeneity of population proportions after taking into account other factors. This procedure is widely used in law cases, for example, on equal employment and discrimination, and in biological and phamaceutical studies.
cmh.test(x)
cmh.test(x)
x |
a numeric |
The test is based on the CMH procedure discussed
by Gastwirth (1984). The data should be input in an array
of 2 rows 2 columns
levels.
The output includes the Mantel–Haenszel Estimate, the pooled Odd Ratio,
and the Odd Ratio between the rows and columns at each level. The Chi-square
test of significance tests if there is an interaction or association
between rows and columns.
The null hypothesis is that the pooled Odd Ratio is equal to 1, i.e., there is no interaction between rows and columns. For more details see Gastwirth (1984).
The cmh.test
can be viewed as a subset of
mantelhaen.test
, in the sense that cmh.test
is for a
2 by 2 by table without continuity correction, whereas
mantelhaen.test
allows for a larger table,
and for a 2 by 2 by table, it has an option of performing continuity correction.
However, in view of Gastwirth (1984), continuity
correction is not recommended as it tends to overestimate the
-value.
A list of class "htest"
containing the following components:
MH.ESTIMATE |
the value of the Cochran–Mantel–Haenszel estimate. |
OR |
pooled Odd Ratio of the data. |
ORK |
vector of Odd Ratio of each level. |
cmh |
the test statistic. |
df |
degrees of freedom. |
p.value |
the |
method |
type of the performed test. |
data.name |
a character string giving the name of the data. |
Min Qin, Wallace W. Hui, Yulia R. Gel, Joseph L. Gastwirth
Gastwirth JL (1984). “Statistical methods for analyzing claims of employment discrimination.” ILR Review, 38(1), 75–86. doi:10.1177/001979398403800108.
## Sample Salary Data data(blackhire) cmh.test(blackhire)
## Sample Salary Data data(blackhire) cmh.test(blackhire)
Number of senators and representatives and population size in 23 districts in the United States of America in 1963 (Gastwirth 1972).
data(data1963)
data(data1963)
A data frame with 23 observations on the following 3 variables:
pop1963
population in 1963;
sen1963
number of senators in the district in 1963;
rep1963
number of representatives in the district in 1963.
Gastwirth (1972).
Gastwirth JL (1972). “The estimation of the Lorenz curve and Gini index.” The Review of Economics and Statistics, 54(3), 306–316.
Gini index for measuring relative inequality (or relative variation) of the data
(Gini 1912). NA
s from the data are omitted.
gini.index(x)
gini.index(x)
x |
the input data. |
See also Gastwirth (1988).
A list with the following components:
statistic |
the Gini index. |
parameter |
the mean difference of the set of numbers. |
data.name |
a character string giving the name of the data. |
Wallace Hui, Yulia R. Gel, Joseph L. Gastwirth, Weiwen Miao
Gastwirth JL (1988).
Statistical Reasoning in Law and Public Policy: Statistical Concepts and Issues of Fairness, volume 1.
Academic Press, San Diego, CA.
Gini C (1912).
“Variabilita e mutabilita.”
Reprinted in Memorie di Metodologica Statistica (Ed. Pizetti E. and Salvemini, T.), 1955, Rome: Libreria Eredi Virgilio Veschi.
English translation in Metron, 2005, 63(1): 3–38.
## The Baker v. Carr Case: one-person-one-vote decision. ## Measure of Relative Inequality of Population data in 33 districts ## of the Tennessee Legislature in 1900 and 1972. See ## popdata (see Gastwirth (1988)). data(popdata) gini.index(popdata[,"pop1900"]) gini.index(popdata[,"pop1972"])
## The Baker v. Carr Case: one-person-one-vote decision. ## Measure of Relative Inequality of Population data in 33 districts ## of the Tennessee Legislature in 1900 and 1972. See ## popdata (see Gastwirth (1988)). data(popdata) gini.index(popdata[,"pop1900"]) gini.index(popdata[,"pop1972"])
Compute average absolute deviation from the sample median,
which is a consistent robust estimate of the population standard deviation
for normally distribution data (Gastwirth 1982).
NA
s from the data are omitted.
j.maad(x)
j.maad(x)
x |
a numeric vector of data values. |
Robust standard deviation.
Wallace Hui, Yulia R. Gel, Joseph L. Gastwirth, Weiwen Miao
Gastwirth JL (1982). “Statistical properties of a measure of tax assessment uniformity.” Journal of Statistical Planning and Inference, 6(1), 1–12. doi:10.1016/0378-3758(82)90050-7.
cd
, gini.index
, rqq
,
rjb.test
, sj.test
## Sample 100 observations from the standard normal distribution x = rnorm(100) j.maad(x)
## Sample 100 observations from the standard normal distribution x = rnorm(100) j.maad(x)
Goodness-of-fit test statistics
A2
(Anderson–Darling),
W2
(Cramer–von Mises),
U2
(Watson),
D
(Kolmogorov–Smirnov), and
V
(Kuiper).
By default, NA
s are omitted. For the tables of critical values, see
Stephens (1986) and
Puig and Stephens (2000).
laplace.test(y)
laplace.test(y)
y |
a numeric vector of data values. |
The function originally used plaplace
function from R package VGAM
(Yee 2019), however, to resolve dependencies between packages,
the plaplace
function was copied entirely to the current package under the name VGAM_plaplace
.
A list with the following numeric components:
A2 |
the Anderson–Darling statistic. |
W2 |
the Cramer–von Mises statistic. |
U2 |
the Watson statistic. |
D |
the Kolmogorov–Smirnov statistic. |
V |
the Kuiper statistic. |
Kimihiro Noguchi, Yulia R. Gel
Puig P, Stephens MA (2000).
“Tests of fit for the Laplace distribution, with applications.”
Technometrics, 42(4), 417–424.
doi:10.1080/00401706.2000.10485715.
Stephens MA (1986).
“Tests for the Uniform Distribution.”
In D'Agostino RB, Stephens MA (eds.), Goodness-of-fit Techniques, volume 68 of Statistics, textbooks and monographs, chapter 8.
Marcel Dekker, New York.
Yee T (2019).
VGAM: Vector Generalized Linear and Additive Models.
R package version 1.1-2, https://CRAN.R-project.org/package=VGAM.
## Differences in flood levels example taken from Puig and Stephens (2000) y <- c(1.96,1.97,3.60,3.80,4.79,5.66,5.76,5.78,6.27,6.30,6.76,7.65,7.84,7.99,8.51,9.18, 10.13,10.24,10.25,10.43,11.45,11.48,11.75,11.81,12.33,12.78,13.06,13.29,13.98,14.18, 14.40,16.22,17.06) laplace.test(y)$D ## [1] 0.9177726 ## The critical value at the 0.05 significance level is approximately 0.906. ## Thus, the null hypothesis should be rejected at the 0.05 level.
## Differences in flood levels example taken from Puig and Stephens (2000) y <- c(1.96,1.97,3.60,3.80,4.79,5.66,5.76,5.78,6.27,6.30,6.76,7.65,7.84,7.99,8.51,9.18, 10.13,10.24,10.25,10.43,11.45,11.48,11.75,11.81,12.33,12.78,13.06,13.29,13.98,14.18, 14.40,16.22,17.06) laplace.test(y)$D ## [1] 0.9177726 ## The critical value at the 0.05 significance level is approximately 0.906. ## Thus, the null hypothesis should be rejected at the 0.05 level.
Tests equality of the population variances.
levene.test( y, group, location = c("median", "mean", "trim.mean"), trim.alpha = 0.25, bootstrap = FALSE, num.bootstrap = 1000, kruskal.test = FALSE, correction.method = c("none", "correction.factor", "zero.removal", "zero.correction") )
levene.test( y, group, location = c("median", "mean", "trim.mean"), trim.alpha = 0.25, bootstrap = FALSE, num.bootstrap = 1000, kruskal.test = FALSE, correction.method = c("none", "correction.factor", "zero.removal", "zero.correction") )
y |
a numeric vector of data values. |
group |
factor of the data. |
location |
the default option is |
trim.alpha |
the fraction (0 to 0.5) of observations to be trimmed from
each end of |
bootstrap |
a logical value identifying whether to implement bootstrap.
The default is |
num.bootstrap |
number of bootstrap samples to be drawn when the |
kruskal.test |
logical value indentifying whether to use the Kruskal–Wallis
statistic. The default option is |
correction.method |
procedures to make the test more robust;
the default option is |
The test statistic is based on
the classical Levene's procedure (using the group means),
the modified Brown–Forsythe Levene-type procedure (using the group medians),
or the modified Levene-type procedure (using the group trimmed means).
More robust versions of the test using the correction factor or structural zero
removal method are also available. Two options for calculating critical values,
namely, approximated and bootstrapped, are available.
By default, NA
s are omitted from the data.
A list of class "htest"
with the following components:
statistic |
the value of the test statistic. |
p.value |
the |
method |
type of test performed. |
data.name |
a character string giving the name of the data. |
non.bootstrap.p.value |
the |
Instead of the ANOVA statistic suggested by Levene,
the Kruskal–Wallis ANOVA may also be applied using this function
(see the parameter kruskal.test
).
Modified from a response posted by Brian Ripley to the R-help e-mail list.
Kimihiro Noguchi, W. Wallace Hui, Yulia R. Gel, Joseph L. Gastwirth, Weiwen Miao
Brown MB, Forsythe AB (1974).
“Robust tests for the equality of variances.”
Journal of the American Statistical Association, 69(346), 364–367.
doi:10.1080/01621459.1974.10482955.
Hines WGS, Hines RJO (2000).
“Increased power with modified forms of the Levene (Med) test for heterogeneity of variance.”
Biometrics, 56(2), 451–454.
doi:10.1111/j.0006-341X.2000.00451.x.
Keyes TK, Levy MS (1997).
“Analysis of Levene's test under design imbalance.”
Journal of Educational and Behavioral Statistics, 22(2), 227–236.
doi:10.3102/10769986022002227.
Levene H (1960).
“Robust Tests for Equality of Variances.”
In Olkin I, others (eds.), Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling.
Stanford University Press, Palo Alto, CA.
Lim T, Loh W (1996).
“A comparison of tests of equality of variances.”
Computational Statistics & Data Analysis, 22(3), 287–301.
doi:10.1016/0167-9473(95)00054-2.
Noguchi K, Gel YR (2010).
“Combination of Levene-type tests and a finite-intersection method for testing equality of variances against ordered alternatives.”
Journal of Nonparametric Statistics, 22(7), 897–913.
doi:10.1080/10485251003698505.
O'Brien RG (1978).
“Robust techniques for testing heterogeneity of variance effects in factorial designs.”
Psychometrika, 43(3), 327–342.
doi:10.1007/BF02293643.
neuhauser.hothorn.test
, lnested.test
,
ltrend.test
, mma.test
, robust.mmm.test
data(pot) levene.test(pot[,"obs"], pot[,"type"], location = "median", correction.method = "zero.correction") ## Bootstrap version of the test. The calculation may take up a few minutes ## depending on the number of bootstrap sampling. levene.test(pot[,"obs"], pot[,"type"], location = "median", correction.method = "zero.correction", bootstrap = TRUE, num.bootstrap = 500)
data(pot) levene.test(pot[,"obs"], pot[,"type"], location = "median", correction.method = "zero.correction") ## Bootstrap version of the test. The calculation may take up a few minutes ## depending on the number of bootstrap sampling. levene.test(pot[,"obs"], pot[,"type"], location = "median", correction.method = "zero.correction", bootstrap = TRUE, num.bootstrap = 500)
The test statistic is based on the finite intersection approach.
lnested.test( y, group, location = c("median", "mean", "trim.mean"), tail = c("right", "left", "both"), trim.alpha = 0.25, bootstrap = FALSE, num.bootstrap = 1000, correction.method = c("none", "correction.factor", "zero.removal", "zero.correction"), correlation.method = c("pearson", "kendall", "spearman") )
lnested.test( y, group, location = c("median", "mean", "trim.mean"), tail = c("right", "left", "both"), trim.alpha = 0.25, bootstrap = FALSE, num.bootstrap = 1000, correction.method = c("none", "correction.factor", "zero.removal", "zero.correction"), correlation.method = c("pearson", "kendall", "spearman") )
y |
a numeric vector of data values. |
group |
factor of the data. |
location |
the default option is |
tail |
the default option is |
trim.alpha |
the fraction (0 to 0.5) of observations to be trimmed from
each end of |
bootstrap |
a logical value identifying whether to implement bootstrap.
The default is |
num.bootstrap |
number of bootstrap samples to be drawn when the |
correction.method |
procedures to make the test more robust;
the default option is |
correlation.method |
measures of correlation; the default option is
|
The test statistic is based on
the classical Levene's procedure (using the group means),
the modified Brown–Forsythe Levene-type procedure (using the group medians),
or the modified Levene-type procedure (using the group trimmed means).
More robust versions of the test using the correction factor or structural zero
removal method are also available. Two options for calculating critical values,
namely, approximated and bootstrapped, are available.
By default, NA
s are omitted from the data.
A list with the following elements:
T |
the statistic and |
F |
the statistic and |
N |
the statistic and |
L |
the statistic and |
Each of the list elements is a list of class "htest"
with the following elements:
statistic |
the value of the test statistic expressed in terms of correlation (Pearson, Kendall, or Spearman). |
p.value |
the |
method |
type of test performed. |
data.name |
a character string giving the name of the data. |
non.bootstrap.statistic |
the statistic of the test without bootstrap method. |
non.bootstrap.p.value |
the |
Kimihiro Noguchi, W. Wallace Hui, Yulia R. Gel, Joseph L. Gastwirth, Weiwen Miao
Brown MB, Forsythe AB (1974).
“Robust tests for the equality of variances.”
Journal of the American Statistical Association, 69(346), 364–367.
doi:10.1080/01621459.1974.10482955.
Hines WGS, Hines RJO (2000).
“Increased power with modified forms of the Levene (Med) test for heterogeneity of variance.”
Biometrics, 56(2), 451–454.
doi:10.1111/j.0006-341X.2000.00451.x.
Keyes TK, Levy MS (1997).
“Analysis of Levene's test under design imbalance.”
Journal of Educational and Behavioral Statistics, 22(2), 227–236.
doi:10.3102/10769986022002227.
Levene H (1960).
“Robust Tests for Equality of Variances.”
In Olkin I, others (eds.), Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling.
Stanford University Press, Palo Alto, CA.
Lim T, Loh W (1996).
“A comparison of tests of equality of variances.”
Computational Statistics & Data Analysis, 22(3), 287–301.
doi:10.1016/0167-9473(95)00054-2.
Noguchi K, Gel YR (2010).
“Combination of Levene-type tests and a finite-intersection method for testing equality of variances against ordered alternatives.”
Journal of Nonparametric Statistics, 22(7), 897–913.
doi:10.1080/10485251003698505.
O'Brien RG (1978).
“Robust techniques for testing heterogeneity of variance effects in factorial designs.”
Psychometrika, 43(3), 327–342.
doi:10.1007/BF02293643.
levene.test
, ltrend.test
,
mma.test
, neuhauser.hothorn.test
,
robust.mmm.test
data(pot) lnested.test(pot[,"obs"], pot[, "type"], location = "median", tail = "left", correction.method = "zero.correction")$N lnested.test(pot[, "obs"], pot[, "type"], location = "median", tail = "left", correction.method = "zero.correction", bootstrap = TRUE, num.bootstrap = 500)$N
data(pot) lnested.test(pot[,"obs"], pot[, "type"], location = "median", tail = "left", correction.method = "zero.correction")$N lnested.test(pot[, "obs"], pot[, "type"], location = "median", tail = "left", correction.method = "zero.correction", bootstrap = TRUE, num.bootstrap = 500)$N
Plots the Lorenz curve that is a graphical representation of the cumulative distribution function. The user can choose between the Lorenz curve with single (default) or multiple weighting of data, for example, taking into account for single or multiple legislature representatives (Gastwirth 1972).
lorenz.curve( data, weight = NULL, mul = FALSE, plot.it = TRUE, main = NULL, xlab = NULL, ylab = NULL, xlim = c(0, 1), ylim = c(0, 1), ... )
lorenz.curve( data, weight = NULL, mul = FALSE, plot.it = TRUE, main = NULL, xlab = NULL, ylab = NULL, xlim = c(0, 1), ylim = c(0, 1), ... )
data |
input data. If the argument is an array, a matrix, a data.frame, or a list
with two or more columns, then the first column will be treated as a data vector,
and the second column to be treated as a weight vector. A separate weight vector is
then ignored and not required. If the argument is a single column vector, then a user
must enter a separate single-column weight vector.
|
weight |
one-column vector contains factors of single or multiple weights.
Ignored if included in the |
mul |
logical value indicates whether the Lorenz curve with multiple weight
is to be plotted. Default is |
plot.it |
logical value indicates whether the Lorenz curve should be plotted.
Default is |
main |
title of Lorenz curve. Only required if user wants to override the default value. |
xlab |
label of x-axis. Only required if user wants to override the default value. |
ylab |
label of y-axis. Only required if user wants to override the default value. |
xlim |
plotting range of x-axis. Only required if user wants to override the default value. |
ylim |
plotting range of y-axis. Only required if user wants to override the default value. |
... |
other graphical parameters to be passed to the |
The input data should be a data frame with 2 columns. The first column will be treated as data vector, and the second column to be treated as a weight vector. Alternatively, data and weights can be entered as separate one-column vectors.
A Lorenz curve plot with x-axis being the culmulative fraction of the data argument, and y-axis being the culmulative fraction of the weight argument. In the legend to the plot, the following values are reported:
RMD |
relative mean deviation of the input data. |
GI |
the Gini index of the input data. |
L(1/2) |
median of the culmulative fraction sum of the data. |
Man Jin, Wallace W. Hui, Yulia R. Gel, Joseph L. Gastwirth
Gastwirth JL (1972). “The estimation of the Lorenz curve and Gini index.” The Review of Economics and Statistics, 54(3), 306–316.
## Data on: number of senators (second column) and ## representatives (third column) relative to population size (first column) in 1963 ## First column is treated as the data argument. data(data1963) ## Single weight Lorenz Curve using number of senators as weight argument. lorenz.curve(data1963) ## Multiple weight Lorenz Curve using number of senators as weight argument. lorenz.curve(data1963, mul = TRUE) ## Multiple weight Lorenz Curve using number of representatives ## as weight argument. lorenz.curve(data1963[, "pop1963"], data1963[, "rep1963"], mul = TRUE)
## Data on: number of senators (second column) and ## representatives (third column) relative to population size (first column) in 1963 ## First column is treated as the data argument. data(data1963) ## Single weight Lorenz Curve using number of senators as weight argument. lorenz.curve(data1963) ## Multiple weight Lorenz Curve using number of senators as weight argument. lorenz.curve(data1963, mul = TRUE) ## Multiple weight Lorenz Curve using number of representatives ## as weight argument. lorenz.curve(data1963[, "pop1963"], data1963[, "rep1963"], mul = TRUE)
Test for a linear trend in variances.
ltrend.test( y, group, score = NULL, location = c("median", "mean", "trim.mean"), tail = c("right", "left", "both"), trim.alpha = 0.25, bootstrap = FALSE, num.bootstrap = 1000, correction.method = c("none", "correction.factor", "zero.removal", "zero.correction"), correlation.method = c("pearson", "kendall", "spearman") )
ltrend.test( y, group, score = NULL, location = c("median", "mean", "trim.mean"), tail = c("right", "left", "both"), trim.alpha = 0.25, bootstrap = FALSE, num.bootstrap = 1000, correction.method = c("none", "correction.factor", "zero.removal", "zero.correction"), correlation.method = c("pearson", "kendall", "spearman") )
y |
a numeric vector of data values. |
group |
factor of the data. |
score |
weights to be used in testing an increasing/decreasing trend in
group variances, |
location |
the default option is |
tail |
the default option is |
trim.alpha |
the fraction (0 to 0.5) of observations to be trimmed from
each end of |
bootstrap |
a logical value identifying whether to implement bootstrap.
The default is |
num.bootstrap |
number of bootstrap samples to be drawn when the |
correction.method |
procedures to make the test more robust;
the default option is |
correlation.method |
measures of correlation; the default option is
|
The test statistic is based on
the classical Levene's procedure (using the group means),
the modified Brown–Forsythe Levene-type procedure (using the group medians),
or the modified Levene-type procedure (using the group trimmed means).
More robust versions of the test using the correction factor or structural zero
removal method are also available. Two options for calculating critical values,
namely, approximated and bootstrapped, are available.
By default, NA
s are omitted from the data.
A list of class "htest"
containing the following components:
statistic |
the value of the test statistic expressed in terms of correlation (Pearson, Kendall, or Spearman). |
p.value |
the |
method |
type of test performed. |
data.name |
a character string giving the name of the data. |
t.statistic |
the value of the test statistic from Student's t-test. |
non.bootstrap.p.value |
the |
log.p.value |
the log of the |
log.q.value |
the log of the (one minus the |
Kimihiro Noguchi, W. Wallace Hui, Yulia R. Gel, Joseph L. Gastwirth, Weiwen Miao
Brown MB, Forsythe AB (1974).
“Robust tests for the equality of variances.”
Journal of the American Statistical Association, 69(346), 364–367.
doi:10.1080/01621459.1974.10482955.
Hines WGS, Hines RJO (2000).
“Increased power with modified forms of the Levene (Med) test for heterogeneity of variance.”
Biometrics, 56(2), 451–454.
doi:10.1111/j.0006-341X.2000.00451.x.
Keyes TK, Levy MS (1997).
“Analysis of Levene's test under design imbalance.”
Journal of Educational and Behavioral Statistics, 22(2), 227–236.
doi:10.3102/10769986022002227.
Levene H (1960).
“Robust Tests for Equality of Variances.”
In Olkin I, others (eds.), Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling.
Stanford University Press, Palo Alto, CA.
Lim T, Loh W (1996).
“A comparison of tests of equality of variances.”
Computational Statistics & Data Analysis, 22(3), 287–301.
doi:10.1016/0167-9473(95)00054-2.
Noguchi K, Gel YR (2010).
“Combination of Levene-type tests and a finite-intersection method for testing equality of variances against ordered alternatives.”
Journal of Nonparametric Statistics, 22(7), 897–913.
doi:10.1080/10485251003698505.
O'Brien RG (1978).
“Robust techniques for testing heterogeneity of variance effects in factorial designs.”
Psychometrika, 43(3), 327–342.
doi:10.1007/BF02293643.
neuhauser.hothorn.test
, levene.test
,
lnested.test
, mma.test
, robust.mmm.test
data(pot) ltrend.test(pot[, "obs"], pot[, "type"], location = "median", tail = "left", correction.method = "zero.correction") ## Bootstrap version of the test. The calculation may take up a few minutes ## depending on the number of bootstrap samples. ltrend.test(pot[, "obs"], pot[, "type"], location = "median", tail = "left", correction.method = "zero.correction", bootstrap = TRUE, num.bootstrap = 500)
data(pot) ltrend.test(pot[, "obs"], pot[, "type"], location = "median", tail = "left", correction.method = "zero.correction") ## Bootstrap version of the test. The calculation may take up a few minutes ## depending on the number of bootstrap samples. ltrend.test(pot[, "obs"], pot[, "type"], location = "median", tail = "left", correction.method = "zero.correction", bootstrap = TRUE, num.bootstrap = 500)
Data contains 16 observations of dioxin levels for counties in the Upper Peninsula of Michigan.
data(michigan)
data(michigan)
A numeric vector of length 16.
The Environmental Protection Agency (EPA) of the State of Michigan.
Test for a monotonic trend in variances for normal samples. The test statistic
is based on a combination of the finite intersection approach and the classical
(variance ratio) test (Mudholkar et al. 1993).
By default,
NA
s are omitted.
mma.test(y, group, tail = c("right", "left", "both"))
mma.test(y, group, tail = c("right", "left", "both"))
y |
a numeric vector of data values. |
group |
factor of the data. |
tail |
the default option is |
A list with the following components:
T |
the statistic and |
F |
the statistic and |
N |
the statistic and |
L |
the statistic and |
Each of the list elements is a list of class "htest"
with the following elements:
statistic |
the value of the test statistic. |
p.value |
the |
method |
type of test performed. |
data.name |
a character string giving the name of the data. |
Kimihiro Noguchi, Yulia R. Gel
Mudholkar GS, McDermott MP, Aumont J (1993). “Testing homogeneity of ordered variances.” Metrika, 40(1), 271–281. doi:10.1007/BF02613691.
neuhauser.hothorn.test
, levene.test
,
lnested.test
, ltrend.test
, robust.mmm.test
data(pot) mma.test(pot[, "obs"], pot[, "type"], tail = "left")$N
data(pot) mma.test(pot[, "obs"], pot[, "type"], tail = "left")$N
The test statistic suggested by Neuhauser and Hothorn (2000).
neuhauser.hothorn.test( y, group, location = c("median", "mean", "trim.mean"), tail = c("right", "left", "both"), trim.alpha = 0.25, bootstrap = FALSE, num.bootstrap = 1000, correction.method = c("none", "correction.factor", "zero.removal", "zero.correction") )
neuhauser.hothorn.test( y, group, location = c("median", "mean", "trim.mean"), tail = c("right", "left", "both"), trim.alpha = 0.25, bootstrap = FALSE, num.bootstrap = 1000, correction.method = c("none", "correction.factor", "zero.removal", "zero.correction") )
y |
a numeric vector of data values. |
group |
factor of the data. |
location |
the default option is |
tail |
the default option is |
trim.alpha |
the fraction (0 to 0.5) of observations to be trimmed from
each end of |
bootstrap |
a logical value identifying whether to implement bootstrap.
The default is |
num.bootstrap |
number of bootstrap samples to be drawn when the |
correction.method |
procedures to make the test more robust;
the default option is |
The test statistic is based on
the classical Levene's procedure (using the group means),
the modified Brown–Forsythe Levene-type procedure (using the group medians),
or the modified Levene-type procedure (using the group trimmed means).
More robust versions of the test using the correction factor or structural zero
removal method are also available. Two options for calculating critical values,
namely, approximated and bootstrapped, are available.
By default, NA
s are omitted from the data.
A list of class "htest"
with the following components:
statistic |
the value of the test statistic. |
p.value |
the |
method |
type of test performed. |
data.name |
a character string giving the name of the data. |
non.bootstrap.p.value |
the |
Kimihiro Noguchi, Yulia R. Gel
Brown MB, Forsythe AB (1974).
“Robust tests for the equality of variances.”
Journal of the American Statistical Association, 69(346), 364–367.
doi:10.1080/01621459.1974.10482955.
Hines WGS, Hines RJO (2000).
“Increased power with modified forms of the Levene (Med) test for heterogeneity of variance.”
Biometrics, 56(2), 451–454.
doi:10.1111/j.0006-341X.2000.00451.x.
Keyes TK, Levy MS (1997).
“Analysis of Levene's test under design imbalance.”
Journal of Educational and Behavioral Statistics, 22(2), 227–236.
doi:10.3102/10769986022002227.
Levene H (1960).
“Robust Tests for Equality of Variances.”
In Olkin I, others (eds.), Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling.
Stanford University Press, Palo Alto, CA.
Lim T, Loh W (1996).
“A comparison of tests of equality of variances.”
Computational Statistics & Data Analysis, 22(3), 287–301.
doi:10.1016/0167-9473(95)00054-2.
Neuhauser M, Hothorn LA (2000).
“Parametric location-scale and scale trend tests based on Levene's transformation.”
Computational Statistics & Data Analysis, 33(2), 189–200.
doi:10.1016/S0167-9473(99)00051-1.
Noguchi K, Gel YR (2010).
“Combination of Levene-type tests and a finite-intersection method for testing equality of variances against ordered alternatives.”
Journal of Nonparametric Statistics, 22(7), 897–913.
doi:10.1080/10485251003698505.
O'Brien RG (1978).
“Robust techniques for testing heterogeneity of variance effects in factorial designs.”
Psychometrika, 43(3), 327–342.
doi:10.1007/BF02293643.
levene.test
, lnested.test
,
ltrend.test
, mma.test
, robust.mmm.test
data(pot) neuhauser.hothorn.test(pot[, "obs"], pot[, "type"], location = "median", tail = "left", correction.method = "zero.correction") ## Bootstrap version of the test. The calculation may take up a few minutes ## depending on the number of bootstrap sampling. neuhauser.hothorn.test(pot[, "obs"], pot[, "type"], location = "median", tail = "left", correction.method = "zero.correction", bootstrap = TRUE, num.bootstrap = 500)
data(pot) neuhauser.hothorn.test(pot[, "obs"], pot[, "type"], location = "median", tail = "left", correction.method = "zero.correction") ## Bootstrap version of the test. The calculation may take up a few minutes ## depending on the number of bootstrap sampling. neuhauser.hothorn.test(pot[, "obs"], pot[, "type"], location = "median", tail = "left", correction.method = "zero.correction", bootstrap = TRUE, num.bootstrap = 500)
Produce four parameters, alpha (tail heavyness), beta (asymmetry), delta (scale), and mu (location) from the four variables: mean, variance, kurtosis, and skewness.
nig.parameter( mean = mean, variance = variance, kurtosis = kurtosis, skewness = skewness )
nig.parameter( mean = mean, variance = variance, kurtosis = kurtosis, skewness = skewness )
mean |
mean of the NIG distribution. |
variance |
variance of the NIG distribution. |
kurtosis |
excess kurtosis of the NIG distribution. |
skewness |
skewness of the NIG distribution. |
The parameters are generated with three conditions:
1) ;
2)
, and
3)
.
See Atkinson (1982),
Barndorff-Nielsen and Blaesild (1983), and
Noguchi and Gel (2010).
A list with the following numeric components:
alpha |
tail-heavyness parameter of the NIG distribution. |
beta |
asymmetry parameter of the NIG distribution. |
delta |
scale parameter of the NIG distribution. |
mu |
location parameter of the NIG distribution. |
Kimihiro Noguchi, Yulia R. Gel
Atkinson AC (1982).
“The simulation of generalized inverse Gaussian and hyperbolic random variables.”
SIAM Journal on Scientific and Statistical Computing, 3(4), 502–515.
doi:10.1137/0903033.
Barndorff-Nielsen OE, Blaesild P (1983).
“Hyperbolic distributions.”
In Johnson NL, Kotz S, Read CB (eds.), Encyclopedia of Statistical Sciences, 700–707.
John Wiley & Sons Ltd, New York.
Noguchi K, Gel YR (2010).
“Combination of Levene-type tests and a finite-intersection method for testing equality of variances against ordered alternatives.”
Journal of Nonparametric Statistics, 22(7), 897–913.
doi:10.1080/10485251003698505.
library(fBasics) test <- nig.parameter(0, 2, 5, 1) random <- rnig(1000000, alpha = test$alpha, beta = test$beta, mu = test$mu, delta = test$delta) mean(random) var(random) kurtosis(random) skewness(random)
library(fBasics) test <- nig.parameter(0, 2, 5, 1) random <- rnig(1000000, alpha = test$alpha, beta = test$beta, mu = test$mu, delta = test$delta) mean(random) var(random) kurtosis(random) skewness(random)
The Baker v. Carr Case: one-person-one-vote decision. Measure of Relative Inequality of Population data in 33 districts of the Tennessee Legislature in 1900, 1960, and 1972 (Gastwirth 1988).
data(popdata)
data(popdata)
A data frame with 33 observations on the following 3 numeric variables:
pop1900
population data in 1900
pop1960
population data in 1960
pop1972
population data in 1972
Gastwirth (1988).
Gastwirth JL (1988). Statistical Reasoning in Law and Public Policy: Statistical Concepts and Issues of Fairness, volume 1. Academic Press, San Diego, CA.
The apertures of the chupa pots from three Philippine locations:
Dalupa (ApDl
), Dangtalan (ApDg
), and Paradijon (ApP
).
data(pot)
data(pot)
A data frame with 343 observations of 2 variables: obs
(integer values of observed apertures)
and locations (factor with 3 levels).
Archaeologists are concerned with the effect that increasing economic activity had on older civilizations. Economic growth and its related economic specialization led to the "standardization hypothesis", i.e., increased production of an item would lead to its becoming more uniform. Kvamme et al. (1996) focused on earthenware, chupa-pots from three Philippine communities that differ in the way they organize ceramic production. In Dangtalan, pottery is primarily made for household use; in Dalupa there is a non-market barter economy where potters exchange their works. In the village of Paradijon, near the provincial capital, full-time pottery specialists sell their output to shopkeepers for sale to the general public.
The data are kindly provided by Professor Kvamme (Kvamme et al. 1996).
Kvamme KL, Stark MT, Longacre WA (1996). “Alternative procedures for assessing standardization in ceramic assemblages.” American Antiquity, 61(1), 116–126. doi:10.2307/282306.
The robust and classical Jarque–Bera tests of normality.
rjb.test( x, option = c("RJB", "JB"), crit.values = c("chisq.approximation", "empirical"), N = 0 )
rjb.test( x, option = c("RJB", "JB"), crit.values = c("chisq.approximation", "empirical"), N = 0 )
x |
a numeric vector of data values. |
option |
the choice of whether to perform the robust test, |
crit.values |
a character string specifying how the critical values should be obtained: approximated by the Chi-square distribution (default) or empirically. |
N |
number of Monte Carlo simulations for the empirical critical values. |
The test is based on a joint statistic using skewness and kurtosis coefficients. The Robust Jarque–Bera (RJB) is the robust version of the Jarque–Bera (JB) test of normality. The RJB (default option) utilizes the robust standard deviation (specifically, the Average Absolute Deviation from the Median; MAAD) to estimate sample kurtosis and skewness. For more details, see Gel and Gastwirth (2008). Users can also choose to perform the classical Jarque–Bera test (Jarque and Bera 1980).
A list of class "htest"
with the following components:
statistic |
the value of the test statistic. |
parameter |
the degrees of freedom. |
p.value |
the |
method |
type of test was performed. |
data.name |
a character string giving the name of the data. |
Modified from jarque.bera.test
(tseries
package).
W. Wallace Hui, Yulia R. Gel, Joseph L. Gastwirth, Weiwen Miao
Gel YR, Gastwirth JL (2008).
“A robust modification of the Jarque–Bera test of normality.”
Economics Letters, 99(1), 30–32.
doi:10.1016/j.econlet.2007.05.022.
Jarque CM, Bera AK (1980).
“Efficient tests for normality, homoscedasticity and serial independence of regression residuals.”
Economics Letters, 6(3), 255–259.
doi:10.1016/0165-1765(80)90024-5.
sj.test
, rqq
,
jarque.bera.test
## Normally distributed data x = rnorm(100) rjb.test(x) ## Using zuni data data(zuni) rjb.test(zuni[, "Revenue"])
## Normally distributed data x = rnorm(100) rjb.test(x) ## Using zuni data data(zuni) rjb.test(zuni[, "Revenue"])
Robust test for the Laplace distribution. Two options for calculating critical values, namely, approximated with Chi-square distribution and empirical, are available.
rlm.test(x, crit.values = c("chisq.approximation", "empirical"), N = 0)
rlm.test(x, crit.values = c("chisq.approximation", "empirical"), N = 0)
x |
a numeric vector of data values. |
crit.values |
a character string specifying how the critical values should be obtained: approximated by the Chi-square distribution (default) or empirically. |
N |
number of Monte Carlo simulations for the empirical critical values. |
The test is based on a joint statistic using skewness and kurtosis coefficients. In particular, RLM uses the Average Absolute Deviation from the Median (MAAD), a robust estimate of standard deviation. See Gel (2010).
A list of class "htest"
with the following components:
statistic |
the value of the test statistic. |
parameter |
the degrees of freedom. |
p.value |
the |
method |
type of test was performed. |
data.name |
a character string giving the name of the data. |
Kimihiro Noguchi, W. Wallace Hui, Yulia R. Gel
Gel YR (2010). “Test of fit for a Laplace distribution against heavier tailed alternatives.” Computational Statistics & Data Analysis, 54(4), 958–965. doi:10.1016/j.csda.2009.10.008.
sj.test
, rjb.test
, rqq
,
jarque.bera.test
## Laplace distributed data x = rexp(100) - rexp(100) rlm.test(x)
## Laplace distributed data x = rexp(100) - rexp(100) rlm.test(x)
A test for a monotonic trend in variances (Mudholkar et al. 1995).
The test statistic is based on
a combination of the finite intersection approach and the two-sample -test
using Miller's transformation. By default,
NA
s are omitted.
robust.mmm.test(y, group, tail = c("right", "left", "both"))
robust.mmm.test(y, group, tail = c("right", "left", "both"))
y |
a numeric vector of data values. |
group |
factor of the data. |
tail |
the default option is |
A list with the following elements:
T |
the statistic and |
F |
the statistic and |
N |
the statistic and |
L |
the statistic and |
Each of the list elements is a list of class "htest"
with the following elements:
statistic |
the value of the test statistic. |
p.value |
the |
method |
type of test performed. |
data.name |
a character string giving the name of the data. |
Kimihiro Noguchi, Yulia R. Gel
Mudholkar GS, McDermott MP, Mudholkar A (1995). “Robust finite-intersection tests for homogeneity of ordered variances.” Journal of Statistical Planning and Inference, 43(1-2), 185–195. doi:10.1016/0378-3758(94)00018-Q.
neuhauser.hothorn.test
, levene.test
,
lnested.test
, ltrend.test
, mma.test
data(pot) robust.mmm.test(pot[, "obs"], pot[, "type"], tail = "left")$N
data(pot) robust.mmm.test(pot[, "obs"], pot[, "type"], tail = "left")$N
Produce robust quantile-quantile (RQQ) and classical quantile-quantile (QQ)
plots for graphical assessment of normality and optionally add a line, a QQ line,
to the produced plot. The QQ line may be chosen to be a 45-degree line or to pass
through the first and third quartiles of the data.
NA
s from the data are omitted.
rqq( y, plot.it = TRUE, square.it = TRUE, scale = c("MAD", "J", "classical"), location = c("median", "mean"), line.it = FALSE, line.type = c("45 degrees", "QQ"), col.line = 1, lwd = 1, outliers = FALSE, alpha = 0.05, ... )
rqq( y, plot.it = TRUE, square.it = TRUE, scale = c("MAD", "J", "classical"), location = c("median", "mean"), line.it = FALSE, line.type = c("45 degrees", "QQ"), col.line = 1, lwd = 1, outliers = FALSE, alpha = 0.05, ... )
y |
the input data. |
plot.it |
logical. Should the result be plotted? |
square.it |
logical. Should the plot scales be square? The default is |
scale |
the choice of a scale estimator, i.e., the classical or robust estimate of the standard deviation. |
location |
the choice of a location estimator, i.e., the mean or median. |
line.it |
logical. Should the line be plotted? No line is the default. |
line.type |
If |
col.line |
the color of the line (if plotted). |
lwd |
the line width (if plotted). |
outliers |
logical. Should the outliers be listed in the output? |
alpha |
significance level of outliers. If |
... |
other parameters passed to the |
An RQQ plot is a modified QQ plot where data are robustly standardized
by the median and robust measure of spread (rather than mean and classical
standard deviation as in the basic QQ plots) and then are plotted against the
expected standard normal order statistics
(Gel et al. 2005; Weisberg 2005).
Under normality, the plot of the standardized
observations should follow the 45-degree line, or QQ line. Both the median and robust
standard deviation are significantly less sensitive to outliers than mean and
classical standard deviation and therefore are more preferable in many practical
situations to assess graphically deviations from normality (if any). We choose
median and MAD as a robust measure of location and spread for our RQQ plots since
this standardization typically provides a clearer graphical diagnostics of normality.
In particular, deviations from the QQ line are usually more noticeable in RQQ plots
in the case of outliers and heavy tails. Users can also choose to plot the
45-degree line or the 1st-3rd quartile line (see the argument line.type
).
No line is the default.
A list with the following numeric components:
x |
the x-coordinates of the points that were/would be plotted. |
y |
the original data vector, i.e., the corresponding y-coordinates,
including |
W. Wallace Hui, Yulia R. Gel, Joseph L. Gastwirth, Weiwen Miao
Gel Y, Miao W, Gastwirth JL (2005).
“The importance of checking the assumptions underlying statistical analysis: graphical methods for assessing normality.”
Jurimetrics, 46, 3.
Weisberg S (2005).
Applied Linear Regression, 3 edition.
John Wiley & Sons, Hoboken, NJ.
rjb.test
, sj.test
,
qqnorm
, qqplot
, qqline
## Simulate 100 observations from standard normal distribution: y = rnorm(100) rqq(y) ## Using Michigan data data(michigan) rqq(michigan)
## Simulate 100 observations from standard normal distribution: y = rnorm(100) rqq(y) ## Using Michigan data data(michigan) rqq(michigan)
Performs the runs test for randomness (Mendenhall and Reinmuth 1982).
Users can choose whether to plot the
correlation graph or not, and whether to test against two-sided, negative,
or positive correlation. NA
s from the data are omitted.
runs.test( y, plot.it = FALSE, alternative = c("two.sided", "positive.correlated", "negative.correlated") )
runs.test( y, plot.it = FALSE, alternative = c("two.sided", "positive.correlated", "negative.correlated") )
y |
a numeric vector of data values. |
plot.it |
logical. If |
alternative |
a character string specifying the alternative hypothesis,
must be one of |
On the graph, observations that are less than the sample median are represented by red letters "A", and observations that are greater or equal to the sample median are represented by blue letters "B".
A list of class "htest"
with the following components:
statistic |
the value of the standardized runs statistic. |
p.value |
the |
data.name |
a character string giving the names of the data. |
alternative |
a character string describing the alternative hypothesis. |
Wallace Hui, Yulia R. Gel, Joseph L. Gastwirth, Weiwen Miao
Mendenhall W, Reinmuth JE (1982). Statistics for Management and Economics, 4 edition. Duxbury, Boston, MA.
##Simulate 100 observations from an autoregressive model ## of the first order (AR(1)) y = arima.sim(n = 100, list(ar = c(0.5))) ##Test y for randomness runs.test(y)
##Simulate 100 observations from an autoregressive model ## of the first order (AR(1)) y = arima.sim(n = 100, list(ar = c(0.5))) ##Test y for randomness runs.test(y)
Perform the robust directed test of normality, which is based on the ratio of the
classical standard deviation to the robust standard deviation
(Average Absolute Deviation from the Median, MAAD) of the sample data.
See Gel et al. (2007).
sj.test(x, crit.values = c("t.approximation", "empirical"), N = 0)
sj.test(x, crit.values = c("t.approximation", "empirical"), N = 0)
x |
a numeric vector of data values. |
crit.values |
a character string specifying how the critical values should be
obtained, i.e., approximated by the |
N |
number of Monte Carlo simulations for the empirical critical values. |
A list of class "htest"
with the following components:
statistic |
the standardized test statistic. |
p.value |
the |
parameter |
the ratio of the classical standard deviation |
data.name |
a character string giving the name of the data. |
Wallace Hui, Yulia R. Gel, Joseph L. Gastwirth, Weiwen Miao
Gel YR, Miao W, Gastwirth JL (2007). “Robust directed tests of normality against heavy-tailed alternatives.” Computational Statistics & Data Analysis, 51(5), 2734–2746. doi:10.1016/j.csda.2006.08.022.
rqq
, rjb.test
,
jarque.bera.test
data(bias) sj.test(bias)
data(bias) sj.test(bias)
Perform test for symmetry about an unknown median. Users can choose among the
Cabilio–Masaro test (Cabilio and Masaro 1996),
the Mira test (Mira 1999),
or the MGG test (Miao et al. 2006);
and between using asymptotic distribution of the respective statistics or
a distribution from -out-of-
bootstrap
(Lyubchich et al. 2016).
Additionally to the general distribution asymmetry, the function allows to test
for negative or positive skeweness (see the argument
side
).
NA
s from the data are omitted.
symmetry.test( x, option = c("MGG", "CM", "M"), side = c("both", "left", "right"), boot = TRUE, B = 1000, q = 8/9 )
symmetry.test( x, option = c("MGG", "CM", "M"), side = c("both", "left", "right"), boot = TRUE, B = 1000, q = 8/9 )
x |
data to be tested for symmetry. |
option |
test statistic to be applied. The options include statistic by Miao et al. (2006) (default), Cabilio and Masaro (1996), and Mira (1999). |
side |
choice from the three possible alternative hypotheses:
general distribution asymmetry ( |
boot |
logical value indicates whether |
B |
number of bootstrap replications to perform (default is 1000). |
q |
scalar from 0 to 1 to define a set of possible |
If the bootstrap option is used (boot = TRUE
), a bootstrap
distribution is obtained for each candidate subsample size . Then, a heuristic
method (Bickel et al. 1997; Bickel and Sakov 2008)
is used for the choice of optimal
. Specifically, we use the Wasserstein metric
(Ruschendorf 2001) to calculate distances between different
bootstrap distributions and select
, which corresponds to the minimal distance.
See Lyubchich et al. (2016) for more details.
A list of class "htest"
with the following components:
method |
name of the method. |
data.name |
name of the data. |
statistic |
value of the test statistic. |
p.value |
|
alternative |
alternative hypothesis. |
estimate |
bootstrap optimal |
Joseph L. Gastwirth, Yulia R. Gel, Wallace Hui, Vyacheslav Lyubchich, Weiwen Miao, Xingyu Wang (in alphabetical order)
Bickel PJ, Gotze F, van Zwet WR (1997).
“Resampling fewer than observations: gains, losses, and remedies for losses.”
Statistica Sinica, 7, 1–31.
Bickel PJ, Sakov A (2008).
“On the choice of in the
out of
bootstrap and confidence bounds for extrema.”
Statistica Sinica, 18(3), 967–985.
Cabilio P, Masaro J (1996).
“A simple test of symmetry about an unknown median.”
Canadian Journal of Statistics, 24(3), 349–361.
doi:10.2307/3315744.
Lyubchich V, Wang X, Heyes A, Gel YR (2016).
“A distribution-free -out-of-
bootstrap approach to testing symmetry about an unknown median.”
Computational Statistics & Data Analysis, 104, 1–9.
doi:10.1016/j.csda.2016.05.004.
Miao W, Gel YR, Gastwirth JL (2006).
“A new test of symmetry about an unknown median.”
In Hsiung A, Zhang C, Ying Z (eds.), Random Walk, Sequential Analysis and Related Topics – A Festschrift in Honor of Yuan-Shih Chow, 199–214.
World Scientific Publisher, Singapore.
doi:10.1142/9789812772558_0013.
Mira A (1999).
“Distribution-free test for symmetry based on Bonferroni's measure.”
Journal of Applied Statistics, 26(8), 959–972.
doi:10.1080/02664769921963.
Ruschendorf L (2001).
“Wasserstein metric.”
In Hazewinkel M (ed.), Encyclopaedia of Mathematics.
Springer, Berlin.
data(zuni) #run ?zuni to see the data description symmetry.test(zuni[,"Revenue"], boot = FALSE)
data(zuni) #run ?zuni to see the data description symmetry.test(zuni[,"Revenue"], boot = FALSE)
Number of students and available revenue per student in each school district in New Mexico.
data(zuni)
data(zuni)
A data frame with 89 observations on 3 variables: District
,
Revenue
, and Mem
(number of students).
The Zuni data come from a law case "The Zuni Public School District No. 89, Gallup-McKinley County Public School District No. 1, Petitioners v. United States Department of Education" concerning whether the revenue per pupil satisfied the standard for "equal" expenditures per pupil in the state. This classification determines whether most of the federal money given to the state under the law goes to the state or to the local school districts.
Gastwirth (2006).
Gastwirth JL (2006). “A 60 million dollar statistical issue arising in the interpretation and calculation of a measure of relative disparity: Zuni Public School District 89 v. US Department of Education.” Law, Probability and Risk, 5(1), 33–61. doi:10.1093/lpr/mgl019.