## SEARCH

#### Institution

##### ( see all 233)

- Universidade Federal de Pernambuco 6 (%)
- Université catholique de Louvain 5 (%)
- Universidade da Coruña 4 (%)
- Stanford University 3 (%)
- Texas A&M University 3 (%)

#### Author

##### ( see all 420)

- Keilegom, Ingrid 6 (%)
- Cribari-Neto, Francisco 5 (%)
- Cordeiro, Gauss M. 4 (%)
- Cao, Ricardo 3 (%)
- FranciscoCribari-Neto Francisco Cribari-Neto 3 (%)

## CURRENTLY DISPLAYING:

Most articles

Fewest articles

Showing 1 to 10 of 171 matching Articles
Results per page:

## The use of random-model tolerance intervals in environmental monitoring and regulation

### Journal of Agricultural, Biological, and Environmental Statistics (2002-03-01) 7: 74-94 , March 01, 2002

When appropriate data from regional reference locations are available, tolerance-interval bounds can be computed to provide criteria or limits distinguishing reference from nonreference conditions. If the limits are to be to applied to locations and times beyond the original data, the data should include temporal and spatial variation and the tolerance interval calculations should utilize a random crossed or nested ANOVA statistical design. Two computational methods for such designs are discussed and evaluated with simulations. Both methods are shown to perform well, and the adverse effect of using an improper design model is demonstrated. Three real-world applications are shown, where tolerance intervals are used to (1) establish a reference threshold for a benthic community pollution index, (2) set criteria for chemicals in sediments, and (3) establish background thresholds for survival rates in sediment bioassay tests. Some practical considerations in the use of the tolerance intervals are discussed.

## Partially linear varying coefficient models with missing at random responses

### Annals of the Institute of Statistical Mathematics (2013-08-01) 65: 721-762 , August 01, 2013

This paper considers partially linear varying coefficient models when the response variable is missing at random. The paper uses imputation techniques to develop an omnibus specification test. The test is based on a simple modification of a Cramer von Mises functional that overcomes the curse of dimensionality often associated with the standard Cramer von Mises functional. The paper also considers estimation of the mean functional under the missing at random assumption. The proposed estimator lies in between a fully nonparametric and a parametric one and can be used, for example, to obtain a novel estimator for the average treatment effect parameter. Monte Carlo simulations show that the proposed estimator and test statistic have good finite sample properties. An empirical application illustrates the applicability of the results of the paper.

## Reducing bias in parameter estimates from stepwise regression in proportional hazards regression with right-censored data

### Lifetime Data Analysis (2008-03-01) 14: 65-85 , March 01, 2008

When variable selection with stepwise regression and model fitting are conducted on the same data set, competition for inclusion in the model induces a selection bias in coefficient estimators away from zero. In proportional hazards regression with right-censored data, selection bias inflates the absolute value of parameter estimate of selected parameters, while the omission of other variables may shrink coefficients toward zero. This paper explores the extent of the bias in parameter estimates from stepwise proportional hazards regression and proposes a bootstrap method, similar to those proposed by Miller (Subset Selection in Regression, 2nd edn. Chapman & Hall/CRC, 2002) for linear regression, to correct for selection bias. We also use bootstrap methods to estimate the standard error of the adjusted estimators. Simulation results show that substantial biases could be present in uncorrected stepwise estimators and, for binary covariates, could exceed 250% of the true parameter value. The simulations also show that the conditional mean of the proposed bootstrap bias-corrected parameter estimator, given that a variable is selected, is moved closer to the unconditional mean of the standard partial likelihood estimator in the chosen model, and to the population value of the parameter. We also explore the effect of the adjustment on estimates of log relative risk, given the values of the covariates in a selected model. The proposed method is illustrated with data sets in primary biliary cirrhosis and in multiple myeloma from the Eastern Cooperative Oncology Group.

## Testing for one-sided alternatives in nonparametric censored regression

### TEST (2012-09-01) 21: 498-518 , September 01, 2012

Assume that we have two populations (*X*_{1},*Y*_{1}) and (*X*_{2},*Y*_{2}) satisfying two general nonparametric regression models *Y*_{j}=*m*_{j}(*X*_{j})+*ε*_{j}, *j*=1,2, where *m*(⋅) is a smooth location function, *ε*_{j} has zero location and the response *Y*_{j} is possibly right-censored. In this paper, we propose to test the null hypothesis *H*_{0}:*m*_{1}=*m*_{2} versus the one-sided alternative *H*_{1}:*m*_{1}<*m*_{2}. We introduce two test statistics for which we obtain the asymptotic normality under the null and the alternative hypotheses. Although the tests are based on nonparametric techniques, they can detect any local alternative converging to the null hypothesis at the parametric rate *n*^{−1/2}. The practical performance of a bootstrap version of the tests is investigated in a simulation study. An application to a data set about unemployment duration times is also included.

## Empirical process approach to some two-sample problems based on ranked set samples

### Annals of the Institute of Statistical Mathematics (2007-12-01) 59: 757-787 , December 01, 2007

We study the asymptotic properties of both the horizontal and vertical shift functions based on independent ranked set samples drawn from continuous distributions. Several tests derived from these shift processes are developed. We show that by using balanced ranked set samples with bigger set sizes, one can decrease the width of the confidence band and hence increase the power of these tests. These theoretical findings are validated through small-scale simulation studies. An application of the proposed techniques to a cancer mortality data set is also provided.

## Coverage plots for assessing the variability of estimated contours of a density

### Statistics and Computing (1996-12-01) 6: 325-336 , December 01, 1996

Methods for assessing the variability of an estimated contour of a density are discussed. A new method called the coverage plot is proposed. Techniques including sectioning and bootstrap techniques are compared for a particular problem which arises in Monte Carlo simulation approaches to estimating the spatial distribution of risk in the operation of weapons firing ranges. It is found that, for computational reasons, the sectioning procedure outperforms the bootstrap for this problem. The roles of bias and sample size are also seen in the examples shown.

## A novel method for constructing ensemble classifiers

### Statistics and Computing (2009-09-01) 19: 317-327 , September 01, 2009

This paper presents a novel ensemble classifier generation method by integrating the ideas of bootstrap aggregation and Principal Component Analysis (PCA). To create each individual member of an ensemble classifier, PCA is applied to every out-of-bag sample and the computed coefficients of all principal components are stored, and then the principal components calculated on the corresponding bootstrap sample are taken as additional elements of the original feature set. A classifier is trained with the bootstrap sample and some features randomly selected from the new feature set. The final ensemble classifier is constructed by majority voting of the trained base classifiers. The results obtained by empirical experiments and statistical tests demonstrate that the proposed method performs better than or as well as several other ensemble methods on some benchmark data sets publicly available from the UCI repository. Furthermore, the diversity-accuracy patterns of the ensemble classifiers are investigated by kappa-error diagrams.

## Bootstrap bias corrections for ensemble methods

### Statistics and Computing (2016-11-30): 1-10 , November 30, 2016

This paper examines the use of a residual bootstrap for bias correction in machine learning regression methods. Accounting for bias is an important obstacle in recent efforts to develop statistical inference for machine learning. We demonstrate empirically that the proposed bootstrap bias correction can lead to substantial improvements in both bias and predictive accuracy. In the context of ensembles of trees, we show that this correction can be approximated at only double the cost of training the original ensemble. Our method is shown to improve test set accuracy over random forests by up to 70% on example problems from the UCI repository.

## Parametric bootstrap tests for continuous and discrete distributions

### Metrika (2008-01-01) 67: 63-81 , January 01, 2008

Statistical procedures based on the estimated empirical process are well known for testing goodness of fit to parametric distribution families. These methods usually are not distribution free, so that the asymptotic critical values of test statistics depend on unknown parameters. This difficulty may be overcome by the utilization of parametric bootstrap procedures. The aim of this paper is to prove a weak approximation theorem for the bootstrapped estimated empirical process under very general conditions, which allow both the most important continuous and *discrete* distribution families, along with most parameter estimation methods. The emphasis is on families of discrete distributions, and simulation results for families of negative binomial distributions are also presented.

## Benchmarked estimates in small areas using linear mixed models with restrictions

### TEST (2009-08-01) 18: 342-364 , August 01, 2009

Linear mixed models have been frequently used to provide estimates in small areas. However, when aggregating small areas within the same region, the sum of these small area estimates does not generally match up with the estimate obtained using an appropriate estimator for the larger region. Then, benchmarking the model-dependent estimates to the ones obtained at certain level of aggregation is needed. In this paper, we propose a small area estimator based on a linear mixed effects model with restrictions to guarantee the concordance between the aggregations of small area estimates and those reported by statistical agencies for larger domains using a synthetic estimator. The mean squared prediction error of the restricted estimator is also derived and its performance is evaluated through a simulation study. The procedure is applied to the 2002 Business Survey of the Basque Country, Spain.