## SEARCH

#### Institution

##### ( see all 494)

- University of Pennsylvania 15 (%)
- Cornell University 14 (%)
- Probability, Statistics & Information 14 (%)
- Centre National de la Recherche Scientifique 12 (%)
- George Mason University 12 (%)

#### Author

##### ( see all 975)

- Eckstein, Peter P. 25 (%)
- Foster, Dean P. 15 (%)
- Stine, Robert A. 15 (%)
- Waterman, Richard P. 15 (%)
- Lebart, Ludovic 14 (%)

#### Publication

##### ( see all 26)

- Data Science, Classification, and Related Methods 95 (%)
- COMPSTAT 76 (%)
- Journal of Medical Systems 44 (%)
- Annals of the Institute of Statistical Mathematics 43 (%)
- Statistical Papers 40 (%)

## CURRENTLY DISPLAYING:

Most articles

Fewest articles

Showing 1 to 10 of 638 matching Articles
Results per page:

## Eight-run two-level factorial designs under dependence

### Metrika (1998-12-01) 48: 127-139 , December 01, 1998

### Abstract.

Dependent observations commonly arise in factorial experiments. Apart from main-effects two-level designs formed by the Cheng & Steinberg reverse foldover algorithm, which are known to be very efficient designs under dependence using the *D*-criterion, little is known about other designs, models and criteria, and the range of possible behaviour. In this paper, we investigate in detail 8-run two-level designs.

## Model-Selection Uncertainty with Examples

### Model Selection and Inference (1998-01-01): 118-158 , January 01, 1998

The understanding of model-selection uncertainty requires that one consider the process that generates the sample data we observe. For a given field, laboratory, or computer simulation study, data are observed on some process or system. If a second, independent, data set could be observed on the same process or system under nearly identical conditions, the new data set would differ somewhat from the first. Clearly, both data sets would contain information about the process, but the information would likely be slightly different, by chance. An obvious goal of data analysis is to make an inference about the process based on the data observed. A less obvious goal of data analysis is to make inferences about the process that are not overly specific with respect to the (single) data set observed. That is, we would like our inferences to be robust, with respect to the particular data set observed, in such a way that we tend to avoid problems associated with over-fitting (overinterpreting) the limited data we have. Thus, we would like some ability to make inferences about the process as if a large number of other data sets were also available. The interpretation of a confidence interval is similar; i.e., in repeated samples from the process, 95% of the data sets will generate a confidence interval that includes the true parameter value. This idea extends to the idea of generating a confidence (sub) set of the models considered such that with high relative frequency, over samples, that set of models contains the actual K-L best model of the set of models considered, while being as small a subset as possible (analogous to short confidence intervals).

## Analyzing Geostatistical Data

### S+SpatialStats (1998-01-01): 67-109 , January 01, 1998

This chapter introduces functions available in *S-Plus* and *S+SpatialStats* for analyzing geostatistical data. Geostatistical data, also termed random field data, consist of measurements taken at fixed locations. For a complete description of geostatistical data see chapter 1. Specifically, this chapter discusses methods related to variogram analysis and kriging. Variogram estimation and kriging were originally introduced as geostatistical methods for use in mining applications. In recent years, these methods have been applied to many disciplines including meteorology, forestry, agriculture, cartography, climatology, and fisheries.

In this chapter you will learn about the following topics:

Estimating Variograms (section 4.1).

Fitting Theoretical Variogram Models (section 4.2).

Performing Ordinary and Universal Kriging (section 4.3).

Simulating Geostatistical Data (section 4.4).

## A method for classifying unaligned biological sequences

### Data Science, Classification, and Related Methods (1998-01-01) , January 01, 1998

### Summary

It is needless to emphasize the importance of classification of protein sequences in molecular biology. Various methods of classification are currently being used by biologists (Landès et aí.1992) but most of them require the sequences to be prealigned — and thus to be of equal length — using one of the several multiple alignment algorithms available, so as to make the site-by-site comparison of sequences possible. Two LLA-based approaches for classifying prealigned sequences were already proposed (Lerman et al. (1994a)) whose results compared favourably with most currently used methods. The first approach made use of the “preordonnance” coding and the second one, the idea of “significant windows”. The new directions of research leading to a clustering method free from this somewhat strong constraint were also suggested by the authors. The present paper gives an account of the recent developments of our research, consisting of a new method that gets round the sequence comparison problem faced with while dealing with unaligned sequences, thanks to the “significant windows” approach.

## Stichprobentheorie

### Repetitorium Statistik (1998-01-01): 252-268 , January 01, 1998

### Zusammenfassung

Die Stichprobentheorie ist das Teilgebiet der Induktiven Statistik, das die theoretischen Grundlagen und die mathematisch-statistischen Verfahren für die Auswahl einer bestimmten Menge von Merkmalsträgern aus einer Grundgesamtheit zum Zwecke des Schlusses vom Teils aufs Ganze bereitstellt.

## Sequential estimation of normal mean under asymmetric loss function with a shrinkage stopping rule

### Metrika (1998-09-01) 48: 53-59 , September 01, 1998

### Abstract.

The problem of estimating a normal mean with unknown variance is considered under an asymmetric loss function such that the associated risk is bounded from above by a known quantity. In the absence of a fixed sample size rule, a sequential stopping rule and two sequential estimators of the mean are proposed and second-order asymptotic expansions of their risk functions are derived. It is demonstrated that the sample mean becomes asymptotically inadmissible, being dominated by a shrinkage-type estimator. Also a shrinkage factor is incorporated in the stopping rule and similar inadmissibility results are established.

## Efficiency Evaluation of Skilled Nursing Facilities

### Journal of Medical Systems (1998-08-01) 22: 211-224 , August 01, 1998

*This study employs Data Envelopment Analysis (DEA) to determine technical efficiency using skilled nursing facilities in the United States, using a 10% national sample of 324 skilled nursing facilities stratified by ownership and size cluster groupings. Results show that nonprofit and for-profit firms operate using significantly different modes of production, thus allowing the best of the for-profits to achieve a level of technical efficiency .86 times higher than the most efficient nonprofit homes. The best larger nursing homes are .89 times more efficient than the best smaller facilities, also indicating a difference in production goals and technologies. A rationale for these differences is sought through an analysis of DEA generated slacks and a logistic regression. Controlling for size and ownership in the DEA, a higher percentage of Medicare patients leads to lower efficiency, while higher occupancy and greater percentage of Medicaid patients lead to greater efficiency. Regional characteristics do not impact efficiency. It is concluded that reimbursement policies should account for differences in organizational goals created by size and ownership differentials. The great variations in efficiency demonstrate tremendous potential for cost-savings through imitation of efficient firms.*

## Exploring Textual Data

### Exploring Textual Data (1998-01-01): 4 , January 01, 1998

## Dependence Between Order Statistics in Samples from Finite Population and its Application to Ranked Set Sampling

### Annals of the Institute of Statistical Mathematics (1998-03-01) 50: 49-70 , March 01, 1998

Let X1, X2,..., Xm, Y1, Y2,..., Yn be a simple random sample without replacement from a finite population and let X(1) ≤ X(2) ≤...≤ X(m) and Y(1) ≤ Y(2) ≤...≤ Y(n) be the order statistics of X1, X2,..., Xm and Y1, Y2,..., Yn, respectively. It is shown that the joint distribution of X(i) and X(j) is positively likelihood ratio dependent and Y(j) is negatively regression dependent on X(i). Using these results, it is shown that when samples are drawn without replacement from a finite population, the relative precision of the ranked set sampling estimator of the population mean, relative to the simple random sample estimator with the same number of units quantified, is bounded below by 1.