# Admissible statistical tests for measurement

The application of various statistical tests for different categories of measurement scales are discussed below.

**Statistical Tests for Nominal scale:** since the symbols or labels attached to any category are arbitrary and can be interchanged without altering essential information contained in the scale, the only kind of descriptive statistics that can be used are those, which would not be affected or altered by such interchange. They are crude mode, proportion and frequency. The nominal scale data, however, can be used for testing of hypothesis relating to distribution of events among the classes. Chi-square test, Contingency Coefficient, and certain other tests based on binomial expansion can be used for the purpose.

**Statistical Tests for Ordinal scale:** median is the most appropriate measure of central tendency of the scores that are in an ordinal scale. Obviously, quartile deviation is the measure of dispersion for such data. There are a number of non-parametric tests to test a hypothesis with scores in an ordinal scale – runs test, sign test, median test, Mann Whitney U- test, etc. These tests are often referred to as ‘order statistics’ or ‘ranking statistics’. Interrelations can be computed from rankings of two sets of observations on the same group of individuals. Spearman’s Rank Difference, or Kendall Rank Correlation coefficients are appropriate for such situations.

For applying tests to measurements on an ordinal scale, we make an assumption that the observations are drawn from a distribution, which is essentially continuous. Such assumptions are also made for all parametric tests. A continuous variate is one that is not restricted to having only isolated values. Given a certain limit (interval between two classes), we can have any number of values inserted in between. With an increase in the number of observations, more and more of these values are likely to be represented.

It will suffice, at this point, to remind the readers that very often the crudeness of our measuring devices obscures the underlying continuity that may exist. The classification of respondents with respect to an attitude statement into categories strongly agree, agree, neutral, disagree, strongly disagree essentially presumes the presence of a continuum. If a variable is truly continuous and if the instrument for measuring the property in question is sensitive enough, then the probability of obtaining a tied observation is extremely small.

**Statistical Tests for Interval Scale:** the interval scale preserves both the ordering of objects and the relative differences between them, even though the numbers associated with the position of the object may be changed, following a regular system. A set of observations will be scalable by interval scale if the data permits a linear transformation, that is, if the equation Y = a + bX, where ‘a’ and ‘b’ are two positive constants, satisfies a set of real numbers, the numbers are said to be in an interval scale.

All the common parametric tests – arithmetic mean, median, standard deviation, product-moment correlation, etc., are applicable to data that follow an interval scale. Parametric tests for statistical significance like Z, t, F are also applicable to data in interval scale.

**Statistical Tests for Ratio scale:** since the values in a ratio scale are real numbers with a true zero (no upper limit) and only the unit of measurement is arbitrary, the ratios between two numbers and intervals preserve all the information contained in the scale even if these true numbers are multiplied by a true positive constant. Any statistical test, parametric or non-parametric, is usable when a ratio scale is used, such statistical tools as geometric mean and coefficient of variation, which require knowledge of true scores, can be used with observations that are in ratio scale.

**Criteria for judging the measuring instruments**

A measurement, too, must satisfy certain criteria. The most important criteria to be used in evaluating a measurement tool are described below.

i. Unidimensionality: this means the scale should measure one characteristic at a time, e.g., the ruler should measure length, not temperature.

ii. Linearity: this means that a scale should follow the straight-line model. Some scoring system should be devised, preferably one based on inter-changeable units. In a ruler an inch is an inch whether it lies at one end of the ruler or at the other, but in altitude scales such interchangeability cannot be ensured. In such cases, ranking is preferable.

iii. Validity: this refers to the ability of a scale to measure what it is supposed to measure.

iv. Reliability: this is an attribute of consistency. A scale should give consistent results.

v. Accuracy and Precision: a tool should give an accurate and precise measure of what we want to measure.

vi. Simplicity: a scale should be as simple as possible; an elaborate, complicated, and over – refined scale may become unduly cumbersome, costly or even useless.

vii. Practicability: this is concerned with wide range of factors like cost effectiveness, convenience and interpretability. Some trade off is usually needed between an ‘ideal’ tool and, that which the budget can afford. The benefit to be derived should be commensurate with the cost incurred.