6 Data Types by Levels of Measurement

6.1 Why Levels of Measurement Matter

The level of measurement of a variable determines which statistics, visualisations, and models are legitimate. A variable that looks numeric is not automatically suitable for arithmetic: PIN codes, customer IDs, and the numeric codes of satisfaction ratings are stored as numbers but behave nothing like revenue in rupees. Applying the wrong statistic produces results that look plausible, pass most sanity checks, and mislead every downstream decision.

From measurement theory to analytics

The theory of measurement levels was formalised by the psychologist S. S. Stevens in a 1946 paper in Science. Stevens’ four levels (nominal, ordinal, interval, ratio) remain the working taxonomy in statistics, business analytics, and the social sciences. Every statistical method carries an implicit assumption about the level of its inputs; understanding that assumption is the first step to using the method correctly.

The practical consequence

Level of measurement is not a philosophical footnote. It chooses the statistic (mean versus median versus mode), the chart (bar versus histogram), and the model (linear regression versus logistic versus ordinal). The same data, classified at different levels, will answer different questions.

6.2 Stevens’ Four Levels

Stevens’ 1946 classification arranges measurement into four nested levels. Each level admits the operations of the level below plus one new operation. Moving up the hierarchy adds mathematical structure and expands the set of permissible statistics.

Level	Equality	Order	Equal intervals	True zero	Indian business example
Nominal	Yes	No	No	No	State code (TN, KA, MH)
Ordinal	Yes	Yes	No	No	CSAT rating (Low, Medium, High)
Interval	Yes	Yes	Yes	No	Temperature in °C at warehouse
Ratio	Yes	Yes	Yes	Yes	Monthly revenue in ₹

6.3 Nominal Scale

The nominal scale classifies observations into categories with no inherent order. The only valid operation is checking whether two values are equal.

Typical nominal variables

State of residence (TN, KA, MH, DL), product category (electronics, apparel, grocery), payment method (UPI, credit card, net banking, cash-on-delivery), account status (active, dormant, closed), blood group, gender. The categories are mutually exclusive and collectively exhaustive; the numeric codes sometimes assigned to them (1, 2, 3) are arbitrary labels, not measurements.

Permissible operations

Frequency counts, mode, chi-square tests of association, proportions. Reporting a mean over numeric codes is meaningless: the “average payment method” has no interpretation. In R, nominal variables are represented using factor() without the ordered argument.

6.4 Ordinal Scale

The ordinal scale adds order to the nominal scale. Values can be ranked, but the distance between consecutive ranks is not guaranteed to be equal.

Typical ordinal variables

Five-point Likert items (Strongly Disagree, Disagree, Neutral, Agree, Strongly Agree), credit-risk bands (AAA, AA, A, BBB, …), NPS buckets (Detractor, Passive, Promoter), customer tiers (Bronze, Silver, Gold, Platinum), education level (undergraduate, postgraduate, doctorate). The gap between “Agree” and “Strongly Agree” is not necessarily the same as the gap between “Neutral” and “Agree”.

Permissible operations

Median, mode, percentiles, rank-based tests (Mann-Whitney, Kruskal-Wallis, Spearman correlation). Means are technically not valid, although in practice a mean of Likert items is often reported and treated as approximately interval. In R, ordinal variables are represented using factor(..., ordered = TRUE).

6.5 Interval Scale

The interval scale adds equal spacing between adjacent values, so differences are meaningful. It lacks a true zero, which means ratios are not meaningful.

Typical interval variables

Temperature in Celsius or Fahrenheit, calendar year (2025 CE), IQ score, credit score on a bounded band (300 to 900 in the CIBIL range when treated as interval). The difference between 20°C and 25°C equals the difference between 30°C and 35°C, but 20°C is not “twice as hot” as 10°C because 0°C is a convention, not an absence of temperature.

Permissible operations

Mean, standard deviation, Pearson correlation, t-tests, linear regression. Ratios and coefficients of variation are not valid because the zero point is arbitrary. In R, interval variables are stored as numeric.

6.6 Ratio Scale

The ratio scale adds a true zero that represents the absence of the quantity. All arithmetic operations are meaningful, including ratios.

Typical ratio variables

Monthly revenue in ₹, units sold, web-session duration in seconds, customer age in years, number of app sessions per week, distance travelled, inventory count. If a store’s sales doubled from ₹5 lakh to ₹10 lakh, the ratio is meaningful because ₹0 denotes no sales.

Permissible operations

All arithmetic, geometric mean, harmonic mean, coefficient of variation, log transforms, any parametric or non-parametric inferential test, any regression model. In R, ratio variables are stored as numeric (or integer when discrete). The distinction between interval and ratio seldom affects routine analytics, but matters for metrics like growth rates and elasticities that depend on ratios.

6.7 The Hierarchy of Measurement Levels

Each level includes the operations of the level below and adds one more. A variable measured at the ratio level can always be degraded to a lower level by discarding information (for example, binning revenue into “Low/Medium/High”), but the reverse is never possible.

flowchart LR
  A[Nominal<br/>equality only] --> B[Ordinal<br/>adds order]
  B --> C[Interval<br/>adds equal spacing]
  C --> D[Ratio<br/>adds true zero]
    classDef default fill:#004466,color:#ffffff,stroke:#ffcc00,stroke-width:3px,rx:10px,ry:10px;

Information loss is one-way

Always collect at the highest level the measurement process supports and bin later if needed. A rating captured as an integer from 0 to 10 can be analysed as continuous, ordinal, or nominal; a rating captured only as “Good/Neutral/Bad” cannot be recovered.

6.8 Discrete and Continuous Variables

Discrete and continuous is an orthogonal distinction that cuts across Stevens’ scheme. Discrete variables take isolated values (typically integers); continuous variables can take any value within an interval.

Typical discrete variables

Count of orders placed, number of defects per batch, number of employees in a branch, number of logins per day. Discrete variables are usually modelled with count-data distributions (Poisson, negative binomial).

Typical continuous variables

Revenue, time, temperature, distance, weight. Continuous variables are usually modelled with continuous distributions (normal, log-normal, gamma). The distinction matters at the modelling stage because the wrong distributional assumption can produce biased confidence intervals and invalid tests.

6.9 Qualitative and Quantitative Variables

An older and coarser classification splits variables into qualitative (categorical) and quantitative (numerical). Qualitative variables correspond roughly to nominal and ordinal; quantitative variables correspond to interval and ratio. The split is convenient in everyday language but too blunt for analytics, where the distinction between ordinal and interval, or between interval and ratio, often decides which technique is appropriate.

6.10 Permissible Statistics by Level

Level	Central tendency	Dispersion	Comparison	Example inferential test
Nominal	Mode	-	Frequency, proportion	Chi-square
Ordinal	Median, mode	Range, IQR	Rank	Mann-Whitney, Kruskal-Wallis
Interval	Mean, median, mode	SD, variance	Difference	t-test, ANOVA
Ratio	All of the above plus geometric and harmonic mean	SD, CV	Difference and ratio	All of the above plus log-linear models

6.11 Representing Levels in R

The level of measurement should be encoded in the object type so that downstream functions treat it correctly.

Try here

Calling mean() on an unordered factor would throw an error; calling sum() on an ordered factor would also fail. The object type is itself a form of documentation that protects later analysis from invalid operations.

6.12 Common Mistakes

Four frequent errors

Averaging a single Likert item as if it were ratio. An “average satisfaction of 3.4” on a 1-5 scale treats the ordinal gaps as equal when they are not. Report the median, the distribution, or a validated composite score. 2. Computing ratios on interval data. Saying 30°C is “twice as hot” as 15°C is meaningless; rework the question on the Kelvin scale if a ratio is really needed. 3. Coding a category as an integer and fitting linear regression on it. Using 1 = Bronze, 2 = Silver, 3 = Gold, 4 = Platinum and treating the code as continuous imposes arbitrary equal spacing. Use dummy variables or ordinal regression. 4. Leaving ID columns as numeric. Customer ID, PIN code, and account number look numeric but are nominal. Convert them to character or factor so they are never fed into arithmetic or into a regression as a predictor.

6.13 Implications for Modelling

Level of measurement selects the model family

The outcome’s level of measurement chooses the model. A binary nominal outcome calls for logistic regression. A multi-category nominal outcome calls for multinomial logistic regression. An ordinal outcome (satisfaction band, credit tier) calls for ordinal regression (proportional-odds or similar). A discrete count outcome (number of claims, number of logins) calls for Poisson or negative binomial regression. A continuous ratio outcome (revenue, duration) calls for linear regression, possibly after a log transform. Selecting the right family is not a stylistic choice; it affects every inference drawn from the model.

Summary

Concept	Description
Levels
Why Levels Matter	Measurement level controls which methods are valid
Nominal Scale	Unordered categorical labels (gender, region, brand)
Ordinal Scale	Ordered categories with unequal intervals (Likert, rank)
Interval Scale	Equal intervals with no true zero (Celsius, calendar year)
Ratio Scale	Equal intervals with a true zero (height, weight, income)
What You Can Do
Permissible Operations	Each level allows specific arithmetic and statistical operations
Choice of Statistic	Mode, median, mean — pick by the level that admits them
Choice of Test	Chi-square, ranks, t-tests — match to measurement level