## A simple implementation of Page's *L* test

Posted: **2023-02-16** · Last updated: **2023-12-02**

Page's *L* test is a little-known nonparametric test for testing for a trend. It was originally described in this paper (shorter version here).

There is already this R implementation, but it has somewhat odd behavior and gives warnings without explaining how to address them. Here is an alternative implementation. Let `X` be a matrix of outcomes from a “within” experiment in wide format. That is, each column represents one treatment; each row represents one experimental subject.

In the following, I will use the example from Page's paper. My code (licensed under CC0) for Page's *L* test will automatically transform the data in `X` into ranks:

X <- matrix(c(2, 1, 3, 4, 1, 3, 4, 2, 1, 3, 2, 4, 1, 4, 2, 3, 3, 1, 2, 4, 1, 2, 4, 3), ncol = 4, byrow = T) page.l <- function (X) { n <- ncol(X) m <- nrow(X) X_rank <- t(apply(X, 1, rank)) L <- sum(colSums(X_rank) * seq(1, n)) stat <- ((12*L - 3*m*n*(n+1)^2)^2)/(m*n^2*(n^2-1)*(n+1)) p <- 1-pchisq(stat, 1) list(L = L, stat = stat, p.value = p) } print(page.l(X))

The result is:

$L [1] 168 $stat [1] 6.48 $p.value [1] 0.0109095

Indeed, the *p*-values are calculated from a χ² approximation described in Page's original papers. The `p.value` returned by the code above reflects the **two-tailed test**; as Page wrote, if “a one-sided test is desired, __as will almost always be the case__, the probability discovered [from the χ² distribution] should be __halved__” (emphasis in original). The exact critical value for his example (one-tailed test, *m* = 6, *n* = 4) is 167 at *α* = 0.01; from the code above, we would estimate a *p*-value of the one-sided test of *p* ≈ 0.0055.

Perhaps I will one day reconstruct Page's original computer programs that he used to calculate the exact critical values. It must have been very time-consuming to run these in the 1950s.

### The size of Page's *L* test

In my opinion, Page's *L* test typically performs very well; as is often expected of nonparametric tests—not entirely correctly—it is well-powered. But, as is also well known, one could construct a test with power 1 by always rejecting the *H*₀. Let us thus investigate the (empirical) size of Page's *L* test. Is the test solid? We will simulate data consistent with the null hypothesis of no trend and see how the *p*-values are distributed.

I choose to simulate the case of 100 subjects and 5 treatments. The outcome is N(0,1) and there is no trend. Feel free to change these parameters and report back. Here is the code:

library(pbapply) # for pbreplicate set.seed(622213478) gen.data <- function () { matrix(rnorm(5*100, 0, 1), ncol = 5) } p.sim <- pbreplicate(10000, page.l(gen.data())$p.value) ks.test(p.sim, punif) # rejects at 5% level mean(p.sim < 0.01) # fine mean(p.sim < 0.05) # fine mean(p.sim < 0.1) # fine summary(p.sim) # also fine plot(ecdf(p.sim)) # looks fine, too abline(0, 1, col = "red")

This shows that a Kolmogorov-Smirnov test rejects the null hypothesis of the *p*-values being uniformly distributed on [0,1]. (Under the *H*₀, *p*-values are U(0,1).) Visual inspection and an investigation using `summary()` reveal, however, that this is likely due to slightly irregular behavior at high values for *α*. Overall, Page's *L* test seems to be well sized. At usual values for *α*, almost exactly the fraction of tests that should be (falsely) rejected is rejected. At unusual values for *α*, it may get a little wobbly!

In my opinion, Page's *L* test is slightly conservative if `X` has many ties. But this requires further investigation.