\documentclass{article} \usepackage{indentfirst} \usepackage{apalike} \usepackage{amsmath} \usepackage{url} \newcommand{\set}[1]{\{\, #1 \,\}} \DeclareMathOperator{\pr}{Pr} \DeclareMathOperator{\var}{var} \DeclareMathOperator{\logit}{logit} \newcommand{\BinomialDis}{\text{Bin}} \newcommand{\NegativeBinomialDis}{\text{NegBin}} \newcommand{\PoissonDis}{\text{Pois}} \RequirePackage{amsthm} \newtheorem{theorem}{Theorem} % \VignetteIndexEntry{UMP Package Design Document} \begin{document} \title{Design of the \texttt{ump} Package} \author{Charles J. Geyer \and Glen D. Meeden} \maketitle \section{Introduction} \subsection{The UMP and UMPU Problems} The following is mostly taken from the first draft of the fuzzy confidence intervals and $P$-values paper. \subsubsection{UMP} Lehmann (1959, pp.~68--69) says for a one-parameter model with likelihood ratio monotone in the statistic $T(X)$ there exists a UMP test having null hypothesis $H_0 = \set{ \vartheta : \vartheta \le \theta }$, alternative hypothesis $H_1 = \set{ \vartheta : \vartheta > \theta }$, significance level $\alpha$, and critical function $\phi$ defined by \begin{equation} \label{eq:ump-crit} \phi(x, \alpha, \theta) = \begin{cases} 1, & T(x) > C \\ \gamma, & T(x) = C \\ 0, & T(x) < C \end{cases} \end{equation} where the constants $\gamma$ and $C$ are determined by $$ E_\theta\{\phi(X, \alpha, \theta)\} = \alpha. $$ The description of the analogous lower-tailed test is the same except that all inequalities are reversed. The constant $C$ is clearly any $(1 - \alpha)$-th quantile of the distribution of $T(X)$ for the parameter value $\theta$. If $C$ is not an atom of this distribution, then the test is effectively not randomized and the value of $\gamma$ is irrelevant. Otherwise \begin{equation} \label{eq:gamma-ump} \gamma = \frac{\alpha - \pr_\theta\{T(X) > C\}}{\pr_\theta\{T(X) = C\}}. \end{equation} \subsubsection{UMPU} \label{sec:umpu} Lehmann (1959, pp.~126--127) says for a one-parameter exponential family model with canonical statistic $T(X)$ and canonical parameter $\theta$ there exists a UMPU test having null hypothesis $H_0 = \set{ \vartheta : \vartheta = \theta }$, alternative hypothesis $H_1 = \set{ \vartheta : \vartheta \neq \theta }$, significance level $\alpha$, and critical function $\phi$ defined by \begin{equation} \label{eq:umpu-crit} \phi(x, \alpha, \theta) = \begin{cases} 1, & T(x) < C_1 \\ \gamma_1, & T(x) = C_1 \\ 0, & C_1 < T(x) < C_2 \\ \gamma_2, & T(x) = C_2 \\ 1, & C_2 < T(x) \end{cases} \end{equation} where $C_1 \le C_2$ and the constants $\gamma_1$, $\gamma_2$, $C_1$, and $C_2$ are determined by \begin{subequations} \begin{align} E_\theta\{\phi(X, \alpha, \theta)\} & = \alpha \label{eq:umpu-a} \\ E_\theta\{T(X) \phi(X, \alpha, \theta)\} & = \alpha E_\theta\{T(X)\} \label{eq:umpu-b} \end{align} \end{subequations} If $C_1 = C_2 = C$ in \eqref{eq:umpu-crit} then $\gamma_1 = \gamma_2 = \gamma$ also. This occurs only in a very special case. Define \begin{subequations} \begin{align} p & = \pr_\theta\{T(X) = C\} \label{eq:p} \\ \mu & = E_\theta\{T(X)\} \label{eq:mu} \end{align} \end{subequations} Then in order to satisfy \eqref{eq:umpu-a} and \eqref{eq:umpu-b} we must have \begin{align*} 1 - (1 - \gamma) p & = \alpha \\ \mu - C (1 - \gamma) p & = \alpha \mu \end{align*} which solved for $\gamma$ and $C$ gives \begin{subequations} \begin{align} \gamma & = 1 - \frac{1 - \alpha}{p} \label{eq:umpu-spec-a} \\ C & = \mu \label{eq:umpu-spec-b} \end{align} \end{subequations} Thus this special case occurs only when $\mu$ an atom of the distribution of $T(X)$ for the parameter value $\theta$, and then only for very large significance levels: $\alpha > 1 - p$. Hence this special case is of no practical importance, although it is of some computational importance to get every case right, no weird bogus results or crashes in unusual special cases. Returning to the general case, assume for a second that we have particular $C_1$ and $C_2$ that work for some $x$, $\alpha$, and $\theta$ (we will see how to determine $C_1$ and $C_2$ presently). With $\mu$ still defined by \eqref{eq:mu} and with the definitions \begin{subequations} \begin{align} p_i & = \pr_\theta\{ T(X) = C_i \}, \qquad i = 1, 2 \label{eq:pi} \\ p_{1 2} & = \pr_\theta\{ C_1 < T(X) < C_2 \} \label{eq:p12} \\ m_{1 2} & = E_\theta\{ T(X) I_{(C_1, C_2)}[T(X)] \} \label{eq:m12} \end{align} \end{subequations} \eqref{eq:umpu-a} and \eqref{eq:umpu-b} become \begin{subequations} \begin{align} 1 - (1 - \gamma_1) p_1 - (1 - \gamma_2) p_2 - p_{1 2} & = \alpha \label{eq:umpu-a-general} \\ \mu - C_1 (1 - \gamma_1) p_1 - C_2 (1 - \gamma_2) p_2 - m_{1 2} & = \alpha \mu \label{eq:umpu-b-general} \end{align} \end{subequations} which solved for $\gamma_1$ and $\gamma_2$ give \begin{subequations} \begin{align} \gamma_1 & = 1 - \frac{(1 - \alpha) (C_2 - \mu) + m_{1 2} - C_2 p_{1 2}}{p_1 (C_2 - C_1)} \label{eq:gamma-1} \\ \gamma_2 & = 1 - \frac{(1 - \alpha) (\mu - C_1) - m_{1 2} + C_1 p_{1 2}}{p_2 (C_2 - C_1)} \label{eq:gamma-2} \end{align} \end{subequations} Note that \eqref{eq:gamma-1} and \eqref{eq:gamma-2} are linear in $\alpha$. They are valid over the range of $\alpha$ (if any) such that both equations give values between zero and one. Now we turn to the determination of $C_1$ and $C_2$. We present an algorithm \label{pg:algo} that determines $\phi(x, \alpha, \theta)$ for any discrete one-parameter exponential family with canonical statistic $T(X)$ for all values of $x$ and $\alpha$ for one fixed value of $\theta$. \begin{enumerate} \item Start with $\alpha = 1$. \begin{enumerate} \item If $\mu$ given by \eqref{eq:mu} is an atom, then $\phi(x, \alpha, \theta)$ is given by \eqref{eq:umpu-crit} with $C_1 = C_2 = \mu$ and $\gamma_1 = \gamma_2 = \gamma$ given by \eqref{eq:p}, \eqref{eq:umpu-spec-a}, and \eqref{eq:umpu-spec-b} over the range of $\alpha$ such that \eqref{eq:umpu-spec-a} is between zero and one. \item If $\mu$ given by \eqref{eq:mu} is not an atom, then choose $C_1$ and $C_2$ to be adjacent atoms such that $C_1 < \mu < C_2$ and $\phi(x, \alpha, \theta)$ is given by \eqref{eq:umpu-crit} with $\gamma_1$ and $\gamma_2$ given by \eqref{eq:pi}, \eqref{eq:p12}, \eqref{eq:m12}, \eqref{eq:gamma-1}, and \eqref{eq:gamma-2} over the range of $\alpha$ such that both \eqref{eq:gamma-1} and \eqref{eq:gamma-2} are between zero and one. [Because $C_1$ and $C_2$ are adjacent atoms, $p_{1 2} = m_{1 2} = 0$, and $\alpha = 1$ gives $\gamma_1 = \gamma_2 = 1$, a valid solution.] \end{enumerate} \item Start with the smallest $\alpha$ for which $\phi(x, \alpha, \theta)$ was determined in step 1 or a previous iteration of step 2. At this point, either $\gamma_1$ or $\gamma_2$ is zero (or both are). \begin{enumerate} \item If $\gamma_1$ is zero, then decrease $C_1$ to the adjacent lower atom and set $\gamma_1 = 1$ [which does not change the value of $\phi(x, \alpha, \theta)$ for any $x$]. \item If $\gamma_2$ is zero, then increase $C_2$ to the adjacent higher atom and set $\gamma_2 = 1$ [which does not change the value of $\phi(x, \alpha, \theta)$ for any $x$]. \item Now $\phi(x, \alpha, \theta)$ is given by \eqref{eq:umpu-spec-a} with $\gamma_1$ and $\gamma_2$ given by \eqref{eq:pi}, \eqref{eq:p12}, \eqref{eq:m12}, \eqref{eq:gamma-1}, and \eqref{eq:gamma-2} over the range of $\alpha$ such that both \eqref{eq:gamma-1} and \eqref{eq:gamma-2} are between zero and one [because of steps (a) and (b), both $\gamma_i$ are now greater than zero, so $\alpha$ can be decreased]. \end{enumerate} \item Repeat step 2 until the whole range $0 \le \alpha \le 1$ is covered. \end{enumerate} This algorithm is certainly unwieldy, but it does make clear that $\alpha \mapsto \phi(x, \alpha, \theta)$ is (1) continuous, (2) piecewise linear, (3) nondecreasing, and (4) onto $[0, 1]$. Hence it is the distribution function of a continuous random variable (an abstract randomized $P$-value). Clearly, it is differentiable on each linear piece and the derivative is piecewise constant (a step function). \subsection{Endpoint Behavior} \label{sec:endpoint} The UMPU test is not well defined when the null hypothesis is on the boundary of the parameter space. But equations \eqref{eq:umpu-crit}, \eqref{eq:umpu-a}, and \eqref{eq:umpu-b} still make sense and define a test. Since the probability and the expectation in those equations are continuous in $\theta$ this also characterizes the behavior as $\theta$ converges to a boundary point (which we need to know to calculate fuzzy confidence intervals, which involve all $\theta$ in the parameter space). \begin{theorem} \label{th:umpu-end} Suppose the setup for a UMPU test described in the first paragraph of Section~\ref{sec:umpu}. In addition suppose that $T(X)$ has a topologically discrete distribution (concentrated on a countable set of atoms and the atoms topologically isolated) not concentrated at one point. If the support $S$ of $T(X)$ has a lower bound and $L$ is the two-point set consisting of the two smallest atoms of the support, then \begin{equation} \label{eq:umpu-end} 1 - \phi(x, \alpha, \theta) \to (1 - \alpha) I_L(x), \qquad \text{as $\theta \to - \infty$}. \end{equation} Similarly, if the support has an upper bound and $L$ consists of the two largest atoms, then \eqref{eq:umpu-end} holds with $- \infty$ replaced by $+ \infty$. \end{theorem} \begin{proof} We do the case where the support has a lower bound. The upper bound case is entirely analogous. The densities in the family of distributions of $T(X)$ have the form \begin{equation} \label{eq:expo-dens} f_\theta(s) = \frac{1}{c(\theta)} e^{s \theta} \lambda(s), \qquad s \in S, \end{equation} where $\lambda$ is some strictly positive function, and \begin{equation} \label{eq:expo-norm} c(\theta) = \sum_{s \in S} e^{s \theta} \lambda(s). \end{equation} The natural parameter space of the family is the set $\Theta$ of $\theta$ such that \eqref{eq:expo-norm} is finite. Because $S$ is bounded below, if $c(\psi) < \infty$, then $c(\theta) < \infty$ for all $\theta < \psi$. Thus $\Theta$ is either the whole real line or a semi-infinite interval extending to $- \infty$. For every $\theta$ in the interior of $\Theta$, the distribution with density $f_\theta$ has a moment generating function $M_\theta$ defined by $$ M_\theta(t) = E_\theta\{e^{t T(X)}\} = \frac{c(\theta + t)}{c(\theta)}, $$ and hence this distribution has moments of all orders, the mean and variance being given by derivatives at zero of the cumulant generating function $\log M_\theta$ \begin{align*} \mu(\theta) & = E_\theta\{T(X)\} = \frac{d}{d \theta} \log c(\theta) \\ \sigma^2(\theta) & = \var_\theta\{T(X)\} = \frac{d^2}{d \theta^2} \log c(\theta) \end{align*} Because of our assumption that $T(X)$ is not concentrated at one point, $\sigma^2(\theta) = d \mu(\theta) / d \theta$ can never be zero. Hence $\mu$ is a strictly increasing continuous function that maps the interior of $\Theta$ to some open interval of the real line. Write $L = \{ s_0, s_1 \}$ with $s_0 < s_1$. Since $$ \frac{f_\theta(s)}{f_\theta(s_0)} = e^{(s - s_0) \theta} \frac{\lambda(s)}{\lambda(s_0)} $$ the distribution clearly converges to the distribution concentrated at $s_0$ as $\theta \to - \infty$. Now $$ \frac{\mu(\theta) - s_0}{f_\theta(s_1)} = \sum_{s \in S \setminus \{s_0\}} (s - s_0) e^{(s - s_1) \theta} \frac{\lambda(s)}{\lambda(s_1)} $$ goes to $s_1 - s_0$ by monotone convergence as $\theta \to - \infty$. Hence $\mu(\theta) \to s_0$ as $\theta \to - \infty$. Thus $\mu$ is a diffeomorphism from the interior of $\Theta$ to some open interval of the real line, the lower endpoint of which is $s_0$. Now $$ \frac{\pr_\theta\{T(X) > s_1\}}{f_\theta(s_1)} = \sum_{s \in S \setminus L} e^{(s - s_1) \theta} \frac{\lambda(s)}{\lambda(s_1)} $$ goes to zero by monotone convergence as $\theta \to - \infty$. And these facts together imply \begin{equation} \label{eq:expo-gen} \begin{split} \pr_\mu\{T(X) = s_0\} & = 1 - \frac{\mu - s_0}{s_1 - s_0} + o(\mu - s_0) \\ \pr_\mu\{T(X) = s_1\} & = \frac{\mu - s_0}{s_1 - s_0} + o(\mu - s_0) \\ \pr_\mu\{T(X) > s_1\} & = o(\mu - s_0) \end{split} \end{equation} where $\mu = \mu(\theta)$ is the mean value parameter. Now we claim that for small enough values of $\theta$ or $\mu$ we have $C_1 = s_0$ and $C_2 = s_1$ and the UMPU test is given by equations (3.10a) and (3.10b) in the paper with $p_1$ and $p_2$ given by (3.8a) in the paper and $p_{1 2} = m_{1 2} = 0$. Let's check. These equations give now \begin{align*} \gamma_1 & = 1 - \frac{(1 - \alpha) (s_1 - \mu)}{s_1 - \mu + o(\mu - s_0)} \\ \gamma_2 & = 1 - \frac{(1 - \alpha) (\mu - s_0)}{\mu - s_0 + o(\mu - s_0)} \end{align*} and clearly both converge to $\alpha$ as $\mu \to s_0$ hence both are between zero and one for small enough $\theta$ or $\mu$ and hence define the UMPU test. \end{proof} This explains the behavior of the fuzzy confidence intervals for the binomial distribution for the two $x$ values nearest each boundary in Figure~2 of the paper. As $\theta \to 0$, the fuzzy confidence interval $1 - \phi(x, \alpha, \theta)$ converges to $1 - \alpha$ for $x = 0$ or $x = 1$ and converges to zero for all other $x$. And as $\theta \to 1$, the fuzzy confidence interval converges to $1 - \alpha$ for $x = n - 1$ or $x = n$ and converges to zero for all other $x$. \subsection{Models With Nuisance Parameters} UMP and UMPU theory extends to multiparameter exponential families when the parameter of interest $\theta$ is one of the canonical parameters (Lehmann, TSH, 1st ed.,\ pp.\ 134--136). Suppose the family has densities of the form $$ \frac{1}{c(\theta, \boldsymbol{\eta})} \exp\left( \theta T(x) + \sum_{i = 1}^k \eta_i U_i(x) \right) $$ with respect to some measure on the sample space. Then the situation is exactly the same as described above except that the reference distribution of the test is the conditional distribution of $T(X)$ given $\mathbf{U}(X)$, which (a standard fact about exponential families) depends only on $\theta$ and not on the nuisance parameter $\boldsymbol{\eta}$. \subsubsection{UMP Tests With Nuisance Parameters} Now there exists a UMP test having null hypothesis $H_0 = \set{ \vartheta : \vartheta \le \theta }$, alternative hypothesis $H_1 = \set{ \vartheta : \vartheta > \theta }$, and significance level $\alpha$, and its critical function $\phi$ is defined by \begin{equation} \label{eq:crit-lower} \begin{split} \phi(x, \alpha, \theta) = \begin{cases} 1, & T(x) > C[\mathbf{U}(x)] \\ \gamma[\mathbf{U}(x)], & T(x) = C[\mathbf{U}(x)] \\ 0, & T(x) < C[\mathbf{U}(x)] \end{cases} \end{split} \end{equation} where the functions $\gamma$ and $C$ are determined by \begin{equation} \label{eq:cond-lower} E_\theta\{\phi(X, \alpha, \theta) \mid \mathbf{U}(X) \} = \alpha. \end{equation} Everything is exactly the same as for the one-parameter case except for the conditioning on $\mathbf{U}(x)$. The only point of the discussion is that the test is UMP whether considered conditionally or unconditionally. As before, the UMP upper-tailed test is obtained by reversing all the inequalities above. \subsubsection{UMPU Tests With Nuisance Parameters} Now there exists a UMPU test having null hypothesis $H_0 = \set{ \vartheta : \vartheta = \theta }$, alternative hypothesis $H_1 = \set{ \vartheta : \vartheta \neq \theta }$, and significance level $\alpha$, and its critical function $\phi$ is defined by \begin{equation} \label{eq:crit-two} \begin{split} \phi(x, \alpha, \theta) = \begin{cases} 1, & T(x) < C_1[\mathbf{U}(x)] \\ \gamma_1[\mathbf{U}(x)], & T(x) = C_1[\mathbf{U}(x)] \\ 0, & C_1[\mathbf{U}(x)] < T(x) < C_2[\mathbf{U}(x)] \\ \gamma_2[\mathbf{U}(x)], & T(x) = C_2[\mathbf{U}(x)] \\ 1, & C_2[\mathbf{U}(x)] < T(x) \end{cases} \end{split} \end{equation} where the functions $\gamma_1$, $\gamma_2$, $C_1$, and $C_2$ are determined by \begin{subequations} \begin{align} E_\theta\{\phi(X, \alpha, \theta) \mid \mathbf{U}(X) \} & = \alpha \label{eq:cond-two-a} \\ E_\theta\{T(X) \phi(X, \alpha, \theta) \mid \mathbf{U}(X) \} & = \alpha E_\theta\{T(X) \mid \mathbf{U}(X) \} \label{eq:cond-two-b} \end{align} \end{subequations} Again, the point is that the test is UMPU whether considered conditionally or unconditionally. % As we saw in Section~\ref{sec:special}, it is possible that $C_1 = C_2$ % in which case $\gamma_1$ and $\gamma_2$ are not separately determinable, % only their sum. Analysis of that situation here would just repeat that % section word for word except for references to conditioning on $\mathbf{U}(x)$. \section{Models} \subsection{Binomial} Let $X \sim \BinomialDis(n, p)$ with $0 < p < 1$. All of the quantities in \eqref{eq:gamma-1} and \eqref{eq:gamma-2} are easily calculated (in R) except possibly $m_{1 2}$. Actually, as Lehmann points out (TSH, 1st, ed.,\ pp.~128--129), this quantity is also easy to calculate $$ \sum_{x = C_1 + 1}^{C_2 - 1} x \binom{n}{x} p^x (1 - p)^{n - x} = n p \sum_{x = C_1 + 1}^{C_2 - 1} \binom{n - 1}{x - 1} p^{x - 1} (1 - p)^{(n - 1) - (x - 1)} $$ and the sum on the right is just a binomial probability for the $\BinomialDis(n - 1, p)$ distribution, that is, the right side can be calculated in R (with the obvious definitions of the variables) by \begin{verbatim} n * p * (pbinom(c2 - 2, n - 1, p) - pbinom(c1 - 1, n - 1, p)) \end{verbatim} assuming \texttt{pbinom(c1 - 1, n - 1, p)} does not crash and produces zero when \texttt{c1} is zero (which it does in recent versions I can check). \subsection{Poisson} Let $X \sim \PoissonDis(\mu)$ with $0 < \mu$. Again, all of the quantities in \eqref{eq:gamma-1} and \eqref{eq:gamma-2} are easily calculated except possibly $m_{1 2}$. Does it work like the binomial case? Yes! $$ \sum_{x = C_1 + 1}^{C_2 - 1} x \frac{\mu^x}{x !} e^{- \mu} = \mu \sum_{x = C_1 + 1}^{C_2 - 1} \frac{\mu^{x - 1}}{(x - 1)!} e^{- \mu} $$ and the sum on the right is just another Poisson probability, that is, the right side can be calculated in R (with the obvious definitions of the variables) by \begin{verbatim} mu * (ppois(c2 - 2, mu) - ppois(c1 - 1, mu)) \end{verbatim} \subsection{Negative Binomial} \label{sec:negbin} Let $X \sim \NegativeBinomialDis(r, p)$ with $0 < p < 1$. Like R we consider the sample space to start at zero rather than $r$. This also allows for non-integer $r$. The densities of the family have the form $$ f(x) = \frac{\Gamma(x+r)}{\Gamma(r) x!} p^r (1-p)^x $$ Note that if we are to have an exponential family $r$ cannot be an unknown parameter! The only unknown parameter is $p$. Again, all of the quantities in \eqref{eq:gamma-1} and \eqref{eq:gamma-2} are easily calculated except possibly $m_{1 2}$. Does it work like the binomial case? Yes! $$ \sum_{x = C_1 + 1}^{C_2 - 1} x \frac{\Gamma(x+r)}{\Gamma(r) x!} p^r (1-p)^x = \frac{1 - p}{p} \sum_{x = C_1 + 1}^{C_2 - 1} \frac{\Gamma(x - 1 + r + 1)}{\Gamma(r) (x - 1)!} p^{r + 1} (1-p)^{x - 1} $$ and the sum on the right is just another negative binomial probability, that is, the right side can be calculated in R (with the obvious definitions of the variables) by \begin{verbatim} (1 - p) / p * (pnbinom(c2 - 2, r + 1, p) - pnbinom(c1 - 1, r + 1, p)) \end{verbatim} \subsection{Two Independent Poisson Random Variables} Let $X_i \sim \PoissonDis(\mu_i)$ with $0 < \mu_i$, for $i = 1$, $2$ be independent random variables. We wish to compare the means $\mu_1$ and $\mu_2$. We cannot just test or produce fuzzy confidence intervals for a function pulled out of the air, such as $\mu_1 - \mu_2$. The parameter we test must be canonical. The canonical statistics of this exponential family are $X_1$ and $X_2$ and the corresponding canonical parameters are $\psi_i = \log(\mu_i)$. Linear functions of canonical parameters are again canonical so we can test or produce fuzzy confidence intervals for $\psi_1 - \psi_2 = \log(\mu_1 / \mu_2)$. Introduce new parameters \begin{align*} \psi_1 & = \eta + \theta \\ \psi_2 & = \eta \end{align*} Then $$ X_1 \psi_1 + X_2 \psi_2 = X_1 \theta + (X_1 + X_2) \eta = T(\mathbf{X}) \theta + U(\mathbf{X}) \eta $$ Where \begin{equation} \label{eq:t-and-u} \begin{split} T(\mathbf{X}) & = X_1 \\ U(\mathbf{X}) & = X_1 + X_2 \end{split} \end{equation} It is a standard result that the conditional distribution of $T(\mathbf{X})$ given $U(\mathbf{X})$ is $$ X_1 \mid X_1 + X_2 \sim \BinomialDis\left(X_1 + X_2, \frac{\mu_1}{\mu_1 + \mu_2}\right) $$ So the theory says we do the UMP or UMPU test based on this distribution with $\mu_1 / (\mu_1 + \mu_2)$ as the parameter of interest (Lehmann, TSH, 1st ed., pp.~140--142, gives further details). \subsection{Two Independent Binomial Proportions} \label{sec:two-binom-indep} Let $X_i \sim \BinomialDis(n_i, p_i)$ with $0 < p_i < 1$, for $i = 1$, $2$ be independent random variables. We wish to compare the proportions $p_1$ and $p_2$. We cannot just test or produce fuzzy confidence intervals for a function pulled out of the air, such as $p_1 - p_2$. The parameter we test must be canonical. The canonical statistics of this exponential family are $X_1$ and $X_2$ and the corresponding canonical parameters are $\psi_i = \logit(p_i)$. Linear functions of canonical parameters are again canonical so we can test or produce fuzzy confidence intervals for $\psi_1 - \psi_2$. As in the Poisson case we see that we can base the test on the conditional distribution of $T(\mathbf{X})$ given $U(\mathbf{X})$, where these variables are defined by \eqref{eq:t-and-u}. This distribution is known although is is not ``nice'' and is not a ``brand name'' distribution. It is (Lehmann, TSH, 1st ed., pp.~142--143) the exponential family generated by the hypergeometric distribution. \begin{equation} \label{eq:hypergeo} \pr\{ X_1 = t \mid X_1 + X_2 = u \} = \frac{1}{c(\rho)} \binom{n_1}{t} \binom{n_2}{u - t} \rho^t, \qquad t = 0, \ldots, u, \end{equation} where $$ \rho = \frac{p_1 (1 - p_2)}{(1 - p_1) p_2} $$ and $$ c(\rho) = \sum_{t = 0}^u \binom{n_1}{t} \binom{n_2}{u - t} \rho^t. $$ So the theory says we do the UMP or UMPU test based on this distribution with $\rho$ as the parameter of interest This distribution is very hairy. For starters the actual range of the random variable $X_1 \mid X_1 + X_2$ is not 0 to $X_1 + X_2$ as \eqref{eq:hypergeo} seems to indicate. The binomial coefficients can evaluate to zero. We will leave the complexity of this model and move on to other models. In principle it is just a one-parameter exponential family and hence ``nice'' in certain respects. In practice, there is no readily available software to calculate distribution and density functions, so this model requires more effort in computer implementation. \subsection{Two Independent Negative Binomial Variables} Let $X_i \sim \NegativeBinomialDis(r_i, p_i)$ with $0 < r_i$ and $0 < p_i < 1$, for $i = 1$, $2$ be independent random variables. As in Section~\ref{sec:negbin} we are using the convention that the sample space starts at zero. We wish to compare the proportions $p_1$ and $p_2$. We cannot just test or produce fuzzy confidence intervals for a function pulled out of the air, such as $p_1 - p_2$. The parameter we test must be canonical. The canonical statistics of this exponential family are $X_1$ and $X_2$ and the corresponding canonical parameters are $\psi_i = \log(1 - p_i)$. Linear functions of canonical parameters are again canonical so we can test or produce fuzzy confidence intervals for $\psi_1 - \psi_2$. As in the Poisson case we see that we can base the test on the conditional distribution of $T(\mathbf{X})$ given $U(\mathbf{X})$, where these variables are defined by \eqref{eq:t-and-u}. This distribution is not fully explained in Lehmann, although the $r_1 = r_2 = 1$ case is the subject of a homework problem in the second edition. Let's see what happens. The joint distribution of the $X$'s is \begin{align} f(x_1, x_2) & = \prod_{i = 1}^2 \frac{\Gamma(x_i + r_i)}{\Gamma(r_i) x_i!} p_i^{r_i} (1 - p_i)^{x_i} \nonumber \\ & = \exp(x_1 \psi_1 + x_2 \psi_2) \prod_{i = 1}^2 \frac{\Gamma(x_i + r_i)}{\Gamma(r_i) x_i!} p_i^{r_i} \nonumber \\ & = \exp(t \theta + u \eta) \frac{\Gamma(t + r_1)}{\Gamma(r_1) t!} \frac{\Gamma(u - t + r_2)}{\Gamma(r_2) (u - t)!} p_1^{r_1} p_2^{r_2} \label{eq:unnormal} \end{align} where \begin{align*} t & = x_1 \\ u & = x_1 + x_2 \\ \theta & = \psi_1 - \psi_2 \\ \eta & = \psi_2 \end{align*} It matters not that we have not specified $p_1$ and $p_2$ as a function of $\theta$ and $\eta$. We want to consider the conditional distribution anyway. Thought of as a function of $t$ for fixed $u$ and dropping all terms that do not contain both parameters and $t$ we get \begin{equation} \label{eq:neg-hyper-expo} f_\theta(t \mid u) = \frac{1}{c(\theta)} \exp(t \theta) \frac{\Gamma(t + r_1) \Gamma(u - t + r_2)}{t! (u - t)!}, \qquad t = 0, \ldots, u, \end{equation} where $$ c(\theta) = \sum_{t = 0}^u \exp(t \theta) \frac{\Gamma(t + r_1) \Gamma(u - t + r_2)}{t! (u - t)!}. $$ Hmmmm. Also not a brand name family and not the exponential family generated by the hypergeometric distribution we saw in the case of two independent binomial proportions. Still just another one-parameter exponential family. No big deal. Actually, Mathematica says this has some sort of relation to hypergeometric functions, for example \begin{verbatim} In[4]:= Sum[ rho^t Gamma[t + r] Gamma[u - t + s] / (Gamma[t + 1] Gamma[u - t + 1]), {t, 0, u} ] Gamma[r] Gamma[s + u] Hypergeometric2F1[r, -u, 1 - s - u, rho] Out[4]= -------------------------------------------------------------- Gamma[1 + u] \end{verbatim} So Mathematica knows how to calculate the normalizing function and it involves a ``hypergeometric $\vphantom{F}_2 F_1$ function. Maybe we should call this the exponential family generated by the negative hypergeometric distribution. According to \begin{verbatim} http://planetmath.org/encyclopedia/NegativeHypergeometricDistribution.html \end{verbatim} there is a \emph{negative hypergeometric distribution} having density $$ f(x) = \frac{ \binom{x+b-1}{x} \binom{W+B-b-x}{W-x} }{ \binom{W+B}{W} }, \qquad x = 0, 1, \ldots, W, $$ where $W$, $B$, and $b$ are positive integers. The web page explains that this is the distribution of the number of ``special items'' $X$ (from $W$ special items in the population) present before the $b$th object from a population with $B$ items altogether. Comparing with \eqref{eq:neg-hyper-expo} we see that the base measure of the family does indeed have this form when $r_1$ and $r_2$ are integers. \subsection{Testing Independence in a Two-by-two Table} \label{sec:exact} This is the UMP/UMPU competitor for Fisher's exact test. The data consist of a matrix $X_{i j}$, $i = 1$, 2, $j = 1$, 2, that has a multinomial distribution with sample size $n$ and cell probability matrix $p_{i j}$, $i = 1$, 2, $j = 1$, 2. This is also called a two-by-two contingency table. The canonical statistics are the $X_{i j}$, but the canonical parameters are not uniquely defined in terms of the $p_{i j}$ because the model is really only three dimensional, not four, because the $X_{i j}$ sum to $n$. As is well known, this is a three-dimensional exponential family, the canonical statistics being any three of the four $X_{i j}$, the fourth being determined from the other three by the requirement that the $X_{i j}$ sum to $n$. In this problem (Lehmann, TSH, 1st ed., pp.~143--146) the marginals are the statistics for the nuisance parameters, and we can consider any other statistic linearly independent of the marginals and the sum of all cells as the statistic of interest. Lehmann chooses \begin{equation} \label{eq:exact} \begin{split} T(\mathbf{X}) & = X_{1 1} \\ U_1(\mathbf{X}) & = X_{1 1} + X_{1 2} \\ U_2(\mathbf{X}) & = X_{1 1} + X_{2 1} \end{split} \end{equation} The conditional distribution of $T(\mathbf{X})$ given the marginals $U_1(\mathbf{X})$ and $U_2(\mathbf{X})$ is well known. It is the hypergeometric distribution involved in Fisher's exact test under the null hypothesis of independence and under general null hypotheses is the exponential family generated by the hypergeometric we encountered in Section~\ref{sec:two-binom-indep}. \begin{multline} \label{eq:hyper-fish} Pr\{X_{1 1} = t \mid X_{1 1} + X_{1 2} = u_1, X_{1 1} + X_{2 1} = u_2\} \\ = \frac{1}{c(\rho)} \binom{u_1}{t} \binom{n}{u_2 - t} \rho^t, \qquad t = 0, \ldots, n \end{multline} where $$ \rho = \frac{p_{1 1} p_{2 2}}{p_{1 2} p_{2 1}} $$ and $$ c(\rho) = \sum_{t = 0}^n \binom{u_1}{t} \binom{n}{u_2 - t} \rho^t. $$ As in Section~\ref{sec:two-binom-indep} we note that this distribution given by \eqref{eq:hyper-fish} often does not have the full range 0 to $n$ because the binomial coefficients may be zero. \subsection{Different Answers to the Same Question in a Poll} This section and the next give the UMP/UMPU/fuzzy competitors to the analysis of correlated binomial proportions in Wild and Seber (pp.~343--350). The first considers different answers to the same question on a poll. This is a multinomial problem. Say the categories of interest have counts $X_1$ and $X_2$, then we know $$ X_1 \mid X_1 + X_2 \sim \BinomialDis\left(X_1 + X_2, \frac{p_1}{p_1 + p_2}\right) $$ and so the UMP/UMPU/fuzzy procedures are based on this distribution. \subsection{Answers to Different Questions in a Poll} Here again, like in Section~\ref{sec:exact}, we have a two-by-two table with data $X_{i j}$ and cell probabilities $p_{i j}$ but the question of interest is different. Now we are interested in just the opposite question, whether the marginals differ. This in a sense (a rather vague sense) interchanges the role of interest and nuisance parameters, what were interest parameters in Section~\ref{sec:exact} are now nuisance parameters and vice versa. Ordinarily, this would be nonsense. There is exactly one interest parameter. The rest (in this case two) must be nuisance parameters. So, strictly speaking, they cannot be interchanged. But a two-by-two table has a redundant canonical statistic: there are four $X_{i j}$ but they sum to $n$ so only three are linearly independent. So if we add the redundant statistic to the statistics corresponding to parameters of interest we two sets of two that can be interchanged. It is clear that \eqref{eq:exact} could have been written with subscripts 1 and 2 interchanged, which would make $X_{2 2}$ the statistic of interest. This tells us that here we should condition on $X_{1 1}$ and $X_{2 2}$ leaving either $X_{1 2}$ or $X_{2 1}$ as the statistic of interest. Thus in this case the UMP/UMPU/fuzzy procedure is based on the distribution $$ X_{1 2} \mid X_{1 1}, X_{2 2} \sim \BinomialDis\left(n - X_{1 1} - X_{2 2}, \frac{p_{1 2}}{p_{1 2} + p_{2 1}} \right) $$ And in hindsight we see that we have invented the conditional, exact, uniformly most whatever (UMW) competitor of McNemar's test (Lindgren, \emph{Statistical Theory}, pp.~381--383). \section{Algorithms} \subsection{Version 0.1} After a great struggle, a very simple algorithm was decided on for calculating $\phi(x, \alpha, \theta)$. Given $\alpha$ and $\theta$, calculate the appropriate $c_1$, $c_2$, $\gamma_1$, and $\gamma_2$ as follows. First handle the special cases where $\alpha$ is zero or one and $\theta$ is on the boundary of the parameter space, using Theorem~\ref{th:umpu-end} above and the obvious fact that $\phi$ is identically equal to one when $\alpha = 1$ and identically equal to zero when $\alpha = 0$. When in the general case $0 < \alpha < 1$ and $\theta$ not on the boundary choose some $c_1$ and $c_2$ such that $c_1 \le E_\theta\{T(X)\} \le c_2$. We pick the $c_1$ and $c_2$ that give, with randomization, an equal tailed test, in the hope that this is close to correct. Then we go into an infinite loop that does the following. \begin{itemize} \item Calculate $\gamma_1$ and $\gamma_2$ using the current guesses for $c_1$ and $c_2$ and equations \eqref{eq:gamma-1} and \eqref{eq:gamma-2} above. If the results satisfy $0 \le \gamma_1 \le 1$ and $0 \le \gamma_2 \le 1$, then we are done and stop the loop. \item Otherwise, we change $c_1$ or $c_2$, the one corresponding to the $\gamma_i$ that violates the constraints worst. If this $\gamma_i$ is negative, we move the $c_i$ out by one (i.~e.\ decrease $c_1$ or increase $c_2$) and if this $\gamma_i$ is greater than one, we move the $c_i$ in by one. \end{itemize} Actually we don't do an infinite loop, because we have no theorem that says this algorithm converges, so we have a maximum iteration count (default 10) and just give up when it is reached. In the examples we have done, there has been no need to increase the iteration count. See \path{ump/src/umpubinom.c} for an example of this algorithm. See \path{ump/tests/umpub.R} for the tests it passed. \subsection{Version 0.3} An attempt to implement the density of abstract randomized $P$-values and test the implementation shows that equations \eqref{eq:gamma-1} and \eqref{eq:gamma-2} are no good. The exhibit catastrophic cancellation for small alpha. So we back up to \eqref{eq:pi}, \eqref{eq:p12}, and \eqref{eq:m12} and keep \eqref{eq:pi} but ignore the other two, replacing them with \begin{subequations} \begin{align} P_1 & = \pr_\theta\{ T(X) < C_1 \} \label{eq:P1} \\ P_2 & = \pr_\theta\{ T(X) > C_2 \} \label{eq:P2} \\ M_1 & = E_\theta\{ T(X) I_{(- \infty, C_1)}[T(X)] \} \label{eq:M1} \\ M_2 & = E_\theta\{ T(X) I_{(C_2, \infty)}[T(X)] \} \label{eq:M2} \end{align} \end{subequations} Then we replace \eqref{eq:umpu-a-general} and \eqref{eq:umpu-b-general} by \begin{subequations} \begin{align} P_1 + \gamma_1 p_1 + \gamma_2 p_2 + P_2 & = \alpha \label{eq:umpu-a-general-too} \\ M_1 + \gamma_1 C_1 p_1 + \gamma_2 C_2 p_2 + M_2 & = \alpha \mu \label{eq:umpu-b-general-too} \end{align} \end{subequations} which solved for $\gamma_1$ and $\gamma_2$ give \begin{subequations} \begin{align} \gamma_1 & = \frac{\alpha (C_2 - \mu) + (M_1 - C_2 P_1) + (M_2 - C_2 P_2)} {p_1 (C_2 - C_1)} \label{eq:gamma-1-too} \\ \gamma_2 & = \frac{\alpha (\mu - C_1) - (M_1 - C_1 P_1) - (M_2 - C_1 P_2)} {p_2 (C_2 - C_1)} \label{eq:gamma-2-too} \end{align} \end{subequations} % % Solve[ { P1 + gamma1 p1 + gamma2 p2 + P2 == alpha, % M1 + gamma1 C1 p1 + gamma2 C2 p2 + M2 == mu alpha }, { gamma1, gamma2 } ] % % gamma1 = (alpha (C2 - mu) + (M1 - C2 P1) + (M2 - C2 P2)) / (p1 (C2 - C1)) % gamma2 = (alpha (mu - C1) - (M1 - C1 P1) - (M2 - C1 P2)) / (p2 (C2 - C1)) % % P1 + gamma1 p1 + gamma2 p2 + P2 % M1 + gamma1 C1 p1 + gamma2 C2 p2 + M2 % % Solve[ 1 == (alpha (C2 - mu) + (M1 - C2 P1) + (M2 - C2 P2)) / (p1 (C2 - C1)), % alpha ] % Solve[ 1 == (alpha (mu - C1) - (M1 - C1 P1) - (M2 - C1 P2)) / (p2 (C2 - C1)), % alpha ] % Solve[ 0 == (alpha (C2 - mu) + (M1 - C2 P1) + (M2 - C2 P2)) / (p1 (C2 - C1)), % alpha ] % Solve[ 0 == (alpha (mu - C1) - (M1 - C1 P1) - (M2 - C1 P2)) / (p2 (C2 - C1)), % alpha ] % This seems to work better, but honesty compels us to admit that this formula also is subject to catastrophic cancellation. Perusal of the source code reveals several ad hoc bits of code that deal with special cases in which the code without the adhockery fails due to catastrophic cancellation or other problems with the inexactitude of floating point arithmetic. It is fair to say that our code is far from an elegant and provably correct solution to this problem. We think the algorithm presented on page~\pageref{pg:algo} would actually be better than the one we used in all respects except that it takes time proportional to the sample size, which was deemed unacceptable (perhaps wrongly). \bibliographystyle{apalike} \begin{thebibliography}{} \bibitem[Lehmann, 1959]{lehmann} \textsc{Lehmann, E.~L.} (1959). \newblock \emph{Testing Statistical Hypotheses}. \newblock Wiley, New York. \newblock (2nd ed., Wiley, 1986; Springer, 1997). \end{thebibliography} \end{document} \section{Computing} \subsection{Nice Models} In this section we take another shot at ``nice models'' by which we mean those for which R has functions for calculating most things about the distribution. Of the models mentioned above, the nice models are the binomial, Poisson, and negative binomial. The exponential families generated by the hypergeometric and negative hypergeometric distributions are not ``nice.'' Let $F_\theta$ and $F_\theta^{-1}$ denote the cumulative distribution and quantile functions as implemented in R (for example, \texttt{pbinom} and \texttt{qbinom}) $$ F_\theta^{-1}(p) = \inf \set{ x : F_\theta(x) \ge p } $$ Define also an analogous ``$M$'' function for the mean $$ M_\theta(x) = \sum_{y = 0}^x y f_\theta(y) $$ and its inverse $$ M_\theta^{-1}(s) = \inf \set{ x : M_\theta(x) \ge s } $$ As discussed in the sections on the models, these ``$M$'' functions are implemented in R as follows \begin{verbatim} mbinom <- function(x, n, p) n * p * pbinom(x - 1, n - 1, p) mpois <- function(x, mu) mu * ppois(x - 1, mu) mnbinom <- function(x, r, p) (1 - p) / p * pnbinom(x - 1, r + 1, p) \end{verbatim} The corresponding inverse functions are \begin{verbatim} nbinom <- function(s, n, p) 1 + qbinom(s / (n * p), n - 1, p) npois <- function(s, mu) 1 + qpois(s / mu, mu) nnbinom <- function(s, r, p) 1 + qnbinom(s * p / (1 - p), r + 1, p) \end{verbatim} Then we can rewrite write the key equations \eqref{eq:gamma-1} and \eqref{eq:gamma-2} as \begin{align*} 1 - \gamma_1 & = \frac{(1 - \alpha) (C_2 - \mu) + m_{1 2} - C_2 p_{1 2}} {f_\theta(C_1) (C_2 - C_1)} \\ 1 - \gamma_2 & = \frac{(1 - \alpha) (\mu - C_1) - m_{1 2} + C_1 p_{1 2}} {f_\theta(C_2) (C_2 - C_1)} \end{align*} where \begin{align*} p_{1 2} & = F_\theta(C_2 - 1) - F_\theta(C_1) \\ m_{1 2} & = M_\theta(C_2 - 1) - M_\theta(C_1) \end{align*} In order to have the gammas in range we need to satisfy the inequalities \begin{align*} 0 & \le \frac{(1 - \alpha) (C_2 - \mu) + m_{1 2} - C_2 p_{1 2}} {f_\theta(C_1) (C_2 - C_1)} \le 1 \\ 0 & \le \frac{(1 - \alpha) (\mu - C_1) - m_{1 2} + C_1 p_{1 2}} {f_\theta(C_2) (C_2 - C_1)} \le 1 \end{align*} or \begin{gather*} 0 \le (1 - \alpha) (C_2 - \mu) + m_{1 2} - C_2 p_{1 2} \le f_\theta(C_1) (C_2 - C_1) \\ 0 \le (1 - \alpha) (\mu - C_1) - m_{1 2} + C_1 p_{1 2} \le f_\theta(C_2) (C_2 - C_1) \end{gather*} Adding these inequalities to eliminate $m_{1 2}$ gives $$ 0 \le (1 - \alpha - p_{1 2}) (C_2 - C_1) \le \bigl[ f_\theta(C_1) + f_\theta(C_2) \bigr] (C_2 - C_1) $$ or $$ 0 \le 1 - \alpha - F_\theta(C_2 - 1) + F_\theta(C_1) \le f_\theta(C_1) + f_\theta(C_2) $$ or \pagebreak[3] \begin{subequations} \begin{equation} \label{eq:fred-a} F_\theta(C_2 - 1) - F_\theta(C_1) \le 1 - \alpha \le F_\theta(C_2) - F_\theta(C_1 - 1) \end{equation} Similarly, eliminating $p_{1 2}$ gives \begin{equation} \label{eq:fred-b} M_\theta(C_2 - 1) - M_\theta(C_1) \le (1 - \alpha) \mu \le M_\theta(C_2) - M_\theta(C_1 - 1) \end{equation} \end{subequations} Note that in hindsight \eqref{eq:fred-a} and \eqref{eq:fred-b} are obvious. In order to satisfy \eqref{eq:fred-a} and \eqref{eq:fred-b} we must have $$ F_\theta^{-1}\bigl\{ F_\theta(C_1 - 1) + 1 - \alpha \bigr\} \le C_2 \le 1 + F_\theta^{-1}\bigl\{ F_\theta(C_1) + 1 - \alpha + \epsilon \bigr\}, \qquad \forall \epsilon > 0 $$ and $$ M_\theta^{-1}\bigl\{ M_\theta(C_1 - 1) + (1 - \alpha) \mu \bigr\} \le C_2 \le 1 + M_\theta^{-1}\bigl\{ M_\theta(C_1) + (1 - \alpha) \mu + \epsilon \bigr\}, \qquad \forall \epsilon > 0 $$ the $\forall \epsilon > 0$ being an unavoidable hassle caused by the interaction of discreteness and the right continuity convention. It's too bad the R ``q'' functions don't come with a flag for either right or left continuity. Basically, these are just \eqref{eq:fred-a} and \eqref{eq:fred-b} ``inverted'' which is what we need to use in the computer, but we will stick with the simpler \eqref{eq:fred-a} and \eqref{eq:fred-b} for analysis. Now suppose we are interested in one fixed $x$ and $\theta$. Suppose, for concreteness that $x < \mu$ so if $\phi(x, \alpha, \theta)$ is strictly between zero and one we must have $x = C_1$. Then our inequalities can be used to determine the range of $\alpha$ and $C_2$ for which we get solutions. For starters, we must have $\mu < C_2 \le x_{\text{max}}$, where $x_{\text{max}}$ is the upper bound of the support (perhaps $\infty$). And this implies from \eqref{eq:fred-a} that $$ F_\theta(C_1 - 1) \le \alpha \le 1 - F_\theta(\mu) + F_\theta(C_1) $$ and from \eqref{eq:fred-b} that $$ \frac{M_\theta(C_1 - 1)}{\mu} \le \alpha \le 1 - \frac{M_\theta(\mu) - M_\theta(C_1)}{\mu} $$ \section{New Computing} In this section (November, 2002) we look at the analog of following the curve $\theta \mapsto \phi(x, \alpha, \theta)$ to get the membership function of a fuzzy confidence interval, the same way we followed $\alpha \mapsto \phi(x, \alpha, \theta)$ to get the cumulative distribution function of a fuzzy $P$-value. The latter is piecewise linear. The former isn't. But how hard is it? We know how to get started from the ``endpoint'' analysis (Section~\ref{sec:endpoint}). The membership function of the fuzzy confidence interval goes to $1 - \alpha$ or to 0 as $\theta$ goes to its upper or lower limit, depending on the value of $x$. (This means we are still in ``nice'' models, one-parameter exponential families with canonical statistic taking values on a range of consecutive integers). So we can start at the edge of the parameter space and go from there. What do equations \eqref{eq:gamma-1} and \eqref{eq:gamma-2} do \emph{considered as functions of $\theta$ for fixed $\alpha$}? Let's start with the case of $\theta$ near the boundary. Suppose the sample space of the canonical statistic is $\{ 0, 1, \ldots \}$ (so we are in the situation analyzed in Section~\ref{sec:endpoint}, but we want to know more than just $o(\mu)$ used there. For small enough $\theta$ we do have $C_1 = 0$ and $C_2 = 1$ so $p_{1 2} = m_{1 2} = 0$ and equations \eqref{eq:gamma-1} and \eqref{eq:gamma-2} become \begin{subequations} \begin{align} 1 - \gamma_1 & = \frac{(1 - \alpha) (1 - E_\theta\{T(X)\})}{\pr_\theta\{T(X) = 0\}} \label{eq:gamma-1-mbogo} \\ 1 - \gamma_2 & = \frac{(1 - \alpha) E_\theta\{T(X)\}}{\pr_\theta\{T(X) = 1\}} \label{eq:gamma-2-mbogo} \end{align} \end{subequations} For the $\text{Binomial}(n, p)$ distribution these become \begin{subequations} \begin{align} 1 - \gamma_1 & = \frac{(1 - \alpha) (1 - n p)}{(1 - p)^n} \label{eq:gamma-1-mbogo-bino} \\ 1 - \gamma_2 & = \frac{(1 - \alpha) n p}{n p (1 - p)^{n - 1}} \nonumber \\ & = \frac{(1 - \alpha)}{(1 - p)^{n - 1}} \label{eq:gamma-2-mbogo-bino} \end{align} \end{subequations} So we see these equations, though nonlinear, are not horribly intractable. They are good for the $\theta$ for which their values stay between zero and one. For \eqref{eq:gamma-2-mbogo-bino} this is for $$ 0 \le p \le 1 - (1 - \alpha)^{1 / (n - 1)} $$ at the upper end of this range where $\gamma_2 = 1$, we can make the change $C_2 = 2$ and $\gamma_2 = 0$ without affecting the critical function. With either of the $C$'s changed we get the more general equations \eqref{eq:gamma-1} and \eqref{eq:gamma-2} now in the form \begin{subequations} \begin{align} 1 - \gamma_1 & = \frac{(1 - \alpha) [C_2 - \mu(\theta)] - E_\theta\{I_{1 2}(X) [C_2 - X]\}} {f_\theta(C_1) (C_2 - C_1)} \label{eq:gamma-1-muggs} \\ 1 - \gamma_2 & = \frac{(1 - \alpha) [\mu(\theta) - C_1] - E_\theta\{I_{1 2}(X) [X - C_1]\}} {f_\theta(C_2) (C_2 - C_1)} \label{eq:gamma-2-muggs} \end{align} \end{subequations} where $$ I_{1 2}(x) = \begin{cases} 1, & C_1 < x < C_2 \\ 0, & \text{otherwise} \end{cases} $$ It follows from the general properties of exponential families that if $\theta$ is the canonical parameter $$ \frac{\partial}{\partial \theta} f_\theta(x) = [x - \mu(\theta)] f_\theta(x) $$ $$ \frac{\partial}{\partial \theta} \mu(\theta) = \sigma^2(\theta) $$ Hence \begin{align*} \frac{\partial}{\partial \theta} (1 - \gamma_1) & = \frac{1}{f_\theta(C_1) (C_2 - C_1)} \times \biggl( - (1 - \alpha) \sigma^2(\theta) \\ & \qquad - E_\theta\{I_{1 2}(X) [C_2 - X] [X - \mu(\theta)]\} \\ & \qquad + (1 - \alpha) [C_2 - \mu(\theta)] [\mu(\theta) - C_1] \\ & \qquad - E_\theta\{I_{1 2}(X) [C_2 - X] [\mu(\theta) - C_1]\} \biggr) \\ & = \frac{1}{f_\theta(C_1) (C_2 - C_1)} \times \biggl( \end{align*} In considering the distribution of the fuzzy $P$-value, we look at $\phi(x, \alpha, \theta)$ as a function of $\alpha$ for fixed $x$ and $\theta$, hence at \eqref{eq:gamma-ump} in the same way. Now $T(x)$ will be a $(1 - \alpha)$-th quantile if $$ \pr_\theta\{ T(X) > T(x) \} \le \alpha \le \pr_\theta\{ T(X) \ge T(x) \}. $$ Hence \begin{align*} \phi(x, \alpha, \theta) & = \begin{cases} 0, & \pr_\theta\{ T(X) \ge T(x) \} \ge \alpha \\ 1, & \pr_\theta\{ T(X) > T(x) \} \le \alpha \\ \frac{\alpha - \pr_\theta\{T(X) > T(x)\}}{\pr_\theta\{T(X) = T(x)\}}, & \text{otherwise} \end{cases} \end{align*} Since this is linear where not zero or one, it is the distribution function of a uniform random variable \begin{equation} \label{eq:fpv-ump-dist} P \sim \text{Uniform}\bigl( \pr_\theta\{ T(X) > T(x) \}, \pr_\theta\{ T(X) \ge T(x) \} \bigr), \end{equation} that is, the fuzzy $P$-value is uniformly distributed on the interval of values that you figure it should spread it over. Intuitively obvious, but we had to check to be sure. Please don't feel put upon by our belaboring the obvious, because the UMPU two-tailed case is not quite so obvious, and understanding the UMP one-tailed case helps.