---
title: "Parameter Estimation of the Ideal Magnitude Distribution"
date: "`r Sys.Date()`"
output:
    rmarkdown::html_vignette:
        toc: true
        fig_width: 6
        fig_height: 4
vignette: >
  %\VignetteIndexEntry{Parameter Estimation of the Ideal Magnitude Distribution}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
library(vismeteor)
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

## Introduction

The density of an ideal magnitude distribution is

$$
    {\displaystyle f(m) = \frac{\mathrm{d}p}{\mathrm{d}m} = \frac{3}{2} \, \log(r) \sqrt{\frac{r^{3 \, \psi + 2 \, m}}{(r^\psi + r^m)^5}}}
$$
where $m$ is the meteor magnitude, $r = 10^{0.4} \approx 2.51189 \dots$ is a constant and
$\psi$ is the only parameter of this magnitude distribution.

In visual meteor observation, it is common to estimate meteor magnitudes in integer values.
Hence, this distribution is discrete and has the density

$$
   {\displaystyle P[M = m] \sim g(m) \, \int_{m-0.5}^{m+0.5} f(m) \, \, \mathrm{d}m} \, \mathrm{,}
$$
where $g(m)$ is the perception probability function. This distribution is thus a product of the
perception probabilities and the actual ideal distribution of the meteor magnitudes.

Here we demonstrate a method for an unbiased estimation of $\psi$.

First, we obtain some magnitude observations from the example data set,
which also includes the limiting magnitude.

```{r, echo=TRUE, results='hide'}
observations <- with(PER_2015_magn$observations, {
    idx <- !is.na(lim.magn) & sl.start > 135.81 & sl.end < 135.87
    data.frame(
        magn.id = magn.id[idx],
        lim.magn = lim.magn[idx]
    )
})
head(observations, 5) # Example values
```

```{r, echo=FALSE, results='asis'}
knitr::kable(head(observations, 5))
```

Next, the observed meteor magnitudes are matched with the corresponding observations.
This is necessary as we need the limiting magnitudes of the observations to determine
the parameter.

Using

```{r, echo=TRUE, results='hide'}
magnitudes <- with(new.env(), {
  magnitudes <- merge(
      observations,
      as.data.frame(PER_2015_magn$magnitudes),
      by = 'magn.id'
  )
  magnitudes$magn <- as.integer(as.character(magnitudes$magn))
  magnitudes
})
head(magnitudes[magnitudes$Freq>0,], 5) # Example values
```

we obtain a data frame with the absolute observed frequencies `Freq` for each
observation of a magnitude class:

```{r, echo=FALSE, results='asis'}
knitr::kable(head(magnitudes[magnitudes$Freq>0,], 5))
```

This data frame contains a total of `r sum(magnitudes$Freq)` meteors.
This is a sufficiently large number to estimate the parameter.

## Maximum Likelihood Method

The maximum likelihood method can be used to estimate the parameter in an unbiased manner.
For this, the function `dvmideal()` is needed, which returns the probability density of
the observable meteor magnitudes when the parameter and the limiting magnitudes are known.

The following algorithm estimates the parameter by maximizing the likelihood with
the `optim()` function. The function `ll` returns the negative log-likelihood,
as `optim()` identifies a minimum. The expression
`subset(magnitudes, (magnitudes$lim.magn - magnitudes$magn) > -0.5`
ensures that meteors fainter than the limiting magnitude are not used if they exist.

```{r, echo=TRUE, results='hide'}
# maximum likelihood estimation (MLE) of psi
result <- with(subset(magnitudes, (magnitudes$lim.magn - magnitudes$magn) > -0.5), {
    # log likelihood function
    ll <- function(psi) -sum(Freq * dvmideal(magn, lim.magn, psi, log=TRUE))
    psi.start <- 5.0 # starting value
    psi.lower <- 0.0 # lowest expected value
    psi.upper <- 10.0 # highest expected value
    # find minimum
    optim(psi.start, ll, method='Brent', lower=psi.lower, upper=psi.upper, hessian=TRUE)
})
```

This gives the expected value and the variance of the parameter:

```{r, echo=TRUE}
psi.mean <- result$par # mean of psi
print(psi.mean)
psi.var <- 1/result$hessian[1][1] # variance of r
print(psi.var)
```

## Residual Analysis

So far, we have operated under the assumption that the real distribution of meteor magnitudes
is exponential and that the perception probabilities are accurate.
We now use the Chi-Square goodness-of-fit test to check whether the observed frequencies match
the expected frequencies. Then, using the estimated parameter, we retrieve the relative
frequencies `p` for each observation and add them to the data frame `magnitudes`:

```{r, echo=TRUE, results='asis'}
magnitudes$p <- with(magnitudes, dvmideal(m = magn, lm = lim.magn, psi.mean))
```

We must also consider the probabilities for the magnitude class with the brightest meteors.

```{r, echo=TRUE, results='hide'}
magn.min <- min(magnitudes$magn)
```

The smallest magnitude class `magn.min` is `r magn.min`. In calculating the probabilities,
we assume that the magnitude class `r magn.min` contains meteors that are either brighter
or equally bright as `r magn.min` and thus use the function `pvmideal()` to determine
their probability.

```{r, echo=TRUE, results='asis'}
idx <- magnitudes$magn == magn.min
magnitudes$p[idx] <- with(
    magnitudes[idx,],
    pvmideal(m = magn + 1L, lm = lim.magn, psi.mean, lower.tail = TRUE)
)
```

This ensures that the probability of observing a meteor of any given magnitude is 100%.
This is known as the normalization condition. Accordingly, the Chi-Square goodness-of-fit test
will fail if this condition is not met.

We now create the contingency table `magnitutes.observed` for the observed meteor magnitudes
and its margin table.

```{r, echo=TRUE}
magnitutes.observed <- xtabs(Freq ~ magn.id + magn, data = magnitudes)
magnitutes.observed.mt <- margin.table(magnitutes.observed, margin = 2) 
print(magnitutes.observed.mt)
```

Next, we check which magnitude classes need to be aggegated so that each contains
at least 10 meteors, allowing us to perform a Chi-Square goodness-of-fit test.

The last output shows that meteors of magnitude class `0` or brighter must be combined into
a magnitude class `0-`. Meteors with a brightness less than `4` are grouped here in the
magnitude class `4+`, and a new contingency table magnitudes.observed is created:

```{r, echo=TRUE}
magnitudes$magn[magnitudes$magn <= 0] <- '0-'
magnitudes$magn[magnitudes$magn >= 4] <- '4+'
magnitutes.observed <- xtabs(Freq ~ magn.id + magn, data = magnitudes)
print(margin.table(magnitutes.observed, margin = 2))
```

We now need the corresponding expected relative frequencies

```{r, echo=TRUE}
magnitutes.expected <- xtabs(p ~ magn.id + magn, data = magnitudes)
magnitutes.expected <- magnitutes.expected/nrow(magnitutes.expected)
print(sum(magnitudes$Freq) * margin.table(magnitutes.expected, margin = 2))
```

and then carry out the Chi-Square goodness-of-fit test:

```{r, echo=TRUE, results='asis'}
chisq.test.result <- chisq.test(
    x = margin.table(magnitutes.observed, margin = 2),
    p = margin.table(magnitutes.expected, margin = 2)
)
```

As a result, we obtain the p-value:

```{r, echo=TRUE}
print(chisq.test.result$p.value)
```

If we set the level of significance at 5 percent, then it is clear that the p-value with
`r chisq.test.result$p.value` is greater than 0.05. Thus, under the assumption that the
magnitude distribution follows an ideal meteor magnitude distribution and assuming that
the perception probabilities are correct (i.e., error-free or precisely known),
the assumptions cannot be rejected. However, the converse is not true; the assumptions
may not necessarily be correct. The total count of meteors here is too small for such
a conclusion.

To verify the p-value, we also graphically represent the Pearson residuals:

```{r, fig.show='hold'}
chisq.test.residuals <- with(new.env(), {
    chisq.test.residuals <- residuals(chisq.test.result)
    v <- as.vector(chisq.test.residuals)
    names(v) <- rownames(chisq.test.residuals)
    v
})

plot(
    chisq.test.residuals,
    main="Residuals of the chi-square goodness-of-fit test",
    xlab="m",
    ylab="Residuals",
    ylim=c(-3, 3),
    xaxt = "n"
)
abline(h=0.0, lwd=2)
axis(1, at = seq_along(chisq.test.residuals), labels = names(chisq.test.residuals))
```