Type: | Package |
Title: | Parametric Mixture Models for Uncertainty Estimation of Fatalities in UCDP Conflict Data |
Version: | 0.5.2 |
Description: | Provides functions for estimating uncertainty in the number of fatalities in the Uppsala Conflict Data Program (UCDP) data. The package implements a parametric reported-value Gumbel mixture distribution that accounts for the uncertainty in the number of fatalities in the UCDP data. The model is based on information from a survey on UCDP coders and how they view the uncertainty of the number of fatalities from UCDP events. The package provides functions for making random draws of fatalities from the mixture distribution, as well as to estimate percentiles, quantiles, means, and other statistics of the distribution. Full details on the survey and estimation procedure can be found in Vesco et al (2024). |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
LazyData: | true |
Imports: | dplyr, mistr, rlang, tibble |
Depends: | R (≥ 2.10) |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | no |
Packaged: | 2024-07-01 08:45:16 UTC; david |
Author: | David Randahl [cre, aut] |
Maintainer: | David Randahl <david.randahl@pcr.uu.se> |
Repository: | CRAN |
Date/Publication: | 2024-07-02 06:30:02 UTC |
uncertainUCDP: Parametric Mixture Models for Uncertainty Estimation of Fatalities in UCDP Conflict Data
Description
Provides functions for estimating uncertainty in the number of fatalities in the Uppsala Conflict Data Program (UCDP) data. The package implements a parametric reported-value Gumbel mixture distribution that accounts for the uncertainty in the number of fatalities in the UCDP data. The model is based on information from a survey on UCDP coders and how they view the uncertainty of the number of fatalities from UCDP events. The package provides functions for making random draws of fatalities from the mixture distribution, as well as to estimate percentiles, quantiles, means, and other statistics of the distribution. Full details on the survey and estimation procedure can be found in Vesco et al (2024).
Author(s)
Maintainer: David Randahl david.randahl@pcr.uu.se
Mean, median, and quantiles of the parametric uncertainty distributions for UCDP events
Description
Mean, median, and quantiles of the parametric uncertainty distributions for UCDP events. The parametric uncertainty distributions are based on the reported-value inflation Gumbel mixture distribution. The median
and quantile
functions are shortcuts for the quncertainUCDP
function.
Usage
mean_uncertainUCDP(fatalities, tov = c("sb", "ns", "os", "any"))
median_uncertainUCDP(fatalities, tov = c("sb", "ns", "os", "any"))
quantiles_unceartainUCDP(probs, fatalities, tov = c("sb", "ns", "os", "any"))
Arguments
fatalities |
A vector of non-negative integers representing the number of fatalities of the UCDP events. Non-integer values are allowed but should be considered experimental. |
tov |
A character string representing the type of violence of the UCDP. Must be one of "sb", "ns", "os", or "any". The options are: * "sb" for state-based violence * "ns" for non-state violence * "os" for one-sided violence * "any" for parameters estimated across all type of violence. This is somewhat experimental and should be used with caution. This is possibly useful when the type of violence is unknown or when the user wants to combine all types of violence into a single category. |
probs |
A numeric vector of probabilities with values in [0,1]. The quantiles to calculate. |
Value
A numeric vector of the same length as the input vector of fatalities representing the means, medians, and quantiles of the parametric uncertainty distribution for each UCDP event.
Examples
data(ucdpged)
# Calculate the mean for an arbitrary UCDP event
mean_uncertainUCDP(fatalities = 100, tov = 'sb')
# Calculate the mean for the first event in the UCDP GED sample
mean_uncertainUCDP(ucdpged$best[1], tov = ucdpged$type_of_violence[1])
# Calculate the median for an arbitrary UCDP event
median_uncertainUCDP(fatalities = 100, tov = 'sb')
# Calculate the median for the first event in the UCDP GED sample
median_uncertainUCDP(ucdpged$best[1], tov = ucdpged$type_of_violence[1])
# Calculate the 90th percentile for an arbitrary UCDP event
quantiles_unceartainUCDP(probs = 0.9, fatalities = 100, tov = 'sb')
# Calculate the 90th percentile for the first event in the UCDP GED sample
quantiles_unceartainUCDP(ucdpged$best[1], 0.9, tov = ucdpged$type_of_violence[1])
Parametric uncertainty distributions for UCDP events
Description
Density, distribution, quantile and random number generation functions for the parametric reported-value inflated Gumbel mixture distribution for UCDP events. The functions estimate the parameters of the distribution based on the number of fatalities and the type of violence of the UCDP event.
Usage
runcertainUCDP(n, fatalities, tov = c("sb", "ns", "os", "any"))
puncertainUCDP(q, fatalities, tov = c("sb", "ns", "os", "any"))
duncertainUCDP(x, fatalities, tov = c("sb", "ns", "os", "any"))
quncertainUCDP(p, fatalities, tov = c("sb", "ns", "os", "any"))
Arguments
n |
Number of observations to generate random values for |
fatalities |
A vector of non-negative integers representing the number of fatalities of the UCDP events. Non-integer values are allowed but should be considered experimental. |
tov |
A character string representing the type of violence of the UCDP. Must be one of "sb", "ns", "os", or "any". The options are: * "sb" for state-based violence * "ns" for non-state violence * "os" for one-sided violence * "any" for parameters estimated across all type of violence. This is somewhat experimental and should be used with caution. This is possibly useful when the type of violence is unknown or when the user wants to combine all types of violence into a single category. |
x , q |
Vector of quantiles |
p |
Vector of probabilities |
Details
The reported-value inflated Gumbel mixture distribution is a parametric distribution for modeling the uncertainty in the number of fatalities of UCDP events. The distribution is a mixture of a Gumbel distribution and a point mass at the reported number of fatalities. The distribution is estimated based on the number of fatalities and the type of violence of the UCDP event. The distribution is estimated using a set of regression models that estimate the location, scale, and weight parameters of the distribution based on the number of fatalities and the type of violence of the UCDP event.
Value
* duncertainUCDP
gives the density function
* puncertainUCDP
gives the distribution function
* quncertainUCDP
gives the quantile function
* runcertainUCDP
generates random values as a vector of length n
Examples
data(ucdpged)
# Generate 10 random values for an arbitrary UCDP event
runcertainUCDP(n = 10, fatalities = 100, tov = 'sb')
# Generate 10 random values for the first event in the GED sample
runcertainUCDP(n = 10, fatalities = ucdpged$best[1], tov = ucdpged$type_of_violence[1])
# Obtaining the probability that an arbitrary UCDP event has at least 150 fatalities
puncertainUCDP(q = 150, fatalities = 100, tov = 'ns')
# Obtaining the probability that the for the first event in the GED sample has at least 5 fatalities
puncertainUCDP(q = 5, fatalities = ucdpged$best[1], tov = ucdpged$type_of_violence[1])
# Obtaining the 90th percentile for an arbitrary UCDP event and one-sided violence
quncertainUCDP(p = 0.9, fatalities = 100, tov = 'os')
# Obtaining the 90th percentile for the first event in the GED sample
quncertainUCDP(p = 0.9, fatalities = ucdpged$best[1], tov = ucdpged$type_of_violence[1])
# Obtaining the density for an arbitrary UCDP event and state-based violence
duncertainUCDP(x = seq(from = 0, to = 500), fatalities = 100, tov = 'sb')
# Obtaining the density for the first event in the GED sample
duncertainUCDP(x = seq(0, 50), fatalities = ucdpged$best[1], tov = ucdpged$type_of_violence[1])
UCDP Georeferenced Event Dataset (GED) sample
Description
A sample of the UCDP Georeferenced Event Dataset (GED) from the 2023 data release. The data contains information about the date, location, and type of conflict events. The data is a sample of the full dataset, which can be downloaded from the UCDP website <https://ucdp.uu.se/downloads/>.
Usage
ucdpged
Format
a tibble with 1000 rows and 49 columns
Source
<https://ucdp.uu.se/downloads/>
Parameter extraction for uncertainUCDP-functions
Description
Extracting parameters for the reported-value inflated Gumbel mixture distribution for UCDP events. Primarily intended for internal use by the uncertainUCDP-functions, but can be used to extract parameters for the distribution manually.
Usage
uncertainUCDP_parameters(fatalities, tov)
Arguments
fatalities |
A vector of non-negative integers representing the number of fatalities of the UCDP event. Non-integer values are allowed but should be considered experimental |
tov |
A character string or integer value representing the type of violence of the UCDP. Must be one of "sb", "ns", "os", or "any" or their numeric equivalent The options are: * "sb" or 1 for state-based violence * "ns" or 2 for non-state violence * "os" or 3 for one-sided violence * "any" or 4 for parameters estimated across all type of violence. This is somewhat experimental and should be used with caution. This is possibly useful when the type of violence is unknown or when the user wants to combine all types of violence into a single category. |
Value
A list with three elements: loc, scale, and w. loc and scale are the location and scale parameters of the Gumbel distribution, respectively. w is the weight parameter for the reported-value inflation