Title: Bayesian Discharge Rating Curves
Version: 1.1.0
Maintainer: Solvi Rognvaldsson <solviro@gmail.com>
Description: Fits a discharge rating curve based on the power-law and the generalized power-law from data on paired stage and discharge measurements in a given river using a Bayesian hierarchical model as described in Hrafnkelsson et al. (2020) <doi:10.48550/arXiv.2010.04769>.
Depends: R (≥ 3.5.0)
License: MIT + file LICENSE
LazyData: true
RoxygenNote: 7.2.1
Imports: ggplot2, grid, gridExtra, rlang, scales
Suggests: testthat, knitr, rmarkdown, covr, vdiffr
VignetteBuilder: knitr
BugReports: https://github.com/sor16/bdrc/issues
Encoding: UTF-8
NeedsCompilation: no
Packaged: 2023-03-19 08:53:15 UTC; solviro
Author: Birgir Hrafnkelsson ORCID iD [aut, cph], Solvi Rognvaldsson ORCID iD [aut, cre], Axel Orn Jansson [aut], Rafael Vias [aut]
Repository: CRAN
Date/Publication: 2023-03-19 17:10:03 UTC

Autoplot method for discharge rating curves

Description

Visualize discharge rating curve model objects

Usage

## S3 method for class 'plm0'
autoplot(
  object,
  ...,
  type = "rating_curve",
  param = NULL,
  transformed = FALSE,
  title = NULL,
  xlim = NULL,
  ylim = NULL
)

## S3 method for class 'plm'
autoplot(
  object,
  ...,
  type = "rating_curve",
  param = NULL,
  transformed = FALSE,
  title = NULL,
  xlim = NULL,
  ylim = NULL
)

## S3 method for class 'gplm0'
autoplot(
  object,
  ...,
  type = "rating_curve",
  param = NULL,
  transformed = FALSE,
  title = NULL,
  xlim = NULL,
  ylim = NULL
)

## S3 method for class 'gplm'
autoplot(
  object,
  ...,
  type = "rating_curve",
  param = NULL,
  transformed = FALSE,
  title = NULL,
  xlim = NULL,
  ylim = NULL
)

Arguments

object

an object of class "plm0","plm","gplm0" or "gplm".

...

other plotting parameters (not used in this function)

type

a character denoting what type of plot should be drawn. Defaults to "rating_curve". Possible types are

  • "rating_curve" to plot the rating curve.

  • "rating_curve_mean" to plot the posterior mean of the rating curve.

  • "f" to plot the power-law exponent.

  • "beta" to plot the random effect in the power-law exponent.

  • "sigma_eps" to plot the standard deviation on the data level.

  • "residuals" to plot the log residuals.

  • "trace" to plot trace plots of parameters given in param.

  • "histogram" to plot histograms of parameters given in param.

param

a character vector with the parameters to plot. Defaults to NULL and is only used if type is "trace" or "histogram". Allowed values are the parameters given in the model summary of x as well as "hyperparameters" or "latent_parameters" for specific groups of parameters.

transformed

a logical value indicating whether the quantity should be plotted on a transformed scale used during the Bayesian inference. Defaults to FALSE.

title

a character denoting the title of the plot

xlim

numeric vector of length 2, denoting the limits on the x axis of the plot. Applicable for types "rating_curve","rating_curve_mean","f","beta","sigma_eps","residuals".

ylim

numeric vector of length 2, denoting the limits on the y axis of the plot. Applicable for types "rating_curve","rating_curve_mean","f","beta","sigma_eps","residuals".

Value

returns an object of class "ggplot2".

Functions

See Also

plm0, plm, gplm0 and gplm for fitting a discharge rating curve and summary.plm0, summary.plm, summary.gplm0 and summary.gplm for summaries. It is also useful to look at spread_draws and gather_draws to work directly with the MCMC samples.

Examples


library(ggplot2)
data(krokfors)
set.seed(1)
plm0.fit <- plm0(Q~W,krokfors,num_cores=2)
autoplot(plm0.fit)
autoplot(plm0.fit,transformed=TRUE)
autoplot(plm0.fit,type='histogram',param='c')
autoplot(plm0.fit,type='histogram',param='c',transformed=TRUE)
autoplot(plm0.fit,type='histogram',param='hyperparameters')
autoplot(plm0.fit,type='histogram',param='latent_parameters')
autoplot(plm0.fit,type='residuals')
autoplot(plm0.fit,type='f')
autoplot(plm0.fit,type='sigma_eps')


Autoplot method for discharge rating curve tournament

Description

Compare the four discharge rating curves from the tournament object in different ways

Usage

## S3 method for class 'tournament'
autoplot(object, ..., type = "deviance")

Arguments

object

an object of class "tournament"

...

other plotting parameters (not used in this function)

type

a character denoting what type of plot should be drawn. Possible types are

  • "deviance" to plot the deviance of the four models.

Value

returns an object of class "ggplot2".

See Also

tournament to run a discharge rating curve tournament and summary.tournament for summaries.

Examples


library(ggplot2)
data(krokfors)
set.seed(1)
t_obj <- tournament(formula=Q~W,data=krokfors,num_cores=2)
autoplot(t_obj)


Compare two models using a specified model selection criteria

Description

evaluate_game uses WAIC, DIC or the posterior probabilities of the models, calculated with Bayes factor, to determine whether one model is more appropriate than the other model given the data at hand.

Usage

evaluate_game(m, method, winning_criteria)

Arguments

m

a list of two model objects fit on the same dataset. The allowed model objects are "gplm", "gplm0", "plm" and "plm0"

method

a string specifying the method used to estimate the predictive performance of the models. The allowed methods are "WAIC", "DIC" and "Posterior_probability".

winning_criteria

a numerical value which sets the threshold which the first model in the list must exceed for it to be declared the more appropriate model. This value defaults to 2.2 for methods "WAIC" and "DIC", but defaults to 0.75 for method "Posterior_probability".

Value

A data.frame with the summary of the results of the game

References

Hrafnkelsson, B., Sigurdarson, H., and Gardarsson, S. M. (2022). Generalization of the power-law rating curve using hydrodynamic theory and Bayesian hierarchical modeling, Environmetrics, 33(2):e2711.

See Also

tournament


Gather MCMC chain draws to data.frame on a long format

Description

Useful to convert MCMC chain draws of particular parameters or output from the model object to a long format for further data wrangling

Usage

gather_draws(mod, ..., transformed = F)

Arguments

mod

an object of class "plm0","plm","gplm0" or "gplm".

...

any number of character vectors containing valid names of parameters in the model or "rating_curve" and "rating_curve_mean". Also accepts "latent_parameters" and "hyperparameters".

transformed

boolean value determining whether the parameter is to be represented on the transformed scale used for sampling in the MCMC chain or the original scale. Defaults to FALSE.

Value

Data frame with columns chain iter param value

References

B. Hrafnkelsson, H. Sigurdarson, S.M. Gardarsson, 2020, Generalization of the power-law rating curve using hydrodynamic theory and Bayesian hierarchical modeling. arXiv preprint 2010.04769

See Also

plm0, plm, gplm0, gplm for further information on parameters

Examples


data(krokfors)
set.seed(1)
plm0.fit <- plm0(formula=Q~W,data=krokfors,num_cores=2)
hyp_samples <- gather_draws(plm0.fit,'hyperparameters')
head(hyp_samples)
rating_curve_samples <- gather_draws(plm0.fit,'rating_curve','rating_curve_mean')
head(rating_curve_samples)


Report for a discharge rating curve or tournament

Description

Save a pdf file with a report of a discharge rating curve object or tournament.

Usage

get_report(x, path = NULL, type = 1, ...)

## S3 method for class 'plm0'
get_report(x, path = NULL, type = 1, ...)

## S3 method for class 'plm'
get_report(x, path = NULL, type = 1, ...)

## S3 method for class 'gplm0'
get_report(x, path = NULL, type = 1, ...)

## S3 method for class 'gplm'
get_report(x, path = NULL, type = 1, ...)

## S3 method for class 'tournament'
get_report(x, path = NULL, type = 1, ...)

Arguments

x

an object of class "tournament", "plm0", "plm", "gplm0" or "gplm".

path

file path to which the pdf file of the report is saved. If NULL, the current working directory is used.

type

an integer denoting what type of report is to be produced. Defaults to type 1. Only type 1 is permissible for an object of class "plm0", "plm", "gplm0" or "gplm". Possible types are

  • 1 - produces a report displaying the results of the model (winning model if a tournament provided). The first page contains a panel of four plots and a summary of the posterior distributions of the parameters. On the second page a tabular prediction of discharge on an equally spaced grid of stages is displayed. This prediction table can span multiple pages.

  • 2 - produces a ten page report and is only permissible for objects of class "tournament". The first four pages contain a panel of four plots and a summary of the posterior distributions of the parameters for each of the four models in the tournament, the fifth page shows model comparison plots and tables, the sixth page convergence diagnostics plots, and the final four pages shows the histograms of the parameters in each of the four models.

...

further arguments passed to other methods (currently unused).

Details

This function can only be used in an interactive R session as it asks permission from the user to write to their file system.

Value

No return value, called for side effects

Methods (by class)

See Also

get_report for generating and saving a report.

Examples


data(krokfors)
set.seed(1)
plm0.fit <- plm0(formula=Q~W,data=krokfors,num_cores=2)

## Not run: 
get_report(plm0.fit)

## End(Not run)

Report pages for a discharge rating curve or tournament

Description

Get a list of the pages of a report on a discharge rating curve model or tournament

Usage

get_report_pages(x, type = 1, ...)

## S3 method for class 'plm0'
get_report_pages(x, type = 1, ...)

## S3 method for class 'plm'
get_report_pages(x, type = 1, ...)

## S3 method for class 'gplm0'
get_report_pages(x, type = 1, ...)

## S3 method for class 'gplm'
get_report_pages(x, type = 1, ...)

## S3 method for class 'tournament'
get_report_pages(x, type = 1, ...)

Arguments

x

an object of class "tournament", "plm0", "plm", "gplm0" or "gplm".

type

an integer denoting what type of report is to be produced. Defaults to type 1. Possible types are

  • 1 - produces a report displaying the results of the model (winning model if a tournament provided). The first page contains a panel of four plots and a summary of the posterior distributions of the parameters. On the second page a tabular prediction of discharge on an equally spaced grid of stages is displayed. This prediction table can span multiple pages.

  • 2 - produces a ten page report and is only permissible for objects of class "tournament". The first four pages contain a panel of four plots and a summary of the posterior distributions of the parameters for each of the four models in the tournament, the fifth page shows model comparison plots and tables, the sixth page convergence diagnostics plots, and the final four pages shows the histograms of the parameters in each of the four models.

...

further arguments passed to other methods (currently unused).

Value

A list of objects of type "grob" that correspond to the pages in a rating curve report.

Methods (by class)

See Also

tournament for running a tournament,summary.tournament for summaries and get_report for generating and saving a report of a tournament object.

Examples


data(krokfors)
set.seed(1)
plm0.fit <- plm0(formula=Q~W,data=krokfors,num_cores=2)
plm0_pages <- get_report_pages(plm0.fit)


Generalized power-law model with variance that varies with stage.

Description

gplm is used to fit a discharge rating curve for paired measurements of stage and discharge using a generalized power-law model with variance that varies with stage as described in Hrafnkelsson et al. (2022). See "Details" for a more elaborate description of the model.

Usage

gplm(
  formula,
  data,
  c_param = NULL,
  h_max = NULL,
  parallel = TRUE,
  num_cores = NULL,
  forcepoint = rep(FALSE, nrow(data))
)

Arguments

formula

an object of class "formula", with discharge column name as response and stage column name as a covariate, i.e. of the form y~x where y is discharge in m^3/s and x is stage in m (it is very important that the data is in the correct units).

data

data.frame containing the variables specified in formula.

c_param

stage for which there is zero discharge. If NULL, it is treated as unknown in the model and inferred from the data.

h_max

maximum stage to which the rating curve should extrapolate to. If NULL, the maximum stage value in the data is selected as an upper bound.

parallel

logical value indicating whether to run the MCMC in parallel or not. Defaults to TRUE.

num_cores

integer between 1 and 4 (number of MCMC chains) indicating how many cores to use. Only used if parallel=TRUE. If NULL, the number of cores available on the device is detected automatically.

forcepoint

logical vector of the same length as the number of rows in data. If an element at index i is TRUE it indicates that the rating curve should be forced through the i-th measurement. Use with care, as this will strongly influence the resulting rating curve.

Details

The generalized power-law model is of the form

Q=a(h-c)^{f(h)}

where Q is discharge, h is stage, a and c are unknown constants and f is a function of h, referred to as the generalized power-law exponent.

The generalized power-law model is here inferred by using a Bayesian hierarchical model. The function f is modeled at the latent level as a fixed constant b plus a continuous stochastic process, \beta(h), which is assumed to be twice differentiable. The model is on a logarithmic scale

\log(Q_i) = \log(a) + (b + \beta(h_i)) \log(h_i - c) + \varepsilon_i, i = 1,...,n

where \varepsilon_i follows a normal distribution with mean zero and variance \sigma_\varepsilon(h_i)^2 that varies with stage. The stochastic process \beta(h) is assumed a priori to be a Gaussian process governed by a Matern covariance function with smoothness parameter \nu = 2.5. The error variance, \sigma_\varepsilon^2(h), of the log-discharge data is modeled as an exponential of a B-spline curve, that is, a linear combination of six B-spline basis functions that are defined over the range of the stage observations. An efficient posterior simulation is achieved by sampling from the joint posterior density of the hyperparameters of the model, and then sampling from the density of the latent parameters conditional on the hyperparameters.

Bayesian inference is based on the posterior density and summary statistics such as the posterior mean and 95% posterior intervals are based on the posterior density. Analytical formulas for these summary statistics are intractable in most cases and thus they are computed by generating samples from the posterior density using a Markov chain Monte Carlo simulation.

Value

gplm returns an object of class "gplm". An object of class "gplm" is a list containing the following components:

rating_curve

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior predictive distribution of the rating curve.

rating_curve_mean

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior distribution of the mean of the rating curve.

param_summary

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior distribution of latent- and hyperparameters. Additionally contains columns with r_hat and the effective number of samples for each parameter as defined in Gelman et al. (2013).

f_summary

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior distribution of f(h).

beta_summary

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior distribution of \beta(h).

sigma_eps_summary

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior distribution of \sigma_\varepsilon(h).

Deviance_summary

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior distribution of the deviance.

rating_curve_posterior

a matrix containing the full thinned posterior samples of the posterior predictive distribution of the rating curve excluding burn-in samples.

rating_curve_mean_posterior

a matrix containing the full thinned posterior samples of the posterior distribution of the mean of the rating curve excluding burn-in samples.

a_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of a excluding burn-in samples.

b_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of b excluding burn-in samples.

c_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of c excluding burn-in samples.

sigma_beta_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \sigma_\beta excluding burn-in samples.

phi_beta_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \phi_\beta excluding burn-in samples.

sigma_eta_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \sigma_\eta excluding burn-in samples.

eta_1_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \eta_1 excluding burn-in samples.

eta_2_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \eta_2 excluding burn-in samples.

eta_3_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \eta_3 excluding burn-in samples.

eta_4_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \eta_4 excluding burn-in samples.

eta_5_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \eta_5 excluding burn-in samples.

eta_6_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \eta_6 excluding burn-in samples.

f_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of f(h) excluding burn-in samples.

beta_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \beta(h) excluding burn-in samples.

sigma_eps_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \sigma_\varepsilon(h) excluding burn-in samples.

Deviance_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of the deviance excluding burn-in samples.

D_hat

deviance at the median value of the parameters.

effective_num_param_DIC

effective number of parameters, which is calculated as median(Deviance_posterior) minus D_hat.

DIC

Deviance Information Criterion for the model, calculated as D_hat plus 2*effective_num_parameters_DIC.

lppd

log pointwise predictive probability of the observed data under the model

effective_num_param_WAIC

effective number of parameters, which is calculated by summing up the posterior variance of the log predictive density for each data point.

WAIC

Watanabe-Akaike information criterion for the model, defined as -2*( lppd - effective_num_param_WAIC ).

autocorrelation

a data frame with the autocorrelation of each parameter for different lags.

acceptance_rate

proportion of accepted samples in the thinned MCMC chain (excluding burn-in).

formula

object of type "formula" provided by the user.

data

data provided by the user, ordered by stage.

run_info

information about the input arguments and the specific parameters used in the MCMC chain.

References

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2013). Bayesian Data Analysis, Third Edition. Chapman & Hall/CRC Texts in Statistical Science. Taylor & Francis.

Hrafnkelsson, B., Sigurdarson, H., and Gardarsson, S. M. (2022). Generalization of the power-law rating curve using hydrodynamic theory and Bayesian hierarchical modeling, Environmetrics, 33(2):e2711.

Spiegelhalter, D., Best, N., Carlin, B., Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64(4), 583–639.

Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. J. Mach. Learn. Res. 11, 3571–3594.

See Also

summary.gplm for summaries, predict.gplm for prediction and plot.gplm for plots. spread_draws and gather_draws are also useful to aid further visualization of the full posterior distributions.

Examples


data(norn)
set.seed(1)
gplm.fit <- gplm(formula=Q~W,data=norn,num_cores=2)
summary(gplm.fit)


Generalized power-law model with a constant variance

Description

gplm0 is used to fit a discharge rating curve for paired measurements of stage and discharge using a generalized power-law model with a constant variance as described in Hrafnkelsson et al. (2022). See "Details" for a more elaborate description of the model.

Usage

gplm0(
  formula,
  data,
  c_param = NULL,
  h_max = NULL,
  parallel = TRUE,
  num_cores = NULL,
  forcepoint = rep(FALSE, nrow(data))
)

Arguments

formula

an object of class "formula", with discharge column name as response and stage column name as a covariate, i.e. of the form y~x where y is discharge in m^3/s and x is stage in m (it is very important that the data is in the correct units).

data

data.frame containing the variables specified in formula.

c_param

stage for which there is zero discharge. If NULL, it is treated as unknown in the model and inferred from the data.

h_max

maximum stage to which the rating curve should extrapolate to. If NULL, the maximum stage value in the data is selected as an upper bound.

parallel

logical value indicating whether to run the MCMC in parallel or not. Defaults to TRUE.

num_cores

integer between 1 and 4 (number of MCMC chains) indicating how many cores to use. Only used if parallel=TRUE. If NULL, the number of cores available on the device is detected automatically.

forcepoint

logical vector of the same length as the number of rows in data. If an element at index i is TRUE it indicates that the rating curve should be forced through the i-th measurement. Use with care, as this will strongly influence the resulting rating curve.

Details

The generalized power-law model is of the form

Q=a(h-c)^{f(h)}

where Q is discharge, h is stage, a and c are unknown constants and f is a function of h referred to as the generalized power-law exponent.

The generalized power-law model is here inferred by using a Bayesian hierarchical model. The function f is modeled at the latent level as a fixed constant $b$ plus a continuous stochastic process,\beta(h), which is assumed to be twice differentiable. The model is on a logarithmic scale

\log(Q_i) = \log(a) + (b + \beta(h_i)) \log(h_i - c) + \varepsilon, i = 1,...,n

where \varepsilon follows a normal distribution with mean zero and variance \sigma_\varepsilon^2, independent of stage. The stochastic process \beta(h) is assumed a priori to be a Gaussian process governed by a Matern covariance function with smoothness parameter \nu = 2.5. An efficient posterior simulation is achieved by sampling from the joint posterior density of the hyperparameters of the model, and then sampling from the density of the latent parameters conditional on the hyperparameters.

Bayesian inference is based on the posterior density and summary statistics such as the posterior mean and 95% posterior intervals are based on the posterior density. Analytical formulas for these summary statistics are intractable in most cases and thus they are computed by generating samples from the posterior density using a Markov chain Monte Carlo simulation.

Value

gplm0 returns an object of class "gplm0". An object of class "gplm0" is a list containing the following components:

rating_curve

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior predictive distribution of the rating curve.

rating_curve_mean

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior distribution of the mean of the rating curve.

param_summary

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior distribution of latent- and hyperparameters. Additionally contains columns with r_hat and the effective number of samples for each parameter as defined in Gelman et al. (2013).

beta_summary

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior distribution of \beta.

Deviance_summary

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior distribution of the deviance.

rating_curve_posterior

a matrix containing the full thinned posterior samples of the posterior predictive distribution of the rating curve (excluding burn-in).

rating_curve_mean_posterior

a matrix containing the full thinned posterior samples of the posterior distribution of the mean of the rating curve (excluding burn-in).

a_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of a.

b_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of b.

c_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of c.

sigma_eps_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \sigma_{\varepsilon}.

sigma_beta_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \sigma_{\beta}.

phi_beta_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \phi_{\beta}.

sigma_eta_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \sigma_{\eta}.

beta_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \beta.

Deviance_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of the deviance excluding burn-in samples.

D_hat

deviance at the median value of the parameters.

effective_num_param_DIC

effective number of parameters, which is calculated as median(Deviance_posterior) minus D_hat.

DIC

Deviance Information Criterion for the model, calculated as D_hat plus 2*effective_num_parameters_DIC.

lppd

log pointwise predictive probability of the observed data under the model

effective_num_param_WAIC

effective number of parameters, which is calculated by summing up the posterior variance of the log predictive density for each data point.

WAIC

Watanabe-Akaike information criterion for the model, defined as -2*( lppd - effective_num_param_WAIC ).

autocorrelation

a data frame with the autocorrelation of each parameter for different lags.

acceptance_rate

proportion of accepted samples in the thinned MCMC chain (excluding burn-in).

formula

object of type "formula" provided by the user.

data

data provided by the user, ordered by stage.

run_info

information about the input arguments and the specific parameters used in the MCMC chain.

References

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2013). Bayesian Data Analysis, Third Edition. Chapman & Hall/CRC Texts in Statistical Science. Taylor & Francis.

Hrafnkelsson, B., Sigurdarson, H., and Gardarsson, S. M. (2022). Generalization of the power-law rating curve using hydrodynamic theory and Bayesian hierarchical modeling, Environmetrics, 33(2):e2711.

Spiegelhalter, D., Best, N., Carlin, B., Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64(4), 583–639.

Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. J. Mach. Learn. Res. 11, 3571–3594.

See Also

summary.gplm0 for summaries, predict.gplm0 for prediction. It is also useful to look at spread_draws and plot.gplm0 to help visualize the full posterior distributions.

Examples


data(krokfors)
set.seed(1)
gplm0.fit <- gplm0(formula=Q~W,data=krokfors,num_cores=2)
summary(gplm0.fit)


Jokulsa a Dal gauging station in Iceland

Description

Data on discharge and stage from Jokulsa a Dal gauging station in Iceland

Usage

jokdal

Format

A data frame with columns:

W

Measurements of water stage in meters

Q

Measurements of water discharge in cubic meters per second

Source

Icelandic Meteorological Office, Landsvirkjun - the National Power Company of Iceland, and the Icelandic Road and Coastal Administration.


Jokulsa a Fjollum gauging station in Iceland

Description

Data on discharge and stage from Jokulsa a Fjollum gauging station in Iceland

Usage

jokfjoll

Format

A data frame with columns:

W

Measurements of water stage in meters

Q

Measurements of water discharge in cubic meters per second

Source

Icelandic Meteorological Office, Landsvirkjun - the National Power Company of Iceland, and the Icelandic Road and Coastal Administration.


Kallstorp gauging station in Sweden

Description

Data on discharge and stage from Kallstorp gauging station in Sweden

Usage

kallstorp

Format

A data frame with columns:

W

Measurements of water stage in meters

Q

Measurements of water discharge in cubic meters per second

Source

Swedish Meteorological and Hydrological Institute.


Krokfors gauging station in Sweden

Description

Data on discharge and stage from Krokfors gauging station in Sweden.

Usage

krokfors

Format

A data frame with columns:

W

Measurements of water stage in meters

Q

Measurements of water discharge in cubic meters per second

Source

Swedish Meteorological and Hydrological Institute.


Melby gauging station in Sweden

Description

Data on discharge and stage from Melby gauging station in Sweden

Usage

melby

Format

A data frame with columns:

W

Measurements of water stage in meters

Q

Measurements of water discharge in cubic meters per second

Source

Swedish Meteorological and Hydrological Institute.


Nordura gauging station in Iceland

Description

Data on discharge and stage from Nordura gauging station in Iceland

Usage

nordura

Format

A data frame with columns:

W

Measurements of water stage in meters

Q

Measurements of water discharge in cubic meters per second

Source

Icelandic Meteorological Office, Landsvirkjun - the National Power Company of Iceland, and the Icelandic Road and Coastal Administration.


Norn gauging station in Sweden

Description

Data on discharge and stage from Norn gauging station in Sweden.

Usage

norn

Format

A data frame with columns:

W

Measurements of water stage in meters

Q

Measurements of water discharge in cubic meters per second

Source

Swedish Meteorological and Hydrological Institute.


Power-law model with variance that varies with stage.

Description

plm is used to fit a discharge rating curve for paired measurements of stage and discharge using a power-law model with variance that varies with stage as described in Hrafnkelsson et al. (2022). See "Details" for a more elaborate description of the model.

Usage

plm(
  formula,
  data,
  c_param = NULL,
  h_max = NULL,
  parallel = TRUE,
  num_cores = NULL,
  forcepoint = rep(FALSE, nrow(data))
)

Arguments

formula

an object of class "formula", with discharge column name as response and stage column name as a covariate, i.e. of the form y~x where y is discharge in m^3/s and x is stage in m (it is very important that the data is in the correct units).

data

data.frame containing the variables specified in formula.

c_param

stage for which there is zero discharge. If NULL, it is treated as unknown in the model and inferred from the data.

h_max

maximum stage to which the rating curve should extrapolate to. If NULL, the maximum stage value in the data is selected as an upper bound.

parallel

logical value indicating whether to run the MCMC in parallel or not. Defaults to TRUE.

num_cores

integer between 1 and 4 (number of MCMC chains) indicating how many cores to use. Only used if parallel=TRUE. If NULL, the number of cores available on the device is detected automatically.

forcepoint

logical vector of the same length as the number of rows in data. If an element at index i is TRUE it indicates that the rating curve should be forced through the i-th measurement. Use with care, as this will strongly influence the resulting rating curve.

Details

The power-law model, which is commonly used in hydraulic practice, is of the form

Q=a(h-c)^{b}

where Q is discharge, h is stage and a, b and c are unknown constants.

The power-law model is here inferred by using a Bayesian hierarchical model. The model is on a logarithmic scale

\log(Q_i) = \log(a) + b \log(h_i - c) + \varepsilon_i, i = 1,...,n

where \varepsilon_i follows a normal distribution with mean zero and variance \sigma_\varepsilon(h_i)^2 that varies with stage. The error variance, \sigma_\varepsilon^2(h), of the log-discharge data is modeled as an exponential of a B-spline curve, that is, a linear combination of six B-spline basis functions that are defined over the range of the stage observations. An efficient posterior simulation is achieved by sampling from the joint posterior density of the hyperparameters of the model, and then sampling from the density of the latent parameters conditional on the hyperparameters.

Bayesian inference is based on the posterior density and summary statistics such as the posterior mean and 95% posterior intervals are based on the posterior density. Analytical formulas for these summary statistics are intractable in most cases and thus they are computed by generating samples from the posterior density using a Markov chain Monte Carlo simulation.

Value

plm returns an object of class "plm". An object of class "plm" is a list containing the following components:

rating_curve

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior predictive distribution of the rating curve.

rating_curve_mean

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior distribution of the mean of the rating curve. Additionally contains columns with r_hat and the effective number of samples for each parameter as defined in Gelman et al. (2013).

param_summary

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior distribution of latent- and hyperparameters.

sigma_eps_summary

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior of \sigma_{\varepsilon}.

Deviance_summary

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior distribution of the deviance.

rating_curve_posterior

a matrix containing the full thinned posterior samples of the posterior predictive distribution of the rating curve (excluding burn-in).

rating_curve_mean_posterior

a matrix containing the full thinned posterior samples of the posterior distribution of the mean of the rating curve (excluding burn-in).

a_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of a.

b_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of b.

c_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of c.

sigma_eps_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \sigma_{\varepsilon}.

eta_1_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \eta_1.

eta_2_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \eta_2.

eta_3_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \eta_3.

eta_4_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \eta_4.

eta_5_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \eta_5.

eta_6_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \eta_6.

Deviance_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of the deviance excluding burn-in samples.

D_hat

deviance at the median value of the parameters.

effective_num_param_DIC

effective number of parameters, which is calculated as median(Deviance_posterior) minus D_hat.

DIC

Deviance Information Criterion for the model, calculated as D_hat plus 2*effective_num_parameters_DIC.

lppd

log pointwise predictive probability of the observed data under the model

effective_num_param_WAIC

effective number of parameters, which is calculated by summing up the posterior variance of the log predictive density for each data point.

WAIC

Watanabe-Akaike information criterion for the model, defined as -2*( lppd - effective_num_param_WAIC ).

autocorrelation

a data frame with the autocorrelation of each parameter for different lags.

acceptance_rate

proportion of accepted samples in the thinned MCMC chain (excluding burn-in).

formula

object of type "formula" provided by the user.

data

data provided by the user, ordered by stage.

run_info

information about the input arguments and the specific parameters used in the MCMC chain.

References

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2013). Bayesian Data Analysis, Third Edition. Chapman & Hall/CRC Texts in Statistical Science. Taylor & Francis.

Hrafnkelsson, B., Sigurdarson, H., and Gardarsson, S. M. (2022). Generalization of the power-law rating curve using hydrodynamic theory and Bayesian hierarchical modeling, Environmetrics, 33(2):e2711.

Spiegelhalter, D., Best, N., Carlin, B., Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64(4), 583–639.

Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. J. Mach. Learn. Res. 11, 3571–3594.

See Also

summary.plm for summaries, predict.plm for prediction. It is also useful to look at spread_draws and plot.plm to help visualize the full posterior distributions.

Examples


data(spanga)
set.seed(1)
plm.fit <- plm(formula=Q~W,data=spanga,num_cores=2)
summary(plm.fit)


Power-law model with a constant variance

Description

plm0 is used to fit a discharge rating curve for paired measurements of stage and discharge using a power-law model with a constant variance as described in Hrafnkelsson et al. (2022). See "Details" for a more elaborate description of the model.

Usage

plm0(
  formula,
  data,
  c_param = NULL,
  h_max = NULL,
  parallel = TRUE,
  num_cores = NULL,
  forcepoint = rep(FALSE, nrow(data))
)

Arguments

formula

an object of class "formula", with discharge column name as response and stage column name as a covariate, i.e. of the form y~x where y is discharge in m^3/s and x is stage in m (it is very important that the data is in the correct units).

data

data.frame containing the variables specified in formula.

c_param

stage for which there is zero discharge. If NULL, it is treated as unknown in the model and inferred from the data.

h_max

maximum stage to which the rating curve should extrapolate to. If NULL, the maximum stage value in the data is selected as an upper bound.

parallel

logical value indicating whether to run the MCMC in parallel or not. Defaults to TRUE.

num_cores

integer between 1 and 4 (number of MCMC chains) indicating how many cores to use. Only used if parallel=TRUE. If NULL, the number of cores available on the device is detected automatically.

forcepoint

logical vector of the same length as the number of rows in data. If an element at index i is TRUE it indicates that the rating curve should be forced through the i-th measurement. Use with care, as this will strongly influence the resulting rating curve.

Details

The power-law model, which is commonly used in hydraulic practice, is of the form

Q=a(h-c)^{b}

where Q is discharge, h is stage and a, b and c are unknown constants.

The power-law model is here inferred by using a Bayesian hierarchical model. The model is on a logarithmic scale

\log(Q_i) = \log(a) + b \log(h_i - c) + \varepsilon, i = 1,...,n

where \varepsilon follows a normal distribution with mean zero and variance \sigma_\varepsilon^2, independent of stage. An efficient posterior simulation is achieved by sampling from the joint posterior density of the hyperparameters of the model, and then sampling from the density of the latent parameters conditional on the hyperparameters.

Bayesian inference is based on the posterior density and summary statistics such as the posterior mean and 95% posterior intervals are based on the posterior density. Analytical formulas for these summary statistics are intractable in most cases and thus they are computed by generating samples from the posterior density using a Markov chain Monte Carlo simulation.

Value

plm0 returns an object of class "plm0". An object of class "plm0" is a list containing the following components:

rating_curve

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior predictive distribution of the rating curve.

rating_curve_mean

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior distribution of the mean of the rating curve.

param_summary

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior distribution of latent- and hyperparameters. Additionally contains columns with r_hat and the effective number of samples for each parameter as defined in Gelman et al. (2013).

Deviance_summary

a data frame with 2.5%, 50% and 97.5% percentiles of the posterior distribution of the deviance.

rating_curve_posterior

a matrix containing the full thinned posterior samples of the posterior predictive distribution of the rating curve (excluding burn-in).

rating_curve_mean_posterior

a matrix containing the full thinned posterior samples of the posterior distribution of the mean of the rating curve (excluding burn-in).

a_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of a.

b_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of b.

c_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of c.

sigma_eps_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of \sigma_{\varepsilon}.

Deviance_posterior

a numeric vector containing the full thinned posterior samples of the posterior distribution of the deviance excluding burn-in samples.

D_hat

deviance at the median value of the parameters

effective_num_param_DIC

effective number of parameters, which is calculated as median(Deviance_posterior) minus D_hat.

DIC

Deviance Information Criterion for the model, calculated as D_hat plus 2*effective_num_parameters_DIC.

lppd

log pointwise predictive probability of the observed data under the model

effective_num_param_WAIC

effective number of parameters, which is calculated by summing up the posterior variance of the log predictive density for each data point.

WAIC

Watanabe-Akaike information criterion for the model, defined as -2*( lppd - effective_num_param_WAIC ).

autocorrelation

a data frame with the autocorrelation of each parameter for different lags.

acceptance_rate

proportion of accepted samples in the thinned MCMC chain (excluding burn-in).

formula

object of type "formula" provided by the user.

data

data provided by the user, ordered by stage.

run_info

information about the input arguments and the specific parameters used in the MCMC chain.

References

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2013). Bayesian Data Analysis, Third Edition. Chapman & Hall/CRC Texts in Statistical Science. Taylor & Francis.

Hrafnkelsson, B., Sigurdarson, H., and Gardarsson, S. M. (2022). Generalization of the power-law rating curve using hydrodynamic theory and Bayesian hierarchical modeling, Environmetrics, 33(2):e2711.

Spiegelhalter, D., Best, N., Carlin, B., Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64(4), 583–639.

Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. J. Mach. Learn. Res. 11, 3571–3594.

See Also

summary.plm0 for summaries, predict.plm0 for prediction. It is also useful to look at spread_draws and plot.plm0 to help visualize the full posterior distributions.

Examples


data(skogsliden)
set.seed(1)
plm0.fit <- plm0(formula=Q~W,data=skogsliden,num_cores=2)
summary(plm0.fit)


Plot method for discharge rating curves

Description

Visualize discharge rating curve model objects

Usage

## S3 method for class 'plm0'
plot(
  x,
  ...,
  type = "rating_curve",
  param = NULL,
  transformed = FALSE,
  title = NULL,
  xlim = NULL,
  ylim = NULL
)

## S3 method for class 'plm'
plot(
  x,
  ...,
  type = "rating_curve",
  param = NULL,
  transformed = FALSE,
  title = NULL,
  xlim = NULL,
  ylim = NULL
)

## S3 method for class 'gplm0'
plot(
  x,
  ...,
  type = "rating_curve",
  param = NULL,
  transformed = FALSE,
  title = NULL,
  xlim = NULL,
  ylim = NULL
)

## S3 method for class 'gplm'
plot(
  x,
  ...,
  type = "rating_curve",
  param = NULL,
  transformed = FALSE,
  title = NULL,
  xlim = NULL,
  ylim = NULL
)

Arguments

x

object of class "plm0", "plm", "gplm0" or "gplm".

...

other plotting parameters (not used in this function)

type

a character denoting what type of plot should be drawn. Defaults to "rating_curve". Possible types are

  • "rating_curve" to plot the rating curve.

  • "rating_curve_mean" to plot the posterior mean of the rating curve.

  • "f" to plot the power-law exponent.

  • "beta" to plot the random effect in the power-law exponent.

  • "sigma_eps" to plot the standard deviation on the data level.

  • "residuals" to plot the log residuals.

  • "trace" to plot trace plots of parameters given in param.

  • "histogram" to plot histograms of parameters given in param.

  • "panel" to plot a 2x2 panel of plots: "rating curve", "residuals", "f" and "sigma_eps"

param

a character vector with the parameters to plot. Defaults to NULL and is only used if type is "trace" or "histogram". Allowed values are the parameters given in the model summary of x as well as "hyperparameters" or "latent_parameters" for specific groups of parameters.

transformed

a logical value indicating whether the quantity should be plotted on a transformed scale used during the Bayesian inference. Defaults to FALSE.

title

a character denoting the title of the plot

xlim

numeric vector of length 2, denoting the limits on the x axis of the plot. Applicable for types "rating_curve","rating_curve_mean","f","beta","sigma_eps","residuals".

ylim

numeric vector of length 2, denoting the limits on the y axis of the plot. Applicable for types "rating_curve","rating_curve_mean","f","beta","sigma_eps","residuals".

Value

No return value, called for side effects.

Functions

See Also

plm0, plm, gplm0 and gplm for fitting a discharge rating curve and summary.plm0, summary.plm, summary.gplm0 and summary.gplm for summaries. It is also useful to look at spread_draws and gather_draws to work directly with the MCMC samples.

Examples


data(krokfors)
set.seed(1)
plm0.fit <- plm0(formula=Q~W,data=krokfors,num_cores=2)

plot(plm0.fit)
plot(plm0.fit,transformed=TRUE)
plot(plm0.fit,type='histogram',param='c')
plot(plm0.fit,type='histogram',param='c',transformed=TRUE)
plot(plm0.fit,type='histogram',param='hyperparameters')
plot(plm0.fit,type='histogram',param='latent_parameters')
plot(plm0.fit,type='residuals')
plot(plm0.fit,type='f')
plot(plm0.fit,type='sigma_eps')


Plot method for discharge rating curve tournament

Description

Compare the four models from the tournament object in multiple ways

Usage

## S3 method for class 'tournament'
plot(x, ..., type = "tournament_results", transformed = FALSE)

Arguments

x

an object of class "tournament"

...

other plotting parameters (not used in this function)

type

a character denoting what type of plot should be drawn. Possible types are

  • "deviance" to plot the deviance of the four models.

  • "rating_curve" to plot the rating curve.

  • "rating_curve_mean" to plot the posterior mean of the rating curve.

  • "f" to plot the power-law exponent.

  • "sigma_eps" to plot the standard deviation on the data level.

  • "residuals" to plot the log residuals.

  • "tournament_results" to plot tournament results visually, game for game.

transformed

a logical value indicating whether the quantity should be plotted on a transformed scale used during the Bayesian inference. Defaults to FALSE.

Value

No return value, called for side effects

See Also

tournament to run a discharge rating curve tournament and summary.tournament for summaries.

Examples


data(krokfors)
set.seed(1)
t_obj <- tournament(formula=Q~W,data=krokfors,num_cores=2)
plot(t_obj)
plot(t_obj,transformed=TRUE)
plot(t_obj,type='deviance')
plot(t_obj,type='f')
plot(t_obj,type='sigma_eps')
plot(t_obj,type='residuals')
plot(t_obj,type='tournament_results')


Plot bdrc model objects

Description

Visualize results from model objects in bdrc, plm0, plm, gplm0,gplm

Usage

plot_fun(
  x,
  type = "rating_curve",
  param = NULL,
  transformed = FALSE,
  title = NULL,
  xlim = NULL,
  ylim = NULL,
  ...
)

Arguments

x

an object of class "plm0","plm","gplm0" or "gplm".

type

a character denoting what type of plot should be drawn. Defaults to "rating_curve". Possible types are

  • "rating_curve" to plot the rating curve.

  • "rating_curve_mean" to plot the posterior mean of the rating curve.

  • "f" to plot the power-law exponent.

  • "beta" to plot the random effect in the power-law exponent.

  • "sigma_eps" to plot the standard deviation on the data level.

  • "residuals" to plot the log residuals.

  • "trace" to plot trace plots of parameters given in param.

  • "histogram" to plot histograms of parameters given in param.

  • "panel" to plot a 2x2 panel of plots: "rating curve", "residuals", "f" and "sigma_eps"

param

a character vector with the parameters to plot. Defaults to NULL and is only used if type is "trace" or "histogram". Allowed values are the parameters given in the model summary of x as well as "hyperparameters" or "latent_parameters" for specific groups of parameters.

transformed

a logical value indicating whether the quantity should be plotted on a transformed scale used during the Bayesian inference. Defaults to FALSE.

title

a character denoting the title of the plot. Defaults to NULL, i.e. no title.

xlim

numeric vector of length 2, denoting the limits on the x axis of the plot. Applicable for types "rating_curve","rating_curve_mean","f","beta","sigma_eps","residuals".

ylim

numeric vector of length 2, denoting the limits on the y axis of the plot. Applicable for types "rating_curve","rating_curve_mean","f","beta","sigma_eps","residuals".

Value

returns an object of class ggplot2.


Predict method for discharge rating curves

Description

Predict the discharge for given stage values based on a discharge rating curve model object.

Usage

## S3 method for class 'plm0'
predict(object, ..., newdata = NULL, wide = FALSE)

## S3 method for class 'plm'
predict(object, ..., newdata = NULL, wide = FALSE)

## S3 method for class 'gplm0'
predict(object, ..., newdata = NULL, wide = FALSE)

## S3 method for class 'gplm'
predict(object, ..., newdata = NULL, wide = FALSE)

Arguments

object

an object of class "plm0", "plm", "gplm0" or "gplm".

...

not used in this function

newdata

a numeric vector of stage values for which to predict. If omitted, the stage values in the data are used.

wide

a logical value denoting whether to produce a wide prediction output.If TRUE, then the output is a table with median prediction values for an equally spaced grid of stages with 1 cm increments, each row containing predictions in a decimeter range of stages.

Value

an object of class "data.frame" with four columns, h (stage), lower (2.5% posterior predictive quantile), median (50% posterior predictive quantile), upper (97.5% posterior predictive quantile). If wide=TRUE, a matrix as described above (see wide parameter) is returned.

Functions

See Also

plm0, plm, gplm0 and gplm for fitting a discharge rating curve and summary.plm0, summary.plm, summary.gplm0 and summary.gplm for summaries. It is also useful to look at plot.plm0, plot.plm, plot.gplm0 and plot.gplm to help visualize all aspects of the fitted discharge rating curve. Additionally, spread_draws and spread_draws help working directly with the MCMC samples.

Examples


data(krokfors)
set.seed(1)
plm0.fit <- plm0(formula=Q~W,data=krokfors,h_max=10,num_cores=2)
#predict rating curve on a equally 10 cm spaced grid from 9 to 10 meters
predict(plm0.fit,newdata=seq(9,10,by=0.1))


Print method for discharge rating curves

Description

Print a discharge rating curve model object

Usage

## S3 method for class 'plm0'
print(x, ...)

## S3 method for class 'plm'
print(x, ...)

## S3 method for class 'gplm0'
print(x, ...)

## S3 method for class 'gplm'
print(x, ...)

Arguments

x

an object of class "plm0", "plm", "gplm0" or "gplm".

...

not used in this function

Functions

See Also

plm0, plm, gplm0, gplm for fitting a discharge rating curve and summary.plm0, summary.plm, summary.gplm0 and summary.gplm for summaries. It is also useful to look at plot.plm0, plot.plm, plot.gplm0 and plot.gplm to help visualize all aspects of the fitted discharge rating curve. Additionally, spread_draws and spread_draws help working directly with the MCMC samples.


Print method for discharge rating curve tournament

Description

Print the results of a tournament of discharge rating curve model comparisons

Usage

## S3 method for class 'tournament'
print(x, ...)

Arguments

x

an object of class "tournament"

...

not used in this function

See Also

tournament to run a discharge rating curve tournament, summary.tournament for summaries and plot.tournament for visualizing the mode comparison.


Skjalfandafljot gauging station in Iceland

Description

Data on discharge and stage from Skjalfandafljot gauging station in Iceland

Usage

skjalf

Format

A data frame with columns:

W

Measurements of water stage in meters

Q

Measurements of water discharge in cubic meters per second

Source

Icelandic Meteorological Office, Landsvirkjun - the National Power Company of Iceland, and the Icelandic Road and Coastal Administration.


Skogsliden gauging station in Sweden

Description

Data on discharge and stage from Skogsliden gauging station in Sweden

Usage

skogsliden

Format

A data frame with columns:

W

Measurements of water stage in meters

Q

Measurements of water discharge in cubic meters per second

Source

Swedish Meteorological and Hydrological Institute.


Spanga gauging station in Sweden

Description

Data on discharge and stage from Spanga gauging station in Sweden.

Usage

spanga

Format

A data frame with columns:

W

Measurements of water stage in meters

Q

Measurements of water discharge in cubic meters per second

Source

Swedish Meteorological and Hydrological Institute.


Spread MCMC chain draws to data.frame on a wide format

Description

Useful to convert MCMC chain draws of particular parameters or output from the model object to a wide format for further data wrangling

Usage

spread_draws(mod, ..., transformed = FALSE)

Arguments

mod

an object of class "plm0","plm","gplm0" or "gplm".

...

any number of character vectors containing valid names of parameters in the model or "rating_curve" and "rating_curve_mean". Also accepts "latent_parameters" and "hyperparameters".

transformed

boolean value determining whether the output is to be represented on the transformed scale used for sampling in the MCMC chain or the original scale. Defaults to FALSE.

Value

Data frame with columns chain iter param value

References

B. Hrafnkelsson, H. Sigurdarson, S.M. Gardarsson, 2020, Generalization of the power-law rating curve using hydrodynamic theory and Bayesian hierarchical modeling. arXiv preprint 2010.04769

See Also

plm0, plm, gplm0, gplm for further information on parameters

Examples


data(krokfors)
set.seed(1)
plm0.fit <- plm0(formula=Q~W,data=krokfors,num_cores=2)
hyp_samples <- spread_draws(plm0.fit,'hyperparameters')
head(hyp_samples)
rating_curve_samples <- spread_draws(plm0.fit,'rating_curve','rating_curve_mean')
head(rating_curve_samples)


Summary method for discharge rating curves

Description

Summarize a discharge rating curve model object

Usage

## S3 method for class 'plm0'
summary(object, ...)

## S3 method for class 'plm'
summary(object, ...)

## S3 method for class 'gplm0'
summary(object, ...)

## S3 method for class 'gplm'
summary(object, ...)

Arguments

object

an object of class "plm0", "plm", "gplm0" or "gplm".

...

Not used for this function

Functions

See Also

plm0, plm, gplm0 and gplm for fitting a discharge rating curve. It is also useful to look at plot.plm0, plot.plm, plot.gplm0 and plot.gplm to help visualize all aspects of the fitted discharge rating curve. Additionally, spread_draws and spread_draws help working directly with the MCMC samples.

Examples


data(krokfors)
set.seed(1)
plm0.fit <- plm0(formula=Q~W,data=krokfors,num_cores=2)
summary(plm0.fit)


Summary method for a discharge rating curve tournament

Description

Print the summary of a tournament of model comparisons

Usage

## S3 method for class 'tournament'
summary(object, ...)

Arguments

object

an object of class "tournament"

...

not used in this function

See Also

tournament to run a discharge rating curve tournament and plot.tournament for visualizing the mode comparison

Examples


data(krokfors)
set.seed(1)
t_obj <- tournament(Q~W,krokfors,num_cores=2)
summary(t_obj)


Custom bdrc theme

Description

Custom bdrc theme

Usage

theme_bdrc(..., scaling = 1)

Arguments

...

not used in this function

scaling

a numerical value which can be used to scale up or down the size of the text and titles of a plot that uses theme_bdrc. Defaults to 1.

Value

returns a theme object for the package


Tournament - Model comparison

Description

tournament compares four rating curve models of different complexities and determines the model that provides the best fit of the data at hand.

Usage

tournament(
  formula = NULL,
  data = NULL,
  model_list = NULL,
  method = "WAIC",
  winning_criteria = NULL,
  ...
)

Arguments

formula

an object of class "formula", with discharge column name as response and stage column name as a covariate.

data

data.frame containing the variables specified in formula.

model_list

list of exactly four model objects of types "plm0","plm","gplm0" and "gplm" to be used in the tournament. Note that all of the model objects are required to be run with the same data and same c_param.

method

a string specifying the method used to estimate the predictive performance of the models. The allowed methods are "WAIC", "DIC" and "Posterior_probability".

winning_criteria

a numerical value which sets a threshold the more complex model in each model comparison must exceed to be deemed the more appropriate model. See the Details section.

...

optional arguments passed to the model functions.

Details

Tournament is a model comparison method that uses WAIC to estimate the predictive performance of the four models and select the most appropriate model given the data. The first round of model comparisons sets up two games between model types, "gplm" vs. "gplm0" and "plm" vs. "plm0". The two comparisons are conducted such that if the WAIC of the more complex model ("gplm" and "plm", respectively) is smaller than the WAIC of the simpler models ("gplm0" and "plm0", respectively) by an input argument called the winning_criteria (default value = 2.2), then it is chosen as the more appropriate model. If not, the simpler model is chosen. The more appropriate models move on to the second round and are compared in the same way. The winner of the second round is chosen as the overall tournament winner and deemed the most appropriate model given the data.

The default method "WAIC", or the Widely Applicable Information Criterion (see Watanabe (2010)), is used to estimate the predictive performance of the models. This method is a fully Bayesian method that uses the full set of posterior draws to estimate of the expected log pointwise predictive density.

Method "DIC", or Deviance Information Criterion (see Spiegelhalter (2002)), is similar to the "WAIC" but instead of using the full set of posterior draws to compute the estimate of the expected log pointwise predictive density, it uses a point estimate of the posterior distribution.

Method "Posterior_probability" uses the posterior probabilities of the models, calculated with Bayes factor (see Jeffreys (1961) and Kass and Raftery (1995)), to compare the models, where all the models are assumed a priori to be equally likely. This method is not chosen as the default method because the Bayes factor calculations can be quite unstable.

When methods "WAIC" or "DIC" are used, the winning_criteria should be a real number. The winning criteria is a threshold value which the more complex model in each model comparison must exceed for it to be declared the more appropriate model. Setting the winning criteria slightly above 0 (default value = 2.2 for both "WAIC" and "DIC") gives the less complex model in each comparison a slight advantage. When method "Posterior_probability" is used, the winning criteria should be a real value between 0 and 1 (default value = 0.75). This sets the threshold value for which the posterior probability of the more complex model, given the data, in each model comparison must exceed for it to be declared the more appropriate model. In all three cases, the default value is selected so as to give the less complex models a slight advantage, and should give more or less consistent results when applying the tournament to real world data.

Value

An object of type "tournament" with the following elements

contestants

model objects of types "plm0","plm","gplm0" and "gplm" being compared.

winner

model object of the tournament winner.

summary

a data frame with information on results of the different games in the tournament.

info

specifics about the tournament; the overall winner; the method used; and the winning criteria.

References

Hrafnkelsson, B., Sigurdarson, H., and Gardarsson, S. M. (2022). Generalization of the power-law rating curve using hydrodynamic theory and Bayesian hierarchical modeling, Environmetrics, 33(2):e2711.

Jeffreys, H. (1961). Theory of Probability, Third Edition. Oxford University Press.

Kass, R., and A. Raftery, A. (1995). Bayes Factors. Journal of the American Statistical Association, 90, 773-795.

Spiegelhalter, D., Best, N., Carlin, B., Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64(4), 583–639.

Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. J. Mach. Learn. Res. 11, 3571–3594.

See Also

plm0 plm, gplm0,gplm summary.tournament and plot.tournament

Examples


data(krokfors)
set.seed(1)
t_obj <- tournament(formula=Q~W,data=krokfors,num_cores=2)
t_obj
summary(t_obj)