Type: | Package |
Version: | 0.1.1 |
Date: | 2017-03-01 today |
Title: | Sampling from Conditional C- and D-Vine Copulas |
Description: | Provides tools for sampling from a conditional copula density decomposed via Pair-Copula Constructions as C- or D- vine. Here, the vines which can be used for such a sampling are those which sample as first the conditioning variables (when following the sampling algorithms shown in Aas et al. (2009) <doi:10.1016/j.insmatheco.2007.02.001>). The used sampling algorithm is presented and discussed in Bevacqua et al. (2017) <doi:10.5194/hess-2016-652>, and it is a modified version of that from Aas et al. (2009) <doi:10.1016/j.insmatheco.2007.02.001>. A function is available to select the best vine (based on information criteria) among those which allow for such a conditional sampling. The package includes a function to compare scatterplot matrices and pair-dependencies of two multivariate datasets. |
Author: | Emanuele Bevacqua [aut, cre] |
Maintainer: | Emanuele Bevacqua <emanuele.bevacqua@uni-graz.at> |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Encoding: | UTF-8 |
LazyData: | true |
Imports: | combinat, graphics, stats, VineCopula |
RoxygenNote: | 5.0.1 |
Suggests: | testthat |
Depends: | R (≥ 2.10) |
NeedsCompilation: | no |
Packaged: | 2017-07-28 14:25:59 UTC; emanuele |
Repository: | CRAN |
Date/Publication: | 2017-07-28 17:20:10 UTC |
Selection of a C- or D- vine copula model for conditional sampling
Description
This function fits either a C- or a D- vine model to a d-dimensional dataset of uniform variables.
The fit of the pair-copula families is performed sequentially through the function RVineCopSelect
of
the package VineCopula
. The vine structure is selected among a group of C- and a D- vines which satisfy the requirement
discussed in Bevacqua et al. (2017). This group is composed by all C- and D- vines from which the conditioning variables
would be sampled as first when following the algorithms from Aas et al. (2009). Alternatively, if the
vine matrix describing the vine structure is given to the function, the fit of the pair-copulas is directly performed skipping the vine structure
selection procedure.
Usage
CDVineCondFit(data, Nx, treecrit = "AIC", type = "CVine-DVine",
selectioncrit = "AIC", familyset = NA, indeptest = FALSE,
level = 0.05, se = FALSE, rotations = TRUE, method = "mle",
Matrix = FALSE)
Arguments
data |
An |
Nx |
Number of conditioning variables. |
treecrit |
Character indicating the criteria used to select the vine. All possible vines are fitted trough
the function |
type |
Type of vine to be fitted: |
selectioncrit |
Character indicating the criterion for pair-copula selection.
Possible choices are |
familyset |
"Integer vector of pair-copula families to select from. The vector has to include at least one
pair-copula family that allows for positive and one that allows for negative
dependence. Not listed copula families might be included to better handle
limit cases. If |
indeptest |
Logical; whether a hypothesis test for the independence of
|
level |
numeric; significance level of the independence test (default:
|
se |
Logical; whether standard errors are estimated (default: |
rotations |
logical; if |
method |
indicates the estimation method: either maximum
likelihood estimation ( |
Matrix |
|
Value
An RVineMatrix
object describing the selected copula model
(for further details about RVineMatrix
objects see the documentation file of the VineCopula
package).
The selected families are stored
in $family
, and the sequentially estimated parameters in $par
and $par2
.
The fit of the model is performed via the function RVineCopSelect
of the package VineCopula
.
"The object RVineMatrix
includes the following information about the fit:
se, se2
standard errors for the parameter estimates (if
se = TRUE
; note that these are only approximate since they do not account for the sequential nature of the estimation,nobs
number of observations,
logLik, pair.logLik
log likelihood (overall and pairwise)
AIC, pair.AIC
Aikaike's Informaton Criterion (overall and pairwise),
BIC, pair.BIC
Bayesian's Informaton Criterion (overall and pairwise),
emptau
matrix of empirical values of Kendall's tau,
p.value.indeptest
matrix of p-values of the independence test.
Note
For a comprehensive summary of the vine copula model, use
summary(object)
; to see all its contents, use str(object)
". (VineCopula Documentation, version 2.1.1, pp. 103)
Author(s)
Emanuele Bevacqua
References
Bevacqua, E., Maraun, D., Hobaek Haff, I., Widmann, M., and Vrac, M.: Multivariate statistical modelling of compound events via pair-copula constructions: analysis of floods in Ravenna (Italy), Hydrol. Earth Syst. Sci., 21, 2701-2723, https://doi.org/10.5194/hess-21-2701-2017, 2017. [link] [link]
Aas, K., Czado, C., Frigessi, A. and Bakken, H.: Pair-copula constructions of multiple dependence, Insurance: Mathematics and Economics, 44(2), 182-198, <doi:10.1016/j.insmatheco.2007.02.001>, 2009. [link]
Ulf Schepsmeier, Jakob Stoeber, Eike Christian Brechmann, Benedikt Graeler, Thomas Nagler and Tobias Erhardt (2017). VineCopula: Statistical Inference of Vine Copulas. R package version 2.1.1. [link]
See Also
Examples
# Example 1
# Read data
data(dataset)
data <- dataset$data[1:100,1:5]
# Define the variables Y and X. X are the conditioning variables,
# which have to be positioned in the last columns of the data.frame
colnames(data) <- c("Y1","Y2","X3","X4","X5")
## Not run:
# Select and fit a C- vine copula model, requiring that the
RVM <- CDVineCondFit(data,Nx=3,treecrit="BIC",type="CVine",selectioncrit="AIC")
summary(RVM)
RVM$Matrix
## End(Not run)
# Example 2
# Read data
data(dataset)
data <- dataset$data[1:80,1:5]
# Define the variables Y and X. X are the conditioning variables,
# which have to be positioned in the last columns of the data.frame
colnames(data) <- c("Y1","Y2","X3","X4","X5")
# Define a VineMatrix which can be used for conditional sampling
ListVines <- CDVineCondListMatrices(data,Nx=3)
Matrix=ListVines$DVine[[1]]
Matrix
## Not run:
# Fit copula families for the defined vine:
RVM <- CDVineCondFit(data,Nx=3,Matrix=Matrix)
summary(RVM)
RVM$Matrix
RVM$family
# check
identical(RVM$Matrix,Matrix)
# Fit copula families for the defined vine, given a group of families to select from:
RVM <- CDVineCondFit(data,Nx=3,Matrix=Matrix,familyset=c(1,2,3,14))
summary(RVM)
RVM$Matrix
RVM$family
# Try to fit copula families for a vine which is not among those
# that allow for conditional sampling:
Matrix
Matrix[which(Matrix==4)]=40
Matrix[which(Matrix==2)]=20
Matrix[which(Matrix==40)]=2
Matrix[which(Matrix==20)]=4
Matrix
RVM <- CDVineCondFit(data,Nx=3,Matrix=Matrix)
RVM
## End(Not run)
List of the possible C- and D- vines allowing for conditional simulation
Description
Provides a list of the C- and D- vines which allow for conditional
sampling, under the condition discussed in the descriprion of CDVineCondFit
.
Usage
CDVineCondListMatrices(data, Nx, type = "CVine-DVine")
Arguments
data |
An |
Nx |
Number of conditioning variables. |
type |
Type of vine to be considered: |
Value
Listes of matrices describing C- ($CVine
) and D- ($DVine
) Vines.
Each matrix corresponds to a vine, according to the same notation used for RVineMatrix
objects (for further details about RVineMatrix
objects see the documentation file of the VineCopula
package).
The index i
in the matrix corresponds to the variable in the i-th column of data
.
Author(s)
Emanuele Bevacqua
References
Bevacqua, E., Maraun, D., Hobaek Haff, I., Widmann, M., and Vrac, M.: Multivariate statistical modelling of compound events via pair-copula constructions: analysis of floods in Ravenna (Italy), Hydrol. Earth Syst. Sci., 21, 2701-2723, https://doi.org/10.5194/hess-21-2701-2017, 2017. [link] [link]
Aas, K., Czado, C., Frigessi, A. and Bakken, H.: Pair-copula constructions of multiple dependence, Insurance: Mathematics and Economics, 44(2), 182-198, <doi:10.1016/j.insmatheco.2007.02.001>, 2009. [link]
See Also
Examples
# Read data
data(dataset)
data <- dataset$data[1:100,1:5]
# Define the variables Y and X. X are the conditioning variables,
# which have to be positioned in the last columns of the data.frame
colnames(data) <- c("Y1","Y2","X3","X4","X5")
# List possible D-Vines:
ListVines <- CDVineCondListMatrices(data,Nx=3,"DVine")
ListVines$DVine
# List possible C-Vines:
ListVines <- CDVineCondListMatrices(data,Nx=3,"CVine")
ListVines$CVine
# List possible C- and D-Vines:
ListVines <- CDVineCondListMatrices(data,Nx=3,"CVine-DVine")
ListVines
Ranking of C- and D- vines allowing for conditional simulation
Description
Provides a ranking of the C- and D- vines which allow for conditional
sampling, under the condition discussed in the descriprion of CDVineCondFit
.
Usage
CDVineCondRank(data, Nx, treecrit = "AIC", selectioncrit = "AIC",
familyset = NA, type = "CVine-DVine", indeptest = FALSE, level = 0.05,
se = FALSE, rotations = TRUE, method = "mle")
Arguments
data |
An |
Nx |
Number of conditioning variables. |
treecrit |
Character indicating the criteria used to select the vine. All possible vines are fitted trough
the function |
selectioncrit |
Character indicating the criterion for pair-copula selection.
Possible choices are |
familyset |
"Integer vector of pair-copula families to select from. The vector has to include at least one
pair-copula family that allows for positive and one that allows for negative
dependence. Not listed copula families might be included to better handle
limit cases. If |
type |
Type of vine to be fitted: |
indeptest |
"Logical; whether a hypothesis test for the independence of
|
level |
numeric; significance level of the independence test (default:
|
se |
Logical; whether standard errors are estimated (default: |
rotations |
logical; if |
method |
indicates the estimation method: either maximum
likelihood estimation ( |
Value
table
A table with the ranking of the vines, with vine index
i
, values of the selectedtreecrit
and vinetype
(1 for "CVine" and 2 for D-Vine).vines
A list where the element
[[i]]
is anRVineMatrix
object corresponding to thei
-th vine in the ranking shown intable
. EachRVineMatrix
object containes the selected families ($family
) as well as sequentially estimated parameters stored in$par
and$par2
. Details aboutRVineMatrix
objects are given in the documentation file of theVineCopula
package). The fit of each model is performed via the functionRVineCopSelect
of the packageVineCopula
. "The object is augmented by the following information about the fit:se, se2
standard errors for the parameter estimates (if
se = TRUE
; note that these are only approximate since they do not account for the sequential nature of the estimationnobs
number of observations
logLik, pair.logLik
log likelihood (overall and pairwise)
AIC, pair.AIC
Aikaike's Informaton Criterion (overall and pairwise)
BIC, pair.BIC
Bayesian's Informaton Criterion (overall and pairwise)
emptau
matrix of empirical values of Kendall's tau
p.value.indeptest
matrix of p-values of the independence test.
Note
For a comprehensive summary of the vine copula model, use
summary(object)
; to see all its contents, use str(object)
." (VineCopula Documentation, version 2.1.1, pp. 103)
Author(s)
Emanuele Bevacqua
References
Bevacqua, E., Maraun, D., Hobaek Haff, I., Widmann, M., and Vrac, M.: Multivariate statistical modelling of compound events via pair-copula constructions: analysis of floods in Ravenna (Italy), Hydrol. Earth Syst. Sci., 21, 2701-2723, https://doi.org/10.5194/hess-21-2701-2017, 2017. [link] [link]
Aas, K., Czado, C., Frigessi, A. and Bakken, H.: Pair-copula constructions of multiple dependence, Insurance: Mathematics and Economics, 44(2), 182-198, <doi:10.1016/j.insmatheco.2007.02.001>, 2009. [link]
Ulf Schepsmeier, Jakob Stoeber, Eike Christian Brechmann, Benedikt Graeler, Thomas Nagler and Tobias Erhardt (2017). VineCopula: Statistical Inference of Vine Copulas. R package version 2.1.1. [link]
See Also
Examples
# Read data
data(dataset)
data <- dataset$data[1:100,1:5]
# Define the variables Y and X. X are the conditioning variables,
# which have to be positioned in the last columns of the data.frame
colnames(data) <- c("Y1","Y2","X3","X4","X5")
# Rank the possible D-Vines according to the AIC
## Not run:
Ranking <- CDVineCondRank(data,Nx=3,"AIC",type="DVine")
Ranking$table
# tree AIC type
# 1 1 -292.8720 2
# 2 2 -290.2941 2
# 3 3 -288.5719 2
# 4 4 -288.2496 2
# 5 5 -287.8006 2
# 6 6 -285.8503 2
# 7 7 -282.2867 2
# 8 8 -278.9371 2
# 9 9 -275.8339 2
# 10 10 -272.9459 2
# 11 11 -271.1526 2
# 12 12 -270.5269 2
Ranking$vines[[1]]$AIC
# [1] -292.8720
summary(Ranking$vines[[1]])
## End(Not run)
Simulation from a conditional C- or D-vine
Description
Simulates from a d-dimensional conditional C- or D-vine of the variables (Y,X), given the fixed conditioning variables X. The algorithm works for vines satysfying the requirements discussed in Bevacqua et al. (2017). The algorthm implemented here is a modified version of those form Aas et al. (2009) and is shown in Bevacqua et al. (2017).
Usage
CDVineCondSim(RVM, Condition, N)
Arguments
RVM |
An |
Condition |
A |
N |
Number of data to be simulated. By default N is taken from |
Value
A N x d
matrix of the simulated variables from the given C- or D-vine copula model. In the first columns there are
the simulated conditioned variables, and in the last columns the conditioning variables Condition
.
For more details about the exact order of the variables in the columns see the examples. The
function is built to work easily in combination with CDVineCondFit
.
Author(s)
Emanuele Bevacqua
References
Bevacqua, E., Maraun, D., Hobaek Haff, I., Widmann, M., and Vrac, M.: Multivariate statistical modelling of compound events via pair-copula constructions: analysis of floods in Ravenna (Italy), Hydrol. Earth Syst. Sci., 21, 2701-2723, https://doi.org/10.5194/hess-21-2701-2017, 2017. [link] [link]
Aas, K., Czado, C., Frigessi, A. and Bakken, H.: Pair-copula constructions of multiple dependence, Insurance: Mathematics and Economics, 44(2), 182-198, <doi:10.1016/j.insmatheco.2007.02.001>, 2009. [link]
Ulf Schepsmeier, Jakob Stoeber, Eike Christian Brechmann, Benedikt Graeler, Thomas Nagler and Tobias Erhardt (2017). VineCopula: Statistical Inference of Vine Copulas. R package version 2.1.1. [link]
See Also
Examples
# Example 1: conditional sampling from a C-Vine
# Read data
data(dataset)
data <- dataset$data[1:400,1:4]
# Define the variables Y and X. X are the conditioning variables,
# which have to be positioned in the last columns of the data.frame
colnames(data) <- c("Y1","Y2","X3","X4")
## Not run:
# Select a vine and fit the copula families, specifying that there are 2 conditioning variables
RVM <- CDVineCondFit(data,Nx=2,type="CVine")
# Set the values of the conditioning variables as those used for the calibration.
# Order them with respect to RVM$Matrix, considering that is a C-Vine
d=dim(RVM$Matrix)[1]
cond1 <- data[,RVM$Matrix[(d+1)-1,(d+1)-1]]
cond2 <- data[,RVM$Matrix[(d+1)-2,(d+1)-2]]
condition <- cbind(cond1,cond2)
# Simulate the variables
Sim <- CDVineCondSim(RVM,condition)
# Plot the simulated variables over the observed
Sim <- data.frame(Sim)
overplot(Sim,data)
# Example 2: conditional sampling from a D-Vine
# Read data
data(dataset)
data <- dataset$data[1:100,1:4]
# Define the variables Y and X. X are the conditioning variables,
# which have to be positioned in the last columns of the data.frame
colnames(data) <- c("Y1","Y2","X3","X4")
# Select a vine and fit the copula families, specifying that there are 2 conditioning variables
RVM <- CDVineCondFit(data,Nx=2,type="DVine")
summary(RVM) #It is a D-Vine.
# Set the values of the conditioning variables as those used for the calibration.
# Order them with respect to RVM$Matrix, considering that is a D-Vine.
cond1 <- data[,RVM$Matrix[1,1]]
cond2 <- data[,RVM$Matrix[2,2]]
condition <- cbind(cond1,cond2)
# Simulate the variables
Sim <- CDVineCondSim(RVM,condition)
# Plot the simulated variables over the observed
Sim <- data.frame(Sim)
overplot(Sim,data)
# Example 3
# Read data
data(dataset)
data <- dataset$data[1:100,1:2]
colnames(data) <- c("Y1","X2")
# Fit copula
require(VineCopula)
BiCop <- BiCopSelect(data$Y1,data$X2)
BiCop
# Fix conditioning variable to low values and simulate
condition <- data$X2/10
Sim <- CDVineCondSim(BiCop,condition)
# Plot the simulated variables over the observed
Sim <- data.frame(Sim)
overplot(Sim,data)
## End(Not run)
Random dataset from a given vine copula model
Description
A random dataset simulated from a given 5-dimensional vine copula model.
Usage
dataset
Format
$data
An
1000 x 5
data set (formatdata.frame
) with the uniform variables(U1,U2,U3,U4,U5)
.$vine
RVineMatrix
object defyining the vine copula model from where$data
was sampled.
Author(s)
Emanuele Bevacqua
Examples
# Load data
data(dataset)
# Extract data
data <- dataset$data
plot(data)
# Extract the RVineMatrix object from where the dataset was randomly sampled
vine <- dataset$vine
vine$Matrix
vine$family
vine$par
vine$par2
summary(vine)
overplot
Description
This function overlays the scatterplot matrices of two multivariate datsets. Moreover, it shows the dependencies among all the pairs for both datsets.
Usage
overplot(data1, data2, col1 = "black", col2 = "grey", xlim = NA,
ylim = NA, labels = NA, method = "pearson", cex.cor = 1,
cex.labels = 1, cor.signif = 2, cex.axis = 1, pch1 = 1, pch2 = 1)
Arguments
data1 , data2 |
Two |
col1 , col2 |
Colors used for |
xlim , ylim |
Two bidimensional vectors indicating the limits of x and y axes for all the scatterplots. If not given, they are authomatically computed for each of the scatterplots. |
labels |
A character vector with the variable names to be printed (if not given, the names of |
method |
Character indicating the dependence types to be computed between the pairs. Possibilites: "kendall", "spearman" and "pearson" (default) |
cex.cor |
Number: character dimension of the printed dependencies. Default |
cex.labels |
Number: character dimension of the printed variable names. Default |
cor.signif |
Number: number of significant numbers of the printed dependencies. Default |
cex.axis |
Number: dimension of the axis numeric values. Default cex.axis=1. |
pch1 , pch2 |
Paramter to specify the symbols to use when plotting points of |
Value
A matrix of overlaying scatterplots of the multivariate datsets data1
and data2
, with
the dependencies of the pairs.
Author(s)
Emanuele Bevacqua
Examples
# Example 1
# Read and prepare the data for the plot
data(dataset)
data1 <- dataset$data[1:300,]
data2 <- dataset$data[301:600,]
overplot(data1,data2,xlim=c(0,1),ylim=c(0,1),method="kendall")
## Not run:
# Example 2
# Read and prepare the data for the plot
data(dataset)
data <- dataset$data[1:200,1:5]
colnames(data) <- c("Y1","Y2","X3","X4","X5")
# Fit copula families for the defined vine:
ListVines <- CDVineCondListMatrices(data,Nx=3)
Matrix=ListVines$CVine[[1]]
RVM <- CDVineCondFit(data,Nx=3,Matrix=Matrix)
# Simulate data:
d=dim(RVM$Matrix)[1]
cond1 <- data[,RVM$Matrix[(d+1)-1,(d+1)-1]]
cond2 <- data[,RVM$Matrix[(d+1)-2,(d+1)-2]]
cond3 <- data[,RVM$Matrix[(d+1)-3,(d+1)-3]]
condition <- cbind(cond1,cond2,cond3)
Sim <- CDVineCondSim(RVM,condition)
# Plot the simulated variables Sim over the observed
Sim <- data.frame(Sim)
overplot(data[,1:2],Sim[,1:2],xlim=c(0,1),ylim=c(0,1),method="spearman")
overplot(data,Sim,xlim=c(0,1),ylim=c(0,1),method="spearman")
## End(Not run)