% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/o_estimation2.R
\name{fitkienerX}
\alias{fitkienerX}
\alias{paramkienerX}
\alias{paramkienerX7}
\alias{paramkienerX5}
\title{Estimation and Regression Functions for Kiener Distributions}
\usage{
fitkienerX(X, algo = c("r", "reg", "e", "estim"), ord = 7, maxk = 10,
  mink = 1.53, maxe = 0.5, probak = pprobs2, dgts = NULL,
  exfitk = NULL, dimnames = FALSE, ncores = 1)

paramkienerX(X, algo = c("r", "reg", "e", "estim"), ord = 7, maxk = 10,
  mink = 1.53, maxe = 0.5, dgts = 3, parnames = TRUE,
  dimnames = FALSE, ncores = 1)

paramkienerX7(X, dgts = 3, parnames = TRUE, dimnames = FALSE,
  ncores = 1)

paramkienerX5(X, dgts = 3, parnames = TRUE, dimnames = FALSE,
  ncores = 1)
}
\arguments{
\item{X}{numeric. Vector, matrix, array or list of quantiles.}

\item{algo}{character. The algorithm used: \code{"r"} or \code{"reg"}
for regression (default) and \code{"e"} or \code{"estim"}
for quantile estimation.}

\item{ord}{integer. Option for probability selection and treatment.}

\item{maxk}{numeric. The maximum value of tail parameter \code{k}.}

\item{mink}{numeric. The minimum value of tail parameter \code{k}.}

\item{maxe}{numeric. The maximum value of absolute tail parameter \code{|e|}.}

\item{probak}{numeric. Ordered vector of probabilities.}

\item{dgts}{integer. The rounding of output parameters.}

\item{exfitk}{character. A vector of parameter names to subset the output.}

\item{dimnames}{boolean. Display dimnames.}

\item{ncores}{integer. The number of cores for parallel processing of arrays.}

\item{parnames}{boolean. Display parameter names.}
}
\value{
\code{paramkienerX}: a vector (or a matrix) of parameter estimates 
\code{c(m, g, a, k, w, d, e)}.

\code{fitkienerX}: a vector (or a matrix) made of several parts:
\itemize{
  \item{ \code{ret} : the return over the period calculated with \code{sum(x)}. 
         Thus, assume log-returns. } 
  \item{ \code{m, g, a, k, w, d, e} : the parameter estimates. } 
  \item{ \code{m1, sd, sk, ke} : the mean, standard deviation, 
         skewness and excess of kurtosis computed from the parameter estimates. } 
  \item{ \code{m1x, sdx, skx, kex} : The mean, standard deviation,  
         skewness and excess of kurtosis computed from the dataset. } 
  \item{ \code{lh} : the length of the dataset over the period. } 
  \item{ \code{q.} : quantile estimated with the parameter estimates. }
  \item{ \code{VaR.} : Value-at-Risk, positive in most cases. } 
  \item{ \code{c.} : corrective tail coefficient = (q - m) / (q_logistic_function - m). }
  \item{ \code{ltm.} : left tail mean (signed ES on the left tail, usually negative). } 
  \item{ \code{rtm.} : right tail mean (signed ES on the right tail, usually positive). }
  \item{ \code{dtmq.} : (p<=0.5 left, p>0.5 right) tail mean minus quantile. }
  \item{ \code{ES.} : expected shortfall, positive in most cases. }
  \item{ \code{h.} : corrective ES  = (ES - m) / (ES_logistic_function - m). }
  \item{ \code{desv.} : ES - VaR, usually positive. } 
  \item{ \code{l.} : quantile estimated by the tangent logistic function. }
  \item{ \code{dl.} : quantile - quantile_logistic_function. }
  \item{ \code{g.} : quantile estimated by the Laplace-Gauss function. } 
  \item{ \code{dg.} : quantile - quantile_Laplace_Gauss_function. }
}

IMPORTANT : if you need to subset \code{fitk}, always subset it by parameter names 
and never subset it by rank number as new items may be added in the future and rank may vary. 
Use for instance \code{\link{exfit0}}, ..., \code{\link{exfit7}}.
}
\description{
Several functions to estimate the parameters of asymmetric Kiener distributions 
and display the results in a numeric vector or in a matrix. 
Algorithm \code{"reg"} (the default) uses a nonlinear regression model, 
is slow but accurate. Algorithm \code{"estim"} just uses 5 to 11 quantiles, 
is very fast but less accurate.
}
\details{
FatTailsR package currently uses two different algorithms to estimate the 
parameters of Kiener distributions K1, K2, K3 and K4.
\itemize{
  \item{Functions \code{fitkienerX(algo = "reg")}, \code{paramkienerX(algo = "reg")} 
     and \code{\link{regkienerLX}} use an unweighted  
     nonlinear regression from \code{logit(p)} to \code{X} over the whole dataset.  
     Depending the size of the dataset, calculation can be slow but is usually
     accurate and describes very well the last 1-10 points in the tails 
     (except if there is a huge outlier). }
  \item{Functions \code{fitkienerX(algo = "estim")}, \code{paramkienerX(algo = "estim")}, 
     \code{paramkienerX5} and \code{paramkienerX7} estimate the parameters with 
     just 5 to 11 quantiles, 5 being the minimum. For averaging purpose, 
     11 quantiles are proposed (see below). Computation is almost instantaneous 
     and reasonnably accurate. This is the recommanded method for intensive computation.}
  }

A typical input is a numeric vector or a matrix that describes the returns of a stock. 
A matrix must be in the format DS with DATES as rownames, STOCKS as colnames and 
(log-)returns as the content of the matrix. 
An array must be in the format DSL with DATES as rownames, STOCKS as colnames 
LAGS in the third dimension and (log-)returns as the content of the array. 
A list can be a list of numeric but neither a list of matrix, a list of data.frame 
or a list of arrays.

Conversion from a (possible) time series format to a sorted numeric vector 
is done automatically and without any check of the initial format. 
Empirical probabilities of each point in the sorted dataset is calculated 
with the function \code{\link{ppoints}} whose parameter \code{a} has been set to 
\code{a = 0} as large datasets are very common in finance. 
The lowest acceptable size of a dataset is not clear at this moment. A minimum 
of 11 points has been set in \code{"reg"} algorithm and a minimum of 15 points 
has been set in \code{"estim"} algorithm. It might change in the future. 
If possible, use at least 21 points. 

Parameter \code{algo} controls the algorithm used. Default is "reg".

When \code{algo = "reg"} (or \code{algo = "r"}), a nonlinear regression is performed 
with \code{\link[minpack.lm]{nlsLM}} from the logit of the empirical probabilities 
\code{logit(p)} over the quantiles X with the function \code{\link{qlkiener4}}. 
The maximum value of the tail parameter \code{k} is controlled by \code{maxk}.
An upper value \code{maxk = 10} is appropriate for datasets
of low and medium size, less than 20.000 or 50.000 points. For larger datasets, the
upper limit can be extended up to \code{maxk = 20}. When this limit is reached, 
the shape of the distribution is very similar to the logistic distribution 
(at least when \code{e = 0}) and the use of this distribution should be considered. 
Remember that value \code{k < 2} describes a distribution with no stable variance and 
\code{k < 1} describes a distribution with no stable mean.

When \code{algo = "estim"} (or \code{algo = "e"}),
5 to 11 quantiles are used to estimate the parameters. 
The minimum is 5 quantiles : the median x.50, two quantiles at medium distance 
to the median, usually x.25 and x.75 and two quantiles located close to the extremes 
of the dataset, for instance x.01 and x.99 if the dataset \code{X} has more 
than 100 points, x.0001 and x.9999 if the dataset \code{X} has more than 
10.000 points and so on if the dataset is larger. 
These quantiles are extracted with function \code{\link{fiveprobs}}. 
Small datasets must contain at least 15 different points. 

With the idea of averaging the results (but without any guarantee of better 
estimates), calculation has been extended to 11 probabilities  
extracted from \code{X} with the function \code{\link{elevenprobs}} where    
p1, p2 and p3 are the most extreme probabilities of the dataset \code{X}  
with values finishing either by \code{.x01} or \code{.x025} or \code{.x05}:
\itemize{
  \item{\code{p11 = c(p1, p2, p3, 0.25, 0.35, 0.50, 0.65, 0.75, 1-p3, 1-p2, 1-p1)}}
}

Selection of subsets among these 11 probabilities is controlled with the option 
\code{ord} which can take 12 different values.  
For instance, the default \code{ord = 7} computes the  parameters at probabilities 
\code{c(p1, 0.25, 0.50, 0.75, 1-p1)} and \code{c(p2, 0.25, 0.50, 0.75, 1-p2)}.
Parameters \code{d} and \code{k} are averaged first and the results of these 
averages are used to compute the other parameters \code{g, a, w, e}. 
Small dataset should consider \code{ord = 5} and 
large dataset can consider \code{ord = 12}. 
The 12 possible values of \code{ord} are: 
\enumerate{
  \item{ \code{c(p1, 0.35, 0.50, 0.65, 1-p1)}}
  \item{ \code{c(p2, 0.35, 0.50, 0.65, 1-p2)}}
  \item{ \code{c(p1, p2, 0.35, 0.50, 0.65, 1-p2, 1-p1)}}
  \item{ \code{c(p1, p2, p3, 0.35, 0.50, 0.65, 1-p3, 1-p2, 1-p1)}}
  \item{ \code{c(p1, 0.25, 0.50, 0.75, 1-p1)}}
  \item{ \code{c(p2, 0.25, 0.50, 0.75, 1-p2)}}
  \item{ \code{c(p1, p2, 0.25, 0.50, 0.75, 1-p2, 1-p1)}}
  \item{ \code{c(p1, p2, p3, 0.25, 0.50, 0.75, 1-p3, 1-p2, 1-p1)}}
  \item{ \code{c(p1, 0.25, 0.35, 0.50, 0.65, 0.75, 1-p1)}}
  \item{ \code{c(p2, 0.25, 0.35, 0.50, 0.65, 0.75, 1-p2)}}
  \item{ \code{c(p1, p2, 0.25, 0.35, 0.50, 0.65, 0.75, 1-p2, 1-p1)}}
  \item{ \code{c(p1, p2, p3, 0.25, 0.35, 0.50, 0.65, 0.75, 1-p3, 1-p2, 1-p1)}}
}

\code{paramkienerX5} is a simplified version of \code{paramkienerX} with  
predefined values \code{algo = "estim"}, \code{ord = 5}, \code{maxk = 10} 
and direct access to internal subfunctions. 
It uses the following probabilities:
\itemize{
  \item{ \code{p5 = c(p1, 0.25, 0.50, 0.75, 1-p1)} }
}

\code{paramkienerX7} is a simplified version of \code{paramkienerX} with 
predefined values \code{algo = "estim"}, \code{ord = 7}, \code{maxk = 10} 
and direct access to internal subfunctions.
It uses the following probabilities:
\itemize{
  \item{ \code{p7 = c(p1, p2, 0.25, 0.50, 0.75, 1-p2, 1-p1)} }
}

The quantiles corresponding to the above probabilities are then extracted 
with the function \code{\link{quantile}} whose parameter \code{type} 
has been set to \code{type = 6} as it returns the closest values 
to the true quantiles (according to our experience) for all \code{k > 1.9}. 
(Note: when \code{k < 1.5}, algorithm \code{algo = "reg"} returns better  
results). 
Both probabilities and quantiles are then transfered to \code{\link{estimkiener11}} 
for calculation.
 
\code{probak} controls the probabilities at which the model is tested with the parameter 
estimates. \code{fitkienerX} and \code{\link{regkienerLX}} share the same subroutines.
The default for \code{fitkienerX} and \code{regkienerLX} is 
\code{pprobs2 = c(0.01, 0.025, 0.05, 0.95, 0.975, 0.99)} as those values 
are usual in finance. Other sets of values are provided at \code{\link{pprobs0}}.

Rounding the results is useful to display nice results, especially 
in a matrix or in a data.frame. \code{dgts = 13} is recommanded 
as \code{a}, \code{k}, \code{w} are usually significant at 1 digit.
\itemize{
  \item{ \code{dgts = NULL} does not perform any rounding. }
  \item{ \code{dgts = 0 to 9} rounds all parameters at the same level. }
  \item{ \code{dgts = 10 to 27} rounds the parameters at various levels for nice display.  
         See \code{\link{roundcoefk}} for the details. (Note: the
         rounding \code{10 to 27} currently works with \code{paramkienerX}, \code{paramkienerX5},  
         \code{paramkienerX7} but not yet with \code{fitkienerX}). }
} 

Extracting the most useful parameters from the (quite long) vector/matrix 
\code{fitk} is controlled by parameter \code{exfitk} that calls user-defined or
predefined parameter subsets like \code{\link{exfit0}}, ..., \code{\link{exfit7}}.
IMPORTANT: never subset \code{fitk} by rank number as new items may be added 
in the future and rank may vary.

Calculation of vectors, matrices and lists is not parallelized. Parallelization 
of code for arrays was introduced in version 1.5-0 and improved in version 1.5-1. 
\code{ncores} controls the number of cores allowed to the process (through 
\code{\link[parallel]{parApply}} which runs on Unices and Windows and requires
about 2 seconds to start). \code{ncores = 1} means no parallelization. 
\code{ncores = 0} is the recommanded option. It uses the maximum number of cores 
available on the computer, as detected by \code{\link[parallel]{detectCores}},  
minus 1 core, which gives the best performance in most cases. 
Although appealing, this automatic selection may be sometimes dangerous. For instance, 
the instruction \code{f(X, ncores_max) - f(X, ncores_max)}, a nice way to compute 
an array of 0, will call \code{2 ncores_max} and crash R. \code{ncores = 2,..,99} 
sets manually the number of cores. If the requested value is larger than the maximum 
number of cores, this value is automatically reduced (with a warning) to this maximum.
Hence, this latest option provides one core more than option \code{ncores = 0}.

NOTE: \code{fitkienerLX}, \code{regkienerX}, \code{estimkiener(X,5,7)} were   
introduced in v1.2-0 and replaced in version v1.4-1 by \code{fitkienerX} and 
\code{paramkiener(X,5,7)} to accomodate vector, matrix, arrays and lists. 
We apologize to early users who need to rewrite their codes.
}
\examples{
    

require(minpack.lm)
require(timeSeries)

### Load the datasets and choose j in 1:16
DS     <- getDSdata()
j      <- 5

### and run this block
probak <- c(0.01, 0.05, 0.95, 0.99)
X      <- DS[[j]] ; names(DS)[j]
elevenprobs(X)
fitkienerX(X, algo = "reg", dgts = 3, probak = probak)
fitkienerX(X, algo = "estim", ord = 5, probak = probak, dgts = 3)
paramkienerX(X)
paramkienerX5(X)

### Compare the 12 values of paramkienerX(ord/row = 1:12) and paramkienerX (row 13)
compare <- function(ord, X) { paramkienerX(X, ord, algo = "estim", dgts = 13) }
rbind(t(sapply( 1:12, compare, X)), paramkienerX(X, algo = "reg", dgts = 13))

### Analyze DS in one step
t(sapply(DS, paramkienerX, algo = "reg", dgts = 13))
t(sapply(DS, paramkienerX, algo = "estim", dgts = 13))
paramkienerX(DS, algo = "reg", dgts = 13)
paramkienerX(DS, algo = "estim", dgts = 13)
system.time(fitk_rDS <- fitkienerX(DS, algo = "r", probak = pprobs2, dgts = 3))
system.time(fitk_eDS <- fitkienerX(DS, algo = "e", probak = pprobs2, dgts = 3))
fitk_rDS
fitk_eDS

### Subset rDS and eDS with exfit0,..,exfit7
fitk_rDS[,exfit4]
fitk_eDS[,exfit7]
fitkienerX(DS, algo = "e", probak = pprobs2, dgts = 3, exfitk = exfit7)

### Array (new example introduced in v1.5-1)
### Increase the number of cores and crash R.
## Not run:
arr <- array(rkiener1(3000), c(4,3,250))
paramkienerX7(arr, ncores = 2)
## paramkienerX7(arr, ncores = 2) - paramkienerX(arr, ncores = 2)
## End(Not run)

### End


}
\references{
P. Kiener, Fat tail analysis and package FatTailsR, 
9th R/Rmetrics Workshop and Summer School, Zurich, 27 June 2015. 
\url{http://www.inmodelia.com/exemples/2015-0627-Rmetrics-Kiener-en.pdf}
}
\seealso{
\code{\link{regkienerLX}}, \code{\link{estimkiener11}}, 
           \code{\link{roundcoefk}}, \code{\link{exfit6}}.
}
