Help for package Weighted.Desc.Stat

Type:

Package

Title:

Weighted Descriptive Statistics

Version:

1.0

Date:

2016-02-26

Author:

Abbas Parchami (Department of Statistics, Faculty of Mathematics and Computer, Shahid Bahonar University of Kerman, Kerman, Iran)

Maintainer:

Abbas Parchami <parchami@uk.ac.ir>

Description:

Weighted descriptive statistics is the discipline of quantitatively describing the main features of real-valued fuzzy data which usually given from a fuzzy population. One can summarize this special kind of fuzzy data numerically or graphically using this package. To interpret some of the properties of one or several sets of real-valued fuzzy data, numerically summarize is possible by some weighted statistics which are designed in this package such as mean, variance, covariance and correlation coefficent. Also, graphically interpretation can be given by weighted histogram and weighted scatter plot using this package to describe properties of real-valued fuzzy data set.

License:

LGPL (≥ 3)

NeedsCompilation:

Packaged:

2016-02-29 16:50:57 UTC; Admin

Repository:

CRAN

Date/Publication:

2016-02-29 23:55:53

Weighted Descriptive Statistics

Description

Weighted Descriptive Statistics is an open source (LGPL 3) package for R which provides descriptive statistical methods to deal with weighted data. Assume that x=(x_1, x_2, \cdots , x_n) is the observed value of a random sample from a fuzzy population. In classical and usual random sample, the degree of belonging x_i into the random sample is equal to 1, for 1 \leq i \leq n. But considering fuzzy population, we denote the degree of belonging x_i into the fuzzy population (or into the observed value of random sample) by \mu_i which is a real-valued number from [0,1]. Therefore in such situations, it is more appropriate that we show the observed value of the random sample by notation \{ (x_1, \mu_1), (x_2, \mu_2), \cdots , (x_n, \mu_n) \} which we called it real-valued fuzzy data or weighted data. Weighted descriptive statistics is the discipline of quantitatively describing the main features of a real-valued fuzzy data which usually given from a fuzzy population.

Details

The weighted descriptive statistics provide a concise summary a set of real data x=(x_1, x_2, \cdots , x_n) on the basis of the vector weight \mu=(\mu_1, \mu_2, \cdots , \mu_n). By Weighted.Desc.Stat package, one can easily summarize real-valued fuzzy data numerically or graphically using this package. Calculating numerically summarize is possible by some weighted statistics in this package (such as mean, variance, covariance, correlation coefficent and etc) that summarize and interpret some of the properties of one or several sets of real-valued fuzzy data (real-valued fuzzy samples). Also, graphically interpretation can be drown by weighted histogram and weighted scatter plot using this package to describe properties of real-valued fuzzy data set.

Author(s)

Abbas Parchami

Maintainer: Abbas Parchami <parchami@uk.ac.ir>

Examples

## Weighted statistics for one variable (property):
x <- c(1:10)
mu <- c(0.9, 0.7, 0.8, 0.7, 0.6, 0.4, 0.2, 0.3, 0.0, 0.1)
w.mean(x, mu)
w.sd(x, mu)
w.var(x, mu)
w.ad(x, mu)
w.cv(x, mu)
w.skewness(x, mu)
w.kurtosis(x, mu)

## Weighted covariance, weighted correlation coefficent and weighted scatter
## plot for two variables (properties):
n=50
x =  rnorm(n,0,1)
y =  rnorm(n,0,1)
mu =  runif(n,0,1)
w.cov(x, y, mu)
w.r(x, y, mu)
w.plot(x, y, 0.3, mu, lwd=2)

## Weighted histogram for one variable (property):
n = 5000
x = rnorm(n,17,1)
x[x<14 | x>20] = NA
range(x)
mu = runif(n,0,1)
bre = seq(from=14,to=20,len=18)
cu = seq(from=0,to=1,len=10)
w.hist(x, mu, breaks=bre, cuts=cu, ylim=c(0,n/7), lwd = 2)

weighted absolute deviation

Description

\bar{x} = \frac{\sum_{i=1}^{n} \mu_i x_i}{\sum_{i=1}^{n} \mu_i}.

Usage

w.ad(x, mu)

Arguments

x

A vector-valued numeric data which you want to compute its weighted absolute deviation.

mu

A vector of weights. The length of this vector must be equal to the length of data and each element of it must be belong to interval [0,1].

Value

The weighted absolute deviation of the vector x, by considering weights vector mu, is numeric or a vector of length one.

Warning

The length of x and mu must be equal. Also, each element of mu must be in interval [0,1].

Author(s)

Abbas Parchami

Examples

x <- c(1:10)
mu <- c(0.9, 0.7, 0.8, 0.7, 0.6, 0.4, 0.2, 0.3, 0.0, 0.1)
w.ad(x, mu)

## The function is currently defined as
function(x, mu)  sum( mu*abs(x- w.mean(x,mu) ) ) / sum(mu)

weighted covariance

Description

Assume that x=(x_1, x_2, \cdots , x_n) is the observed value of a random sample from a fuzzy population. In classical and usual random sample, the degree of belonging x_i into the random sample is equal to 1, for 1 \leq i \leq n. But considering fuzzy population, we denote the degree of belonging x_i into the fuzzy population (or into the observed value of random sample) by \mu_i which is a real-valued number from [0,1]. Therefore in such situations, it is more appropriate that we show the observed value of the random sample by notation \{ (x_1, \mu_1), (x_2, \mu_2), \cdots , (x_n, \mu_n) \} which we called it real-valued fuzzy data. The goal of w.cov function is computing covariance (or, the weighted covariance) between two vector-valued data sets x_1, \cdots , x_n and y_1, \cdots , y_n based on real-valued fuzzy data \{ (x_1, \mu_1), \cdots , (x_n, \mu_n) \} and \{ (y_1, \mu_1), \cdots , (y_n, \mu_n) \} by considering their vector-valued weights, i.e.

s_{xy} = \frac{1}{\sum_{i=1}^{n} \mu_i} \sum_{i=1}^{n} \mu_i (x_i-\bar{x})( y_i -\bar{y}).

Usage

w.cov(x, y, mu)

Arguments

x, y

Two vector-valued numeric data sets which you want to compute the weighted covariance between them.

mu

A vector of weights. The length of this vector must be equal to the length of data and each element of it is belongs to interval [0,1].

Value

The weighted covariance between two vectors x and y, by considering weights vector mu, is numeric or a vector of length one.

Warning

The length of x, y and mu must be equal. Also, each element of mu must be in interval [0,1].

Author(s)

Abbas Parchami

Examples

x <- c(1:10)
y <- c(10:1)
mu <- c(0.9, 0.7, 0.8, 0.7, 0.6, 0.4, 0.2, 0.3, 0.0, 0.1)
w.cov(x, y, mu)

## The function is currently defined as
function(x, y, mu)  (sum(mu*x*y)/sum(mu)) - (w.mean(x,mu) * w.mean(y,mu))

weighted coefficient of variation

Description

Assume that x=(x_1, x_2, \cdots , x_n) is the observed value of a random sample from a fuzzy population. In classical and usual random sample, the degree of belonging x_i into the random sample is equal to 1, for 1 \leq i \leq n. But considering fuzzy population, we denote the degree of belonging x_i into the fuzzy population (or into the observed value of random sample) by \mu_i which is a real-valued number from [0,1]. Therefore in such situations, it is more appropriate that we show the observed value of the random sample by notation \{ (x_1, \mu_1), (x_2, \mu_2), \cdots , (x_n, \mu_n) \} which we called it real-valued fuzzy data. The goal of w.cv function is computing the coefficient of variation (or, the weighted coefficient of variation) value of x_1, \cdots , x_n based on real-valued fuzzy data \{ (x_1, \mu_1), \cdots , (x_n, \mu_n) \} by considering its vector-valued weight. In other words, the weighted coefficient of variation is equal to the weighted standard deviation devided by the weighted mean.

Usage

w.cv(x, mu)

Arguments

x

A vector-valued numeric data which you want to compute its weighted coefficient of variation.

mu

A vector of weights. The length of this vector must be equal to the length of data and each element of it is belongs to interval [0,1].

Value

The weighted coefficient of variation for vector x, by considering weights vector mu, is numeric or a vector of length one.

Warning

The length of x and mu must be equal. Also, each element of mu must be in interval [0,1].

Author(s)

Abbas Parchami

Examples

x <- c(1:10)
mu <- c(0.9, 0.7, 0.8, 0.7, 0.6, 0.4, 0.2, 0.3, 0.0, 0.1)
w.cv(x, mu)

## The function is currently defined as
function(x, mu)  w.sd(x,mu) / w.mean(x,mu)

weighted histogram

Description

Assume that x=(x_1, x_2, \cdots , x_n) is the observed value of a random sample from a fuzzy population. In classical and usual random sample, the degree of belonging x_i into the random sample is equal to 1, for 1 \leq i \leq n. But considering fuzzy population, we denote the degree of belonging x_i into the fuzzy population (or into the observed value of random sample) by \mu_i which is a real-valued number from [0,1]. Therefore in such situations, it is more appropriate that we show the observed value of the random sample by notation \{ (x_1, \mu_1), (x_2, \mu_2), \cdots , (x_n, \mu_n) \} which we called it real-valued fuzzy data. This function drow the weighted histogram for a vector-valued data by considering a vector-valued weight. The weighted histogram containes several classical histograms which are depicted on one two-dimentional sorface. Each classical histogram drown only for the elements of real-value fuzzy data set which their weights are bigger than a cut point belongs to (0,1].

Usage

w.hist(x, mu, breaks, cuts, ylim = NULL, freq = NULL, lwd = NULL)

Arguments

x

A vector-valued numeric data for which the weighted histogram is desired by considering their weights.

mu

A vector of weights of the real-value fuzzy data. The length of this vector must be equal to the length of x and each element of it is belongs to interval [0,1].

breaks

a vector giving the breakpoints between the weighted histogram cells.

cuts

a vector giving the cut-points between (to determine) the classical histograms in the desired weighted histogram.

freq

logical; if TRUE, the histogram graphic is a representation of frequencies, the counts component of the result; if FALSE, probability densities, component density, are plotted (so that the histogram has a total area of one). Defaults to TRUE if and only if breaks are equidistant (and probability is not specified).

ylim

numeric vector of length 2 giving the y limits for the plot. Unused if add = TRUE.

lwd

The line width, a positive number, defaulting to 1. The interpretation is device-specific, and some devices do not implement line widths less than one.

Details

The arguments of the weighted histogram can be extended similar to the arguments of usual histogram which is detailed in function "hist" from "graphics" package.

Warning

The length of x and mu must be equal. Also, each element of mu must be in interval [0,1].

Author(s)

Abbas Parchami

Department of Statistics, Faculty of Mathematics and Computer, Shahid Bahonar University of Kerman, Kerman, Iran

Examples

n = 5000
x = rnorm(n,17,1)
x[x<14 | x>20] = NA
range(x)
mu = runif(n,0,1)
bre = seq(from=14,to=20,len=18)
cu = seq(from=0,to=1,len=10)
w.hist(x, mu, breaks=bre, cuts=cu, ylim=c(0,n/7), lwd = 2)

## The function is currently defined as
function(x, mu, breaks, cuts, ylim = NULL, freq = NULL, lwd = NULL) 
{
Gray = paste("gray", round(seq(from=10, to=100, len=length(cuts)-1)), sep="")
hist(x, col=Gray[1], xlim=range(breaks), ylim=ylim, breaks=breaks, freq=freq, lwd=lwd)
i=2
while(i<=length(cuts))
{
X=x
X[(X*(mu>=cuts[i]))==0]=NA
hist(X, col=Gray[i], xlim=range(breaks), ylim=ylim, breaks=breaks, freq=freq, lwd=lwd, add=TRUE)
i=i+1
}
}

weighted coefficient of kurtosis

Description

k = \frac{\frac{1}{\sum_{i=1}^{n} \mu_i} \sum_{i=1}^{n} \mu_i \left[ x_i - \bar{x} \right]^4}{s^4} - 3.

Usage

w.kurtosis(x, mu)

Arguments

x

A vector-valued numeric data which you want to compute its weighted coefficient of kurtosis.

mu

A vector of weights. The length of this vector must be equal to the length of data and each element of it is belongs to interval [0,1].

Value

The weighted coefficient of kurtosis for the vector x, by considering weights vector mu, is numeric or a vector of length one.

Warning

The length of x and mu must be equal. Also, each element of mu must be in interval [0,1].

Author(s)

Abbas Parchami

Examples

x <- c(1:10)
mu <- c(0.9, 0.7, 0.8, 0.7, 0.6, 0.4, 0.2, 0.3, 0.0, 0.1)
w.kurtosis(x, mu)

## The function is currently defined as
function(x, mu)  (( sum( mu*(x-w.mean(x,mu))^4 ) / sum(mu) ) / w.sd(x,mu)^4)-3

weighted mean

Description

\bar{x} = \frac{\sum_{i=1}^{n} \mu_i x_i}{\sum_{i=1}^{n} \mu_i}.

Usage

w.mean(x, mu)

Arguments

x

A vector-valued numeric data which you want to compute its weighted mean.

mu

A vector of weights. The length of this vector must be equal to the length of data and each element of it is belongs to interval [0,1].

Value

The weighted mean of the vector x, by considering weights vector mu, is numeric or a vector of length one.

Warning

The length of x and mu must be equal. Also, each element of mu must be in interval [0,1].

Author(s)

Abbas Parchami

Examples

x <- c(1:10)
mu <- c(0.9, 0.7, 0.8, 0.7, 0.6, 0.4, 0.2, 0.3, 0.0, 0.1)
w.mean(x, mu)

## The function is currently defined as
function(x, mu)  sum( mu*abs(x-w.mean(x,mu)) ) / sum(mu)

weighted scatter plot

Description

Assume that x=(x_1, x_2, \cdots , x_n) is the observed value of a random sample from a fuzzy population. In classical and usual random sample, the degree of belonging x_i into the random sample is equal to 1, for 1 \leq i \leq n. But considering fuzzy population, we denote the degree of belonging x_i into the fuzzy population (or into the observed value of random sample) by \mu_i which is a real-valued number from [0,1]. Therefore in such situations, it is more appropriate that we show the observed value of the random sample by notation \{ (x_1, \mu_1), (x_2, \mu_2), \cdots , (x_n, \mu_n) \} which we called it real-valued fuzzy data. The weighted scatter plot, or weighted scatterplot, is a type of mathematical diagram to display values of two real-valued fuzzy data sets (from two variables of fuzzy populations) by considering a vector-valued weight. In weighted scatter plot, this kind of data is displayed as a collection of circles, the center point of each having the value of one variable determining the position on the horizontal axis and the value of the other variable determining the position on the vertical axis. Also the radius of each circle is considered equal to (or a ratio from) the weight of correcponding two-dimentional element (the center of circle).

Usage

w.plot(x, y, mu, coef.radii, xlim = NULL, ylim = NULL, lwd = NULL, add = NULL, ...)

Arguments

x, y

Two vector-valued numeric data sets which you want to drow the weighted scatter plot for them.

mu

A vector of weights. The length of this vector must be equal to the length of data and each element of it is belongs to interval [0,1].

coef.radii

a possitive number giving the coefficient of radiuses for the circles, i.e. radius[i] = mu[i] * coef.radii.

xlim

numeric vector of length 2 giving the x limits for the plot. Unused if add = TRUE.

ylim

numeric vector of length 2 giving the y limits for the plot. Unused if add = TRUE.

lwd

The line width, a positive number, defaulting to 1. The interpretation is device-specific, and some devices do not implement line widths less than one.

add

if add is TRUE, the new plot is added to an existing plot, otherwise a new plot is created.

...

Arguments to be passed to methods, such as graphical parameters.

Warning

The length of x, y and mu must be equal. Also, each element of mu must be in interval [0,1].

Author(s)

Abbas Parchami

Department of Statistics, Faculty of Mathematics and Computer, Shahid Bahonar University of Kerman, Kerman, Iran

Examples

n=50
x =  rnorm(n,0,1)
y =  rchisq(n,3)
mu =  runif(n,0,1)
w.plot(x, y, 0.3, mu, lwd=3)

## The function is currently defined as
function(x, y, mu, coef.radii, xlim = NULL, ylim = NULL, lwd = NULL, add = NULL, ...)
	{
		symbols(x, y, mu * coef.radii, inches = FALSE, xlim=xlim, ylim=ylim, lwd=lwd)
	}

weighted Pearson's correlation coefficent

Description

Assume that x=(x_1, x_2, \cdots , x_n) is the observed value of a random sample from a fuzzy population. In classical and usual random sample, the degree of belonging x_i into the random sample is equal to 1, for 1 \leq i \leq n. But considering fuzzy population, we denote the degree of belonging x_i into the fuzzy population (or into the observed value of random sample) by \mu_i which is a real-valued number from [0,1]. Therefore in such situations, it is more appropriate that we show the observed value of the random sample by notation \{ (x_1, \mu_1), (x_2, \mu_2), \cdots , (x_n, \mu_n) \} which we called it real-valued fuzzy data. The goal of w.r function is computing the Pearson's correlation coefficent (or, the weighted Pearson's correlation coefficent) between two vector-valued data sets x_1, \cdots , x_n and y_1, \cdots , y_n based on real-valued fuzzy data \{ (x_1, \mu_1), \cdots , (x_n, \mu_n) \} and \{ (y_1, \mu_1), \cdots , (y_n, \mu_n) \} by formula r = \frac{s_{xy}}{s_x s_y}.

Usage

w.r(x, y, mu)

Arguments

x, y

Two vector-valued numeric data sets which you want to compute the weighted Pearson's correlation coefficent between them.

mu

A vector of weights. The length of this vector must be equal to the length of data sets and each element of it is belongs to interval [0,1].

Value

The weighted correlation coefficent between two vectors x and y, by considering weights vector mu, is numeric or a vector of length one.

Warning

The length of x, y and mu must be equal. Also, each element of mu must be in interval [0,1].

Author(s)

Abbas Parchami

Examples

x <- c(1:10)
y <- c(2, 7, 0.8, -1, 3, 4, 8, 13, 0, 12)
mu <- c(0.9, 0.7, 0.8, 0.7, 0.6, 0.4, 0.2, 0.3, 0.0, 0.1)
w.r(x, y, mu)

## The function is currently defined as
function(x, y, mu)  w.cov(x,y,mu) / (w.sd(x,mu) * w.sd(y,mu))

weighted standard deviation

Description

s = \sqrt{s^2}.

Usage

w.sd(x, mu)

Arguments

x

A vector-valued numeric data which you want to compute its weighted standard deviation.

mu

A vector of weights. The length of this vector must be equal to the length of data and each element of it is belongs to interval [0,1].

Value

The weighted standard deviation of vector x, by considering weights vector mu, is numeric or a vector of length one.

Warning

The length of x and mu must be equal. Also, each element of mu must be in interval [0,1].

Author(s)

Abbas Parchami

Examples

x <- c(1:10)
mu <- c(0.9, 0.7, 0.8, 0.7, 0.6, 0.4, 0.2, 0.3, 0.0, 0.1)
w.sd(x, mu)

## The function is currently defined as
function(x, mu)  ( (sum(mu*x*x)/sum(mu)) - w.mean(x,mu)^2 )^.5

weighted coefficient of skewness

Description

g = \frac{\frac{1}{\sum_{i=1}^{n} \mu_i} \sum_{i=1}^{n} \mu_i \left[ x_i - \bar{x} \right]^3}{s^3}.

Usage

w.skewness(x, mu)

Arguments

x

A vector-valued numeric data which you want to compute its weighted coefficient of skewness.

mu

A vector of weights. The length of this vector must be equal to the length of data and each element of it is belongs to interval [0,1].

Value

The weighted coefficient of skewness for the vector x, by considering weights vector mu, is numeric or a vector of length one.

Warning

The length of x and mu must be equal. Also, each element of mu must be in interval [0,1].

Author(s)

Abbas Parchami

Examples

x <- c(1:10)
mu <- c(0.9, 0.7, 0.8, 0.7, 0.6, 0.4, 0.2, 0.3, 0.0, 0.1)
w.skewness(x, mu)

## The function is currently defined as
function(x, mu)  ( sum( mu*(x-w.mean(x,mu))^3 ) / sum(mu) ) / w.sd(x,mu)^3

weighted variance

Description

s^2 = \frac{1}{\sum_{i=1}^{n} \mu_i} \sum_{i=1}^{n} \mu_i \left[ x_i - \bar{x} \right]^2 .

Usage

w.var(x, mu)

Arguments

x

A vector-valued numeric data which you want to compute its weighted variance.

mu

A vector of weights. The length of this vector must be equal to the length of data and each element of it is belongs to interval [0,1].

Value

The weighted variance of vector x, by considering weights vector mu, is numeric or a vector of length one.

Warning

The length of x and mu must be equal. Also, each element of mu must be in interval [0,1].

Author(s)

Abbas Parchami

Examples

x <- c(1:10)
mu <- c(0.9, 0.7, 0.8, 0.7, 0.6, 0.4, 0.2, 0.3, 0.0, 0.1)
w.var(x, mu)

## The function is currently defined as
function(x, mu)  (sum(mu*x*x)/sum(mu)) - w.mean(x,mu)^2