Title: | A Nonparametric Doubly Robust Test for Continuous Treatment Effect |
Version: | 0.1 |
Description: | Implement the statistical test proposed in Weng et al. (2021) to test whether the average treatment effect curve is constant and whether a discrete covariate is a significant effect modifier. |
License: | GPL (≥ 3) |
Encoding: | UTF-8 |
RoxygenNote: | 7.1.2 |
Imports: | KernSmooth, stats, SuperLearner |
NeedsCompilation: | no |
Packaged: | 2021-09-26 17:32:15 UTC; wgw |
Author: | Guangwei Weng [aut, cre] |
Maintainer: | Guangwei Weng <wengx076@umn.edu> |
Repository: | CRAN |
Date/Publication: | 2021-09-28 09:10:06 UTC |
The function for performing tests of average treatment effects with user specified nuisance functions
Description
This is the function for testing average treatment effects with user specified nuisance functions.
Usage
drdrtest(
y,
a,
l,
arange,
pifunc,
mufunc,
h = NULL,
b = 1000,
dist = "TwoPoint",
pi.low = 0.01,
a.grid.size = 401
)
Arguments
y |
A vector containing the outcomes for each observation |
a |
A vector containing the treatment levels (dosage) for each observation |
l |
A data.frame containing the observations of covariates |
arange |
A vector of length 2 giving the lower bound and upper bound of treatment levels |
pifunc |
A user specifid function or wapper that takes treatment a as the first argument and covariates l as the second argument and return propensit scores |
mufunc |
A user specifid function or wapper that takes treatment a as the first argument and covariates l as the second argument and return outcome regression values |
h |
bandwidth to be used in kernel regression. If not specified, will by default use "rule of thumb" bandwidth selector |
b |
number of Bootstrap samples to be generated |
dist |
distibution used to generate residuals for Bootstrap samples. Currently only have two options, "TwoPoint" and "Rademachar" |
pi.low |
Lower bound to truncate propensity scores |
a.grid.size |
size of equally spaced grid points over |
Value
A list containing
- p.value:
P value of the test result
- test.stat:
Value of the observed test statistic
- Bootstrap.samples:
A vector containing test statistic values from Bootstrap samples
- loc.fit:
A list containg evalution points of average treatment effect and the corresponding values
- bandwidth:
Bandwidth used in kernel regression
Examples
mu.mod<-function(a,l,delta,height){
mu <- as.numeric(l%*%c(0.2,0.2,0.3,-0.1))+triangle(a-2.5,delta,height)+a*(-0.1*l[,1]+0.1*l[,3])
return(mu)
}
triangle <- function(a,delta,height){
y <- exp(-a^2/((delta/2)^2))*height
return(y)
}
set.seed(2000)
n <- 500
d <- 4
sigma <- 0.05
delta <- 1
height <- 0
arange<-c(0.01,4.99)
l <- matrix(rnorm(n*d),ncol=d)
colnames(l) <- paste("l",1:4,sep="")
logit.lambda <- as.numeric(l%*%c(0.1,0.1,-0.1,0.2))
lambda <- exp(logit.lambda)/(1+exp(logit.lambda))
a <- rbeta(n, shape1 = lambda, shape2 =1-lambda)*5
mu <- mu.mod(a,l,delta,height)
residual.list <- rnorm(n,mean=0,sd=sigma)
y <- mu+residual.list
## We use the oracal propensity score and outcome regression for illustration
pifunc <- function(a,l){
l <- as.matrix(l)
logit.lambda <- as.numeric(l%*%c(0.1,0.1,-0.1,0.2))
lambda <- exp(logit.lambda)/(1+exp(logit.lambda))
return(dbeta(a/5,shape1=lambda,shape2 = 1-lambda)/5)
}
mufunc <- function(a,l){
l <- as.matrix(l)
return(mu.mod(a,l,delta,height))
}
out <- drdrtest(y,a,data.frame(l),arange,pifunc,mufunc)
The base function for performing tests of average treatment effects
Description
This is the base function for testing average treatment effects. Users can use specify the nuisance function values by themselves.
Usage
drdrtest.base(
y,
a,
pi,
varpi,
mu,
ma,
arange,
h = NULL,
b = 1000,
dist = "TwoPoint",
a.grid.size = 401
)
Arguments
y |
A vector containing the outcomes for each observation |
a |
A vector containing the treatment levels (dosage) for each observation |
pi |
A vector containing the propensity scores for each observation |
varpi |
A vector containing the mean propensity scores for each observation |
mu |
A vector containing the outcome regression function values for each observation |
ma |
A vector containing the mean outcome regression fucntion values for each observation |
arange |
A vector of length 2 giving the lower bound and upper bound of treatment levels |
h |
bandwidth to be used in kernel regression. If not specified, will by default use "rule of thumb" bandwidth selector |
b |
number of Bootstrap samples to be generated |
dist |
distibution used to generate residuals for Bootstrap samples. Currently only have two options, "TwoPoint" and "Rademachar" |
a.grid.size |
size of equally spaced grid points over |
Value
A list containing
- p.value:
P value of the test result
- test.stat:
Value of the observed test statistic
- Bootstrap.samples:
A vector containing test statistic values from Bootstrap samples
- loc.fit:
A list containg evalution points of average treatment effect and the corresponding values
- bandwidth:
Bandwidth used in kernel regression
Examples
mu.mod<-function(a,l,delta,height){
mu <- as.numeric(l%*%c(0.2,0.2,0.3,-0.1))+triangle(a-2.5,delta,height)+a*(-0.1*l[,1]+0.1*l[,3])
return(mu)
}
triangle <- function(a,delta,height){
y <- exp(-a^2/((delta/2)^2))*height
return(y)
}
set.seed(2000)
n <- 500
d <- 4
sigma <- 0.5
delta <- 1
height <- 0
arange<-c(0.01,4.99)
l <- matrix(rnorm(n*d),ncol=d)
colnames(l) <- paste("l",1:4,sep="")
logit.lambda <- as.numeric(l%*%c(0.1,0.1,-0.1,0.2))
lambda <- exp(logit.lambda)/(1+exp(logit.lambda))
a <- rbeta(n, shape1 = lambda, shape2 =1-lambda)*5
mu <- mu.mod(a,l,delta,height)
residual.list <- rnorm(n,mean=0,sd=sigma)
y <- mu+residual.list
## We use the oracal propensity score and outcome regression for illustration
pilist <- dbeta(a/5, shape1=lambda, shape2 = 1-lambda)/5
varpilist <- colMeans(matrix(dbeta(rep(a,each=n)/5,
shape1=rep(lambda,n),
shape2 = 1-rep(lambda,n))/5, nrow=n))
mulist <- mu
malist <-colMeans(matrix(mu.mod(rep(a,each=n),l[rep(1:n,n),],delta,height),nrow=n))
out <- drdrtest.base(y,a,pilist,varpilist,mulist,malist,arange)
The function for performing tests of average treatment effects with SuperLearner
Description
This is the function for testing average treatment effects with user specified nuisance functions.
Usage
drdrtest.superlearner(
y,
a,
l,
arange,
pi.sl.lib = c("SL.earth", "SL.glm", "SL.gam", "SL.glmnet"),
mu.sl.lib = c("SL.earth", "SL.glm", "SL.gam", "SL.glmnet"),
mu.family = "gaussian",
h = NULL,
b = 1000,
dist = "TwoPoint",
a.grid.size = 401,
pi.low = 0.01,
pi.var.low = 0.01
)
Arguments
y |
A vector containing the outcomes for each observation |
a |
A vector containing the treatment levels (dosage) for each observation |
l |
A data.frame containing the observations of covariates |
arange |
A vector of length 2 giving the lower bound and upper bound of treatment levels |
pi.sl.lib |
Models will be used by SuperLearner to estiamte propensity scores |
mu.sl.lib |
Models will be used by SuperLearner to estiamte outcome regression function |
mu.family |
Type of response. Currently only support "gaussian" and "binomial" |
h |
bandwidth to be used in kernel regression. If not specified, will by default use "rule of thumb" bandwidth selector |
b |
number of Bootstrap samples to be generated |
dist |
distibution used to generate residuals for Bootstrap samples. Currently only have two options, "TwoPoint" and "Rademachar" |
a.grid.size |
size of equally spaced grid points over |
pi.low |
Lower bound to truncate propensity scores |
pi.var.low |
Lower bound to truncate conditional variance of treament (used in propensity score estimation). |
Value
A list containing
- p.value:
P value of the test result
- test.stat:
Value of the observed test statistic
- Bootstrap.samples:
A vector containing test statistic values from Bootstrap samples
- loc.fit:
A list containg evalution points of average treatment effect and the corresponding values
- bandwidth:
Bandwidth used in kernel regression
Examples
mu.mod<-function(a,l,delta,height){
mu <- as.numeric(l%*%c(0.2,0.2,0.3,-0.1))+triangle(a-2.5,delta,height)+a*(-0.1*l[,1]+0.1*l[,3])
return(mu)
}
triangle <- function(a,delta,height){
y <- exp(-a^2/((delta/2)^2))*height
return(y)
}
set.seed(2000)
n <- 500
d <- 4
sigma <- 0.05
delta <- 1
height <- 0
arange<-c(0.01,4.99)
l <- matrix(rnorm(n*d),ncol=d)
colnames(l) <- paste("l",1:4,sep="")
logit.lambda <- as.numeric(l%*%c(0.1,0.1,-0.1,0.2))
lambda <- exp(logit.lambda)/(1+exp(logit.lambda))
a <- rbeta(n, shape1 = lambda, shape2 =1-lambda)*5
mu <- mu.mod(a,l,delta,height)
residual.list <- rnorm(n,mean=0,sd=sigma)
y <- mu+residual.list
out <- drdrtest.superlearner(y,a,l,arange,pi.sl.lib=c("SL.glm"),mu.sl.lib=c("SL.glm"))
The base function for testing a effect modifier with user specified nuisance functions
Description
This is the function for testing whether a discrete covariate is an effect modifier with user specified nuisance functions
Usage
drdrtest_em(
y,
a,
l,
class_label,
arange,
pifunc,
mufunc,
h = NULL,
b = 1000,
dist = "TwoPoint",
pi.low = 0.01,
a.grid.size = 401
)
Arguments
y |
A vector containing the outcomes for each observation |
a |
A vector containing the treatment levels (dosage) for each observation |
l |
A data.frame containing the observations of covariates |
class_label |
A vector containing the class label (label for the effect modifier) for each observation. |
arange |
A vector of length 2 giving the lower bound and upper bound of treatment levels |
pifunc |
A user specifid function or wapper that takes treatment a as the first argument and covariates l as the second argument and return propensit scores |
mufunc |
A user specifid function or wapper that takes treatment a as the first argument and covariates l as the second argument and return outcome regression values |
h |
bandwidth to be used in kernel regression. If not specified, will by default use "rule of thumb" bandwidth selector |
b |
number of Bootstrap samples to be generated |
dist |
distibution used to generate residuals for Bootstrap samples. Currently only have two options, "TwoPoint" and "Rademachar" |
pi.low |
Lower bound to truncate propensity scores |
a.grid.size |
size of equally spaced grid points over |
Value
A list containing
- p.value:
P value of the test result
- test.stat:
Value of the observed test statistic
- Bootstrap.samples:
A vector containing test statistic values from Bootstrap samples
- bandwidth:
Bandwidth used in kernel regression
Examples
d <- 4
n <- 200
sigma <- 0.5
delta <- 1
height <-1
arange <- c(0,5)
triangle <- function(a,height){
y <- exp(-a^2/((1/2)^2))*height
return(y)
}
mu.mod<-function(a,l,delta,height){
mu <- as.numeric(l%*%c(0.2,0.2,0.3,-0.1*delta))+
triangle(a-2.5,height)+a*(-0.1*l[,1]+0.1*delta*l[,4])
return(mu)
}
l <- matrix(rnorm(n*d),ncol=d)
l[,4] <- ifelse(l[,4]>0,1,0)
colnames(l) <- paste("l",1:4,sep="")
logit.lambda <- as.numeric(l%*%c(0.1,0.1,-0.1,0))
lambda <- exp(logit.lambda)/(1+exp(logit.lambda))
a <- rbeta(n, shape1 = lambda, shape2 =1-lambda)*5
mu <- mu.mod(a,l,delta,height)
residual.list <- rnorm(n,mean=0,sd =sigma)
y <- mu+residual.list
class_label <- l[,4]
pifunc <- function(a,l){
l <- as.matrix(l)
logit.lambda <- as.numeric(l%*%c(0.1,0.1,-0.1,0))
lambda <- exp(logit.lambda)/(1+exp(logit.lambda))
return(pmin(dbeta(a/5,shape=lambda,shape2=1-lambda)/5,100))
}
mufunc <- function(a,l){
return(mu.mod(a,as.matrix(l),delta,height))
}
out <- drdrtest_em(y,a,l,class_label,arange,pifunc,mufunc)
The base function for testing effect modifiers
Description
This is the base function for testing whether a discrete covariate is an effect modifier.
Usage
drdrtest_em.base(
ylist,
alist,
pilist,
varpilist,
mulist,
malist,
arange,
h = NULL,
b = 1000,
dist = "TwoPoint",
a.grid.size = 401
)
Arguments
ylist |
A list containing vectors of outcomes for each class |
alist |
A list containing vectors of treatment levels (dosage) for each class |
pilist |
A list containing vectors of propensity scores for each class |
varpilist |
A list containing vectors of mean propensity scores for each class |
mulist |
A list containing vectors of outcome regression function values for each class |
malist |
A list containing vectors of mean outcome regression values for each class |
arange |
A vector of length 2 giving the lower bound and upper bound of treatment levels |
h |
bandwidth to be used in kernel regression. If not specified, will by default use "rule of thumb" bandwidth selector |
b |
number of Bootstrap samples to be generated |
dist |
distibution used to generate residuals for Bootstrap samples. Currently only have two options, "TwoPoint" and "Rademachar" |
a.grid.size |
size of equally spaced grid points over |
Value
A list containing
- p.value:
P value of the test result
- test.stat:
Value of the observed test statistic
- Bootstrap.samples:
A vector containing test statistic values from Bootstrap samples
- bandwidth:
Bandwidth used in kernel regression
Examples
d <- 4
n <- 200
sigma <- 0.5
delta <- 1
height <-1
arange <- c(0,5)
triangle <- function(a,height){
y <- exp(-a^2/((1/2)^2))*height
return(y)
}
mu.mod<-function(a,l,delta,height){
mu <- as.numeric(l%*%c(0.2,0.2,0.3,-0.1*delta))+
triangle(a-2.5,height)+a*(-0.1*l[,1]+0.1*delta*l[,4])
return(mu)
}
l <- matrix(rnorm(n*d),ncol=d)
l[,4] <- ifelse(l[,4]>0,1,0)
colnames(l) <- paste("l",1:4,sep="")
logit.lambda <- as.numeric(l%*%c(0.1,0.1,-0.1,0))
lambda <- exp(logit.lambda)/(1+exp(logit.lambda))
a <- rbeta(n, shape1 = lambda, shape2 =1-lambda)*5
mu <- mu.mod(a,l,delta,height)
residual.list <- rnorm(n,mean=0,sd =sigma)
y <- mu+residual.list
class_label <- l[,4]
ylist <- split(y,class_label)
alist <- split(a,class_label)
pilist <- split(pmin(dbeta(a/5,shape1=lambda,shape2=1-lambda)/5,100),class_label)
mulist <- split(mu,class_label)
varpilist <- list()
malist <- list()
for(c in c(0,1)){
ac <- a[class_label==c]
lc <- l[class_label==c,]
logit.lambdac <- as.numeric(lc[rep(1:nrow(lc),nrow(lc)),]%*%c(0.1,0.1,-0.1,0))
lambdac <- exp(logit.lambdac)/(1+exp(logit.lambdac))
varpic <- colMeans(matrix(pmin(dbeta(rep(ac,each=length(ac))/5,
shape1=lambdac,
shape2 = 1-lambdac)/5,100),nrow=length(ac)))
mac <- colMeans(matrix(mu.mod(rep(ac,each=length(ac)),
lc[rep(1:nrow(lc),nrow(lc)),],
delta,height),
nrow=length(ac)))
varpilist[[as.character(c)]]<-varpic
malist[[as.character(c)]] <- mac
}
out <- drdrtest_em.base(ylist,alist,pilist,varpilist,mulist,malist,arange)
The function for testing a effect modifier with SuperLearner
Description
This is the function for testing whether a discrete covariate is an effect modifier with SuperLearner
Usage
drdrtest_em.superlearner(
y,
a,
l,
class_label,
arange,
pi.sl.lib = c("SL.earth", "SL.glm", "SL.gam", "SL.glmnet"),
mu.sl.lib = c("SL.earth", "SL.glm", "SL.gam", "SL.glmnet"),
mu.family = "gaussian",
h = NULL,
b = 1000,
dist = "TwoPoint",
pi.low = 0.01,
pi.var.low = 0.01,
a.grid.size = 401
)
Arguments
y |
A vector containing the outcomes for each observation |
a |
A vector containing the treatment levels (dosage) for each observation |
l |
A data.frame containing the observations of covariates |
class_label |
A vector containing the class label (label for the effect modifier) for each observation. |
arange |
A vector of length 2 giving the lower bound and upper bound of treatment levels |
pi.sl.lib |
Models will be used by SuperLearner to estiamte propensity scores |
mu.sl.lib |
Models will be used by SuperLearner to estiamte outcome regression function |
mu.family |
Type of response. Currently only support "gaussian" and "binomial" |
h |
bandwidth to be used in kernel regression. If not specified, will by default use "rule of thumb" bandwidth selector |
b |
number of Bootstrap samples to be generated |
dist |
distibution used to generate residuals for Bootstrap samples. Currently only have two options, "TwoPoint" and "Rademachar" |
pi.low |
Lower bound to truncate propensity scores |
pi.var.low |
Lower bound to truncate conditional variance of treament (used in propensity score estimation). |
a.grid.size |
size of equally spaced grid points over |
Value
A list containing
- p.value:
P value of the test result
- test.stat:
Value of the observed test statistic
- Bootstrap.samples:
A vector containing test statistic values from Bootstrap samples
- bandwidth:
Bandwidth used in kernel regression
Examples
d <- 4
n <- 200
sigma <- 0.5
delta <- 1
height <-1
arange <- c(0,5)
triangle <- function(a,height){
y <- exp(-a^2/((1/2)^2))*height
return(y)
}
mu.mod<-function(a,l,delta,height){
mu <- as.numeric(l%*%c(0.2,0.2,0.3,-0.1*delta))+
triangle(a-2.5,height)+a*(-0.1*l[,1]+0.1*delta*l[,4])
return(mu)
}
l <- matrix(rnorm(n*d),ncol=d)
l[,4] <- ifelse(l[,4]>0,1,0)
colnames(l) <- paste("l",1:4,sep="")
logit.lambda <- as.numeric(l%*%c(0.1,0.1,-0.1,0))
lambda <- exp(logit.lambda)/(1+exp(logit.lambda))
a <- rbeta(n, shape1 = lambda, shape2 =1-lambda)*5
mu <- mu.mod(a,l,delta,height)
residual.list <- rnorm(n,mean=0,sd =sigma)
y <- mu+residual.list
class_label <- l[,4]
out <- drdrtest_em.superlearner(y,a,l,l[,4],arange,pi.sl.lib=c("SL.glm"),mu.sl.lib=c("SL.glm"))