Type: | Package |
Title: | Matching Methods for Causal Inference with Time-Series Cross-Sectional Data |
Version: | 3.1.1 |
Date: | 2025-06-04 |
Description: | Implements a set of methodological tools that enable researchers to apply matching methods to time-series cross-sectional data. Imai, Kim, and Wang (2023) http://web.mit.edu/insong/www/pdf/tscs.pdf proposes a nonparametric generalization of the difference-in-differences estimator, which does not rely on the linearity assumption as often done in practice. Researchers first select a method of matching each treated observation for a given unit in a particular time period with control observations from other units in the same time period that have a similar treatment and covariate history. These methods include standard matching methods based on propensity score and Mahalanobis distance, as well as weighting methods. Once matching and refinement is done, treatment effects can be estimated with standard errors. The package also offers diagnostics for researchers to assess the quality of their results. |
License: | GPL (≥ 3) |
Imports: | Rcpp (≥ 0.12.5), data.table, ggplot2, CBPS, stats, graphics, MASS, Matrix, doParallel, foreach, methods |
Depends: | R (≥ 2.14.0) |
LinkingTo: | RcppArmadillo, Rcpp, RcppEigen |
Encoding: | UTF-8 |
LazyData: | true |
BugReports: | https://github.com/insongkim/PanelMatch/issues |
RoxygenNote: | 7.3.1 |
Suggests: | knitr, rmarkdown, testthat (≥ 2.1.0) |
VignetteBuilder: | knitr |
NeedsCompilation: | yes |
Packaged: | 2025-06-04 05:49:30 UTC; adamrauh |
Author: | In Song Kim [aut, cre], Adam Rauh [aut], Erik Wang [aut], Kosuke Imai [aut] |
Maintainer: | In Song Kim <insong@mit.edu> |
Repository: | CRAN |
Date/Publication: | 2025-06-04 14:40:02 UTC |
Matching Methods for Causal Inference with Time-Series Cross-Sectional Data
Description
Implements a set of methodological tools that enable researchers to apply matching methods to time-series cross-sectional data. Imai, Kim, and Wang (2023) proposes a nonparametric generalization of the difference-in-differences estimator, which does not rely on the linearity assumption as often done in practice. Researchers first select a method of matching each treated observation for a given unit in a particular time period with control observations from other units in the same time period that have a similar treatment and covariate history. These methods include standard matching methods based on propensity score and Mahalanobis distance, as well as weighting methods. Once matching is done, both short-term and long-term average treatment effects for the treated observations can be estimated with standard errors. The package also offers a variety of diagnostic and visualization functions to assess the credibility of results.
Author(s)
In Song Kim <insong@mit.edu>, Erik Wang <haixiao@Princeton.edu>, Adam Rauh <amrauh@umich.edu>, and Kosuke Imai <imai@harvard.edu>
Maintainer: In Song Kim insong@mit.edu
References
Imai, Kosuke, In Song Kim and Erik Wang. (2023)
See Also
Useful links:
Report bugs at https://github.com/insongkim/PanelMatch/issues
Visualize the treatment distribution across units and time in a panel data set
Description
Visualize the treatment distribution across units and time in a panel data set
Usage
DisplayTreatment(
panel.data,
color.of.treated = "red",
color.of.untreated = "blue",
title = "Treatment Distribution \n Across Units and Time",
xlab = "Time",
ylab = "Unit",
x.size = NULL,
y.size = NULL,
legend.position = "none",
x.angle = NULL,
y.angle = NULL,
legend.labels = c("not treated", "treated"),
decreasing = FALSE,
matched.set = NULL,
show.set.only = FALSE,
hide.x.tick.label = FALSE,
hide.y.tick.label = FALSE,
gradient.weights = FALSE,
dense.plot = FALSE
)
Arguments
panel.data |
|
color.of.treated |
Color of the treated observations provided as a character string (this includes hex values). Default is red. |
color.of.untreated |
Color of the untreated observations provided as a character string (this includes hex values). Default is blue. |
title |
Title of the plot provided as character string |
xlab |
Character label of the x-axis |
ylab |
Character label of the y-axis |
x.size |
Numeric size of the text for xlab or x axis tick labels. Assign x.size = NULL to use built in ggplot2 method of determining label size. When the length of the time period is long, consider setting to NULL and adjusting size and ratio of the plot. |
y.size |
Numeric size of the text for ylab or y axis tick labels. Assign y.size = NULL to use built in ggplot2 method of determining label size. When the number of units is large, consider setting to NULL and adjusting size and ratio of the plot. |
legend.position |
Position of the legend. Provide this according to ggplot2 standards. |
x.angle |
Angle (in degrees) of the tick labels for x-axis |
y.angle |
Angle (in degrees) of the tick labels for y-axis |
legend.labels |
Character vector of length two describing the labels of the legend to be shown in the plot. ggplot2 standards are used. |
decreasing |
Logical. Determines if display order should be increasing or decreasing by the amount of treatment received. Default is |
matched.set |
(optional) a |
show.set.only |
(optional) logical. If TRUE, only the treated unit and control units contained in the provided |
hide.x.tick.label |
logical. If TRUE, x axis tick labels are not shown. Default is FALSE. |
hide.y.tick.label |
logical. If TRUE, y axis tick labels are not shown. Default is FALSE. |
gradient.weights |
(optional) logical. If TRUE, the "darkness"/shade of units in the provided |
dense.plot |
logical. if TRUE, lines between tiles are removed on resulting plot. This is useful for producing more readable plots in situations where the number of units and/or time periods is very high. |
Value
DisplayTreatment
returns a treatment variation plot (generated via ggplot2 geom_tile() or geom_raster()),
which visualizes the variation of treatment across units and time. The results can be customized using ggplot2 syntax.
Author(s)
In Song Kim <insong@mit.edu>, Erik Wang <haixiao@Princeton.edu>, Adam Rauh <amrauh@umich.edu>, and Kosuke Imai <imai@harvard.edu>
Examples
dem.panel <- PanelData(panel.data = dem,
unit.id = "wbcode2",
time.id = "year",
treatment = "dem",
outcome = "y")
DisplayTreatment(panel.data = dem.panel,
legend.position = "none",
xlab = "year", ylab = "Country Code")
Pre-process and balance panel data
Description
Pre-process and balance panel data
Usage
PanelData(panel.data, unit.id, time.id, treatment, outcome)
Arguments
panel.data |
A |
unit.id |
A character string indicating the name of unit identifier in the data. This data must be integer. |
time.id |
A character string indicating the name of the time variable in the data. |
treatment |
A character string indicating the name of the treatment variable. The treatment must be a binary indicator variable (integer with 0 for the control group and 1 for the treatment group). |
outcome |
A character string identifying the outcome variable |
Value
PanelData()
returns an object of class PanelData
. This takes the form of a data.frame
object with the following properties and attributes. First, the data has been balanced and sorted. These properties are noted in the "is.balanced" and "is.sorted" attributes, respectively. So, each unit appears the same number of times in the resulting PanelData
object, with NAs filling out missing data. Second, the data has been sorted to appear in order for each unit. Next, the PanelData
object has the following attributes: "unit.id", "time.id", "treatment", and "outcome" reflecting the variables provided in the specification. If the function attempts to automatically convert time data to be consecutive integers, the mapping between the original time data and the "new" converted time data is provided as a data.frame
object and stored as the "time.data.map" attribute.
Examples
d <- PanelData(panel.data = dem,
unit.id = "wbcode2",
time.id = "year",
treatment = "dem",
outcome = "y")
Estimate a causal quantity of interest
Description
Estimate a causal quantity of interest, including the average treatment effect for
treated or control units (att and atc, respectively), the average effect of treatment reversal on reversed units (art), or average treatment effect (ate), as specified in PanelMatch()
.
This is done by estimating the counterfactual outcomes for each treated unit using
matched sets. Users will provide matched sets that were obtained by the
PanelMatch
function and obtain point estimates and standard errors.
Usage
PanelEstimate(
sets,
panel.data,
number.iterations = 1000,
df.adjustment = FALSE,
confidence.level = 0.95,
moderator = NULL,
se.method = "bootstrap",
pooled = FALSE,
include.placebo.test = FALSE,
parallel = FALSE,
num.cores = 1
)
Arguments
sets |
A |
panel.data |
The same time series cross sectional data set provided to the |
number.iterations |
If using bootstrapping for calculating standard errors, this is the number of bootstrap iterations. Provide as integer. If |
df.adjustment |
A logical value indicating whether or not a
degree-of-freedom adjustment should be performed for the standard error
calculation. The default is |
confidence.level |
A numerical value specifying the confidence level and range of interval estimates for statistical inference. The default is .95. |
moderator |
The name of a moderating variable, provided as a character string. If a moderating variable is provided,the returned object will be a list of |
se.method |
Method used for calculating standard errors, provided as a character string. Users must choose between "bootstrap", "conditional", and "unconditional" methods. Default is "bootstrap". "bootstrap" uses a block bootstrapping procedure to calculate standard errors. The conditional method calculates the variance of the estimator, assuming independence across units but not across time. The unconditional method also calculates the variance of the estimator analytically, but makes no such assumptions about independence across units. When the quantity of interest is "att", "atc", or "art", all methods are available. Only "bootstrap" is available for the ate. If |
pooled |
Logical. If TRUE, estimates and standard errors are returned for treatment effects pooled across the entire lead window. Only available for |
include.placebo.test |
Logical. If TRUE, a placebo test is run and returned in the results. The placebo test uses the same specifications for calculating standard errors as the main results. That is, standard errors are calculated according to the user provided |
parallel |
Logical. If TRUE and |
num.cores |
Integer. Specifies the number of cores to use for parallelization. If |
Value
PanelEstimate
returns a list of class
PanelEstimate
containing the following components:
estimates |
the point estimates of the quantity of interest for the lead periods specified |
se.method |
The method used to calculate standard errors. This is the same as the argument provided to the function. |
bootstrapped.estimates |
the bootstrapped point estimate values, when applicable |
bootstrap.iterations |
the number of iterations used in bootstrapping, when applicable |
method |
refinement method used to create the matched sets from which the estimates were calculated |
lag |
See PanelMatch() argument |
lead |
The lead window sequence for which |
confidence.level |
the confidence level |
qoi |
the quantity of interest |
matched.sets |
the refined matched sets used to produce the estimations |
standard.error |
the standard error(s) of the point estimates |
pooled |
Logical indicating whether or not estimates were calculated for individual lead periods or pooled. |
placebo.test |
if |
Author(s)
In Song Kim <insong@mit.edu>, Erik Wang <haixiao@Princeton.edu>, Adam Rauh <amrauh@umich.edu>, and Kosuke Imai <imai@harvard.edu>
References
Imai, Kosuke, In Song Kim, and Erik Wang (2023)
Examples
dem.sub <- dem[dem[, "wbcode2"] <= 100, ]
dem.sub.panel <- PanelData(dem.sub, "wbcode2", "year", "dem", "y")
# create subset of data for simplicity
PM.results <- PanelMatch(panel.data = dem.sub.panel, lag = 4,
refinement.method = "ps.match",
match.missing = TRUE,
covs.formula = ~ tradewb,
size.match = 5, qoi = "att",
lead = 0:4,
forbid.treatment.reversal = FALSE)
PE.results <- PanelEstimate(sets = PM.results,
panel.data = dem.sub.panel,
se.method = "unconditional")
Create and refine sets of matched treated and control observations
Description
PanelMatch
identifies treated observations and a matched set for each treated
observation. Specifically, for a given treated unit, the matched
set consists of control observations that have an identical
treatment history up to a number of lag
time periods. A further refinement of
the matched set using matching or weighting techniques, described below.
Usage
PanelMatch(
panel.data,
lag,
refinement.method,
qoi,
size.match = 10,
match.missing = TRUE,
covs.formula = NULL,
lead = 0,
verbose = FALSE,
exact.match.variables = NULL,
forbid.treatment.reversal = FALSE,
matching = TRUE,
listwise.delete = FALSE,
use.diagonal.variance.matrix = FALSE,
restrict.control.period = NULL,
placebo.test = FALSE
)
Arguments
panel.data |
A |
lag |
An integer value indicating the length of treatment history periods to be matched on |
refinement.method |
A character string specifying the matching or weighting method to be used for refining the matched sets. The user can choose "mahalanobis", "ps.match", "CBPS.match", "ps.weight", "CBPS.weight", "ps.msm.weight", "CBPS.msm.weight", or "none". The first three methods will use the |
qoi |
quantity of interest, provided as a string: |
size.match |
An integer dictating the number of permitted closest control units in a matched set after refinement.
This argument only affects results when using a matching method ("mahalanobis" or any of the refinement methods that end in ".match").
This argument is not needed and will have no impact if included when a weighting method is specified (any |
match.missing |
Logical variable indicating whether or not units should be matched on the patterns of missingness in their treatment histories. Default is TRUE. When FALSE, neither treated nor control units are allowed to have missing treatment data in the lag window. |
covs.formula |
One sided formula object indicating which variables should be used for matching and refinement.
Argument is not needed if |
lead |
integer sequence specifying the lead window, for which qoi point estimates (and standard errors) will ultimately be produced. Default is 0 (which corresponds to contemporaneous treatment effect). |
verbose |
option to include more information about the |
exact.match.variables |
character vector giving the names of variables to be exactly matched on. These should be time invariant variables. Exact matching for time varying covariates is not currently supported. |
forbid.treatment.reversal |
Logical. For the ATT, it indicates whether or not it is permissible for treatment to reverse in the specified lead window. This is defined analogously for the ART. It is not valid for the ATC or ATE. When set to TRUE, only matched sets for treated units where treatment is applied continuously in the lead window are included in the results. Default is FALSE. |
matching |
logical indicating whether or not any matching on treatment history should be performed. This is primarily used for diagnostic purposes, and most users will never need to set this to FALSE. Default is TRUE. |
listwise.delete |
TRUE/FALSE indicating whether or not missing data should be handled using listwise deletion or the package's default missing data handling procedures. Default is FALSE. |
use.diagonal.variance.matrix |
TRUE/FALSE indicating whether or not a regular covariance matrix should be used in mahalanobis distance calculations during refinement,
or if a diagonal matrix with only covariate variances should be used instead.
In many cases, setting this to TRUE can lead to better covariate balance, especially when there is
high correlation between variables. Default is FALSE. This argument is only necessary when
|
restrict.control.period |
(optional) integer specifying the number of pre-treatment periods that treated units and potentially matched control units should be non-NULL and in the control state. For instance, specifying 4 would mean that the treatment history cannot contain any missing data or treatment from t-4 to t. |
placebo.test |
logical TRUE/FALSE. indicates whether or not you want to be able to run a placebo test. This will add additional requirements on the data – specifically, it requires that no unit included in the matching/refinement process can having missing outcome data over the lag window. Additionally, you should not use the outcome variable in refinement when |
Value
PanelMatch()
returns an object of class PanelMatch
. This is a list that contains a few specific elements:
First, a matched.set
object(s) that has the same name as the provided qoi if the qoi is "att", "art", or "atc".
If qoi = "ate" then two matched.set
objects will be attached, named "att" and "atc." Please consult the documentation for
matched_set()
to read more about the structure and usage of matched.set
objects.
The PanelMatch
object also has some additional attributes that track metadata about the specification, like the names of the unit and time identifier variables.
Author(s)
Adam Rauh <amrauh@umich.edu>, In Song Kim <insong@mit.edu>, Erik Wang <haixiao@Princeton.edu>, and Kosuke Imai <imai@harvard.edu>
References
Imai, Kosuke, In Song Kim, and Erik Wang (2023)
Examples
dem.sub <- dem[dem[, "wbcode2"] <= 100, ]
dem.sub.panel <- PanelData(dem.sub, "wbcode2", "year", "dem", "y")
# create subset of data for simplicity
PM.results <- PanelMatch(panel.data = dem.sub.panel, lag = 4,
refinement.method = "ps.match",
match.missing = TRUE,
covs.formula = ~ tradewb,
size.match = 5, qoi = "att",
lead = 0:4,
forbid.treatment.reversal = FALSE)
# include lagged variables
PM.results <- PanelMatch(panel.data = dem.sub.panel, lag = 4,
refinement.method = "ps.weight",
match.missing = TRUE,
covs.formula = ~ tradewb + I(lag(tradewb, 1:4)) + I(lag(y, 1:4)),
size.match = 5, qoi = "att",
lead = 0:4,
forbid.treatment.reversal = FALSE)
Subset PanelBalance objects
Description
Subset PanelBalance objects
Usage
## S3 method for class 'PanelBalance'
x[i, ...]
Arguments
x |
|
i |
numeric. Specifies which element to extract. Substantively, it specifies which |
... |
Not used |
Value
Returns balance information for specified PanelMatch
configuration. Note that results are still returned as a PanelBalance
object. In order to return a list, use the [[ operator
Examples
dem$rdata <- runif(runif(nrow(dem)))
dem.panel <- PanelData(dem, "wbcode2", "year", "dem", "y")
pm.obj <- PanelMatch(lead = 0:3, lag = 4, refinement.method = "mahalanobis",
panel.data = dem.panel, match.missing = TRUE,
covs.formula = ~ tradewb + rdata + I(lag(tradewb, 1:4)) + I(lag(y, 1:4)),
size.match = 5, qoi = "att")
# create multiple configurations to compare
pm2 <- PanelMatch(lead = 0:3, lag = 4, refinement.method = "ps.match",
panel.data = dem.panel, match.missing = TRUE,
covs.formula = ~ tradewb + rdata + I(lag(tradewb, 1:4)) + I(lag(y, 1:4)),
size.match = 5, qoi = "att")
pb <- get_covariate_balance(pm.obj, pm2,
include.unrefined = TRUE,
panel.data = dem.panel,
covariates = c("tradewb", "rdata"))
bal.maha <- pb[1]
bal.ps <- pb[2]
Subset matched.set object
Description
Subsets matched.set
objects while preserving attributes.
Usage
## S3 method for class 'matched.set'
x[i, j = NULL, drop = NULL]
Arguments
x |
|
i |
numeric. specifies the index of which element to extract. |
j |
NULL |
drop |
NULL |
build_maha_mats Builds the matrices that we will then use to calculate the mahalanobis distances for each matched set
Description
build_maha_mats Builds the matrices that we will then use to calculate the mahalanobis distances for each matched set
Usage
build_maha_mats(idx, ordered_expanded_data)
Arguments
idx |
List of vectors specifying which observations should be extracted |
ordered_expanded_data |
data.frame of prepared/parsed input data |
Value
List of parsed distance matrices, with elements corresponding to each matched set
build_ps_data
Description
build_ps_data
Usage
build_ps_data(idxlist, data, lag)
Arguments
idxlist |
|
data |
data.frame object with the data |
lag |
see PanelMatch() documentation |
Value
Returns a list of length equal to the number of matched sets. Each item is a data frame and each data frame contains information at time = t + 0 for each treated unit and their corresponding controls.
calculate_estimates
Description
Mid-level function that helps with estimation process. Calls lower level helper functions
Usage
calculate_estimates(
qoi.in,
data.in,
lead,
number.iterations,
att.treated.unit.ids,
atc.treated.unit.ids,
outcome.variable,
unit.id.variable,
confidence.level,
att.sets,
atc.sets,
placebo.test = FALSE,
lag,
se.method,
pooled = FALSE,
parallel = FALSE,
num.cores = 1
)
Arguments
qoi.in |
String specifying qoi |
data.in |
data.frame object with the data |
lead |
integer specifying lead window |
number.iterations |
integer. specifies number of bootstrap iterations |
att.treated.unit.ids |
Integer vector specifying the treated units for the att or art |
atc.treated.unit.ids |
Integer vector specifying the "treated" units under the atc definition |
outcome.variable |
string specifying the name of the outcome variable |
unit.id.variable |
string specifying the name of the unit id variable |
confidence.level |
double. specifies confidence level for confidence interval |
att.sets |
matched.set object specifying the att or art sets |
atc.sets |
matched.set object specifying the atc sets |
lag |
integer vector specifying size of the lag. |
se.method |
string specifying which method should be used for standard error calculation |
pooled |
bool. specifies whether or not estimates should be calculated for each lead period, or pooled across all lead periods |
parallel |
bool. Specifies whether or not parallelization should be used |
num.cores |
Integer. specifies how many cores to use for parallelization |
Value
Returns PanelEstimate object.
calculate_placebo_estimates
Description
Handles the procedures for calculating point estimates and standard errors for the placebo test. Code is structured very similarly to the calculate_estimates() code, but with appropriate modifications for the placebo test. See that function for description of arguments. Bootstrap SEs are available for any specification. Conditional, unconditional standard errors only available for att, art, atc.
Usage
calculate_placebo_estimates(
qoi.in,
data.in,
lead,
number.iterations,
att.treated.unit.ids,
atc.treated.unit.ids,
outcome.variable,
unit.id.variable,
confidence.level,
att.sets,
atc.sets,
placebo.test = FALSE,
lag,
placebo.lead,
se.method = "bootstrap",
parallel = FALSE,
num.cores = 1
)
Value
Returns a PanelEstimate object
calculate_point_estimates Helper function that calculates the point estimates for the specified QOI
Description
calculate_point_estimates Helper function that calculates the point estimates for the specified QOI
Usage
calculate_point_estimates(
qoi.in,
data.in,
lead,
outcome.variable,
pooled = FALSE
)
Arguments
qoi.in |
string specifying the QOI |
data.in |
data.frame providing the processed/parsed data to be used for calculations |
lead |
see PanelMatch() documentation |
outcome.variable |
string specifying the outcome variable |
pooled |
Logical. See PanelEstimate() documentation. |
Value
A named vector of point estimates
check_time_data
Description
Time data should be consecutive integers: When it is not, try to convert it as best we can or throw an error. If function does not fail, returns the data as data frame object, either processed or not as appropriately
Usage
check_time_data(data, time.id)
Arguments
data |
data.frame object. |
time.id |
string specifying the time id variable. |
Details
enforces the requirements for time data, with some reasonable defaults
Value
data.frame object with the data. If function throws error, nothing is returned.
clean_leads Function to check the lead windows in treated and control units for missing outcome data. If data is missing, remove those units from matched sets.
Description
clean_leads Function to check the lead windows in treated and control units for missing outcome data. If data is missing, remove those units from matched sets.
Usage
clean_leads(matched_sets, ordered.data, max.lead, t.var, id.var, outcome.var)
Arguments
matched_sets |
matched.set object contained pre-filtered matched sets |
ordered.data |
data.frame object to be checked for missing data. This should have been passed through data preparation functions already. |
max.lead |
Integer specifying the biggest value of the lead window. |
t.var |
string specifying the time id variable |
id.var |
string specifying the unit id variable |
outcome.var |
string specifying the outcome variable. |
Value
a cleaned/filtered matched.set object
Produce confidence intervals for PanelEstimate objects
Description
Produce confidence intervals for PanelEstimate objects
Usage
## S3 method for class 'PanelEstimate'
confint(object, parm = NULL, level = NULL, ..., bias.corrected = FALSE)
Arguments
object |
|
parm |
Not used. |
level |
Confidence level to be used for confidence interval calculations. Must be numeric between 0 and 1. If NULL, confidence level from |
... |
not used |
bias.corrected |
logical indicating whether or not bias corrected estimates should be provided. Default is FALSE. This argument only applies for standard errors calculated with the bootstrap. |
Value
Matrix with two columns and ‘length(lead)' rows. Contains the upper and lower boundaries of the confidence interval for each time period’s point estimate.
Country-year level democratization data
Description
A dataset containing the democracy indicator for 184 countries from 1960 to 2010
Format
A data.frame containing 9384 rows and 3 variables
Details
wbcode2. World Bank country ID. Integer.
year. year (1960–2010). Integer.
dem. binary indicator of democracy as defined in Acemoglu et al (2019).
y log of GDP per capita in 2000 constant dollars (multiplied by 100). Numeric.
tradewb Exports plus imports as a share of GDP from World Bank. Numeric.
Source
Acemoglu, Daron, Suresh Naidu, Pascual Restrepo, and James A Robinson. “Democracy does cause growth.” Journal of Political Economy.
Get distances See distances.matched.set method
Description
Get distances See distances.matched.set method
Usage
distances(object)
Arguments
object |
|
Extract the distances of matched control units
Description
Extract the distances of matched control units
Usage
## S3 method for class 'matched.set'
distances(object)
Arguments
object |
a matched.set object |
Value
A named list of named vectors. Each element corresponds to a matched set and will be a named vector, where the names of each element will identify a matched control unit and its distance from the treated observation within a particular matched set. These correspond to the "distances" attribute, which are calculated and included when the verbose
option is set to TRUE in PanelMatch
.
Examples
dem.panel <- PanelData(dem, "wbcode2", "year", "dem", "y")
PM.results <- PanelMatch(panel.data = dem.panel, lag = 4,
refinement.method = "mahalanobis",
verbose = TRUE,
match.missing = TRUE,
covs.formula = ~ tradewb,
size.match = 5, qoi = "att",
lead = 0:4,
forbid.treatment.reversal = FALSE)
r1 <- extract(PM.results, qoi = "att")
lt <- distances(r1)
enforce_lead_restrictions check treatment and control units for treatment reversion in the lead window. Treated units must stay treated and control units must stay in control (according to the specified qoi)
Description
enforce_lead_restrictions check treatment and control units for treatment reversion in the lead window. Treated units must stay treated and control units must stay in control (according to the specified qoi)
Usage
enforce_lead_restrictions(
matched_sets,
ordered.data,
max.lead,
t.var,
id.var,
treatment.var
)
Arguments
matched_sets |
matched.set object |
ordered.data |
parsed data as data.frame object |
max.lead |
The largest lead value (e.g. the biggest F) |
t.var |
string specifying the time variable |
id.var |
string specifying the unit id variable |
treatment.var |
string specifying the treatment variable. |
Value
matched.set object with the matched sets that meet the conditions
equality_four Small helper function implementing estimation function from Imai, Kim, and Wang (2023)
Description
equality_four Small helper function implementing estimation function from Imai, Kim, and Wang (2023)
Usage
equality_four(x, y, z)
Value
Returns numeric vector of results.
equality_four_placebo
Description
Small helper function implementing estimation function from Imai, Kim, and Wang (2023)
Usage
equality_four_placebo(x, y, z)
Value
Returns numeric vector of results.
Extract QOI estimates See documentation for 'estimates.PanelEstimate()'
Description
Extract QOI estimates See documentation for 'estimates.PanelEstimate()'
Usage
estimates(object, ...)
Arguments
object |
PanelEstimate object |
... |
other arguments. Not used. |
Extract QOI estimates
Description
This is a method for extracting point estimates for the QOI from PanelEstimate
objects. This function is analogous to the 'coef()' method used elsewhere.
Usage
## S3 method for class 'PanelEstimate'
estimates(object, ...)
Arguments
object |
|
... |
not used |
Value
Named vector with the QOI point estimates and the time periods to which they correspond
expand_treated_ts Builds a list that contains all times in a lag window that correspond to a particular treated unit. This is structured as a list of vectors. Each vector is lag + 1 units long. The overall list will be the same length as the number of matched sets
Description
expand_treated_ts Builds a list that contains all times in a lag window that correspond to a particular treated unit. This is structured as a list of vectors. Each vector is lag + 1 units long. The overall list will be the same length as the number of matched sets
Usage
expand_treated_ts(lag, treated.ts)
Arguments
lag |
lag value |
treated.ts |
times of treated observations |
Value
list. Contains all times in a lag window that correspond to a particular treated unit
Extract matched.set objects from PanelMatch results
Description
Extract matched.set objects from PanelMatch results
Usage
extract(pm.object, qoi)
Arguments
pm.object |
|
qoi |
character, specifying the qoi. Valid inputs include "att", "atc", "art", and NULL. If NULL, function extracts att, art, or atc results if possible. Otherwise, throws an error if ate is specified. |
Extract matched.set objects from PanelMatch results
Description
Extract matched.set objects from PanelMatch results
Usage
## S3 method for class 'PanelMatch'
extract(pm.object, qoi = NULL)
Arguments
pm.object |
|
qoi |
character, specifying the qoi. Valid inputs include "att", "atc", "art", and NULL. If NULL, function extracts att, art, or atc results if possible. Otherwise, throws an error if ate is specified. |
Value
a matched.set
object
Examples
dem.sub <- dem[dem[, "wbcode2"] <= 100, ]
dem.sub.panel <- PanelData(dem.sub, "wbcode2", "year", "dem", "y")
# create subset of data for simplicity
PM.results <- PanelMatch(panel.data = dem.sub.panel,
lag = 4,
refinement.method = "mahalanobis",
match.missing = TRUE,
covs.formula = ~ I(lag(tradewb, 1:4)) + I(lag(y, 1:4)),
size.match = 5, qoi = "att",
lead = 0:4, forbid.treatment.reversal = FALSE)
extract(PM.results, qoi = "att")
extract(PM.results) # valid since att is specified
extract_differences This function calculates the differences from t-1 to 1 for treated and control units in the treatment variable. While functionality is somewhat trivial for current implementation of package, it will be needed for continuous treatment version of the package.
Description
extract_differences This function calculates the differences from t-1 to 1 for treated and control units in the treatment variable. While functionality is somewhat trivial for current implementation of package, it will be needed for continuous treatment version of the package.
Usage
extract_differences(indexed.data, matched.set, treatment.variable, qoi)
Arguments
indexed.data |
data that has been indexed. Rows have been named with a unique identifier. |
matched.set |
matched.set object |
treatment.variable |
string specifying treatment variable |
qoi |
string specifying QOI |
Value
matched.set object, with differences extracted as described previously for each matched set.
findBinaryTreated
Description
findBinaryTreated
is used to identify t,id pairs of units for which a matched set might exist.
More precisely, it finds units for which at time t, the specified treatment has been applied, but at time t - 1, the treatment has not.
Usage
findBinaryTreated(
dmat,
qoi.in,
treatedvar,
time.var,
unit.var,
hasbeensorted = FALSE
)
Arguments
dmat |
Data frame or matrix containing data used to identify potential treated units. Must be specified in such a way that a combination of time and id variables will correspond to a unique row. Must also contain at least a binary treatment variable column as well. |
treatedvar |
Character string that identifies the name of the column in |
time.var |
Character string that identifies the name of the column in |
unit.var |
Character string that identifies the name of the column in |
hasbeensorted |
variable that only has internal usage for optimization purposes. There should be no need for a user to toggle this |
Value
findBinaryTreated
returns a subset of the data in the dmat
data frame, containing only treated units for which a matched set might exist
find_ps
Description
find_ps
Usage
find_ps(sets, fitted.model)
Arguments
sets |
matched sets |
fitted.model |
Result of a fitted (CB) PS model call |
Value
Returns a list of data frames with propensity score weights for each unit in a matched set. Each element in the list is a data frame which corresponds to a matched set of 1 treatment and all matched control units
Return the refinement formula used in a PanelMatch specification
Description
Return the refinement formula used in a PanelMatch specification
Usage
## S3 method for class 'PanelMatch'
formula(x, ...)
Arguments
x |
A PanelMatch Object |
... |
not used |
Value
One sided formula object containing the variables/specification used in refinement. This corresponds to what was provided to the covs.formula
argument.
get.matchedsets
Description
get.matchedsets
is used to identify matched sets for a given unit with a specified i, t.
Usage
get.matchedsets(
t,
id,
data,
L,
t.column,
id.column,
treatedvar,
hasbeensorted = FALSE,
match.on.missingness = TRUE,
matching = TRUE,
qoi.in,
restrict.control.period = NULL
)
Arguments
t |
integer vector specifying the times of treated units for which matched sets should be found. This vector should be the same length as the following |
id |
integer vector specifying the unit ids of treated units for which matched sets should be found. note that both |
data |
data frame containing the data to be used for finding matched sets. |
L |
An integer value indicating the length of treatment history to be matched |
t.column |
Character string that identifies the name of the column in |
id.column |
Character string that identifies the name of the column in |
treatedvar |
Character string that identifies the name of the column in |
hasbeensorted |
variable that only has internal usage for optimization purposes. There should be no need for a user to toggle this |
match.on.missingness |
TRUE/FALSE indicating whether or not the user wants to "match on missingness." That is, should units with NAs in their treatment history windows be matched with control units that have NA's in corresponding places? |
matching |
logical indicating whether or not the treatment history should be used for matching. This should almost always be set to TRUE, except for specific situations where the user is interested in particular diagnostic questions. |
Value
get.matchedsets
returns a "matched set" object, which primarily contains a named list of vectors. Each vector is a "matched set" containing the unit ids included in a matched set. The list names will indicate an i,t pair (formatted as "<i variable>.<t variable>") to which the vector/matched set corresponds.
getDits returns a vector of Dit values, as defined in the paper. They should be in the same order as the data frame containing the original problem data.
Description
getDits returns a vector of Dit values, as defined in the paper. They should be in the same order as the data frame containing the original problem data.
Usage
getDits(matched_sets, data)
Arguments
matched_sets |
matched.set object |
data |
data.frame object |
Value
vector of Dits, as described in Imai et al. (2023)
getWits returns a vector of Wits, as defined in the paper (equation 25 or equation 23). They should be in the same order as the data frame containing the original problem data. The pts, pcs, and getWits functions act for a specific lead. So, for instance if our lead window is 0,1,2,3,4, these function must be called for each of those – so for 0, then for 1, etc.
Description
getWits returns a vector of Wits, as defined in the paper (equation 25 or equation 23). They should be in the same order as the data frame containing the original problem data. The pts, pcs, and getWits functions act for a specific lead. So, for instance if our lead window is 0,1,2,3,4, these function must be called for each of those – so for 0, then for 1, etc.
Usage
getWits(matched_sets, lead, data, estimation.method = "bootstrap")
Arguments
matched_sets |
matched.set object |
lead |
integer providing a specific lead value |
data |
data.frame object |
estimation.method |
method of estimation for calculating standard errors. |
Value
data.table of Wits, as described above
Calculate covariate balance measures for refined and unrefined matched sets
Description
Calculate covariate balance for user specified covariates across matched sets. Balance is assessed by taking the average of the difference between the values of the specified covariates for the treated unit(s) and the weighted average of the control units across all matched sets. Results are standardized and are expressed in standard deviations. Balance is calculated for each period in the specified lag window.
Usage
get_covariate_balance(..., panel.data, covariates, include.unrefined = TRUE)
Arguments
... |
one or more PanelMatch objects |
panel.data |
|
covariates |
a character vector, specifying the names of the covariates for which the user is interested in calculating balance. |
include.unrefined |
logical. Indicates whether or not covariate balance measures for unrefined matched sets should be included. If TRUE, the function will return covariate balance results for the PanelMatch configurations provided, as well as a set of balance results that assume all matched controls have equal weight (i.e., the matched sets are unrefined). These results are included in addition to whatever PanelMatch configurations are specified to the function. Note that if you provide a PanelMatch object where no refinement is applied (that is, where |
Value
A list of matrices, or a list of lists (if the QOI is ATE). The matrices contain the calculated covariate balance levels for each specified covariate for each period. Each element in the list (whether that be a matrix or a sublist) corresponds to a PanelMatch
configuration specified to the function. Results are returned in the order they were provided. Unrefined results are stored as a parallel list object in an attribute called "unrefined.balance.results".
Examples
dem.sub <- dem[dem[, "wbcode2"] <= 100, ]
# create subset of data for simplicity
#add some additional data to data set for demonstration purposes
dem.sub$rdata <- runif(runif(nrow(dem.sub)))
dem.sub.panel <- PanelData(dem.sub, "wbcode2", "year", "dem", "y")
PM.results <- PanelMatch(panel.data = dem.sub.panel, lag = 4,
refinement.method = "ps.match",
match.missing = TRUE,
covs.formula = ~ tradewb + rdata,
size.match = 5, qoi = "att",
lead = 0:4,
forbid.treatment.reversal = FALSE)
get_covariate_balance(PM.results, panel.data = dem.sub.panel, covariates = c("tradewb", "rdata"))
Calculate matched set level treatment effects
Description
Calculate the size of treatment effects for each matched set.
Usage
get_set_treatment_effects(pm.obj, panel.data, lead)
Arguments
pm.obj |
an object of class |
panel.data |
|
lead |
integer (or integer vector) indicating the time period(s) in the future for which the treatment effect size will be calculated. Calculations will be made for the period t + lead, where t is the time of treatment. If more than one lead value is provided, then calculations will be performed for each value. |
Value
a list equal in length to the number of lead periods specified to the lead
argument. Each element in the list is a vector of the matched set level effect estimates.
Examples
dem.sub <- dem[dem[, "wbcode2"] <= 100, ]
dem.sub.panel <- PanelData(dem.sub, "wbcode2", "year", "dem", "y")
# create subset of data for simplicity
PM.results <- PanelMatch(panel.data = dem.sub.panel, lag = 4,
refinement.method = "ps.match",
match.missing = TRUE,
covs.formula = ~ tradewb,
size.match = 5, qoi = "att",
lead = 0:4,
forbid.treatment.reversal = FALSE)
set.effects <- get_set_treatment_effects(pm.obj = PM.results,
panel.data = dem.sub.panel, lead = 0)
Extract just the unrefined covariate balance results, if they exist
Description
Extract just the unrefined covariate balance results, if they exist
Usage
get_unrefined_balance(pb.object)
Arguments
pb.object |
|
Extract unrefined covariate balance results, if they exist
Description
Extract unrefined covariate balance results, if they exist
Usage
## S3 method for class 'PanelBalance'
get_unrefined_balance(pb.object)
Arguments
pb.object |
|
Value
A PanelBalance
object, with just the unrefined balance results
Examples
dem$rdata <- runif(runif(nrow(dem)))
dem.panel <- PanelData(dem, "wbcode2", "year", "dem", "y")
pm.obj <- PanelMatch(lead = 0:3, lag = 4, refinement.method = "mahalanobis",
panel.data = dem.panel, match.missing = TRUE,
covs.formula = ~ tradewb + rdata + I(lag(tradewb, 1:4)) + I(lag(y, 1:4)),
size.match = 5, qoi = "att")
# create multiple configurations to compare
pm2 <- PanelMatch(lead = 0:3, lag = 4, refinement.method = "ps.match",
panel.data = dem.panel, match.missing = TRUE,
covs.formula = ~ tradewb + rdata + I(lag(tradewb, 1:4)) + I(lag(y, 1:4)),
size.match = 5, qoi = "att")
pb <- get_covariate_balance(pm.obj, pm2,
include.unrefined = TRUE,
panel.data = dem.panel,
covariates = c("tradewb", "rdata"))
get_unrefined_balance(pb)
handle_bootstrap
Description
Helper function for calculating bootstrapped estimates for the QOI. This version is not parallelized.
Usage
handle_bootstrap(
qoi.in,
data.in,
lead,
number.iterations,
att.treated.unit.ids,
atc.treated.unit.ids,
outcome.variable,
unit.id.variable,
confidence.level,
lag,
pooled
)
Arguments
qoi.in |
String specifying qoi |
data.in |
data.frame object with the data |
number.iterations |
integer. Specifies number of bootstrap iterations |
att.treated.unit.ids |
Integer vector specifying the treated units for the att or art |
atc.treated.unit.ids |
Integer vector specifying the "treated" units under the atc definition |
outcome.variable |
string specifying the name of the outcome variable |
unit.id.variable |
string specifying the name of the unit id variable |
confidence.level |
double. specifies confidence level for confidence interval |
lag |
integer vector specifying size of the lag. |
pooled |
logical. Specifies whether or not to calculate point estimates for each specified lead value, or a single pooled estimate. |
Value
Returns a matrix of bootstrapped QOI estimate values.
handle_bootstrap_parallel
Description
Helper function for calculating bootstrapped estimates for the QOI. This version is parallelized.
Usage
handle_bootstrap_parallel(
qoi.in,
data.in,
lead,
number.iterations,
att.treated.unit.ids,
atc.treated.unit.ids,
outcome.variable,
unit.id.variable,
confidence.level,
lag,
pooled,
num.cores = 1
)
Arguments
qoi.in |
String specifying qoi |
data.in |
data.frame object with the data |
number.iterations |
integer. Specifies number of bootstrap iterations |
att.treated.unit.ids |
Integer vector specifying the treated units for the att or art |
atc.treated.unit.ids |
Integer vector specifying the "treated" units under the atc definition |
outcome.variable |
string specifying the name of the outcome variable |
unit.id.variable |
string specifying the name of the unit id variable |
confidence.level |
double. specifies confidence level for confidence interval |
lag |
integer vector specifying size of the lag. |
pooled |
logical. Specifies whether or not to calculate point estimates for each specified lead value, or a single pooled estimate. |
num.cores |
number of cores to be used for parallelization |
Value
Returns a matrix of bootstrapped QOI estimate values.
handle_bootstrap_placebo
Description
Helper function for calculating bootstrapped estimates for the placebo test. This version is not parallelized.
Usage
handle_bootstrap_placebo(
qoi.in,
data.in,
placebo.lead,
number.iterations,
att.treated.unit.ids,
atc.treated.unit.ids,
outcome.variable,
unit.id.variable,
confidence.level,
lag
)
Arguments
qoi.in |
String specifying qoi |
data.in |
data.frame object with the data |
number.iterations |
integer. specifies number of bootstrap iterations |
att.treated.unit.ids |
Integer vector specifying the treated units for the att or art |
atc.treated.unit.ids |
Integer vector specifying the "treated" units under the atc definition |
outcome.variable |
string specifying the name of the outcome variable |
unit.id.variable |
string specifying the name of the unit id variable |
confidence.level |
double. specifies confidence level for confidence interval |
lag |
integer vector specifying size of the lag. |
Value
Returns a matrix of bootstrapped QOI estimate values.
handle_bootstrap_placebo_parallel
Description
Helper function for calculating bootstrapped estimates for the placebo test. This version is parallelized.
Usage
handle_bootstrap_placebo_parallel(
qoi.in,
data.in,
placebo.lead,
number.iterations,
att.treated.unit.ids,
atc.treated.unit.ids,
outcome.variable,
unit.id.variable,
confidence.level,
lag,
num.cores = 1
)
Arguments
qoi.in |
String specifying qoi |
data.in |
data.frame object with the data |
number.iterations |
integer. Specifies number of bootstrap iterations |
att.treated.unit.ids |
Integer vector specifying the treated units for the att or art |
atc.treated.unit.ids |
Integer vector specifying the "treated" units under the atc definition |
outcome.variable |
string specifying the name of the outcome variable |
unit.id.variable |
string specifying the name of the unit id variable |
confidence.level |
double. specifies confidence level for confidence interval |
lag |
integer vector specifying size of the lag. |
num.cores |
number of cores to be used for parallelization |
Value
Returns a matrix of bootstrapped QOI estimate values.
handle_conditional_se Calculates conditional standard errors analytically, as defined in Imai et al. (2023). See PanelEstimate() for a more complete description of the standard error types.
Description
handle_conditional_se Calculates conditional standard errors analytically, as defined in Imai et al. (2023). See PanelEstimate() for a more complete description of the standard error types.
Usage
handle_conditional_se(
qoi.in,
data.in,
lead,
outcome.variable,
unit.id.variable
)
Arguments
qoi.in |
string specifying the QOI |
data.in |
data.frame specifying the data |
lead |
See PanelMatch() documentation |
outcome.variable |
string specifying the name of the outcome variable |
unit.id.variable |
string specifying the name of the unit id variable. |
Value
Named vector with standard error estimates
handle_mahalanobis_calculations Returns a matched.set object with weights for control units, along with some other metadata
Description
handle_mahalanobis_calculations Returns a matched.set object with weights for control units, along with some other metadata
Usage
handle_mahalanobis_calculations(
mahal.nested.list,
msets,
max.size,
verbose,
use.diagonal.covmat
)
Arguments
mahal.nested.list |
Output from build_maha_mats function |
msets |
matched.set object – list containing the treated observations and matched controls |
max.size |
maximum number of control units that will receive non-zero weights within a matched set |
verbose |
Logical. See PanelMatch() documentation |
use.diagonal.covmat |
Logical. See PanelMatch() documentation |
Value
matched.set object with weights for control units, along with some other metadata
handle_missing_data
Description
use col.index to determine which columns we want to "scan" for missing data. Note that in earlier points in the code, we rearrange the columns and prepare the data frame such that cols 1-4 are bookkeeping (unit id, time id, treated variable, unlagged outcome variable) and all remaining columns are used in the calculations after going through parse_and_prep function, so col.index should usually be 5:ncol(data). In practice, this function just looks over the data in the specified columns in the "data" data frame for missing data. Then it creates columns with indicator variables about the missingness of those variables: 1 for missing data, 0 for present
Usage
handle_missing_data(data, col.index)
Arguments
data |
data.frame object. |
col.index |
numeric vector specifying which columns to inspect |
Details
Tags missing data
Value
data.frame object with the data and the missingness indicators described above.
handle_moderating_variable
Description
handles moderating variable calculations: In practice, this just involves slicing the data up according to the moderator, calling PanelEstimate() and putting everything back together This function creates the sets of objects on which PanelEstimate() will be called. It identifies the set of valid values the moderating variable can take on.
Usage
handle_moderating_variable(
ordered.data,
att.sets,
atc.sets,
PM.object,
moderator,
unit.id,
time.id,
qoi.in
)
Arguments
ordered.data |
data.frame |
att.sets |
matched.set object for the ATT or ART |
atc.sets |
matched.set object for the ATC |
PM.object |
PanelMatch object |
moderator |
string specifying the name of the moderating variable |
unit.id |
string specifying the unit id variable |
time.id |
string specifying the time id variable |
qoi.in |
string specifying the QOI |
Value
Character vector of valid moderating variable values
handle_ps_match Returns a matched.set object with weights for control units, along with some other metadata
Description
handle_ps_match Returns a matched.set object with weights for control units, along with some other metadata
Usage
handle_ps_match(just.ps.sets, msets, refinement.method, verbose, max.set.size)
Arguments
just.ps.sets |
Output from find_ps() function |
msets |
matched.set object – list containing the treated observations and matched controls |
verbose |
Logical. See PanelMatch() documentation |
max.set.size |
maximum number of control units that will receive non-zero weights within a matched set |
Value
matched.set object with weights for control units, along with some other metadata
handle_ps_weighted
Description
handle_ps_weighted
Usage
handle_ps_weighted(just.ps.sets, msets, refinement.method)
Arguments
just.ps.sets |
results of find_ps() |
msets |
list of matched sets of treated and control observations |
refinement.method |
string specifying the refinement method |
Value
matched.set object with treated and matched control observations, with weights as determined by the specification
handle_conditional_se Calculates conditional standard errors analytically, as defined in Imai et al. (2023). See PanelEstimate() for a more complete description of the standard error types.
Description
handle_conditional_se Calculates conditional standard errors analytically, as defined in Imai et al. (2023). See PanelEstimate() for a more complete description of the standard error types.
Usage
handle_unconditional_se(
qoi.in,
data.in,
lead,
outcome.variable,
unit.id.variable
)
Arguments
qoi.in |
string specifying the QOI |
data.in |
data.frame specifying the data |
lead |
See PanelMatch() documentation |
outcome.variable |
string specifying the name of the outcome variable |
unit.id.variable |
string specifying the name of the unit id variable. |
Value
Named vector with standard error estimates
identifyDirectionalChanges Identifies changes in treatment variable for treated and control observations
Description
identifyDirectionalChanges Identifies changes in treatment variable for treated and control observations
Usage
identifyDirectionalChanges(
msets,
ordered.data,
id.var,
time.var,
treatment.var,
qoi
)
Arguments
msets |
|
ordered.data |
|
id.var |
|
time.var |
|
treatment.var |
|
qoi |
Value
matched.set object with changes in the treatment variable for treated and control observations identified.
lwd_refinement master function that performs refinement with listwise deletion = TRUE
Description
lwd_refinement master function that performs refinement with listwise deletion = TRUE
Usage
lwd_refinement(
msets,
global.data,
treated.ts,
treated.ids,
lag,
time.id,
unit.id,
lead,
refinement.method,
treatment,
size.match,
match.missing,
covs.formula,
verbose,
outcome.var,
e.sets,
use.diag.covmat
)
Arguments
msets |
|
global.data |
data.frame. needs to be fully prepped/parsed data set that is internally balanced, full of NAs likely |
treated.ts |
vector of the times of treatment for treated observations |
treated.ids |
vector of unit identifiers of treated observations |
lag |
|
time.id |
string specifying |
unit.id |
|
lead |
vector of lead values |
refinement.method |
string specifying refinement method |
treatment |
string specifying treatment variable |
size.match |
maximum number of units to give non-zero weight to when using matching refinement method |
match.missing |
logical. indicates whether or not to allow the package to match units on missingness in treatment history |
covs.formula |
see PanelMatch documentation for descriptions |
verbose |
see PanelMatch documentation for descriptions |
outcome.var |
string specifying outcome variable |
e.sets |
empty sets (treated observations with no matched controls) |
use.diag.covmat |
see PanelMatch documentation for descriptions |
Value
matched.set object with refined matched sets.
lwd_units helper function that actually subsets sets down to contain units with complete data
Description
lwd_units helper function that actually subsets sets down to contain units with complete data
Usage
lwd_units(full.local.data, unit.id)
Arguments
full.local.data |
data.frame containing the data to be used in set-level refinement, but containing missing data |
unit.id |
Value
data.frame with the missing data removed to be used for set-level refinement.
A constructor for the matched.set class.
Description
Users should never need to use this function by itself. See below for more about matched.set
objects.
Usage
matched_set(matchedsets, id, t, L, t.var, id.var, treatment.var)
Arguments
matchedsets |
a list of treated units and matched control units. Each element in the list should be a vector of control unit ids. |
id |
A vector containing the ids of treated units |
t |
A vector containing the times of treatment for treated units. |
L |
integer specifying the length of the lag window used in matching |
t.var |
string specifying the time variable |
id.var |
string specifying the unit id variable |
treatment.var |
string specifying the treatment variable. The constructor function returns a |
Value
matched.set
objects have additional attributes. These reflect the specified parameters when using the PanelMatch
function:
lag |
an integer value indicating the length of treatment history to be used for matching. Treated and control units are matched based on whether or not they have exactly matching treatment histories in the lag window. |
t.var |
time variable name, represented as a character/string |
id.var |
unit id variable name, represented as a character/string |
treatment.var |
treatment variable name, represented as a character/string |
class |
class of the object: should always be "matched.set" |
refinement.method |
method used to refine and/or weight the control units in each set. |
covs.formula |
One sided formula indicating which variables should be used for matching and refinement |
match.missing |
Logical variable indicating whether or not units should be matched on the patterns of missingness in their treatment histories |
max.match.size |
Maximum size of the matched sets after refinement. This argument only affects results when using a matching method |
Author(s)
Adam Rauh <amrauh@umich..edu>, In Song Kim <insong@mit.edu>, Erik Wang <haixiao@Princeton.edu>, and Kosuke Imai <imai@harvard.edu>
merge_formula
Description
Simple helper function for merging formula objects
Usage
merge_formula(form1, form2)
Arguments
form1 |
formula object |
form2 |
formula object |
Value
Returns a formula object, which is the concatenation of two provided formula objects.
parse_and_prep
Description
accepts formula object and data, creates the data used for refinement
Usage
parse_and_prep(formula, data)
Arguments
formula |
formula object specifying how to construct the data used for refinement. This is likely to be some variation of the covs.formula argument. |
data |
data.frame object to be used to create the data needed for refinement. data has unit, time, treatment columns in that order, followed by everything else |
Value
data.frame object with the data prepared for refinement. Data will have unit, time, treatment columns in that order, followed by everything else.
Prepare Control Units pcs and pts create data frames with the time/id combinations–that need to be found so that they can be easily looked up in the data frame via a hash table. The data frame also contains information about the weight of that unit at particular times, so we use the hash table to look up where to put this data so that we can easily assign the appropriate weights in the original data frame containing the problem data. pcs does this for all control units in a matched set. pts does this for all treated units.
Description
Prepare Control Units pcs and pts create data frames with the time/id combinations–that need to be found so that they can be easily looked up in the data frame via a hash table. The data frame also contains information about the weight of that unit at particular times, so we use the hash table to look up where to put this data so that we can easily assign the appropriate weights in the original data frame containing the problem data. pcs does this for all control units in a matched set. pts does this for all treated units.
Usage
pcs(sets, lead.in)
Arguments
sets |
object describing the matched sets |
lead.in |
integer describing a particular lead value. |
Value
data.frame object with time-id combinations
perform_refinement Performs refinement of matched sets, ultimately returning sets of treated observations and controls with weights. This function mostly acts as an intermediary between PanelMatch and lower level functions that do the dirty work of refinement. The function takes a lot of the same arguments as PanelMatch()
Description
perform_refinement Performs refinement of matched sets, ultimately returning sets of treated observations and controls with weights. This function mostly acts as an intermediary between PanelMatch and lower level functions that do the dirty work of refinement. The function takes a lot of the same arguments as PanelMatch()
Usage
perform_refinement(
lag,
time.id,
unit.id,
treatment,
refinement.method,
size.match,
ordered.data,
match.missing,
covs.formula,
verbose,
lead,
outcome.var = NULL,
forbid.treatment.reversal = FALSE,
qoi = "",
matching = TRUE,
exact.matching.variables = NULL,
listwise.deletion,
use.diag.covmat = FALSE,
placebo.test = FALSE,
restrict.control.period = NULL
)
Arguments
lag |
See PanelMatch() documentation. |
time.id |
See PanelMatch() documentation. |
unit.id |
See PanelMatch() documentation. |
treatment |
See PanelMatch() documentation. |
refinement.method |
See PanelMatch() documentation. |
size.match |
See PanelMatch() documentation. |
ordered.data |
data.frame that has been balanced and ordered by time-unit. |
match.missing |
See PanelMatch() documentation. |
covs.formula |
See PanelMatch() documentation. |
verbose |
See PanelMatch() documentation. |
lead |
See PanelMatch() documentation. |
outcome.var |
See PanelMatch() documentation. |
forbid.treatment.reversal |
See PanelMatch() documentation. |
qoi |
See PanelMatch() documentation. |
matching |
See PanelMatch() documentation. |
exact.matching.variables |
|
listwise.deletion |
See PanelMatch() documentation. |
use.diag.covmat |
See PanelMatch() documentation for use.diagonal.covariance.matrix argument. |
placebo.test |
See PanelMatch() documentation. |
restrict.control.period |
See PanelMatch() documentation. |
Value
returns a matched.set object containing the refined matched sets
perunitSum This is a low level function that is used to calculate a value associated with each unit. This value is a weighted summation of the dependent variable, based on the Wit values discussed in Imai et al. (2023)
Description
perunitSum This is a low level function that is used to calculate a value associated with each unit. This value is a weighted summation of the dependent variable, based on the Wit values discussed in Imai et al. (2023)
Usage
perunitSum(udf, lead.in, dependent.in, qoi_in)
Arguments
udf |
data.frame |
lead.in |
integer. A particular lead value |
dependent.in |
string specifying the dependent variable name |
qoi_in |
string specifying the QOI |
Value
Named vector containing the per-unit sums.
perunitSum_Dit Similar to perunitSum, this is a low level helper function for calculating specific values defined in Imai et al. (2023). This focuses on Dit rather than Wit
Description
perunitSum_Dit Similar to perunitSum, this is a low level helper function for calculating specific values defined in Imai et al. (2023). This focuses on Dit rather than Wit
Usage
perunitSum_Dit(udf, qoi_in)
Arguments
udf |
data.frame |
qoi_in |
string specifying the QOI |
Value
Named vector containing the per-unit sums.
Conduct a placebo test
Description
Calculate the results of a placebo test, looking at the change in outcome at time = t-1, compared to other pre-treatment periods in the lag window.
Usage
placebo_test(
pm.obj,
panel.data,
lag.in = NULL,
number.iterations = 1000,
confidence.level = 0.95,
plot = FALSE,
se.method = "bootstrap",
parallel = FALSE,
num.cores = 1,
...
)
Arguments
pm.obj |
an object of class |
panel.data |
|
lag.in |
integer indicating earliest the time period(s) in the future for which the placebo test change in outcome will be calculated. Calculations will be made over the period t - max(lag) to t-2, where t is the time of treatment. The results are similar to those returned by |
number.iterations |
integer specifying the number of bootstrap iterations. This argument only has an effect if standard errors are calculated with the bootstrap. |
confidence.level |
confidence level for the calculated standard error intervals. Should be specified as a numeric between 0 and 1. |
plot |
logical indicating whether or not a plot should be generated, or just return the raw data from the calculations |
se.method |
character string describing the type of standard error to be used. Valid inputs include "bootstrap", "conditional" and "unconditional". When the QOI is ATE, only bootstrap can be used. See the documentation of this argument in |
parallel |
Logical. If TRUE and |
num.cores |
Integer. Specifies the number of cores to use for parallelization. If |
... |
extra arguments to be passed to |
Value
list with 2 or 3 elements: "estimate", which contains the point estimates for the test, "standard.errors" which has the standard errors for each period and optionally "bootstrapped.estimates", containing the bootstrapped point estimates for the test for each specified lag window period.
Examples
dem.sub <- dem[dem[, "wbcode2"] <= 100, ]
dem.sub.panel <- PanelData(dem.sub, "wbcode2", "year", "dem", "y")
# create subset of data for simplicity
PM.results <- PanelMatch(panel.data = dem.sub.panel, lag = 4,
refinement.method = "ps.match",
match.missing = TRUE,
covs.formula = ~ tradewb,
size.match = 5, qoi = "att",
lead = 0:4,
forbid.treatment.reversal = FALSE, placebo.test = TRUE)
placebo_test(PM.results, panel.data = dem.sub.panel, se.method = "unconditional", plot = FALSE)
Plot covariate balance results
Create figures displaying covariate balance results for one or more PanelMatch
configurations. Users can customize these visualizations.
Description
Plot covariate balance results
Create figures displaying covariate balance results for one or more PanelMatch
configurations. Users can customize these visualizations.
Usage
## S3 method for class 'PanelBalance'
plot(
x,
...,
type = "panel",
reference.line = TRUE,
legend = TRUE,
ylab = NULL,
include.treatment.period = TRUE,
include.unrefined.panel = TRUE,
legend.position = "topleft",
main = NULL,
main.unrefined = NULL
)
Arguments
x |
|
... |
additional parameters to be passed to |
type |
character specifying which type of plot to produce. Can be either "panel" or "scatter". When "panel," covariate balance results for covariates are shown over the lag period. When "scatter," the figure has the following characteristics. Each point on the plot represents a specific covariate at a particular time period in the lag window from t-L to t-1. The horizontal axis represents the covariate balance for this particular variable and time period before refinement is applied, while the vertical axis represents the post-refinement balance value. |
reference.line |
logical. Include a reference line at y = 0? Only applicable to the panel plot. |
legend |
logical. Describes whether or not to include a legend. |
ylab |
character. Y-axis label. |
include.treatment.period |
Logical. Describes whether or not the treatment period should be included on the panel plot. Default is TRUE. |
include.unrefined.panel |
logical indicating whether or not unrefined balance plots should be returned for panel plot. Only applicable to panel plot. Default is TRUE. |
legend.position |
character. Describes where the legend should be placed on the figure. Uses base R syntax. |
main |
character. Either a single title to be used for all plots or a character vector providing a name for each figure, which should be the same length as the number of 'PanelMatch' configurations in the 'PanelBalance' object. By default, main is set to NULL and figures are titled the same as the 'PanelMatch' objects the figures are based on. |
main.unrefined |
character. This argument is the same as main, but applies to the set of figures corresponding to the unrefined covariate balance results. This is only used when applicable – otherwise it has no effect. |
Value
returns a set of base R plots, depending on the specification of "panel" or "scatter" above. When type = "panel"
and include.unrefined.panel = TRUE
, two sets of plots are returned. The first set shows covariate balance levels for the specified PanelMatch
configurations. The second set shows covariate balance levels for the same PanelMatch
configurations, but with all control units receiving equal weight (i.e., balance levels prior to refinement). If include.unrefined.panel = FALSE
, only the first set of figures are returned. The sets of figures are both returned in the same order as the PanelMatch
configurations specified to get_covariate_balance()
that compose the PanelBalance
object. When type = "scatter"
, the visualization described above is produced, with all configurations shown on the same plot with different symbols.
Examples
dem$rdata <- runif(runif(nrow(dem)))
dem.panel <- PanelData(dem, "wbcode2", "year", "dem", "y")
pm.obj <- PanelMatch(lead = 0:3, lag = 4, refinement.method = "mahalanobis",
panel.data = dem.panel, match.missing = TRUE,
covs.formula = ~ tradewb + rdata + I(lag(tradewb, 1:4)) + I(lag(y, 1:4)),
size.match = 5, qoi = "att")
# create multiple configurations to compare
pm2 <- PanelMatch(lead = 0:3, lag = 4, refinement.method = "ps.match",
panel.data = dem.panel, match.missing = TRUE,
covs.formula = ~ tradewb + rdata + I(lag(tradewb, 1:4)) + I(lag(y, 1:4)),
size.match = 5, qoi = "att")
pb <- get_covariate_balance(pm.obj, pm2,
include.unrefined = TRUE,
panel.data = dem.panel,
covariates = c("tradewb", "rdata"))
plot(pb, type = "panel", include.unrefined.panel = TRUE)
plot(pb, type = "scatter")
# only show refined balance figures
plot(pb, type = "panel", include.unrefined.panel = FALSE)
Create basic plots of PanelData objects
Description
Create basic plots of PanelData objects
Usage
## S3 method for class 'PanelData'
plot(x, ..., plotting.variable = NA)
Arguments
x |
|
... |
Not used |
plotting.variable |
character string specifying which variable to plot in the resulting figure. The values of this variable will be used to fill in cells on the resulting heatmap. Defaults to whatever is specified as the treatment variable. |
Value
Returns a ggplot2 object created by geom_tile()
. The basic figure shows units along the y-axis and time along the x-axis. The figure takes the form of a heatmap. The value of the plotting.variable argument is used to fill in the color of the cells.
Examples
dem$rdata <- rnorm(nrow(dem))
d <- PanelData(dem, "wbcode2", "year", "dem", "y")
plot(d)
plot(d, plotting.variable = "rdata")
Plot point estimates and standard errors from a PanelEstimate calculation.
Description
The plot.PanelEstimate
method takes an object returned by the PanelEstimate
function and plots the calculated
point estimates and standard errors over the specified lead
time period.
The only mandatory argument is an object of the PanelEstimate
class.
Usage
## S3 method for class 'PanelEstimate'
plot(
x,
ylab = "Estimated Effect of Treatment",
xlab = "Time",
main = "Estimated Effects of Treatment Over Time",
ylim = NULL,
pch = NULL,
cex = NULL,
confidence.level = NULL,
bias.corrected = FALSE,
...
)
Arguments
x |
a |
ylab |
default is "Estimated Effect of Treatment." This is the same argument as the standard argument for |
xlab |
default is "Time". This is the same argument as the standard argument for |
main |
default is "Estimated Effects of Treatment Over Time". This is the same argument as the standard argument for |
ylim |
default is NULL. This is the same argument as the standard argument for |
pch |
default is NULL. This is the same argument as the standard argument for |
cex |
default is NULL. This is the same argument as the standard argument for |
confidence.level |
confidence.level Confidence level to be used for confidence interval calculations. Must be numeric between 0 and 1. If NULL, confidence level from |
bias.corrected |
logical indicating whether or not bias corrected estimates should be plotted Default is FALSE. This argument only applies for standard errors calculated with the bootstrap. |
... |
Additional optional arguments to be passed to |
Examples
dem.sub <- dem[dem[, "wbcode2"] <= 100, ]
dem.sub.panel <- PanelData(dem.sub, "wbcode2", "year", "dem", "y")
# create subset of data for simplicity
PM.results <- PanelMatch(panel.data = dem.sub.panel, lag = 4,
refinement.method = "ps.match",
match.missing = TRUE,
covs.formula = ~ tradewb,
size.match = 5, qoi = "att",
lead = 0:4,
forbid.treatment.reversal = FALSE)
PE.results <- PanelEstimate(sets = PM.results,
panel.data = dem.sub.panel,
se.method = "unconditional")
plot(PE.results)
Plot the distribution of the sizes of matched sets.
Description
A plot method for creating a histogram of the distribution of the sizes of matched sets.
This method accepts all standard optional hist
arguments via the ...
argument.
By default, empty matched sets (treated units that could not be
matched with any control units) are noted as a vertical bar at x = 0 and not included in the
regular histogram. See the include.empty.sets
argument for more information about this. If the quantity of interest is ATE, a plot will be returned for the matched sets associated with the att and the atc.
Usage
## S3 method for class 'PanelMatch'
plot(
x,
...,
border = NA,
col = "grey",
ylab = "Frequency of Size",
xlab = "Matched Set Size",
lwd = NULL,
main = "Distribution of Matched Set Sizes",
freq = TRUE,
include.empty.sets = FALSE
)
Arguments
x |
a |
... |
optional arguments to be passed to |
border |
default is NA. This is the same argument as the standard argument for |
col |
default is "grey". This is the same argument as the standard argument for |
ylab |
default is "Frequency of Size". This is the same argument as the standard argument for |
xlab |
default is "Matched Set Size". This is the same argument as the standard argument for |
lwd |
default is NULL. This is the same argument as the standard argument for |
main |
default is "Distribution of Matched Set Sizes". This is the same argument as the standard argument for |
freq |
default is TRUE. See |
include.empty.sets |
logical value indicating whether or not empty sets should be included in the histogram. default is FALSE. If FALSE, then empty sets will be noted as a separate vertical bar at x = 0. If TRUE, empty sets will be included as normal sets. |
Examples
dem.sub <- dem[dem[, "wbcode2"] <= 100, ]
dem.sub.panel <- PanelData(dem.sub, "wbcode2", "year", "dem", "y")
PM.results <- PanelMatch(panel.data = dem.sub.panel,
lag = 4,
refinement.method = "mahalanobis",
match.missing = TRUE,
covs.formula = ~ I(lag(tradewb, 1:4)) + I(lag(y, 1:4)),
size.match = 5, qoi = "att",
lead = 0:4, forbid.treatment.reversal = FALSE)
plot(PM.results)
plot(PM.results, include.empty.sets = TRUE)
Plot the distribution of control unit weights
Description
The method creates a heatmap with the following characteristics. The heatmap grid is m x n, where m is the number of treated observations (as identified by i,t pairs) and n is the number of units. Treated observations represent the rows, and every unit in the data set form the columns. The figure then shows the calculated weights or distances (as specified) for each control unit within the matched set as identified by the row. Weights/distances that are missing or zero are not considered in the shading scheme and are both treated as NA for all practical purposes. Note that not all refinement methods will return a distance. Those that do also require verbose = TRUE
in the PanelMatch
specification.
For example say (2, 5) is a treated observation and units 1, 4, 8 are matched as controls. Row i will represent (2,5) in the matrix, M. The columns indexed by w, x, and y, correspond to units 1, 4, and 8. M[i, w], M[i, x], M[i, y] then contain the weights or pairwise distances of units 1, 4, and 8 within that matched set.
Usage
## S3 method for class 'matched.set'
plot(
x,
...,
panel.data,
type = "weights",
include.missing = TRUE,
low.color = "blue",
mid.color = "white",
high.color = "red",
missing.color = "grey50"
)
Arguments
x |
a |
... |
Not used |
panel.data |
a |
type |
character indicating whether or not weights or distances should be plotted |
include.missing |
logical. When TRUE, all units appear as columns, including those that are never included in any matched sets. When FALSE, only units that appear in at least one matched set are included. |
low.color |
option passed to |
mid.color |
option passed to |
high.color |
option passed to |
missing.color |
option passed to |
Value
returns a ggplot2::geom_tile()
object producing a plot in alignment with the description above
Examples
dem.panel <- PanelData(dem, "wbcode2", "year", "dem", "y")
PM.results <- PanelMatch(panel.data = dem.panel, lag = 4,
refinement.method = "ps.match",
match.missing = TRUE,
covs.formula = ~ tradewb,
size.match = 5, qoi = "att",
lead = 0:4,
forbid.treatment.reversal = FALSE)
mso <- extract(PM.results)
plot(mso, panel.data = dem.panel)
Helper function for plotting the distribution of matched set sizes
Description
Helper function for plotting the distribution of matched set sizes
Usage
plot_matched_set(
x,
border = NA,
col = "grey",
ylab = "Frequency of Size",
xlab = "Matched Set Size",
lwd = NULL,
main = "Distribution of Matched Set Sizes",
freq = TRUE,
include.empty.sets = FALSE,
...
)
Arguments
x |
a |
border |
default is NA. This is the same argument as the standard argument for |
col |
default is "grey". This is the same argument as the standard argument for |
ylab |
default is "Frequency of Size". This is the same argument as the standard argument for |
xlab |
default is "Matched Set Size". This is the same argument as the standard argument for |
lwd |
default is NULL. This is the same argument as the standard argument for |
main |
default is "Distribution of Matched Set Sizes". This is the same argument as the standard argument for |
freq |
default is TRUE. See |
include.empty.sets |
logical value indicating whether or not empty sets should be included in the histogram. default is FALSE. If FALSE, then empty sets will be noted as a separate vertical bar at x = 0. If TRUE, empty sets will be included as normal sets. |
... |
optional arguments to be passed to |
prepare_data The calculation of point estimates and standard errors first requires the calculation of a variety of different weights, parameters, and indicator variables. This function prepares the data within PanelEstimate() such that the estimates can be calculated easily. In practical terms, the function calls the lower level helpers to calculate W_its and D_its as described in Imai et al. (2023) and merges those results together with the original data to facilitate calculations.
Description
prepare_data The calculation of point estimates and standard errors first requires the calculation of a variety of different weights, parameters, and indicator variables. This function prepares the data within PanelEstimate() such that the estimates can be calculated easily. In practical terms, the function calls the lower level helpers to calculate W_its and D_its as described in Imai et al. (2023) and merges those results together with the original data to facilitate calculations.
Usage
prepare_data(
data.in,
lead,
sets.att = NULL,
sets.atc = NULL,
qoi.in,
dependent.variable
)
Arguments
data.in |
data.frame: the data to be used in the analysis |
lead |
See PanelMatch() documentation |
sets.att |
matched.set object containing ATT or ART matched sets. |
sets.atc |
matched.set object containing ATC matched sets. |
qoi.in |
See PanelMatch() documentation |
dependent.variable |
string specifying the outcome/dependent variable. |
Value
data.frame with the results of the lower level calculations
Print basic information about PanelBalance objects
Description
This function prints out covariate balance information for all of the PanelMatch configurations specified within a PanelBalance object. Specifically it prints out the name of the PanelMatch object(s), and covariate balance measures over the specified time period after refinement. If no refinement was applied, then these unrefined results will be shown.
Usage
## S3 method for class 'PanelBalance'
print(x, ...)
Arguments
x |
|
... |
Not used |
Value
Nothing
Examples
dem$rdata <- runif(runif(nrow(dem)))
dem.panel <- PanelData(dem, "wbcode2", "year", "dem", "y")
pm.obj <- PanelMatch(lead = 0:3, lag = 4, refinement.method = "mahalanobis",
panel.data = dem.panel, match.missing = TRUE,
covs.formula = ~ tradewb + rdata + I(lag(tradewb, 1:4)) + I(lag(y, 1:4)),
size.match = 5, qoi = "att")
pb <- get_covariate_balance(pm.obj,
include.unrefined = TRUE,
panel.data = dem.panel,
covariates = c("tradewb", "rdata"))
print(pb)
Print PanelData objects and basic metadata
Description
Print PanelData objects and basic metadata
Usage
## S3 method for class 'PanelData'
print(x, ..., n = 5, verbose = FALSE)
Arguments
x |
|
... |
additional arguments to be passed to |
n |
Integer. Number of rows to print by default for previewing data. Default is 5. |
verbose |
Logical. Print the entire data frame, rather than just a preview. Default is FALSE. |
Value
Returns nothing but prints PanelData
object. This is a data frame that has been balanced, sorted, and tagged with important metadata to facilitate the use of other functions.
Examples
d <- PanelData(dem, "wbcode2", "year", "dem", "y")
print(d)
Print point estimates and standard errors
Description
Print point estimates and standard errors
Usage
## S3 method for class 'PanelEstimate'
print(x, ...)
Arguments
x |
|
... |
Not used |
Examples
dem.sub <- dem[dem[, "wbcode2"] <= 100, ]
dem.sub.panel <- PanelData(dem.sub, "wbcode2", "year", "dem", "y")
# create subset of data for simplicity
PM.results <- PanelMatch(panel.data = dem.sub.panel, lag = 4,
refinement.method = "ps.match",
match.missing = TRUE,
covs.formula = ~ tradewb,
size.match = 5, qoi = "att",
lead = 0:4,
forbid.treatment.reversal = FALSE)
PE.results <- PanelEstimate(sets = PM.results,
panel.data = dem.sub.panel,
se.method = "unconditional")
print(PE.results)
Print PanelMatch objects.
Description
Print PanelMatch objects.
Usage
## S3 method for class 'PanelMatch'
print(x, ..., verbose = FALSE, n = 5, show.all = FALSE)
Arguments
x |
a |
... |
additional arguments to be passed to |
verbose |
logical indicating whether or not underlying data should be printed in expanded/raw list form. The verbose form is not recommended unless the data set is small. Default is FALSE |
n |
Integer. Number of matched sets to display information about as a preview. Default is 5. |
show.all |
Logical. By default ('show.all = FALSE'), the print method only shows a small preview of the sizes of matched sets. When set to TRUE, a full summary description of matched set sizes is shown. |
Examples
dem.sub <- dem[dem[, "wbcode2"] <= 100, ]
dem.sub.panel <- PanelData(dem, 'wbcode2', 'year', 'dem', 'y')
PM.results <- PanelMatch(panel.data = dem.sub.panel,
lag = 4,
refinement.method = "mahalanobis",
match.missing = TRUE,
covs.formula = ~ I(lag(tradewb, 1:4)) + I(lag(y, 1:4)),
size.match = 5, qoi = "att",
lead = 0:4, forbid.treatment.reversal = FALSE)
print(PM.results)
Print matched.set objects.
Description
Print matched.set objects.
Usage
## S3 method for class 'matched.set'
print(x, ..., verbose = FALSE, n = 5, show.all = FALSE)
Arguments
x |
a |
... |
Not used. additional arguments to be passed to |
verbose |
logical indicating whether or not output should be printed in expanded/raw list form. The verbose form is not recommended unless the data set is small. Default is FALSE, which prints an overview of matched set sizes. |
n |
Integer. Integer. Number of matched sets to display information about as a preview. Default is 5. |
show.all |
Logical. By default ('show.all = FALSE'), the print method only shows a small preview of the sizes of matched sets. When set to TRUE, a full summary description of matched set sizes is shown. |
Value
Returns nothing, but prints information about matched sets: treated observation IDs, the time of treatment, and the size of matched sets.
Examples
dem.sub <- dem[dem[, "wbcode2"] <= 100, ]
# create subset of data for simplicity
dem.sub.panel <- PanelData(dem.sub, "wbcode2", "year", "dem", "y")
PM.results <- PanelMatch(panel.data = dem.sub.panel, lag = 4,
refinement.method = "ps.match",
match.missing = TRUE,
covs.formula = ~ tradewb,
size.match = 5, qoi = "att",
lead = 0:4,
forbid.treatment.reversal = FALSE)
print(extract(PM.results, qoi = "att"))
set_lwd_refinement Performs the set-level operations for refinement with listwise deletion. See documentation for lwd_refinement for descriptions of most parameters.
Description
set_lwd_refinement Performs the set-level operations for refinement with listwise deletion. See documentation for lwd_refinement for descriptions of most parameters.
Usage
set_lwd_refinement(
mset,
local.data,
time,
id,
lag,
refinement.method,
lead,
verbose,
size.match,
unit.id,
time.id,
covs.formula,
match.missing,
treatment,
use.diag.covmat
)
Arguments
mset |
individual matched set |
local.data |
data.frame containing the data relevant for set level refinement |
time |
time of treated observation |
id |
id of treated observation |
Value
an individual matched set
Summarize covariate balance over time
Description
Summarize covariate balance over time
Usage
## S3 method for class 'PanelBalance'
summary(
object,
qoi = NULL,
include.unrefined = TRUE,
unrefined.only = FALSE,
...
)
Arguments
object |
|
qoi |
Character. Valid values include "att", "art", or "atc". Specifying which QOI information to extract and summarize. |
include.unrefined |
logical. Indicates whether or not unrefined balance results should be included in the summary. |
unrefined.only |
logical. Indicates whether or not only unrefined balance results should be included in the summary. |
... |
Not used |
Value
returns a list of matrices with covariate balance results calculated. Each element in the list corresponds to a PanelMatch
configuration given to get_covariate_balance()
and are returned in order. These elements should also have names that correspond to the names of the PanelMatch
variables provided to the function. Note that if a configuration has qoi = "ate"
, the corresponding element in the returned list will also be a list, containing balance results corresponding to the ATT and ATC. Otherwise, each element in the returned list will be a matrix. Each matrix entry corresponds to balance results for a particular covariate in a particular period. When unrefined balance results are included, users will see additional columns with "_unrefined" appended to covariate names. These correspond to the unrefined balance results for a particular covariate-period. If 'unrefined.only = TRUE', then the names of the elements will have "_unrefined" appended to them.
Examples
dem$rdata <- runif(runif(nrow(dem)))
dem.panel <- PanelData(dem, "wbcode2", "year", "dem", "y")
pm.obj <- PanelMatch(lead = 0:3, lag = 4, refinement.method = "mahalanobis",
panel.data = dem.panel, match.missing = TRUE,
covs.formula = ~ tradewb + rdata + I(lag(tradewb, 1:4)) + I(lag(y, 1:4)),
size.match = 5, qoi = "att")
pb <- get_covariate_balance(pm.obj,
include.unrefined = TRUE,
panel.data = dem.panel,
covariates = c("tradewb", "rdata"))
summary(pb)
Summarize information about variable names, and unit, time, and treatment data in a PanelData object
Description
Summarize information about variable names, and unit, time, and treatment data in a PanelData object
Usage
## S3 method for class 'PanelData'
summary(object, ...)
Arguments
object |
|
... |
Not used |
Value
Returns a data.frame
object, with columns "quantity" and "value." Within the data frame the following information is returned: The name of the unit id variable, the name of the time id variable, the name of the treatment variable, the name of the outcome variable, the number of unique units found in the data, the number of unique time periods found in the data and the percentage of treated periods that are missing treatment data.
Examples
d <- PanelData(dem, "wbcode2", "year", "dem", "y")
summary(d)
Get summaries of PanelEstimate objects and calculations
Description
summary.PanelEstimate
takes an object returned by
PanelEstimate
, and returns a summary table of point
estimates and confidence intervals
Usage
## S3 method for class 'PanelEstimate'
summary(
object,
confidence.level = NULL,
verbose = FALSE,
bias.corrected = FALSE,
...
)
Arguments
object |
A |
confidence.level |
Confidence level to be used for confidence interval calculations. Must be numeric between 0 and 1. If NULL, confidence level from |
verbose |
logical indicating whether or not output should be printed in an expanded form. Default is FALSE |
bias.corrected |
logical indicating whether or not bias corrected estimates should be provided. Default is FALSE. This argument only applies for standard errors calculated with the bootstrap. |
... |
optional additional arguments. Currently, no additional arguments are supported. |
Examples
dem.sub <- dem[dem[, "wbcode2"] <= 100, ]
dem.sub.panel <- PanelData(dem.sub, "wbcode2", "year", "dem", "y")
# create subset of data for simplicity
PM.results <- PanelMatch(panel.data = dem.sub.panel, lag = 4,
refinement.method = "ps.match",
match.missing = TRUE,
covs.formula = ~ tradewb,
size.match = 5, qoi = "att",
lead = 0:4,
forbid.treatment.reversal = FALSE)
PE.results <- PanelEstimate(sets = PM.results,
panel.data = dem.sub.panel,
se.method = "unconditional")
summary(PE.results)
summary(PE.results, confidence.level = .9)
Summarize information about a PanelMatch object and the matched sets contained within them.
Description
A method for viewing summary data about the sizes of matched sets, the number of treated units, and the number of empty matched sets. If the quantity of interest is ate, then a summary will be provided for the matched sets associated with the att and the atc.
Usage
## S3 method for class 'PanelMatch'
summary(object, ...)
Arguments
object |
a |
... |
Not used |
Value
A list of data frame(s) containing information about matched sets associated with the specified qoi. If the qoi is "att", "art", or "atc", then the returned list contains one data frame and the element is named for the specified qoi. If the qoi is "ate", then a list of two elements is returned, with one data frame corresponding to the "att" and the other to the "atc". The data frame contains summary information about the sizes of matched sets, along with information about the number of treated observations and the number of empty sets. Specifically, it contains the minimum, 1st quartile, median, mean, 3rd quartile, and maximum matched set size. It also contains the number of treated units total and the number of empty matched sets.
Examples
dem.sub <- dem[dem[, "wbcode2"] <= 100, ]
dem.sub.panel <- PanelData(dem.sub, "wbcode2", "year", "dem", "y")
PM.results <- PanelMatch(panel.data = dem.sub.panel,
lag = 4,
refinement.method = "mahalanobis",
match.missing = TRUE,
covs.formula = ~ I(lag(tradewb, 1:4)) + I(lag(y, 1:4)),
size.match = 5, qoi = "att",
lead = 0:4, forbid.treatment.reversal = FALSE)
summary(PM.results)
Summarize information about a matched.set object and the matched sets contained within them.
Description
A method for viewing summary data about the sizes of matched sets and metadata about how they were created. This method accepts all standard summary
arguments.
Usage
## S3 method for class 'matched.set'
summary(object, ..., verbose = TRUE)
Arguments
object |
a |
... |
Optional additional arguments to be passed to the |
verbose |
Logical value specifying whether or not a longer, more verbose summary should be calculated and returned. Default is TRUE. |
Value
list object with either 5 or 1 element(s), depending on whether or not verbose
is set to TRUE or not.
overview |
A |
set.size.summary |
a |
number.of.treated.units |
The number of unit, time pairs that are considered to be "treated" units |
num.units.empty.set |
The number of units treated at a particular time that were not able to be matched to any control units |
lag |
The size of the lag window used for matching on treatment history. This affects which treated and control units are matched. |
Examples
dem.sub <- dem[dem[, "wbcode2"] <= 100, ]
dem.sub.panel <- PanelData(dem.sub, "wbcode2", "year", "dem", "y")
# create subset of data for simplicity
PM.results <- PanelMatch(lag = 4, refinement.method = "ps.match",
panel.data = dem.sub.panel, match.missing = TRUE,
covs.formula = ~ I(lag(tradewb, 1:4)) + I(lag(y, 1:4)),
size.match = 5, qoi = "att",
lead = 0:4, forbid.treatment.reversal = FALSE)
summary(extract(PM.results, qoi = "att"))
Get weights of matched control units See weights.matched.set method
Description
Get weights of matched control units See weights.matched.set method
Usage
weights(object)
Arguments
object |
|
Extract the weights of matched control units
Description
Extract the weights of matched control units
Usage
## S3 method for class 'matched.set'
weights(object)
Arguments
object |
matched.set object, extracted using the |
Value
list of named vectors. Each list element corresponds to a particular treated observation and contains the matched control units, along with their weights. These correspond to the "weights" attribute, which are calculated in the PanelMatch
refinement process.
Examples
dem.panel <- PanelData(dem, "wbcode2", "year", "dem", "y")
PM.results <- PanelMatch(panel.data = dem.panel, lag = 4,
refinement.method = "ps.match",
match.missing = TRUE,
covs.formula = ~ tradewb,
size.match = 5, qoi = "att",
lead = 0:4,
forbid.treatment.reversal = FALSE)
r1 <- extract(PM.results, qoi = "att")
lt <- weights(r1)