Help for package PanelMatch

Type:

Package

Title:

Matching Methods for Causal Inference with Time-Series Cross-Sectional Data

Version:

3.1.1

Date:

2025-06-04

Description:

Implements a set of methodological tools that enable researchers to apply matching methods to time-series cross-sectional data. Imai, Kim, and Wang (2023) http://web.mit.edu/insong/www/pdf/tscs.pdf proposes a nonparametric generalization of the difference-in-differences estimator, which does not rely on the linearity assumption as often done in practice. Researchers first select a method of matching each treated observation for a given unit in a particular time period with control observations from other units in the same time period that have a similar treatment and covariate history. These methods include standard matching methods based on propensity score and Mahalanobis distance, as well as weighting methods. Once matching and refinement is done, treatment effects can be estimated with standard errors. The package also offers diagnostics for researchers to assess the quality of their results.

License:

GPL (≥ 3)

Imports:

Rcpp (≥ 0.12.5), data.table, ggplot2, CBPS, stats, graphics, MASS, Matrix, doParallel, foreach, methods

Depends:

R (≥ 2.14.0)

LinkingTo:

RcppArmadillo, Rcpp, RcppEigen

Encoding:

UTF-8

LazyData:

true

BugReports:

https://github.com/insongkim/PanelMatch/issues

RoxygenNote:

7.3.1

Suggests:

knitr, rmarkdown, testthat (≥ 2.1.0)

VignetteBuilder:

knitr

NeedsCompilation:

yes

Packaged:

2025-06-04 05:49:30 UTC; adamrauh

Author:

In Song Kim [aut, cre], Adam Rauh [aut], Erik Wang [aut], Kosuke Imai [aut]

Maintainer:

In Song Kim <insong@mit.edu>

Repository:

CRAN

Date/Publication:

2025-06-04 14:40:02 UTC

Matching Methods for Causal Inference with Time-Series Cross-Sectional Data

Description

Implements a set of methodological tools that enable researchers to apply matching methods to time-series cross-sectional data. Imai, Kim, and Wang (2023) proposes a nonparametric generalization of the difference-in-differences estimator, which does not rely on the linearity assumption as often done in practice. Researchers first select a method of matching each treated observation for a given unit in a particular time period with control observations from other units in the same time period that have a similar treatment and covariate history. These methods include standard matching methods based on propensity score and Mahalanobis distance, as well as weighting methods. Once matching is done, both short-term and long-term average treatment effects for the treated observations can be estimated with standard errors. The package also offers a variety of diagnostic and visualization functions to assess the credibility of results.

Author(s)

In Song Kim <insong@mit.edu>, Erik Wang <haixiao@Princeton.edu>, Adam Rauh <amrauh@umich.edu>, and Kosuke Imai <imai@harvard.edu>

Maintainer: In Song Kim insong@mit.edu

References

Imai, Kosuke, In Song Kim and Erik Wang. (2023)

Visualize the treatment distribution across units and time in a panel data set

Description

Visualize the treatment distribution across units and time in a panel data set

Usage

DisplayTreatment(
  panel.data,
  color.of.treated = "red",
  color.of.untreated = "blue",
  title = "Treatment Distribution \n Across Units and Time",
  xlab = "Time",
  ylab = "Unit",
  x.size = NULL,
  y.size = NULL,
  legend.position = "none",
  x.angle = NULL,
  y.angle = NULL,
  legend.labels = c("not treated", "treated"),
  decreasing = FALSE,
  matched.set = NULL,
  show.set.only = FALSE,
  hide.x.tick.label = FALSE,
  hide.y.tick.label = FALSE,
  gradient.weights = FALSE,
  dense.plot = FALSE
)

Arguments

panel.data

PanelData object

color.of.treated

Color of the treated observations provided as a character string (this includes hex values). Default is red.

color.of.untreated

Color of the untreated observations provided as a character string (this includes hex values). Default is blue.

title

Title of the plot provided as character string

xlab

Character label of the x-axis

ylab

Character label of the y-axis

x.size

Numeric size of the text for xlab or x axis tick labels. Assign x.size = NULL to use built in ggplot2 method of determining label size. When the length of the time period is long, consider setting to NULL and adjusting size and ratio of the plot.

y.size

Numeric size of the text for ylab or y axis tick labels. Assign y.size = NULL to use built in ggplot2 method of determining label size. When the number of units is large, consider setting to NULL and adjusting size and ratio of the plot.

legend.position

Position of the legend. Provide this according to ggplot2 standards.

x.angle

Angle (in degrees) of the tick labels for x-axis

y.angle

Angle (in degrees) of the tick labels for y-axis

legend.labels

Character vector of length two describing the labels of the legend to be shown in the plot. ggplot2 standards are used.

decreasing

Logical. Determines if display order should be increasing or decreasing by the amount of treatment received. Default is decreasing = FALSE.

matched.set

(optional) a matched.set object containing a single treated unit and a set of matched controls. If provided, this set will be highlighted on the resulting plot.

show.set.only

(optional) logical. If TRUE, only the treated unit and control units contained in the provided matched.set object will be shown on the plot. Default is FALSE. If no matched.set is provided, then this argument will have no effect.

hide.x.tick.label

logical. If TRUE, x axis tick labels are not shown. Default is FALSE.

hide.y.tick.label

logical. If TRUE, y axis tick labels are not shown. Default is FALSE.

gradient.weights

(optional) logical. If TRUE, the "darkness"/shade of units in the provided matched.set object will be displayed according to their weight. Control units with higher weights will appear darker on the resulting plot. Control units with lower weights will appear lighter. This argument has no effect unless a matched.set is provided.

dense.plot

logical. if TRUE, lines between tiles are removed on resulting plot. This is useful for producing more readable plots in situations where the number of units and/or time periods is very high.

Value

DisplayTreatment returns a treatment variation plot (generated via ggplot2 geom_tile() or geom_raster()), which visualizes the variation of treatment across units and time. The results can be customized using ggplot2 syntax.

Author(s)

In Song Kim <insong@mit.edu>, Erik Wang <haixiao@Princeton.edu>, Adam Rauh <amrauh@umich.edu>, and Kosuke Imai <imai@harvard.edu>

Examples

dem.panel <- PanelData(panel.data = dem, 
              unit.id = "wbcode2", 
              time.id = "year", 
              treatment = "dem", 
              outcome = "y")
DisplayTreatment(panel.data = dem.panel,
                 legend.position = "none",
                 xlab = "year", ylab = "Country Code")

Pre-process and balance panel data

Description

Pre-process and balance panel data

Usage

PanelData(panel.data, unit.id, time.id, treatment, outcome)

Arguments

panel.data

A data.frame object containing time series cross sectional data. Time data should be sequential integers that increase by 1. Unit identifiers must be integers. Treatment data must be binary. If time data is non-integer, the package will attempt to sensibly convert it by converting the data to factor, then to integer. If a conversion is performed, a mapping will be returned as an attribute called "time.data.map"

unit.id

A character string indicating the name of unit identifier in the data. This data must be integer.

time.id

A character string indicating the name of the time variable in the data.

treatment

A character string indicating the name of the treatment variable. The treatment must be a binary indicator variable (integer with 0 for the control group and 1 for the treatment group).

outcome

A character string identifying the outcome variable

Value

PanelData() returns an object of class PanelData. This takes the form of a data.frame object with the following properties and attributes. First, the data has been balanced and sorted. These properties are noted in the "is.balanced" and "is.sorted" attributes, respectively. So, each unit appears the same number of times in the resulting PanelData object, with NAs filling out missing data. Second, the data has been sorted to appear in order for each unit. Next, the PanelData object has the following attributes: "unit.id", "time.id", "treatment", and "outcome" reflecting the variables provided in the specification. If the function attempts to automatically convert time data to be consecutive integers, the mapping between the original time data and the "new" converted time data is provided as a data.frame object and stored as the "time.data.map" attribute.

Examples

d <- PanelData(panel.data = dem, 
               unit.id = "wbcode2", 
               time.id = "year", 
               treatment = "dem", 
               outcome = "y")

Estimate a causal quantity of interest

Description

Estimate a causal quantity of interest, including the average treatment effect for treated or control units (att and atc, respectively), the average effect of treatment reversal on reversed units (art), or average treatment effect (ate), as specified in PanelMatch(). This is done by estimating the counterfactual outcomes for each treated unit using matched sets. Users will provide matched sets that were obtained by the PanelMatch function and obtain point estimates and standard errors.

Usage

PanelEstimate(
  sets,
  panel.data,
  number.iterations = 1000,
  df.adjustment = FALSE,
  confidence.level = 0.95,
  moderator = NULL,
  se.method = "bootstrap",
  pooled = FALSE,
  include.placebo.test = FALSE,
  parallel = FALSE,
  num.cores = 1
)

Arguments

sets

A PanelMatch object attained via the PanelMatch() function.

panel.data

The same time series cross sectional data set provided to the PanelMatch() function used to produce the matched sets. This should be a PanelData object.

number.iterations

If using bootstrapping for calculating standard errors, this is the number of bootstrap iterations. Provide as integer. If se.method is not equal to "bootstrap", this argument has no effect.

df.adjustment

A logical value indicating whether or not a degree-of-freedom adjustment should be performed for the standard error calculation. The default is FALSE. This parameter is only available for the bootstrap method of standard error calculation.

confidence.level

A numerical value specifying the confidence level and range of interval estimates for statistical inference. The default is .95.

moderator

The name of a moderating variable, provided as a character string. If a moderating variable is provided,the returned object will be a list of PanelEstimate objects. The names of the list will reflect the different values of the moderating variable. More specifically, the moderating variable values will be converted to syntactically proper names using make.names().

se.method

Method used for calculating standard errors, provided as a character string. Users must choose between "bootstrap", "conditional", and "unconditional" methods. Default is "bootstrap". "bootstrap" uses a block bootstrapping procedure to calculate standard errors. The conditional method calculates the variance of the estimator, assuming independence across units but not across time. The unconditional method also calculates the variance of the estimator analytically, but makes no such assumptions about independence across units. When the quantity of interest is "att", "atc", or "art", all methods are available. Only "bootstrap" is available for the ate. If pooled argument is TRUE, then only bootstrap is available.

pooled

Logical. If TRUE, estimates and standard errors are returned for treatment effects pooled across the entire lead window. Only available for se.method = ``bootstrap''

include.placebo.test

Logical. If TRUE, a placebo test is run and returned in the results. The placebo test uses the same specifications for calculating standard errors as the main results. That is, standard errors are calculated according to the user provided se.method and confidence.level arguments (and, if applicable, parallelization specifications).

parallel

Logical. If TRUE and se.method = ``bootstrap'', bootstrap procedure will be parallelized. Default is FALSE. If se.method is not set to bootstrap, this option does nothing.

num.cores

Integer. Specifies the number of cores to use for parallelization. If se.method = ``bootstrap'' and parallel = TRUE, then this option will take effect. Otherwise, it will do nothing.

Value

PanelEstimate returns a list of class PanelEstimate containing the following components:

estimates

the point estimates of the quantity of interest for the lead periods specified

se.method

The method used to calculate standard errors. This is the same as the argument provided to the function.

bootstrapped.estimates

the bootstrapped point estimate values, when applicable

bootstrap.iterations

the number of iterations used in bootstrapping, when applicable

method

refinement method used to create the matched sets from which the estimates were calculated

lag

See PanelMatch() argument lag for more information.

lead

The lead window sequence for which PanelEstimate() is producing point estimates and standard errors.

confidence.level

the confidence level

qoi

the quantity of interest

matched.sets

the refined matched sets used to produce the estimations

standard.error

the standard error(s) of the point estimates

pooled

Logical indicating whether or not estimates were calculated for individual lead periods or pooled.

placebo.test

if include.placebo.test = TRUE, a placebo test is conducted using placebo_test() and returned as a list. See documentation for placebo_test() for more about each individual item.

Author(s)

In Song Kim <insong@mit.edu>, Erik Wang <haixiao@Princeton.edu>, Adam Rauh <amrauh@umich.edu>, and Kosuke Imai <imai@harvard.edu>

References

Imai, Kosuke, In Song Kim, and Erik Wang (2023)

Examples

dem.sub <- dem[dem[, "wbcode2"] <= 100, ]
dem.sub.panel <- PanelData(dem.sub, "wbcode2", "year", "dem", "y")
# create subset of data for simplicity
PM.results <- PanelMatch(panel.data = dem.sub.panel, lag = 4, 
                         refinement.method = "ps.match", 
                         match.missing = TRUE, 
                         covs.formula = ~ tradewb,
                         size.match = 5, qoi = "att",
                         lead = 0:4, 
                         forbid.treatment.reversal = FALSE)
PE.results <- PanelEstimate(sets = PM.results, 
               panel.data = dem.sub.panel, 
               se.method = "unconditional")

Create and refine sets of matched treated and control observations

Description

PanelMatch identifies treated observations and a matched set for each treated observation. Specifically, for a given treated unit, the matched set consists of control observations that have an identical treatment history up to a number of lag time periods. A further refinement of the matched set using matching or weighting techniques, described below.

Usage

PanelMatch(
  panel.data,
  lag,
  refinement.method,
  qoi,
  size.match = 10,
  match.missing = TRUE,
  covs.formula = NULL,
  lead = 0,
  verbose = FALSE,
  exact.match.variables = NULL,
  forbid.treatment.reversal = FALSE,
  matching = TRUE,
  listwise.delete = FALSE,
  use.diagonal.variance.matrix = FALSE,
  restrict.control.period = NULL,
  placebo.test = FALSE
)

Arguments

panel.data

A PanelData object containing time series cross sectional data. Time data must be sequential integers that increase by 1. Unit identifiers must be integers. Treatment data must be binary.

lag

An integer value indicating the length of treatment history periods to be matched on

refinement.method

A character string specifying the matching or weighting method to be used for refining the matched sets. The user can choose "mahalanobis", "ps.match", "CBPS.match", "ps.weight", "CBPS.weight", "ps.msm.weight", "CBPS.msm.weight", or "none". The first three methods will use the size.match argument to create sets of at most size.match closest control units. Choosing "none" will assign equal weights to all control units in each matched set. The MSM methods refer to marginal structural models. See Imai, Kim, and Wang (2023) for a more in-depth discussion of MSMs.

qoi

quantity of interest, provided as a string: att (average treatment effect on treated units), atc (average treatment effect of treatment on the control units) art (average effect of treatment reversal for units that experience treatment reversal), or ate (average treatment effect).

size.match

An integer dictating the number of permitted closest control units in a matched set after refinement. This argument only affects results when using a matching method ("mahalanobis" or any of the refinement methods that end in ".match"). This argument is not needed and will have no impact if included when a weighting method is specified (any refinement.method that includes "weight" in the name).

match.missing

Logical variable indicating whether or not units should be matched on the patterns of missingness in their treatment histories. Default is TRUE. When FALSE, neither treated nor control units are allowed to have missing treatment data in the lag window.

covs.formula

One sided formula object indicating which variables should be used for matching and refinement. Argument is not needed if refinement.method is set to "none" If the user wants to include lagged variables, this can be done using a function, "lag()", which takes two, unnamed, positional arguments. The first is the name of the variable which you wish to lag. The second is the lag window, specified as an integer sequence in increasing order. For instance, I(lag(x, 1:4)) will then add new columns to the data for variable "x" for time t-1, t-2, t-3, and t-4 internally and use them for defining/measuring similarity between units. Other transformations using the I() function, such as I(x^2) are also permitted. The variables specified in this formula are used to define the similarity/distances between units.

lead

integer sequence specifying the lead window, for which qoi point estimates (and standard errors) will ultimately be produced. Default is 0 (which corresponds to contemporaneous treatment effect).

verbose

option to include more information about the matched.set object calculations, like the distances used to create the refined sets and weights.

exact.match.variables

character vector giving the names of variables to be exactly matched on. These should be time invariant variables. Exact matching for time varying covariates is not currently supported.

forbid.treatment.reversal

Logical. For the ATT, it indicates whether or not it is permissible for treatment to reverse in the specified lead window. This is defined analogously for the ART. It is not valid for the ATC or ATE. When set to TRUE, only matched sets for treated units where treatment is applied continuously in the lead window are included in the results. Default is FALSE.

matching

logical indicating whether or not any matching on treatment history should be performed. This is primarily used for diagnostic purposes, and most users will never need to set this to FALSE. Default is TRUE.

listwise.delete

TRUE/FALSE indicating whether or not missing data should be handled using listwise deletion or the package's default missing data handling procedures. Default is FALSE.

use.diagonal.variance.matrix

TRUE/FALSE indicating whether or not a regular covariance matrix should be used in mahalanobis distance calculations during refinement, or if a diagonal matrix with only covariate variances should be used instead. In many cases, setting this to TRUE can lead to better covariate balance, especially when there is high correlation between variables. Default is FALSE. This argument is only necessary when refinement.method = mahalanobis and will have no impact otherwise.

restrict.control.period

(optional) integer specifying the number of pre-treatment periods that treated units and potentially matched control units should be non-NULL and in the control state. For instance, specifying 4 would mean that the treatment history cannot contain any missing data or treatment from t-4 to t.

placebo.test

logical TRUE/FALSE. indicates whether or not you want to be able to run a placebo test. This will add additional requirements on the data – specifically, it requires that no unit included in the matching/refinement process can having missing outcome data over the lag window. Additionally, you should not use the outcome variable in refinement when placebo.test = TRUE.

Value

PanelMatch() returns an object of class PanelMatch. This is a list that contains a few specific elements: First, a matched.set object(s) that has the same name as the provided qoi if the qoi is "att", "art", or "atc". If qoi = "ate" then two matched.set objects will be attached, named "att" and "atc." Please consult the documentation for matched_set() to read more about the structure and usage of matched.set objects. The PanelMatch object also has some additional attributes that track metadata about the specification, like the names of the unit and time identifier variables.

Author(s)

Adam Rauh <amrauh@umich.edu>, In Song Kim <insong@mit.edu>, Erik Wang <haixiao@Princeton.edu>, and Kosuke Imai <imai@harvard.edu>

References

Imai, Kosuke, In Song Kim, and Erik Wang (2023)

Examples

dem.sub <- dem[dem[, "wbcode2"] <= 100, ]
dem.sub.panel <- PanelData(dem.sub, "wbcode2", "year", "dem", "y")
# create subset of data for simplicity
PM.results <- PanelMatch(panel.data = dem.sub.panel, lag = 4, 
                         refinement.method = "ps.match", 
                         match.missing = TRUE, 
                         covs.formula = ~ tradewb,
                         size.match = 5, qoi = "att",
                         lead = 0:4, 
                         forbid.treatment.reversal = FALSE)
# include lagged variables
PM.results <- PanelMatch(panel.data = dem.sub.panel, lag = 4, 
                         refinement.method = "ps.weight", 
                         match.missing = TRUE, 
                         covs.formula = ~ tradewb + I(lag(tradewb, 1:4)) + I(lag(y, 1:4)),
                         size.match = 5, qoi = "att",
                         lead = 0:4, 
                         forbid.treatment.reversal = FALSE)

Subset PanelBalance objects

Description

Subset PanelBalance objects

Usage

## S3 method for class 'PanelBalance'
x[i, ...]

Arguments

x

PanelBalance object

i

numeric. Specifies which element to extract. Substantively, it specifies which PanelMatch configuration data to extract.

...

Not used

Value

Returns balance information for specified PanelMatch configuration. Note that results are still returned as a PanelBalance object. In order to return a list, use the [[ operator

Examples

dem$rdata <- runif(runif(nrow(dem)))
dem.panel <- PanelData(dem, "wbcode2", "year", "dem", "y")
pm.obj <- PanelMatch(lead = 0:3, lag = 4, refinement.method = "mahalanobis", 
                     panel.data = dem.panel, match.missing = TRUE,
                     covs.formula = ~ tradewb + rdata + I(lag(tradewb, 1:4)) + I(lag(y, 1:4)), 
                     size.match = 5, qoi = "att")

# create multiple configurations to compare
pm2 <- PanelMatch(lead = 0:3, lag = 4, refinement.method = "ps.match", 
                  panel.data = dem.panel, match.missing = TRUE,
                  covs.formula = ~ tradewb + rdata + I(lag(tradewb, 1:4)) + I(lag(y, 1:4)), 
                  size.match = 5, qoi = "att")

pb <- get_covariate_balance(pm.obj, pm2,
                            include.unrefined = TRUE,
                            panel.data = dem.panel, 
                            covariates = c("tradewb", "rdata"))
bal.maha <- pb[1]
bal.ps <- pb[2]

Subset matched.set object

Description

Subsets matched.set objects while preserving attributes.

Usage

## S3 method for class 'matched.set'
x[i, j = NULL, drop = NULL]

Arguments

x

matched.set object

i

numeric. specifies the index of which element to extract.

j

NULL

drop

NULL

build_maha_mats Builds the matrices that we will then use to calculate the mahalanobis distances for each matched set

Description

build_maha_mats Builds the matrices that we will then use to calculate the mahalanobis distances for each matched set

Usage

build_maha_mats(idx, ordered_expanded_data)

Arguments

idx

List of vectors specifying which observations should be extracted

ordered_expanded_data

data.frame of prepared/parsed input data

Value

List of parsed distance matrices, with elements corresponding to each matched set

build_ps_data

Description

build_ps_data

Usage

build_ps_data(idxlist, data, lag)

Arguments

idxlist

data

data.frame object with the data

lag

see PanelMatch() documentation

Value

Returns a list of length equal to the number of matched sets. Each item is a data frame and each data frame contains information at time = t + 0 for each treated unit and their corresponding controls.

calculate_estimates

Description

Mid-level function that helps with estimation process. Calls lower level helper functions

Usage

calculate_estimates(
  qoi.in,
  data.in,
  lead,
  number.iterations,
  att.treated.unit.ids,
  atc.treated.unit.ids,
  outcome.variable,
  unit.id.variable,
  confidence.level,
  att.sets,
  atc.sets,
  placebo.test = FALSE,
  lag,
  se.method,
  pooled = FALSE,
  parallel = FALSE,
  num.cores = 1
)

Arguments

qoi.in

String specifying qoi

data.in

data.frame object with the data

lead

integer specifying lead window

number.iterations

integer. specifies number of bootstrap iterations

att.treated.unit.ids

Integer vector specifying the treated units for the att or art

atc.treated.unit.ids

Integer vector specifying the "treated" units under the atc definition

outcome.variable

string specifying the name of the outcome variable

unit.id.variable

string specifying the name of the unit id variable

confidence.level

double. specifies confidence level for confidence interval

att.sets

matched.set object specifying the att or art sets

atc.sets

matched.set object specifying the atc sets

lag

integer vector specifying size of the lag.

se.method

string specifying which method should be used for standard error calculation

pooled

bool. specifies whether or not estimates should be calculated for each lead period, or pooled across all lead periods

parallel

bool. Specifies whether or not parallelization should be used

num.cores

Integer. specifies how many cores to use for parallelization

Value

Returns PanelEstimate object.

calculate_placebo_estimates

Description

Handles the procedures for calculating point estimates and standard errors for the placebo test. Code is structured very similarly to the calculate_estimates() code, but with appropriate modifications for the placebo test. See that function for description of arguments. Bootstrap SEs are available for any specification. Conditional, unconditional standard errors only available for att, art, atc.

Usage

calculate_placebo_estimates(
  qoi.in,
  data.in,
  lead,
  number.iterations,
  att.treated.unit.ids,
  atc.treated.unit.ids,
  outcome.variable,
  unit.id.variable,
  confidence.level,
  att.sets,
  atc.sets,
  placebo.test = FALSE,
  lag,
  placebo.lead,
  se.method = "bootstrap",
  parallel = FALSE,
  num.cores = 1
)

Value

Returns a PanelEstimate object

calculate_point_estimates Helper function that calculates the point estimates for the specified QOI

Description

calculate_point_estimates Helper function that calculates the point estimates for the specified QOI

Usage

calculate_point_estimates(
  qoi.in,
  data.in,
  lead,
  outcome.variable,
  pooled = FALSE
)

Arguments

qoi.in

string specifying the QOI

data.in

data.frame providing the processed/parsed data to be used for calculations

lead

see PanelMatch() documentation

outcome.variable

string specifying the outcome variable

pooled

Logical. See PanelEstimate() documentation.

Value

A named vector of point estimates

check_time_data

Description

Time data should be consecutive integers: When it is not, try to convert it as best we can or throw an error. If function does not fail, returns the data as data frame object, either processed or not as appropriately

Usage

check_time_data(data, time.id)

Arguments

data

data.frame object.

time.id

string specifying the time id variable.

Details

enforces the requirements for time data, with some reasonable defaults

Value

data.frame object with the data. If function throws error, nothing is returned.

clean_leads Function to check the lead windows in treated and control units for missing outcome data. If data is missing, remove those units from matched sets.

Description

clean_leads Function to check the lead windows in treated and control units for missing outcome data. If data is missing, remove those units from matched sets.

Usage

clean_leads(matched_sets, ordered.data, max.lead, t.var, id.var, outcome.var)

Arguments

matched_sets

matched.set object contained pre-filtered matched sets

ordered.data

data.frame object to be checked for missing data. This should have been passed through data preparation functions already.

max.lead

Integer specifying the biggest value of the lead window.

t.var

string specifying the time id variable

id.var

string specifying the unit id variable

outcome.var

string specifying the outcome variable.

Value

a cleaned/filtered matched.set object

Produce confidence intervals for PanelEstimate objects

Description

Produce confidence intervals for PanelEstimate objects

Usage

## S3 method for class 'PanelEstimate'
confint(object, parm = NULL, level = NULL, ..., bias.corrected = FALSE)

Arguments

object

PanelEstimate results

parm

Not used.

level

Confidence level to be used for confidence interval calculations. Must be numeric between 0 and 1. If NULL, confidence level from PanelEstimate() specification is used.

...

not used

bias.corrected

logical indicating whether or not bias corrected estimates should be provided. Default is FALSE. This argument only applies for standard errors calculated with the bootstrap.

Value

Matrix with two columns and ‘length(lead)' rows. Contains the upper and lower boundaries of the confidence interval for each time period’s point estimate.

Country-year level democratization data

Description

A dataset containing the democracy indicator for 184 countries from 1960 to 2010

Format

A data.frame containing 9384 rows and 3 variables

Details

wbcode2. World Bank country ID. Integer.
year. year (1960–2010). Integer.
dem. binary indicator of democracy as defined in Acemoglu et al (2019).
y log of GDP per capita in 2000 constant dollars (multiplied by 100). Numeric.
tradewb Exports plus imports as a share of GDP from World Bank. Numeric.

Source

Acemoglu, Daron, Suresh Naidu, Pascual Restrepo, and James A Robinson. “Democracy does cause growth.” Journal of Political Economy.

Get distances See distances.matched.set method

Description

Get distances See distances.matched.set method

Usage

distances(object)

Arguments

object

matched.set object

Extract the distances of matched control units

Description

Extract the distances of matched control units

Usage

## S3 method for class 'matched.set'
distances(object)

Arguments

object

a matched.set object

Value

A named list of named vectors. Each element corresponds to a matched set and will be a named vector, where the names of each element will identify a matched control unit and its distance from the treated observation within a particular matched set. These correspond to the "distances" attribute, which are calculated and included when the verbose option is set to TRUE in PanelMatch.

Examples

dem.panel <- PanelData(dem, "wbcode2", "year", "dem", "y")
PM.results <- PanelMatch(panel.data = dem.panel, lag = 4,
                         refinement.method = "mahalanobis",
                         verbose = TRUE,
                         match.missing = TRUE,
                         covs.formula = ~ tradewb,
                         size.match = 5, qoi = "att",
                         lead = 0:4,
                         forbid.treatment.reversal = FALSE)
r1 <- extract(PM.results, qoi = "att")
lt <- distances(r1)

enforce_lead_restrictions check treatment and control units for treatment reversion in the lead window. Treated units must stay treated and control units must stay in control (according to the specified qoi)

Description

enforce_lead_restrictions check treatment and control units for treatment reversion in the lead window. Treated units must stay treated and control units must stay in control (according to the specified qoi)

Usage

enforce_lead_restrictions(
  matched_sets,
  ordered.data,
  max.lead,
  t.var,
  id.var,
  treatment.var
)

Arguments

matched_sets

matched.set object

ordered.data

parsed data as data.frame object

max.lead

The largest lead value (e.g. the biggest F)

t.var

string specifying the time variable

id.var

string specifying the unit id variable

treatment.var

string specifying the treatment variable.

Value

matched.set object with the matched sets that meet the conditions

equality_four Small helper function implementing estimation function from Imai, Kim, and Wang (2023)

Description

equality_four Small helper function implementing estimation function from Imai, Kim, and Wang (2023)

Usage

equality_four(x, y, z)

Value

Returns numeric vector of results.

equality_four_placebo

Description

Small helper function implementing estimation function from Imai, Kim, and Wang (2023)

Usage

equality_four_placebo(x, y, z)

Value

Returns numeric vector of results.

Extract QOI estimates See documentation for 'estimates.PanelEstimate()'

Description

Extract QOI estimates See documentation for 'estimates.PanelEstimate()'

Usage

estimates(object, ...)

Arguments

object

PanelEstimate object

...

other arguments. Not used.

Extract QOI estimates

Description

This is a method for extracting point estimates for the QOI from PanelEstimate objects. This function is analogous to the 'coef()' method used elsewhere.

Usage

## S3 method for class 'PanelEstimate'
estimates(object, ...)

Arguments

object

PanelEstimate object

...

not used

Value

Named vector with the QOI point estimates and the time periods to which they correspond

expand_treated_ts Builds a list that contains all times in a lag window that correspond to a particular treated unit. This is structured as a list of vectors. Each vector is lag + 1 units long. The overall list will be the same length as the number of matched sets

Description

expand_treated_ts Builds a list that contains all times in a lag window that correspond to a particular treated unit. This is structured as a list of vectors. Each vector is lag + 1 units long. The overall list will be the same length as the number of matched sets

Usage

expand_treated_ts(lag, treated.ts)

Arguments

lag

lag value

treated.ts

times of treated observations

Value

list. Contains all times in a lag window that correspond to a particular treated unit

Extract matched.set objects from PanelMatch results

Description

Extract matched.set objects from PanelMatch results

Usage

extract(pm.object, qoi)

Arguments

pm.object

PanelMatch object

qoi

character, specifying the qoi. Valid inputs include "att", "atc", "art", and NULL. If NULL, function extracts att, art, or atc results if possible. Otherwise, throws an error if ate is specified.

Extract matched.set objects from PanelMatch results

Description

Extract matched.set objects from PanelMatch results

Usage

## S3 method for class 'PanelMatch'
extract(pm.object, qoi = NULL)

Arguments

pm.object

PanelMatch obect

qoi

character, specifying the qoi. Valid inputs include "att", "atc", "art", and NULL. If NULL, function extracts att, art, or atc results if possible. Otherwise, throws an error if ate is specified.

Value

a matched.set object

Examples

dem.sub <- dem[dem[, "wbcode2"] <= 100, ]
dem.sub.panel <- PanelData(dem.sub, "wbcode2", "year", "dem", "y")
# create subset of data for simplicity
PM.results <- PanelMatch(panel.data = dem.sub.panel,
                         lag = 4, 
                         refinement.method = "mahalanobis",
                         match.missing = TRUE,
                         covs.formula = ~ I(lag(tradewb, 1:4)) + I(lag(y, 1:4)),
                         size.match = 5, qoi = "att",
                         lead = 0:4, forbid.treatment.reversal = FALSE)
extract(PM.results, qoi = "att")
extract(PM.results) # valid since att is specified

extract_differences This function calculates the differences from t-1 to 1 for treated and control units in the treatment variable. While functionality is somewhat trivial for current implementation of package, it will be needed for continuous treatment version of the package.

Description

extract_differences This function calculates the differences from t-1 to 1 for treated and control units in the treatment variable. While functionality is somewhat trivial for current implementation of package, it will be needed for continuous treatment version of the package.

Usage

extract_differences(indexed.data, matched.set, treatment.variable, qoi)

Arguments

indexed.data

data that has been indexed. Rows have been named with a unique identifier.

matched.set

matched.set object

treatment.variable

string specifying treatment variable

qoi

string specifying QOI

Value

matched.set object, with differences extracted as described previously for each matched set.

findBinaryTreated

Description

findBinaryTreated is used to identify t,id pairs of units for which a matched set might exist. More precisely, it finds units for which at time t, the specified treatment has been applied, but at time t - 1, the treatment has not.

Usage

findBinaryTreated(
  dmat,
  qoi.in,
  treatedvar,
  time.var,
  unit.var,
  hasbeensorted = FALSE
)

Arguments

dmat

Data frame or matrix containing data used to identify potential treated units. Must be specified in such a way that a combination of time and id variables will correspond to a unique row. Must also contain at least a binary treatment variable column as well.

treatedvar

Character string that identifies the name of the column in dmat that provides information about the binary treatment variable

time.var

Character string that identifies the name of the column in dmat that contains data about the time variable. This data must be integer that increases by one.

unit.var

Character string that identifies the name of the column in dmat that contains data about the variable used as a unit id. This data must be integer

hasbeensorted

variable that only has internal usage for optimization purposes. There should be no need for a user to toggle this

Value

findBinaryTreated returns a subset of the data in the dmat data frame, containing only treated units for which a matched set might exist

find_ps

Description

find_ps

Usage

find_ps(sets, fitted.model)

Arguments

sets

matched sets

fitted.model

Result of a fitted (CB) PS model call

Value

Returns a list of data frames with propensity score weights for each unit in a matched set. Each element in the list is a data frame which corresponds to a matched set of 1 treatment and all matched control units

Return the refinement formula used in a PanelMatch specification

Description

Return the refinement formula used in a PanelMatch specification

Usage

## S3 method for class 'PanelMatch'
formula(x, ...)

Arguments

x

A PanelMatch Object

...

not used

Value

One sided formula object containing the variables/specification used in refinement. This corresponds to what was provided to the covs.formula argument.

get.matchedsets

Description

get.matchedsets is used to identify matched sets for a given unit with a specified i, t.

Usage

get.matchedsets(
  t,
  id,
  data,
  L,
  t.column,
  id.column,
  treatedvar,
  hasbeensorted = FALSE,
  match.on.missingness = TRUE,
  matching = TRUE,
  qoi.in,
  restrict.control.period = NULL
)

Arguments

t

integer vector specifying the times of treated units for which matched sets should be found. This vector should be the same length as the following id parameter – the entries at corresponding indices in each vector should form the t,id pair of a specified treatment unit.

id

integer vector specifying the unit ids of treated units for which matched sets should be found. note that both t and id can be of length 1

data

data frame containing the data to be used for finding matched sets.

L

An integer value indicating the length of treatment history to be matched

t.column

Character string that identifies the name of the column in data that contains data about the time variable. Each specified entry in t should be somewhere in this column in the data. This data must be integer that increases by one.

id.column

Character string that identifies the name of the column in data that contains data about the unit id variable. Each specified entry in id should be somewhere in this column in the data. This data must be integer.

treatedvar

Character string that identifies the name of the column in data that contains data about the binary treatment variable.

hasbeensorted

variable that only has internal usage for optimization purposes. There should be no need for a user to toggle this

match.on.missingness

TRUE/FALSE indicating whether or not the user wants to "match on missingness." That is, should units with NAs in their treatment history windows be matched with control units that have NA's in corresponding places?

matching

logical indicating whether or not the treatment history should be used for matching. This should almost always be set to TRUE, except for specific situations where the user is interested in particular diagnostic questions.

Value

get.matchedsets returns a "matched set" object, which primarily contains a named list of vectors. Each vector is a "matched set" containing the unit ids included in a matched set. The list names will indicate an i,t pair (formatted as "<i variable>.<t variable>") to which the vector/matched set corresponds.

getDits returns a vector of Dit values, as defined in the paper. They should be in the same order as the data frame containing the original problem data.

Description

getDits returns a vector of Dit values, as defined in the paper. They should be in the same order as the data frame containing the original problem data.

Usage

getDits(matched_sets, data)

Arguments

matched_sets

matched.set object

data

data.frame object

Value

vector of Dits, as described in Imai et al. (2023)

getWits returns a vector of Wits, as defined in the paper (equation 25 or equation 23). They should be in the same order as the data frame containing the original problem data. The pts, pcs, and getWits functions act for a specific lead. So, for instance if our lead window is 0,1,2,3,4, these function must be called for each of those – so for 0, then for 1, etc.

Description

getWits returns a vector of Wits, as defined in the paper (equation 25 or equation 23). They should be in the same order as the data frame containing the original problem data. The pts, pcs, and getWits functions act for a specific lead. So, for instance if our lead window is 0,1,2,3,4, these function must be called for each of those – so for 0, then for 1, etc.

Usage

getWits(matched_sets, lead, data, estimation.method = "bootstrap")

Arguments

matched_sets

matched.set object

lead

integer providing a specific lead value

data

data.frame object

estimation.method

method of estimation for calculating standard errors.

Value

data.table of Wits, as described above

Calculate covariate balance measures for refined and unrefined matched sets

Description

Calculate covariate balance for user specified covariates across matched sets. Balance is assessed by taking the average of the difference between the values of the specified covariates for the treated unit(s) and the weighted average of the control units across all matched sets. Results are standardized and are expressed in standard deviations. Balance is calculated for each period in the specified lag window.

Usage

get_covariate_balance(..., panel.data, covariates, include.unrefined = TRUE)

Arguments

...

one or more PanelMatch objects

panel.data

PanelData object

covariates

a character vector, specifying the names of the covariates for which the user is interested in calculating balance.

include.unrefined

logical. Indicates whether or not covariate balance measures for unrefined matched sets should be included. If TRUE, the function will return covariate balance results for the PanelMatch configurations provided, as well as a set of balance results that assume all matched controls have equal weight (i.e., the matched sets are unrefined). These results are included in addition to whatever PanelMatch configurations are specified to the function. Note that if you provide a PanelMatch object where no refinement is applied (that is, where refinement.method = "none") and set this option to TRUE, then both sets of covariate balance results will be identical. If FALSE, then only balance calculations for the provided PanelMatch specifications are performed and returned.

Value

A list of matrices, or a list of lists (if the QOI is ATE). The matrices contain the calculated covariate balance levels for each specified covariate for each period. Each element in the list (whether that be a matrix or a sublist) corresponds to a PanelMatch configuration specified to the function. Results are returned in the order they were provided. Unrefined results are stored as a parallel list object in an attribute called "unrefined.balance.results".

Examples

dem.sub <- dem[dem[, "wbcode2"] <= 100, ]
# create subset of data for simplicity
#add some additional data to data set for demonstration purposes
dem.sub$rdata <- runif(runif(nrow(dem.sub)))
dem.sub.panel <- PanelData(dem.sub, "wbcode2", "year", "dem", "y")
PM.results <- PanelMatch(panel.data = dem.sub.panel, lag = 4, 
                         refinement.method = "ps.match", 
                         match.missing = TRUE, 
                         covs.formula = ~ tradewb + rdata,
                         size.match = 5, qoi = "att",
                         lead = 0:4, 
                         forbid.treatment.reversal = FALSE)
get_covariate_balance(PM.results, panel.data = dem.sub.panel, covariates = c("tradewb", "rdata"))

Calculate matched set level treatment effects

Description

Calculate the size of treatment effects for each matched set.

Usage

get_set_treatment_effects(pm.obj, panel.data, lead)

Arguments

pm.obj

an object of class PanelMatch

panel.data

PanelData object with the time series cross sectional data used for matching, refinement, and estimation

lead

integer (or integer vector) indicating the time period(s) in the future for which the treatment effect size will be calculated. Calculations will be made for the period t + lead, where t is the time of treatment. If more than one lead value is provided, then calculations will be performed for each value.

Value

a list equal in length to the number of lead periods specified to the lead argument. Each element in the list is a vector of the matched set level effect estimates.

Examples

dem.sub <- dem[dem[, "wbcode2"] <= 100, ]
dem.sub.panel <- PanelData(dem.sub, "wbcode2", "year", "dem", "y")
# create subset of data for simplicity
PM.results <- PanelMatch(panel.data = dem.sub.panel, lag = 4, 
                         refinement.method = "ps.match", 
                         match.missing = TRUE, 
                         covs.formula = ~ tradewb,
                         size.match = 5, qoi = "att",
                         lead = 0:4, 
                         forbid.treatment.reversal = FALSE)
set.effects <- get_set_treatment_effects(pm.obj = PM.results, 
                panel.data = dem.sub.panel, lead = 0)

Extract just the unrefined covariate balance results, if they exist

Description

Extract just the unrefined covariate balance results, if they exist

Usage

get_unrefined_balance(pb.object)

Arguments

pb.object

PanelBalance object

Extract unrefined covariate balance results, if they exist

Description

Extract unrefined covariate balance results, if they exist

Usage

## S3 method for class 'PanelBalance'
get_unrefined_balance(pb.object)

Arguments

pb.object

PanelBalance object

Value

A PanelBalance object, with just the unrefined balance results

Examples

dem$rdata <- runif(runif(nrow(dem)))
dem.panel <- PanelData(dem, "wbcode2", "year", "dem", "y")
pm.obj <- PanelMatch(lead = 0:3, lag = 4, refinement.method = "mahalanobis", 
                     panel.data = dem.panel, match.missing = TRUE,
                     covs.formula = ~ tradewb + rdata + I(lag(tradewb, 1:4)) + I(lag(y, 1:4)), 
                     size.match = 5, qoi = "att")

# create multiple configurations to compare
pm2 <- PanelMatch(lead = 0:3, lag = 4, refinement.method = "ps.match", 
                  panel.data = dem.panel, match.missing = TRUE,
                  covs.formula = ~ tradewb + rdata + I(lag(tradewb, 1:4)) + I(lag(y, 1:4)), 
                  size.match = 5, qoi = "att")

pb <- get_covariate_balance(pm.obj, pm2,
                            include.unrefined = TRUE,
                            panel.data = dem.panel, 
                            covariates = c("tradewb", "rdata"))
get_unrefined_balance(pb)

handle_bootstrap

Description

Helper function for calculating bootstrapped estimates for the QOI. This version is not parallelized.

Usage

handle_bootstrap(
  qoi.in,
  data.in,
  lead,
  number.iterations,
  att.treated.unit.ids,
  atc.treated.unit.ids,
  outcome.variable,
  unit.id.variable,
  confidence.level,
  lag,
  pooled
)

Arguments

qoi.in

String specifying qoi

data.in

data.frame object with the data

number.iterations

integer. Specifies number of bootstrap iterations

att.treated.unit.ids

Integer vector specifying the treated units for the att or art

atc.treated.unit.ids

Integer vector specifying the "treated" units under the atc definition

outcome.variable

string specifying the name of the outcome variable

unit.id.variable

string specifying the name of the unit id variable

confidence.level

double. specifies confidence level for confidence interval

lag

integer vector specifying size of the lag.

pooled

logical. Specifies whether or not to calculate point estimates for each specified lead value, or a single pooled estimate.

Value

Returns a matrix of bootstrapped QOI estimate values.

handle_bootstrap_parallel

Description

Helper function for calculating bootstrapped estimates for the QOI. This version is parallelized.

Usage

handle_bootstrap_parallel(
  qoi.in,
  data.in,
  lead,
  number.iterations,
  att.treated.unit.ids,
  atc.treated.unit.ids,
  outcome.variable,
  unit.id.variable,
  confidence.level,
  lag,
  pooled,
  num.cores = 1
)

Arguments

qoi.in

String specifying qoi

data.in

data.frame object with the data

number.iterations

integer. Specifies number of bootstrap iterations

att.treated.unit.ids

Integer vector specifying the treated units for the att or art

atc.treated.unit.ids

Integer vector specifying the "treated" units under the atc definition

outcome.variable

string specifying the name of the outcome variable

unit.id.variable

string specifying the name of the unit id variable

confidence.level

double. specifies confidence level for confidence interval

lag

integer vector specifying size of the lag.

pooled

logical. Specifies whether or not to calculate point estimates for each specified lead value, or a single pooled estimate.

num.cores

number of cores to be used for parallelization

Value

Returns a matrix of bootstrapped QOI estimate values.

handle_bootstrap_placebo

Description

Helper function for calculating bootstrapped estimates for the placebo test. This version is not parallelized.

Usage

handle_bootstrap_placebo(
  qoi.in,
  data.in,
  placebo.lead,
  number.iterations,
  att.treated.unit.ids,
  atc.treated.unit.ids,
  outcome.variable,
  unit.id.variable,
  confidence.level,
  lag
)

Arguments

qoi.in

String specifying qoi

data.in

data.frame object with the data

number.iterations

integer. specifies number of bootstrap iterations

att.treated.unit.ids

Integer vector specifying the treated units for the att or art

atc.treated.unit.ids

Integer vector specifying the "treated" units under the atc definition

outcome.variable

string specifying the name of the outcome variable

unit.id.variable

string specifying the name of the unit id variable

confidence.level

double. specifies confidence level for confidence interval

lag

integer vector specifying size of the lag.

Value

Returns a matrix of bootstrapped QOI estimate values.

handle_bootstrap_placebo_parallel

Description

Helper function for calculating bootstrapped estimates for the placebo test. This version is parallelized.

Usage

handle_bootstrap_placebo_parallel(
  qoi.in,
  data.in,
  placebo.lead,
  number.iterations,
  att.treated.unit.ids,
  atc.treated.unit.ids,
  outcome.variable,
  unit.id.variable,
  confidence.level,
  lag,
  num.cores = 1
)

Arguments

qoi.in

String specifying qoi

data.in

data.frame object with the data

number.iterations

integer. Specifies number of bootstrap iterations

att.treated.unit.ids

Integer vector specifying the treated units for the att or art

atc.treated.unit.ids

Integer vector specifying the "treated" units under the atc definition

outcome.variable

string specifying the name of the outcome variable

unit.id.variable

string specifying the name of the unit id variable

confidence.level

double. specifies confidence level for confidence interval

lag

integer vector specifying size of the lag.

num.cores

number of cores to be used for parallelization

Value

Returns a matrix of bootstrapped QOI estimate values.

handle_conditional_se Calculates conditional standard errors analytically, as defined in Imai et al. (2023). See PanelEstimate() for a more complete description of the standard error types.

Description

handle_conditional_se Calculates conditional standard errors analytically, as defined in Imai et al. (2023). See PanelEstimate() for a more complete description of the standard error types.

Usage

handle_conditional_se(
  qoi.in,
  data.in,
  lead,
  outcome.variable,
  unit.id.variable
)

Arguments

qoi.in

string specifying the QOI

data.in

data.frame specifying the data

lead

See PanelMatch() documentation

outcome.variable

string specifying the name of the outcome variable

unit.id.variable

string specifying the name of the unit id variable.

Value

Named vector with standard error estimates

handle_mahalanobis_calculations Returns a matched.set object with weights for control units, along with some other metadata

Description

handle_mahalanobis_calculations Returns a matched.set object with weights for control units, along with some other metadata

Usage

handle_mahalanobis_calculations(
  mahal.nested.list,
  msets,
  max.size,
  verbose,
  use.diagonal.covmat
)

Arguments

mahal.nested.list

Output from build_maha_mats function

msets

matched.set object – list containing the treated observations and matched controls

max.size

maximum number of control units that will receive non-zero weights within a matched set

verbose

Logical. See PanelMatch() documentation

use.diagonal.covmat

Logical. See PanelMatch() documentation

Value

matched.set object with weights for control units, along with some other metadata

handle_missing_data

Description

use col.index to determine which columns we want to "scan" for missing data. Note that in earlier points in the code, we rearrange the columns and prepare the data frame such that cols 1-4 are bookkeeping (unit id, time id, treated variable, unlagged outcome variable) and all remaining columns are used in the calculations after going through parse_and_prep function, so col.index should usually be 5:ncol(data). In practice, this function just looks over the data in the specified columns in the "data" data frame for missing data. Then it creates columns with indicator variables about the missingness of those variables: 1 for missing data, 0 for present

Usage

handle_missing_data(data, col.index)

Arguments

data

data.frame object.

col.index

numeric vector specifying which columns to inspect

Details

Tags missing data

Value

data.frame object with the data and the missingness indicators described above.

handle_moderating_variable

Description

handles moderating variable calculations: In practice, this just involves slicing the data up according to the moderator, calling PanelEstimate() and putting everything back together This function creates the sets of objects on which PanelEstimate() will be called. It identifies the set of valid values the moderating variable can take on.

Usage

handle_moderating_variable(
  ordered.data,
  att.sets,
  atc.sets,
  PM.object,
  moderator,
  unit.id,
  time.id,
  qoi.in
)

Arguments

ordered.data

data.frame

att.sets

matched.set object for the ATT or ART

atc.sets

matched.set object for the ATC

PM.object

PanelMatch object

moderator

string specifying the name of the moderating variable

unit.id

string specifying the unit id variable

time.id

string specifying the time id variable

qoi.in

string specifying the QOI

Value

Character vector of valid moderating variable values

handle_ps_match Returns a matched.set object with weights for control units, along with some other metadata

Description

handle_ps_match Returns a matched.set object with weights for control units, along with some other metadata

Usage

handle_ps_match(just.ps.sets, msets, refinement.method, verbose, max.set.size)

Arguments

just.ps.sets

Output from find_ps() function

msets

matched.set object – list containing the treated observations and matched controls

verbose

Logical. See PanelMatch() documentation

max.set.size

maximum number of control units that will receive non-zero weights within a matched set

Value

matched.set object with weights for control units, along with some other metadata

handle_ps_weighted

Description

handle_ps_weighted

Usage

handle_ps_weighted(just.ps.sets, msets, refinement.method)

Arguments

just.ps.sets

results of find_ps()

msets

list of matched sets of treated and control observations

refinement.method

string specifying the refinement method

Value

matched.set object with treated and matched control observations, with weights as determined by the specification

handle_conditional_se Calculates conditional standard errors analytically, as defined in Imai et al. (2023). See PanelEstimate() for a more complete description of the standard error types.

Description

handle_conditional_se Calculates conditional standard errors analytically, as defined in Imai et al. (2023). See PanelEstimate() for a more complete description of the standard error types.

Usage

handle_unconditional_se(
  qoi.in,
  data.in,
  lead,
  outcome.variable,
  unit.id.variable
)

Arguments

qoi.in

string specifying the QOI

data.in

data.frame specifying the data

lead

See PanelMatch() documentation

outcome.variable

string specifying the name of the outcome variable

unit.id.variable

string specifying the name of the unit id variable.

Value

Named vector with standard error estimates

identifyDirectionalChanges Identifies changes in treatment variable for treated and control observations

Description

identifyDirectionalChanges Identifies changes in treatment variable for treated and control observations

Usage

identifyDirectionalChanges(
  msets,
  ordered.data,
  id.var,
  time.var,
  treatment.var,
  qoi
)

Arguments

msets

ordered.data

id.var

time.var

treatment.var

qoi

Value

matched.set object with changes in the treatment variable for treated and control observations identified.

lwd_refinement master function that performs refinement with listwise deletion = TRUE

Description

lwd_refinement master function that performs refinement with listwise deletion = TRUE

Usage

lwd_refinement(
  msets,
  global.data,
  treated.ts,
  treated.ids,
  lag,
  time.id,
  unit.id,
  lead,
  refinement.method,
  treatment,
  size.match,
  match.missing,
  covs.formula,
  verbose,
  outcome.var,
  e.sets,
  use.diag.covmat
)

Arguments

msets

global.data

data.frame. needs to be fully prepped/parsed data set that is internally balanced, full of NAs likely

treated.ts

vector of the times of treatment for treated observations

treated.ids

vector of unit identifiers of treated observations

lag

time.id

string specifying

unit.id

lead

vector of lead values

refinement.method

string specifying refinement method

treatment

string specifying treatment variable

size.match

maximum number of units to give non-zero weight to when using matching refinement method

match.missing

logical. indicates whether or not to allow the package to match units on missingness in treatment history

covs.formula

see PanelMatch documentation for descriptions

verbose

see PanelMatch documentation for descriptions

outcome.var

string specifying outcome variable

e.sets

empty sets (treated observations with no matched controls)

use.diag.covmat

see PanelMatch documentation for descriptions

Value

matched.set object with refined matched sets.

lwd_units helper function that actually subsets sets down to contain units with complete data

Description

lwd_units helper function that actually subsets sets down to contain units with complete data

Usage

lwd_units(full.local.data, unit.id)

Arguments

full.local.data

data.frame containing the data to be used in set-level refinement, but containing missing data

unit.id

Value

data.frame with the missing data removed to be used for set-level refinement.

A constructor for the matched.set class.

Description

Users should never need to use this function by itself. See below for more about matched.set objects.

Usage

matched_set(matchedsets, id, t, L, t.var, id.var, treatment.var)

Arguments

matchedsets

a list of treated units and matched control units. Each element in the list should be a vector of control unit ids.

id

A vector containing the ids of treated units

t

A vector containing the times of treatment for treated units.

L

integer specifying the length of the lag window used in matching

t.var

string specifying the time variable

id.var

string specifying the unit id variable

treatment.var

string specifying the treatment variable.

The constructor function returns a matched.set object. matched.set objects are a modified list. Each element in the list is a vector of ids corresponding to the control unit ids in a matched set. Additionally, these vectors might have additional attributes – "weights". These correspond to the weights assigned to each control unit, as determined by the specified refinement method. Each element in the list also has a name, which corresponds to the unit id of the treated unit and time of treatment, concatenated together and separated by a period. matched.set objects also have a number of methods defined: summary, plot, and `[`. matched.set objects can be modified manually as long as these conventions (and conventions about other attributes) are maintained. It is important to note that matched.set objects are distinct from PanelMatch objects. matched.set objects are often contained within PanelMatch objects.

Value

matched.set objects have additional attributes. These reflect the specified parameters when using the PanelMatch function:

lag

an integer value indicating the length of treatment history to be used for matching. Treated and control units are matched based on whether or not they have exactly matching treatment histories in the lag window.

t.var

time variable name, represented as a character/string

id.var

unit id variable name, represented as a character/string

treatment.var

treatment variable name, represented as a character/string

class

class of the object: should always be "matched.set"

refinement.method

method used to refine and/or weight the control units in each set.

covs.formula

One sided formula indicating which variables should be used for matching and refinement

match.missing

Logical variable indicating whether or not units should be matched on the patterns of missingness in their treatment histories

max.match.size

Maximum size of the matched sets after refinement. This argument only affects results when using a matching method

Author(s)

Adam Rauh <amrauh@umich..edu>, In Song Kim <insong@mit.edu>, Erik Wang <haixiao@Princeton.edu>, and Kosuke Imai <imai@harvard.edu>

merge_formula

Description

Simple helper function for merging formula objects

Usage

merge_formula(form1, form2)

Arguments

form1

formula object

form2

formula object

Value

Returns a formula object, which is the concatenation of two provided formula objects.

parse_and_prep

Description

accepts formula object and data, creates the data used for refinement

Usage

parse_and_prep(formula, data)

Arguments

formula

formula object specifying how to construct the data used for refinement. This is likely to be some variation of the covs.formula argument.

data

data.frame object to be used to create the data needed for refinement. data has unit, time, treatment columns in that order, followed by everything else

Value

data.frame object with the data prepared for refinement. Data will have unit, time, treatment columns in that order, followed by everything else.

Prepare Control Units pcs and pts create data frames with the time/id combinations–that need to be found so that they can be easily looked up in the data frame via a hash table. The data frame also contains information about the weight of that unit at particular times, so we use the hash table to look up where to put this data so that we can easily assign the appropriate weights in the original data frame containing the problem data. pcs does this for all control units in a matched set. pts does this for all treated units.

Description

Prepare Control Units pcs and pts create data frames with the time/id combinations–that need to be found so that they can be easily looked up in the data frame via a hash table. The data frame also contains information about the weight of that unit at particular times, so we use the hash table to look up where to put this data so that we can easily assign the appropriate weights in the original data frame containing the problem data. pcs does this for all control units in a matched set. pts does this for all treated units.

Usage

pcs(sets, lead.in)

Arguments

sets

object describing the matched sets

lead.in

integer describing a particular lead value.

Value

data.frame object with time-id combinations

perform_refinement Performs refinement of matched sets, ultimately returning sets of treated observations and controls with weights. This function mostly acts as an intermediary between PanelMatch and lower level functions that do the dirty work of refinement. The function takes a lot of the same arguments as PanelMatch()

Description

perform_refinement Performs refinement of matched sets, ultimately returning sets of treated observations and controls with weights. This function mostly acts as an intermediary between PanelMatch and lower level functions that do the dirty work of refinement. The function takes a lot of the same arguments as PanelMatch()