Repository: ebenmichael/augsynth
Branch: master
Commit: 65c5a6f34f4e
Files: 83
Total size: 373.9 KB

Directory structure:
gitextract_wwvwxodd/

├── .Rbuildignore
├── .gitignore
├── .travis.yml
├── DESCRIPTION
├── LICENSE
├── NAMESPACE
├── R/
│   ├── augsynth.R
│   ├── augsynth_pre.R
│   ├── cv.R
│   ├── data.R
│   ├── eligible_donors.R
│   ├── fit_synth.R
│   ├── format.R
│   ├── globalVariables.R
│   ├── highdim.R
│   ├── inference.R
│   ├── multi_outcomes.R
│   ├── multi_synth_qp.R
│   ├── multisynth_class.R
│   ├── outcome_models.R
│   ├── outcome_multi.R
│   ├── ridge.R
│   ├── ridge_lambda.R
│   └── time_regression_multi.R
├── README.md
├── data/
│   └── kansas.rda
├── data-raw/
│   ├── clean_kansas.R
│   └── kansas_longer2.dta
├── man/
│   ├── augsynth-package.Rd
│   ├── augsynth.Rd
│   ├── augsynth_multiout.Rd
│   ├── check_data_stag.Rd
│   ├── conformal_inf.Rd
│   ├── conformal_inf_linear.Rd
│   ├── conformal_inf_multiout.Rd
│   ├── get_nona_donors.Rd
│   ├── jackknife_se_single.Rd
│   ├── kansas.Rd
│   ├── make_V_matrix.Rd
│   ├── multisynth.Rd
│   ├── plot.augsynth.Rd
│   ├── plot.augsynth_multiout.Rd
│   ├── plot.multisynth.Rd
│   ├── plot.summary.augsynth.Rd
│   ├── plot.summary.augsynth_multiout.Rd
│   ├── plot.summary.multisynth.Rd
│   ├── predict.augsynth.Rd
│   ├── predict.augsynth_multiout.Rd
│   ├── predict.multisynth.Rd
│   ├── print.augsynth.Rd
│   ├── print.augsynth_multiout.Rd
│   ├── print.multisynth.Rd
│   ├── print.summary.augsynth.Rd
│   ├── print.summary.augsynth_multiout.Rd
│   ├── print.summary.multisynth.Rd
│   ├── rdirichlet_b.Rd
│   ├── rmultinom_b.Rd
│   ├── rwild_b.Rd
│   ├── single_augsynth.Rd
│   ├── summary.augsynth.Rd
│   ├── summary.augsynth_multiout.Rd
│   ├── summary.multisynth.Rd
│   ├── time_jackknife_plus.Rd
│   └── time_jackknife_plus_multiout.Rd
├── pkg.Rproj
├── tests/
│   ├── testthat/
│   │   ├── test_augsynth_pre.R
│   │   ├── test_format.R
│   │   ├── test_general.R
│   │   ├── test_lambda.R
│   │   ├── test_load_data.R
│   │   ├── test_multiple_outcomes.R
│   │   ├── test_multisynth.R
│   │   ├── test_multisynth_covariates.R
│   │   ├── test_outcome_models.R
│   │   ├── test_time_cohort.R
│   │   └── test_unbalanced_multisynth.R
│   └── testthat.R
└── vignettes/
    ├── .gitignore
    ├── multi-outcomes-vignette.Rmd
    ├── multisynth-vignette.Rmd
    ├── multisynth-vignette.md
    ├── singlesynth-vignette.Rmd
    └── singlesynth-vignette.md

================================================
FILE CONTENTS
================================================

================================================
FILE: .Rbuildignore
================================================
^data-raw$
^Meta$
^doc$
^\.travis\.yml$
^pkg.Rproj$
figure$
cache$


================================================
FILE: .gitignore
================================================
Meta
doc
inst/doc

## Files
# Emacs autosave files
*~
\#*#
# Don't put data in the repo
*.csv
*.feather

# R stuff
*.Rout
*.Rhistory
*.RData 
*.Rapp.history

# Mac stuff
*.DS_store


# C++ stuff
*.o
*.so
*.dll

test.R

*-vignette.pdf

================================================
FILE: .travis.yml
================================================
# R for travis: see documentation at https://docs.travis-ci.com/user/languages/r

language: r
r:
    - 3.5.1  
sudo: false
cache: packages
warnings_are_errors: false
r_binary_packages:
    - dplyr
    - magrittr
    - ggplot2
    - glmnet
    - plyr
    - kableExtra

================================================
FILE: DESCRIPTION
================================================
Package: augsynth
Title: The Augmented Synthetic Control Method
Version: 0.2.0
Authors@R: person("Eli", "Ben-Michael", email = "ebenmichael@berkeley.edu", role = c("aut", "cre"))
Description: A package implementing the Augmented Synthetic Controls Method.
Depends:
    R (>= 3.5.0)
Imports:
    dplyr,
    tidyr,
    magrittr,
    ggplot2,
    MASS,
    LiblineaR,
    Formula,
    Matrix,
    osqp,
    rlang,
    purrr,
    FNN
Remotes:
    susanathey/MCPanel
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.2.3
Suggests:
    testthat,
    CausalImpact,
    keras,
    gsynth,
    knitr,
    rmarkdown,
    softImpute,
    MCPanel,
    glmnet,
    randomForest,
    kableExtra,
    ggrepel
VignetteBuilder: knitr


================================================
FILE: LICENSE
================================================
MIT License

Copyright (c) 2018 Elijahu Ben-Michael

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


================================================
FILE: NAMESPACE
================================================
# Generated by roxygen2: do not edit by hand

S3method(plot,augsynth)
S3method(plot,augsynth_multiout)
S3method(plot,multisynth)
S3method(plot,summary.augsynth)
S3method(plot,summary.augsynth_multiout)
S3method(plot,summary.multisynth)
S3method(predict,augsynth)
S3method(predict,augsynth_multiout)
S3method(predict,multisynth)
S3method(print,augsynth)
S3method(print,augsynth_multiout)
S3method(print,multisynth)
S3method(print,summary.augsynth)
S3method(print,summary.augsynth_multiout)
S3method(print,summary.multisynth)
S3method(summary,augsynth)
S3method(summary,augsynth_multiout)
S3method(summary,multisynth)
export(augsynth)
export(augsynth_multiout)
export(multisynth)
export(rdirichlet_b)
export(rmultinom_b)
export(rwild_b)
export(single_augsynth)
import(dplyr)
import(tidyr)
importFrom(ggplot2,aes)
importFrom(graphics,plot)
importFrom(magrittr,"%>%")
importFrom(purrr,reduce)
importFrom(stats,coef)
importFrom(stats,delete.response)
importFrom(stats,formula)
importFrom(stats,lm)
importFrom(stats,model.frame)
importFrom(stats,model.matrix)
importFrom(stats,na.omit)
importFrom(stats,poly)
importFrom(stats,predict)
importFrom(stats,sd)
importFrom(stats,terms)
importFrom(stats,update)
importFrom(utils,capture.output)


================================================
FILE: R/augsynth.R
================================================
################################################################################
## Main functions for single-period treatment augmented synthetic controls Method
################################################################################


#' Fit Augmented SCM
#' 
#' @param form outcome ~ treatment | auxillary covariates
#' @param unit Name of unit column
#' @param time Name of time column
#' @param t_int Time of intervention
#' @param data Panel data as dataframe
#' @param progfunc What function to use to impute control outcomes
#'                 ridge=Ridge regression (allows for standard errors),
#'                 none=No outcome model,
#'                 en=Elastic Net, RF=Random Forest, GSYN=gSynth,
#'                 mcp=MCPanel, 
#'                 cits=Comparitive Interuppted Time Series
#'                 causalimpact=Bayesian structural time series with CausalImpact
#' @param scm Whether the SCM weighting function is used
#' @param fixedeff Whether to include a unit fixed effect, default F 
#' @param cov_agg Covariate aggregation functions, if NULL then use mean with NAs omitted
#' @param ... optional arguments for outcome model
#'
#' @return augsynth object that contains:
#'         \itemize{
#'          \item{"weights"}{Ridge ASCM weights}
#'          \item{"l2_imbalance"}{Imbalance in pre-period outcomes, measured by the L2 norm}
#'          \item{"scaled_l2_imbalance"}{L2 imbalance scaled by L2 imbalance of uniform weights}
#'          \item{"mhat"}{Outcome model estimate}
#'          \item{"data"}{Panel data as matrices}
#'         }
#' @export
single_augsynth <- function(form, unit, time, t_int, data,
                     progfunc = "ridge",
                     scm=T,
                     fixedeff = FALSE,
                     cov_agg=NULL, ...) {
    call_name <- match.call()

    form <- Formula::Formula(form)
    unit <- enquo(unit)
    time <- enquo(time)

    ## format data
    outcome <- terms(formula(form, rhs=1))[[2]]
    trt <- terms(formula(form, rhs=1))[[3]]

    wide <- format_data(outcome, trt, unit, time, t_int, data)
    synth_data <- do.call(format_synth, wide)
    
    treated_units <- data %>% filter(!!trt == 1) %>% distinct(!!unit) %>% pull(!!unit)
    control_units <- data %>% filter(!(!!unit %in% treated_units)) %>% 
                        distinct(!!unit) %>% arrange(!!unit) %>% pull(!!unit)
        ## add covariates
    if(length(form)[2] == 2) {
        Z <- extract_covariates(form, unit, time, t_int, data, cov_agg)
    } else {
        Z <- NULL
    }
    
    # fit augmented SCM
    augsynth <- fit_augsynth_internal(wide, synth_data, Z, progfunc, 
                                      scm, fixedeff, ...)
    
    # add some extra data
    augsynth$data$time <- data %>% distinct(!!time) %>%
                                   arrange(!!time) %>% pull(!!time)
    augsynth$call <- call_name
    augsynth$t_int <- t_int 
    
    augsynth$weights <- matrix(augsynth$weights)
    rownames(augsynth$weights) <- control_units

    return(augsynth)
}


#' Internal function to fit augmented SCM
#' @param wide Data formatted from format_data
#' @param synth_data Data formatted from foramt_synth
#' @param Z Matrix of auxiliary covariates
#' @param progfunc outcome model to use
#' @param scm Whether to fit SCM
#' @param fixedeff Whether to de-mean synth
#' @param V V matrix for Synth, default NULL
#' @param ... Extra args for outcome model
#' 
#' @noRd
#' 
fit_augsynth_internal <- function(wide, synth_data, Z, progfunc,
                                  scm, fixedeff, V = NULL, ...) {

    n <- nrow(wide$X)
    t0 <- ncol(wide$X)
    ttot <- t0 + ncol(wide$y)
    if(fixedeff) {
        demeaned <- demean_data(wide, synth_data)
        fit_wide <- demeaned$wide
        fit_synth_data <- demeaned$synth_data
        mhat <- demeaned$mhat
    } else {
        fit_wide <- wide
        fit_synth_data <- synth_data
        mhat <- matrix(0, n, ttot)
    }
    if (is.null(progfunc)) {
        progfunc = "none"
    }
    progfunc = tolower(progfunc)
    ## fit augsynth
    if(progfunc == "ridge") {
        # Ridge ASCM
        augsynth <- do.call(fit_ridgeaug_formatted,
                            list(wide_data = fit_wide,
                                 synth_data = fit_synth_data,
                                 Z = Z, V = V, scm = scm, ...))
    } else if(progfunc == "none") {
        ## Just SCM
        augsynth <- do.call(fit_ridgeaug_formatted,
                        c(list(wide_data = fit_wide, 
                               synth_data = fit_synth_data,
                               Z = Z, ridge = F, scm = T, V = V, ...)))
    } else {
        ## Other outcome models
        progfuncs = c("ridge", "none", "en", "rf", "gsyn", "mcp",
                      "cits", "causalimpact", "seq2seq")
        if (progfunc %in% progfuncs) {
            augsynth <- fit_augsyn(fit_wide, fit_synth_data, 
                                   progfunc, scm, ...)
        } else {
            stop("progfunc must be one of 'EN', 'RF', 'GSYN', 'MCP', 'CITS', 'CausalImpact', 'seq2seq', 'None'")
        }
        
    }

    augsynth$mhat <- mhat + cbind(matrix(0, nrow = n, ncol = t0), 
                                  augsynth$mhat)
    augsynth$data <- wide
    augsynth$data$Z <- Z
    augsynth$data$synth_data <- synth_data
    augsynth$progfunc <- progfunc
    augsynth$scm <- scm
    augsynth$fixedeff <- fixedeff
    augsynth$extra_args <- list(...)
    if(progfunc == "ridge") {
        augsynth$extra_args$lambda <- augsynth$lambda
    } else if(progfunc == "gsyn") {
        augsynth$extra_args$r <- ncol(augsynth$params$factor)
        augsynth$extra_args$CV <- 0
    }
    ##format output
    class(augsynth) <- "augsynth"
    return(augsynth)
}

#' Get prediction of ATT or average outcome under control
#' @param object augsynth object
#' @param att If TRUE, return the ATT, if FALSE, return imputed counterfactual
#' @param ... Optional arguments
#'
#' @return Vector of predicted post-treatment control averages
#' @export
predict.augsynth <- function(object, att = F, ...) {
    # if ("att" %in% names(list(...))) {
    #     att <- list(...)$att
    # } else {
    #     att <- F
    # }
    augsynth <- object
    
    X <- augsynth$data$X
    y <- augsynth$data$y
    comb <- cbind(X, y)
    trt <- augsynth$data$trt
    mhat <- augsynth$mhat
    
    m1 <- colMeans(mhat[trt==1,,drop=F])

    resid <- (comb[trt==0,,drop=F] - mhat[trt==0,drop=F])

    y0 <- m1 + t(resid) %*% augsynth$weights
    if(att) {
        return(colMeans(comb[trt == 1,, drop = F]) - c(y0))
    } else {
        rnames <- rownames(y0)
        y0_vec <- c(y0)
        names(y0_vec) <- rnames
        return(y0_vec)
    }
}


#' Print function for augsynth
#' @param x augsynth object
#' @param ... Optional arguments
#' @export
print.augsynth <- function(x, ...) {
    augsynth <- x
    
    ## straight from lm
    cat("\nCall:\n", paste(deparse(augsynth$call), sep="\n", collapse="\n"), "\n\n", sep="")

    ## print att estimates
    tint <- ncol(augsynth$data$X)
    ttotal <- tint + ncol(augsynth$data$y)
    att_post <- predict(augsynth, att = T)[(tint + 1):ttotal]

    cat(paste("Average ATT Estimate: ",
              format(round(mean(att_post),3), nsmall = 3), "\n\n", sep=""))
}


#' Plot function for augsynth
#' @importFrom graphics plot
#' 
#' @param x Augsynth object to be plotted
#' @param inf Boolean, whether to get confidence intervals around the point estimates
#' @param cv If True, plot cross validation MSE against hyper-parameter, otherwise plot effects
#' @param ... Optional arguments
#' @export
plot.augsynth <- function(x, inf = T, cv = F, ...) {
    # if ("se" %in% names(list(...))) {
    #     se <- list(...)$se
    # } else {
    #     se <- T
    # }

    augsynth <- x
    
    if (cv == T) {
        errors = data.frame(lambdas = augsynth$lambdas,
                            errors = augsynth$lambda_errors,
                            errors_se = augsynth$lambda_errors_se)
        p <- ggplot2::ggplot(errors, ggplot2::aes(x = lambdas, y = errors)) +
              ggplot2::geom_point(size = 2) + 
              ggplot2::geom_errorbar(
                ggplot2::aes(ymin = errors,
                             ymax = errors + errors_se),
                width=0.2, size = 0.5) 
        p <- p + ggplot2::labs(title = bquote("Cross Validation MSE over " ~ lambda),
                              x = expression(lambda), y = "Cross Validation MSE", 
                              parse = TRUE)
        p <- p + ggplot2::scale_x_log10()
        
        # find minimum and min + 1se lambda to plot
        min_lambda <- choose_lambda(augsynth$lambdas,
                                   augsynth$lambda_errors,
                                   augsynth$lambda_errors_se,
                                   F)
        min_1se_lambda <- choose_lambda(augsynth$lambdas,
                                       augsynth$lambda_errors,
                                       augsynth$lambda_errors_se,
                                       T)
        min_lambda_index <- which(augsynth$lambdas == min_lambda)
        min_1se_lambda_index <- which(augsynth$lambdas == min_1se_lambda)

        p <- p + ggplot2::geom_point(
            ggplot2::aes(x = min_lambda, 
                         y = augsynth$lambda_errors[min_lambda_index]),
            color = "gold")
        p + ggplot2::geom_point(
              ggplot2::aes(x = min_1se_lambda,
                           y = augsynth$lambda_errors[min_1se_lambda_index]),
              color = "gold") +
            ggplot2::theme_bw()
    } else {
        plot(summary(augsynth, ...), inf = inf)
    }
}


#' Summary function for augsynth
#' @param object augsynth object
#' @param inf Boolean, whether to get confidence intervals around the point estimates
#' @param inf_type Type of inference algorithm. Options are
#'         \itemize{
#'          \item{"conformal"}{Conformal inference (default)}
#'          \item{"jackknife+"}{Jackknife+ algorithm over time periods}
#'          \item{"jackknife"}{Jackknife over units}
#'         }
#' @param linear_effect Boolean, whether to invert the conformal inference hypothesis test to get confidence intervals for a linear-in-time treatment effect: intercept + slope * time 
#' @param ... Optional arguments for inference, for more details for each `inf_type` see
#'         \itemize{
#'          \item{"conformal"}{`conformal_inf`}
#'          \item{"jackknife+"}{`time_jackknife_plus`}
#'          \item{"jackknife"}{`jackknife_se_single`}
#'         }
#' @export
summary.augsynth <- function(object, inf = T, inf_type = "conformal",
                             linear_effect = F,
                             ...) {
    augsynth <- object
    summ <- list()

    t0 <- ncol(augsynth$data$X)
    t_final <- t0 + ncol(augsynth$data$y)

    if(inf) {
        if(inf_type == "jackknife") {
            att_se <- jackknife_se_single(augsynth)
        } else if(inf_type == "jackknife+") {
            att_se <- time_jackknife_plus(augsynth, ...)
        } else if(inf_type == "conformal") {
          att_se <- conformal_inf(augsynth, ...)
          # get CIs for linear treatment effects
          if(linear_effect) {
            att_linear <- conformal_inf_linear(augsynth, ...)
          }
        } else {
            stop(paste(inf_type, "is not a valid choice of 'inf_type'"))
        }

        att <- data.frame(Time = augsynth$data$time,
                          Estimate = att_se$att[1:t_final])
        if(inf_type == "jackknife") {
          att$Std.Error <- att_se$se[1:t_final]
          att_avg_se <- att_se$se[t_final + 1]
        } else {
          att_avg_se <- NA
        }
        att_avg <- att_se$att[t_final + 1]
        if(inf_type %in% c("jackknife+", "nonpar_bs", "t_dist", "conformal")) {
            att$lower_bound <- att_se$lb[1:t_final]
            att$upper_bound <- att_se$ub[1:t_final]
        }
        if(inf_type == "conformal") {
          att$p_val <- att_se$p_val[1:t_final]
        }

    } else {
        t0 <- ncol(augsynth$data$X)
        t_final <- t0 + ncol(augsynth$data$y)
        att_est <- predict(augsynth, att = T)
        att <- data.frame(Time = augsynth$data$time,
                          Estimate = att_est)
        att$Std.Error <- NA
        att_avg <- mean(att_est[(t0 + 1):t_final])
        att_avg_se <- NA
    }

    summ$att <- att

    if(inf) {
      if(inf_type %in% c("jackknife+")) {
        summ$average_att <- data.frame(Value = "Average Post-Treatment Effect",
                              Estimate = att_avg, Std.Error = att_avg_se)
        summ$average_att$lower_bound <- att_se$lb[t_final + 1]
        summ$average_att$upper_bound <- att_se$ub[t_final + 1]
        summ$alpha <-  att_se$alpha
      }
      if(inf_type == "conformal") {
        # summ$average_att$p_val <- att_se$p_val[t_final + 1]
        # summ$average_att$lower_bound <- att_se$lb[t_final + 1]
        # summ$average_att$upper_bound <- att_se$ub[t_final + 1]
        # summ$alpha <-  att_se$alpha
        if(linear_effect) {
          summ$average_att <- data.frame(
                                Value = c("Average Post-Treatment Effect",
                                          "Treatment Effect Intercept",
                                          "Treatment Effect Slope"),
                                Estimate = c(att_avg, att_linear$est_int,
                                             att_linear$est_slope),
                                Std.Error = c(att_avg_se, NA, NA),
                                p_val = c(att_se$p_val[t_final + 1], NA, NA),
                                lower_bound = c(att_se$lb[t_final + 1],
                                            att_linear$ci_int[1],
                                            att_linear$ci_slope[1]),
                                upper_bound =  c(att_se$ub[t_final + 1],
                                            att_linear$ci_int[2],
                                            att_linear$ci_slope[2])
          )
        } else {
          summ$average_att <- data.frame(
                                Value = c("Average Post-Treatment Effect"),
                                Estimate = att_avg,
                                Std.Error = att_avg_se,
                                p_val = att_se$p_val[t_final + 1],
                                lower_bound = att_se$lb[t_final + 1],
                                upper_bound =  att_se$ub[t_final + 1]
          )

        }
        summ$alpha <-  att_se$alpha
      }
    } else {
              summ$average_att <- data.frame(Value = "Average Post-Treatment Effect",
                              Estimate = att_avg, Std.Error = att_avg_se)
    }
    summ$t_int <- augsynth$t_int
    summ$call <- augsynth$call
    summ$l2_imbalance <- augsynth$l2_imbalance
    summ$scaled_l2_imbalance <- augsynth$scaled_l2_imbalance
    if(!is.null(augsynth$covariate_l2_imbalance)) {
      summ$covariate_l2_imbalance <- augsynth$covariate_l2_imbalance
      summ$scaled_covariate_l2_imbalance <- augsynth$scaled_covariate_l2_imbalance
    }
    ## get estimated bias

    if(tolower(augsynth$progfunc) == "ridge") {
        mhat <- augsynth$ridge_mhat
        w <- augsynth$synw
    } else {
        mhat <- augsynth$mhat
        w <- augsynth$weights
    }
    trt <- augsynth$data$trt
    m1 <- colMeans(mhat[trt==1,,drop=F])

    if(tolower(augsynth$progfunc) == "none" | (!augsynth$scm)) {
        summ$bias_est <- NA
    } else {
      summ$bias_est <- m1 - t(mhat[trt==0,,drop=F]) %*% w
    }
    
    
    summ$inf_type <- if(inf) inf_type else "None"
    class(summ) <- "summary.augsynth"
    return(summ)
}

#' Print function for summary function for augsynth
#' @param x summary object
#' @param ... Optional arguments
#' @export
print.summary.augsynth <- function(x, ...) {
    summ <- x
    
    ## straight from lm
    cat("\nCall:\n", paste(deparse(summ$call), sep="\n", collapse="\n"), "\n\n", sep="")

    t_final <- nrow(summ$att)

    ## distinction between pre and post treatment
    att_est <- summ$att$Estimate
    t_total <- length(att_est)
    t_int <- summ$att %>% filter(Time <= summ$t_int) %>% nrow()
    
    att_pre <- att_est[1:(t_int-1)]
    att_post <- att_est[t_int:t_total]


    out_msg <- ""


    # print out average post treatment estimate
    att_post <- summ$average_att$Estimate[1]
    se_est <- summ$att$Std.Error
    if(summ$inf_type == "jackknife") {
      se_avg <- summ$average_att$Std.Error

      out_msg <- paste("Average ATT Estimate (Jackknife Std. Error): ",
                        format(round(att_post,3), nsmall=3), 
                        "  (",
                        format(round(se_avg,3)), ")\n")
      inf_type <- "Jackknife over units"
    } else if(summ$inf_type == "conformal") {
      p_val <- summ$average_att$p_val[1]
      out_msg <- paste("Average ATT Estimate (p Value for Joint Null): ",
                        format(att_post, digits = 3), 
                        "  (",
                        format(p_val, digits = 2), ")\n")
      inf_type <- "Conformal inference"
      if("Treatment Effect Slope" %in% summ$average_att$Value) {
        lowers <- summ$average_att$lower_bound[2:3]
        uppers <- summ$average_att$upper_bound[2:3]
        out_msg_line2 <- paste0("Confidence intervals for linear-in-time treatment effects (Intercept + Slope * Time)\n",
        "\tIntercept: [", format(lowers[1], digits = 3), ",",
        format(uppers[1], digits = 3), "]\n",
        "\tSlope: [", format(lowers[2], digits = 3), ",",
        format(uppers[2], digits = 3), "]\n")
        out_msg <- paste0(out_msg, out_msg_line2)
      }
    } else if(summ$inf_type == "jackknife+") {
      out_msg <- paste("Average ATT Estimate: ",
                        format(round(att_post,3), nsmall=3), "\n")
      inf_type <- "Jackknife+ over time periods"
    } else {
      out_msg <- paste("Average ATT Estimate: ",
                        format(round(att_post,3), nsmall=3), "\n")
      inf_type <- "None"
    }


    out_msg <- paste(out_msg, 
              "L2 Imbalance: ",
              format(round(summ$l2_imbalance,3), nsmall=3), "\n",
              "Percent improvement from uniform weights: ",
              format(round(1 - summ$scaled_l2_imbalance,3)*100), "%\n\n",
              sep="")
  if(!is.null(summ$covariate_l2_imbalance)) {

    out_msg <- paste(out_msg,
                     "Covariate L2 Imbalance: ",
                     format(round(summ$covariate_l2_imbalance,3), 
                                  nsmall=3),
                    "\n",
                     "Percent improvement from uniform weights: ",
                     format(round(1 - summ$scaled_covariate_l2_imbalance,3)*100), 
                     "%\n\n",
                     sep="")

  }
  out_msg <- paste(out_msg, 
              "Avg Estimated Bias: ",
              format(round(mean(summ$bias_est), 3),nsmall=3), "\n\n",
              "Inference type: ",
              inf_type,
              "\n\n",
              sep="")
  cat(out_msg)

    if(summ$inf_type == "jackknife") {
      out_att <- summ$att[t_int:t_final,] %>% 
              select(Time, Estimate, Std.Error)
    } else if(summ$inf_type == "conformal") {
      out_att <- summ$att[t_int:t_final,] %>% 
              select(Time, Estimate, lower_bound, upper_bound, p_val)
      names(out_att) <- c("Time", "Estimate", 
                          paste0((1 - summ$alpha) * 100, "% CI Lower Bound"),
                          paste0((1 - summ$alpha) * 100, "% CI Upper Bound"),
                          paste0("p Value"))
    } else if(summ$inf_type == "jackknife+") {
      out_att <- summ$att[t_int:t_final,] %>% 
              select(Time, Estimate, lower_bound, upper_bound)
      names(out_att) <- c("Time", "Estimate", 
                          paste0((1 - summ$alpha) * 100, "% CI Lower Bound"),
                          paste0((1 - summ$alpha) * 100, "% CI Upper Bound"))
    } else {
      out_att <- summ$att[t_int:t_final,] %>% 
              select(Time, Estimate)
    }
    out_att %>%
      mutate_at(vars(-Time), ~ round(., 3)) %>%
      print(row.names = F)

    
}

#' Plot function for summary function for augsynth
#' @param x Summary object
#' @param inf Boolean, whether to plot confidence intervals
#' @param ... Optional arguments
#' @export
plot.summary.augsynth <- function(x, inf = T, ...) {
    summ <- x
    # if ("inf" %in% names(list(...))) {
    #     inf <- list(...)$inf
    # } else {
    #     inf <- T
    # }
    
    p <- summ$att %>%
        ggplot2::ggplot(ggplot2::aes(x=Time, y=Estimate))
    if(inf) {
        if(all(is.na(summ$att$lower_bound))) {
            p <- p + ggplot2::geom_ribbon(ggplot2::aes(ymin=Estimate-2*Std.Error,
                        ymax=Estimate+2*Std.Error),
                    alpha=0.2)
        } else {
            p <- p + ggplot2::geom_ribbon(ggplot2::aes(ymin=lower_bound,
                        ymax=upper_bound),
                    alpha=0.2)
        }

    }
    p + ggplot2::geom_line() +
        ggplot2::geom_vline(xintercept=summ$t_int, lty=2) +
        ggplot2::geom_hline(yintercept=0, lty=2) + 
        ggplot2::theme_bw()

}


#' augsynth
#' 
#' @description A package implementing the Augmented Synthetic Controls Method
#' @docType package
#' @name augsynth-package
#' @importFrom magrittr "%>%"
#' @importFrom purrr reduce
#' @import dplyr
#' @import tidyr
#' @importFrom stats terms
#' @importFrom stats formula
#' @importFrom stats update 
#' @importFrom stats delete.response 
#' @importFrom stats model.matrix 
#' @importFrom stats model.frame 
#' @importFrom stats na.omit
NULL


================================================
FILE: R/augsynth_pre.R
================================================
################################################################################
## Main function for the augmented synthetic controls Method
################################################################################


#' Fit Augmented SCM
#' @param form outcome ~ treatment | auxillary covariates
#' @param unit Name of unit column
#' @param time Name of time column
#' @param data Panel data as dataframe
#' @param t_int Time of intervention (used for single-period treatment only)
#' @param ... Optional arguments
#' \itemize{
#'   \item Single period augsynth with/without multiple outcomes
#'     \itemize{
#'       \item{"progfunc"}{What function to use to impute control outcomes: Ridge=Ridge regression (allows for standard errors), None=No outcome model, EN=Elastic Net, RF=Random Forest, GSYN=gSynth, MCP=MCPanel, CITS=CITS, CausalImpact=Bayesian structural time series with CausalImpact, seq2seq=Sequence to sequence learning with feedforward nets}
#'       \item{"scm"}{Whether the SCM weighting function is used}
#'       \item{"fixedeff"}{Whether to include a unit fixed effect, default F }
#'       \item{"cov_agg"}{Covariate aggregation functions, if NULL then use mean with NAs omitted}
#'     }
#'   \item Multi period (staggered) augsynth
#'    \itemize{
#'          \item{"relative"}{Whether to compute balance by relative time}
#'          \item{"n_leads"}{How long past treatment effects should be estimated for}
#'          \item{"n_lags"}{Number of pre-treatment periods to balance, default is to balance all periods}
#'          \item{"alpha"}{Fraction of balance for individual balance}
#'          \item{"lambda"}{Regularization hyperparameter, default = 0}
#'          \item{"force"}{Include "none", "unit", "time", "two-way" fixed effects. Default: "two-way"}
#'          \item{"n_factors"}{Number of factors for interactive fixed effects, default does CV}
#'         }
#' }
#' 
#' @return augsynth object that contains:
#'         \itemize{
#'          \item{"weights"}{weights}
#'          \item{"data"}{Panel data as matrices}
#'         }
#' @export
#' 
augsynth <- function(form, unit, time, data, t_int=NULL, ...) {

  call_name <- match.call()

  form <- Formula::Formula(form)
  unit_quosure <- enquo(unit)
  time_quosure <- enquo(time)
  

  ## format data
  outcome <- terms(formula(form, rhs=1))[[2]]
  trt <- terms(formula(form, rhs=1))[[3]]

  # check for multiple outcomes
  multi_outcome <- length(outcome) != 1

  ## get first treatment times
  trt_time <- data %>%
      group_by(!!unit_quosure) %>%
      filter(!all(!!trt == 0)) %>%
      summarise(trt_time = min((!!time_quosure)[(!!trt) == 1])) %>%
      mutate(trt_time = replace_na(as.numeric(trt_time), Inf))

  num_trt_years <- sum(is.finite(unique(trt_time$trt_time)))

  if(multi_outcome & num_trt_years > 1) {
    stop("augsynth is not currently implemented for more than one outcome and more than one treated unit")
  } else if(num_trt_years > 1) {
    message("More than one treatment time found. Running multisynth.")
    if("progfunc" %in% names(list(...))) {
      warning("`progfunc` is not an argument for multisynth, so it is ignored")
    }
    return(multisynth(form, !!enquo(unit), !!enquo(time), data, ...)) 
  } else {
    if (is.null(t_int)) {
      t_int <- trt_time %>% filter(is.finite(trt_time)) %>%
        summarise(t_int = max(trt_time)) %>% pull(t_int)
    }
    if(!multi_outcome) {
      message("One outcome and one treatment time found. Running single_augsynth.")
      return(single_augsynth(form, !!enquo(unit), !!enquo(time), t_int,
                             data = data, ...))
    } else {
      message("Multiple outcomes and one treatment time found. Running augsynth_multiout.")
      return(augsynth_multiout(form, !!enquo(unit), !!enquo(time), t_int,
                               data = data, ...))
    }
  }
}


================================================
FILE: R/cv.R
================================================
drop_time_t <- function(wide_data, Z, t_drop) {
  new_wide_data <- list()
  new_wide_data$trt <- wide_data$trt
  
  if (is.list(wide_data$X)) {
    # TODO
  } else {
    new_wide_data$X <- wide_data$X[, -t_drop, drop = F]
    new_wide_data$y <- cbind(wide_data$X[, t_drop, drop = F], 
                             wide_data$y)
    
    X0 <- new_wide_data$X[new_wide_data$trt == 0,, drop = F]
    x1 <- matrix(colMeans(new_wide_data$X[new_wide_data$trt == 1,, drop = F]),
                 ncol=1)
    y0 <- new_wide_data$y[new_wide_data$trt == 0,, drop = F]
    y1 <- colMeans(new_wide_data$y[new_wide_data$trt == 1,, drop = F])
    
    new_synth_data <- list()
    new_synth_data$Z0 <- t(X0)
    new_synth_data$X0 <- t(X0)
    new_synth_data$Z1 <- x1
    new_synth_data$X1 <- x1
    
    return(list(wide_data = new_wide_data,
                synth_data = new_synth_data,
                Z = Z)) 
  }
}

drop_time_and_refit <- function(wide_data, Z, t_drop, progfunc, scm, fixedeff, ...) {
  new_data <- drop_time_t(wide_data, Z, t_drop)
  new_ascm <- do.call(fit_augsynth_internal,
                      c(list(wide = new_data$wide,
                             synth_data = new_data$synth_data,
                             Z = new_data$Z,
                             progfunc = progfunc,
                             scm = scm,
                             fixedeff = fixedeff, ...)))
  return(new_ascm)
}

cv_internal <- function(wide_data, Z, progfunc, scm, fixedeff, lambdas, holdout_periods, ...) {
  X <- wide_data$X
  lambda_error_vals <- vapply(lambdas, function(lambda){
    errors <- apply(holdout_periods, 1, function(t_drop){
      new_ascm <- drop_time_and_refit(wide_data, Z, t_drop, progfunc, scm, fixedeff, lambda = lambda, ...)
      err <- sum((predict(new_ascm, att = T)[(ncol(X)-length(t_drop)+1):ncol(X)])^2)
      err
    })
    lambda_error <- mean(errors)
    lambda_error_se <- sd(errors) / sqrt(length(errors))
    c(lambda_error, lambda_error_se)
  }, numeric(2))
  return(list(lambda_errors = lambda_error_vals[1,], lambda_errors_se = lambda_error_vals[2,]))
}

cv_ridge <- function(wide_data, synth_data, Z, progfunc, scm, fixedeff, how = 'time', holdout_length = 1, lambdas = NULL, 
               lambda_min_ratio = 1e-8, n_lambda = 20, lambda_max = NULL, min_1se = T, V = NULL, ...) {
  X <- wide_data$X
  trt <- wide_data$trt
  if (is.null(lambdas)) {
    if(is.null(lambda_max)) {
      X_cent <- apply(X, 2, function(x) x - mean(x[trt==0]))
      X_c <- X_cent[trt==0,,drop=FALSE]
      t0 <- ncol(X_c)
      
      if(is.null(V)) {
        V <- diag(rep(1, t0))
      } else if(is.vector(V)) {
        V <- diag(V)
      } else if(ncol(V) == 1 & nrow(V) == t0) {
        V <- diag(c(V))
      } else if(ncol(V) == t0 & nrow(V) == 1) {
        V <- diag(c(V))
      } else if(nrow(V) == t0) {
      } else {
        stop("`V` must be a vector with t0 elements or a t0xt0 matrix")
      }
      X_c <- X_c %*% V

      if(!is.null(Z)) {
        Z_cent <- apply(Z, 2, function(x) x - mean(x[trt==0]))
        Z_c <- Z_cent[trt==0,,drop=FALSE]
        Xc_hat <- Z_c %*% solve(t(Z_c) %*% Z_c) %*% t(Z_c) %*% X_c
        res_c <- X_c - Xc_hat
        X_c <- res_c
      }
      lambda_max <- svd(X_c)$d[1] ^ 2
    }
    lambdas <- create_lambda_list(lambda_max, lambda_min_ratio, n_lambda)
  }
  
  if (how == 'time') {
    period_starts <- 1:(ncol(X) - holdout_length)
    if (holdout_length == 1) {
      holdout_periods <- matrix(period_starts, nrow = length(period_starts), ncol = 1)
    } else {
      holdout_periods <- t(vapply(period_starts, function(t) t:(t+holdout_length-1), numeric(holdout_length)))
    }
    results <- cv_internal(wide_data, Z, progfunc, scm, fixedeff, lambdas, holdout_periods, ...)
    lambda <- choose_lambda(lambdas, results$lambda_errors, results$lambda_errors_se, min_1se)
    return(list(lambda = lambda, lambdas = lambdas, lambda_errors = results$lambda_errors, lambda_errors_se = results$lambda_errors_se))
  }
}

================================================
FILE: R/data.R
================================================
#' Economic indicators for US states from 1990-2016
#' 
#' 
#' @format A dataframe with 5250 rows and 32 variables:
#' \describe{
#'  \item{fips}{FIPS code for each state}
#'  \item{year}{Year of measurement}
#'  \item{qtr}{Quarter (1-4) of measurement}
#'  \item{state}{Name of State}
#'  \item{gdp}{Gross State Product (millions of $) Values before 2005 are linearly interpolated between years}
#'  \item{revenuepop}{State and local revenue per capita}
#'  \item{rev_state_total}{State total general revenue (millions of $)}
#'  \item{rev_local_total}{Local total general revenue (millions of $)}
#'  \item{popestimate}{Population estimate}
#'  \item{qtrly_estabs_count}{Count of establishments for a given quarter}
#'  \item{month1_emplvl, month2_emplvl, month3_emplvl}{ Employment level for first, second, and third months of a given quarter}
#'  \item{total_qtrly_wages}{Total wages for a givne quarter}
#'  \item{taxable_qtrly_wage}{Taxable wages for a given quarter}
#'  \item{avg_wkly_wage}{Average weekly wage for a given quarter}
#'  \item{year_qtr}{Year and quarter combined into one continuous variable}
#'  \item{treated}{Whether the state passed tax cuts before the given year and quareter}
#'  \item{lngdpcapita}{Natural log of GDP per capita}
#'  \item{emplvlcapita}{Average employment level per capita}
#'  \item{Xcapita}{Per capita value of X}
#'  \item{abb}{State abbreviation}
#' }
"kansas"

================================================
FILE: R/eligible_donors.R
================================================
##############################################################################
## Code to get eligible donor units based on covariates
##############################################################################

get_donors <- function(X, y, trt, Z, time_cohort, n_lags,
                       n_leads, how = "knn", 
                       exact_covariates = NULL, ...) {

  # first get eligible donors by treatment time
  donors <- get_eligible_donors(trt, time_cohort, n_leads)

  # get donors with no NA values
  nona_donors <- get_nona_donors(X, y, trt, n_lags, n_leads, time_cohort)

  donors <- lapply(1:length(donors),
                     function(j) donors[[j]] & nona_donors[[j]])

  # if Z isn't NULL, futher restrict the donors by matching
  if(!is.null(Z)) {
    if(ncol(Z) != 0) {
      donors <- get_matched_donors(trt, Z, donors, how, exact_covariates, ...)
    }
  }

  return(donors)
}


get_eligible_donors <- function(trt, time_cohort, n_leads) {

    # get treatment times
    if(time_cohort) {
        grps <- unique(trt[is.finite(trt)])
    } else {
        grps <- trt[is.finite(trt)]
    }

    J <- length(grps)

    # only allow weights on donors treated after n_leads
    donors <- lapply(1:J, function(j) trt > n_leads + grps[j])

    return(donors)
}

#' Get donors that don't have missing outcomes where treated units have outcomes
get_nona_donors <- function(X, y, trt, n_lags, n_leads, time_cohort) {

  n <- length(trt)
  # find na treatment times
  fulldat <- cbind(X, y)
  is_na <- is.na(fulldat[is.finite(trt), , drop = F])
  # aggregate by time cohort
  if(time_cohort) {
    grps <- unique(trt[is.finite(trt)])
    # if doing a time cohort, convert the boolean mask
    finite_trt <- trt[is.finite(trt)]
    is_na <- t(sapply(grps,
                    function(tj) apply(is_na[finite_trt == tj, , drop = F],
                                       2, all)))
  } else {
      grps <- trt[is.finite(trt)]
  }
  not_na <- !is.na(fulldat)
  J <- length(grps)
  lapply(1:J,
             function(j) {
               idxs <- max(grps[j] - n_lags + 1, 1):min(grps[j] + n_leads,
                                                        ncol(fulldat))
               isna_j <- is_na[j, idxs]
               apply(not_na[, idxs, drop = F][, !isna_j, drop = F], 1, all)
        }) -> donors

  return(donors)
}

get_matched_donors <- function(trt, Z, donors, how, exact_covariates = NULL, k = NULL, ...) {

  J <- sum(is.finite(trt))
  trt_idx <- which(is.finite(trt))
  if(is.null(exact_covariates)) {
    if(how == "exact") {
      return(
        lapply(1:J,
            function(j) donors[[j]] & apply(t(Z) == Z[trt_idx[j], ], 2, all)
        )
      )
    } else if(how == "knn") {
        return(get_knn_donors(trt, Z, donors, k))
    } else {
      stop("Option for exact matching must be in ('exact', 'knn')")
    }
  } else {
        if(how == "exact") {
      return(
        lapply(1:J,
            function(j) donors[[j]] & apply(t(Z) == Z[trt_idx[j], 
                                                   exact_covariates], 2, all)
        )
      )
    } else if(how == "knn") {
        donors <- lapply(1:J,
            function(j) { donors[[j]] &
              apply(t(Z[, exact_covariates, drop = F]) == 
                Z[trt_idx[j],exact_covariates], 2, all)
            }
              )
        approx_covs <- which(!colnames(Z) %in% exact_covariates)
        if(length(approx_covs != 0)) {
          return(get_knn_donors(trt, Z[, approx_covs, drop = F], donors, k))
        } else {
          return(donors)
        }
        
    } else {
      stop("Option for exact matching must be in ('exact', 'knn')")
    }
  }

}

get_knn_donors <- function(trt, Z, donors, k) {

  if(is.null(k)) {
    stop("Number of neighbors for knn not selected, please choose k.")
  }
  # knn matching within time cohort
  trt_idxs <- which(is.finite(trt))
  lapply(1:length(trt_idxs), 
        function(j) {
          idx <- trt_idxs[j]
          # idxs for treated units treated at time tj
          Z_tj <- Z[idx, , drop = F]

          # get donors for treated cohort
          donors_tj <- donors[[j]]
          Z_donors_tj <- Z[donors_tj, , drop = F]
          # check that k is less than the number of donors
          # if not, warn and set k to be the number of donors - 1
          if(k >= nrow(Z_donors_tj)) {
            warning(paste("Number of potential donor units is less than",
                          "the number of required matches,",
                          "returning all donors as matches"))
            return(donors_tj)
          }
          # do knn matching
          nn <- FNN::get.knnx(data = Z_donors_tj, query = Z_tj, k = k)
          # keep track of which indices these are
          donors_j <- logical(length(donors_tj))
          true_idx <- which(donors_tj)[nn$nn.index[1, ]]
          donors_j[true_idx] <- TRUE
          return(donors_j)
         }) -> matches
  names(matches) <- trt_idxs
  return(matches)
}

================================================
FILE: R/fit_synth.R
================================================
#######################################################
# Helper scripts to fit synthetic controls to simulations
#######################################################

#' Make a V matrix from a vector (or null)
make_V_matrix <- function(t0, V) {
  if(is.null(V)) {
        V <- diag(rep(1, t0))
    } else if(is.vector(V)) {
        if(length(V) != t0) {
          stop(paste("`V` must be a vector with", t0, "elements or a", t0, 
                     "x", t0, "matrix"))
        }
        V <- diag(V)
    } else if(ncol(V) == 1 & nrow(V) == t0) {
        V <- diag(c(V))
    } else if(ncol(V) == t0 & nrow(V) == 1) {
        V <- diag(c(V))
    } else if(nrow(V) == t0) {
    } else {
        stop(paste("`V` must be a vector with", t0, "elements or a", t0, 
                     "x", t0, "matrix"))
    }

  return(V)
}

#' Fit synthetic controls on outcomes after formatting data
#' @param synth_data Panel data in format of Synth::dataprep
#' @param V Matrix to scale the obejctive by
#' @noRd
#' @return \itemize{
#'          \item{"weights"}{Synth weights}
#'          \item{"l2_imbalance"}{Imbalance in pre-period outcomes, measured by the L2 norm}
#'          \item{"scaled_l2_imbalance"}{L2 imbalance scaled by L2 imbalance of uniform weights}
#' }
fit_synth_formatted <- function(synth_data, V = NULL) {


    t0 <- dim(synth_data$Z0)[1]
    ## if no  is supplied, set equal to 1

    V <- make_V_matrix(t0, V)

    weights <- synth_qp(synth_data$X1, t(synth_data$X0), V)
    l2_imbalance <- sqrt(sum((synth_data$Z0 %*% weights - synth_data$Z1)^2))

    ## primal objective value scaled by least squares difference for mean
    uni_w <- matrix(1/ncol(synth_data$Z0), nrow=ncol(synth_data$Z0), ncol=1)
    unif_l2_imbalance <- sqrt(sum((synth_data$Z0 %*% uni_w - synth_data$Z1)^2))
    scaled_l2_imbalance <- l2_imbalance / unif_l2_imbalance

    return(list(weights=weights,
                l2_imbalance=l2_imbalance,
                scaled_l2_imbalance=scaled_l2_imbalance))
}

#' Solve the synth QP directly
#' @param X1 Target vector
#' @param X0 Matrix of control outcomes
#' @param V Scaling matrix
#' @noRd
synth_qp <- function(X1, X0, V) {
    
    Pmat <- X0 %*% V %*% t(X0)
    qvec <- - t(X1) %*% V %*% t(X0)

    n0 <- nrow(X0)
    A <- rbind(rep(1, n0), diag(n0))
    l <- c(1, numeric(n0))
    u <- c(1, rep(1, n0))

    settings = osqp::osqpSettings(verbose = FALSE,
                                  eps_rel = 1e-8,
                                  eps_abs = 1e-8)
    sol <- osqp::solve_osqp(P = Pmat, q = qvec,
                            A = A, l = l, u = u, 
                            pars = settings)

    return(sol$x)
}


================================================
FILE: R/format.R
================================================
################################################################################
## Scripts to format panel data into matrices
################################################################################

#' Format "long" panel data into "wide" program evaluation matrices
#' @param outcome Name of outcome column
#' @param trt Name of treatment column
#' @param unit Name of unit column
#' @param time Name of time column
#' @param t_int Time of intervention
#' @param data Panel data as dataframe
#' @noRd
#' @return \itemize{
#'          \item{"X"}{Matrix of pre-treatment outcomes}
#'          \item{"trt"}{Vector of treatment assignments}
#'          \item{"y"}{Matrix of post-treatment outcomes}
#'         }
format_data <- function(outcome, trt, unit, time, t_int, data) {

    ## pre treatment outcomes
    X <- data %>%
        filter(!!time < t_int) %>%
        select(!!unit, !!time, !!outcome) %>%
        spread(!!time, !!outcome) %>%
        select(-!!unit) %>%
        as.matrix()
    

    ## post treatment outcomes
    y <- data %>%
        filter(!!time >= t_int) %>%
        select(!!unit, !!time, !!outcome) %>%
        spread(!!time, !!outcome) %>%
        select(-!!unit) %>%
        as.matrix()


    ## treatment status
    trt <- data %>%
        select(!!unit, !!trt) %>%
        group_by(!!unit) %>%
        summarise(trt = max(!!trt)) %>%
        ungroup() %>%
        pull(trt)

    return(list(X=X, trt=trt, y=y))
}


#' Format "long" panel data into "wide" program evaluation matrices
#' @param outcomes Vectors of names of outcome columns
#' @param trt Name of treatment column
#' @param unit Name of unit column
#' @param time Name of time column
#' @param t_int Time of intervention
#' @param data Panel data as dataframe
#' @noRd
#' @return \itemize{
#'          \item{"X"}{List of matrices of pre-treatment outcomes}
#'          \item{"trt"}{Vector of treatment assignments}
#'          \item{"y"}{List of matrices of post-treatment outcomes}
#'         }
format_data_multi <- function(outcomes, trt, unit, time, t_int, data) {


    lapply(outcomes, 
        function(outcome) format_data(outcome, trt, unit, 
                                     time, t_int, data)
          ) -> formats

    # X <- simplify2array(lapply(formats, function(x) x$X))
    # y <- simplify2array(lapply(formats, function(x) x$y))
    # X <- lapply(formats, function(x) t(na.omit(t(x$X))))
    X <- lapply(formats, `[[`, "X")
    y <- lapply(formats, function(x) t(na.omit(t(x$y))))
    trt <- formats[[1]]$trt
    return(list(X = X, trt = trt, y = y))
}


#' Format "long" panel data into "wide" program evaluation matrices with staggered adoption
#' @param outcome Name of outcome column
#' @param trt Name of treatment column
#' @param unit Name of unit column
#' @param time Name of time column
#' @param data Panel data as dataframe
#' @noRd
#' @return \itemize{
#'          \item{"X"}{Matrix of pre-treatment outcomes}
#'          \item{"trt"}{Vector of treatment assignments}
#'          \item{"y"}{Matrix of post-treatment outcomes}
#'         }
format_data_stag <- function(outcome, trt, unit, time, data) {

    # arrange data by time first
    data <- data %>% arrange(!!time)
      
    ## get first treatment times
    trt_time <- data %>%
        group_by(!!unit) %>%
        summarise(trt_time=(!!time)[(!!trt) == 1][1]) %>%
        mutate(trt_time=replace_na(as.numeric(trt_time), Inf))
    

    t_int <- trt_time %>% filter(is.finite(trt_time)) %>%
        summarise(t_int=max(trt_time)) %>% pull(t_int)

    ## ## boolean mask of available data for treatment groups
    ## mask <- data %>% inner_join(trt_time %>%
    ##                             filter(is.finite(trt_time))) %>%
    ##     filter(!!time < t_int) %>%
    ##     mutate(trt=1-!!trt) %>%
    ##     select(!!unit, !!time, trt_time, trt) %>%
    ##     spread(!!time, trt) %>% 
    ##     group_by(trt_time) %>% 
    ##     summarise_all(list(max)) %>%
    ##     arrange(trt_time) %>% 
    ##     select(-trt_time, -!!unit) %>%
    ##     as.matrix()

    ## boolean mask of available data for treatment groups
    mask <- data %>% inner_join(trt_time %>%
                                filter(is.finite(trt_time)),
                                by = rlang::as_name(unit)) %>%
        filter(!!time < t_int) %>%
        mutate(trt=1-!!trt) %>%
        select(!!unit, !!time, trt_time, trt) %>%
        spread(!!time, trt) %>% 
        ## arrange(!!unit) %>% 
        select(-trt_time, -!!unit) %>%
        as.matrix()
    
    # outcomes as a matrix
    Xy <- data %>%
        select(!!unit, !!time, !!outcome) %>%
        spread(!!time, !!outcome) %>%
        select(-!!unit) %>%
        as.matrix()

    pre_times <- data %>% filter(!!time < t_int) %>%
        distinct(!!time) %>% pull(!!time)
    post_times <- data %>% filter(!!time >= t_int) %>%
        distinct(!!time) %>% pull(!!time)
    X <- Xy[, as.character(pre_times), drop = F]
    y <- Xy[, as.character(post_times), drop = F]

    if(nrow(X) != nrow(y)) {
      stop("There are not the same number of units after the last unit is treated as before the last unit is treated")
    }

    t_vec <- data %>% pull(!!time) %>% unique() %>% sort()
    trt <- sapply(trt_time$trt_time, function(x) which(t_vec == x)-1) %>%
        as.numeric() %>%
        replace_na(Inf)
   

    units <- data %>%
        filter(!!time < t_int) %>%
        select(!!unit, !!time, !!outcome) %>%
        spread(!!time, !!outcome) %>%
        pull(!!unit)

    
    return(list(X=X,
                trt=trt,
                y=y,
                mask=mask,
                time = t_vec,
                units=units))
}


#' Format program eval matrices into synth form
#'
#' @param X Matrix of pre-treatment outcomes
#' @param trt Vector of treatment assignments
#' @param y Matrix of post-treatment outcomes
#' @noRd
#' @return List with data formatted as Synth::dataprep
format_synth <- function(X, trt, y) {


    synth_data <- list()

    ## pre-treatment values as covariates
    synth_data$Z0 <- t(X[trt==0,,drop=F])

    ## average treated units together
    synth_data$Z1 <- as.matrix((colMeans(X[trt==1,,drop=F])), ncol=1)

    ## combine everything together also
    synth_data$Y0plot <- t(cbind(X[trt==0,,drop=F], y[trt==0,,drop=F]))
    synth_data$Y1plot <- as.matrix(colMeans(
        cbind(X[trt==1,,drop=F], y[trt==1,,drop=F])), ncol=1)


    ## predictors are pre-period outcomes
    synth_data$X0 <- synth_data$Z0
    synth_data$X1 <- synth_data$Z1

    return(synth_data)
    
}

#' Remove unit means 
#' @param wide_data X, y, trt
#' @param synth_data List with data formatted as Synth::dataprep
#' @noRd
demean_data <- function(wide_data, synth_data) {

    # pre treatment means
    means <- rowMeans(wide_data$X)

    new_wide_data <- list()
    new_X <- wide_data$X - means
    trt <- wide_data$trt

    new_wide_data$X <- new_X
    new_wide_data$y <- wide_data$y - means
    new_wide_data$trt <- trt

    new_synth_data <- list()
    new_synth_data$X0 <- t(new_X[trt == 0,, drop = FALSE])
    new_synth_data$Z0 <- new_synth_data$X0
    new_synth_data$X1 <- as.matrix((colMeans(new_X[trt==1,,drop = F])), 
                                   ncol = 1)
    new_synth_data$Z1 <- new_synth_data$X1


    # estimate post-treatment as pre-treatment means
    mhat <- replicate(ncol(wide_data$X) + ncol(wide_data$y), means)

    return(list(wide = new_wide_data,
                synth_data = new_synth_data,
                mhat = mhat))
}

#' Helper function to extract covariate matrix from data
#' @param form Formula as outcome ~ treatment | covariates
#' @param unit Name of unit column
#' @param time Name of time column
#' @param t_int Time of intervention
#' @param data Panel data as dataframe
#' @param cov_agg Covariate aggregation function
#' @noRd
extract_covariates <- function(form, unit, time, t_int, data, cov_agg) {

    ## if no aggregation functions, use the mean (omitting NAs)
    if(is.null(cov_agg)) {
        cov_agg <- c(function(x) mean(x, na.rm=T))
    }

    cov_form <- update(formula(delete.response(terms(form, rhs=2, data=data))),
                        ~. - 1) ## ensure that there is no intercept

    ## pull out relevant covariates and aggregate
    pre_data <- data %>% 
        filter(!! (time) < t_int)

    model.matrix(cov_form,
                    model.frame(cov_form, pre_data,
                                na.action=NULL) ) %>%
        data.frame() %>%
        mutate(unit=pull(pre_data, !!unit)) %>%
        group_by(unit) %>%
        summarise_all(cov_agg) -> Z

    # recombine with any missing units and convert to matrix
    data %>% distinct(!!unit) %>%
      rename(unit = !!unit) %>%
      left_join(Z, by = "unit") %>%
      arrange(unit) %>%
      select(-unit) %>%
      as.matrix() -> Z
    
    if(nrow(distinct(data, !!unit))  != nrow(Z)) {
      stop("Some units missing all covariate data")
    }

    # check if any covariates have no variation
    Zsds <- apply(Z, 2, sd)

    if(any(Zsds == 0)) {
      zero_covs <- paste(colnames(Z)[Zsds == 0], collapse = ", ")
      stop(paste("The following covariates have no variation across units:",
                 zero_covs))
    }
    return(Z)
}

#' Check that we can actually run multisynth on the data
#' @param wide Output of format_data_stag
#' @param fixedeff Whether to include a unit fixed effect
#' @param n_leads How long past treatment effects should be estimated for, default is number of post treatment periods for last treated unit
#' @param n_lags Number of pre-treatment periods to balance, default is to balance all periods
check_data_stag <- function(wide, fixedeff, n_leads, n_lags) {

  # If there are less than 5 pre-treatment outcomes, give a warning
  less_5 <- wide$units[wide$trt < 5]
  warn_msg <- ""
  if(length(less_5) != 0) {
    warn_msg <- paste0(
      warn_msg,
      "The following units have less than 5 pre-treatment outcomes: (",
      paste(less_5, collapse = ","),
      "). Be cautious!"
    )
  }

  # check if there are any always treated units
  always_trt <- wide$units[wide$trt == 0]

  # If including a fixed effect, check that there is more than one pretreatment
  # outcome for each unit
  n1 <- wide$units[wide$trt == 1]

  err_msg <- ""
  if(length(always_trt) != 0) {
    err_msg <- paste0(
      err_msg,
      "The following units are always treated and should be removed: (",
      paste(always_trt, collapse = ","),
      ")\n")
  }

  if(length(n1) != 0 & fixedeff) {
    if(nchar(err_msg) > 0) {
      err_msg <- paste0(err_msg, "  Also: ")
    }
    err_msg <- paste0(
      err_msg,
      "You are including a fixed effect with `fixedeff = T`, but the ",
      "following units only have one pre-treatment outcome: (",
      paste(n1, collapse = ","),
      "). Either remove these units or set `fixedeff = F`.\n"
    )
  }
  # check if there are never treated units
  if(max(wide$trt) < ncol(wide$X) + ncol(wide$y)) {
    if(nchar(err_msg) > 0) {
      err_msg <- paste0(err_msg, "  Also: ")
    }
    err_msg <- paste0(
      err_msg,
      "All units are eventually treated. The last treatment time is ",
      wide$time[max(wide$trt)],
      ". To run multisynth, remove all periods after this time. ",
      "Units treated at this time will be considered 'never treated' in the ",
      "narrowed sample.\n"
    )
  }

  if(nchar(warn_msg) > 0) {
    warning(warn_msg)
  }
  if(nchar(err_msg) > 0) {
    stop(err_msg)
  }

}

================================================
FILE: R/globalVariables.R
================================================
utils::globalVariables(c("time", "val", "post", "weight", ".", "Time",
                         "Estimate", "Std.Error", "Level", "last_time",
                         "is_avg", "label", "Outcome", "unit", "obs",
                         "lambdas", "errors_se",
                         "upper_bound", "lower_bound"))

================================================
FILE: R/highdim.R
================================================
################################################################################
## Methods to use flexible outcome models
################################################################################

##### Augmented SCM with general outcome models

#' Use zero weights, do nothing but output everything in the right way
#' @param synth_data Panel data in format of Synth::dataprep
#' @noRd
#' @return \itemize{
#'          \item{"weights"}{Synth weights}
#'          \item{"l2_imbalance"}{Imbalance in pre-period outcomes, measured by the L2 norm}
#'          \item{"scaled_l2_imbalance"}{L2 imbalance scaled by L2 imbalance of uniform weights}
#' }
fit_zero_weights <- function(synth_data) {
    
    ## Imbalance is uniform weights imbalance
    uni_w <- matrix(1/ncol(synth_data$Z0), nrow=ncol(synth_data$Z0), ncol=1)
    unif_l2_imbalance <- sqrt(sum((synth_data$Z0 %*% uni_w - synth_data$Z1)^2))
    scaled_l2_imbalance <- 1
    
    return(list(weights=rep(0, ncol(synth_data$Z0)),
                l2_imbalance=unif_l2_imbalance,
                scaled_l2_imbalance=scaled_l2_imbalance))
}


#' Fit E[Y(0)|X] and for each post-period and balance pre-period
#'
#' @param wide_data Output of `format_ipw`
#' @param synth_data Output of `synth_data`
#' @param fit_progscore Function to fit prognostic score
#' @param fit_weights Function to fit synth weights
#' @param ... optional arguments for outcome model
#' @noRd
#' @return \itemize{
#'          \item{"weights"}{Ridge ASCM weights}
#'          \item{"l2_imbalance"}{Imbalance in pre-period outcomes, measured by the L2 norm}
#'          \item{"scaled_l2_imbalance"}{L2 imbalance scaled by L2 imbalance of uniform weights}
#'          \item{"mhat"}{Outcome model estimate}
#' }
fit_augsyn_formatted <- function(wide_data, synth_data,
                                fit_progscore, fit_weights, ...) {


    X <- wide_data$X
    y <- wide_data$y
    trt <- wide_data$trt
    
    ## fit prognostic scores
    fitout <- do.call(fit_progscore,
                          list(X=X, y=y, trt=trt, ...))
    
    ## fit synth
    syn <- fit_weights(synth_data)

    syn$params <- fitout$params

    syn$mhat <- fitout$y0hat
    
    return(syn)
}


#' Fit outcome model and balance pre-period
#' @param wide_data Output of `format_ipw`
#' @param synth_data Output of `synth_data`
#' @param progfunc What function to use to impute control outcomes
#'                 EN=Elastic Net, RF=Random Forest, GSYN=gSynth,
#'                 Comp=softImpute, MCP=MCPanel, CITS=CITS
#'                 CausalImpact=Bayesian structural time series with CausalImpact
#'                 seq2seq=Sequence to sequence learning with feedforward nets
#' @param scm Whether the SCM weighting function is used
#' @param ... optional arguments for outcome model
#' @noRd
#' @return \itemize{
#'          \item{"weights"}{Ridge ASCM weights}
#'          \item{"l2_imbalance"}{Imbalance in pre-period outcomes, measured by the L2 norm}
#'          \item{"scaled_l2_imbalance"}{L2 imbalance scaled by L2 imbalance of uniform weights}
#'          \item{"mhat"}{Outcome model estimate}
#' }
fit_augsyn <- function(wide_data, synth_data,
                       progfunc=c("EN", "RF", "GSYN", "MCP","CITS", "CausalImpact", "seq2seq"),
                       scm=T, ...) {
    ## prognostic score and weight functions to use
    progfunc = tolower(progfunc)
    if(progfunc == "en") {
        progf <- fit_prog_reg
    } else if(progfunc == "rf") {
        progf <- fit_prog_rf
    } else if(progfunc == "gsyn"){
        progf <- fit_prog_gsynth
    } else if(progfunc == "mcp"){
        progf <- fit_prog_mcpanel
    } else if(progfunc == "cits") {
        progf <- fit_prog_cits
    } else if(progfunc == "causalimpact") {
        progf <- fit_prog_causalimpact
    } else if(progfunc == "seq2seq"){
        progf <- fit_prog_seq2seq
    } else {
        stop("progfunc must be one of 'EN', 'RF', 'GSYN', 'MCP', 'CITS', 'CausalImpact', 'seq2seq'")
    }

    if(scm) {
        weightf <- fit_synth_formatted
    } else {
        ## still fit synth even if none
        ## TODO: This is a dumb wasteful hack
        weightf <- fit_zero_weights
    }
    return(fit_augsyn_formatted(wide_data, synth_data,
                                progf, weightf, ...))
}


### Combine synth and gsynth by balancing pre-period residuals
#' Fit outcome model and balance residuals
#'
#' @param wide_data Output of `format_data`
#' @param synth_data Output of `format_synth`
#' @param fit_progscore Function to fit prognostic score
#' @param fit_weights Function to fit synth weights
#' @param ... optional arguments for outcome model
#' @noRd
#' @return \itemize{
#'          \item{"weights"}{Ridge ASCM weights}
#'          \item{"l2_imbalance"}{Imbalance in pre-period outcomes, measured by the L2 norm}
#'          \item{"scaled_l2_imbalance"}{L2 imbalance scaled by L2 imbalance of uniform weights}
#'          \item{"mhat"}{Outcome model estimate}
#' }
fit_residaug_formatted <- function(wide_data, synth_data,
                                  fit_progscore, fit_weights, ...) {


    X <- wide_data$X
    y <- wide_data$y
    trt <- wide_data$trt

    ## fit prognostic scores
    fitout <- do.call(fit_progscore, list(X=X, y=y, trt=trt, ...))

    
    y0hat <- fitout$y0hat

    ## get residuals
    ctrl_resids <- fitout$params$ctrl_resids
    trt_resids <- fitout$params$trt_resids
    
    ## replace outcomes with pre-period residuals
    t0 <- dim(X)[2]

    synth_data$Z0 <- ctrl_resids[1:t0, ]
    synth_data$Z1 <- as.matrix(trt_resids[1:t0])
    
    ## fit synth weights
    syn <- fit_weights(synth_data)

    syn$params <- fitout$params    

    ## return predicted values for treatment and control
    syn$mhat <- y0hat
    
    return(syn)
}
#' Fit outcome model and balance residuals
#'
#' @param wide_data Output of `format_data`
#' @param synth_data Output of `format_synth`
#' @param progfunc What function to use to impute control outcomes
#'                 GSYN=gSynth, MCP=MCPanel,
#'                 CITS=Comparative interrupted time series
#'                 CausalImpact=Bayesian structural time series with CausalImpact
#' @param weightfunc What function to use to fit weights
#'                   SCM=Vanilla Synthetic Controls
#'                   NONE=No reweighting, just outcome model
#' @param ... optional arguments for outcome model
#' @noRd
#' @return \itemize{
#'          \item{"weights"}{Ridge ASCM weights}
#'          \item{"l2_imbalance"}{Imbalance in pre-period outcomes, measured by the L2 norm}
#'          \item{"scaled_l2_imbalance"}{L2 imbalance scaled by L2 imbalance of uniform weights}
#'          \item{"mhat"}{Outcome model estimate}
#' }
fit_residaug <- function(wide_data, synth_data,
                        progfunc=c("GSYN", "MCP", "CITS", "CausalImpact"),
                        weightfunc=c("SC","ENT", "SVD", "NONE"), ...) {

    ## prognostic score and weight functions to use
    if(progfunc == "GSYN"){
        progf <- fit_prog_gsynth
    } else if(progfunc == "MCP"){
        progf <- fit_prog_mcpanel
    } else if(progfunc == "CITS") {
        progf <- fit_prog_cits
    } else if(progfunc == "CausalImpact") {
        progf <- fit_prog_causalimpact
    } else {
        stop("progfunc must be one of 'GSYN', 'MCP', 'CITS', 'CausalImpact'")
    }

    
    ## weight function to use
    if(weightfunc == "SCM") {
        weightf <- fit_synth_formatted
    } else if(weightfunc == "NONE") {
        ## still fit synth even if none
        ## TODO: This is a dumb wasteful hack
        weightf <- fit_synth_formatted
    } else {
        stop("weightfunc must be one of 'SCM', 'NONE'")
    }

    return(fit_residaug_formatted(wide_data, synth_data,
                                  progf, weightf, ...))
}


================================================
FILE: R/inference.R
================================================
################################################################################
## Code for inference
################################################################################

#' Jackknife+ algorithm over time
#' @param ascm Fitted `augsynth` object
#' @param alpha Confidence level
#' @param conservative Whether to use the conservative jackknife+ procedure
#' @return List that contains:
#'         \itemize{
#'          \item{"att"}{Vector of ATT estimates}
#'          \item{"heldout_att"}{Vector of ATT estimates with the time period held out}
#'          \item{"se"}{Standard error, always NA but returned for compatibility}
#'          \item{"lb"}{Lower bound of 1 - alpha confidence interval}
#'          \item{"ub"}{Upper bound of 1 - alpha confidence interval}
#'          \item{"alpha"}{Level of confidence interval}
#'         }
time_jackknife_plus <- function(ascm, alpha = 0.05, conservative = F) {
    wide_data <- ascm$data
    synth_data <- ascm$data$synth_data
    n <- nrow(wide_data$X)
    n_c <- dim(synth_data$Z0)[2]
    Z <- wide_data$Z

    t0 <- dim(synth_data$Z0)[1]
    tpost <- ncol(wide_data$y)
    t_final <- dim(synth_data$Y0plot)[1]

    jack_ests <- lapply(1:t0, 
        function(tdrop) {
            # drop unit i
            new_data <- drop_time_t(wide_data, Z, tdrop)
            # refit
            new_ascm <- do.call(fit_augsynth_internal,
                    c(list(wide = new_data$wide,
                            synth_data = new_data$synth_data,
                            Z = new_data$Z,
                            progfunc = ascm$progfunc,
                            scm = ascm$scm,
                            fixedeff = ascm$fixedeff),
                        ascm$extra_args))
            # get ATT estimates and held out error for time t
            # t0 is prediction for held out time
            est <- predict(new_ascm, att = F)[(t0 +1):t_final]
            est <- c(est, mean(est))
            err <- c(colMeans(wide_data$X[wide_data$trt == 1,
                                         tdrop,
                                         drop = F]) -
                    predict(new_ascm, att = F)[t0])
            list(err, rbind(est + abs(err), est - abs(err), est + err, est))
        })
    # get errors and jackknife distribution
    held_out_errs <- vapply(jack_ests, `[[`, numeric(1), 1)
    jack_dist <- vapply(jack_ests, `[[`,
                        matrix(0, nrow = 4, ncol = tpost + 1), 2)

    out <- list()
    att <- predict(ascm, att = T)
    out$att <- c(att, 
                 mean(att[(t0 + 1):t_final]))
    # held out ATT
    out$heldout_att <- c(held_out_errs, 
                          att[(t0 + 1):t_final], 
                          mean(att[(t0 + 1):t_final]))

    # out$se <- rep(NA, 10 + tpost)
    if(conservative) {
        qerr <- stats::quantile(abs(held_out_errs), 1 - alpha)
        out$lb <- c(rep(NA, t0), apply(jack_dist[4,,], 1, min) - qerr)
        out$ub <- c(rep(NA, t0), apply(jack_dist[4,,], 1, max) + qerr)
    } else {
        out$lb <- c(rep(NA, t0), apply(jack_dist[2,,], 1, stats::quantile, alpha / 2))
        out$ub <- c(rep(NA, t0), apply(jack_dist[1,,], 1, stats::quantile, 1 - alpha / 2))
    }
    # shift back to ATT scale
    y1 <- predict(ascm, att = F) + att
    y1 <-  c(y1, mean(y1[(t0 + 1):t_final]))
    shifted_lb <- y1 - out$ub
    shifted_ub <- y1 - out$lb
    out$lb <- shifted_lb
    out$ub <- shifted_ub
    out$alpha <- alpha


    return(out)
}

#' Drop time period from pre-treatment data
#' @param wide_data (X, y, trt)
#' @param Z Covariates matrix
#' @param t_drop Time to drop
#' @noRd
drop_time_t <- function(wide_data, Z, t_drop) {

        new_wide_data <- list()
        new_wide_data$trt <- wide_data$trt
        new_wide_data$X <- wide_data$X[, -t_drop, drop = F]
        new_wide_data$y <- cbind(wide_data$X[, t_drop, drop = F], 
                                 wide_data$y)

        X0 <- new_wide_data$X[new_wide_data$trt == 0,, drop = F]
        x1 <- matrix(colMeans(new_wide_data$X[new_wide_data$trt == 1,,
                                              drop = F]),
                     ncol=1)
        y0 <- new_wide_data$y[new_wide_data$trt == 0,, drop = F]
        y1 <- colMeans(new_wide_data$y[new_wide_data$trt == 1,, drop = F])

        new_synth_data <- list()
        new_synth_data$Z0 <- t(X0)
        new_synth_data$X0 <- t(X0)
        new_synth_data$Z1 <- x1
        new_synth_data$X1 <- x1

        return(list(wide_data = new_wide_data,
                    synth_data = new_synth_data,
                    Z = Z)) 
}

#' Conformal inference procedure to compute p-values and point-wise confidence intervals
#' @param ascm Fitted `augsynth` object
#' @param alpha Confidence level
#' @param stat_func Function to compute test statistic
#' @param type Either "iid" for iid permutations or "block" for moving block permutations; default is "block"
#' @param q The norm for the test static `((sum(x ^ q))) ^ (1/q)`
#' @param ns Number of resamples for "iid" permutations
#' @param grid_size Number of grid points to use when inverting the hypothesis test
#' @return List that contains:
#'         \itemize{
#'          \item{"att"}{Vector of ATT estimates}
#'          \item{"heldout_att"}{Vector of ATT estimates with the time period held out}
#'          \item{"se"}{Standard error, always NA but returned for compatibility}
#'          \item{"lb"}{Lower bound of 1 - alpha confidence interval}
#'          \item{"ub"}{Upper bound of 1 - alpha confidence interval}
#'          \item{"p_val"}{p-value for test of no post-treatment effect}
#'          \item{"alpha"}{Level of confidence interval}
#'         }
conformal_inf <- function(ascm, alpha = 0.05, 
                          stat_func = NULL, type = "iid",
                          q = 1, ns = 1000, grid_size = 50) {
  wide_data <- ascm$data
  synth_data <- ascm$data$synth_data
  n <- nrow(wide_data$X)
  n_c <- dim(synth_data$Z0)[2]
  Z <- wide_data$Z

  t0 <- dim(synth_data$Z0)[1]
  tpost <- ncol(wide_data$y)
  t_final <- dim(synth_data$Y0plot)[1]

  # grid of nulls
  att <- predict(ascm, att = T)
  post_att <- att[(t0 +1):t_final]
  post_sd <- sqrt(mean(post_att ^ 2))
  # iterate over post-treatment periods to get pointwise CIs
  vapply(1:tpost,
         function(j) {
          # fit using t0 + j as a pre-treatment period and get reisduals
          new_wide_data <- wide_data
          new_wide_data$X <- cbind(wide_data$X, wide_data$y[, j, drop = TRUE])
          if(tpost > 1) {
            new_wide_data$y <- wide_data$y[, -j, drop = FALSE]
          } else {
            # set the post period has to be *something*
            new_wide_data$y <- matrix(1, nrow = n, ncol = 1)
          }


          # make a grid around the estimated ATT
          grid <- seq(att[t0 + j] - 2 * post_sd, att[t0 + j] + 2 * post_sd,
                      length.out = grid_size)
          compute_permute_ci(new_wide_data, ascm, grid, 1, alpha, type,
                             q, ns, stat_func)
         },
         numeric(3)) -> cis

  # test a null post-treatment effect
  new_wide_data <- wide_data
  new_wide_data$X <- cbind(wide_data$X, wide_data$y)
  new_wide_data$y <- matrix(1, nrow = n, ncol = 1)
  null_p <- compute_permute_pval(new_wide_data, ascm, 0, ncol(wide_data$y), 
                                 type, q, ns, stat_func)
  out <- list()
  att <- predict(ascm, att = T)
  out$att <- c(att, mean(att[(t0 + 1):t_final]))
  # out$se <- rep(NA, t_final)
  # out$sigma <- NA
  out$lb <- c(rep(NA, t0), cis[1, ], NA)
  out$ub <- c(rep(NA, t0), cis[2, ], NA)
  out$p_val <- c(rep(NA, t0), cis[3, ], null_p)
  out$alpha <- alpha
  return(out)
}


#' Conformal inference procedure to compute a confidence interval for a linear in time effect
#' @param ascm Fitted `augsynth` object
#' @param alpha Confidence level
#' @param stat_func Function to compute test statistic
#' @param type Either "iid" for iid permutations or "block" for moving block permutations; default is "iid"
#' @param q The norm for the test static `((sum(x ^ q))) ^ (1/q)`
#' @param ns Number of resamples for "iid" permutations
#' @param grid_size Number of grid points to use when inverting the hypothesis test
#' @return List that contains:
#'         \itemize{
#'          \item{"att"}{Vector of ATT estimates}
#'          \item{"heldout_att"}{Vector of ATT estimates with the time period held out}
#'          \item{"se"}{Standard error, always NA but returned for compatibility}
#'          \item{"lb"}{Lower bound of 1 - alpha confidence interval}
#'          \item{"ub"}{Upper bound of 1 - alpha confidence interval}
#'          \item{"p_val"}{p-value for test of no post-treatment effect}
#'          \item{"alpha"}{Level of confidence interval}
#'         }
conformal_inf_linear <- function(ascm, alpha = 0.05, 
                          stat_func = NULL, type = "iid",
                          q = 1, ns = 1000, grid_size = 50) {
  wide_data <- ascm$data
  synth_data <- ascm$data$synth_data
  n <- nrow(wide_data$X)
  n_c <- dim(synth_data$Z0)[2]
  Z <- wide_data$Z

  t0 <- dim(synth_data$Z0)[1]
  tpost <- ncol(wide_data$y)
  t_final <- dim(synth_data$Y0plot)[1]

  # grid of nulls
  att <- predict(ascm, att = T)
  post_att <- att[(t0 +1):t_final]
  post_second <- sqrt(mean(post_att^2))

  # grid for slope
  # use ols to get pilot estimate
  ts <- 1:tpost
  lm_out <- summary(lm(post_att ~ ts))$coefficients
  # grid for intercept
  grid_int <- seq(lm_out[1,1] - 2 * post_second,
                  lm_out[1,1] + 2 * post_second,
                  length.out = grid_size)
  if(tpost == 2) {
    warning(paste0("There are 2 post-treatment time periods, so a linear model has a perfect fit. A confidence interval for the slope may not be reasonable here."))
    grid_slope <- seq(lm_out[2,1] - abs(lm_out[2,1]),
                  lm_out[2,1] + abs(lm_out[2,1]),
                  length.out = grid_size)
  } else if(tpost <= 1) {
    stop("There is only one post-treatment time period, so an intercept and a slope cannot be computed.")
  } else {
    grid_slope <- seq(lm_out[2,1] - 4 * lm_out[2,2] * sqrt(tpost),
                  lm_out[2,1] + 4 * lm_out[2,2] * sqrt(tpost),
                  length.out = grid_size)
  }

  # test a null post-treatment effect
  new_wide_data <- wide_data
  new_wide_data$X <- cbind(wide_data$X, wide_data$y)
  new_wide_data$y <- matrix(1, nrow = n, ncol = 1)
  null_p <- compute_permute_pval(new_wide_data, ascm, 0, ncol(wide_data$y), 
                                 type, q, ns, stat_func)

  # confidence interval for linear in time treatment effects
  cis <- compute_permute_ci_linear(new_wide_data, ascm, grid_int, grid_slope,
                                   ncol(wide_data$y), alpha, type, q, ns, stat_func)

  return(cis)
}


#' Compute conformal test statistics
#' @param wide_data List containing pre- and post-treatment outcomes and outcome vector
#' @param ascm Fitted `augsynth` object
#' @param h0 Null hypothesis to test
#' @param post_length Number of post-treatment periods
#' @param type Either "iid" for iid permutations or "block" for moving block permutations
#' @param q The norm for the test static `((sum(x ^ q))) ^ (1/q)`
#' @param ns Number of resamples for "iid" permutations
#' @param stat_func Function to compute test statistic
#' 
#' @return List that contains:
#'         \itemize{
#'          \item{"resids"}{Residuals after enforcing the null}
#'          \item{"test_stats"}{Permutation distribution of test statistics}
#'          \item{"stat_func"}{Test statistic function}
#'         }
#' @noRd
compute_permute_test_stats <- function(wide_data, ascm, h0,
                                       post_length, type,
                                       q, ns, stat_func) {
  # format data
  new_wide_data <- wide_data
  t0 <- ncol(wide_data$X) - post_length
  tpost <- t0 + post_length
  # adjust outcomes for null
  new_wide_data$X[wide_data$trt == 1,(t0 + 1):tpost ] <- new_wide_data$X[wide_data$trt == 1,(t0 + 1):tpost] - h0
  X0 <- new_wide_data$X[new_wide_data$trt == 0,, drop = F]
  x1 <- matrix(colMeans(new_wide_data$X[new_wide_data$trt == 1,, drop = F]),
              ncol=1)

  new_synth_data <- list()
  new_synth_data$Z0 <- t(X0)
  new_synth_data$X0 <- t(X0)
  new_synth_data$Z1 <- x1
  new_synth_data$X1 <- x1

  # fit synth with adjusted data and get residuals
  new_ascm <- do.call(fit_augsynth_internal,
                    c(list(wide = new_wide_data,
                            synth_data = new_synth_data,
                            Z = wide_data$Z,
                            progfunc = ascm$progfunc,
                            scm = ascm$scm,
                            fixedeff = ascm$fixedeff),
                        ascm$extra_args))
  resids <- predict(new_ascm, att = T)[1:tpost]
  # permute residuals and compute test statistic
  if(is.null(stat_func)) {
    stat_func <- function(x) (sum(abs(x) ^ q)  / sqrt(length(x))) ^ (1 / q)
  }
  if(type == "iid") {
    test_stats <- sapply(1:ns, 
                        function(x) {
                          reorder <- sample(resids)
                          stat_func(reorder[(t0 + 1):tpost])
                        })
  } else {
    ## increment time by one step and wrap
    test_stats <- sapply(1:tpost,
                        function(j) {
                          reorder <- resids[(0:tpost -1 + j) %% tpost + 1]
                          stat_func(reorder[(t0 + 1):tpost])
                        })
  }
  
  return(list(resids = resids,
              test_stats = test_stats,
              stat_func = stat_func))
}


#' Compute conformal p-value
#' @param wide_data List containing pre- and post-treatment outcomes and outcome vector
#' @param ascm Fitted `augsynth` object
#' @param h0 Null hypothesis to test
#' @param post_length Number of post-treatment periods
#' @param type Either "iid" for iid permutations or "block" for moving block permutations
#' @param q The norm for the test static `((sum(x ^ q))) ^ (1/q)`
#' @param ns Number of resamples for "iid" permutations
#' @param stat_func Function to compute test statistic
#' 
#' @return Computed p-value
#' @noRd
compute_permute_pval <- function(wide_data, ascm, h0,
                                 post_length, type,
                                 q, ns, stat_func) {
  t0 <- ncol(wide_data$X) - post_length
  tpost <- t0 + post_length
  out <- compute_permute_test_stats(wide_data, ascm, h0,
                                    post_length, type, q, ns, stat_func)
  mean(out$stat_func(out$resids[(t0 + 1):tpost]) <= out$test_stats)
}

#' Compute conformal p-value
#' @param wide_data List containing pre- and post-treatment outcomes and outcome vector
#' @param ascm Fitted `augsynth` object
#' @param grid Set of null hypothesis to test for inversion
#' @param post_length Number of post-treatment periods
#' @param type Either "iid" for iid permutations or "block" for moving block permutations
#' @param q The norm for the test static `((sum(x ^ q))) ^ (1/q)`
#' @param ns Number of resamples for "iid" permutations
#' @param stat_func Function to compute test statistic
#' 
#' @return (lower bound of interval, upper bound of interval, p-value for null of 0 effect)
#' @noRd
compute_permute_ci <- function(wide_data, ascm, grid,
                               post_length, alpha, type,
                               q, ns, stat_func) {
  # make sure 0 is in the grid
  grid <- c(grid, 0)
  ps <-sapply(grid, 
              function(x) {
                compute_permute_pval(wide_data, ascm, x, 
                                     post_length, type, q, ns, stat_func)
              })
  c(min(grid[ps >= alpha]), max(grid[ps >= alpha]), ps[grid == 0])
}


#' Compute conformal confidence interval for a linear model for effects
#' int + slope * time
#' @param wide_data List containing pre- and post-treatment outcomes and outcome vector
#' @param ascm Fitted `augsynth` object
#' @param grid_int Set of null hypothesis values for the intercept
#' @param grid_slope Set of null hypothesis values for the slope
#' @param post_length Number of post-treatment periods
#' @param type Either "iid" for iid permutations or "block" for moving block permutations
#' @param q The norm for the test static `((sum(x ^ q))) ^ (1/q)`
#' @param ns Number of resamples for "iid" permutations
#' @param stat_func Function to compute test statistic
#' 
#' @return (lower bound of interval, upper bound of interval, p-value for null of 0 effect)
#' @noRd
compute_permute_ci_linear <- function(wide_data, ascm, grid_int, grid_slope,
                               post_length, alpha, type,
                               q, ns, stat_func) {
  # make sure 0 is in both grids
  # grid_int <- c(grid_int, 0)
  # grid_slope <- c(grid_slope, 0)
  # make the combined grid
  grid_comb <- expand.grid(grid_int, grid_slope)
  grid_comb$p_val <-apply(grid_comb, 1,
              function(x) {
                compute_permute_pval(wide_data, ascm, x[1] + x[2] * (1:post_length), 
                                     post_length, type, q, ns, stat_func)
              })
  ci_int <- c(min(grid_comb[grid_comb$p_val >= alpha, 1]),
              max(grid_comb[grid_comb$p_val >= alpha, 1]))
  ci_slope <- c(min(grid_comb[grid_comb$p_val >= alpha, 2]),
                   max(grid_comb[grid_comb$p_val >= alpha, 2]))
  int_slope_est <- as.numeric(grid_comb[which.max(grid_comb$p_val), 1:2])
  return(list(est_int = int_slope_est[1], ci_int = ci_int,
              est_slope = int_slope_est[2], ci_slope = ci_slope))
}


#' Jackknife+ algorithm over time
#' @param ascm Fitted `augsynth` object
#' @param alpha Confidence level
#' @param conservative Whether to use the conservative jackknife+ procedure
#' @return List that contains:
#'         \itemize{
#'          \item{"att"}{Vector of ATT estimates}
#'          \item{"heldout_att"}{Vector of ATT estimates with the time period held out}
#'          \item{"se"}{Standard error, always NA but returned for compatibility}
#'          \item{"lb"}{Lower bound of 1 - alpha confidence interval}
#'          \item{"ub"}{Upper bound of 1 - alpha confidence interval}
#'          \item{"alpha"}{Level of confidence interval}
#'         }
time_jackknife_plus_multiout <- function(ascm_multi, alpha = 0.05, conservative = F) {
    wide_data <- ascm_multi$data
    data_list <- ascm_multi$data_list

    n <- nrow(wide_data$X)
    k <- length(data_list$X)


    t0 <- min(sapply(data_list$X, ncol))
    tpost <- max(sapply(data_list$y, ncol))
    t_final <- t0 + tpost
    Z <- wide_data$Z

    jack_ests <- lapply(1:t0, 
        function(tdrop) {
            # drop unit i
            new_data_list <- drop_time_t_multiout(data_list, Z, tdrop)
            # refit
            new_ascm <- do.call(fit_augsynth_multiout_internal,
                    c(list(wide_list = new_data_list,
                            combine_method = ascm_multi$combine_method,
                            Z = data_list$Z,
                            progfunc = ascm_multi$progfunc,
                            scm = ascm_multi$scm,
                            fixedeff = ascm_multi$fixedeff,
                            outcomes_str = ascm_multi$outcomes),
                        ascm_multi$extra_args))
            # get ATT estimates and held out error for time t
            # t0 is prediction for held out time
            est <- predict(new_ascm, att = F)[(t0 +1):t_final, , drop = F]
            est <- rbind(est, colMeans(est))
            # err <- c(colMeans(wide_data$X[wide_data$trt == 1,
            #                              tdrop,
            #                              drop = F]) -
            #         predict(new_ascm, att = F)[t0])
            err <- c(predict(new_ascm, att = T)[t0, , drop = F])
            list(err, t(t(est) + abs(err)), t(t(est) - abs(err)), t(t(est) + err), est)
        })
    # get errors and jackknife distribution
    held_out_errs <- matrix(vapply(jack_ests, `[[`, numeric(k), 1), nrow = k)
    jack_dist_high <- vapply(jack_ests, `[[`,
                        matrix(0, nrow = tpost + 1, ncol = k), 2)
    jack_dist_low <- vapply(jack_ests, `[[`,
                        matrix(0, nrow = tpost + 1, ncol = k), 3)
    jack_dist_cons <- vapply(jack_ests, `[[`,
                        matrix(0, nrow = tpost + 1, ncol = k), 4)

    out <- list()
    att <- predict(ascm_multi, att = T)
    out$att <- rbind(att, 
                      colMeans(att[(t0 + 1):t_final, , drop = F]))
    # held out ATT

    out$heldout_att <- rbind(t(held_out_errs), 
                              att[(t0 + 1):t_final, , drop = F], 
                              colMeans(att[(t0 + 1):t_final, , drop = F]))
    if(conservative) {
        qerr <- apply(abs(held_out_errs), 1, 
                      stats::quantile, 1 - alpha, type = 1)
        out$lb <- rbind(matrix(NA, nrow = t0, ncol = k),
                        t(t(apply(jack_dist_cons, 1:2, min)) - qerr))
        out$ub <- rbind(matrix(NA, nrow = t0, ncol = k),
                        t(t(apply(jack_dist_cons, 1:2, max)) + qerr))

    } else {
        out$lb <- rbind(matrix(NA, nrow = t0, ncol = k),
                        apply(jack_dist_low, 1:2,
                              stats::quantile, alpha, type = 1))
        out$ub <- rbind(matrix(NA, nrow = t0, ncol = k), 
                        apply(jack_dist_high, 1:2, 
                              stats::quantile, 1 - alpha, type = 1))
    }
    # shift back to ATT scale
    y1 <- predict(ascm_multi, att = F) + att
    y1 <-  rbind(y1, colMeans(y1[(t0 + 1):t_final, , drop = F]))
    shifted_lb <- y1 - out$ub
    shifted_ub <- y1 - out$lb
    out$lb <- shifted_lb
    out$ub <- shifted_ub
    out$alpha <- alpha


    return(out)
}

#' Drop time period from pre-treatment data
#' @param wide_data (X, y, trt)
#' @param Z Covariates matrix
#' @param t_drop Time to drop
#' @noRd
drop_time_t_multiout <- function(data_list, Z, t_drop) {

        new_data_list <- list()
        new_data_list$trt <- data_list$trt
        new_data_list$X <- lapply(data_list$X,
                                  function(x) x[, -t_drop, drop = F])
        new_data_list$y <- lapply(1:length(data_list$y),
                                  function(k) {
                                    cbind(data_list$X[[k]][, t_drop, drop = F], 
                                          data_list$y[[k]])
                                  })
        return(new_data_list)
}


#' Conformal inference procedure to compute p-values and point-wise confidence intervals
#' @param ascm Fitted `augsynth` object
#' @param alpha Confidence level
#' @param stat_func Function to compute test statistic
#' @param type Either "iid" for iid permutations or "block" for moving block permutations
#' @param q The norm for the test static `((sum(x ^ q))) ^ (1/q)`
#' @param ns Number of resamples for "iid" permutations
#' @param grid_size Number of grid points to use when inverting the hypothesis test (default is 1, so only to test joint null)
#' @return List that contains:
#'         \itemize{
#'          \item{"att"}{Vector of ATT estimates}
#'          \item{"heldout_att"}{Vector of ATT estimates with the time period held out}
#'          \item{"se"}{Standard error, always NA but returned for compatibility}
#'          \item{"lb"}{Lower bound of 1 - alpha confidence interval}
#'          \item{"ub"}{Upper bound of 1 - alpha confidence interval}
#'          \item{"p_val"}{p-value for test of no post-treatment effect}
#'          \item{"alpha"}{Level of confidence interval}
#'         }
conformal_inf_multiout <- function(ascm_multi, alpha = 0.05, 
                                    stat_func = NULL, type = "iid",
                                    q = 1, ns = 1000, grid_size = 1,
                                    lin_h0 = NULL) {
  wide_data <- ascm_multi$data
  data_list <- ascm_multi$data_list

  n <- nrow(wide_data$X)
  k <- length(data_list$X)


  t0 <- min(sapply(data_list$X, ncol))
  tpost <- max(sapply(data_list$y, ncol))
  t_final <- t0 + tpost

  # grid of nulls
  att <- predict(ascm_multi, att = T)
  post_att <- att[(t0 +1):t_final,, drop = F]
  post_sd <- apply(post_att, 2, function(x) sqrt(mean(x ^ 2, na.rm = T)))
  # iterate over post-treatment periods to get pointwise CIs
  
  vapply(1:tpost,
         function(j) {
          # fit using t0 + j as a pre-treatment period and get residuals
          new_data_list <- data_list
          new_data_list$X <- lapply(1:k,
              function(i) {
                Xi <- cbind(data_list$X[[i]], data_list$y[[i]][, j, drop = TRUE])
                colnames(Xi) <- c(colnames(data_list$X[[i]]),
                                  colnames(data_list$y[[i]])[j])
                Xi
          })
          
          
          if(tpost > 1) {
            new_data_list$y <- lapply(1:k,
              function(i) {
                data_list$y[[i]][, -j, drop = FALSE]
            })
          } else {
            # set the post period has to be *something*
            new_data_list$y <- lapply(1:k,
              function(i) {
                x <- matrix(1, nrow = n, ncol = 1)
                colnames(x) <- max(as.numeric(colnames(data_list$y[[i]]))) + 1
                x
            })
          }


          # make a grid around the estimated ATT
          if(is.null(lin_h0)) {
            grid <- lapply(1:k, 
            function(i) {
              seq(att[t0 + j, i] - 2 * post_sd[i], att[t0 + j, i] + 2 * post_sd[i],
                    length.out = grid_size)
            })
          } else {
            grid <- seq(min(att[t0 + j, ]) - 2 * max(post_sd),
                max(att[t0 + j, ]) + 2 * max(post_sd),
                length.out = grid_size)
          }
          if(grid_size > 1) {
            compute_permute_ci_multiout(new_data_list, ascm_multi, grid, 1,
                                    alpha, type, q, ns, lin_h0, stat_func)
          } else {
            rbind(matrix(0, ncol = k, nrow = 2),
              compute_permute_pval_multiout(new_data_list, ascm_multi, numeric(k),
                                          1, type, q, ns, stat_func))
          }
          
         },
         matrix(0, ncol = k, nrow=3)) -> cis
  # # test a null post-treatment effect

  new_data_list <- data_list
  new_data_list$X <- lapply(1:k,
      function(i) {
        Xi <- cbind(data_list$X[[i]], data_list$y[[i]])
        colnames(Xi) <- c(colnames(data_list$X[[i]]),
                          colnames(data_list$y[[i]]))
        Xi
      })
  # set post treatment to be *something*
  new_data_list$y <- lapply(1:k,
      function(i) {
        data_list$y[[i]][, 1, drop = FALSE]
    })
  null_p <- compute_permute_pval_multiout(new_data_list, ascm_multi,
                                          numeric(k), 
                                          tpost, type, q, ns, stat_func)
  if(is.null(lin_h0)) {
    grid <- lapply(1:k, 
            function(i) {
              seq(min(att[(t0 + 1):tpost, i]) - 4 * post_sd[i],
                  max(att[(t0 + 1):tpost, i]) + 4 * post_sd[i],
                    length.out = grid_size)
            })
  } else {
    grid <- seq(min(att[t0 + 1, ]) - 3 * max(post_sd),
                max(att[t0 + 1, ]) + 3 * max(post_sd),
                length.out = grid_size)
  }
  null_ci <- compute_permute_ci_multiout(new_data_list, ascm_multi, grid,
                                          tpost, alpha, type, q, ns,
                                          lin_h0, stat_func)
  out <- list()
  att <- predict(ascm_multi, att = T)
  out$att <- rbind(att, apply(att[(t0 + 1):t_final, , drop = F], 2, mean))
  out$lb <- rbind(matrix(NA, nrow = t0, ncol = k),
                  t(matrix(cis[1, ,], nrow = k)),
                  # rep(NA, k)
                  null_ci[1,]
                  )
  colnames(out$lb) <- ascm_multi$outcomes
  out$ub <- rbind(matrix(NA, nrow = t0, ncol = k),
                  t(matrix(cis[2, ,], nrow = k)),
                  # rep(NA, k)
                  null_ci[2,]
                  )
  colnames(out$ub) <- ascm_multi$outcomes
  out$p_val <- rbind(matrix(NA, nrow = t0, ncol = k),
                  t(matrix(cis[3, ,], nrow = k)),
                  # rep(null_p, k)
                  null_ci[3,])
  colnames(out$p_val) <- ascm_multi$outcomes
  out$alpha <- alpha
  return(out)
}


#' Compute conformal test statistics
#' @param wide_data List containing pre- and post-treatment outcomes and outcome vector
#' @param ascm Fitted `augsynth` object
#' @param h0 Null hypothesis to test
#' @param post_length Number of post-treatment periods
#' @param type Either "iid" for iid permutations or "block" for moving block permutations
#' @param q The norm for the test static `((sum(x ^ q))) ^ (1/q)`
#' @param ns Number of resamples for "iid" permutations
#' @param stat_func Function to compute test statistic
#' 
#' @return List that contains:
#'         \itemize{
#'          \item{"resids"}{Residuals after enforcing the null}
#'          \item{"test_stats"}{Permutation distribution of test statistics}
#'          \item{"stat_func"}{Test statistic function}
#'         }
#' @noRd
compute_permute_test_stats_multiout <- function(data_list, ascm_multi, h0,
                                              post_length, type,
                                              q, ns, stat_func) {
  # format data
  new_data_list <- data_list
  t0 <- ncol(data_list$X[[1]]) - post_length
  tpost <- t0 + post_length
  k <- length(data_list$X)
  # adjust outcomes for null
  for(i in 1:k) {
    new_data_list$X[[k]][data_list$trt == 1,(t0 + 1):tpost ] <- new_data_list$X[[k]][data_list$trt == 1,(t0 + 1):tpost] - h0[i]
  }
  # fit synth with adjusted data and get residuals
  new_ascm <- do.call(fit_augsynth_multiout_internal,
                    c(list(wide_list = new_data_list,
                            combine_method = ascm_multi$combine_method,
                            Z = data_list$Z,
                            progfunc = ascm_multi$progfunc,
                            scm = ascm_multi$scm,
                            fixedeff = ascm_multi$fixedeff,
                            outcomes_str = ascm_multi$outcomes),
                        ascm_multi$extra_args))

  resids <- predict(new_ascm, att = T)[1:tpost, , drop = F]

  # permute residuals and compute test statistic
  if(is.null(stat_func)) {
    stat_func <- function(x) {
      x <- na.omit(x)
      (sum(abs(x) ^ q)  / sqrt(length(x))) ^ (1 / q)
    }
  }
  if(type == "iid") {
    test_stats <- sapply(1:ns, 
                        function(x) {
                          idxs <- sample(1:nrow(resids))
                          reorder <- resids[idxs, , drop = F]
                          apply(reorder[(t0 + 1):tpost, ,drop = F], 2, stat_func)
                        })
  } else {
    ## increment time by one step and wrap
    test_stats <- sapply(0:(tpost - 1),
                        function(j) {
                          reorder <- resids[(0:(tpost -1) + j) %% tpost + 1, ,drop = F]
                          if(!all(dim(reorder) == dim(resids))) {
                            stop("Error in block resampling")
                          }
                          apply(reorder[(t0 + 1):tpost, , drop = F], 2, stat_func)
                        })
  }
  
  return(list(resids = resids,
              test_stats = matrix(test_stats, nrow = k),
              stat_func = stat_func))
}


#' Compute conformal p-value
#' @param wide_data List containing pre- and post-treatment outcomes and outcome vector
#' @param ascm Fitted `augsynth` object
#' @param h0 Null hypothesis to test
#' @param post_length Number of post-treatment periods
#' @param type Either "iid" for iid permutations or "block" for moving block permutations
#' @param q The norm for the test static `((sum(x ^ q))) ^ (1/q)`
#' @param ns Number of resamples for "iid" permutations
#' @param stat_func Function to compute test statistic
#' 
#' @return Computed p-value
#' @noRd
compute_permute_pval_multiout <- function(data_list, ascm_multi, h0,
                                        post_length, type,
                                        q, ns, stat_func) {
  t0 <- ncol(data_list$X[[1]]) - post_length
  tpost <- t0 + post_length

  out <- compute_permute_test_stats_multiout(data_list, ascm_multi, h0,
                                          post_length, type, q, ns, stat_func)
  k <- length(data_list$X)

  comb_stat <- mean(apply(out$resids[(t0 + 1):tpost, , drop = F], 2, out$stat_func), na.rm = TRUE)
  comb_test_stats <- apply(out$test_stats, 2, mean, na.rm = TRUE)
  # if(h0 == 0) {
  #   hist(comb_test_stats)
  #   abline(v = comb_stat)
  #   print(mean(comb_stat <= comb_test_stats))
  #   print(1 - mean(comb_stat > comb_test_stats))
  # }
  1 - mean(comb_stat > comb_test_stats)
}

#' Compute conformal p-value
#' @param wide_data List containing pre- and post-treatment outcomes and outcome vector
#' @param ascm Fitted `augsynth` object
#' @param grid Set of null hypothesis to test for inversion
#' @param post_length Number of post-treatment periods
#' @param type Either "iid" for iid permutations or "block" for moving block permutations
#' @param q The norm for the test static `((sum(x ^ q))) ^ (1/q)`
#' @param ns Number of resamples for "iid" permutations
#' @param stat_func Function to compute test statistic
#' 
#' @return (lower bound of interval, upper bound of interval, p-value for null of 0 effect)
#' @noRd
compute_permute_ci_multiout <- function(data_list, ascm_multi, grid,
                                      post_length, alpha, type,
                                      q, ns, lin_h0 = NULL, stat_func) {
  # make sure 0 is in the grid
  if(is.null(lin_h0)) {
    grid <- lapply(grid, function(x) c(x, 0))
    k <- length(grid)
    # get all combinations of grid
    grid <- expand.grid(grid)
    grid_low <- NULL
  } else {
    k <- length(lin_h0)
    # keep track of low dimensional grid
    grid_low <- c(grid, 0)
    # transform into high dimensional grid with linear hypothesis
    grid <- sapply(lin_h0, function(x) x * grid_low)
  }
  ps <- apply(grid, 1,
              function(x) {
                compute_permute_pval_multiout(data_list, ascm_multi, x, 
                                     post_length, type, q, ns, stat_func)
              })
  sapply(1:k, 
    function(i) c(min(grid[ps >= alpha, i]), 
                  max(grid[ps >= alpha, i]), 
                  ps[apply(grid == 0, 1, all)]))
}


#' Drop unit i from data
#' @param wide_data (X, y, trt)
#' @param Z Covariates matrix
#' @param i Unit to drop
#' @noRd
drop_unit_i <- function(wide_data, Z, i) {

        new_wide_data <- list()
        new_wide_data$trt <- wide_data$trt[-i]
        new_wide_data$X <- wide_data$X[-i,, drop = F]
        new_wide_data$y <- wide_data$y[-i,, drop = F]

        X0 <- new_wide_data$X[new_wide_data$trt == 0,, drop = F]
        x1 <- matrix(colMeans(new_wide_data$X[new_wide_data$trt == 1,, drop = F]),
                     ncol=1)
        y0 <- new_wide_data$y[new_wide_data$trt == 0,, drop = F]
        y1 <- colMeans(new_wide_data$y[new_wide_data$trt == 1,, drop = F])

        new_synth_data <- list()
        new_synth_data$Z0 <- t(X0)
        new_synth_data$X0 <- t(X0)
        new_synth_data$Z1 <- x1
        new_synth_data$X1 <- x1
        new_Z <- if(!is.null(Z)) Z[-i, , drop = F] else NULL

        return(list(wide_data = new_wide_data,
                    synth_data = new_synth_data,
                    Z = new_Z))
}

#' Drop unit i from data
#' @param wide_list (X, y, trt)
#' @param Z Covariates matrix
#' @param i Unit to drop
#' @noRd
drop_unit_i_multiout <- function(wide_list, Z, i) {

        new_wide_data <- list()
        new_wide_data$trt <- wide_list$trt[-i]
        new_wide_data$X <- lapply(wide_list$X, function(x) x[-i,, drop = F])
        new_wide_data$y <- lapply(wide_list$y, function(x) x[-i,, drop = F])
        new_Z <- if(!is.null(Z)) Z[-i, , drop = F] else NULL

        return(list(wide_list = new_wide_data,
                    Z = new_Z))
}


#' Estimate standard errors for single ASCM with the jackknife
#' Do this for ridge-augmented synth
#' @param ascm Fitted augsynth object
#' 
#' @return List that contains:
#'         \itemize{
#'          \item{"att"}{Vector of ATT estimates}
#'          \item{"se"}{Standard error estimate}
#'          \item{"lb"}{Lower bound of 1 - alpha confidence interval}
#'          \item{"ub"}{Upper bound of 1 - alpha confidence interval}
#'          \item{"alpha"}{Level of confidence interval}
#'         }
jackknife_se_single <- function(ascm) {

    wide_data <- ascm$data
    synth_data <- ascm$data$synth_data
    n <- nrow(wide_data$X)
    n_c <- dim(synth_data$Z0)[2]
    Z <- wide_data$Z

    t0 <- dim(synth_data$Z0)[1]
    tpost <- ncol(wide_data$y)
    t_final <- dim(synth_data$Y0plot)[1]
    errs <- matrix(0, n_c, t_final - t0)


    # only drop out control units with non-zero weights
    nnz_weights <- numeric(n)
    nnz_weights[wide_data$trt == 0] <- round(ascm$weights, 3) != 0
    # if more than one unit is treated, include them in the jackknife
    if(sum(wide_data$trt) > 1) {
      nnz_weights[wide_data$trt == 1] <- 1
    }

    trt_idxs <- (1:n)[as.logical(nnz_weights)]


    # jackknife estimates
    ests <- vapply(trt_idxs,
                   function(i) {
                       # drop unit i
                       new_data <- drop_unit_i(wide_data, Z, i)
                       # refit
                       new_ascm <- do.call(fit_augsynth_internal,
                                c(list(wide = new_data$wide,
                                       synth_data = new_data$synth_data,
                                       Z = new_data$Z,
                                       progfunc = ascm$progfunc,
                                       scm = ascm$scm,
                                       fixedeff = ascm$fixedeff),
                                  ascm$extra_args))
                       # get ATT estimates
                       est <- predict(new_ascm, att = T)[(t0 + 1):t_final]
                       c(est, mean(est))
                   },
                   numeric(tpost + 1))
    # convert to matrix
    ests <- matrix(ests, nrow = tpost + 1, ncol = length(trt_idxs))
    ## standard errors
    se2 <- apply(ests, 1,
                 function(x) (n - 1) / n * sum((x - mean(x, na.rm = T)) ^ 2))
    se <- sqrt(se2)

    out <- list()
    att <- predict(ascm, att = T)
    out$att <- c(att, mean(att[(t0 + 1):t_final]))

    out$se <- c(rep(NA, t0), se)
    # out$sigma <- NA
    return(out)
}


#' Compute standard errors using the jackknife
#' @param multisynth fitted multisynth object
#' @param relative Whether to compute effects according to relative time
#' @noRd
jackknife_se_multi <- function(multisynth, relative=NULL, alpha = 0.05, att_weight = NULL) {
    ## get info from the multisynth object
    if(is.null(relative)) {
        relative <- multisynth$relative
    }
    n_leads <- multisynth$n_leads
    n <- nrow(multisynth$data$X)
    att <- predict(multisynth, att=T, att_weight = att_weight)
    outddim <- nrow(att)

    J <- length(multisynth$grps)

    ## drop each unit and estimate overall treatment effect
    jack_est <- vapply(1:n,
                       function(i) {
                           msyn_i <- drop_unit_i_multi(multisynth, i)
                           pred <- predict(msyn_i[[1]], relative=relative, att=T, att_weight = att_weight)
                           if(nrow(pred) < outddim) {
                               pred <- rbind(
                                   pred[1:(nrow(pred)-1), ],
                                   matrix(NA, nrow=outddim-nrow(pred), ncol=ncol(pred)),
                                   pred[nrow(pred), ]
                               )
                           }
                           if(length(msyn_i[[2]]) != 0) {
                               out <- matrix(NA, nrow=nrow(pred), ncol=(J+1))
                               out[,-(msyn_i[[2]]+1)] <- pred
                           } else {
                               out <- pred
                           }
                           out
                       },
                       matrix(0, nrow=outddim,ncol=(J+1)))

    se2 <- apply(jack_est, c(1,2),
                function(x) (n-1) / n * sum((x - mean(x,na.rm=T))^2, na.rm=T))
    lower_bound <- att - qnorm(1 - alpha / 2) * sqrt(se2)
    upper_bound <- att + qnorm(1 - alpha / 2) * sqrt(se2)
    return(list(att = att, se = sqrt(se2),
                lower_bound = lower_bound, upper_bound = upper_bound))

}

#' Helper function to drop unit i and refit
#' @param msyn multisynth_object
#' @param i Unit to drop
#' @noRd
drop_unit_i_multi <- function(msyn, i) {

    n <- nrow(msyn$data$X)
    time_cohort <- msyn$time_cohort
    which_t <- (1:n)[is.finite(msyn$data$trt)]

    not_miss_j <- which_t %in% setdiff(which_t, i)

    # drop unit i from data
    drop_i <- msyn$data
    drop_i$X <- msyn$data$X[-i, , drop = F]
    drop_i$y <- msyn$data$y[-i, , drop = F]
    drop_i$trt <- msyn$data$trt[-i]
    drop_i$mask <- msyn$data$mask[not_miss_j,, drop = F]

    if(!is.null(msyn$data$Z)) {
      drop_i$Z <- msyn$data$Z[-i, , drop = F]
    } else {
      drop_i$Z <- NULL
    }

    long_df <- msyn$long_df
    unit <- colnames(long_df)[1]
    # make alphabetical, because the ith unit is the index in alphabetical ordering
    long_df <- long_df[order(long_df[, unit, drop = TRUE]),]
    ith_unit <- unique(long_df[,unit, drop = TRUE])[i]
    long_df <- long_df[long_df[,unit, drop = TRUE] != ith_unit,]

    # re-fit everything
    args_list <- list(wide = drop_i, relative = msyn$relative,
                      n_leads = msyn$n_leads, n_lags = msyn$n_lags,
                      nu = msyn$nu, lambda = msyn$lambda,
                      V = msyn$V,
                      force = msyn$force, n_factors = msyn$n_factors,
                      scm = msyn$scm, time_w = msyn$time_w,
                      lambda_t = msyn$lambda_t,
                      fit_resids = msyn$fit_resids,
                      time_cohort = msyn$time_cohort, long_df = long_df,
                      how_match = msyn$how_match)
    msyn_i <- do.call(multisynth_formatted, c(args_list, msyn$extra_pars))

    # check for dropped treated units/time periods
    if(time_cohort) {
        dropped <- which(!msyn$grps %in% msyn_i$grps)
    } else {
        dropped <- which(!not_miss_j)
    }
    return(list(msyn_i,
                dropped))
}


#' Estimate standard errors for multi outcome ascm with jackknife
#' @param ascm Fitted augsynth object
#' @noRd
jackknife_se_multiout <- function(ascm) {

    wide_data <- ascm$data
    wide_list <- ascm$data_list
    n <- nrow(wide_data$X)
    Z <- wide_data$Z


    # only drop out control units with non-zero weights
    nnz_weights <- numeric(n)
    nnz_weights[wide_data$trt == 0] <- round(ascm$weights, 3) != 0

    trt_idxs <- (1:n)[as.logical(nnz_weights)]

    # jackknife estimates
    ests <- lapply(trt_idxs,
                   function(i) {
                       # drop unit i
                       new_data <- drop_unit_i_multiout(wide_list, Z, i)
                       # refit
                       new_ascm <- do.call(fit_augsynth_multiout_internal,
                                c(list(wide = new_data$wide,
                                       combine_method = ascm$combine_method,
                                       Z = new_data$Z,
                                       progfunc = ascm$progfunc,
                                       scm = ascm$scm,
                                       fixedeff = ascm$fixedeff,
                                       outcomes_str = ascm$outcomes),
                                  ascm$extra_args))
                        new_ascm$outcomes <- ascm$outcomes
                        new_ascm$data_list <- ascm$data_list
                        new_ascm$data$time <- ascm$data$time
                       # get ATT estimates
                       est <- predict(new_ascm, att = T)
                       est <- est[as.numeric(rownames(est)) >= ascm$t_int,, drop = F]
                       rbind(est, colMeans(est, na.rm = T))
                   })
    ests <- simplify2array(ests)
    ## standard errors
    se2 <- apply(ests, c(1, 2),
                 function(x) (n - 1) / n * sum((x - mean(x, na.rm = T)) ^ 2))
    se <- sqrt(se2)
    out <- list()
    att <- predict(ascm, att = T)
    att_post <- colMeans(att[as.numeric(rownames(att)) >= ascm$t_int,, drop = F],
                         na.rm = T)
    out$att <- rbind(att, att_post)
    t0 <- sum(as.numeric(rownames(att)) < ascm$t_int)
    out$se <- rbind(matrix(NA, t0, ncol(se)), se)
    out$sigma <- NA
    return(out)
}


#' Compute the weighted bootstrap distribution
#' @param multisynth fitted multisynth object
#' @param rweight Function to draw random weights as a function of n (e.g rweight(n))
#' @param relative Whether to compute effects according to relative time
#' @noRd
weighted_bootstrap_multi <- function(multisynth,
                                    rweight = rwild_b,
                                    n_boot = 1000,
                                    alpha = 0.05,
                                    att_weight = NULL,
                                    relative=NULL) {
  ## get info from the multisynth object
  if(is.null(relative)) {
      relative <- multisynth$relative
  }

  n <- nrow(multisynth$data$X)
  att <- predict(multisynth, att=T, att_weight = att_weight)
  outddim <- nrow(att)
  n1 <- sum(is.finite(multisynth$data$trt))
  J <- length(multisynth$grps)


  # draw random weights to get bootstrap distribution
  bs_est <- vapply(1:n_boot,
                      function(i) {
                        Z <- rweight(n)# / sqrt(n1)

                        predict(multisynth, att=T, att_weight = att_weight, bs_weight = Z) - sum(Z) / n1 * att
                      },
                      matrix(0, nrow=outddim,ncol=(J+1)))

  se2 <- apply(bs_est, c(1,2),
              function(x) mean((x - mean(x))^2, na.rm=T))
  bias <- apply(bs_est, c(1,2),
              function(x) mean(x, na.rm=T))
  upper_bound <- att - apply(bs_est, c(1,2),
              function(x) quantile(x, alpha / 2, na.rm = T))
  
  lower_bound <- att - apply(bs_est, c(1,2),
              function(x) quantile(x, 1 - alpha / 2, na.rm = T))

  return(list(att = att,
              bias = bias,
              se = sqrt(se2),
              upper_bound = upper_bound,
              lower_bound = lower_bound))

}

#' Bayesian bootstrap
#' @param n Number of units
#' @export
rdirichlet_b <- function(n) {
  Z <- as.numeric(rgamma(n, 1, 1))
  return(Z / sum(Z) * n)
}

#' Non-parametric bootstrap
#' @param n Number of units
#' @export
rmultinom_b <- function(n) as.numeric(rmultinom(1, n, rep(1 / n, n)))

#' Wild bootstrap (Mammen 1993)
#' @param n Number of units
#' @export
rwild_b <- function(n) {
  sample(c(-(sqrt(5) - 1) / 2, (sqrt(5) + 1) / 2 ), n,
         replace = TRUE,
         prob = c((sqrt(5) + 1)/ (2 * sqrt(5)), (sqrt(5) - 1) / (2 * sqrt(5))))
}

================================================
FILE: R/multi_outcomes.R
================================================
#' Fit Augmented SCM with multiple outcomes
#' @param form outcome ~ treatment | auxillary covariates
#' @param unit Name of unit column
#' @param time Name of time column
#' @param t_int Time of intervention
#' @param data Panel data as dataframe
#' @param progfunc What function to use to impute control outcomes
#'                 Ridge=Ridge regression (allows for standard errors),
#'                 None=No outcome model,
#' @param scm Whether the SCM weighting function is used
#' @param fixedeff Whether to include a unit fixed effect, default F 
#' @param cov_agg Covariate aggregation functions, if NULL then use mean with NAs omitted
#' @param combine_method How to combine outcomes: `concat` concatenates outcomes and `avg` averages them, default: 'avg'
#' @param ... optional arguments for outcome model
#'
#' @return augsynth object that contains:
#'         \itemize{
#'          \item{"weights"}{Ridge ASCM weights}
#'          \item{"l2_imbalance"}{Imbalance in pre-period outcomes, measured by the L2 norm}
#'          \item{"scaled_l2_imbalance"}{L2 imbalance scaled by L2 imbalance of uniform weights}
#'          \item{"mhat"}{Outcome model estimate}
#'          \item{"data"}{Panel data as matrices}
#'         }
#' @export
augsynth_multiout <- function(form, unit, time, t_int, data,
                              progfunc=c("Ridge", "None"),
                              scm=T,
                              fixedeff = FALSE,
                              cov_agg=NULL,
                              combine_method = "avg",
                              ...) {
    call_name <- match.call()

    form <- Formula::Formula(form)
    unit <- enquo(unit)
    time <- enquo(time)
    
    ## format data
    outcome <- terms(formula(form, rhs=1))[[2]]
    trt <- terms(formula(form, rhs=1))[[3]]

    outcomes_str <- all.vars(outcome)
    outcomes <- sapply(outcomes_str, quo)
    # get outcomes as a list
    wide_list <- format_data_multi(outcomes, trt, unit, time, t_int, data)
    

    ## add covariates
    if(length(form)[2] == 2) {
        cov_form <- paste(deparse(terms(formula(form, rhs = 2))[[3]]), collapse = "")
        new_form <- as.formula(paste("~", cov_form))
        Z <- extract_covariates(new_form, unit, time, t_int, data, cov_agg)
    } else {
        Z <- NULL
    }

    # only allow ridge augmentation
    if(! tolower(progfunc) %in% c("none", "ridge")) {
      stop(paste(progfunc, "is not a valid augmentation function with multiple outcomes. Only `none` or `ridge` are allowable options for `prog_func`"))
    }

    # fit augmented SCM
    augsynth <- fit_augsynth_multiout_internal(wide_list, combine_method, Z,
                                               progfunc, scm,
                                               fixedeff, outcomes_str, ...)

    # add some extra data
    augsynth$data$time <- data %>% distinct(!!time) %>% pull(!!time)
    augsynth$call <- call_name
    augsynth$t_int <- t_int 
    augsynth$combine_method <- combine_method

    treated_units <- data %>% filter(!!trt == 1) %>% distinct(!!unit) %>% pull(!!unit)
    control_units <- data %>% filter(!(!!unit %in% treated_units)) %>% 
                        distinct(!!unit) %>% pull(!!unit)
    augsynth$weights <- matrix(augsynth$weights)
    rownames(augsynth$weights) <- control_units

    return(augsynth)
}


#' Internal function to fit augmented SCM with multiple outcomes
#' @param wide_list List of matrices for each outcome formatted from format_data
#' @param combine_method How to combine outcomes
#' @param Z Matrix of auxiliary covariates
#' @param progfunc outcome model to use
#' @param scm Whether to fit SCM
#' @param fixedeff Whether to de-mean synth
#' @param ... Extra args for outcome model
#' @noRd
fit_augsynth_multiout_internal <- function(wide_list, combine_method, Z,
                                           progfunc, scm, fixedeff, 
                                           outcomes_str, ...) {


    # combine into a matrix for fitting and balancing
    out <- combine_outcomes(wide_list, combine_method, fixedeff, ...)
    wide_bal <- out$wide_bal
    mhat <- out$mhat
    V <- out$V
    synth_data <- do.call(format_synth, wide_bal)

    # set Y1 and Y0plot to be raw concatenated outcomes
    X <- do.call(cbind, wide_list$X)
    y <- do.call(cbind, wide_list$y)
    trt <- wide_list$trt
    synth_data$Y0plot <- t(cbind(X, y)[trt == 0,, drop = F])
    synth_data$Y1plot <- colMeans(cbind(X, y)[trt == 1,, drop = F])


    augsynth <- fit_augsynth_internal(wide_bal, synth_data, Z, progfunc, 
                                      scm, fixedeff = F, V = V, ...)

    # potentially add back in fixed effects
    augsynth$mhat <- mhat# + augsynth$mhat

    augsynth$data <- list(X = X, trt = trt, y = y, Z = Z)
    augsynth$data_list <- wide_list
    augsynth$outcomes <- outcomes_str
    # change fixedeff flag to match input (rather than fixedeff = F in fit_augsynth_internal)
    augsynth$fixedeff <- fixedeff
    ##format output
    class(augsynth) <- c("augsynth_multiout", "augsynth")
    return(augsynth)
}

#' Helper function to combine multiple outcomes into a single balance matrix
#' @param wide_list List of lists of pre/post treatment data for each outcome
#' @param combine_method How to combine outcomes
#' @param fixedeff Whether to take out unit fixed effects or not
#' @param nu Weighting between concatenated and averaged objectives
#' @param ... Extra arguments for combination
#' @noRd
#' @return \itemize{
#'          \item{"X"}{Matrix of combined pre-treatment outcomes}
#'          \item{"trt"}{Vector of treatment assignments}
#'          \item{"y"}{Matrix of combined post-treatment outcomes}
#'         }
combine_outcomes <- function(wide_list, combine_method, fixedeff,
                             nu = NULL, ...) {
  n_outs <- length(wide_list$X)
  n_units <- Map(nrow, wide_list$X) %>% Reduce(max, .)
  # take out unit fixed effects
  demean_j <- function(j) {
    means <- rowMeans(wide_list$X[[j]], na.rm = TRUE)

    new_wide_data <- list()
    new_X <- wide_list$X[[j]] - means
    new_y <- wide_list$y[[j]] - means

    new_wide_data$X <- new_X
    new_wide_data$y <- new_y
    new_wide_data$mhat_pre <- replicate(ncol(wide_list$X[[j]]),
                                        means)
    new_wide_data$mhat_post <- replicate(ncol(wide_list$y[[j]]),
                                        means)
    return(new_wide_data)
  }
  if(fixedeff) {
    new_wide_list <- lapply(1:n_outs, demean_j)
    wide_list$X <- lapply(new_wide_list, function(x) x$X)
    wide_list$y <- lapply(new_wide_list, function(x) x$y)
    mhat_pre <- lapply(new_wide_list, function(x) x$mhat_pre)
    mhat_post <- lapply(new_wide_list, function(x) x$mhat_post)
  } else {
    mhat_pre <- lapply(
      1:n_outs,
      function(j) matrix(0, nrow = n_units, ncol = ncol(wide_list$X[[j]])))
    mhat_post <- lapply(
      1:n_outs,
      function(j) matrix(0, nrow = n_units, ncol = ncol(wide_list$y[[j]])))
  }

    # combine outcomes
    if(combine_method == "concat") {
      # center X and scale by overall variance for outcome
      # X <- lapply(wide_list$X, function(x) t(t(x) - colMeans(x)) / sd(x))
      wide_bal <- list(X = do.call(cbind, lapply(wide_list$X, function(x) t(na.omit(t(x))))),
                        y = do.call(cbind, lapply(wide_list$y, function(x) t(na.omit(t(x))))),
                        trt = wide_list$trt)

      # V matrix scales by inverse variance for outcome and number of periods
      V <- do.call(c, 
          lapply(wide_list$X, 
            function(x) rep(1 / (sqrt(nrow(na.omit(t(x)))) * 
                    sd(x[wide_list$trt == 0, , drop = F], na.rm=T)),
                    nrow(na.omit(t(x))))))

    # } else if(combine_method == "svd") {
    #     wide_bal <- list(X = do.call(cbind, wide_list$X),
    #                      y = do.call(cbind, wide_list$y),
    #                      trt = wide_list$trt)

    #     # first get the standard deviations of the outcomes to put on the same scale
    #     sds <- do.call(c, 
    #         lapply(wide_list$X, 
    #             function(x) rep((sqrt(ncol(x)) * sd(x, na.rm=T)), ncol(x))))

    #     # do an SVD on centered and scaled outcomes
    #     X0 <- wide_bal$X[wide_bal$trt == 0, , drop = FALSE]
    #     X0 <- t((t(X0) - colMeans(X0)) / sds)
    #     k <- if(is.null(k)) ncol(X0) else k
    #     V <- diag(1 / sds) %*% svd(X0)$v[, 1:k, drop = FALSE]
  } else if(combine_method == "avg") {
      # average pre-treatment outcomes, dividing by standard deviation and removing missing values
      X_avg <- rowMeans(simplify2array(lapply(wide_list$X,
                                  function(x) (x - mean(x[wide_list$trt == 0,], na.rm = TRUE)) / sd(x[wide_list$trt == 0,], na.rm = TRUE))), dims = 2, na.rm = TRUE)
      # remove any time periods with NAs
      X_avg <- t(na.omit(t(X_avg)))
    wide_bal <- list(X = X_avg,
        y = rowMeans(simplify2array(wide_list$y), dims = 2, na.rm = TRUE),
        trt = wide_list$trt)

    V <- diag(ncol(wide_bal$X))

  } else if(combine_method == "avg_concat") {
      # average pre-treatment outcomes, dividing by standard deviation and removing missing values
      # standardize the outcomes
      X_list_std<- lapply(wide_list$X,function(x) (x - mean(x[wide_list$trt == 0,], na.rm = TRUE)) / sd(x[wide_list$trt == 0,], na.rm = TRUE))


      X_avg <- rowMeans(simplify2array(X_list_std), dims = 2, na.rm = TRUE)
      # remove any time periods with NAs
      X_avg <- t(na.omit(t(X_avg)))

      X_concat <- do.call(cbind, lapply(X_list_std, function(x) t(na.omit(t(x)))))

      # V matrix assigns weight nu to the averaged objective and (1 - nu) to the concatenated objective
      # V <- c(rep(sqrt(nu), ncol(X_avg)),
      #         sqrt(1 - nu) / sqrt(n_outs) * do.call(c, 
      #             lapply(wide_list$X,
      #               function(x) rep(1 / (sqrt(nrow(na.omit(t(x)))) * 
      #                       sd(x[wide_list$trt == 0, , drop = F], na.rm=T)),
      #                       nrow(na.omit(t(x))))))
      # )
      V <- c(rep(sqrt(nu), ncol(X_avg)), rep(sqrt(1 - nu) / sqrt(n_outs), ncol(X_concat)))
      wide_bal <- list(
        X = cbind(X_avg, X_concat),
        y = do.call(cbind, lapply(wide_list$y, function(x) t(na.omit(t(x))))),
        trt = wide_list$trt
      )
    } else {
        stop(paste("combine_method should be one of ('avg', 'concat', 'avg_concat'),", 
            combine_method, " is not a valid combining option"))
    }

  mhat_pre <- do.call(cbind, mhat_pre)
  mhat_post <- do.call(cbind, mhat_post)
  mhat <- cbind(mhat_pre, mhat_post)

  return(list(wide_bal = wide_bal, mhat = mhat, V = V))

}

#' Get prediction of ATT or average outcome under control
#' @param object augsynth_multiout object
#' @param ... Optional arguments, including \itemize{\item{"att"}{Whether to return the ATT or average outcome under control}}
#'
#' @return Vector of predicted post-treatment control averages
#' @export
predict.augsynth_multiout <- function(object, ...) {
    if ("att" %in% names(list(...))) {
        att <- list(...)$att
    } else {
        att <- F
    }

    # call augsynth predict
    pred <- NextMethod()

    # separate out by outcome
    n_outs <- length(object$data_list$X)
    max_t <- max(sapply(1:n_outs, 
      function(k) ncol(object$data_list$X[[k]]) + ncol(object$data_list$y[[k]])))
    pred_reshape <- matrix(NA, ncol = n_outs, 
                               nrow = max_t)
    colnames <- lapply(1:n_outs, 
      function(k) colnames(cbind(object$data_list$X[[k]], 
                                 object$data_list$y[[k]])))
    rownames(pred_reshape) <- colnames[[which.max(sapply(colnames, length))]]
    colnames(pred_reshape) <- object$outcomes
    # get outcome names for predictions

    pre_outs <- do.call(c, 
                        sapply(1:n_outs, 
                               function(j) {
                                   rep(object$outcomes[j],
                                       ncol(object$data_list$X[[j]]))
                               }, simplify = FALSE))
    
    post_outs <- do.call(c,
                         sapply(1:n_outs, 
                                function(j) {
                                    rep(object$outcomes[j],
                                        ncol(object$data_list$y[[j]]))
                               }, simplify = FALSE))
    # print(pred)
    # print(cbind(names(pred), c(pre_outs, post_outs)))
    
    pred_reshape[cbind(names(pred), c(pre_outs, post_outs))] <- pred
    return(pred_reshape)
}


#' Print function for augsynth
#' @param x augsynth_multiout object
#' @param ... Optional arguments
#' @export
print.augsynth_multiout <- function(x, ...) {
    ## straight from lm
    cat("\nCall:\n", paste(deparse(x$call), sep="\n", collapse="\n"), "\n\n", sep="")

    ## print att estimates
    att <- predict(x, att = T)
    att_post <- data.frame(
        colMeans(att[as.numeric(rownames(att)) >= x$t_int,, drop = F]))
    names(att_post) <- c("")
    cat("Average ATT Estimate:\n")
    print(att_post)
    cat("\n\n")
}

#' Summary function for augsynth
#' @param object augsynth_multiout object
#' @param inf whether or not to perform inference
#' @param inf_typ Type of inference, default is "conformal"
#' @param grid_size Grid to compute prediction intervals over, default is 1 and only p-values are computed
#' @param ... Optional arguments, including \itemize{\item{"se"}{Whether to plot standard error}}
#' @export
summary.augsynth_multiout <- function(object, inf = T, inf_type = "conformal", grid_size = 1, ...) {
    

    summ <- list()

    if(inf) {
        if(inf_type == "conformal") {
          if(grid_size > 1) {
            cat(paste0("A grid size of ", grid_size, " will require ",
                           grid_size, "^", length(object$outcomes),
                           " = ", grid_size ^ length(object$outcomes),
                           " evaluations. This could take a while..."))
          }
          att_se <- conformal_inf_multiout(object, grid_size = grid_size, ...)
        } else {
          stop("Only conformal inference is supported for multiple outcomes")
        }
        # if(inf_type == "jackknife") {
        #     att_se <- jackknife_se_multiout(object)
        # } else if(inf_type == "jackknife+") {
        #   att_se <- time_jackknife_plus_multiout(object, ...)
        # } else if(inf_type == "conformal") {
        #   att_se <- conformal_inf_multiout(object, ...)
        # } else {
        #     stop(paste(inf_type, "is not a valid choice of 'inf_type'"))
        # }

        t_final <- nrow(att_se$att)

        att_df <- data.frame(att_se$att[1:(t_final - 1),, drop=F])
        names(att_df) <- object$outcomes
        att_df$Time <- object$data$time
        att_df <- att_df %>% gather(Outcome, Estimate, -Time)

        # if(inf_type == "jackknife") {
        #   se_df <- data.frame(att_se$se[1:(t_final - 1),, drop=F])
        #   names(se_df) <- object$outcomes
        #   se_df$Time <- object$data$time
        #   se_df <- se_df %>% gather(Outcome, Std.Error, -Time)

        #   att <- inner_join(att_df, se_df, by = c("Time", "Outcome"))
        # } else if(inf_type %in% c("conformal", "jackknife+")) {
          
        lb_df <- data.frame(att_se$lb[1:(t_final - 1),, drop=F])
        names(lb_df) <- object$outcomes
        lb_df$Time <- object$data$time
        lb_df <- lb_df %>% gather(Outcome, lower_bound, -Time)

        ub_df <- data.frame(att_se$ub[1:(t_final - 1),, drop=F])
        names(ub_df) <- object$outcomes
        ub_df$Time <- object$data$time
        ub_df <- ub_df %>% gather(Outcome, upper_bound, -Time)

        att <- inner_join(att_df, lb_df, by = c("Time", "Outcome")) %>%
            inner_join(ub_df, by = c("Time", "Outcome")) 
          # if(inf_type == "conformal") {

          pval_df <- data.frame(att_se$p_val[1:(t_final - 1),, drop=F])
          names(pval_df) <- object$outcomes
          pval_df$Time <- object$data$time
          pval_df <- pval_df %>% gather(Outcome, p_val, -Time)
          att <- inner_join(att, pval_df, by = c("Time", "Outcome")) 
          # }
        # }
        if(grid_size == 1) {
          att <- att %>% mutate(lower_bound = NA, upper_bound = NA)
        }

        att_avg <- data.frame(att_se$att[t_final,, drop = F])
        names(att_avg) <- object$outcomes
        att_avg <- gather(att_avg, Outcome, Estimate)

        # if(inf_type == "jackknife") {
        #   att_avg_se <- data.frame(att_se$se[t_final,, drop = F])
        #   names(att_avg_se) <- object$outcomes
        #   att_avg_se <- gather(att_avg_se, Outcome, Std.Error)
        #   average_att <- inner_join(att_avg, att_avg_se, by="Outcome")
        # } else if(inf_type %in% c("conformal", "jackknife+")){
        att_avg_lb <- data.frame(att_se$lb[t_final,, drop = F])
        names(att_avg_lb) <- object$outcomes
        att_avg_lb <- gather(att_avg_lb, Outcome, lower_bound)

        att_avg_ub <- data.frame(att_se$ub[t_final,, drop = F])
        names(att_avg_ub) <- object$outcomes
        att_avg_ub <- gather(att_avg_ub, Outcome, upper_bound)
        

        average_att <- inner_join(att_avg, att_avg_lb, by="Outcome") %>%
            inner_join(att_avg_ub, by = "Outcome")
          
          # if(inf_type == "conformal") {
        att_avg_pval <- data.frame(att_se$p_val[t_final,, drop = F])
        names(att_avg_pval) <- object$outcomes
        att_avg_pval <- gather(att_avg_pval, Outcome, p_val)

        average_att <- inner_join(average_att, att_avg_pval, by = "Outcome")

        if(grid_size == 1) {
          average_att <- average_att %>% mutate(lower_bound = NA, upper_bound = NA)
        }
          # }
        # } else {
        #   average_att <- gather(att_avg, Outcome, Estimate)
        # }
        

    } else {
        att_est <- predict(object, att = T)
        att_df <- data.frame(att_est)
        names(att_df) <- object$outcomes
        att_df$Time <- object$data$time
        att <- att_df %>% gather(Outcome, Estimate, -Time)
        att$Std.Error <- NA
        t_int <- min(sapply(object$data_list$X, ncol))
        att_avg <- data.frame(t(colMeans(
            att_est[t_int:nrow(att_est),, drop = F])))
        print(att_avg)
        names(att_avg) <- object$outcomes
        average_att <- gather(att_avg, Outcome, Estimate)
        average_att$Std.Error <- NA
    }

      # get average of all outcomes
    sds <- data.frame(Outcome = object$outcomes,
                      sdo = sapply(object$data_list$X,
                                    function(x)  sd(x[object$data_list$trt == 0,], na.rm = TRUE)))

    att %>%
      inner_join(sds, by = "Outcome") %>%
      mutate(Estimate = Estimate / sdo) %>%
      group_by(Time) %>%
      summarise(Estimate = mean(Estimate, na.rm = TRUE)) %>%
      mutate(Outcome = "Average") %>%
      bind_rows(att, .) -> att

    summ$att <- att
    summ$average_att <- average_att
    summ$t_int <- object$t_int
    summ$call <- object$call
    summ$l2_imbalance <- object$l2_imbalance
    summ$scaled_l2_imbalance <- object$scaled_l2_imbalance
    summ$inf_type <- inf_type
    ## get estimated bias

    if(object$progfunc == "Ridge") {
        mhat <- object$ridge_mhat
        w <- object$synw
    } else {
        mhat <- object$mhat
        w <- object$weights
    }
    trt <- object$data$trt
    m1 <- colMeans(mhat[trt==1,,drop=F])

    summ$bias_est <- m1 - t(mhat[trt==0,,drop=F]) %*% w
    if(object$progfunc == "None" | (!object$scm)) {
        summ$bias_est <- NA
    }

    
    class(summ) <- "summary.augsynth_multiout"
    return(summ)
}


#' Print function for summary function for augsynth
#' @param x summary.augsynth_multiout object
#' @param ... Optional arguments
#' @export
print.summary.augsynth_multiout <- function(x, ...) {
    ## straight from lm
    cat("\nCall:\n", paste(deparse(x$call), sep="\n", collapse="\n"), "\n\n", sep="")
    
    att_est <- x$att$Estimate
    ## get pre-treatment fit by outcome
    imbal <- x$att %>% 
        filter(Time < x$t_int) %>%
        group_by(Outcome) %>%
        summarise(Pre.RMSE = sqrt(mean(Estimate ^ 2, na.rm = TRUE)))

    cat(paste("Overall L2 Imbalance (Scaled):",
              format(round(x$l2_imbalance,3), nsmall=3), "  (",
              format(round(x$scaled_l2_imbalance,3), nsmall=3), ")\n\n",
            #   "Avg Estimated Bias: ",
            #   format(round(mean(summ$bias_est), 3),nsmall=3), "\n\n",
              sep=""))
    cat("Average ATT Estimate:\n")
    print(inner_join(x$average_att, imbal, by = "Outcome"))
    cat("\n\n")
}


#' Plot function for summary function for augsynth
#' @importFrom graphics plot
#' @param x summary.augsynth_multiout object
#' @param inf Boolean, whether to plot uncertainty intervals, default TRUE
#' @param plt_avg Boolean, whether to plot the average of the outcomes, default FALSE
#' @param ... Optional arguments for summary function
#' 
#' @export
plot.augsynth_multiout  <- function(x, inf = T, plt_avg = F, ...) {
  plot(summary(x, ...), inf =  inf, plt_avg = plt_avg)
}

#' Plot function for summary function for augsynth
#' @param x summary.augsynth_multiout object
#' @param inf Boolean, whether to plot uncertainty intervals, default TRUE
#' @param plt_avg Boolean, whether to plot the average of the outcomes, default FALSE
#' 
#' @export
plot.summary.augsynth_multiout <- function(x, inf = F, plt_avg = F, ...) {
    if(plt_avg) {
      p <- x$att %>%
        ggplot2::ggplot(ggplot2::aes(x=Time, y=Estimate))
    } else {
      p <- x$att %>%
        filter(Outcome != "Average") %>% 
        ggplot2::ggplot(ggplot2::aes(x=Time, y=Estimate))
    }
    if(inf) {
      if(x$inf_type == "jackknife") {
        p <- p + ggplot2::geom_ribbon(ggplot2::aes(ymin=Estimate-2*Std.Error,
                        ymax=Estimate+2*Std.Error),
                    alpha=0.2, data = . %>% filter(Outcome != "Average"))
      } else if(x$inf_type %in% c("conformal", "jackknife+")) {
        p <- p + ggplot2::geom_ribbon(ggplot2::aes(ymin=lower_bound,
                        ymax=upper_bound),
                    alpha=0.2, data =  . %>% filter(Outcome != "Average"))
      }

    }
    p + ggplot2::geom_line() +
        ggplot2::geom_vline(xintercept=x$t_int, lty=2) +
        ggplot2::geom_hline(yintercept=0, lty=2) +
        ggplot2::facet_wrap(~ Outcome, scales = "free_y") +
        ggplot2::theme_bw()

}

================================================
FILE: R/multi_synth_qp.R
================================================
################################################################################
## Solve the multisynth problem as a QP
################################################################################


#' Internal function to fit synth with staggered adoption with a QP solver
#' @param X Matrix of pre-final intervention outcomes, or list of such matrices after transformations
#' @param trt Vector of treatment levels/times
#' @param mask Matrix with indicators for observed pre-intervention times for each treatment group
#' @param n_leads Number of time periods after treatment to impute control values.
#'            For units treated at time T_j, all units treated after T_j + n_leads
#'            will be used as control values. If larger than the number of periods,
#'            only never never treated units (pure controls) will be used as comparison units
#' @param n_lags Number of pre-treatment periods to balance, default is to balance all periods
#' @param relative Whether to re-index time according to treatment date, default T
#' @param nu Hyper-parameter that controls trade-off between overall and individual balance.
#'              Larger values of nu place more emphasis on individual balance.
#'              Balance measure is
#'                nu ||global|| + (1-nu) ||individual||
#'              Default: 0
#' @param lambda Regularization hyper-parameter. Default, 0
#' @param time_cohort Whether to average synthetic controls into time cohorts
#' @param norm_pool Normalizing value for pooled objective, default: number of treated units squared
#' @param norm_sep Normalizing value for separate objective, default: number of treated units
#' @param verbose Whether to print logs for osqp
#' @param eps_rel Relative error tolerance for osqp
#' @param eps_abs Absolute error tolerance for osqp
#' @noRd
#' @return \itemize{
#'          \item{"weights"}{Matrix of unit weights}
#'          \item{"imbalance"}{Matrix of overall and group specific imbalance}
#'          \item{"global_l2"}{Imbalance overall}
#'          \item{"ind_l2"}{Matrix of imbalance for each group}
#'         }
multisynth_qp <- function(X, trt, mask, Z = NULL, n_leads=NULL, n_lags=NULL,
                          relative=T, nu=0, lambda=0, V = NULL, time_cohort = FALSE,
                          donors = NULL, norm_pool = NULL, norm_sep = NULL,
                          verbose = FALSE, 
                          eps_rel=1e-4, eps_abs=1e-4) {

    # if Z has no columns then set it to NULL
    if(!is.null(Z)) {
      if(ncol(Z) == 0) {
        Z <- NULL
      }
    }

    n <- if(typeof(X) == "list") dim(X[[1]])[1] else dim(X)[1]
    d <- if(typeof(X) == "list") dim(X[[1]])[2] else dim(X)[2]

    if(is.null(n_leads)) {
        n_leads <- d+1
    } else if(n_leads > d) {
        n_leads <- d+1
    }
    if(is.null(n_lags)) {
        n_lags <- d
    } else if(n_lags > d) {
        n_lags <- d
    }
    V <- make_V_matrix(n_lags, V)
    ## treatment times
    if(time_cohort) {
        grps <- unique(trt[is.finite(trt)])
        which_t <- lapply(grps, function(tj) (1:n)[trt == tj])
        # if doing a time cohort, convert the boolean mask
        mask <- unique(mask)
    } else {
        grps <- trt[is.finite(trt)]
        which_t <- (1:n)[is.finite(trt)]
    }


    J <- length(grps)
    if(is.null(norm_sep)) {
      norm_sep <- 1#J
    }
    if(is.null(norm_pool)) {
      norm_pool <- 1#J ^ 2
    }
    n1 <- sapply(1:J, function(j) length(which_t[[j]]))

    # if no specific donors passed in,
    # then all donors treated after n_lags are eligible
    if(is.null(donors)) {
      donors <- get_eligible_donors(trt, time_cohort, n_leads)
    }

    ## handle X differently if it is a list
    if(typeof(X) == "list") {
        x_t <- lapply(1:J, function(j) colSums(X[[j]][which_t[[j]], mask[j,]==1, drop=F]))
        
        # Xc contains pre-treatment data for valid donor units
        Xc <- lapply(1:nrow(mask),
                 function(j) X[[j]][donors[[j]], mask[j,]==1, drop=F])
        
        # std dev of outcomes for first treatment time
        sdx <- sd(X[[1]][is.finite(trt)])
    } else {
        x_t <- lapply(1:J, function(j) colSums(X[which_t[[j]], mask[j,]==1, drop=F]))        
        
        # Xc contains pre-treatment data for valid donor units
        Xc <- lapply(1:nrow(mask),
                 function(j) X[donors[[j]], mask[j,]==1, drop=F])

        # std dev of outcomes
        sdx <- sd(X[is.finite(trt)])
    }

    # get covariates for donors
    if(!is.null(Z)) {
      # scale covariates to have same variance as pure control outcomes
      Z_scale <- sdx * apply(Z, 2, 
        function(z) (z - mean(z[!is.finite(trt)])) / sd(z[!is.finite(trt)]))

      z_t <- lapply(1:J, function(j) colSums(Z_scale[which_t[[j]], , drop = F]))
      Zc <- lapply(1:J, function(j) Z_scale[donors[[j]], , drop = F])
    } else {
      z_t <- lapply(1:J, function(j) c(0))
      Zc <- lapply(1:J, function(j) Matrix::Matrix(0,
                                                   nrow = sum(donors[[j]]),
                                                   ncol = 1))
    }
    dz <- ncol(Zc[[1]])

    # replace NA values with zero
    x_t <- lapply(x_t, function(xtk) tidyr::replace_na(xtk, 0))
    Xc <- lapply(Xc, function(xck) apply(xck, 2, tidyr::replace_na, 0))
    ## make matrices for QP
    n0s <- sapply(Xc, nrow)
    if(any(n0s == 0)) {
      stop("Some treated units have no possible donor units!")
    }
    n0 <- sum(n0s)

    const_mats <- make_constraint_mats(trt, grps, n_leads, n_lags, Xc, Zc, d, n1)
    Amat <- const_mats$Amat
    lvec <- const_mats$lvec
    uvec <- const_mats$uvec

    ## quadratic balance measures

    qvec <- make_qvec(Xc, x_t, z_t, nu, n_lags, d, V, norm_pool, norm_sep)

    Pmat <- make_Pmat(Xc, x_t, dz, nu, n_lags, lambda, d, V, norm_pool, norm_sep)

    ## Optimize
    settings <- do.call(osqp::osqpSettings, 
                        c(list(verbose = verbose, 
                               eps_rel = eps_rel, 
                               eps_abs = eps_abs)))

    out <- osqp::solve_osqp(Pmat, qvec, Amat, lvec, uvec, pars = settings)

    ## get weights
    total_ctrls <- n0 * J
    weights <- matrix(out$x[1:total_ctrls], nrow = n0)
    nj0 <- as.numeric(lapply(Xc, nrow))
    nj0cumsum <- c(0, cumsum(nj0))
    imbalance <- vapply(1:J,
                        function(j) {
                            dj <- length(x_t[[j]])
                            ndim <- min(dj, n_lags)
                            c(numeric(d-ndim),
                            x_t[[j]][(dj-ndim+1):dj] -
                                t(Xc[[j]][,(dj-ndim+1):dj, drop = F]) %*% 
                                out$x[(nj0cumsum[j] + 1):nj0cumsum[j + 1]])
                        },
                        numeric(d))
    avg_imbal <- rowMeans(t(t(imbalance)))

    Vsq <- t(V) %*% V
    global_l2 <- c(sqrt(t(avg_imbal[(d - n_lags + 1):d]) %*% Vsq %*%
                          avg_imbal[(d - n_lags + 1):d])) / sqrt(d)
    avg_l2 <- mean(apply(imbalance, 2,
                  function(x) c(sqrt(t(x[(d - n_lags + 1):d]) %*% Vsq %*%
                                x[(d - n_lags + 1):d]))))
    ind_l2 <- sqrt(mean(
      apply(imbalance, 2,
      function(x) c(x[(d - n_lags + 1):d] %*% Vsq %*% x[(d - n_lags + 1):d]) /
          sum(x[(d - n_lags + 1):d] != 0))))
    # pad weights with zeros for treated units and divide by number of treated units
    vapply(1:J,
           function(j) {
             weightj <-  numeric(n)
             weightj[donors[[j]]] <- out$x[(nj0cumsum[j] + 1):nj0cumsum[j + 1]]
             weightj
           },
           numeric(n)) -> weights

    weights <- t(t(weights) / n1)
    # manually enforce non-negativity constraint
    # (osqp solver only enforces constraint up to a tolerance)
    weights <- pmax(weights, 0)

    output <- list(weights = weights,
                   imbalance = cbind(avg_imbal, imbalance),
                   global_l2 = global_l2,
                   ind_l2 = ind_l2,
                   avg_l2 = avg_l2,
                   V = V)

    if(!is.null(Z)) {
      # imbalance in auxiliary covariates
      z_t <- sapply(1:J, function(j) colMeans(Z[which_t[[j]], , drop = F]))
      imbal_z <- z_t - t(Z) %*% weights
      avg_imbal_z <- rowSums(t(t(imbal_z) * n1)) / sum(n1)
      global_l2_z <- sqrt(sum(avg_imbal_z ^ 2))
      ind_l2_z <- sum(apply(imbal_z, 2, function(x) sqrt(sum(x ^ 2))))
      imbal_z <- cbind(avg_imbal_z, imbal_z)
      rownames(imbal_z) <- colnames(Z)

      output$imbalance_aux <- imbal_z
      output$global_l2_aux <- global_l2_z
      output$ind_l2_aux <- ind_l2_z
    }
    

    return(output)
}


#' Create constraint matrices for multisynth QP
#' @param trt Vector of treatment levels/times
#' @param grps Treatment times
#' @param n_leads Number of time periods after treatment to impute control values.
#' @param n_lags Number of pre-treatment periods to balance
#' @param Xc List of outcomes for possible comparison units
#' @param d Max number of lagged outcomes
#' @param n1 Vector of number of treated units per cohort
#' @noRd
#' @return 
#'         \itemize{
#'          \item{"Amat"}{Linear constraint matrix}
#'          \item{"lvec"}{Lower bounds for linear constraints}
#'          \item{"uvec"}{Upper bounds for linear constraints}
#'         }
make_constraint_mats <- function(trt, grps, n_leads, n_lags, Xc, Zc, d, n1) {

    J <- length(grps)
    idxs0  <- trt  > n_leads + min(grps)

    n0 <- sum(idxs0)

    ## sum to n1 constraint
    A1 <- do.call(Matrix::bdiag, lapply(1:(J), function(x) rep(1, n0)))
    A1 <- Matrix::bdiag(lapply(1:J, function(j) rep(1, nrow(Xc[[j]]))))
    
    Amat <- as.matrix(Matrix::t(A1))
    Amat <- Matrix::rbind2(Matrix::t(A1), Matrix::Diagonal(nrow(A1)))

    dz <- ncol(Zc[[1]])
    # constraints for transformed weights
    A_trans1 <- do.call(Matrix::bdiag,
                       lapply(1:J,
                        function(j)  {
                            dj <- ncol(Xc[[j]])
                            ndim <- min(dj, n_lags)
                            max_dim <- min(d, n_lags)
                            mat <- Xc[[j]][, (dj - ndim + 1):dj, drop = F]
                            n0 <- nrow(mat)
                            zero_mat <- Matrix::Matrix(0, n0, max_dim - ndim)
                            Matrix::t(cbind(zero_mat, mat))
                       }))

    # sum of total number of pre-periods
    sum_tj <- min(d, n_lags) * J
    A_trans2 <- - Matrix::Diagonal(sum_tj)
    A_trans <- Matrix::cbind2(
      Matrix::cbind2(A_trans1, A_trans2),
      Matrix::Matrix(0, nrow = nrow(A_trans1), ncol = dz * J))

    # constraints for transformed weights on auxiliary covariates
    A_transz <- Matrix::t(Matrix::bdiag(Zc))
    A_transz <- Matrix::cbind2(
      Matrix::cbind2(A_transz, 
                     Matrix::Matrix(0, nrow = nrow(A_transz), ncol = sum_tj)),
      -Matrix::Diagonal(dz * J))

    # add in zero columns for transformated weights
    Amat <- Matrix::cbind2(Amat, 
                           Matrix::Matrix(0,
                                          nrow = nrow(Amat),
                                          ncol = sum_tj + dz * J))
    Amat <- Matrix::rbind2(Matrix::rbind2(Amat, A_trans), A_transz)

    lvec <- c(n1, # sum to n1 constraint
              numeric(nrow(A1)), # lower bound by zero
              numeric(sum_tj), # constrain transformed weights
              numeric(dz * J) # constrain transformed weights
             ) 
    
    uvec <- c(n1, #sum to n1 constraint
              rep(Inf, nrow(A1)),
              numeric(sum_tj), # constrain transformed weights
              numeric(dz * J) # constrain transformed weights
              )


    return(list(Amat = Amat, lvec = lvec, uvec = uvec))
}

#' Make the vector in the QP
#' @param Xc List of outcomes for possible comparison units
#' @param x_t List of outcomes for treated units
#' @param nu Hyperparameter between global and individual balance
#' @param n_lags Number of lags to balance
#' @param d Largest number of pre-intervention time periods
#' @param V Scaling matrix
#' @param norm_pool Normalizing value for pooled objective
#' @param norm_sep Normalizing value for separate objective
#' @noRd
make_qvec <- function(Xc, x_t, z_t, nu, n_lags, d, V, norm_pool, norm_sep) {

    J <- length(x_t)
    Vsq <- t(V) %*% V
    qvec <- lapply(1:J,
                   function(j) {
                       dj <- length(x_t[[j]])
                       ndim <- min(dj, n_lags)
                       max_dim <- min(d, n_lags)
                       vec <- x_t[[j]][(dj - ndim + 1):dj] / ndim
                       Vsq %*% c(numeric(max_dim - ndim), vec)
                   })

    avg_target_vec <- lapply(x_t,
                            function(xtk) {
                                dk <- length(xtk)
                                ndim <- min(dk, n_lags)
                                max_dim <- min(d, n_lags)
                                c(numeric(max_dim - ndim), 
                                    xtk[(dk - ndim + 1):dk])
                            }) %>% reduce(`+`) %*% Vsq
    qvec_avg <- rep(avg_target_vec, J)
    # qvec <- - (nu * qvec_avg / n_lags + (1 - nu) * reduce(qvec, c))
    # qvec <- - (nu * qvec_avg / (J ^ 2 * n_lags) +
    #           (1 - nu) * reduce(qvec, c) / J)
    qvec <- - (nu * qvec_avg / (norm_pool * n_lags * J ^ 2) +
               (1 - nu) * reduce(qvec, c) / (norm_sep * J))

    qvec_avg_z <- z_t %>% reduce(`+`)
    qvec_avg_z <- rep(qvec_avg_z, J)
    # qvec_z <- - (nu * qvec_avg_z + (1 - nu) * reduce(z_t, c)) / length(z_t[[1]])
    # qvec_z <- - (nu * qvec_avg_z / J ^2 +
    #              (1 - nu) * reduce(z_t, c) / J) / length(z_t[[1]])
    qvec_z <- - (nu * qvec_avg_z / (norm_pool * J ^ 2) +
                 (1 - nu) * reduce(z_t, c) / (norm_sep * J)) / length(z_t[[1]])

    total_ctrls <- lapply(Xc, nrow) %>% reduce(`+`)
    return(c(numeric(total_ctrls), qvec, qvec_z))
}


#' Make the matrix in the QP
#' @param Xc List of outcomes for possible comparison units
#' @param x_t List of outcomes for treated units
#' @param nu Hyperparameter between global and individual balance
#' @param n_lags Number of lags to balance
#' @param lambda Regularization hyperparameter
#' @param d Largest number of pre-intervention time periods
#' @param V Scaling matrix
#' @param norm_pool Normalizing value for pooled objective
#' @param norm_sep Normalizing value for separate objective
#' @noRd
make_Pmat <- function(Xc, x_t, dz, nu, n_lags, lambda, d, V,
                      norm_pool, norm_sep) {

    J <- length(x_t)

    Vsq <- t(V) %*% V
    ndims <- vapply(1:J,
                    function(j) min(length(x_t[[j]]), n_lags),
                    numeric(1))
    max_dim <- min(d, n_lags)
    total_dim <- sum(ndims)
    total_dim <- max_dim * J
    V1 <- Matrix::bdiag(lapply(ndims, 
                        function(ndim) Matrix::Diagonal(max_dim, 1 / ndim)))
    V1 <- Matrix::bdiag(lapply(ndims, function(ndim) Vsq / ndim))
    tile_sparse <- function(j) {
        kronecker(Matrix::Matrix(1, nrow = j, ncol = j), Vsq)
    }
    tile_sparse_cov <- function(d, j) {
        kronecker(Matrix::Matrix(1, nrow = j, ncol = j),
                  Matrix::Diagonal(d))
    }
    V2 <- tile_sparse(J) / n_lags
    # Pmat <- nu * V2 + (1 - nu) * V1
    # Pmat <- nu * V2 / J ^ 2 + (1 - nu) * V1 / J
    Pmat <- nu * V2 / (norm_pool * J ^ 2) + (1 - nu) * V1 / (norm_sep * J)
    V1_z <- Matrix::Diagonal(dz * J, 1 / dz)
    V2_z <- tile_sparse_cov(dz, J) / dz
    # Pmat_z <- nu * V2_z + (1 - nu) * V1_z
    # Pmat_z <- nu * V2_z / J ^ 2 + (1 - nu) * V1_z / J
    Pmat_z <- nu * V2_z / (norm_pool * J ^ 2) + (1 - nu) * V1_z / (norm_sep * J)
    # combine
    total_ctrls <- lapply(Xc, nrow) %>% reduce(`+`)
    Pmat <- Matrix::bdiag(Matrix::Matrix(0, nrow = total_ctrls,
                                         ncol = total_ctrls),
                          Pmat, Pmat_z)
    I0 <- Matrix::bdiag(Matrix::Diagonal(total_ctrls),
                        Matrix::Matrix(0, nrow = total_dim + dz * J,
                                          ncol = total_dim + dz * J))
    return(Pmat + lambda * I0)

}

================================================
FILE: R/multisynth_class.R
================================================
################################################################################
## Fitting, plotting, summarizing staggered synth
################################################################################

#' Fit staggered synth
#' @param form outcome ~ treatment | weighting covariates | approximate matching covaraites | exact matching covariates
#' \itemize{
#'    \item{outcome}{Name of the outcome of interest}
#'    \item{treatment}{Name of the treatment assignment variable}
#'    \item{weighting covariates}{Auxiliary covariates to weight on}
#'    \item{approximate matching covariates}{Auxiliary covariates to approximately match one before weighting}
#'    \item{exact matching covariates}{Auxiliary covariates to exactly match on before weighting}
#' }
#' If covariates are time-varying, their average value before the first unit is treated will be used. This can be changed by supplying a custom aggregation function to cov_agg.
#' @param unit Name of unit column
#' @param time Name of time column
#' @param data Panel data as dataframe
#' @param n_leads How long past treatment effects should be estimated for, default is number of post treatment periods for last treated unit
#' @param n_lags Number of pre-treatment periods to balance, default is to balance all periods
#' @param nu Fraction of balance for individual balance
#' @param lambda Regularization hyperparameter, default = 0
#' @param V Scaling matrix for synth optimization, default NULL is identity
#' @param fixedeff Whether to include a unit fixed effect, default TRUE
#' @param n_factors Number of factors for interactive fixed effects, setting to NULL fits with CV, default is 0
#' @param scm Whether to fit scm weights
#' @param time_cohort Whether to average synthetic controls into time cohorts, default FALSE
#' @param cov_agg Covariate aggregation function
#' @param eps_abs Absolute error tolerance for osqp
#' @param eps_rel Relative error tolerance for osqp
#' @param verbose Whether to print logs for osqp
#' @param ... Extra arguments
#' 
#' @return multisynth object that contains:
#'         \itemize{
#'          \item{"weights"}{weights matrix where each column is a set of weights for a treated unit}
#'          \item{"data"}{Panel data as matrices}
#'          \item{"imbalance"}{Matrix of treatment minus synthetic control for pre-treatment time periods, each column corresponds to a treated unit}
#'          \item{"global_l2"}{L2 imbalance for the pooled synthetic control}
#'          \item{"scaled_global_l2"}{L2 imbalance for the pooled synthetic control, scaled by the imbalance for unitform weights}
#'          \item{"ind_l2"}{Average L2 imbalance for the individual synthetic controls}
#'          \item{"scaled_ind_l2"}{Average L2 imbalance for the individual synthetic controls, scaled by the imbalance for unitform weights}
#'         \item{"n_leads", "n_lags"}{Number of post treatment outcomes (leads) and pre-treatment outcomes (lags) to include in the analysis}
#'          \item{"nu"}{Fraction of balance for individual balance}
#'          \item{"lambda"}{Regularization hyperparameter}
#'          \item{"scm"}{Whether to fit scm weights}
#'          \item{"grps"}{Time periods for treated units}
#'          \item{"y0hat"}{Pilot estimates of control outcomes}
#'          \item{"residuals"}{Difference between the observed outcomes and the pilot estimates}
#'          \item{"n_factors"}{Number of factors for interactive fixed effects}
#'         }
#' @export
multisynth <- function(form, unit, time, data,
                       n_leads=NULL, n_lags=NULL,
                       nu=NULL, lambda=0, V = NULL,
                       fixedeff = TRUE,
                       n_factors=0,
                       scm=T,
                       time_cohort = F,
                       how_match = "knn",
                       cov_agg = NULL,
                       eps_abs = 1e-4,
                       eps_rel = 1e-4,
                       verbose = FALSE, ...) {
    call_name <- match.call()

    form <- Formula::Formula(form)
    unit <- enquo(unit)
    time <- enquo(time)

    ## format data
    outcome <- terms(formula(form, rhs=1))[[2]]
    trt <- terms(formula(form, rhs=1))[[3]]
    wide <- format_data_stag(outcome, trt, unit, time, data)

    check_data_stag(wide, fixedeff, n_leads, n_lags)

    force <- if(fixedeff) 3 else 2

    # get covariates
    if(length(form)[2] == 2) {
      Z <- extract_covariates(form, unit, time, wide$time[min(wide$trt) + 1],
                              data, cov_agg)
    } else if(length(form)[2] == 3) {
      app_form <- Formula::Formula(formula(form, rhs = 1:2))
      Z_weight <- extract_covariates(app_form, unit, time,
                                  wide$time[min(wide$trt) + 1],
                                  data, cov_agg)
      exact_form <- Formula::Formula(formula(form, rhs = c(1,3)))
      Z_match<- extract_covariates(exact_form, unit, time,
                                  wide$time[min(wide$trt) + 1],
                                  data, cov_agg)
      Z <- cbind(Z_weight, Z_match)
      wide$match_covariates <- colnames(Z_match)
    } else if(length(form)[2] == 4) {
      if(time_cohort) {
        stop("Aggregating by time cohort and matching on covariates are not ",
             "implemented together. If matching then you cannot aggregate ",
             "by time cohort.")
      }
      weight_form <- Formula::Formula(formula(form, rhs = c(1,2)))
      Z_weight <- extract_covariates(weight_form, unit, time,
                                      wide$time[min(wide$trt) + 1],
                                      data, cov_agg)
      app_form <- Formula::Formula(formula(form, rhs = c(1,3)))
      Z_app <- extract_covariates(app_form, unit, time,
                                  wide$time[min(wide$trt) + 1],
                                  data, cov_agg)
      exact_form <- Formula::Formula(formula(form, rhs = c(1,4)))
      Z_exact <- extract_covariates(exact_form, unit, time,
                                  wide$time[min(wide$trt) + 1],
                                  data, cov_agg)
      Z <- cbind(Z_weight, Z_app, Z_exact)
      wide$exact_covariates <- colnames(Z_exact)
      wide$match_covariates <- c(colnames(Z_app), wide$exact_covariates)
    } else {
        Z <- NULL
    }
    wide$Z <- Z

    # if n_leads is NULL set it to be the largest possible number of leads
    # for the last treated unit
    if(is.null(n_leads)) {
        n_leads <- ncol(wide$y)
    } else if(n_leads > max(apply(1-wide$mask, 1, sum, na.rm = T)) +
                                                              ncol(wide$y)) {
        n_leads <- max(apply(1-wide$mask, 1, sum, na.rm = T)) + ncol(wide$y)
    }

    ## if n_lags is NULL set it to the largest number of pre-treatment periods
    if(is.null(n_lags)) {
        n_lags <- ncol(wide$X)
    } else if(n_lags > ncol(wide$X)) {
        n_lags <- ncol(wide$X)
    }

    long_df <- data[c(quo_name(unit), quo_name(time), as.character(trt), as.character(outcome))]

    msynth <- multisynth_formatted(wide = wide, relative = T,
                                n_leads = n_leads, n_lags = n_lags,
                                nu = nu, lambda = lambda, V = V,
                                force = force, n_factors = n_factors,
                                scm = scm, time_cohort = time_cohort,
                                time_w = F, lambda_t = 0,
                                fit_resids = TRUE, eps_abs = eps_abs,
                                eps_rel = eps_rel, verbose = verbose, long_df = long_df, 
                                how_match = how_match, ...)
    
   
    units <- data %>% arrange(!!unit) %>% distinct(!!unit) %>% pull(!!unit)
    rownames(msynth$weights) <- units


    if(scm) {
        ## Get imbalance for uniform weights on raw data

        ## TODO: Get rid of this stupid hack of just fitting the weights again with big lambda
        unif <- multisynth_qp(X=wide$X, ## X=residuals[,1:ncol(wide$X)],
                            trt=wide$trt,
                            mask=wide$mask,
                            Z = Z[, ! colnames(Z) %in% wide$match_covariates,
                                  drop = F],
                            n_leads=n_leads,
                            n_lags=n_lags,
                            relative=T,
                            nu=0, lambda=1e10,
                            V = V,
                            time_cohort = time_cohort,
                            donors = msynth$donors,
                            eps_rel = eps_rel, 
                            eps_abs = eps_abs,
                            verbose = verbose)
        ## scaled global balance
        ## msynth$scaled_global_l2 <- msynth$global_l2  / sqrt(sum(unif$imbalance[,1]^2))
        msynth$scaled_global_l2 <- msynth$global_l2  / unif$global_l2

        ## balance for individual estimates
        ## msynth$scaled_ind_l2 <- msynth$ind_l2  / sqrt(sum(unif$imbalance[,-1]^2))
        msynth$scaled_ind_l2 <- msynth$ind_l2  / unif$ind_l2
    }

    msynth$call <- call_name

    return(msynth)

}


#' Internal funciton to fit staggered synth with formatted data
#' @param wide List containing data elements
#' @param relative Whether to compute balance by relative time
#' @param n_leads How long past treatment effects should be estimated for
#' @param n_lags Number of pre-treatment periods to balance, default is to balance all periods
#' @param nu Fraction of balance for individual balance
#' @param lambda Regularization hyperparameter, default = 0
#' @param V Scaling matrix for synth optimization, default NULL is identity
#' @param force c(0,1,2,3) what type of fixed effects to include
#' @param n_factors Number of factors for interactive fixed effects, default does CV
#' @param scm Whether to fit scm weights
#' @param time_cohort Whether to average synthetic controls into time cohorts
#' @param time_w Whether to fit time weights
#' @param lambda_t Regularization for time regression
#' @param fit_resids Whether to fit SCM on the residuals or not
#' @param eps_abs Absolute error tolerance for osqp
#' @param eps_rel Relative error tolerance for osqp
#' @param verbose Whether to print logs for osqp
#' @param long_df A long dataframe with 4 columns in the order unit, time, trt, outcome
#' @param ... Extra arguments
#' @noRd
#' @return multisynth object
multisynth_formatted <- function(wide, relative=T, n_leads, n_lags,
                       nu, lambda, V,
                       force,
                       n_factors,
                       scm, time_cohort, 
                       time_w, lambda_t,
                       fit_resids,
                       eps_abs, eps_rel,
                       verbose, long_df, 
                       how_match, ...) {
    ## average together treatment groups
    ## grps <- unique(wide$trt) %>% sort()
    if(time_cohort) {
        grps <- unique(wide$trt[is.finite(wide$trt)])
    } else {
        grps <- wide$trt[is.finite(wide$trt)]
    }
    J <- length(grps)

    ## fit outcome models
    if(time_w) {
        # Autoregressive model
        out <- fit_time_reg(cbind(wide$X, wide$y), wide$trt,
                            n_leads, lambda_t, ...)
        y0hat <- out$y0hat
        residuals <- out$residuals
        params <- out$time_weights
    } else if(is.null(n_factors)) {
        out <- tryCatch({
            fit_gsynth_multi(long_df, cbind(wide$X, wide$y), wide$trt, force=force)
        }, error = function(error_condition) {
            stop("Cannot run CV because there are too few pre-treatment periods.")
        })

        y0hat <- out$y0hat
        params <- out$params
        n_factors <- ncol(params$factor)
        ## get residuals from outcome model
        residuals <- cbind(wide$X, wide$y) - y0hat
        
    } else if (n_factors != 0) {
        ## if number of factors is provided don't do CV
        out <- fit_gsynth_multi(long_df, cbind(wide$X, wide$y), wide$trt,
                                r=n_factors, CV=0, force=force)
        y0hat <- out$y0hat
        params <- out$params        
        
        ## get residuals from outcome model
        residuals <- cbind(wide$X, wide$y) - y0hat
    } else if(force == 0 & n_factors == 0) {
        # if no fixed effects or factors, just take out 
        # control averages at each time point
        # time fixed effects from pure controls
        pure_ctrl <- cbind(wide$X, wide$y)[!is.finite(wide$trt), , drop = F]
        y0hat <- matrix(colMeans(pure_ctrl, na.rm = TRUE),
                          nrow = nrow(wide$X), ncol = ncol(pure_ctrl), 
                          byrow = T)
        residuals <- cbind(wide$X, wide$y) - y0hat
        params <- NULL
    } else {
        ## take out pre-treatment averages
        fullmask <- cbind(wide$mask, matrix(0, nrow=nrow(wide$mask),
                                            ncol=ncol(wide$y)))
        out <- fit_feff(cbind(wide$X, wide$y), wide$trt, fullmask, force, time_cohort)
        y0hat <- out$y0hat
        residuals <- out$residuals
        params <- NULL
    }

    ## balance the residuals
    if(fit_resids) {
        if(time_w) {
            # fit scm on residuals after taking out unit fixed effects
            fullmask <- cbind(wide$mask, matrix(0, nrow=nrow(wide$mask),
                                            ncol=ncol(wide$y)))
            out <- fit_feff(cbind(wide$X, wide$y), wide$trt, fullmask, force, time_cohort)
            bal_mat <- lapply(out$residuals, function(x) x[,1:ncol(wide$X)])
        } else if(typeof(residuals) == "list") {
            bal_mat <- lapply(residuals, function(x) x[,1:ncol(wide$X)])
        } else {
            bal_mat <- residuals[,1:ncol(wide$X)]
        }
    } else {
        # if not balancing residuals, then take out control averages
        # for each time
        ctrl_avg <- matrix(colMeans(wide$X[!is.finite(wide$trt), , drop = F]),
                          nrow = nrow(wide$X), ncol = ncol(wide$X), byrow = T)
        bal_mat <- wide$X - ctrl_avg
        bal_mat <- wide$X
    }
    

    if(scm) {

        # get eligible set of donor units based on covariates
        donors <- get_donors(wide$X, wide$y, wide$trt,
                             wide$Z[, colnames(wide$Z) %in% 
                                      wide$match_covariates, drop = F],
                             time_cohort, n_lags, n_leads, how = how_match,
                             exact_covariates = wide$exact_covariates, ...)
        # run separate synth for scaling
        sep_fit <- multisynth_qp(X=bal_mat,
                                    trt=wide$trt,
                                    mask=wide$mask,
                                    Z = wide$Z[, !colnames(wide$Z) %in%
                                                  wide$match_covariates,
                                                  drop = F],
                                    n_leads=n_leads,
                                    n_lags=n_lags,
                                    relative=relative,
                                    nu=0, lambda=lambda,
                                    V = V,
                                    time_cohort = time_cohort,
                                    donors = donors,
                                    eps_rel = eps_rel,
                                    eps_abs = eps_abs,
                                    verbose = verbose)
        # if no nu value is provided, use default based on
        # global and individual imbalance for separate synth
        if(is.null(nu)) {
            # select nu by triangle inequality ratio
            glbl <- sep_fit$global_l2 * sqrt(nrow(sep_fit$imbalance))
            ind <- sep_fit$avg_l2
            nu <- glbl / ind

        }

        msynth <- multisynth_qp(X=bal_mat,
                                trt=wide$trt,
                                mask=wide$mask,
                                Z = wide$Z[, !colnames(wide$Z) %in%
                                                  wide$match_covariates,
                                                  drop = F],
                                n_leads=n_leads,
                                n_lags=n_lags,
                                relative=relative,
                                nu=nu, lambda=lambda,
                                V = V,
                                time_cohort = time_cohort,
                                donors = donors,
                                norm_pool = sep_fit$global_l2 ^ 2,
                                norm_sep = sep_fit$ind_l2 ^ 2,
                                eps_rel = eps_rel,
                                eps_abs = eps_abs,
                                verbose = verbose)
    } else {
        msynth <- list(weights = matrix(0, nrow = nrow(wide$X), ncol = J),
                       imbalance=NA,
                       global_l2=NA,
                       ind_l2=NA)
    }

    ## put in data and hyperparams
    msynth$data <- wide
    msynth$relative <- relative
    msynth$n_leads <- n_leads
    msynth$n_lags <- n_lags
    msynth$nu <- nu
    msynth$lambda <- lambda
    msynth$scm <- scm
    msynth$time_cohort <- time_cohort


    msynth$grps <- grps
    msynth$y0hat <- y0hat
    msynth$residuals <- residuals

    msynth$n_factors <- n_factors
    msynth$force <- force


    ## outcome model parameters
    msynth$params <- params

    # more arguments
    msynth$scm <- scm
    msynth$time_w <- time_w
    msynth$lambda_t <- lambda_t
    msynth$fit_resids <- fit_resids
    msynth$extra_pars <- c(list(eps_abs = eps_abs, 
                                eps_rel = eps_rel, 
                                verbose = verbose), 
                           list(...))
    msynth$long_df <- long_df

    msynth$how_match <- how_match
    msynth$donors <- donors
    ##format output
    class(msynth) <- "multisynth"
    return(msynth)
}


#' Get prediction of average outcome under control or ATT
#' @param object Fit multisynth object
#' @param att If TRUE, return the ATT, if FALSE, return imputed counterfactual
#' @param att_weight Weights to place on individual units/cohorts when averaging
#' @param bs_weight Weight to perturb units by for weighted bootstrap
#' @param ... Optional arguments
#'
#' @return Matrix of predicted post-treatment control outcomes for each treated unit
#' @export
predict.multisynth <- function(object, att = F, att_weight = NULL, bs_weight = NULL, ...) {

    multisynth <- object
    relative <- T
    
    time_cohort <- multisynth$time_cohort
    if(is.null(relative)) {
        relative <- multisynth$relative
    }
    n_leads <- multisynth$n_leads
    d <- ncol(multisynth$data$X)
    n <- nrow(multisynth$data$X)
    fulldat <- cbind(multisynth$data$X, multisynth$data$y)
    ttot <- ncol(fulldat)
    grps <- multisynth$grps
    J <- length(grps)

    if(is.null(bs_weight)) {
      # bs_weight <- rep(1 / sqrt(sum(is.finite(multisynth$data$trt))), n)
      bs_weight <- rep(1, n)
    }

    if(time_cohort) {
        which_t <- lapply(grps, 
                          function(tj) (1:n)[multisynth$data$trt == tj])
        mask <- unique(multisynth$data$mask)
    } else {
        which_t <- (1:n)[is.finite(multisynth$data$trt)]
        mask <- multisynth$data$mask
    }
    

    n1 <- sapply(1:J, function(j) length(which_t[[j]]))

    fullmask <- cbind(mask, matrix(0, nrow = J, ncol = (ttot - d)))
    

    ## estimate the post-treatment values to get att estimates
    mu1hat <- vapply(1:J,
                     function(j) colMeans((bs_weight * fulldat)[which_t[[j]],
                                                              , drop=FALSE]),
                     numeric(ttot))


    ## get average outcome model estimates and reweight residuals
    if(typeof(multisynth$y0hat) == "list") {
        mu0hat <- vapply(1:J,
                        function(j) {
                            y0hat <- colMeans(
                              (bs_weight * multisynth$y0hat[[j]])[which_t[[j]],
                                                                , drop=FALSE])
                            weightsj <- multisynth$weights[,j] * bs_weight
                            resj <- multisynth$residuals[[j]][weightsj != 0,, drop = F]
                            y0hat + t(resj) %*% weightsj[weightsj != 0]
                        }
                       , numeric(ttot)
                        )
    } else {
        mu0hat <- vapply(1:J,
                        function(j) {
                            y0hat <- colMeans(
                              (bs_weight * multisynth$y0hat)[which_t[[j]],
                                                              , drop=FALSE])
                            weightsj <- multisynth$weights[, j] * bs_weight
                            resj <- multisynth$residuals[weightsj != 0,, drop = F]
                            y0hat + t(resj) %*% weightsj[weightsj != 0]
                        }
                       , numeric(ttot)
                        )
    }

    tauhat <- mu1hat - mu0hat

    if(is.null(att_weight)) {
      att_weight <- rep(1, J)
    }
    ## re-index time if relative to treatment
    if(relative) {
        total_len <- min(d + n_leads, ttot + d - min(grps)) ## total length of predictions
        mu0hat <- vapply(1:J,
                         function(j) {
                             vec <- c(rep(NA, d-grps[j]),
                                      mu0hat[1:grps[j],j],
                                      mu0hat[(grps[j]+1):(min(grps[j] + n_leads, ttot)), j])
                             ## last row is post-treatment average
                             c(vec,
                               rep(NA, total_len - length(vec)),
                               mean(mu0hat[(grps[j]+1):(min(grps[j] + n_leads, ttot)), j]))
                               
                         },
                         numeric(total_len +1
                                 ))
        
        tauhat <- vapply(1:J,
                         function(j) {
                             vec <- c(rep(NA, d-grps[j]),
                                      tauhat[1:grps[j],j],
                                      tauhat[(grps[j]+1):(min(grps[j] + n_leads, ttot)), j])
                             ## last row is post-treatment average
                             c(vec,
                               rep(NA, total_len - length(vec)),
                               mean(tauhat[(grps[j]+1):(min(grps[j] + n_leads, ttot)), j]))
                         },
                         numeric(total_len +1
                                 ))
        # re-index unit weights if they change over time
        if(is.null(dim(att_weight))) {
          if(J == 1) {
            att_weight <- matrix(replicate(total_len + 1, att_weight), ncol = 1)
          } else {
            att_weight <- t(replicate(total_len + 1, att_weight))
          }
        }
        att_weight_new <- vapply(1:J,
                        function(j) {
                            vec <- c(rep(NA, d-grps[j]),
                                    att_weight[1:grps[j],j],
                                    att_weight[(grps[j]+1):(min(grps[j] + n_leads, ttot)), j])
                            ## last row is post-treatment average
                            c(vec,
                              rep(NA, total_len - length(vec)),
                              mean(att_weight[(grps[j]+1):(min(grps[j] + n_leads, ttot)), j]))
                              
                        },
                        numeric(total_len +1
                                ))
          
        ## get the overall average estimate
        avg <- apply(mu0hat, 1, function(z) sum(n1 * z, na.rm=T) / sum(n1 * !is.na(z)))
        avg <- sapply(1:nrow(mu0hat),
        function(k)  {
          sum(n1 * mu0hat[k,] * att_weight_new[k,], na.rm=T) /
            sum(n1 * (!is.na(mu0hat[k,])) * att_weight_new[k, ], na.rm = T)
        })
        mu0hat <- cbind(avg, mu0hat)

        avg <- apply(tauhat, 1, function(z) sum(n1 * z, na.rm=T) / sum(n1 * !is.na(z)))
        avg <- sapply(1:nrow(mu0hat),
        function(k)  {
          sum(n1 * tauhat[k,] * att_weight_new[k,], na.rm=T) /
            sum(n1 * (!is.na(tauhat[k,])) * att_weight_new[k, ], na.rm = T)
        })
        tauhat <- cbind(avg, tauhat)
        
    } else {

        ## remove all estimates for t > T_j + n_leads
        vapply(1:J,
               function(j) c(mu0hat[1:min(grps[j]+n_leads, ttot),j],
                             rep(NA, max(0, ttot-(grps[j] + n_leads)))),
               numeric(ttot)) -> mu0hat

        vapply(1:J,
               function(j) c(tauhat[1:min(grps[j]+n_leads, ttot),j],
                             rep(NA, max(0, ttot-(grps[j] + n_leads)))),
               numeric(ttot)) -> tauhat

        
        ## only average currently treated units
        avg1 <- rowSums(t(fullmask) *  mu0hat * n1) /
                rowSums(t(fullmask) *  n1)
        avg2 <- rowSums(t(1-fullmask) *  mu0hat * n1) /
            rowSums(t(1-fullmask) *  n1)
        avg <- replace_na(avg1, 0) * apply(fullmask, 2, min) +
            replace_na(avg2,0) * apply(1-fullmask, 2, max)
        cbind(avg, mu0hat) -> mu0hat

        ## only average currently treated units
        avg1 <- rowSums(t(fullmask) *  tauhat * n1) /
            rowSums(t(fullmask) *  n1)
        avg2 <- rowSums(t(1-fullmask) *  tauhat * n1) /
            rowSums(t(1-fullmask) *  n1)
        avg <- replace_na(avg1, 0) * apply(fullmask, 2, min) +
            replace_na(avg2,0) * apply(1 - fullmask, 2, max)
        cbind(avg, tauhat) -> tauhat
    }
    

    if(att) {
        return(tauhat)
    } else {
        return(mu0hat)
    }
}


#' Print function for multisynth
#' @param x multisynth object
#' @param ... Optional arguments
#' @export
print.multisynth <- function(x, att_weight = NULL, ...) {
    multisynth <- x
    
    ## straight from lm
    cat("\nCall:\n", paste(deparse(multisynth$call), 
        sep="\n", collapse="\n"), "\n\n", sep="")

    # print att estimates
    att_post <- predict(multisynth, att=T, att_weight = att_weight)[,1]
    att_post <- att_post[length(att_post)]

    cat(paste("Average ATT Estimate: ",
              format(round(mean(att_post),3), nsmall = 3), "\n\n", sep=""))
}


#' Plot function for multisynth
#' @importFrom graphics plot
#' @param x Augsynth object to be plotted
#' @param inf_type Type of inference to perform:
#'  \itemize{
#'    \item{bootstrap}{Wild bootstrap, the default option}
#'    \item{jackknife}{Jackknife}
#' }
#' @param inf Whether to compute and plot confidence intervals
#' @param levels Which units/groups to plot, default is every group
#' @param label Whether to label the individual levels
#' @param weights Whether to plot the weights, default = FALSE
#' @param ... Optional arguments
#' @export
plot.multisynth <- function(x, inf_type = "bootstrap", inf = T,
                            levels = NULL, label = T, 
                            weights = FALSE, ...) {

    if(weights) {
      ever_trt <- x$data$units[is.finite(x$data$trt)]
      never_trt <- x$data$units[!is.finite(x$data$trt)]
      weights <- data.frame(x$weights)

      colnames(weights) <- ever_trt
      weights$unit <- factor(rownames(weights),
                             levels = sort(rownames(weights), decreasing = TRUE))

      # plotting the weights
      weights %>%
        tidyr::pivot_longer(-unit, names_to = "trt_unit", values_to = "weight") %>%
        ggplot2::ggplot(aes(x = trt_unit, y = unit, fill = round(weight, 3))) + 
        ggplot2::geom_tile(color = "white", size=.5) +
        ggplot2::scale_fill_gradient(low = "white", high = "black", limits=c(-0.01,1.01)) +
        ggplot2::guides(fill = "none") +
        ggplot2::xlab("Treated Unit") +
        ggplot2::ylab("Donor Unit") +
        ggplot2::theme_bw() + 
        ggplot2::theme(axis.ticks.x = ggplot2::element_blank(),
              axis.ticks.y = ggplot2::element_blank())
    }
    else {
      plot(summary(x, inf_type = inf_type, ...),
         inf = inf, levels = levels, label = label)
    }
}


#' Summary function for multisynth
#' @param object multisynth object
#' @param inf_type Type of inference to perform:
#'  \itemize{
#'    \item{bootstrap}{Wild bootstrap, the default option}
#'    \item{jackknife}{Jackknife}
#' }
#' @param ... Optional arguments
#' 
#' @return summary.multisynth object that contains:
#'         \itemize{
#'          \item{"att"}{Dataframe with ATT estimates, standard errors for each treated unit}
#'          \item{"global_l2"}{L2 imbalance for the pooled synthetic control}
#'          \item{"scaled_global_l2"}{L2 imbalance for the pooled synthetic control, scaled by the imbalance for unitform weights}
#'          \item{"ind_l2"}{Average L2 imbalance for the individual synthetic controls}
#'          \item{"scaled_ind_l2"}{Average L2 imbalance for the individual synthetic controls, scaled by the imbalance for unitform weights}
#'         \item{"n_leads", "n_lags"}{Number of post treatment outcomes (leads) and pre-treatment outcomes (lags) to include in the analysis}
#'         }
#' @export
summary.multisynth <- function(object, inf_type = "bootstrap", att_weight = NULL, ...) {

    multisynth <- object
    
    relative <- T

    n_leads <- multisynth$n_leads
    d <- ncol(multisynth$data$X)
    n <- nrow(multisynth$data$X)
    ttot <- d + ncol(multisynth$data$y)

    trt <- multisynth$data$trt
    time_cohort <- multisynth$time_cohort
    if(time_cohort) {
        grps <- unique(trt[is.finite(trt)])
        which_t <- lapply(grps, function(tj) (1:n)[trt == tj])
    } else {
        grps <- trt[is.finite(trt)]
        which_t <- (1:n)[is.finite(trt)]
    }
    
    # grps <- unique(multisynth$data$trt) %>% sort()
    J <- length(grps)
    
    # which_t <- (1:n)[is.finite(multisynth$data$trt)]
    times <- multisynth$data$time
    
    summ <- list()
    ## post treatment estimate for each group and overall
    # att <- predict(multisynth, relative, att=T)
    
    if(inf_type == "jackknife") {
        attse <- jackknife_se_multi(multisynth, relative, att_weight = att_weight, ...)
    } else if(inf_type == "bootstrap") {
        if(object$force == 2) {
          warning("Wild bootstrap without including a unit fixed effect ",
                  "is likely to be very conservative!")
        }
        attse <- weighted_bootstrap_multi(multisynth, att_weight = att_weight, ...)
    } else {
        att <- predict(multisynth, relative, att=T, att_weight = att_weight)
        attse <- list(att = att,
                      se = matrix(NA, nrow(att), ncol(att)),
                      upper_bound = matrix(NA, nrow(att), ncol(att)),
                      lower_bound = matrix(NA, nrow(att), ncol(att)))
    }
    

    if(relative) {
        att <- data.frame(cbind(c(-(d-1):min(n_leads, ttot-min(grps)), NA),
                                attse$att))
        if(time_cohort) {
            col_names <- c("Time", "Average", 
                            as.character(times[grps + 1]))
        } else {
            col_names <- c("Time", "Average", 
                            as.character(multisynth$data$units[which_t]))
        }
        names(att) <- col_names
        att %>% gather(Level, Estimate, -Time) %>%
            rename("Time"=Time) %>%
            mutate(Time=Time-1) -> att

        se <- data.frame(cbind(c(-(d-1):min(n_leads, ttot-min(grps)), NA),
                               attse$se))                            
        names(se) <- col_names
        se %>% gather(Level, Std.Error, -Time) %>%
            rename("Time"=Time) %>%
            mutate(Time=Time-1) -> se
        lower_bound <- data.frame(cbind(c(-(d-1):min(n_leads, ttot-min(grps)), NA),
                                  attse$lower_bound))
        names(lower_bound) <- col_names
        lower_bound %>% gather(Level, lower_bound, -Time) %>%
          rename("Time"=Time) %>%
          mutate(Time=Time-1) -> lower_bound

        upper_bound <- data.frame(cbind(c(-(d-1):min(n_leads, ttot-min(grps)), NA),
                                        attse$upper_bound))
        names(upper_bound) <- col_names
        upper_bound %>% gather(Level, upper_bound, -Time) %>%
          rename("Time"=Time) %>%
            mutate(Time=Time-1) -> upper_bound

    } else {
        att <- data.frame(cbind(times, attse$att))
        names(att) <- c("Time", "Average", times[grps[1:J]])        
        att %>% gather(Level, Estimate, -Time) -> att

        se <- data.frame(cbind(times, attse$se))
        names(se) <- c("Time", "Average", times[grps[1:J]])        
        se %>% gather(Level, Std.Error, -Time) -> se

    }

    summ$att <- inner_join(att, se, by = c("Time", "Level")) %>%
      inner_join(lower_bound, by = c("Time", "Level")) %>%
        inner_join(upper_bound, by = c("Time", "Level"))

    summ$relative <- relative
    summ$grps <- grps
    summ$call <- multisynth$call
    summ$global_l2 <- multisynth$global_l2
    summ$scaled_global_l2 <- multisynth$scaled_global_l2

    summ$ind_l2 <- multisynth$ind_l2
    summ$scaled_ind_l2 <- multisynth$scaled_ind_l2

    summ$n_leads <- multisynth$n_leads
    summ$n_lags <- multisynth$n_lags

    class(summ) <- "summary.multisynth"
    return(summ)
}

#' Print function for summary function for multisynth
#' @param x summary object
#' @param level Which unit/group to print results for, default is the overall average
#' @param ... Optional arguments
#' @export
print.summary.multisynth <- function(x, level = "Average", ...) {

    summ <- x
    
    ## straight from lm
    cat("\nCall:\n", paste(deparse(summ$call), sep="\n", collapse="\n"), "\n\n", sep="")

    first_lvl <- summ$att %>% filter(Level != "Average") %>% pull(Level) %>% min()
    
    ## get ATT estimates for treatment level, post treatment
    if(summ$relative) {
        summ$att %>%
            filter(Time >= 0, Level==level) %>%
            rename("Time Since Treatment"=Time) -> att_est
    } else if(level == "average") {
        summ$att %>% filter(Time > first_lvl, Level=="Average") -> att_est
    } else {
        summ$att %>% filter(Time > level, Level==level) -> att_est
    }

    cat(paste("Average ATT Estimate (Std. Error): ",
              summ$att %>%
                  filter(Level == level, is.na(Time)) %>%
                  pull(Estimate) %>%
                  round(3) %>% format(nsmall=3),
              "  (",
              summ$att %>%
                  filter(Level == level, is.na(Time)) %>%
                  pull(Std.Error) %>%
                  round(3) %>% format(nsmall=3),
              ")\n\n", sep=""))
    
    cat(paste("Global L2 Imbalance: ",
              format(round(summ$global_l2,3), nsmall=3), "\n",
              "Scaled Global L2 Imbalance: ",
              format(round(summ$scaled_global_l2,3), nsmall=3), "\n",
              "Percent improvement from uniform global weights: ", 
              format(round(1-summ$scaled_global_l2,3)*100), "\n\n",
              "Individual L2 Imbalance: ",
              format(round(summ$ind_l2,3), nsmall=3), "\n",
              "Scaled Individual L2 Imbalance: ", 
              format(round(summ$scaled_ind_l2,3), nsmall=3), "\n",
              "Percent improvement from uniform individual weights: ", 
              format(round(1-summ$scaled_ind_l2,3)*100), "\t",
              "\n\n",
              sep=""))


    print(att_est, row.names=F)

}

#' Plot function for summary function for multisynth
#' @importFrom ggplot2 aes
#' 
#' @param x summary object
#' @param inf Whether to plot confidence intervals
#' @param levels Which units/groups to plot, default is every group
#' @param label Whether to label the individual levels
#' @param weights Whether to plot the weights, default = FALSE
#' @param ... Optional arguments
#' @export
plot.summary.multisynth <- function(x, inf = T, levels = NULL, label = T,
                                    weights = FALSE, ...) {


    if(weights) {
        ever_trt <- x$data$units[is.finite(x$data$trt)]
        never_trt <- x$data$units[!is.finite(x$data$trt)]
        weights <- data.frame(x$weights)

        colnames(weights) <- ever_trt
        weights$unit <- factor(rownames(weights),
                              levels = sort(rownames(weights), decreasing = TRUE))

        # plotting the weights
        weights %>%
          tidyr::pivot_longer(-unit, names_to = "trt_unit", values_to = "weight") %>%
          ggplot2::ggplot(aes(x = trt_unit, y = unit, fill = round(weight, 3))) + 
          ggplot2::geom_tile(color = "white", size=.5) +
          ggplot2::scale_fill_gradient(low = "white", high = "black", limits=c(-0.01,1.01)) +
          ggplot2::guides(fill = "none") +
          ggplot2::xlab("Treated Unit") +
          ggplot2::ylab("Donor Unit") +
          ggplot2::theme_bw() + 
          ggplot2::theme(axis.ticks.x = ggplot2::element_blank(),
                axis.ticks.y = ggplot2::element_blank())
    }

    summ <- x
    
    ## get the last time period for each level
    summ$att %>%
        filter(!is.na(Estimate),
               Time >= -summ$n_lags,
               Time <= summ$n_leads) %>%
        group_by(Level) %>%
        summarise(last_time = max(Time)) -> last_times

    if(is.null(levels)) levels <- unique(summ$att$Level)

    summ$att %>% inner_join(last_times) %>%
        filter(Level %in% levels) %>%
        mutate(label = ifelse(Time == last_time, Level, NA),
               is_avg = ifelse(("Average" %in% levels) * (Level == "Average"),
                               "A", "B")) %>%
        ggplot2::ggplot(ggplot2::aes(x = Time, y = Estimate,
                                     group = Level,
                                     color = is_avg,
                                     alpha = is_avg)) +
            ggplot2::geom_line(size = 1) +
            ggplot2::geom_point(size = 1) -> p
            
        if(label) {
          p <- p + ggrepel::geom_label_repel(ggplot2::aes(label = label),
                                      nudge_x = 1, na.rm = T)
        } 
        p <- p + ggplot2::geom_hline(yintercept = 0, lty = 2)

    if(summ$relative) {
        p <- p + ggplot2::geom_vline(xintercept = 0, lty = 2) +
            ggplot2::xlab("Time Relative to Treatment")
    } else {
        p <- p + ggplot2::geom_vline(aes(xintercept = as.numeric(Level)),
                                     lty = 2, alpha = 0.5,
                                     summ$att %>% filter(Level != "Average"))
    }

    ## add ses
    if(inf) {
        max_time <- max(summ$att$Time, na.rm = T)
        if(max_time == 0) {
          error_plt <- ggplot2::geom_errorbar
          clr <- "black"
          alph <- 1
        } else {
          error_plt <- ggplot2::geom_ribbon
          clr <- NA
          alph <- 0.2
        }
        if("Average" %in% levels) {
            p <- p + error_plt(
                ggplot2::aes(ymin=lower_bound,
                             ymax=upper_bound),
                alpha = alph, color=clr,
                data = summ$att %>% 
                        filter(Level == "Average",
                               Time >= 0))
            
        } else {
            p <- p + error_plt(
                ggplot2::aes(ymin=lower_bound,
                             ymax=upper_bound),
                             data = . %>% filter(Time >= 0),
                alpha = alph, color = clr)
        }
    }

    p <- p + ggplot2::scale_alpha_manual(values=c(1, 0.5)) +
        ggplot2::scale_color_manual(values=c("#333333", "#818181")) +
        ggplot2::guides(alpha=F, color=F) + 
        ggplot2::theme_bw()
    return(p)

}


================================================
FILE: R/outcome_models.R
================================================
################################################################################
## Code to fit various outcome models
################################################################################

#' Use a separate regularized regression for each post period
#' to fit E[Y(0)|X]
#' @importFrom stats poly
#' @importFrom stats coef
#'
#' @param X Matrix of covariates/lagged outcomes
#' @param y Matrix of post-period outcomes
#' @param trt Vector of treatment indicator
#' @param alpha Mixing between L1 and L2, default: 1 (LASSO)
#' @param lambda Regularization hyperparameter, if null then CV
#' @param poly_order Order of polynomial to fit, default 1
#' @param type How to fit outcome model(s)
#'             \itemize{
#'              \item{sep }{Separate outcome models}
#'              \item{avg }{Average responses into 1 outcome}
#'              \item{multi }{Use multi response regression in glmnet}}
#' @param ... optional arguments for outcome model
#' @noRd
#' @return \itemize{
#'           \item{y0hat }{Predicted outcome under control}
#'           \item{params }{Regression parameters}}
fit_prog_reg <- function(X, y, trt, alpha=1, lambda=NULL,
                         poly_order=1, type="sep", ...) {
    if(!requireNamespace("glmnet", quietly = TRUE)) {
        stop("In order to fit an elastic net outcome model, you must install the glmnet package.")
    }
    
    extra_params = list(...)
    if (length(extra_params) > 0) {
        warning("Unused parameters when using elastic net: ", paste(names(extra_params), collapse = ", "))
    }
    
    X <- matrix(poly(matrix(X),degree=poly_order), nrow=dim(X)[1])

    ## helper function to fit regression with CV
    outfit <- function(x, y) {
        if(is.null(lambda)) {
            lam <- glmnet::cv.glmnet(x, y, alpha=alpha, grouped=FALSE)$lambda.min
        } else {
            lam <- lambda
        }
        fit <- glmnet::glmnet(x, y, alpha=alpha,
                              lambda=lam)
        
        return(as.matrix(coef(fit)))
    }

    if(type=="avg") {
        ## if fitting the average post period value, stack post periods together
        stacky <- c(y)
        stackx <- do.call(rbind,
                          lapply(1:dim(y)[2],
                                 function(x) X))
        stacktrt <- rep(trt, dim(y)[2])
        regweights <- outfit(stackx[stacktrt==0,],
                             stacky[stacktrt==0])
    } else if(type=="sep"){
        ## fit separate regressions for each post period
        regweights <- apply(as.matrix(y), 2,
                            function(yt) outfit(X[trt==0,],
                                                yt[trt==0]))
    } else {
        ## fit multi response regression
        lam <- glmnet::cv.glmnet(X, y, family="mgaussian",
                                 alpha=alpha, grouped=FALSE)$lambda.min
        fit <- glmnet::glmnet(X, y, family="mgaussian",
                              alpha=alpha,
                              lambda=lam)
        regweights <- as.matrix(do.call(cbind, coef(fit)))
    }


    ## Get predicted values
    y0hat <- cbind(rep(1, dim(X)[1]),
                   X) %*% regweights

    return(list(y0hat = y0hat,
                params  = regweights))
}


#' Use a separate random forest regression for each post period
#' to fit E[Y(0)|X]
#'
#' @param X Matrix of covariates/lagged outcomes
#' @param y Matrix of post-period outcomes
#' @param trt Vector of treatment indicator
#' @param avg Predict the average post-treatment outcome
#' @param ... optional arguments for outcome model
#' @noRd
#' @return \itemize{
#'           \item{y0hat }{Predicted outcome under control}
#'           \item{params }{Regression parameters}}
fit_prog_rf <- function(X, y, trt, avg=FALSE, ...) {

    if(!requireNamespace("randomForest", quietly = TRUE)) {
        stop("In order to fit a random forest outcome model, you must install the randomForest package.")
    }
    
    extra_params = list(...)
    if (length(extra_params) > 0) {
        warning("Unused parameters when using random forest: ", paste(names(extra_params), collapse = ", "))
    }

    
    ## helper function to fit RF
    outfit <- function(x, y) {
            fit <- randomForest::randomForest(x, y)
            return(fit)
    }


    if(avg | dim(y)[2] == 1) {
        ## if fitting the average post period value, stack post periods together
        stacky <- c(y)
        stackx <- do.call(rbind,
                          lapply(1:dim(y)[2],
                                 function(x) X))
        stacktrt <- rep(trt, dim(y)[2])
        fit <- outfit(stackx[stacktrt==0,],
                      stacky[stacktrt==0])

        ## predict outcome
        y0hat <- matrix(predict(fit, X), ncol=1)

        
        ## keep feature importances
        imports <- randomForest::importance(fit)

        
    } else {
        ## fit separate regressions for each post period
        fits <- apply(as.matrix(y), 2,
                      function(yt) outfit(X[trt==0,],
                                          yt[trt==0]))
        
        ## predict outcome
        y0hat <- lapply(fits, function(fit) as.matrix(predict(fit,X))) %>%
            bind_rows() %>%
            as.matrix()

        
        ## keep feature importances
        imports <- lapply(fits, function(fit) randomForest::importance(fit)) %>%
            bind_rows() %>%
            as.matrix()

    }


    return(list(y0hat=y0hat,
                params=imports))
    
}


#' Use gsynth to fit factor model for E[Y(0)|X]
#'
#' @param X Matrix of covariates/lagged outcomes
#' @param y Matrix of post-period outcomes
#' @param trt Vector of treatment indicator
#' @param r Number of factors to use (or start with if CV==1)
#' @param r.end Max number of factors to consider if CV==1
#' @param force Fixed effects (0=none, 1=unit, 2=time, 3=two-way)
#' @param CV Whether to do CV (0=no CV, 1=yes CV)
#' @param ... optional arguments for outcome model
#' @noRd
#' @return \itemize{
#'           \item{y0hat }{Predicted outcome under control}
#'           \item{params }{Regression parameters}}
fit_prog_gsynth <- function(X, y, trt, r=0, r.end=5, force=3, CV=1, ...) {
    if(!requireNamespace("gsynth", quietly = TRUE)) {
        stop("In order to fit generalized synthetic controls, you must install the gsynth package.")
    }
    extra_params = list(...)
    if (length(extra_params) > 0) {
        warning("Unused parameters when using gSynth: ", paste(names(extra_params), collapse = ", "))
    }
    
    df_x = data.frame(X, check.names=FALSE)
    df_x$unit = rownames(df_x)
    df_x$trt = rep(0, nrow(df_x))
    df_x <- df_x %>% select(unit, trt, everything())
    long_df_x = gather(df_x, time, obs, -c(unit,trt))

    df_y = data.frame(y, check.names=FALSE)
    df_y$unit = rownames(df_y)
    df_y$trt = trt
    df_y <- df_y %>% select(unit, trt, everything())
    long_df_y = gather(df_y, time, obs, -c(unit,trt))
    long_df = rbind(long_df_x, long_df_y)

    transform(long_df, time = as.numeric(time))
    transform(long_df, unit = as.numeric(unit))
    gsyn <- gsynth::gsynth(data = long_df, Y = "obs", D = "trt", 
                           index = c("unit", "time"), force = force, CV = CV, r = r)

    t0 <- dim(X)[2]
    t_final <- t0 + dim(y)[2]
    n <- dim(X)[1]
    ## get predicted outcomes
    y0hat <- matrix(0, nrow=n, ncol=(t_final-t0))
    y0hat[trt==0,]  <- t(gsyn$Y.co[(t0+1):t_final,,drop=FALSE] -
                             gsyn$est.co$residuals[(t0+1):t_final,,drop=FALSE])

    y0hat[trt==1,] <- gsyn$Y.ct[(t0+1):t_final,]

    ## add treated prediction for whole pre-period
    gsyn$est.co$Y.ct <- gsyn$Y.ct

    ## control and treated residuals
    gsyn$est.co$ctrl_resids <- gsyn$est.co$residuals
    gsyn$est.co$trt_resids <- colMeans(cbind(X[trt==1,,drop=FALSE],
                                             y[trt==1,,drop=FALSE])) -
        rowMeans(gsyn$est.co$Y.ct)
    return(list(y0hat=y0hat,
                params=gsyn$est.co))
}


#' Use Athey (2017) matrix completion panel data code
#'
#' @param X Matrix of covariates/lagged outcomes
#' @param y Matrix of post-period outcomes
#' @param trt Vector of treatment indicator
#' @param unit_fixed Whether to estimate unit fixed effects
#' @param time_fixed Whether to estimate time fixed effects
#' @param ... optional arguments for outcome model
#' @noRd
#' @return \itemize{
#'           \item{y0hat }{Predicted outcome under control}
#'           \item{params }{Regression parameters}}
fit_prog_mcpanel <- function(X, y, trt, unit_fixed=1, time_fixed=1, ...) {


    if(!requireNamespace("MCPanel", quietly = TRUE)) {
        stop("In order to fit matrix completion, you must install the MCPanel package.")
    }
    
    extra_params = list(...)
    if (length(extra_params) > 0) {
        warning("Unused parameters when using MCPanel: ", paste(names(extra_params), collapse = ", "))
    }
    
    ## create matrix and missingness matrix

    t0 <- dim(X)[2]
    t_final <- t0 + dim(y)[2]
    n <- dim(X)[1]    
    
    fullmat <- cbind(X, y)
    maskmat <- matrix(1, nrow=nrow(fullmat), ncol=ncol(fullmat))
    maskmat[trt==1, (t0+1):t_final] <- 0

    ## estimate matrix
    mcp <- MCPanel::mcnnm_cv(fullmat, maskmat,
                             to_estimate_u=unit_fixed, to_estimate_v=time_fixed)
    
    ## impute matrix
    imp_mat <- mcp$L +
        sweep(matrix(0, nrow=nrow(fullmat), ncol=ncol(fullmat)), 1, mcp$u, "+") + # unit fixed
        sweep(matrix(0, nrow=nrow(fullmat), ncol=ncol(fullmat)), 2, mcp$v, "+") # time fixed
    
    
    trtmat <- matrix(0, ncol=n, nrow=t_final)
    trtmat[t0:t_final, trt == 1] <- 1

    ## get predicted outcomes
    y0hat <- imp_mat[,(t0+1):t_final,drop=FALSE]
    params <- mcp

    params$trt_resids <- colMeans(cbind(X[trt==1,,drop=FALSE],
                                        y[trt==1,,drop=FALSE])) -
        rowMeans(imp_mat[trt==1,,drop=FALSE])

    params$ctrl_resids <- t(cbind(X[trt==0,,drop=FALSE],
                                y[trt==0,,drop=FALSE]) - imp_mat[trt==0,,drop=FALSE])
    params$Y.ct <- t(imp_mat[trt==1,,drop=FALSE])
    return(list(y0hat=y0hat,
                params=params))
    
}


#' Fit a Comparitive interupted time series
#' to fit E[Y(0)|X]
#' @importFrom stats lm
#' @importFrom stats predict
#'
#' @param X Matrix of covariates/lagged outcomes
#' @param y Matrix of post-period outcomes
#' @param trt Vector of treatment indicator
#' @param poly_order Order of time trend polynomial to fit, default 1
#' @param weights Weights to use in WLS, default is no weights
#' @param ... optional arguments for outcome model
#' @noRd
#' @return \itemize{
#'           \item{y0hat }{Predicted outcome under control}
#'           \item{params }{Regression parameters}}
fit_prog_cits <- function(X, y, trt, poly_order=1, weights=NULL, ...) {

    extra_params = list(...)
    if (length(extra_params) > 0) {
        warning("Unused parameters when using CITS: ", paste(names(extra_params), collapse = ", "))
    }
    
    ## combine back into a panel structure
    ids <- 1:nrow(X)
    t0 <- dim(X)[2]
    t_final <- t0 + dim(y)[2]
    n <- nrow(X)


    if(is.null(weights)) {
        weights <- rep(1, n)
    }
    
    pnl1 <- data.frame(X)
    colnames(pnl1) <- 1:(t0)

    pnl1 <- pnl1 %>% mutate(trt=trt, post=0, id=ids, weight=weights) %>%
        gather(time, val, -trt, -post, -id, -weight) %>%
        mutate(time=as.numeric(time))

    pnl2 <- data.frame(y)
    colnames(pnl2) <- (t0+1):t_final
    pnl2 <- pnl2 %>% mutate(trt=trt, post=1, id=ids, weight=weights) %>%
        gather(time, val, -trt, -post, -id, -weight) %>%
        mutate(time=as.numeric(time))
    
    
    pnl <- bind_rows(pnl1, pnl2)
    
    ## fit regression
    if(poly_order == "fixed") {
        fit <- pnl %>%
            filter(!((post==1) & (trt==1))) %>% ## filter out post-period treated outcomes
            lm(val ~  as.factor(id) + as.factor(time),
              .,
              weights = .$weight 
              )
    } else if(poly_order > 0) {
        fit <- pnl %>%
            filter(!((post==1) & (trt==1))) %>% ## filter out post-period treated outcomes
        lm(val ~ poly(time, poly_order) + post + trt + poly(time * trt, poly_order),
              ., 
              weights = .$weight
              )
    } else {

        fit <- pnl %>%
            filter(!((post==1) & (trt==1))) %>% ## filter out post-period treated outcomes
            lm(val ~  post + trt,
              .,
              weights = .$weight 
              )
    }

    
    ## get predicted post-period outcomes
    
    y0hat <- matrix(0, nrow=n, ncol=(t_final-t0))
    y0hat[trt==0,]  <- matrix(predict(fit,
                                      pnl %>% filter(post==1 & trt==0)),
                              ncol=ncol(y))

    y0hat[trt==1,] <- matrix(predict(fit,
                                     pnl %>% filter(post==1 & trt==1)),
                             ncol=ncol(y))


    params <- list()

    
    ## add treated prediction for whole pre-period
    params$Y.ct <- matrix(predict(fit,
                                  pnl %>% filter(trt==1),
                                  ncol=(ncol(X) + ncol(y))))

    ## and control prediction
    ctrl_pred <- matrix(predict(fit,
                                pnl %>% filter(trt==0)),
                                ncol=(ncol(X) + ncol(y)))

    ## control and treated residuals
    params$ctrl_resids <- t(cbind(X[trt==0,,drop=FALSE],
                                y[trt==0,,drop=FALSE])) - 
        t(ctrl_pred)
    params$trt_resids <- colMeans(cbind(X[trt==1,,drop=FALSE],
                                            y[trt==1,,drop=FALSE])) -
        rowMeans(params$Y.ct)
    
    return(list(y0hat=y0hat,
                params=params))
    
}


#' Fit a bayesian structural time series
#' to fit E[Y(0)|X]
#'
#' @param X Matrix of covariates/lagged outcomes
#' @param y Matrix of post-period outcomes
#' @param trt Vector of treatment indicator
#' @param ... optional arguments for outcome model
#' @noRd
#' @return \itemize{
#'           \item{y0hat }{Predicted outcome under control}
#'           \item{params }{Model parameters}}
fit_prog_causalimpact <- function(X, y, trt, ...) {


    if(!requireNamespace("CausalImpact", quietly = TRUE)) {
        stop("In order to fit bayesian structural time series, you must install the CausalImpact package.")
    }
    
    extra_params = list(...)
    if (length(extra_params) > 0) {
        warning("Unused parameters using Bayesian structural time series with CausalImpact: ", paste(names(extra_params), collapse = ", "))
    }

    ## structure data accordingly
    ids <- 1:nrow(X)
    t0 <- dim(X)[2]
    t_final <- t0 + dim(y)[2]
    n <- nrow(X)

    comb <- cbind(X, y)

    imp_dat <- t(rbind(colMeans(comb[trt==1,,drop=F]), comb[trt==0,,drop=F]))

    
    ## get predicted post-period outcomes
    ## TODO: is this the way to use CausalImpact??
    ci_func <- function(i) {
        ## fit causal impact using controls
        CausalImpact::CausalImpact(t(rbind(comb[i,], comb[-i,][trt[-i]==0,])),
                                   pre.period=c(1, t0), post.period=c(t0+1, t_final)
                                   )$series$point.pred
        
    }

    y0hat <- t(sapply(1:n, ci_func))

    params <- list()

    
    ## add treated prediction for whole pre-period
    params$Y.ct <- t(y0hat[trt==1,,drop=F])

    ## and control prediction
    ctrl_pred <- y0hat[trt==0,,drop=F]

    ## control and treated residuals
    params$ctrl_resids <- t(cbind(X[trt==0,,drop=FALSE],
                                y[trt==0,,drop=FALSE])) - 
        t(ctrl_pred)
    
    params$trt_resids <- colMeans(cbind(X[trt==1,,drop=FALSE],
                                            y[trt==1,,drop=FALSE])) -
        rowMeans(params$Y.ct)
    return(list(y0hat=y0hat[,(t0+1):t_final, drop=F],
                params=params))
    
}


#' Fit a seq2seq model with a feedforward net
#' to fit E[Y(0)|X]
#'
#' @param X Matrix of covariates/lagged outcomes
#' @param y Matrix of post-period outcomes
#' @param trt Vector of treatment indicator
#' @param layers List of (n_hidden_units, activation function) pairs to define layers
#' @param epochs Number of epochs for training
#' @param patience Number of epochs to wait before early stopping
#' @param val_split Proportion of control units to use for validation
#' @param verbose Whether to print training progress
#' @param ... optional arguments for outcome model
#' @noRd
#' @return \itemize{
#'           \item{y0hat }{Predicted outcome under control}
#'           \item{params }{Model parameters}}
fit_prog_seq2seq <- function(X, y, trt,
                             layers=list(c(50, "relu"), c(5, "relu")),
                             epochs=500,
                             patience=5,
                             val_split=0.2,
                             verbose=F, ...) {

    if(!requireNamespace("keras", quietly = TRUE)) {
        stop("In order to fit a neural network, you must install the keras package.")
    }
    
    extra_params = list(...)
    if (length(extra_params) > 0) {
        warning("Unused parameters when building sequence to sequence learning with feedforward nets: ", paste(names(extra_params), collapse = ", "))
    }
    
    ## structure data accordingly
    ids <- 1:nrow(X)
    t0 <- dim(X)[2]
    t_final <- t0 + dim(y)[2]
    n <- nrow(X)


    Xctrl <- X[trt==0,,drop=F]
    yctrl <- y[trt==0,,drop=F]

    ## create first layer
    model <- keras::keras_model_sequential() %>%
        keras::layer_dense(units = layers[[1]][1], activation = layers[[1]][2],
                    input_shape = ncol(Xctrl))

    ## add layers
    for(layer in layers[-1]) {
        model %>% keras::layer_dense(units = layer[1], activation = layer[2])
    }

    ## output lyaer
    model %>% keras::layer_dense(units=ncol(yctrl))

    ## compile
    model %>% keras::compile(optimizer="rmsprop", loss="mse", metrics=c("mae")) 

    ## fit model
    learn <- model %>%
        keras::fit(x=Xctrl, y=yctrl,
            epochs=epochs,
            batch_size=nrow(Xctrl),
            validation_split=val_split,
            callbacks=list(keras::callback_early_stopping(patience=patience)),
            verbose=verbose)

    ## predict for everything
    y0hat <- model %>% predict(X)
    params=list(model=model, learn=learn)
    
    return(list(y0hat=y0hat,
                params=params))
}


================================================
FILE: R/outcome_multi.R
================================================
################################################################################
## Fitting outcome models for multiple treatment groups
################################################################################


#' Use gsynth to fit factor model with 
#' @importFrom utils capture.output
#' @param long_df A long dataframe with 4 columns in the order unit, time, trt, outcome
#' @param X Matrix of outcomes
#' @param trt Vector of treatment status for each unit
#' @param r Number of factors to use (or start with if CV==1)
#' @param r.end Max number of factors to consider if CV==1
#' @param force Fixed effects (0=none, 1=unit, 2=time, 3=two-way)
#' @param CV Whether to do CV (0=no CV, 1=yes CV)
#' @noRd
#' @return \itemize{
#'           \item{y0hat }{Predicted outcome under control}
#'           \item{params }{Regression parameters}}
fit_gsynth_multi <- function(long_df, X, trt, r=0, force=3, CV=1) {
    if(!requireNamespace("gsynth", quietly = TRUE)) {
        stop("In order to fit generalized synthetic controls, you must install the gsynth package.")
    }
    ttot <- ncol(X)
    n <- nrow(X)

    labels <- colnames(long_df)
    gsyn <- gsynth::gsynth(data = long_df, Y = labels[4], D = labels[3], index = c(labels[1], labels[2]), force = force, CV = CV, r=r)
    
    y0hat <- matrix(0, nrow=n, ncol=ttot)
    y0hat[!is.finite(trt),]  <- t(gsyn$Y.co - gsyn$est.co$residuals)
    
    y0hat[is.finite(trt),] <- t(gsyn$Y.ct)
    
    ## add treated prediction for whole pre-period
    gsyn$est.co$Y.ct <- gsyn$Y.ct
    return(list(y0hat=y0hat,
                params=gsyn$est.co))
}


#' Get fixed effects from pre-treatment data for each level
#'
#' @param X Matrix of outcomes
#' @param trt Vector of treatment status for each unit
#' @param mask Matrix of treatment statuses
#' @param force Fixed effects: 1="unit", 2="time", 3="two-way"
#' @param time_cohort Boolean indicating whether to use time cohorts
#' @noRd
#' @return \itemize{
#'           \item{y0hat }{Predicted outcome under control}
#'           \item{params }{Regression parameters}}
fit_feff <- function(X, trt, mask, force, time_cohort) {

    ttot <- ncol(X)
    n <- nrow(X)
    # grps <- trt[is.finite(trt)]
    # iterate over treatment cohorts
    grps <- unique(trt[is.finite(trt)])
    J <- length(grps)
    which_t <- (1:n)[is.finite(trt)]

    if(force %in% c(2,3)) {
        ## compute time fixed effects from pure controls
        time_eff <- matrix(colMeans(X[!is.finite(trt),, drop = F],
                            na.rm = TRUE),
                            nrow=nrow(X),
                            ncol=ncol(X),
                            byrow=T)
    } else {
      time_eff <- matrix(0, nrow = nrow(X), ncol = ncol(X))
    }
    residuals <- X - time_eff
    y0hat <- time_eff
    if(force %in% c(1,3)) {

        ## compute unit fixed effects from pre-intervention outcomes
        unit_eff <- lapply(grps, 
                            function(tj) matrix(
                                            rowMeans(residuals[, 1:tj, drop = F],
                                                     na.rm = TRUE),
                                            nrow=nrow(X), ncol=ncol(X)))
        residuals <- lapply(1:J, function(j) residuals -
                                                unit_eff[[j]])
        y0hat <- unit_eff
    }

    if(force == 3) {
        y0hat <- lapply(unit_eff, function(ufj) time_eff + ufj)
    }
    
    # go from treatment cohorts to individuals
    if(force %in% c(1,3) & !time_cohort) {
      names(residuals) <- as.character(grps)
      residuals <- residuals[as.character(trt[is.finite(trt)])]
      names(y0hat) <- as.character(grps)
      y0hat <- y0hat[as.character(trt[is.finite(trt)])]
    }
    
    return(list(y0hat = y0hat,
                residuals = residuals))
    
}


================================================
FILE: R/ridge.R
================================================
################################################################################
## Ridge-augmented SCM
################################################################################

#' Ridge augmented weights (possibly with covariates)
#'
#' @param wide_data Output of `format_data`
#' @param synth_data Output of `format_synth`
#' @param Z Matrix of covariates, default is  NULL
#' @param lambda Ridge hyper-parameter, if NULL use CV
#' @param ridge Include ridge or not
#' @param scm Include SCM or not
#' @param lambda_min_ratio Ratio of the smallest to largest lambda when tuning lambda values
#' @param n_lambda Number of lambdas to consider between the smallest and largest lambda value
#' @param lambda_max Initial (largest) lambda, if NULL sets it to be (1+norm(X_1-X_c))^2
#' @param holdout_length Length of conseuctive holdout period for when tuning lambdas 
#' @param min_1se If TRUE, chooses the maximum lambda within 1 standard error of the lambda that minimizes the CV error, if FALSE chooses the optimal lambda; default TRUE
#' @param V V matrix for synth, default NULL
#' @param residualize Whether to residualize auxiliary covariates or balance directly, default TRUE
#' @param ... optional arguments for outcome model
#' @noRd
#' @return \itemize{
#'          \item{"weights"}{Ridge ASCM weights}
#'          \item{"l2_imbalance"}{Imbalance in pre-period outcomes, measured by the L2 norm}
#'          \item{"scaled_l2_imbalance"}{L2 imbalance scaled by L2 imbalance of uniform weights}
#'          \item{"mhat"}{Outcome model estimate (zero in this case)}
#'          \item{"lambda"}{Value of the ridge hyperparameter}
#'          \item{"ridge_mhat"}{The ridge regression predictions (for estimating the bias)}
#'          \item{"synw"}{The synth weights(for estimating the bias)}
#'          \item{"lambdas"}{List of lambda values evaluated to tune ridge regression}
#'          \item{"lambda_errors"}{"The MSE associated with each lambda term in lambdas."}
#'          \item{"lambda_errors_se"}{"The SE of the MSE associated with each lambda term in lambdas."}
#' }
fit_ridgeaug_formatted <- function(wide_data, synth_data,
                                   Z=NULL, lambda=NULL, ridge=T, scm=T,
                                   lambda_min_ratio = 1e-8, n_lambda = 20,
                                   lambda_max = NULL,
                                   holdout_length = 1, min_1se = T,
                                   V = NULL,
                                   residualize = FALSE, ...) {
    extra_params = list(...)
    if (length(extra_params) > 0) {
        warning("Unused parameters in using ridge augmented weights: ", paste(names(extra_params), collapse = ", "))
    }

    X <- wide_data$X
    y <- wide_data$y
    trt <- wide_data$trt

    lambda_errors <- NULL
    lambda_errors_se <- NULL
    lambdas <- NULL

    ## center outcomes
    X_cent <- apply(X, 2, function(x) x - mean(x[trt==0]))
    X_c <- X_cent[trt==0,,drop=FALSE]
    X_1 <- matrix(colMeans(X_cent[trt==1,,drop=FALSE]), nrow=1)
    y_cent <- apply(y, 2, function(x) x - mean(x[trt==0]))
    y_c <- y_cent[trt==0,,drop=FALSE]

    t0 <- ncol(X_c)

    V <- make_V_matrix(t0, V)

    # apply V matrix transformation
    X_c <- X_c %*% V
    X_1 <- X_1 %*% V

    new_synth_data <- synth_data


    ## if there are auxiliary covariates, use them
    if(!is.null(Z)) {
        ## center covariates
        Z_cent <- apply(Z, 2, function(x) x - mean(x[trt==0]))
        Z_c <- Z_cent[trt==0,,drop=FALSE]
        Z_1 <- matrix(colMeans(Z_cent[trt==1,,drop=FALSE]), nrow=1)

        if(residualize) {
          ## regress out covariates
          Xc_hat <- Z_c %*% solve(t(Z_c) %*% Z_c) %*% t(Z_c) %*% X_c
          X1_hat <- Z_1 %*% solve(t(Z_c) %*% Z_c) %*% t(Z_c) %*% X_c

          # take residuals
          res_t <- X_1  - X1_hat
          res_c <- X_c - Xc_hat

          X_c <- res_c
          X_1 <- res_t

          X_cent[trt == 0,] <- res_c
          X_cent[trt == 1,] <- res_t


          new_synth_data$Z1 <- t(res_t)
          new_synth_data$X1 <- t(res_t)
          new_synth_data$Z0 <- t(res_c)
          new_synth_data$X0 <- t(res_c)
        } else {
            # standardize covariates to be on the same scale as the outcomes
            sdz <-  apply(Z_c, 2, sd)
            sdx <- sd(X_c)
            Z_c <- sdx * t(t(Z_c) / sdz)
            Z_1 <- sdx * Z_1 / sdz

          # concatenate
          X_c <- cbind(X_c, Z_c)
          X_1 <- cbind(X_1, Z_1)
          new_synth_data$Z1 <- t(X_1)
          new_synth_data$X1 <- t(X_1)
          new_synth_data$Z0 <- t(X_c)
          new_synth_data$X0 <- t(X_c)
          V <- diag(ncol(X_c))
        }
    } else {
        new_synth_data$Z1 <- t(X_1)
        new_synth_data$X1 <- t(X_1)
        new_synth_data$Z0 <- t(X_c)
        new_synth_data$X0 <- t(X_c)
    }
    out <- fit_ridgeaug_inner(X_c, X_1, trt, new_synth_data,
                               lambda, ridge, scm,
                               lambda_min_ratio, n_lambda,
                               lambda_max,
                               holdout_length, min_1se)

    weights <- out$weights
    synw <- out$synw
    lambda <- out$lambda
    lambdas <- out$lambdas
    lambda_errors <- out$lambda_errors
    lambda_errors_se <- out$lambda_errors_se

    # add back in covariate weights
    if(!is.null(Z)) {
        if(residualize) {
          no_cov_weights <- weights
          ridge_w <- t(t(Z_1) - t(Z_c) %*% weights) %*% 
                      solve(t(Z_c) %*% Z_c) %*% t(Z_c)
          weights <- weights + t(ridge_w)
        } else {
          no_cov_weights <- NULL
        }
    }

    l2_imbalance <- sqrt(sum((synth_data$X0 %*% weights - synth_data$X1)^2))

    ## primal objective value scaled by least squares difference for mean
    uni_w <- matrix(1/ncol(synth_data$X0), nrow=ncol(synth_data$X0), ncol=1)
    unif_l2_imbalance <- sqrt(sum((synth_data$X0 %*% uni_w - synth_data$X1)^2))
    scaled_l2_imabalance <- l2_imbalance / unif_l2_imbalance


    ## no outcome model
    mhat <- matrix(0, nrow=nrow(y), ncol=ncol(y))
    ridge_mhat <- mhat
    if(!is.null(Z)) {
      if(residualize) {
        ridge_mhat <- ridge_mhat + Z_cent %*% solve(t(Z_c) %*% Z_c) %*%
                        t(Z_c) %*% y_c

        ## regress out covariates for outcomes
        yc_hat <- ridge_mhat[trt == 0,, drop = F]
        # take residuals of outcomes
        y_c <- y_c - yc_hat
      } else {
        X_cent <- cbind(X_cent, Z_cent)
      }
    }

    if(ridge) {
        ridge_mhat <- ridge_mhat + X_cent %*% solve(t(X_c) %*% X_c +
                        lambda * diag(ncol(X_c))) %*%
                        t(X_c) %*% y_c
    }

    output <- list(weights = weights,
                l2_imbalance = l2_imbalance,
                scaled_l2_imbalance = scaled_l2_imabalance,
                mhat = mhat,
                lambda = lambda,
                ridge_mhat = ridge_mhat,
                synw = synw,
                lambdas = lambdas,
                lambda_errors = lambda_errors,
                lambda_errors_se = lambda_errors_se)

    if(!is.null(Z)) {
        output$no_cov_weights <- no_cov_weights

        z_l2_imbalance <- sqrt(sum((t(Z_c) %*% weights - t(Z_1))^2))
        z_unif_l2_imbalance <- sqrt(sum((t(Z_c) %*% uni_w - t(Z_1))^2))
        z_scaled_l2_imbalance <- z_l2_imbalance / z_unif_l2_imbalance

        output$covariate_l2_imbalance <- z_l2_imbalance
        output$scaled_covariate_l2_imbalance <- z_scaled_l2_imbalance

    }
    return(output)
}

#' Helper function to fit ridge ASCM
#' @param X_c Matrix of control lagged outcomes
#' @param X_1 Vector of treated leagged outcomes
#' @param trt Vector of treatment indicators
#' @param synth_data Output of `format_synth`
#' @param lambda Ridge hyper-parameter, if NULL use CV
#' @param ridge Include ridge or not
#' @param scm Include SCM or not
#' @param lambda_min_ratio Ratio of the smallest to largest lambda when tuning lambda values
#' @param n_lambda Number of lambdas to consider between the smallest and largest lambda value
#' @param lambda_max Initial (largest) lambda, if NULL sets it to be (1+norm(X_1-X_c))^2
#' @param holdout_length Length of conseuctive holdout period for when tuning lambdas 
#' @param min_1se If TRUE, chooses the maximum lambda within 1 standard error of the lambda that minimizes the CV error, if FALSE chooses the optimal lambda; default TRUE
#' @noRd
#' @return \itemize{
#'          \item{"weights"}{Ridge ASCM weights}
#'          \item{"lambda"}{Value of the ridge hyperparameter}
#'          \item{"synw"}{The synth weights(for estimating the bias)}
#'          \item{"lambdas"}{List of lambda values evaluated to tune ridge regression}
#'          \item{"lambda_errors"}{"The MSE associated with each lambda term in lambdas."}
#'          \item{"lambda_errors_se"}{"The SE of the MSE associated with each lambda term in lambdas."}
#' }
fit_ridgeaug_inner <- function(X_c, X_1, trt, synth_data,
                               lambda, ridge, scm,
                               lambda_min_ratio, n_lambda,
                               lambda_max,
                               holdout_length, min_1se) {
    lambda_errors <- NULL
    lambda_errors_se <- NULL
    lambdas <- NULL

    ## if SCM fit scm
    if(scm) {
        syn <- fit_synth_formatted(synth_data)$weights
    } else {
        ## else use uniform weights
        syn <- rep(1 / sum(trt == 0), sum(trt == 0))
    }
    if(ridge) {
        if(is.null(lambda)) {
            cv_out <- cv_lambda(X_c, X_1, synth_data, trt, holdout_length, scm,
                      lambda_max, lambda_min_ratio, n_lambda, min_1se)

            lambda <- cv_out$lambda
            lambda_errors <- cv_out$lambda_errors
            lambda_errors_se <- cv_out$lambda_errors_se
            lambdas <- cv_out$lambdas
        }
        # get ridge weights
        ridge_w <- t(t(X_1) - t(X_c) %*% syn) %*%
                    solve(t(X_c) %*% X_c  + lambda * diag(ncol(X_c))) %*% t(X_c)
    } else {
        ridge_w <- matrix(0, ncol = sum(trt == 0), nrow=1)
    }
    ## combine weights
    weights <- syn + t(ridge_w)

    return(list(weights = weights,
                synw = syn,
                lambda = lambda,
                lambdas = lambdas,
                lambda_errors = lambda_errors,
                lambda_errors_se = lambda_errors_se))
}


#' Choose max lambda as largest eigenvalue of control X
#' @param X_c matrix of control lagged outcomes
#' @noRd
#' @return max lambda
get_lambda_max <- function(X_c) {
    svd(X_c)$d[1] ^ 2
}
#' Create list of lambdas
#' @param lambda_min_ratio Ratio of the smallest to largest lambda when tuning lambda values
#' @param n_lambda Number of lambdas to consider between the smallest and largest lambda value
#' @param lambda_max Initial (largest) lambda, if NULL sets it to be (1+norm(X_1-X_c))^2
#' @noRd
#' @return List of lambdas
create_lambda_list <- function(lambda_max, lambda_min_ratio, n_lambda) {
    scaler <- (lambda_min_ratio) ^ (1/n_lambda)
    lambdas <- lambda_max * (scaler ^ (seq(0:n_lambda) - 1))
    return(lambdas)
}

#' Choose either the lambda that minimizes CV MSE or largest lambda within 1 se of min
#' @param lambdas list of lambdas
#' @param lambda_errors The MSE associated with each lambda term in lambdas.
#' @param lambda_errors_se The SE of the MSE associated with each lambda
#' @param min_1se If TRUE, chooses the maximum lambda within 1 standard error of the lambda that minimizes the CV error, if FALSE chooses the optimal lambda; default TRUE
#' @noRd
#' @return optimal lambda
choose_lambda <- function(lambdas, lambda_errors, lambda_errors_se, min_1se) {
    # lambda with smallest error
    min_idx <- which.min(lambda_errors)
    min_error <- lambda_errors[min_idx]
    min_se <- lambda_errors_se[min_idx]
    lambda_min <- lambdas[min_idx]
    # max lambda with error within one se of min
    lambda_1se <- max(lambdas[lambda_errors <= min_error + min_se])
    return(if(min_1se) lambda_1se else lambda_min)
}

#' Choose best lambda with CV
#' @param X_c Matrix of control lagged outcomes
#' @param X_1 Vector of treated leagged outcomes
#' @param synth_data Output of `format_synth`
#' @param trt Vector of treatment indicators
#' @param holdout_length Length of conseuctive holdout period for when tuning lambdas 
#' @param scm Include SCM or not
#' @param lambda_max Initial (largest) lambda, if NULL sets it to be (1+norm(X_1-X_c))^2
#' @param lambda_min_ratio Ratio of the smallest to largest lambda when tuning lambda values
#' @param n_lambda Number of lambdas to consider between the smallest and largest lambda value
#' @param min_1se If TRUE, chooses the maximum lambda within 1 standard error of the lambda 
#' @noRd
#' @return \itemize{
#'          \item{"lambda"}{Value of the ridge hyperparameter}
#'          \item{"lambdas"}{List of lambda values evaluated to tune ridge regression}
#'          \item{"lambda_errors"}{"The MSE associated with each lambda term in lambdas."}
#'          \item{"lambda_errors_se"}{"The SE of the MSE associated with each lambda term}
#' }
cv_lambda <- function(X_c, X_1, synth_data, trt, holdout_length, scm,
                      lambda_max, lambda_min_ratio, n_lambda, min_1se) {
    if(is.null(lambda_max)) {
        lambda_max <- get_lambda_max(X_c) 
    }

    lambdas <- create_lambda_list(lambda_max, lambda_min_ratio, n_lambda)
    
    lambda_out <- get_lambda_errors(lambdas, X_c, X_1,
                                        synth_data, trt,
                                        holdout_length, scm)
    lambda_errors <- lambda_out$lambda_errors
    lambda_errors_se <- lambda_out$lambda_errors_se

    lambda <- choose_lambda(lambdas, lambda_errors, lambda_errors_se, min_1se)

    return(list(lambda = lambda, lambda_errors = lambda_errors,
                lambda_errors_se = lambda_errors_se, lambdas = lambdas))
}


================================================
FILE: R/ridge_lambda.R
================================================
################################################################################
## Function to calculate error on different lambda values if using Ridge Augmented SCM
################################################################################

#' Get Lambda Errors
#' @importFrom stats sd
#'
#' @param lambdas Vector of lambda values to compute errors for
#' @param X_c Matrix of control group pre-treatment outcomes
#' @param X_t Matrix of treatment group pre-treatment outcomes
#' @param synth_data Output of `format_synth`
#' @param trt Boolean vector of treatment assignments
#' @param holdout_length Length of conseuctive holdout period for when tuning lambdas
#' @param scm Include SCM or not
#' @noRd
#' @return List of lambda errors for each corresponding lambda in the lambdas parameter.
get_lambda_errors <- function(lambdas, X_c, X_t, synth_data, trt, holdout_length=1, scm=T) {
  # vector that stores the sum MSE across all CV sets
  errors <- matrix(0, nrow = ncol(X_c) - holdout_length, ncol = length(lambdas))
  lambda_errors = numeric(length(lambdas)) 
  lambda_errors_se = numeric(length(lambdas)) 

  for (i in 1:(ncol(X_c) - holdout_length)) {
    X_0 <- X_c[,-(i:(i + holdout_length - 1))]
    X_1 <- matrix(X_t[-(i:(i + holdout_length - 1))])
    X_0v <- X_c[,i:(i + holdout_length - 1)]
    X_1v <- matrix(X_t[i:(i + holdout_length - 1)], ncol = 1)
    new_synth_data <- synth_data
    new_synth_data$Z1 <- X_1
    new_synth_data$X1 <- X_1
    new_synth_data$Z0 <- t(X_0)
    new_synth_data$X0 <- t(X_0)

    if(scm) {
      syn <- fit_synth_formatted(new_synth_data)$weights
    } else {
      syn <- rep(1/sum(trt==0), sum(trt==0))
    }

    for (j in 1:length(lambdas)) {
      ridge_weights <- t(X_1 - t(X_0) %*% syn) %*% solve(t(X_0) %*% X_0 + lambdas[j] * diag(ncol(X_0))) %*% t(X_0)
      aug_weights <- syn + t(ridge_weights)
      error <- X_1v - t(X_0v) %*% aug_weights
      # take sum of errors across the holdout time periods
      error <- sum(error ^ 2)
      errors[i, j] <-  error
      # lambda_errors[j] <- lambda_errors[j] + error
    }
  }
  lambda_errors <- apply(errors, 2, mean)
  lambda_errors_se <- apply(errors, 2, function(x) sd(x) / sqrt(length(x)))
  return(list(lambda_errors = lambda_errors, 
              lambda_errors_se = lambda_errors_se))
}

================================================
FILE: R/time_regression_multi.R
================================================
##############################################################################
## Outcome regression with multiple treated units
##############################################################################

#' Fit a time regression
#' @param X Matrix of outcomes
#' @param trt Vector of treatment status for each unit
#' @param n_leads How long past treatment effects should be estimated for
#' @param reg_param Regularization hyperparameter
#' @param lowlim Lower bound for coefs
#' @param uplim upper bound for coefs
#' @param ... Extra optimization hyperparameters
#' @noRd
#' @return \itemize{
#'           \item{y0hat }{List of predicted outcome under control}
#'           \item{residuals }{List of residuals}
#'           \item{params }{Regression parameters}}
fit_time_reg <- function(X, trt, n_leads, reg_param, lowlim = 0, uplim = 1, ...) {

    grps <- trt[is.finite(trt)]
    J <- length(grps)
    tmax <- max(trt[is.finite(trt)])

    # fit QP
    reg_weights <- fit_time_reg_qp_(X, trt, n_leads, lowlim, uplim, reg_param, ...)

    # get predicted outcomes (repeated as a matrix) and residuals
    y0hat <- lapply(1:J, 
        function(j) {
            # compute time fixed effects from pure controls
            time_eff <- matrix(colMeans(X[!is.finite(trt),]),
                              nrow=nrow(X), ncol=ncol(X),
                              byrow=T)
            Xj <- X - time_eff
            zero_mat <- matrix(0, nrow = nrow(X), ncol = (tmax - grps[j]))
            Xj <- cbind(zero_mat, Xj[, 1:grps[j], drop = F])
            # take out pure control means
            y0hatj <- Xj %*% reg_weights[,j, drop = F]
            matrix(y0hatj, nrow=nrow(X), ncol=ncol(X)) + time_eff
        })

    residuals <- lapply(1:J, function(j) X - y0hat[[j]])

    return(list(y0hat = y0hat,
                residuals = residuals,
                time_weights = reg_weights))
}


#' Fit a time regression
#' @param X Matrix of outcomes
#' @param trt Vector of treatment status for each unit
#' @param n_leads How long past treatment effects should be estimated for
#' @param reg_param Regularization hyperparameter
#' @param lowlim Lower bound for coefs
#' @param uplim upper bound for coefs
#' @param ... Extra optimization hyperparameters
#' @noRd
#' @return reg_weights Fitted regression weights
fit_time_reg_qp_ <- function(X, trt, n_leads, lowlim, uplim, reg_param, ...) {

    grps <- trt[is.finite(trt)]
    J <- length(grps)
    ttot <- ncol(X)
    max_trt <- max(grps)

    # get data in the right form
    data_mats <- collect_data(X, trt, n_leads)

    # create constraint matrices
    constraints <- make_constraints(J, grps, lowlim, uplim)

    # get components of QP
    Qmat <- get_Qmat(data_mats$pre_mats)
    pvec <- get_pvec(data_mats$pre_mats, data_mats$post_vecs)

    I0 <- get_regularization_matrix(J, max_trt, reg_param)
    # add in regularization
    # I0 <- Matrix::bdiag(reg_param1 * Matrix::Diagonal(max_trt), 
    #                     reg_param2 * Matrix::Diagonal(J * max_trt))
    

    Qmat <- Qmat + I0

    # fit QP
    settings <- osqp::osqpSettings(verbose = FALSE, ...)
    out <- osqp::solve_osqp(Qmat, pvec, constraints$Amat, 
                            constraints$lvec, constraints$uvec, 
                            pars=settings)

    # collect as matrix
    # reg_weights <- matrix(out$x, ncol = J + 1)
    reg_weights <- matrix(out$x, ncol = J)
    # pooled <- reg_weights[,1]
    # add in common component
    # reg_weights <- reg_weights[, 1] + reg_weights[, -1]
    # reverse to calendar time
    # reg_weights <- reg_weights[nrow(reg_weights):1, ]
    return(reg_weights)
}


#' Organize data for the QP
#' @param X Matrix of outcomes
#' @param trt Vector of treatment status for each unit
#' @param n_leads How long past treatment effects should be estimated for
#' @noRd
collect_data <- function(X, trt, n_leads) {

    grps <- trt[is.finite(trt)]
    J <- length(grps)
    ttot <- ncol(X)
    max_trt <- max(grps)


    # sapply(1:ncol(X), 
    #        function(tj) {
    #            mean(X[trt >= tj])
    #        }) -> ctrl_means

    # X <- t(t(X) - ctrl_means)

    # get pre-treatment matrices
    lapply(grps, function(tj) {
        # donor unit pre tj outcomes
        idxs <- trt > tj + n_leads
        pre_mat <- cbind(#1, 
                         matrix(0, nrow = nrow(X), ncol = (max_trt - tj)),
                         X[, 1:tj, drop = F])
        # subtract out pure control means
        pre_mat <- t(t(pre_mat) - colMeans(pre_mat[!is.finite(trt),,drop = F]))
        # restrict to units that won't be treated w/in n_leads
        pre_mat[idxs,,drop = F]
    }) -> pre_mats


    # get post treatment averages
    lapply(grps, 
           function(tj) {
               # avg of donor units post tj outcomes
               idxs <- trt > tj + n_leads
               donors <- rowMeans(X[, (tj + 1):(tj + n_leads), drop = F])
               # subtract out pure control means
               donors <- donors - mean(donors[!is.finite(trt)])
               # restrict to units that won't be treated w/in n_leads
               donors[idxs]
           }) -> post_vecs
    return(list(pre_mats = pre_mats, post_vecs = post_vecs))
}


get_Qmat <- function(pre_mats) {

    #### matrix in QP
    cov_mats <- lapply(pre_mats, function(x) t(x) %*% x)
    # unit specific covariance matrices
    Qmat <- Matrix::bdiag(cov_mats)
    return(Qmat)
}


get_Qmat_pool <- function(pre_mats) {

    #### matrix in QP
    cov_mats <- lapply(pre_mats, function(x) t(x) %*% x)
    pooled_cov <- Reduce(`+`, cov_mats)
    cov_mats_bind <-do.call(rbind, cov_mats)
    # unit specific covariance matrices
    Qmat <- Matrix::bdiag(cov_mats)
    # pooling terms
    Qmat <- rbind(t(cov_mats_bind), Qmat)
    Qmat <- cbind(rbind(pooled_cov, cov_mats_bind), Qmat)

    return(Qmat)
}


get_pvec <- function(pre_mats, post_vecs) {
    # vector in QP
    lapply(1:length(pre_mats), function(j) {
        t(pre_mats[[j]]) %*% post_vecs[[j]]
    }) -> pvec_list

    pvec <- do.call(c, pvec_list)
    
    return(-2 * pvec)
}


get_pvec_pool <- function(pre_mats, post_vecs) {
    # vector in QP
    lapply(1:length(pre_mats), function(j) {
        t(pre_mats[[j]]) %*% post_vecs[[j]]
    }) -> pvec_list

    pvec_pool <- Reduce(`+`, pvec_list)
    pvec <- do.call(c, pvec_list)
    pvec <- c(pvec_pool, pvec)

    return(-2 * pvec)
}


make_constraints <- function(J, grps, lowlim, uplim) {

    tmax <- max(grps)
    # sum to 1 constraints
    A1 <- Matrix::t(Matrix::bdiag(lapply(1:J, function(j) c(0, rep(1, tmax)))))
    A1 <- Matrix::t(Matrix::bdiag(lapply(1:J, function(j) rep(1, tmax))))
    l1 <- rep(1, J)
    # l1 <- rep(-Inf, J)
    u1 <- rep(1, J)
    # u1 <- rep(Inf, J)

    # upper lower limits
    diag_w_intercept <- Matrix::bdiag(list(0, Matrix::Diagonal(tmax)))[-1, ]
    A2 <- Matrix::bdiag(lapply(1:J, function(j) diag_w_intercept))
    A2 <- Matrix::Diagonal(J * tmax)
    # make sure that only weighting times that exist
    l2 <- sapply(1:J, function(j) {
        c(rep(0, tmax - grps[j]), rep(lowlim, grps[j]))
    })
    
    u2 <- sapply(1:J, function(j) {
        c(rep(0, tmax - grps[j]), rep(uplim, grps[j]))
    })

    # combine

    Amat <- rbind(A1, A2)
    lvec <- c(l1, l2)
    uvec <- c(u1, u2)
    return(list(Amat = Amat, lvec = lvec, uvec = uvec))
}

make_constraints_pool <- function(J, grps, lowlim, uplim) {

    tmax <- max(grps)
    # sum to 1 constraints
    A1 <- cbind(0,
                matrix(1, ncol = tmax, nrow = J),
                Matrix::t(Matrix::bdiag(lapply(1:J, function(j) c(0, rep(1, tmax))))))
    
    l1 <- rep(1, J)
    # l1 <- rep(-Inf, J)
    u1 <- rep(1, J)
    # u1 <- rep(Inf, J)

    # upper lower limits
    diag_w_intercept <- Matrix::bdiag(list(0, Matrix::Diagonal(tmax)))[-1, ]
    pool_A2 <- do.call(rbind, lapply(1:J, function(j) diag_w_intercept))
    A2 <- Matrix::bdiag(lapply(1:J, function(j) diag_w_intercept))
    A2 <- cbind(pool_A2, A2)

    # restrict global intercept to 0
    A2 <- rbind(c(1, numeric(ncol(A2) - 1)), A2)
    
    # make sure that only weighting times that exist
    l2 <- sapply(1:J, function(j) {
        c(rep(0, tmax - grps[j]), rep(lowlim, grps[j]))
    })
    l2 <- c(0, l2)
    
    u2 <- sapply(1:J, function(j) {
        c(rep(0, tmax - grps[j]), rep(uplim, grps[j]))
    })
    u2 <- c(0, u2)
    # l2 <- rep(lowlim, J * tmax)
    # u2 <- rep(uplim, J  * tmax)

    # combine

    Amat <- rbind(A1, A2)
    lvec <- c(l1, l2)
    uvec <- c(u1, u2)
    return(list(Amat = Amat, lvec = lvec, uvec = uvec))
}

get_regularization_matrix <- function(J, max_trt, reg_param) {

    single_reg_mat <- Matrix::bdiag(list(0, Matrix::Diagonal(max_trt)))
    I0 <- Matrix::bdiag(lapply(1:J,function(j) single_reg_mat))
    I0 <- reg_param * Matrix::Diagonal(J * max_trt)
    return(reg_param * I0)
}

get_regularization_matrix_pool <- function(J, max_trt, reg_param1, reg_param2) {

    single_reg_mat <- Matrix::bdiag(list(0, Matrix::Diagonal(max_trt)))
        grp_reg_mats <- Matrix::bdiag(lapply(1:J,function(j) single_reg_mat))
        I0 <- Matrix::bdiag(reg_param1 * single_reg_mat,
                            reg_param2 * grp_reg_mats)

    return(I0)
}

================================================
FILE: README.md
================================================
# augsynth: Augmented Synthetic Control Method
[![Build Status](https://travis-ci.org/ebenmichael/augsynth.svg?branch=master)](https://travis-ci.org/ebenmichael/augsynth) [![Project Status: Active  The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)


## Overview
This package implements the Augmented Synthetic Control Method (ASCM).

For a more detailed description of the main functionality check out:
- [the vignette for simultaneous adoption](https://github.com/ebenmichael/augsynth/blob/master/vignettes/singlesynth-vignette.md)
- [the vignette for staggered adoption](https://github.com/ebenmichael/augsynth/blob/master/vignettes/multisynth-vignette.md)

## Installation
To install this package, first ensure that `devtools` is installed with

```
install.packages("devtools")
```

then install the package from GitHub with

```
devtools::install_github("ebenmichael/augsynth")
```

## Basic usage
To get started, use a panel dataset with an `outcome` measure, a `treatment` indicator, a `unit` indicator, a `time` variable, and an intervention time `t_int`. Then run


```
asyn <- augsynth(outcome ~ trt, unit, time, t_int, data)
```


================================================
FILE: data-raw/clean_kansas.R
================================================
library(haven)
library(tidyverse)

kansas <- read_dta("kansas_longer2.dta")
state_abb <- read_csv("us-state-ansi-fips.csv") %>%
                rename(fips = st, abb = stusps) %>%
                mutate(fips = as.numeric(fips)) %>%
                select(fips, abb)
kansas <- kansas %>%
        rename(fips=Fips) %>%
        filter(year >= 1990,
               !is.na(fips), # filter out all of US
               fips != 11, # filter out DC
            #    year_qtr >= 2005 | year_qtr == round(year_qtr)
               ) %>%
        # interpolate GDP
        mutate(year_qtr = year + qtr / 4 - 0.25, # combine year and quarter
               fips = as.integer(fips), # state id
               treated = 1 * (fips == 20) * (year_qtr >= 2012.25),
               gdp = ifelse((qtr == 1) | (year >= 2005), gdp, NA),
               popestimate = ifelse((qtr == 1), popestimate, NA)) %>%
        # interpolate GDP and population
        group_by(fips) %>%
        arrange(year_qtr) %>%
        mutate(gdp = approx(year_qtr, gdp, year_qtr)$y,
               popestimate = approx(year_qtr, popestimate, year_qtr)$y) %>%
        ungroup() %>% arrange(fips, year_qtr) %>%
        mutate(gdpcapita = gdp / popestimate * 1e6,
               lngdp = log(gdp),
               lngdpcapita = log(gdpcapita),
               revstatecapita = rev_state_total / popestimate * 1e6,
               revlocalcapita = rev_local_total / popestimate * 1e6,
               emplvl1capita = month1_emplvl / popestimate,
               emplvl2capita = month2_emplvl / popestimate,
               emplvl3capita = month3_emplvl / popestimate,
               emplvlcapita = (month1_emplvl + month2_emplvl + month3_emplvl) / (3 * popestimate),
               totalwagescapita = total_qtrly_wages / popestimate,
               taxwagescapita = taxable_qtrly_wages / popestimate,
               avgwklywagecapita = avg_wkly_wage,
               estabscapita = qtrly_estabs_count / popestimate) %>%
        filter(year_qtr <= 2016) %>%
        inner_join(state_abb)

for (name in colnames(kansas)) {
        attributes(kansas[[name]])$label = NULL 
}


================================================
FILE: man/augsynth-package.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/augsynth.R
\docType{package}
\name{augsynth-package}
\alias{augsynth-package}
\title{augsynth}
\description{
A package implementing the Augmented Synthetic Controls Method
}


================================================
FILE: man/augsynth.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/augsynth_pre.R
\name{augsynth}
\alias{augsynth}
\title{Fit Augmented SCM}
\usage{
augsynth(form, unit, time, data, t_int = NULL, ...)
}
\arguments{
\item{form}{outcome ~ treatment | auxillary covariates}

\item{unit}{Name of unit column}

\item{time}{Name of time column}

\item{data}{Panel data as dataframe}

\item{t_int}{Time of intervention (used for single-period treatment only)}

\item{...}{Optional arguments
\itemize{
  \item Single period augsynth with/without multiple outcomes
    \itemize{
      \item{"progfunc"}{What function to use to impute control outcomes: Ridge=Ridge regression (allows for standard errors), None=No outcome model, EN=Elastic Net, RF=Random Forest, GSYN=gSynth, MCP=MCPanel, CITS=CITS, CausalImpact=Bayesian structural time series with CausalImpact, seq2seq=Sequence to sequence learning with feedforward nets}
      \item{"scm"}{Whether the SCM weighting function is used}
      \item{"fixedeff"}{Whether to include a unit fixed effect, default F }
      \item{"cov_agg"}{Covariate aggregation functions, if NULL then use mean with NAs omitted}
    }
  \item Multi period (staggered) augsynth
   \itemize{
         \item{"relative"}{Whether to compute balance by relative time}
         \item{"n_leads"}{How long past treatment effects should be estimated for}
         \item{"n_lags"}{Number of pre-treatment periods to balance, default is to balance all periods}
         \item{"alpha"}{Fraction of balance for individual balance}
         \item{"lambda"}{Regularization hyperparameter, default = 0}
         \item{"force"}{Include "none", "unit", "time", "two-way" fixed effects. Default: "two-way"}
         \item{"n_factors"}{Number of factors for interactive fixed effects, default does CV}
        }
}}
}
\value{
augsynth object that contains:
        \itemize{
         \item{"weights"}{weights}
         \item{"data"}{Panel data as matrices}
        }
}
\description{
Fit Augmented SCM
}


================================================
FILE: man/augsynth_multiout.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/multi_outcomes.R
\name{augsynth_multiout}
\alias{augsynth_multiout}
\title{Fit Augmented SCM with multiple outcomes}
\usage{
augsynth_multiout(
  form,
  unit,
  time,
  t_int,
  data,
  progfunc = c("Ridge", "None"),
  scm = T,
  fixedeff = FALSE,
  cov_agg = NULL,
  combine_method = "avg",
  ...
)
}
\arguments{
\item{form}{outcome ~ treatment | auxillary covariates}

\item{unit}{Name of unit column}

\item{time}{Name of time column}

\item{t_int}{Time of intervention}

\item{data}{Panel data as dataframe}

\item{progfunc}{What function to use to impute control outcomes
Ridge=Ridge regression (allows for standard errors),
None=No outcome model,}

\item{scm}{Whether the SCM weighting function is used}

\item{fixedeff}{Whether to include a unit fixed effect, default F}

\item{cov_agg}{Covariate aggregation functions, if NULL then use mean with NAs omitted}

\item{combine_method}{How to combine outcomes: `concat` concatenates outcomes and `avg` averages them, default: 'avg'}

\item{...}{optional arguments for outcome model}
}
\value{
augsynth object that contains:
        \itemize{
         \item{"weights"}{Ridge ASCM weights}
         \item{"l2_imbalance"}{Imbalance in pre-period outcomes, measured by the L2 norm}
         \item{"scaled_l2_imbalance"}{L2 imbalance scaled by L2 imbalance of uniform weights}
         \item{"mhat"}{Outcome model estimate}
         \item{"data"}{Panel data as matrices}
        }
}
\description{
Fit Augmented SCM with multiple outcomes
}


================================================
FILE: man/check_data_stag.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/format.R
\name{check_data_stag}
\alias{check_data_stag}
\title{Check that we can actually run multisynth on the data}
\usage{
check_data_stag(wide, fixedeff, n_leads, n_lags)
}
\arguments{
\item{wide}{Output of format_data_stag}

\item{fixedeff}{Whether to include a unit fixed effect}

\item{n_leads}{How long past treatment effects should be estimated for, default is number of post treatment periods for last treated unit}

\item{n_lags}{Number of pre-treatment periods to balance, default is to balance all periods}
}
\description{
Check that we can actually run multisynth on the data
}


================================================
FILE: man/conformal_inf.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/inference.R
\name{conformal_inf}
\alias{conformal_inf}
\title{Conformal inference procedure to compute p-values and point-wise confidence intervals}
\usage{
conformal_inf(
  ascm,
  alpha = 0.05,
  stat_func = NULL,
  type = "iid",
  q = 1,
  ns = 1000,
  grid_size = 50
)
}
\arguments{
\item{ascm}{Fitted `augsynth` object}

\item{alpha}{Confidence level}

\item{stat_func}{Function to compute test statistic}

\item{type}{Either "iid" for iid permutations or "block" for moving block permutations; default is "block"}

\item{q}{The norm for the test static `((sum(x ^ q))) ^ (1/q)`}

\item{ns}{Number of resamples for "iid" permutations}

\item{grid_size}{Number of grid points to use when inverting the hypothesis test}
}
\value{
List that contains:
        \itemize{
         \item{"att"}{Vector of ATT estimates}
         \item{"heldout_att"}{Vector of ATT estimates with the time period held out}
         \item{"se"}{Standard error, always NA but returned for compatibility}
         \item{"lb"}{Lower bound of 1 - alpha confidence interval}
         \item{"ub"}{Upper bound of 1 - alpha confidence interval}
         \item{"p_val"}{p-value for test of no post-treatment effect}
         \item{"alpha"}{Level of confidence interval}
        }
}
\description{
Conformal inference procedure to compute p-values and point-wise confidence intervals
}


================================================
FILE: man/conformal_inf_linear.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/inference.R
\name{conformal_inf_linear}
\alias{conformal_inf_linear}
\title{Conformal inference procedure to compute a confidence interval for a linear in time effect}
\usage{
conformal_inf_linear(
  ascm,
  alpha = 0.05,
  stat_func = NULL,
  type = "iid",
  q = 1,
  ns = 1000,
  grid_size = 50
)
}
\arguments{
\item{ascm}{Fitted `augsynth` object}

\item{alpha}{Confidence level}

\item{stat_func}{Function to compute test statistic}

\item{type}{Either "iid" for iid permutations or "block" for moving block permutations; default is "iid"}

\item{q}{The norm for the test static `((sum(x ^ q))) ^ (1/q)`}

\item{ns}{Number of resamples for "iid" permutations}

\item{grid_size}{Number of grid points to use when inverting the hypothesis test}
}
\value{
List that contains:
        \itemize{
         \item{"att"}{Vector of ATT estimates}
         \item{"heldout_att"}{Vector of ATT estimates with the time period held out}
         \item{"se"}{Standard error, always NA but returned for compatibility}
         \item{"lb"}{Lower bound of 1 - alpha confidence interval}
         \item{"ub"}{Upper bound of 1 - alpha confidence interval}
         \item{"p_val"}{p-value for test of no post-treatment effect}
         \item{"alpha"}{Level of confidence interval}
        }
}
\description{
Conformal inference procedure to compute a confidence interval for a linear in time effect
}


================================================
FILE: man/conformal_inf_multiout.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/inference.R
\name{conformal_inf_multiout}
\alias{conformal_inf_multiout}
\title{Conformal inference procedure to compute p-values and point-wise confidence intervals}
\usage{
conformal_inf_multiout(
  ascm_multi,
  alpha = 0.05,
  stat_func = NULL,
  type = "iid",
  q = 1,
  ns = 1000,
  grid_size = 1,
  lin_h0 = NULL
)
}
\arguments{
\item{alpha}{Confidence level}

\item{stat_func}{Function to compute test statistic}

\item{type}{Either "iid" for iid permutations or "block" for moving block permutations}

\item{q}{The norm for the test static `((sum(x ^ q))) ^ (1/q)`}

\item{ns}{Number of resamples for "iid" permutations}

\item{grid_size}{Number of grid points to use when inverting the hypothesis test (default is 1, so only to test joint null)}

\item{ascm}{Fitted `augsynth` object}
}
\value{
List that contains:
        \itemize{
         \item{"att"}{Vector of ATT estimates}
         \item{"heldout_att"}{Vector of ATT estimates with the time period held out}
         \item{"se"}{Standard error, always NA but returned for compatibility}
         \item{"lb"}{Lower bound of 1 - alpha confidence interval}
         \item{"ub"}{Upper bound of 1 - alpha confidence interval}
         \item{"p_val"}{p-value for test of no post-treatment effect}
         \item{"alpha"}{Level of confidence interval}
        }
}
\description{
Conformal inference procedure to compute p-values and point-wise confidence intervals
}


================================================
FILE: man/get_nona_donors.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/eligible_donors.R
\name{get_nona_donors}
\alias{get_nona_donors}
\title{Get donors that don't have missing outcomes where treated units have outcomes}
\usage{
get_nona_donors(X, y, trt, n_lags, n_leads, time_cohort)
}
\description{
Get donors that don't have missing outcomes where treated units have outcomes
}


================================================
FILE: man/jackknife_se_single.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/inference.R
\name{jackknife_se_single}
\alias{jackknife_se_single}
\title{Estimate standard errors for single ASCM with the jackknife
Do this for ridge-augmented synth}
\usage{
jackknife_se_single(ascm)
}
\arguments{
\item{ascm}{Fitted augsynth object}
}
\value{
List that contains:
        \itemize{
         \item{"att"}{Vector of ATT estimates}
         \item{"se"}{Standard error estimate}
         \item{"lb"}{Lower bound of 1 - alpha confidence interval}
         \item{"ub"}{Upper bound of 1 - alpha confidence interval}
         \item{"alpha"}{Level of confidence interval}
        }
}
\description{
Estimate standard errors for single ASCM with the jackknife
Do this for ridge-augmented synth
}


================================================
FILE: man/kansas.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/data.R
\docType{data}
\name{kansas}
\alias{kansas}
\title{Economic indicators for US states from 1990-2016}
\format{
A dataframe with 5250 rows and 32 variables:
\describe{
 \item{fips}{FIPS code for each state}
 \item{year}{Year of measurement}
 \item{qtr}{Quarter (1-4) of measurement}
 \item{state}{Name of State}
 \item{gdp}{Gross State Product (millions of $) Values before 2005 are linearly interpolated between years}
 \item{revenuepop}{State and local revenue per capita}
 \item{rev_state_total}{State total general revenue (millions of $)}
 \item{rev_local_total}{Local total general revenue (millions of $)}
 \item{popestimate}{Population estimate}
 \item{qtrly_estabs_count}{Count of establishments for a given quarter}
 \item{month1_emplvl, month2_emplvl, month3_emplvl}{ Employment level for first, second, and third months of a given quarter}
 \item{total_qtrly_wages}{Total wages for a givne quarter}
 \item{taxable_qtrly_wage}{Taxable wages for a given quarter}
 \item{avg_wkly_wage}{Average weekly wage for a given quarter}
 \item{year_qtr}{Year and quarter combined into one continuous variable}
 \item{treated}{Whether the state passed tax cuts before the given year and quareter}
 \item{lngdpcapita}{Natural log of GDP per capita}
 \item{emplvlcapita}{Average employment level per capita}
 \item{Xcapita}{Per capita value of X}
 \item{abb}{State abbreviation}
}
}
\usage{
kansas
}
\description{
Economic indicators for US states from 1990-2016
}
\keyword{datasets}


================================================
FILE: man/make_V_matrix.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/fit_synth.R
\name{make_V_matrix}
\alias{make_V_matrix}
\title{Make a V matrix from a vector (or null)}
\usage{
make_V_matrix(t0, V)
}
\description{
Make a V matrix from a vector (or null)
}


================================================
FILE: man/multisynth.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/multisynth_class.R
\name{multisynth}
\alias{multisynth}
\title{Fit staggered synth}
\usage{
multisynth(
  form,
  unit,
  time,
  data,
  n_leads = NULL,
  n_lags = NULL,
  nu = NULL,
  lambda = 0,
  V = NULL,
  fixedeff = TRUE,
  n_factors = 0,
  scm = T,
  time_cohort = F,
  how_match = "knn",
  cov_agg = NULL,
  eps_abs = 1e-04,
  eps_rel = 1e-04,
  verbose = FALSE,
  ...
)
}
\arguments{
\item{form}{outcome ~ treatment | weighting covariates | approximate matching covaraites | exact matching covariates
\itemize{
   \item{outcome}{Name of the outcome of interest}
   \item{treatment}{Name of the treatment assignment variable}
   \item{weighting covariates}{Auxiliary covariates to weight on}
   \item{approximate matching covariates}{Auxiliary covariates to approximately match one before weighting}
   \item{exact matching covariates}{Auxiliary covariates to exactly match on before weighting}
}
If covariates are time-varying, their average value before the first unit is treated will be used. This can be changed by supplying a custom aggregation function to cov_agg.}

\item{unit}{Name of unit column}

\item{time}{Name of time column}

\item{data}{Panel data as dataframe}

\item{n_leads}{How long past treatment effects should be estimated for, default is number of post treatment periods for last treated unit}

\item{n_lags}{Number of pre-treatment periods to balance, default is to balance all periods}

\item{nu}{Fraction of balance for individual balance}

\item{lambda}{Regularization hyperparameter, default = 0}

\item{V}{Scaling matrix for synth optimization, default NULL is identity}

\item{fixedeff}{Whether to include a unit fixed effect, default TRUE}

\item{n_factors}{Number of factors for interactive fixed effects, setting to NULL fits with CV, default is 0}

\item{scm}{Whether to fit scm weights}

\item{time_cohort}{Whether to average synthetic controls into time cohorts, default FALSE}

\item{cov_agg}{Covariate aggregation function}

\item{eps_abs}{Absolute error tolerance for osqp}

\item{eps_rel}{Relative error tolerance for osqp}

\item{verbose}{Whether to print logs for osqp}

\item{...}{Extra arguments}
}
\value{
multisynth object that contains:
        \itemize{
         \item{"weights"}{weights matrix where each column is a set of weights for a treated unit}
         \item{"data"}{Panel data as matrices}
         \item{"imbalance"}{Matrix of treatment minus synthetic control for pre-treatment time periods, each column corresponds to a treated unit}
         \item{"global_l2"}{L2 imbalance for the pooled synthetic control}
         \item{"scaled_global_l2"}{L2 imbalance for the pooled synthetic control, scaled by the imbalance for unitform weights}
         \item{"ind_l2"}{Average L2 imbalance for the individual synthetic controls}
         \item{"scaled_ind_l2"}{Average L2 imbalance for the individual synthetic controls, scaled by the imbalance for unitform weights}
        \item{"n_leads", "n_lags"}{Number of post treatment outcomes (leads) and pre-treatment outcomes (lags) to include in the analysis}
         \item{"nu"}{Fraction of balance for individual balance}
         \item{"lambda"}{Regularization hyperparameter}
         \item{"scm"}{Whether to fit scm weights}
         \item{"grps"}{Time periods for treated units}
         \item{"y0hat"}{Pilot estimates of control outcomes}
         \item{"residuals"}{Difference between the observed outcomes and the pilot estimates}
         \item{"n_factors"}{Number of factors for interactive fixed effects}
        }
}
\description{
Fit staggered synth
}


================================================
FILE: man/plot.augsynth.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/augsynth.R
\name{plot.augsynth}
\alias{plot.augsynth}
\title{Plot function for augsynth}
\usage{
\method{plot}{augsynth}(x, inf = T, cv = F, ...)
}
\arguments{
\item{x}{Augsynth object to be plotted}

\item{inf}{Boolean, whether to get confidence intervals around the point estimates}

\item{cv}{If True, plot cross validation MSE against hyper-parameter, otherwise plot effects}

\item{...}{Optional arguments}
}
\description{
Plot function for augsynth
}


================================================
FILE: man/plot.augsynth_multiout.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/multi_outcomes.R
\name{plot.augsynth_multiout}
\alias{plot.augsynth_multiout}
\title{Plot function for summary function for augsynth}
\usage{
\method{plot}{augsynth_multiout}(x, inf = T, plt_avg = F, ...)
}
\arguments{
\item{x}{summary.augsynth_multiout object}

\item{inf}{Boolean, whether to plot uncertainty intervals, default TRUE}

\item{plt_avg}{Boolean, whether to plot the average of the outcomes, default FALSE}

\item{...}{Optional arguments for summary function}
}
\description{
Plot function for summary function for augsynth
}


================================================
FILE: man/plot.multisynth.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/multisynth_class.R
\name{plot.multisynth}
\alias{plot.multisynth}
\title{Plot function for multisynth}
\usage{
\method{plot}{multisynth}(
  x,
  inf_type = "bootstrap",
  inf = T,
  levels = NULL,
  label = T,
  weights = FALSE,
  ...
)
}
\arguments{
\item{x}{Augsynth object to be plotted}

\item{inf_type}{Type of inference to perform:
 \itemize{
   \item{bootstrap}{Wild bootstrap, the default option}
   \item{jackknife}{Jackknife}
}}

\item{inf}{Whether to compute and plot confidence intervals}

\item{levels}{Which units/groups to plot, default is every group}

\item{label}{Whether to label the individual levels}

\item{weights}{Whether to plot the weights, default = FALSE}

\item{...}{Optional arguments}
}
\description{
Plot function for multisynth
}


================================================
FILE: man/plot.summary.augsynth.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/augsynth.R
\name{plot.summary.augsynth}
\alias{plot.summary.augsynth}
\title{Plot function for summary function for augsynth}
\usage{
\method{plot}{summary.augsynth}(x, inf = T, ...)
}
\arguments{
\item{x}{Summary object}

\item{inf}{Boolean, whether to plot confidence intervals}

\item{...}{Optional arguments}
}
\description{
Plot function for summary function for augsynth
}


================================================
FILE: man/plot.summary.augsynth_multiout.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/multi_outcomes.R
\name{plot.summary.augsynth_multiout}
\alias{plot.summary.augsynth_multiout}
\title{Plot function for summary function for augsynth}
\usage{
\method{plot}{summary.augsynth_multiout}(x, inf = F, plt_avg = F, ...)
}
\arguments{
\item{x}{summary.augsynth_multiout object}

\item{inf}{Boolean, whether to plot uncertainty intervals, default TRUE}

\item{plt_avg}{Boolean, whether to plot the average of the outcomes, default FALSE}
}
\description{
Plot function for summary function for augsynth
}


================================================
FILE: man/plot.summary.multisynth.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/multisynth_class.R
\name{plot.summary.multisynth}
\alias{plot.summary.multisynth}
\title{Plot function for summary function for multisynth}
\usage{
\method{plot}{summary.multisynth}(x, inf = T, levels = NULL, label = T, weights = FALSE, ...)
}
\arguments{
\item{x}{summary object}

\item{inf}{Whether to plot confidence intervals}

\item{levels}{Which units/groups to plot, default is every group}

\item{label}{Whether to label the individual levels}

\item{weights}{Whether to plot the weights, default = FALSE}

\item{...}{Optional arguments}
}
\description{
Plot function for summary function for multisynth
}


================================================
FILE: man/predict.augsynth.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/augsynth.R
\name{predict.augsynth}
\alias{predict.augsynth}
\title{Get prediction of ATT or average outcome under control}
\usage{
\method{predict}{augsynth}(object, att = F, ...)
}
\arguments{
\item{object}{augsynth object}

\item{att}{If TRUE, return the ATT, if FALSE, return imputed counterfactual}

\item{...}{Optional arguments}
}
\value{
Vector of predicted post-treatment control averages
}
\description{
Get prediction of ATT or average outcome under control
}


================================================
FILE: man/predict.augsynth_multiout.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/multi_outcomes.R
\name{predict.augsynth_multiout}
\alias{predict.augsynth_multiout}
\title{Get prediction of ATT or average outcome under control}
\usage{
\method{predict}{augsynth_multiout}(object, ...)
}
\arguments{
\item{object}{augsynth_multiout object}

\item{...}{Optional arguments, including \itemize{\item{"att"}{Whether to return the ATT or average outcome under control}}}
}
\value{
Vector of predicted post-treatment control averages
}
\description{
Get prediction of ATT or average outcome under control
}


================================================
FILE: man/predict.multisynth.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/multisynth_class.R
\name{predict.multisynth}
\alias{predict.multisynth}
\title{Get prediction of average outcome under control or ATT}
\usage{
\method{predict}{multisynth}(object, att = F, att_weight = NULL, bs_weight = NULL, ...)
}
\arguments{
\item{object}{Fit multisynth object}

\item{att}{If TRUE, return the ATT, if FALSE, return imputed counterfactual}

\item{att_weight}{Weights to place on individual units/cohorts when averaging}

\item{bs_weight}{Weight to perturb units by for weighted bootstrap}

\item{...}{Optional arguments}
}
\value{
Matrix of predicted post-treatment control outcomes for each treated unit
}
\description{
Get prediction of average outcome under control or ATT
}


================================================
FILE: man/print.augsynth.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/augsynth.R
\name{print.augsynth}
\alias{print.augsynth}
\title{Print function for augsynth}
\usage{
\method{print}{augsynth}(x, ...)
}
\arguments{
\item{x}{augsynth object}

\item{...}{Optional arguments}
}
\description{
Print function for augsynth
}


================================================
FILE: man/print.augsynth_multiout.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/multi_outcomes.R
\name{print.augsynth_multiout}
\alias{print.augsynth_multiout}
\title{Print function for augsynth}
\usage{
\method{print}{augsynth_multiout}(x, ...)
}
\arguments{
\item{x}{augsynth_multiout object}

\item{...}{Optional arguments}
}
\description{
Print function for augsynth
}


================================================
FILE: man/print.multisynth.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/multisynth_class.R
\name{print.multisynth}
\alias{print.multisynth}
\title{Print function for multisynth}
\usage{
\method{print}{multisynth}(x, att_weight = NULL, ...)
}
\arguments{
\item{x}{multisynth object}

\item{...}{Optional arguments}
}
\description{
Print function for multisynth
}


================================================
FILE: man/print.summary.augsynth.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/augsynth.R
\name{print.summary.augsynth}
\alias{print.summary.augsynth}
\title{Print function for summary function for augsynth}
\usage{
\method{print}{summary.augsynth}(x, ...)
}
\arguments{
\item{x}{summary object}

\item{...}{Optional arguments}
}
\description{
Print function for summary function for augsynth
}


================================================
FILE: man/print.summary.augsynth_multiout.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/multi_outcomes.R
\name{print.summary.augsynth_multiout}
\alias{print.summary.augsynth_multiout}
\title{Print function for summary function for augsynth}
\usage{
\method{print}{summary.augsynth_multiout}(x, ...)
}
\arguments{
\item{x}{summary.augsynth_multiout object}

\item{...}{Optional arguments}
}
\description{
Print function for summary function for augsynth
}


================================================
FILE: man/print.summary.multisynth.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/multisynth_class.R
\name{print.summary.multisynth}
\alias{print.summary.multisynth}
\title{Print function for summary function for multisynth}
\usage{
\method{print}{summary.multisynth}(x, level = "Average", ...)
}
\arguments{
\item{x}{summary object}

\item{level}{Which unit/group to print results for, default is the overall average}

\item{...}{Optional arguments}
}
\description{
Print function for summary function for multisynth
}


================================================
FILE: man/rdirichlet_b.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/inference.R
\name{rdirichlet_b}
\alias{rdirichlet_b}
\title{Bayesian bootstrap}
\usage{
rdirichlet_b(n)
}
\arguments{
\item{n}{Number of units}
}
\description{
Bayesian bootstrap
}


================================================
FILE: man/rmultinom_b.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/inference.R
\name{rmultinom_b}
\alias{rmultinom_b}
\title{Non-parametric bootstrap}
\usage{
rmultinom_b(n)
}
\arguments{
\item{n}{Number of units}
}
\description{
Non-parametric bootstrap
}


================================================
FILE: man/rwild_b.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/inference.R
\name{rwild_b}
\alias{rwild_b}
\title{Wild bootstrap (Mammen 1993)}
\usage{
rwild_b(n)
}
\arguments{
\item{n}{Number of units}
}
\description{
Wild bootstrap (Mammen 1993)
}


================================================
FILE: man/single_augsynth.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/augsynth.R
\name{single_augsynth}
\alias{single_augsynth}
\title{Fit Augmented SCM}
\usage{
single_augsynth(
  form,
  unit,
  time,
  t_int,
  data,
  progfunc = "ridge",
  scm = T,
  fixedeff = FALSE,
  cov_agg = NULL,
  ...
)
}
\arguments{
\item{form}{outcome ~ treatment | auxillary covariates}

\item{unit}{Name of unit column}

\item{time}{Name of time column}

\item{t_int}{Time of intervention}

\item{data}{Panel data as dataframe}

\item{progfunc}{What function to use to impute control outcomes
ridge=Ridge regression (allows for standard errors),
none=No outcome model,
en=Elastic Net, RF=Random Forest, GSYN=gSynth,
mcp=MCPanel, 
cits=Comparitive Interuppted Time Series
causalimpact=Bayesian structural time series with CausalImpact}

\item{scm}{Whether the SCM weighting function is used}

\item{fixedeff}{Whether to include a unit fixed effect, default F}

\item{cov_agg}{Covariate aggregation functions, if NULL then use mean with NAs omitted}

\item{...}{optional arguments for outcome model}
}
\value{
augsynth object that contains:
        \itemize{
         \item{"weights"}{Ridge ASCM weights}
         \item{"l2_imbalance"}{Imbalance in pre-period outcomes, measured by the L2 norm}
         \item{"scaled_l2_imbalance"}{L2 imbalance scaled by L2 imbalance of uniform weights}
         \item{"mhat"}{Outcome model estimate}
         \item{"data"}{Panel data as matrices}
        }
}
\description{
Fit Augmented SCM
}


================================================
FILE: man/summary.augsynth.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/augsynth.R
\name{summary.augsynth}
\alias{summary.augsynth}
\title{Summary function for augsynth}
\usage{
\method{summary}{augsynth}(object, inf = T, inf_type = "conformal", linear_effect = F, ...)
}
\arguments{
\item{object}{augsynth object}

\item{inf}{Boolean, whether to get confidence intervals around the point estimates}

\item{inf_type}{Type of inference algorithm. Options are
\itemize{
 \item{"conformal"}{Conformal inference (default)}
 \item{"jackknife+"}{Jackknife+ algorithm over time periods}
 \item{"jackknife"}{Jackknife over units}
}}

\item{linear_effect}{Boolean, whether to invert the conformal inference hypothesis test to get confidence intervals for a linear-in-time treatment effect: intercept + slope * time}

\item{...}{Optional arguments for inference, for more details for each `inf_type` see
\itemize{
 \item{"conformal"}{`conformal_inf`}
 \item{"jackknife+"}{`time_jackknife_plus`}
 \item{"jackknife"}{`jackknife_se_single`}
}}
}
\description{
Summary function for augsynth
}


================================================
FILE: man/summary.augsynth_multiout.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/multi_outcomes.R
\name{summary.augsynth_multiout}
\alias{summary.augsynth_multiout}
\title{Summary function for augsynth}
\usage{
\method{summary}{augsynth_multiout}(object, inf = T, inf_type = "conformal", grid_size = 1, ...)
}
\arguments{
\item{object}{augsynth_multiout object}

\item{inf}{whether or not to perform inference}

\item{grid_size}{Grid to compute prediction intervals over, default is 1 and only p-values are computed}

\item{...}{Optional arguments, including \itemize{\item{"se"}{Whether to plot standard error}}}

\item{inf_typ}{Type of inference, default is "conformal"}
}
\description{
Summary function for augsynth
}


================================================
FILE: man/summary.multisynth.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/multisynth_class.R
\name{summary.multisynth}
\alias{summary.multisynth}
\title{Summary function for multisynth}
\usage{
\method{summary}{multisynth}(object, inf_type = "bootstrap", att_weight = NULL, ...)
}
\arguments{
\item{object}{multisynth object}

\item{inf_type}{Type of inference to perform:
 \itemize{
   \item{bootstrap}{Wild bootstrap, the default option}
   \item{jackknife}{Jackknife}
}}

\item{...}{Optional arguments}
}
\value{
summary.multisynth object that contains:
        \itemize{
         \item{"att"}{Dataframe with ATT estimates, standard errors for each treated unit}
         \item{"global_l2"}{L2 imbalance for the pooled synthetic control}
         \item{"scaled_global_l2"}{L2 imbalance for the pooled synthetic control, scaled by the imbalance for unitform weights}
         \item{"ind_l2"}{Average L2 imbalance for the individual synthetic controls}
         \item{"scaled_ind_l2"}{Average L2 imbalance for the individual synthetic controls, scaled by the imbalance for unitform weights}
        \item{"n_leads", "n_lags"}{Number of post treatment outcomes (leads) and pre-treatment outcomes (lags) to include in the analysis}
        }
}
\description{
Summary function for multisynth
}


================================================
FILE: man/time_jackknife_plus.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/inference.R
\name{time_jackknife_plus}
\alias{time_jackknife_plus}
\title{Jackknife+ algorithm over time}
\usage{
time_jackknife_plus(ascm, alpha = 0.05, conservative = F)
}
\arguments{
\item{ascm}{Fitted `augsynth` object}

\item{alpha}{Confidence level}

\item{conservative}{Whether to use the conservative jackknife+ procedure}
}
\value{
List that contains:
        \itemize{
         \item{"att"}{Vector of ATT estimates}
         \item{"heldout_att"}{Vector of ATT estimates with the time period held out}
         \item{"se"}{Standard error, always NA but returned for compatibility}
         \item{"lb"}{Lower bound of 1 - alpha confidence interval}
         \item{"ub"}{Upper bound of 1 - alpha confidence interval}
         \item{"alpha"}{Level of confidence interval}
        }
}
\description{
Jackknife+ algorithm over time
}


================================================
FILE: man/time_jackknife_plus_multiout.Rd
================================================
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/inference.R
\name{time_jackknife_plus_multiout}
\alias{time_jackknife_plus_multiout}
\title{Jackknife+ algorithm over time}
\usage{
time_jackknife_plus_multiout(ascm_multi, alpha = 0.05, conservative = F)
}
\arguments{
\item{alpha}{Confidence level}

\item{conservative}{Whether to use the conservative jackknife+ procedure}

\item{ascm}{Fitted `augsynth` object}
}
\value{
List that contains:
        \itemize{
         \item{"att"}{Vector of ATT estimates}
         \item{"heldout_att"}{Vector of ATT estimates with the time period held out}
         \item{"se"}{Standard error, always NA but returned for compatibility}
         \item{"lb"}{Lower bound of 1 - alpha confidence interval}
         \item{"ub"}{Upper bound of 1 - alpha confidence interval}
         \item{"alpha"}{Level of confidence interval}
        }
}
\description{
Jackknife+ algorithm over time
}


================================================
FILE: pkg.Rproj
================================================
Version: 1.0

RestoreWorkspace: No
SaveWorkspace: No
AlwaysSaveHistory: Default

EnableCodeIndexing: Yes
Encoding: UTF-8

AutoAppendNewline: Yes
StripTrailingWhitespace: Yes

BuildType: Package
PackageUseDevtools: Yes
PackageInstallArgs: --no-multiarch --with-keep.source
PackageRoxygenize: rd,collate,namespace


================================================
FILE: tests/testthat/test_augsynth_pre.R
================================================
context("Testing that top level API runs the right functions")

library(Synth)


test_that("augsynth runs single_synth when there is a single treated unit", {

  data(basque)
  basque <- basque %>% mutate(trt = case_when(year < 1975 ~ 0,
                                              regionno != 17 ~0,
                                              regionno == 17 ~ 1)) %>%
      filter(regionno != 1)

  syn <- augsynth(gdpcap ~ trt, regionno, year, basque,
                  progfunc = "None", scm = T, t_int = 1975)

  syn_single <- single_augsynth(gdpcap ~ trt, regionno, year, basque,
                                progfunc = "None", scm = T, t_int = 1975)

  expect_equal(syn$weights, syn_single$weights)
})


test_that("augsynth finds the treated time when is a single treated unit", {

  data(basque)
  basque <- basque %>% mutate(trt = case_when(year < 1975 ~ 0,
                                              regionno != 17 ~0,
                                              regionno == 17 ~ 1)) %>%
      filter(regionno != 1)

  syn <- augsynth(gdpcap ~ trt, regionno, year, basque,
                  progfunc = "None", scm = T, t_int = 1975)

  syn2 <- augsynth(gdpcap ~ trt, regionno, year, basque,
                  progfunc = "None", scm = T)

  expect_equal(syn$weights, syn2$weights, tolerance = 1e-6)

  # should work with out of order time as well
  syn_rev <- augsynth(gdpcap ~ trt, regionno, year,
                      basque %>% arrange(desc(year)),
                      progfunc = "None", scm = T)
  expect_equal(syn$weights, syn_rev$weights)
  expect_equal(predict(syn), predict(syn_rev))
})


test_that("augsynth runs single_synth when there is simultaneous adoption", {

  data(basque)
  basque <- basque %>% mutate(trt = case_when(year < 1975 ~ 0,
                                              !regionno %in% c(16, 17) ~0,
                                              regionno %in% c(16, 17) ~ 1)) %>%
      filter(regionno != 1)

  syn <- augsynth(gdpcap ~ trt, regionno, year, basque,
                  progfunc = "None", scm = T, t_int = 1975)

  syn_single <- single_augsynth(gdpcap ~ trt, regionno, year, basque,
                                progfunc = "None", scm = T, t_int = 1975)

  expect_equal(syn$weights, syn_single$weights)
})


test_that("augsynth runs multisynth when there is staggered adoption", {

  data(basque)
  basque <- basque %>% mutate(trt = case_when((regionno == 17) & (year == 1975) ~ 1,
                                              (regionno == 16) & (year == 1980) ~ 1,
                                              TRUE ~ 0)) %>%
      filter(regionno != 1)

  syn <- augsynth(gdpcap ~ trt, regionno, year, basque, scm = T)

  syn_multi <- multisynth(gdpcap ~ trt, regionno, year, basque)

  expect_equal(syn$weights, syn_multi$weights,  tolerance = 1e-5)
})


test_that("augsynth with a single treated unit doesn't depend on unit order", {

  data(kansas)


  syn <- augsynth(lngdpcapita ~ treated | log(revstatecapita), abb,
                  year_qtr, kansas, progfunc = "None", scm = T)

  syn2 <- augsynth(lngdpcapita ~ treated | log(revstatecapita), fips, year_qtr,
                   kansas %>% arrange(desc(fips)), progfunc = "None", scm = T)


  expect_equal(predict(syn), predict(syn2))


  asyn <- augsynth(lngdpcapita ~ treated | log(revstatecapita), fips,
                  year_qtr, kansas, progfunc = "ridge", scm = T)

  asyn2 <- augsynth(lngdpcapita ~ treated | log(revstatecapita), fips, year_qtr,
                   kansas %>% arrange(desc(fips)), progfunc = "ridge", scm = T)

  expect_equal(c(asyn$weights), c(asyn2$weights))
  expect_equal(predict(asyn), predict(asyn2))


})


test_that("augsynth runs single_synth with progfunc = 'ridge' when there is a single treated unit and no progfunc is specified", {

  data(basque)
  basque <- basque %>% mutate(trt = case_when(year < 1975 ~ 0,
                                              regionno != 17 ~0,
                                              regionno == 17 ~ 1)) %>%
      filter(regionno != 1)

  syn <- augsynth(gdpcap ~ trt, regionno, year, basque, scm = T)

  syn_single <- single_augsynth(gdpcap ~ trt, regionno, year, basque,
                                progfunc = "ridge", scm = T, t_int = 1975)

  expect_equal(syn$weights, syn_single$weights)
})


================================================
FILE: tests/testthat/test_format.R
================================================
context("Test data formatting")

library(Synth)
data(basque)
basque <- basque %>% mutate(trt = case_when(year < 1975 ~ 0,
                                            regionno != 17 ~0,
                                            regionno == 17 ~ 1)) %>%
    filter(regionno != 1)
                            
test_that("format_data creates matrices with the right dimensions", {
    
    dat <- format_data(quo(gdpcap), quo(trt), quo(regionno), quo(year),1975, basque)

    test_dim <- function(obj, d) {
        expect_equivalent(dim(obj), d)
        }

    test_dim(dat$X, c(17, 20))
    expect_equivalent(length(dat$trt), 17)
    test_dim(dat$y, c(17, 23))
}
)


test_that("format_synth creates matrices with the right dimensions", {
    
    dat <- format_data(quo(gdpcap), quo(trt), quo(regionno), quo(year),1975, basque)
    syn_dat <- format_synth(dat$X, dat$trt, dat$y)
    test_dim <- function(obj, d) {
        expect_equivalent(dim(obj), d)
        }

    test_dim(syn_dat$Z0, c(20, 16))
    test_dim(syn_dat$Z1, c(20, 1))

    test_dim(syn_dat$Y0plot, c(43, 16))
    test_dim(syn_dat$Y1plot, c(43, 1))

    expect_equivalent(syn_dat$Z1, syn_dat$X1)
    expect_equivalent(syn_dat$Z0, syn_dat$X0)
}
)


test_that("multisynth throws errors when there aren't enough pre-treatment times",
  {
    basque2 <- basque %>%
      mutate(trt = case_when(
        regionno == 16 ~ 1,
        year >= 1975 & regionno == 17 ~ 1,
        TRUE ~ 0)
        ) %>%
      filter(regionno != 1)

  # error from always treated unit
  expect_warning(
    expect_error(
      multisynth(gdpcap ~ trt, regionno, year, basque2)
    )
  )

  basque2 <- basque %>%
      mutate(trt = case_when(
        regionno == 16 & year >= 1956 ~ 1,
        year >= 1975 & regionno == 17 ~ 1,
        TRUE ~ 0)
        ) %>%
      filter(regionno != 1)

  # error from one pre-treatment outcome and fixedeff = T
  expect_warning(
    expect_error(multisynth(gdpcap ~ trt, regionno, year, basque2))
  )

  # no error from one pre-treatment outcome and fixedeff = F, just warning
  expect_warning(multisynth(gdpcap ~ trt, regionno, year, basque2, fixedeff = F))

  })


  test_that("formatting for staggered adoption doesn't care about order of time in data",
  {
    basque2 <- basque %>%
      # slice(sample(1:n())) %>%
      mutate(trt = case_when((regionno == 17) & (year >= 1975) ~ 1,
                              (regionno == 16) & (year >= 1980) ~ 1,
                                TRUE ~ 0))

      dat <- format_data_stag(quo(gdpcap), quo(trt), quo(regionno), quo(year), basque2)

      # true treatment times
      true_trt <- c(1975, 1980) - min(basque$year)

      expect_equal(true_trt, sort(dat$trt[is.finite(dat$trt)]))

    basque2 <- basque %>%
        slice(sample(1:n())) %>%
        mutate(trt = case_when((regionno == 17) & (year >= 1975) ~ 1,
                                (regionno == 16) & (year >= 1980) ~ 1,
                                  TRUE ~ 0))

    dat <- format_data_stag(quo(gdpcap), quo(trt), quo(regionno), quo(year), basque2)

    expect_equal(true_trt, sort(dat$trt[is.finite(dat$trt)]))

  })


test_that("augsynth exits with error message if there are no never treated units",
  {
  basque2 <- basque %>%
    # slice(sample(1:n())) %>%
    mutate(trt = case_when((regionno == 17) & (year >= 1975) ~ 1,
                            (year >= 1997) ~ 1,
                              TRUE ~ 0))
  expect_error(augsynth(gdpcap ~ trt, regionno, year, basque2), "1996")

  })

================================================
FILE: tests/testthat/test_general.R
================================================
context("Generally testing the workflow for augsynth")


library(Synth)
data(basque)
basque <- basque %>% mutate(trt = case_when(year < 1975 ~ 0,
                                            regionno != 17 ~0,
                                            regionno == 17 ~ 1)) %>%
    filter(regionno != 1)


test_that("SCM gives the right answer", {

    syn <- augsynth(gdpcap ~ trt, regionno, year, basque, progfunc="None", scm=T, t_int=1975)
    ## average att estimate is as expected
    expect_equal(-.3686, mean(summary(syn, inf = F)$att$Estimate), tolerance=1e-4)


    ## level of balance is as expected
    expect_equal(.377, syn$l2_imbalance, tolerance=1e-3)

}
)

test_that("SCM finds the correct t_int and gives the right answer", {

    syn1 <- augsynth(gdpcap ~ trt, regionno, year, basque,
                     progfunc="None", scm=T)
    syn2 <- augsynth(gdpcap ~ trt, regionno, year, basque,
                     progfunc = "None", scm = T, t_int = 1975)
    ## average att estimate is as expected
    expect_equal(mean(summary(syn1, inf = F)$att$Estimate), 
                 mean(summary(syn2, inf = F)$att$Estimate), tolerance=1e-4)
    
    ## level of balance is as expected
    expect_equal(syn1$l2_imbalance, syn2$l2_imbalance, tolerance=1e-3)
    
}
)


test_that("Ridge ASCM gives the right answer", {

    asyn <- augsynth(gdpcap ~ trt, regionno, year, basque, progfunc="Ridge",
                     scm=T, lambda=8)

    ## average att estimate is as expected
    expect_equal(-.3696, mean(summary(asyn, inf = F)$att$Estimate), tolerance=1e-3)

    ## level of balance is as expected
    expect_equal(.373, asyn$l2_imbalance, tolerance=1e-3)

}
)


test_that("SCM after residualizing covariates gives the right answer", {

  covsyn_resid <- augsynth(gdpcap ~ trt | invest + popdens,
                      regionno, year, basque,
                      progfunc = "None", scm = T,
                      residualize = T)

  ## average att estimate is as expected
  expect_equal(-.1443,
                mean(summary(covsyn_resid, inf = F)$att$Estimate),
                tolerance = 1e-3)

  ## level of balance is as expected
  expect_equal(.3720, covsyn_resid$l2_imbalance, tolerance=1e-3)

  # perfect auxiliary covariate balance
  expect_equal(0, covsyn_resid$covariate_l2_imbalance, tolerance=1e-3)

}
)

test_that("Ridge ASCM with covariates jointly gives the right answer", {

    covsyn_noresid <- augsynth(gdpcap ~ trt | invest + popdens,
                       regionno, year, basque,
                       progfunc = "None", scm = T,
                       residualize = F)

    ## average att estimate is as expected
    expect_equal(-.3345,
                 mean(summary(covsyn_noresid, inf = F)$att$Estimate),
                 tolerance = 1e-3)

    ## level of balance is as expected
    expect_equal(0.659, covsyn_noresid$l2_imbalance, tolerance=1e-3)

    # covariate balance is as expected
    expect_equal(0.884, covsyn_noresid$covariate_l2_imbalance, tolerance=1e-3)


}
)


test_that("Ridge ASCM after residualizing covariates gives the right answer", {

    covascm_resid <- augsynth(gdpcap ~ trt | invest + popdens,
                       regionno, year, basque,
                       progfunc = "Ridge", scm = T,
                       lambda = 1,
                       residualize = T)

    ## average att estimate is as expected
    expect_equal(-.123,
                 mean(summary(covascm_resid, inf = F)$att$Estimate),
                 tolerance = 1e-3)

    ## level of balance is as expected
    expect_equal(.347, covascm_resid$l2_imbalance, tolerance=1e-3)

    # perfect auxiliary covariate balance
    expect_equal(0, covascm_resid$covariate_l2_imbalance, tolerance=1e-3)

}
)

test_that("Ridge ASCM with covariates jointly gives the right answer", {

    covascm_noresid <- augsynth(gdpcap ~ trt | invest + popdens,
                       regionno, year, basque,
                       progfunc = "Ridge", scm = T,
                       lambda = 1,
                       residualize = F)

    ## average att estimate is as expected
    expect_equal(-.267,
                 mean(summary(covascm_noresid, inf = F)$att$Estimate),
                 tolerance = 1e-3)

    ## level of balance is as expected
    expect_equal(0.419, covascm_noresid$l2_imbalance, tolerance=1e-3)

    # covariate balance is as expected
    expect_equal(0.084, covascm_noresid$covariate_l2_imbalance, tolerance=1e-3)

}
)


test_that("Warning given when inputting an unused argument", {

    expect_warning(
      augsynth(gdpcap ~ trt| invest + popdens, regionno, year, basque, 
               progfunc="Ridge", scm=T, lambda=8, t_int = 1975, 
               bad_param = "Unused input parameter"),
    )
})


================================================
FILE: tests/testthat/test_lambda.R
================================================
context("Testing lambda tuning if ridge is true.")


library(Synth)
data(basque)
basque <- basque %>% mutate(trt = case_when(year < 1975 ~ 0,
                                            regionno != 17 ~0,
                                            regionno == 17 ~ 1)) %>%
  filter(regionno != 1)


test_that("Lambda sequence is generated correctly", {
  syn <- augsynth(gdpcap ~ trt, regionno, year, basque, progfunc="Ridge", scm=T)
  lambdas <- syn$lambdas
  expect_equivalent(lambdas[length(lambdas)] / lambdas[1], 1e-8)
  expect_equivalent(lambdas[2] / lambdas[1], lambdas[3] / lambdas[2])
})

test_that("Smallest lambda is chosen", {
  syn <- augsynth(gdpcap ~ trt, regionno, year, basque, 
                  progfunc="Ridge", scm=T, 
                  min_1se = F)
  expect_equivalent(syn$lambda, syn$lambdas[which.min(syn$lambda_errors)])
})


test_that("Largest lambda within 1 SE of minimum is chosen", {
  syn <- augsynth(gdpcap ~ trt, regionno, year, basque, 
                  progfunc="Ridge", scm=T, 
                  min_1se = T)
  min_idx <- which.min(syn$lambda_errors)
  min_1se <- max(syn$lambdas[syn$lambda_errors <= 
                              syn$lambda_errors[min_idx] + 
                              syn$lambda_errors_se[min_idx]])
  expect_equivalent(syn$lambda, min_1se)
})

test_that("Max lambda is in list of returned lambas (optional parameters are going through)", {
  syn <- augsynth(gdpcap ~ trt, regionno, year, basque, 
                  progfunc="Ridge", scm=T, 
                  lambda_max = 100)
  lambdas <- syn$lambdas
  expect_equivalent(lambdas[1], 100)
})

================================================
FILE: tests/testthat/test_load_data.R
================================================
context("Testing that we can load packaged data")

test_that("kansas data loads", {
    expect_error(data(kansas), NA)
})

================================================
FILE: tests/testthat/test_multiple_outcomes.R
================================================
context("Generally testing the workflow for synth with multiple outcomes")

library(Synth)
data(basque)
basque <- basque %>% mutate(trt = case_when(year < 1975 ~ 0,
                                            regionno != 17 ~0,
                                            regionno == 17 ~ 1),
                            gdpcap_sq = gdpcap ^ 2) %>%
    filter(regionno != 1)

test_that("augsynth and augsynth_multiout are the same without augmentation", {

    syn1 <- augsynth_multiout(gdpcap + gdpcap_sq  ~ trt, regionno, year, 1975, basque,
                    progfunc="None", scm=T)
    syn2 <- augsynth(gdpcap + gdpcap_sq  ~ trt, regionno, year, basque,
                    progfunc="None", scm=T)

    # weights are the same
    expect_equal(c(syn1$weights), c(syn2$weights), tolerance=3e-4)

    # estimates are the same
    expect_equal(c(predict(syn1, att=F)), c(predict(syn2, att = F)), tolerance=5e-5)


    ## level of balance is same
    expect_equal(syn1$l2_imbalance, syn2$l2_imbalance, tolerance=1e-5)        
})

test_that("augsynth and augsynth_multiout are the same with ridge augmentation", {

    syn1 <- augsynth_multiout(gdpcap + gdpcap_sq  ~ trt, regionno, year, 1975, basque,
                    progfunc="Ridge", scm=T, lambda = 10)
    syn2 <- augsynth(gdpcap + gdpcap_sq  ~ trt, regionno, year, basque,
                    progfunc="Ridge", scm=T, lambda = 10)

    # weights are the same
    expect_equal(c(syn1$weights), c(syn2$weights), tolerance=3e-4)

    # estimates are the same
    expect_equal(c(predict(syn1, att=F)), c(predict(syn2, att = F)), tolerance=5e-5)


    ## level of balance is same
    expect_equal(syn1$l2_imbalance, syn2$l2_imbalance, tolerance=1e-5)        
})

test_that("augsynth and augsynth_multiout are the same with fixed effects augmentation", {

    syn1 <- augsynth_multiout(gdpcap + gdpcap_sq  ~ trt, regionno, year, 1975, basque,
                    progfunc="None", scm=T, fixedeff = T)
    syn2 <- augsynth(gdpcap + gdpcap_sq  ~ trt, regionno, year, basque,
                    progfunc="None", scm=T, fixedeff = T)

    # weights are the same
    expect_equal(c(syn1$weights), c(syn2$weights), tolerance=3e-4)

    # estimates are the same
    expect_equal(c(predict(syn1, att=F)), c(predict(syn2, att = F)), tolerance=5e-5)


    ## level of balance is same
    expect_equal(syn1$l2_imbalance, syn2$l2_imbalance, tolerance=1e-5)
})

test_that("single_augsynth and augsynth_multiout are the same for one outcome", {
    syn1 <- augsynth_multiout(gdpcap  ~ trt, regionno, year, 1975, basque,
                    progfunc="None", scm=T, combine_method = "concat")
    syn2 <- augsynth(gdpcap  ~ trt, regionno, year, basque,
                    progfunc="None", scm=T)

    # weights are the same
    expect_equal(c(syn1$weights), c(syn2$weights), tolerance=3e-4)

    # estimates are the same
    expect_equal(c(predict(syn1, att=F)), unname(predict(syn2, att = F)), tolerance=5e-5)


    ## level of balance is same
    expect_equal(syn1$l2_imbalance, syn2$l2_imbalance, tolerance=1e-5)            
})

test_that("single_augsynth and augsynth_multiout are the same for one outcome with ridge augmentation",{
    syn1 <- augsynth_multiout(gdpcap  ~ trt, regionno, year, 1975, basque,
                    progfunc="Ridge", scm=T, combine_method = "concat")
    syn2 <- augsynth(gdpcap  ~ trt, regionno, year, basque,
                    progfunc="Ridge", scm=T)

    # weights are the same
    expect_equal(c(syn1$weights), c(syn2$weights), tolerance=3e-4)

    # estimates are the same
    expect_equal(c(predict(syn1, att=F)), unname(predict(syn2, att = F)),
                 tolerance=5e-5)


    ## level of balance is same
    expect_equal(syn1$l2_imbalance, syn2$l2_imbalance, tolerance=1e-5)            
})

test_that("single_augsynth and augsynth_multiout are the same for one outcome with fixed effect augmentation", {
    syn1 <- augsynth_multiout(gdpcap  ~ trt, regionno, year, 1975, basque,
                    progfunc="None", scm=T, fixedeff = T, combine_method = "concat")
    syn2 <- augsynth(gdpcap  ~ trt, regionno, year, basque,
                    progfunc="None", scm=T, fixedeff = T)

    # weights are the same
    expect_equal(c(syn1$weights), c(syn2$weights), tolerance=3e-4)

    # estimates are the same
    expect_equal(c(predict(syn1, att=F)), unname(predict(syn2, att = F)), tolerance=5e-5)


    ## level of balance is same
    expect_equal(syn1$l2_imbalance, syn2$l2_imbalance, tolerance=1e-5)            
})


test_that("Averaging outcomes with augsynth_multiout gives correct results without fixed effects", {

  sds <- basque %>% filter(trt == 0, year < 1975) %>%
          summarise(across(c(gdpcap, gdpcap_sq), sd)) %>%
          rename(gdpcap_sd = gdpcap, gdpcap_sq_sd = gdpcap_sq)

  basque %>%
    bind_cols(sds) %>%
    mutate(avg = gdpcap / gdpcap_sd + gdpcap_sq / gdpcap_sq_sd,
           avg2 = gdpcap + gdpcap_sq) -> bas_avg
  

  syn1 <- augsynth_multiout(gdpcap + gdpcap_sq ~ trt, regionno, year, 1975, basque,
                            progfunc="None", scm=T, fixedeff = F, combine_method = "avg")
  syn2 <- augsynth(avg  ~ trt, regionno, year, bas_avg,
                    progfunc="None", scm=T, fixedeff = F)

  # weights are the same
  expect_equal(c(syn1$weights), c(syn2$weights), tolerance=3e-4)
})

test_that("Averaging outcomes with augsynth_multiout gives correct results with fixed effects", {

  sds <- basque %>% filter(trt == 0, year < 1975) %>%
          summarise(across(c(gdpcap, gdpcap_sq), sd)) %>%
          rename(gdpcap_sd = gdpcap, gdpcap_sq_sd = gdpcap_sq)

  basque %>%
    bind_cols(sds) %>%
    mutate(avg = gdpcap / gdpcap_sd + gdpcap_sq / gdpcap_sq_sd,
           avg2 = gdpcap + gdpcap_sq) -> bas_avg
  

  syn1 <- augsynth_multiout(gdpcap + gdpcap_sq ~ trt, regionno, year, 1975, basque,
                            progfunc="None", scm=T, fixedeff = T,
                            combine_method = "avg")
  syn2 <- augsynth(avg  ~ trt, regionno, year, bas_avg,
                    progfunc="None", scm=T, fixedeff = T)

  # weights are the same
  expect_equal(c(syn1$weights), c(syn2$weights), tolerance=1e-3)
})


test_that("Concatenating outcomes with augsynth_multiout gives correct results without fixed effects", {


  sds <- basque %>% filter(trt == 0, year < 1975) %>%
          summarise(across(c(gdpcap, gdpcap_sq), sd)) %>%
          rename(gdpcap_sd = gdpcap, gdpcap_sq_sd = gdpcap_sq)

  basque %>%
    bind_cols(sds) %>%
    mutate(gdpcap = gdpcap / gdpcap_sd, gdpcap_sq = gdpcap_sq / gdpcap_sq_sd) %>%
    select(gdpcap, gdpcap_sq, trt, year, regionno) %>%
    pivot_longer(-c(regionno, year, trt)) %>%
    mutate(year = ifelse(name == "gdpcap", year, year - 0.5)) -> bas_cat  

  syn1 <- augsynth_multiout(gdpcap + gdpcap_sq ~ trt, regionno, year, 1975, basque,
                            progfunc="None", scm=T, fixedeff = F,
                            combine_method = "concat")
  syn2 <- augsynth(value  ~ trt, regionno, year, bas_cat,
                    progfunc="None", scm=T, fixedeff = F)

  # weights are the same
  expect_equal(c(syn1$weights), c(syn2$weights), tolerance=5e-4)
})


================================================
FILE: tests/testthat/test_multisynth.R
================================================
context("Generally testing the workflow for multisynth")

library(Synth)
data(basque)
basque <- basque %>% mutate(trt = case_when(year < 1975 ~ 0,
                                            regionno != 17 ~0,
                                            regionno == 17 ~ 1)) %>%
    filter(regionno != 1)


test_that("augsynth and multisynth give the same answer for a single treated unit and no augmentation", {

    syn <- single_augsynth(gdpcap ~ trt, regionno, year, 1975, basque,
                    progfunc="None", scm=T, fixedeff = F)
    msyn <- multisynth(gdpcap ~ trt, regionno, year, basque, nu = 0,
                       fixedeff = F, scm=T, eps_rel=1e-5, eps_abs=1e-5)
    
    # weights are the same-ish
    expect_equal(c(syn$weights), c(msyn$weights[-16]), tolerance=3e-4)

    # estimates are the same-ish
    pred_msyn <- predict(msyn, att=F)[,1]
    pred_msyn <- pred_msyn[-length(pred_msyn)]
    expect_equal(unname(predict(syn, att=F)), pred_msyn, tolerance=5e-5)


    ## level of balance is same-ish expected
    expect_equal(syn$l2_imbalance, msyn$avg_l2, tolerance=1e-5)

}
)


test_that("Pooling doesn't matter for a single treated unit", {

    nopool <- multisynth(gdpcap ~ trt, regionno, year, basque, nu = 0,
                         scm=T, eps_rel=1e-5, eps_abs=1e-5)
    allpool <- multisynth(gdpcap ~ trt, regionno, year, basque, nu = 1,
                          scm=T, eps_rel=1e-5, eps_abs=1e-5)

    # weights are the same
    expect_equal(nopool$weights, allpool$weights)

    # estimates are the same
    expect_equal(predict(nopool), predict(allpool))


    ## level of balance is same-ish expected
    expect_equal(allpool$ind_l2, nopool$ind_l2)

}
)


test_that("Separate synth is the same as fitting separate synths", {


    basque2 <- basque %>% mutate(trt = case_when(year < 1975 ~ 0,
                                                !regionno %in% c(16, 17) ~ 0,
                                                regionno %in% c(16, 17) ~ 1)) %>%
        filter(regionno != 1)


    basque2  %>% filter(regionno != 16) %>% 
        single_augsynth(gdpcap ~ trt, regionno, year, 1975, .,
                    progfunc="None", scm=T) -> scm17
    basque2  %>% filter(regionno != 17) %>% 
        single_augsynth(gdpcap ~ trt, regionno, year, 1975, .,
                    progfunc="None", scm=T) -> scm16
    
    msyn <- multisynth(gdpcap ~ trt, regionno, year, basque2, nu = 0,
                       scm=T, eps_rel=1e-5, eps_abs=1e-5, fixedeff = F)
    
    # weights are the same-ish
    sscm_weights <- unname(c(scm17$weights))
    mscm_weights <- unname(c(msyn$weights[-c(15, 16), 2]))
    expect_equal(sscm_weights, mscm_weights, tolerance=3e-2)
    expect_equal(rownames(scm17$weights), rownames(as.matrix(msyn$weights[-c(15, 16), 2])))
    # expect_equal(c(scm16$weights), c(msyn$weights[-c(15, 16), 1]), tolerance=3e-2)
    
    # estimates are the same-ish
    pred_msyn <- predict(msyn, att=F)
    pred_msyn <- pred_msyn[-nrow(pred_msyn), ]
    expect_equal(unname(predict(scm17, att=F)), pred_msyn[, 3], tolerance=5e-3)
    expect_equal(unname(predict(scm16, att=F)), pred_msyn[, 2], tolerance=5e-3)
}
)

test_that("Limiting number of lags works", {


    basque2 <- basque %>% mutate(trt = case_when(year < 1975 ~ 0,
                                                !regionno %in% c(16, 17) ~ 0,
                                                regionno %in% c(16, 17) ~ 1)) %>%
        filter(regionno != 1)

    expect_error(
      multisynth(gdpcap ~ trt, regionno, year, basque2, nu = 0,
                 scm=T, eps_rel=1e-5, eps_abs=1e-5, n_lags =3),
      NA
    )
}
)

test_that("L2 imbalance computed correctly", {

  basque2 <- basque %>% mutate(trt = case_when(year < 1975 ~ 0,
                                              !regionno %in% c(16, 17) ~ 0,
                                              regionno %in% c(16, 17) ~ 1)) %>%
      filter(regionno != 1)

  msyn <- multisynth(gdpcap ~ trt, regionno, year, basque2,
                scm=T, eps_rel=1e-5, eps_abs=1e-5)

  glbl <- sqrt(mean(msyn$imbalance[,1]^2))
  ind <- sqrt(mean(
    apply(msyn$imbalance[, -1], 2,
          function(x) sum(x ^ 2) / sum(x != 0))))
  avg_ind <- mean(apply(msyn$imbalance[,-1, drop = F], 2,
              function(x) sqrt(sum(x ^ 2))))
  expect_equal(glbl, msyn$global_l2)
  expect_equal(avg_ind, msyn$avg_l2)
  expect_equal(ind, msyn$ind_l2)
})

test_that("V matrix is equivalent to hard thresholding", {


  basque2 <- basque %>% mutate(trt = case_when(year < 1975 ~ 0,
                                              !regionno %in% c(16, 17) ~ 0,
                                              regionno %in% c(16, 17) ~ 1)) %>%
      filter(regionno != 1)

  V <- c(numeric(10), rep(1,5))
  msyn1 <- multisynth(gdpcap ~ trt, regionno, year, basque2, nu = 0,
                scm=T, eps_rel=1e-8, eps_abs=1e-8, n_lags = 15, V = V)

  msyn2 <- multisynth(gdpcap ~ trt, regionno, year, basque2, nu = 0,
                scm=T, eps_rel=1e-8, eps_abs=1e-8, n_lags = 5)

  expect_equal(msyn1$weights, msyn2$weights, tolerance = 1e-5)
  expect_equal(msyn1$global_l2, msyn2$global_l2, tolerance = 1e-5)
  expect_equal(msyn1$avg_l2, msyn2$avg_l2, tolerance = 1e-5)
}
)

test_that("V matrix is the same for single and multi synth", {

  V <- exp(seq(log(1e-3), log(1), length.out = 20))

  syn <- augsynth(gdpcap ~ trt, regionno, year, basque, progfunc = "none",
                scm=T, V = V)

  msyn <- multisynth(gdpcap ~ trt, regionno, year, basque,
                scm=T, eps_rel=1e-8, eps_abs=1e-8, V = V,
                fixed = F, nu = 0)

  expect_equal(as.numeric(syn$weights), as.numeric(msyn$weights[-16, ]), tolerance = 1e-3)
}
)


test_that("multisynth doesn't depend on unit order", {

    basque2 <- basque %>% mutate(trt = case_when(year < 1975 ~ 0,
                                                !regionno %in% c(16, 17) ~ 0,
                                                regionno %in% c(16, 17) ~ 1)) %>%
        filter(regionno != 1)

    msyn <- multisynth(gdpcap ~ trt, regionno, year, basque2, nu = 0,
                       fixedeff = F, scm=T, eps_rel=1e-5, eps_abs=1e-5)

    msyn2 <- multisynth(gdpcap ~ trt, regionno, year,
                       basque2 %>% arrange(desc(regionno)), nu = 0,
                       fixedeff = F, scm=T, eps_rel=1e-5, eps_abs=1e-5)

    
    # weights are the same
    expect_equal(c(msyn$weights), c(msyn2$weights))

    # estimates are the same
    expect_equal(predict(msyn), predict(msyn2))

}
)


test_that("multisynth doesn't depend on time order", {

    basque2 <- basque %>% mutate(trt = case_when(year < 1975 ~ 0,
                                                !regionno %in% c(16, 17) ~ 0,
                                                regionno %in% c(16, 17) ~ 1)) %>%
        filter(regionno != 1)

    msyn <- multisynth(gdpcap ~ trt, regionno, year, basque2, nu = 0,
                       fixedeff = F, scm=T, eps_rel=1e-5, eps_abs=1e-5)

    msyn2 <- multisynth(gdpcap ~ trt, regionno, year,
                       basque2 %>% arrange(desc(year)), nu = 0,
                       fixedeff = F, scm=T, eps_rel=1e-5, eps_abs=1e-5)

    
    # weights are the same
    expect_equal(c(msyn$weights), c(msyn2$weights))

    # estimates are the same
    expect_equal(predict(msyn), predict(msyn2))

}
)


================================================
FILE: tests/testthat/test_multisynth_covariates.R
================================================
context("Testing multisynth with covariates")
set.seed(1011)

library(Synth)
data(basque)
basque <- basque %>% mutate(trt = case_when((regionno == 17) & (year >= 1975) ~ 1,
                                              (regionno == 16) & (year >= 1980) ~ 1,
                                              TRUE ~ 0)) %>%
      filter(regionno != 1)


regions <- basque %>% distinct(regionno) %>% pull(regionno)

test_that("Getting eligible donor units by exact matching works", {

  # binary variable to split on
  fake_bin <- sample(c(0, 1), length(regions), replace = T)
  basque %>%
    inner_join(
      data.frame(regionno = regions, Z = fake_bin) %>%
        mutate(Z = case_when(regionno == 17 ~ 0,
                             regionno == 16 ~ 1,
                             TRUE ~ Z)
              ),
               by = "regionno") -> basque2

  msyn <- multisynth(gdpcap ~ trt | 0 | 0| Z, regionno, year, basque2, nu = 0,
                     scm = T)

  # check that there is actually no weight on donors with different Z
  expect_equal(sum(msyn$weights[fake_bin == 1, 1]), 1, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_bin == 0, 1]), 0, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_bin == 1, 2]), 0, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_bin == 0, 2]), 1, tolerance = 1e-3)


  # again with fixed effect
  msyn <- multisynth(gdpcap ~ trt | 0 | 0 | Z, regionno, year, basque2, nu = 0,
                     scm = T, fixedeff = T)
  # check that there is actually no weight on donors with different Z
  expect_equal(sum(msyn$weights[fake_bin == 1, 1]), 1, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_bin == 0, 1]), 0, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_bin == 1, 2]), 0, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_bin == 0, 2]), 1, tolerance = 1e-3)
})

test_that("Getting eligible donor units by exact matching works with factors", {

  # binary variable to split on
  fake_fac <- sample(c(0, 1, 3), length(regions), replace = T)
  basque %>%
    inner_join(
      data.frame(regionno = regions, Z = fake_fac) %>%
        mutate(
          Z = case_when(regionno == 17 ~ 0,
                             regionno == 16 ~ 1,
                             TRUE ~ Z),
               Z = as.factor(Z)
              ),
               by = "regionno") -> basque2

  msyn <- multisynth(gdpcap ~ trt | 0 | 0 | Z, regionno, year, basque2, nu = 0,
                     scm = T)

  # check that there is actually no weight on donors with different Z
  expect_equal(sum(msyn$weights[fake_fac == 1, 1]), 1, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_fac == 0, 1]), 0, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_fac == 3, 1]), 0, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_fac == 1, 2]), 0, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_fac == 0, 2]), 1, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_fac == 3, 2]), 0, tolerance = 1e-4)


  # again with fixed effect
  msyn <- multisynth(gdpcap ~ trt | 0 | 0 |Z, regionno, year, basque2, nu = 0,
                     scm = T, fixedeff = T, how_match = "exact")
  # check that there is actually no weight on donors with different Z
  expect_equal(sum(msyn$weights[fake_fac == 1, 1]), 1, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_fac == 0, 1]), 0, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_fac == 3, 1]), 0, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_fac == 1, 2]), 0, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_fac == 0, 2]), 1, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_fac == 3, 2]), 0, tolerance = 1e-4)
})

test_that("K-NN finds the right number of neighbors", {

  # variables to match on
  Z <- matrix(rnorm(length(regions) * 3), ncol = 3)
  basque %>%
    inner_join(
      data.frame(regionno = regions,
                 Z1 = Z[, 1], Z2 = Z[, 2], Z3 = Z[, 3]),
      by = "regionno") -> basque2
  
  dat <- format_data_stag(quo(gdpcap), quo(trt), quo(regionno),
                          quo(year), basque2)
  k <- 3
  donors <- get_eligible_donors(dat$trt, F, 100)
  knn_donors <- get_knn_donors(dat$trt, Z, donors, k)
  expect_true(all(sapply(knn_donors, sum) == k))

  k <- 20
  expect_warning(get_knn_donors(dat$trt, Z, donors, k))
})

test_that("Getting eligible donor units by knn matching works", {

  # variables to match on
  Z <- matrix(rnorm(length(regions) * 3), ncol = 3)
  basque %>%
    inner_join(
      data.frame(regionno = regions,
                 Z1 = Z[, 1], Z2 = Z[, 2], Z3 = Z[, 3]),
      by = "regionno") -> basque2

  # error if no k is supplied
  expect_error(multisynth(gdpcap ~ trt | 0 | Z1 + Z2 + Z3, regionno, 
                          year, basque2,
                          scm = T, how_match = "knn"),
              "Number of neighbors for knn not selected, please choose k.")

  k <- 5
  msyn <- multisynth(gdpcap ~ trt | 0 | Z1 + Z2 + Z3, regionno, year, 
                     basque2, scm = T, how_match = "knn", k = k)

  # check that at most k units recieve non-0 weight
  expect_lte(sum(msyn$weights[, 1] != 0), k)
  expect_lte(sum(msyn$weights[, 2] != 0), k)

  
  # again with fixed effect
  msyn <- multisynth(gdpcap ~ trt | 0 | Z1 + Z2 + Z3, regionno, year,
                       basque2, scm = T, fixedeff = T, how_match = "knn", k = k)
  # check that all but k units recieve exactly 0 weight
  expect_lte(sum(msyn$weights[, 1] != 0), k)
  expect_lte(sum(msyn$weights[, 2] != 0), k) 

  # without synth weights, weights are uniform
  k <- 2
  unimatch <- multisynth(gdpcap ~ trt| 0 | Z1 + Z2 + Z3, regionno, year,
                     basque2, scm = T, how_match = "knn", k = k, lambda = 1e10)

  expect_equal(unimatch$weights[unimatch$weights != 0 ], rep(1 / k, 2 * k))

  # matching with more neighbors is worse
  unimatch2 <- multisynth(gdpcap ~ trt | 0 | Z1 + Z2 + Z3, regionno, year, basque2,
                     scm = T, how_match = "knn", k = 2.5 * k, lambda = 1e10)

  trtZ <- Z[regions %in% c(16, 17),]
  imbal1 <- sqrt(sum(sapply(1:2, 
                function(i) sum(unimatch$weights[,i] * (trtZ[i,] - Z) ^ 2 ))))
  imbal2 <- sqrt(sum(sapply(1:2, 
                function(i) sum(unimatch2$weights[,i] * (trtZ[i,] - Z) ^ 2 ))))

  expect_lt(imbal1, imbal2)

})


test_that("Getting eligible donor units by exact and knn matching works", {

  # binary variable to split on
  fake_bin <- sample(c(0, 1), length(regions), replace = T)

  # variables to match on
  Z <- matrix(rnorm(length(regions) * 3), ncol = 3)
  basque %>%
    inner_join(
      data.frame(regionno = regions,
                 Z1 = Z[, 1], Z2 = Z[, 2], Z3 = Z[, 3],
                 Z_bin = fake_bin) %>%
        mutate(Z_bin = case_when(regionno == 17 ~ 0,
                             regionno == 16 ~ 1,
                             TRUE ~ Z_bin)),
      by = "regionno") -> basque2

  # error if no k is supplied
  expect_error(multisynth(gdpcap ~ trt | 0 | Z1 + Z2 + Z3 | Z_bin, regionno, 
                          year, basque2,
                          scm = T, how_match = "knn"),
              "Number of neighbors for knn not selected, please choose k.")

  k <- 3
  msyn <- multisynth(gdpcap ~ trt | 0 | Z1 + Z2 + Z3 | Z_bin, regionno, year, 
                     basque2, scm = T, how_match = "knn", k = k)
  
  # check that there is actually no weight on donors with different Z
  expect_equal(sum(msyn$weights[fake_bin == 1, 1]), 1, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_bin == 0, 1]), 0, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_bin == 1, 2]), 0, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_bin == 0, 2]), 1, tolerance = 1e-4)
  
  # check that at most k units recieve non-0 weight
  expect_lte(sum(msyn$weights[, 1] != 0), k)
  expect_lte(sum(msyn$weights[, 2] != 0), k)
  
  # again with fixed effect
    msyn <- multisynth(gdpcap ~ trt | 0 | Z1 + Z2 + Z3 | Z_bin, regionno, year,
                       basque2, scm = T, fixedeff = T, how_match = "knn", k = k)
  # check that at most k units recieve non-0 weight
  expect_lte(sum(msyn$weights[, 1] != 0), k)
  expect_lte(sum(msyn$weights[, 2] != 0), k)

  # check that there is actually no weight on donors with different Z
  expect_equal(sum(msyn$weights[fake_bin == 1, 1]), 1, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_bin == 0, 1]), 0, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_bin == 1, 2]), 0, tolerance = 1e-4)
  expect_equal(sum(msyn$weights[fake_bin == 0, 2]), 1, tolerance = 1e-4) 

  k <- 3
  # without synth weights, weights are uniform
  unimatch <- multisynth(gdpcap ~ trt | 0 | Z1 + Z2 + Z3 | Z_bin, regionno,
                     year, basque2, scm = T, how_match = "knn", k = k, lambda = 1e10)

  expect_equal(unimatch$weights[unimatch$weights != 0 ], rep(1 / k, 2 * k))

  # matching without exact gives better matches
  unimatch2 <- multisynth(gdpcap ~ trt | 0 | Z1 + Z2 + Z3, regionno, year,
                    basque2, scm = T, how_match = "knn", k = k, lambda = 1e10)

  trtZ <- Z[regions %in% c(16, 17),]
  imbal1 <- sqrt(sum(sapply(1:2, 
                function(i) sum(unimatch$weights[,i] * (trtZ[i,] - Z) ^ 2 ))))
  imbal2 <- sqrt(sum(sapply(1:2, 
                function(i) sum(unimatch2$weights[,i] * (trtZ[i,] - Z) ^ 2 ))))

  expect_lt(imbal2, imbal1)
})


test_that("An error is thrown if trying to match with time cohorts or the formula is wrong", {

  # binary variable to split on
  fake_bin <- sample(c(0, 1), length(regions), replace = T)

  # variables to match on
  Z <- matrix(rnorm(length(regions) * 3), ncol = 3)
  basque %>%
    inner_join(
      data.frame(regionno = regions,
                 Z1 = Z[, 1], Z2 = Z[, 2], Z3 = Z[, 3],
                 Z_bin = fake_bin) %>%
        mutate(Z_bin = case_when(regionno == 17 ~ 0,
                             regionno == 16 ~ 1,
                             TRUE ~ Z_bin)),
      by = "regionno") %>% 
    mutate(trt = case_when((regionno == 17) & (year >= 1975) ~ 1,
                            (regionno == 16) & (year >= 1975) ~ 1,
                                              TRUE ~ 0)) %>%
      filter(regionno != 1)-> basque2

  expect_error(multisynth(gdpcap ~ trt | Z1 + Z2, regionno, year, basque2,
                     time_cohort = T), NA)

  expect_error(multisynth(gdpcap ~ trt | Z1 + Z2 | 0 | Z_bin,
                          regionno, year, basque2, time_cohort = T))
})


test_that("multisynth with covariates doesn't depend on unit or time order ", {

  data <- read.csv("https://dataverse.harvard.edu/api/access/datafile/:persistentId?persistentId=doi:10.7910/DVN/WGWMAV/3UHTLP", sep="\t")
  data %>%
    filter(!State %in% c("DC", "WI"),
           year >= 1959, year <= 1997) %>%
    mutate(YearCBrequired = ifelse(is.na(YearCBrequired), 
                                   Inf, YearCBrequired),
           cbr = 1 * (year >= YearCBrequired)) -> analysis_df

  data %>%
  select(State, year, agr, pnwht, purban, perinc, studteachratio) %>%
  group_by(State) %>%
  summarise(perinc_1959 = perinc[year == 1959],
            studteachratio_1959 = studteachratio[year == 1959]) %>% 
  # filter to lower 48 where we have data
  filter(!State %in% c("AK", "HI"))  -> cov_data

  analysis_df %>%
    inner_join(cov_data, by = "State") -> analysis_df_covs

  msyn <- multisynth(lnppexpend ~ cbr | perinc_1959 + studteachratio_1959,
                            State, year, analysis_df_covs)

  msyn_rev_unit <- multisynth(lnppexpend ~ cbr | perinc_1959 + studteachratio_1959,
                            State, year,
                            analysis_df_covs %>% arrange(desc(State)))

  msyn_rev_time <- multisynth(lnppexpend ~ cbr | perinc_1959 + studteachratio_1959,
                            State, year,
                            analysis_df_covs %>% arrange(desc(year)))

  expect_equal(predict(msyn), predict(msyn_rev_time))
  expect_equal(predict(msyn), predict(msyn_rev_unit))

})

================================================
FILE: tests/testthat/test_outcome_models.R
================================================
context("Testing that augmenting synth with different models loads and runs")


library(Synth)
data(basque)
basque <- basque %>% mutate(trt = case_when(year < 1975 ~ 0,
                                            regionno != 17 ~0,
                                            regionno == 17 ~ 1)) %>%
    filter(regionno != 1)


test_that("Augmenting synth with glmnet runs", {

    if(!requireNamespace("glmnet", quietly = TRUE)) {
        ## should fail because glmnet isn't installed
        expect_error(augsynth(gdpcap ~ trt, regionno, year, basque, progfunc="EN", scm=T),
                     "you must install the glmnet package")

        ## install glmnet
        install.packages("glmnet", repos = "http://cran.us.r-project.org")
    }

    ## should run because glmnet is installed
    expect_error(augsynth(gdpcap ~ trt, regionno, year, basque, progfunc="EN", scm=T),
                 NA)    
}
)


test_that("Augmenting synth with random forest runs", {

    if(!requireNamespace("randomForest", quietly = TRUE)) {
        ## should fail because randomForest isn't installed
        expect_error(augsynth(gdpcap ~ trt, regionno, year, basque, progfunc="RF", scm=T),
                     "you must install the randomForest package")

        ## install randomForest
        install.packages("randomForest", repos = "http://cran.us.r-project.org")
    }

    ## should run because randomForest is installed
    expect_error(augsynth(gdpcap ~ trt, regionno, year, basque, progfunc="RF", scm=T),
                 NA)    
}
)


test_that("Augmenting synth with gsynth runs and produces the correct result", {

    if(!requireNamespace("gsynth", quietly = TRUE)) {
        ## should fail because gsynth isn't installed
        expect_error(augsynth(gdpcap ~ trt, regionno, year, basque, 
                              progfunc="GSYN", scm=T),
                     "you must install the gsynth package")

        ## install gsynth
        install.packages("gsynth", repos = "http://cran.us.r-project.org")
    }

    ## should run because gsynth is installed
    expect_error(
      augsynth(gdpcap ~ trt, regionno, year, basque, 
                                progfunc = "GSYN", scm = T, CV = 0, r = 4),
      NA)
    asyn_gsyn <- augsynth(gdpcap ~ trt, regionno, year, basque,
                          progfunc = "GSYN", scm = F, CV = 0, r = 4)
    expect_equal(summary(asyn_gsyn, inf = F)$average_att$Estimate, 
                 -0.1444637, tolerance=1e-4) 
}
)


test_that("Augmenting synth with MCPanel runs", {

    if(!requireNamespace("MCPanel", quietly = TRUE)) {
        ## should fail because MCPanel isn't installed
        expect_error(augsynth(gdpcap ~ trt, regionno, year, basque, progfunc="MCP", scm=T),
                     "you must install the MCPanel package")
    } else {
        ## should run because MCPanel is installed
        expect_error(augsynth(gdpcap ~ trt, regionno, year, basque, progfunc="MCP", scm=T),
                     NA)    
    }

    
}
)


test_that("Augmenting synth with CausalImpact runs", {

    if(!requireNamespace("CausalImpact", quietly = TRUE)) {
        ## should fail because CausalImpact isn't installed
        expect_error(augsynth(gdpcap ~ trt, regionno, year, basque, progfunc="CausalImpact", scm=T),
                     "you must install the CausalImpact package")

        ## install CausalImpact
        install.packages("CausalImpact", repos = "http://cran.us.r-project.org")
    }

    ## should run because CausalImpact is installed
    expect_error(augsynth(gdpcap ~ trt, regionno, year, basque, progfunc="CausalImpact", scm=T),
                 NA)    
}
)


================================================
FILE: tests/testthat/test_time_cohort.R
================================================
context("Test time cohort vs unit level analysis")

library(Synth)
data(basque)
basque <- basque %>% mutate(trt = case_when(year < 1975 ~ 0,
                                            regionno != 17 ~0,
                                            regionno == 17 ~ 1)) %>%
    filter(regionno != 1)


test_that("multisynth at the unit level and time cohort level give the same answer for a single treated unit and no augmentation", {

    msyn_unit <- multisynth(gdpcap ~ trt, regionno, year, basque, nu = 0,
                            time_cohort = F, scm = T,
                            eps_rel = 1e-5, eps_abs = 1e-5)
    msyn_time <- multisynth(gdpcap ~ trt, regionno, year, basque, nu = 0,
                            time_cohort = T, scm = T,
                            eps_rel = 1e-5, eps_abs = 1e-5)
    # weights are the same-ish
    expect_equal(c(msyn_unit$weights), c(msyn_time$weights), tolerance=3e-2)

    # estimates are the same-ish
    expect_equal(c(predict(msyn_unit, att=F)),
                 c(predict(msyn_time, att=F)),
                 tolerance=5e-3)


    ## level of balance is same-ish expected
    expect_equal(msyn_unit$ind_l2, msyn_time$ind_l2, tolerance=1e-3)

}
)


test_that("multisynth at the time cohort level runs", {

    expect_error(msyn_time <- multisynth(gdpcap ~ trt, regionno, year, basque,
                            time_cohort = T, scm = T),
                 NA)
}
)

================================================
FILE: tests/testthat/test_unbalanced_multisynth.R
================================================
context("Test multisynth for unbalanced panels")

set.seed(1011)

library(Synth)
data(basque)
basque <- basque %>% mutate(trt = case_when((regionno == 17) & (year >= 1975) ~ 1,
                                              (regionno == 16) & (year >= 1980) ~ 1,
                                              TRUE ~ 0)) %>%
      filter(regionno != 1)
regions <- basque %>% distinct(regionno) %>% pull(regionno)


test_that("Data formatting creates NAs correctly", {

  # drop a time period for unit 17
  basque %>%
    filter(regionno != 17 | year != 1970) -> basque_mis

  dat_format <- format_data_stag(quo(gdpcap), quo(trt),
                                  quo(regionno), quo(year), basque_mis)

  expect_true(is.na(dat_format$X[regions == 17, "1970"]))
})


test_that("Non-NA donors are chosen correctly with missing pre-treatment", {

  # drop a time period for unit 17
  basque %>%
    filter(!regionno %in% c(15, 17, 18) | year != 1970) -> basque_mis

  dat_format <- format_data_stag(quo(gdpcap), quo(trt),
                                  quo(regionno), quo(year), basque_mis)
  n_lags <- ncol(dat_format$X)
  n_leads <- ncol(dat_format$y)
  donors <- get_nona_donors(dat_format$X, dat_format$y, dat_format$trt,
                            n_lags, n_leads, F)

  expect_true(!all(donors[[1]][regions %in% c(15, 17, 18) ]))
  expect_true(all(donors[[1]][!regions %in% c(15, 17, 18) ]))
  expect_true(all(donors[[2]]))
})

test_that("Non-NA donors are chosen correctly with missing post-treatment", {

  # drop a time period for unit 17
  basque %>%
    filter(!regionno %in% c(15, 17, 18) | !year %in% c(1990)) -> basque_mis

  dat_format <- format_data_stag(quo(gdpcap), quo(trt),
                                  quo(regionno), quo(year), basque_mis)
  n_lags <- ncol(dat_format$X)
  n_leads <- ncol(dat_format$y)
  donors <- get_nona_donors(dat_format$X, dat_format$y, dat_format$trt,
                           n_lags, n_leads, F)

  expect_true(!all(donors[[1]][regions %in% c(15, 17, 18) ]))
  expect_true(all(donors[[1]][!regions %in% c(15, 17, 18) ]))
  expect_true(all(donors[[2]]))
})


test_that("Non-NA donors are chosen correctly with missing pre- and post-treatment", {

  # drop a time period for unit 17
  basque %>%
    filter(!regionno %in% c(15, 17, 18) | !year %in% c(1970, 1990)) -> basque_mis

  dat_format <- format_data_stag(quo(gdpcap), quo(trt),
                                  quo(regionno), quo(year), basque_mis)
  n_lags <- ncol(dat_format$X)
  n_leads <- ncol(dat_format$y)
  donors <- get_nona_donors(dat_format$X, dat_format$y, dat_format$trt,
                           n_lags, n_leads, F)

  expect_true(!all(donors[[1]][regions %in% c(15, 17, 18) ]))
  expect_true(all(donors[[1]][!regions %in% c(15, 17, 18) ]))
  expect_true(all(donors[[2]]))
})


test_that("Non-NA donors are chosen correctly with missing pre- and post-treatment and not considering all leads and lags", {

  # drop a time period for unit 17
  basque %>%
    filter(!regionno %in% c(15, 17, 18) | !year %in% c(1970, 1990)) -> basque_mis

  dat_format <- format_data_stag(quo(gdpcap), quo(trt),
                                  quo(regionno), quo(year), basque_mis)
  n_lags <- ncol(dat_format$X)
  n_leads <- ncol(dat_format$y)
  donors <- get_nona_donors(dat_format$X, dat_format$y, dat_format$trt,
                           5, 5, F)

  expect_true(all(donors[[1]]))
  expect_true(all(donors[[2]]))
})

test_that("Separate synth with missing treated unit time drops the time", {

  # drop a time period for unit 17
  basque %>%
    filter(!regionno %in% c(17) | year != 1970) -> basque_mis

  msyn <- multisynth(gdpcap ~ trt, regionno, year, basque_mis,
                     fixedeff = F,
                     nu = 0, scm=T, eps_rel=1e-8, eps_abs=1e-8)

  msyn2 <- multisynth(gdpcap ~ trt, regionno, year,
                      basque %>% filter(year != 1970),
                      fixedeff = F,
                      nu = 0, scm=T, eps_rel=1e-8, eps_abs=1e-8)


  expect_equal(msyn$weights[,2], msyn2$weights[,2], tolerance = 1e-6)
})


test_that("Separate synth with missing control unit time drops control unit", {

  # drop a time period for unit 17
  basque %>%
    filter(!regionno %in% c(18) | year != 1970) -> basque_mis
  
  msyn <- multisynth(gdpcap ~ trt, regionno, year, basque_mis, 
                     nu = 0, scm=T, eps_rel=1e-8, eps_abs=1e-8)

  msyn2 <- multisynth(gdpcap ~ trt, regionno, year, 
                      basque %>% filter(regionno != 18),
                      nu = 0, scm=T, eps_rel=1e-8, eps_abs=1e-8)

  expect_equal(msyn$weights[-17,2], msyn2$weights[,2], tolerance = 1e-6)
})


test_that("Separate synth with missing control unit only in post-treatment period drops control unit", {

  # drop a time period for unit 17
  basque %>%
    filter(!regionno %in% c(18) | year < 1980) -> basque_mis

  dat_format <- format_data_stag(quo(gdpcap), quo(trt), quo(regionno), quo(year), basque_mis)

  expect_true(nrow(dat_format$X) == nrow(dat_format$y))
  msyn <- multisynth(gdpcap ~ trt, regionno, year, basque_mis, 
                     nu = 0, scm=T, eps_rel=1e-8, eps_abs=1e-8)

  msyn2 <- multisynth(gdpcap ~ trt, regionno, year, 
                      basque %>% filter(regionno != 18),
                      nu = 0, scm=T, eps_rel=1e-8, eps_abs=1e-8)

  expect_equal(msyn$weights[-17,2], msyn2$weights[,2], tolerance = 1e-6)
})

test_that("Separate synth with missing control unit only in pre-treatment period drops control unit", {

  # drop a time period for unit 17
  basque %>%
    filter(!regionno %in% c(18) | year >= 1980) -> basque_mis

  dat_format <- format_data_stag(quo(gdpcap), quo(trt), quo(regionno), quo(year), basque_mis)

  expect_true(nrow(dat_format$X) == nrow(dat_format$y))

  msyn <- multisynth(gdpcap ~ trt, regionno, year, basque_mis, 
                     nu = 0, scm=T, eps_rel=1e-8, eps_abs=1e-8)

  msyn2 <- multisynth(gdpcap ~ trt, regionno, year, 
                      basque %>% filter(regionno != 18),
                      nu = 0, scm=T, eps_rel=1e-8, eps_abs=1e-8)

  expect_equal(msyn$weights[-17,2], msyn2$weights[,2], tolerance = 1e-6)
})


test_that("Multisynth with unbalanced panels runs", {

  # drop a time period for unit 17
  basque %>%
    filter(!regionno %in% c(15, 17) | year != 1970) -> basque_mis

  msyn <- multisynth(gdpcap ~ trt, regionno, year, basque_mis, 
                     scm=T, eps_rel=1e-8, eps_abs=1e-8)

  expect_error(summary(msyn), NA)
})


test_that("Multisynth with unbalanced panels runs with missing post-treatment", {

  # drop a time period for unit 17
  basque %>%
    filter(!regionno %in% c(15, 17) | year != 1990) -> basque_mis

  msyn <- multisynth(gdpcap ~ trt, regionno, year, basque_mis, 
                     scm=T, eps_rel=1e-8, eps_abs=1e-8)

  expect_error(summary(msyn), NA)
})


test_that("Multisynth with unbalanced panels runs", {

  # drop a time period for unit 17
  basque %>%
    filter(!regionno %in% c(15) | year != 1985) -> basque_mis

  msyn <- multisynth(gdpcap ~ trt, regionno, year, basque_mis, 
                     scm=T, eps_rel=1e-8, eps_abs=1e-8)

  expect_error(summary(msyn), NA)
})

================================================
FILE: tests/testthat.R
================================================
library(testthat)
library(augsynth)

test_check("augsynth")


================================================
FILE: vignettes/.gitignore
================================================
*.html
*.R


================================================
FILE: vignettes/multi-outcomes-vignette.Rmd
================================================
---
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Multi Outcomes AugSynth Vignette}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
  )
library(kableExtra)
```

# `augsynth`: Estimating multiple outcome effects

### The data
To demonstrate `augsynth` with multiple outcomes, we'll use data on the impact of personal income tax cuts in Kansas that comes with the `AugSynth` package. Our interest is in estimating the effect of income tax cuts on gross state product (GSP) per capita, wages, establishment counts, and other macroeconomic indicators.

```{r load_data, results="hide", warning=F, message=F}
library(magrittr)
library(dplyr)
library(augsynth)
data(kansas)
```

The `kansas` dataset contains the GSP per capita (the outcome measure) `lngdpcapita` for all 50 states from the first quarter of 1990 to the first quarter of 2016.

To run `augsynth`, we need to include a treatment status column that indicates which region was treated and at what time. The table in `kansas` contains the column `treated` to denote this. In the original study, the second quarter of 2012 was the implementation of the tax cut in Kansas.

```{r treated_units}
kansas %>% select(year, qtr, year_qtr, state, treated, gdp, lngdpcapita) %>% filter(state == "Kansas" & year_qtr >= 2012 & year_qtr < 2013) 
```


### Using the Synthetic Controls Method
We will begin by running the synthetic controls method on GDP per capita, wages, and the number of establishments. To run the vanilla synthetic controls method using `augsynth`, set `progfunc` to `None` and `scm` to `TRUE`.

#### Single outcomes
First, we will examine each outcome variable separately, beginning with log GDP per capita `lngdpcapita`.

```{r lngdpcapita_syn}
syn_lngdpcapita <- augsynth(lngdpcapita ~ treated, fips, year_qtr, kansas, progfunc="None", scm=T)
summary(syn_lngdpcapita)
```

```{r lngdpcapita_syn_plot}
plot(syn_lngdpcapita)
```

Next we will examine the log total wages per capita. Since this column doesn't already exist in the dataframe, we will create a `lntotalwagescapita` column.
```{r lntotalwagescapita_syn}
kansas$lntotalwagescapita <- log(kansas$totalwagescapita)
syn_lntotalwagescapita <- augsynth(lntotalwagescapita ~ treated, fips, year_qtr, kansas, progfunc="None", scm=T)
summary(syn_lntotalwagescapita)
```

```{r lntotalwagescapita_syn_plot}
plot(syn_lntotalwagescapita)
```

Lastly, we will examine the number of establishments per capita, `estabscapita`.
```{r estabscapita_syn}
syn_estabscapita <- augsynth(estabscapita ~ treated, fips, year_qtr, kansas, progfunc="None", scm=T)
summary(syn_estabscapita)
```

```{r single_estabscapita_syn_plot}
plot(syn_estabscapita)
```

#### Multiple outcomes
Now we will combine our outcome variables into one study. To add more outcome variables, we add them to the LHS of the formula. 
```{r multi_outcome_syn}
syn_multi <- augsynth(lngdpcapita + lntotalwagescapita + estabscapita ~ treated, fips, year_qtr, kansas, progfunc="None", scm=T)
summary(syn_multi)
```

```{r multi_outcome_syn_plot}
plot(syn_multi)
```


### Using the Augmented Synthetic Controls Method
We will now repeat the study using the Augmented Synthetic Controls Method with ridge regression. In ASCM, we first fit the SCM weights, then combine it with a ridge regression, thus setting `progfunc="Ridge", scm=T`.

```{r lngdpcapita_asyn}
asyn_lngdpcapita <- augsynth(lngdpcapita ~ treated, fips, year_qtr, kansas, progfunc="Ridge", scm=T)
summary(asyn_lngdpcapita)
```

```{r lngdpcapita_asyn_plot}
plot(asyn_lngdpcapita)
```

```{r lntotalwagescapita_asyn}
asyn_lntotalwagescapita <- augsynth(lntotalwagescapita ~ treated, fips, year_qtr, kansas, progfunc="Ridge", scm=T)
summary(asyn_lntotalwagescapita)
```

```{r lntotalwagescapita_asyn_plot}
plot(asyn_lntotalwagescapita)
```

```{r estabscapita_asyn}
asyn_estabscapita <- augsynth(estabscapita ~ treated, fips, year_qtr, kansas, progfunc="Ridge", scm=T)
summary(asyn_estabscapita)
```

```{r single_estabscapita_asyn_plot}
plot(asyn_estabscapita)
```

#### Multiple outcomes
Now we will combine our outcome variables into one study. To add more outcome variables, we add them to the LHS of the formula. 
```{r multi_outcome_asyn}
asyn_multi <- augsynth(lngdpcapita + lntotalwagescapita + estabscapita ~ treated, 
                       fips, year_qtr, kansas, progfunc="Ridge", scm=T, lambda = 1e-4)
summary(asyn_multi)
```

```{r multi_outcome_asyn_plot}
plot(asyn_multi)
```


================================================
FILE: vignettes/multisynth-vignette.Rmd
================================================
---
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{MultiSynth Vignette}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
  )
library(kableExtra)
```


# `augsynth`: Estimating treatment effects with staggered adoption

### The data

To show the features of the `multisynth` function we will use data on the effects of states implementing mandatory collective bargaining agreements for public sector unions [(Paglayan, 2018)](https://onlinelibrary.wiley.com/doi/full/10.1111/ajps.12388)

```{r results="hide", warning=F, message=F}
library(magrittr)
library(dplyr)
library(augsynth)
```

```{r }
data <- read.csv("https://dataverse.harvard.edu/api/access/datafile/:persistentId?persistentId=doi:10.7910/DVN/WGWMAV/3UHTLP", sep="\t")
```

The dataset contains several important variables that we'll use:

- `year`, `State`: The state and year of the measurement
- `YearCBrequired`: The year that the state adopted mandatory collective bargaining
- `lnppexpend`: Log per pupil expenditures in constant 2010 $

```{r echo = F}
data %>% 
    filter(year == 1960) %>% 
    select(year, State, YearCBrequired, lnppexpend) %>%
    head() %>%
    kable() %>%
    kable_styling(bootstrap_options =c("hover", "responsive"))
```

To run `multisynth`, we need to include a treatment status column that indicates which state is treated in a given year, we call this `cbr` below. We also restrict to the years 1959-1997 where we have yearly measurements of expenditures and drop Washington D.C. and Wisconsin from the analysis.

```{r }
data %>%
    filter(!State %in% c("DC", "WI"),
           year >= 1959, year <= 1997) %>%
    mutate(YearCBrequired = ifelse(is.na(YearCBrequired), 
                                   Inf, YearCBrequired),
           cbr = 1 * (year >= YearCBrequired)) -> analysis_df
```

## Partially pooled SCM with an intercept

To fit partially pooled synthetic controls, we need to give `multisynth` a formula of the form `outcome ~ treatment`, point it to the unit and time variables, and choose the level of partial pooling `nu`. Setting `nu = 0` fits a separate synthetic control for each treated unit and setting `nu = 1` fits fully pooled synthetic controls. If we don't set `nu`, `multisynth` will choose a heuristic value based on how well separate synthetic controls balance the overall average.
By default, `multisynth` includes an intercept shift along with the weights; we can exclude the intercept shift by setting `fixedeff = F`.
We can also set the number of pre-treatment time periods (lags) that we want to balance with the `n_lags` argument and the number of post-treatment time periods (leads) that we want to estimate with the `n_leads` argument. By default `multisynth` sets `n_lags` and `n_leads` to the number of pre-treatment and post-treatment periods for the last treated unit, respectively.

```{r }
# with a choice of nu
ppool_syn <- multisynth(lnppexpend ~ cbr, State, year, 
                        nu = 0.5, analysis_df)
# with default nu
ppool_syn <- multisynth(lnppexpend ~ cbr, State, year, 
                        analysis_df)

print(ppool_syn$nu)

ppool_syn
```

Using the `summary` function, we'll compute the treatment effects and standard errors and confidence intervals for all treated units as well as the average via the wild bootstrap. (This takes a bit of time so we'll store the output) We can also change the significant level associated with the confidence intervals by setting the `alpha` argument, by default `alpha = 0.05`.

```{r}
ppool_syn_summ <- summary(ppool_syn)
```

We can then report the level of global and individual balance as well as estimates for the average.

```{r }
ppool_syn_summ
```

`ppool_syn_summ$att` is a dataframe that contains all of the point estimates, standard errors, and lower/upper confidence limits. `Time = NA` denotes the effect averaged across the post treatment periods.

```{r echo = F}
ppool_syn_summ$att %>%
  filter(Time >= 0) %>%
  head() %>%
  kable() %>%
  kable_styling(bootstrap_options =c("hover", "responsive"))
```

We can also visually display both the pre-treatment balance and the estimated treatment effects.

```{r ppool_syn_plot, fig.width=8, fig.height=4.5, fig.align="center", warning=F, message=F}
plot(ppool_syn_summ)
```

And again we can hone in on the average effects.

```{r ppool_syn_plot_avg, fig.width=8, fig.height=4.5, fig.align="center", warning=F, message=F}
plot(ppool_syn_summ, levels = "Average")
```


### Collapsing into time cohorts

We can also collapse treated units with the same treatment time into _time cohorts_, and find one synthetic control per time cohort by setting `time_cohort = TRUE`. When the number of distinct treatment times is much smaller than the number of treated units, this will run significantly faster.

```{r }
# with default nu
ppool_syn_time <- multisynth(lnppexpend ~ cbr, State, year,
                        analysis_df, time_cohort = TRUE)

print(ppool_syn_time$nu)

ppool_syn_time
```

We can then compute effects for the overall average as well as for each treatment time cohort, rather than individual units.

```{r}
ppool_syn_time_summ <- summary(ppool_syn_time)
ppool_syn_time_summ
```

```{r echo = F}
ppool_syn_time_summ$att %>%
  filter(Time >= 0) %>%
  head() %>%
  kable() %>%
  kable_styling(bootstrap_options =c("hover", "responsive"))
```

Again we can plot the effects.

```{r ppool_syn_time_plot, fig.width=8, fig.height=4.5, fig.align="center", warning=F, message=F}
plot(ppool_syn_time_summ)
```


### Including auxiliary covariates

We can also include an additional set of covariates to balance along with the pre-treatment outcomes. First, let's create a data frame with the values of some covariates in a few different years:

```{r cov_data}

data %>%
  select(State, year, agr, pnwht, purban, perinc, studteachratio) %>%
  group_by(State) %>%
  summarise(perinc_1959 = perinc[year == 1959],
            studteachratio_1959 = studteachratio[year == 1959]) %>% 
  # filter to lower 48 where we have data
  filter(!State %in% c("AK", "HI"))  -> cov_data

analysis_df %>%
  inner_join(cov_data, by = "State") -> analysis_df_covs

```

To include auxiliary covariates, we can add them in to the formula after `|`. This will balance the auxiliary covariates along with the pre-treatment outcomes simultanouesly. If the covariates vary during the pre-treatment periods, `multisynth` will use the average pre-treatment value. We can change this behavior by including our own custom aggregation function via the `cov_agg` argument.
```{r cov_syn}
# with default nu
ppool_syn_cov <- multisynth(lnppexpend ~ cbr | perinc_1959 + studteachratio_1959,
                            State, year, analysis_df_covs)

print(ppool_syn_cov$nu)

ppool_syn_cov
```

Again we can compute effects, along with their standard errors and confidence intervals, and plot.
```{r}
ppool_syn_cov_summ <- summary(ppool_syn_cov)
ppool_syn_cov_summ
```

```{r echo = F}
ppool_syn_cov_summ$att %>%
  filter(Time >= 0) %>%
  head() %>%
  kable() %>%
  kable_styling(bootstrap_options =c("hover", "responsive"))
```

Again we can plot the effects.
```{r ppool_syn_cov_plot, fig.width=8, fig.height=4.5, fig.align="center", warning=F, message=F}
plot(ppool_syn_cov_summ, levels = "Average")
```

================================================
FILE: vignettes/multisynth-vignette.md
================================================
---
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{MultiSynth Vignette}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---


# `augsynth`: Estimating treatment effects with staggered adoption

### The data

To show the features of the `multisynth` function we will use data on the effects of states implementing mandatory collective bargaining agreements for public sector unions [(Paglayan, 2018)](https://onlinelibrary.wiley.com/doi/full/10.1111/ajps.12388)


```r
library(magrittr)
library(dplyr)
library(augsynth)
```


```r
data <- read.csv("https://dataverse.harvard.edu/api/access/datafile/:persistentId?persistentId=doi:10.7910/DVN/WGWMAV/3UHTLP", sep="\t")
```

The dataset contains several important variables that we'll use:

- `year`, `State`: The state and year of the measurement
- `YearCBrequired`: The year that the state adopted mandatory collective bargaining
- `lnppexpend`: Log per pupil expenditures in constant 2010 $

<table class="table table-hover table-responsive" style="margin-left: auto; margin-right: auto;">
 <thead>
  <tr>
   <th style="text-align:right;"> year </th>
   <th style="text-align:left;"> State </th>
   <th style="text-align:right;"> YearCBrequired </th>
   <th style="text-align:right;"> lnppexpend </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;"> 1960 </td>
   <td style="text-align:left;"> AK </td>
   <td style="text-align:right;"> 1970 </td>
   <td style="text-align:right;"> 8.325518 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 1960 </td>
   <td style="text-align:left;"> AL </td>
   <td style="text-align:right;"> NA </td>
   <td style="text-align:right;"> 7.396177 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 1960 </td>
   <td style="text-align:left;"> AR </td>
   <td style="text-align:right;"> NA </td>
   <td style="text-align:right;"> 7.385373 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 1960 </td>
   <td style="text-align:left;"> AZ </td>
   <td style="text-align:right;"> NA </td>
   <td style="text-align:right;"> 7.947127 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 1960 </td>
   <td style="text-align:left;"> CA </td>
   <td style="text-align:right;"> 1976 </td>
   <td style="text-align:right;"> 8.185162 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 1960 </td>
   <td style="text-align:left;"> CO </td>
   <td style="text-align:right;"> NA </td>
   <td style="text-align:right;"> 7.952833 </td>
  </tr>
</tbody>
</table>


To run `multisynth`, we need to include a treatment status column that indicates which state is treated in a given year, we call this `cbr` below. We also restrict to the years 1959-1997 where we have yearly measurements of expenditures and drop Washington D.C. and Wisconsin from the analysis.


```r
data %>%
    filter(!State %in% c("DC", "WI"),
           year >= 1959, year <= 1997) %>%
    mutate(YearCBrequired = ifelse(is.na(YearCBrequired), 
                                   Inf, YearCBrequired),
           cbr = 1 * (year >= YearCBrequired)) -> analysis_df
```

## Partially pooled SCM with an intercept

To fit partially pooled synthetic controls, we need to give `multisynth` a formula of the form `outcome ~ treatment`, point it to the unit and time variables, and choose the level of partial pooling `nu`. Setting `nu = 0` fits a separate synthetic control for each treated unit and setting `nu = 1` fits fully pooled synthetic controls. If we don't set `nu`, `multisynth` will choose a heuristic value based on how well separate synthetic controls balance the overall average.
By default, `multisynth` includes an intercept shift along with the weights; we can exclude the intercept shift by setting `fixedeff = F`.
We can also set the number of pre-treatment time periods (lags) that we want to balance with the `n_lags` argument and the number of post-treatment time periods (leads) that we want to estimate with the `n_leads` argument. By default `multisynth` sets `n_lags` and `n_leads` to the number of pre-treatment and post-treatment periods for the last treated unit, respectively.


```r
# with a choice of nu
ppool_syn <- multisynth(lnppexpend ~ cbr, State, year, 
                        nu = 0.5, analysis_df)
# with default nu
ppool_syn <- multisynth(lnppexpend ~ cbr, State, year, 
                        analysis_df)

print(ppool_syn$nu)
#> [1] 0.2606793

ppool_syn
#> 
#> Call:
#> multisynth(form = lnppexpend ~ cbr, unit = State, time = year, 
#>     data = analysis_df)
#> 
#> Average ATT Estimate: -0.011
```

Using the `summary` function, we'll compute the treatment effects and standard errors and confidence intervals for all treated units as well as the average via the wild bootstrap. (This takes a bit of time so we'll store the output) We can also change the significant level associated with the confidence intervals by setting the `alpha` argument, by default `alpha = 0.05`.


```r
ppool_syn_summ <- summary(ppool_syn)
```

We can then report the level of global and individual balance as well as estimates for the average.


```r
ppool_syn_summ
#> 
#> Call:
#> multisynth(form = lnppexpend ~ cbr, unit = State, time = year, 
#>     data = analysis_df)
#> 
#> Average ATT Estimate (Std. Error): -0.011  (0.022)
#> 
#> Global L2 Imbalance: 0.003
#> Scaled Global L2 Imbalance: 0.019
#> Percent improvement from uniform global weights: 98.1
#> 
#> Individual L2 Imbalance: 0.028
#> Scaled Individual L2 Imbalance: 0.096
#> Percent improvement from uniform individual weights: 90.4	
#> 
#>  Time Since Treatment   Level     Estimate  Std.Error lower_bound upper_bound
#>                     0 Average -0.004281754 0.02231379 -0.04888183  0.03786032
#>                     1 Average -0.010856856 0.02099299 -0.05423609  0.02939147
#>                     2 Average  0.004378813 0.02268842 -0.04268354  0.04896627
#>                     3 Average  0.001155346 0.02388535 -0.04846624  0.04464696
#>                     4 Average -0.009305005 0.02529949 -0.06207289  0.03822153
#>                     5 Average -0.016942988 0.02447144 -0.06935946  0.02695179
#>                     6 Average -0.018505173 0.02507329 -0.07297111  0.02755436
#>                     7 Average -0.003866657 0.02817460 -0.06047905  0.05013422
#>                     8 Average -0.015835730 0.03141197 -0.08179055  0.04231137
#>                     9 Average -0.031751350 0.02962989 -0.09168791  0.02202697
#>                    10 Average -0.017839047 0.03314017 -0.08835499  0.04070061
```

`ppool_syn_summ$att` is a dataframe that contains all of the point estimates, standard errors, and lower/upper confidence limits. `Time = NA` denotes the effect averaged across the post treatment periods.

<table class="table table-hover table-responsive" style="margin-left: auto; margin-right: auto;">
 <thead>
  <tr>
   <th style="text-align:right;"> Time </th>
   <th style="text-align:left;"> Level </th>
   <th style="text-align:right;"> Estimate </th>
   <th style="text-align:right;"> Std.Error </th>
   <th style="text-align:right;"> lower_bound </th>
   <th style="text-align:right;"> upper_bound </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:left;"> Average </td>
   <td style="text-align:right;"> -0.0042818 </td>
   <td style="text-align:right;"> 0.0223138 </td>
   <td style="text-align:right;"> -0.0488818 </td>
   <td style="text-align:right;"> 0.0378603 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:left;"> Average </td>
   <td style="text-align:right;"> -0.0108569 </td>
   <td style="text-align:right;"> 0.0209930 </td>
   <td style="text-align:right;"> -0.0542361 </td>
   <td style="text-align:right;"> 0.0293915 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:left;"> Average </td>
   <td style="text-align:right;"> 0.0043788 </td>
   <td style="text-align:right;"> 0.0226884 </td>
   <td style="text-align:right;"> -0.0426835 </td>
   <td style="text-align:right;"> 0.0489663 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 3 </td>
   <td style="text-align:left;"> Average </td>
   <td style="text-align:right;"> 0.0011553 </td>
   <td style="text-align:right;"> 0.0238853 </td>
   <td style="text-align:right;"> -0.0484662 </td>
   <td style="text-align:right;"> 0.0446470 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 4 </td>
   <td style="text-align:left;"> Average </td>
   <td style="text-align:right;"> -0.0093050 </td>
   <td style="text-align:right;"> 0.0252995 </td>
   <td style="text-align:right;"> -0.0620729 </td>
   <td style="text-align:right;"> 0.0382215 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 5 </td>
   <td style="text-align:left;"> Average </td>
   <td style="text-align:right;"> -0.0169430 </td>
   <td style="text-align:right;"> 0.0244714 </td>
   <td style="text-align:right;"> -0.0693595 </td>
   <td style="text-align:right;"> 0.0269518 </td>
  </tr>
</tbody>
</table>


We can also visually display both the pre-treatment balance and the estimated treatment effects.


```r
plot(ppool_syn_summ)
```

<div class="figure" style="text-align: center">
<img src="figure/ppool_syn_plot-1.png" alt="plot of chunk ppool_syn_plot"  />
<p class="caption">plot of chunk ppool_syn_plot</p>
</div>

And again we can hone in on the average effects.


```r
plot(ppool_syn_summ, levels = "Average")
```

<div class="figure" style="text-align: center">
<img src="figure/ppool_syn_plot_avg-1.png" alt="plot of chunk ppool_syn_plot_avg"  />
<p class="caption">plot of chunk ppool_syn_plot_avg</p>
</div>


### Collapsing into time cohorts

We can also collapse treated units with the same treatment time into _time cohorts_, and find one synthetic control per time cohort by setting `time_cohort = TRUE`. When the number of distinct treatment times is much smaller than the number of treated units, this will run significantly faster.


```r
# with default nu
ppool_syn_time <- multisynth(lnppexpend ~ cbr, State, year,
                        analysis_df, time_cohort = TRUE)

print(ppool_syn_time$nu)
#> [1] 0.3939013

ppool_syn_time
#> 
#> Call:
#> multisynth(form = lnppexpend ~ cbr, unit = State, time = year, 
#>     data = analysis_df, time_cohort = TRUE)
#> 
#> Average ATT Estimate: -0.018
```

We can then compute effects for the overall average as well as for each treatment time cohort, rather than individual units.


```r
ppool_syn_time_summ <- summary(ppool_syn_time)
ppool_syn_time_summ
#> 
#> Call:
#> multisynth(form = lnppexpend ~ cbr, unit = State, time = year, 
#>     data = analysis_df, time_cohort = TRUE)
#> 
#> Average ATT Estimate (Std. Error): -0.018  (0.024)
#> 
#> Global L2 Imbalance: 0.005
#> Scaled Global L2 Imbalance: 0.018
#> Percent improvement from uniform global weights: 98.2
#> 
#> Individual L2 Imbalance: 0.038
#> Scaled Individual L2 Imbalance: 0.057
#> Percent improvement from uniform individual weights: 94.3	
#> 
#>  Time Since Treatment   Level      Estimate  Std.Error lower_bound upper_bound
#>                     0 Average -0.0007756959 0.02443902 -0.04849731  0.04410082
#>                     1 Average -0.0160616979 0.02455148 -0.06120905  0.03042719
#>                     2 Average -0.0028471499 0.02521902 -0.05189710  0.04841170
#>                     3 Average -0.0026721191 0.02742973 -0.05634460  0.05048728
#>                     4 Average -0.0181312843 0.02798461 -0.07148111  0.03468573
#>                     5 Average -0.0284898474 0.02644653 -0.07724091  0.02368573
#>                     6 Average -0.0228343778 0.02673115 -0.07456646  0.02837584
#>                     7 Average -0.0140789250 0.03200335 -0.07574649  0.04580312
#>                     8 Average -0.0245472682 0.03276526 -0.08792999  0.03819451
#>                     9 Average -0.0476922268 0.03221486 -0.11080383  0.01490279
#>                    10 Average -0.0216121159 0.03235770 -0.08391841  0.03853317
```

<table class="table table-hover table-responsive" style="margin-left: auto; margin-right: auto;">
 <thead>
  <tr>
   <th style="text-align:right;"> Time </th>
   <th style="text-align:left;"> Level </th>
   <th style="text-align:right;"> Estimate </th>
   <th style="text-align:right;"> Std.Error </th>
   <th style="text-align:right;"> lower_bound </th>
   <th style="text-align:right;"> upper_bound </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:left;"> Average </td>
   <td style="text-align:right;"> -0.0007757 </td>
   <td style="text-align:right;"> 0.0244390 </td>
   <td style="text-align:right;"> -0.0484973 </td>
   <td style="text-align:right;"> 0.0441008 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:left;"> Average </td>
   <td style="text-align:right;"> -0.0160617 </td>
   <td style="text-align:right;"> 0.0245515 </td>
   <td style="text-align:right;"> -0.0612091 </td>
   <td style="text-align:right;"> 0.0304272 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:left;"> Average </td>
   <td style="text-align:right;"> -0.0028471 </td>
   <td style="text-align:right;"> 0.0252190 </td>
   <td style="text-align:right;"> -0.0518971 </td>
   <td style="text-align:right;"> 0.0484117 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 3 </td>
   <td style="text-align:left;"> Average </td>
   <td style="text-align:right;"> -0.0026721 </td>
   <td style="text-align:right;"> 0.0274297 </td>
   <td style="text-align:right;"> -0.0563446 </td>
   <td style="text-align:right;"> 0.0504873 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 4 </td>
   <td style="text-align:left;"> Average </td>
   <td style="text-align:right;"> -0.0181313 </td>
   <td style="text-align:right;"> 0.0279846 </td>
   <td style="text-align:right;"> -0.0714811 </td>
   <td style="text-align:right;"> 0.0346857 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 5 </td>
   <td style="text-align:left;"> Average </td>
   <td style="text-align:right;"> -0.0284898 </td>
   <td style="text-align:right;"> 0.0264465 </td>
   <td style="text-align:right;"> -0.0772409 </td>
   <td style="text-align:right;"> 0.0236857 </td>
  </tr>
</tbody>
</table>


Again we can plot the effects.


```r
plot(ppool_syn_time_summ)
```

<div class="figure" style="text-align: center">
<img src="figure/ppool_syn_time_plot-1.png" alt="plot of chunk ppool_syn_time_plot"  />
<p class="caption">plot of chunk ppool_syn_time_plot</p>
</div>


### Including auxiliary covariates

We can also include an additional set of covariates to balance along with the pre-treatment outcomes. First, let's create a data frame with the values of some covariates in a few different years:


```r

data %>%
  select(State, year, agr, pnwht, purban, perinc, studteachratio) %>%
  group_by(State) %>%
  summarise(perinc_1959 = perinc[year == 1959],
            studteachratio_1959 = studteachratio[year == 1959]) %>% 
  # filter to lower 48 where we have data
  filter(!State %in% c("AK", "HI"))  -> cov_data

analysis_df %>%
  inner_join(cov_data, by = "State") -> analysis_df_covs
```

To include auxiliary covariates, we can add them in to the formula after `|`. This will balance the auxiliary covariates along with the pre-treatment outcomes simultanouesly. If the covariates vary during the pre-treatment periods, `multisynth` will use the average pre-treatment value. We can change this behavior by including our own custom aggregation function via the `cov_agg` argument.

```r
# with default nu
ppool_syn_cov <- multisynth(lnppexpend ~ cbr | perinc_1959 + studteachratio_1959,
                            State, year, analysis_df_covs)

print(ppool_syn_cov$nu)
#> [1] 0.2242633

ppool_syn_cov
#> 
#> Call:
#> multisynth(form = lnppexpend ~ cbr | perinc_1959 + studteachratio_1959, 
#>     unit = State, time = year, data = analysis_df_covs)
#> 
#> Average ATT Estimate: -0.019
```

Again we can compute effects, along with their standard errors and confidence intervals, and plot.

```r
ppool_syn_cov_summ <- summary(ppool_syn_cov)
ppool_syn_cov_summ
#> 
#> Call:
#> multisynth(form = lnppexpend ~ cbr | perinc_1959 + studteachratio_1959, 
#>     unit = State, time = year, data = analysis_df_covs)
#> 
#> Average ATT Estimate (Std. Error): -0.019  (0.016)
#> 
#> Global L2 Imbalance: 0.004
#> Scaled Global L2 Imbalance: 0.030
#> Percent improvement from uniform global weights: 97
#> 
#> Individual L2 Imbalance: 0.043
#> Scaled Individual L2 Imbalance: 0.155
#> Percent improvement from uniform individual weights: 84.5	
#> 
#>  Time Since Treatment   Level      Estimate  Std.Error lower_bound upper_bound
#>                     0 Average -0.0002624529 0.02142663 -0.04534283 0.039477273
#>                     1 Average -0.0156461424 0.01955742 -0.05138329 0.021858933
#>                     2 Average  0.0069387257 0.01979857 -0.03246108 0.046934990
#>                     3 Average -0.0106105517 0.02094953 -0.05241864 0.032678554
#>                     4 Average -0.0194238312 0.02027608 -0.06026658 0.019006295
#>                     5 Average -0.0209126517 0.02065713 -0.06053277 0.018478402
#>                     6 Average -0.0212525401 0.02011174 -0.06076619 0.018093027
#>                     7 Average -0.0276107046 0.02122581 -0.07010144 0.014753050
#>                     8 Average -0.0278450111 0.02282095 -0.07360570 0.017305636
#>                     9 Average -0.0354977043 0.02341366 -0.07998872 0.009126067
#>                    10 Average -0.0341083505 0.02709654 -0.08591161 0.017937928
```

<table class="table table-hover table-responsive" style="margin-left: auto; margin-right: auto;">
 <thead>
  <tr>
   <th style="text-align:right;"> Time </th>
   <th style="text-align:left;"> Level </th>
   <th style="text-align:right;"> Estimate </th>
   <th style="text-align:right;"> Std.Error </th>
   <th style="text-align:right;"> lower_bound </th>
   <th style="text-align:right;"> upper_bound </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:left;"> Average </td>
   <td style="text-align:right;"> -0.0002625 </td>
   <td style="text-align:right;"> 0.0214266 </td>
   <td style="text-align:right;"> -0.0453428 </td>
   <td style="text-align:right;"> 0.0394773 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:left;"> Average </td>
   <td style="text-align:right;"> -0.0156461 </td>
   <td style="text-align:right;"> 0.0195574 </td>
   <td style="text-align:right;"> -0.0513833 </td>
   <td style="text-align:right;"> 0.0218589 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:left;"> Average </td>
   <td style="text-align:right;"> 0.0069387 </td>
   <td style="text-align:right;"> 0.0197986 </td>
   <td style="text-align:right;"> -0.0324611 </td>
   <td style="text-align:right;"> 0.0469350 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 3 </td>
   <td style="text-align:left;"> Average </td>
   <td style="text-align:right;"> -0.0106106 </td>
   <td style="text-align:right;"> 0.0209495 </td>
   <td style="text-align:right;"> -0.0524186 </td>
   <td style="text-align:right;"> 0.0326786 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 4 </td>
   <td style="text-align:left;"> Average </td>
   <td style="text-align:right;"> -0.0194238 </td>
   <td style="text-align:right;"> 0.0202761 </td>
   <td style="text-align:right;"> -0.0602666 </td>
   <td style="text-align:right;"> 0.0190063 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 5 </td>
   <td style="text-align:left;"> Average </td>
   <td style="text-align:right;"> -0.0209127 </td>
   <td style="text-align:right;"> 0.0206571 </td>
   <td style="text-align:right;"> -0.0605328 </td>
   <td style="text-align:right;"> 0.0184784 </td>
  </tr>
</tbody>
</table>


Again we can plot the effects.

```r
plot(ppool_syn_cov_summ, levels = "Average")
```

<div class="figure" style="text-align: center">
<img src="figure/ppool_syn_cov_plot-1.png" alt="plot of chunk ppool_syn_cov_plot"  />
<p class="caption">plot of chunk ppool_syn_cov_plot</p>
</div>


================================================
FILE: vignettes/singlesynth-vignette.Rmd
================================================
---
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Single Outcome AugSynth Vignette}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
  )
library(kableExtra)
```

# `augsynth`: The Augmented Synthetic Control Method


## Installation

You can install `augsynth` from github using `devtools`.

```{r install, results="hide", message=F, eval=F}
## Install devtools if noy already installed
install.packages("devtools", repos='http://cran.us.r-project.org')
## Install augsynth from github
devtools::install_github("ebenmichael/augsynth")
```

## Example: Effects of the 2012 Kansas Tax Cuts 

### The data
To show the usage and features of `augsynth`, we'll use data on the impact of personal income tax cuts in Kansas that comes with the `AugSynth` package. Our interest is in estimating the effect of income tax cuts on gross state product (GSP) per capita.

```{r load_data, results="hide", warning=F, message=F}
library(magrittr)
library(dplyr)
library(augsynth)
data(kansas)
```

The `kansas` dataset contains the GSP per capita (the outcome measure) `lngdpcapita` for all 50 states from the first quarter of 1990 to the first quarter of 2016.

To run `augsynth`, we need to include a treatment status column that indicates which region was treated and at what time. The table in `kansas` contains the column `treated` to denote this. In the original study, the second quarter of 2012 was the implementation of the tax cut in Kansas.

```{r treated_units}
kansas %>% 
  select(year, qtr, year_qtr, state, treated, gdp, lngdpcapita) %>% 
  filter(state == "Kansas" & year_qtr >= 2012 & year_qtr < 2013) 
```


### Synth
Now to find a synthetic control using the entire series of pre-intervention outcomes (and no auxiliary covariates), we can use `augsynth`. To do so we just need to give `augsynth` a formula like `outcome ~ treatment`, tell it what the unit and time variables are, optionally provide when intervention took place (the code will automatically determine this if `t_int` is not provided), and specify that we don't want to fit an outcome model

```{r fit_synth, message=F, warning=F}
library(augsynth)
syn <- augsynth(lngdpcapita ~ treated, fips, year_qtr, kansas,
                progfunc = "None", scm = T)
```

We can then look at the ATT estimates for each post-intervention time period and overall. 
We'll also see the quality of the synthetic control fit measured by the L2 distance between Kansas and its synthetic control, and the percent improvement over uniform weights.
By default, we'll also see pointwise confidence intervals using a [conformal inference procedure](https://arxiv.org/abs/1712.09089).

```{r summ_syn}
summary(syn)
```


The default test statistic is the sum of the absolute treatment efects `function(x) sum(abs(x))`. We can change the test statistic via the `stat_func` argument. For instance, if we want to perform a one-way test against postive effects, we can set the test stastic to be the negative sum `function(x) -sum(x)`:
```{r summ_syn_neg}
summary(syn, stat_func = function(x) -sum(x))
```
Or if we want to priotize testing the average post-treatment effect, we can set it to be the absolute sum:
```{r summ_syn_sum}
summary(syn, stat_func = function(x) abs(sum(x)))
```


It's easier to see this information visually. Below we plot the difference between Kansas and it's synthetic control. Before the tax cuts (to the left of the dashed line) we expect these to be close, and after the tax cuts we measure the effect (with point-wise confidence intervals).

```{r fig_syn, fig.width=8, fig.height=4.5, echo=T, fig.align="center"}
plot(syn)
```

We can also compute point-wise confidence intervals using the [Jackknife+ procedure](https://arxiv.org/abs/1905.02928) by changing the `inf_type` argument, although this requires additional assumptions.

```{r fig_syn_plus, fig.width=8, fig.height=4.5, echo=T, fig.align="center"}
plot(syn, inf_type = "jackknife+")
```


### Augmenting synth with an outcome model
In this example the pre-intervention synthetic control fit has an L2 imbalance of 0.083, about 20% of the imbalance between Kansas and the average of the other states. We can reduce this by _augmenting_ synth with ridge regression. To do this we change `progfunc` to `"Ridge"`. We can also choose the ridge hyper-parameter by setting `lambda`, while not specifying `lambda` will determine one through cross validation:
```{r fit_asynth, message=F, warning=F}
asyn <- augsynth(lngdpcapita ~ treated, fips, year_qtr, kansas,
                progfunc = "Ridge", scm = T)
```

We can plot the cross-validation MSE when dropping pre-treatment time periods by setting `cv = T` in the `plot` function:

```{r fig_asyn_cv, fig.width=8, fig.height=4.5, echo=T, fig.align="center"}
plot(asyn, cv = T)
```

By default, the CV procedure chooses the maximal value of `lambda` with MSE within one standard deviation of the minimal MSE. To instead choose the `lambda` that minizes the cross validation MSE, set `min_1se = FALSE`.


We can look at the summary and plot the results. Now in the summary output we see an estimate of the overall bias of synth; we measure this with the average amount that augmentation changes the synth estimate. Notice that the estimates become somewhat larger in magnitude, and the standard errors are tighter.
```{r summ_asyn}
summary(asyn)
```

```{r fig_asyn, fig.width=8, fig.height=4.5, echo=T, fig.align="center"}
plot(asyn)
```

There are also several auxiliary covariates. We can include these in the augmentation by fitting an outcome model using the auxiliary covariates. To do this we simply add the covariates into the formula after `|`. By default this will create time invariant covariates by averaging the auxiliary covariates over the pre-intervention period, dropping `NA` values. We can use a custom aggregation function by setting the `cov_agg` argument. Then the lagged outcomes and the auxiliary covariates are jointly balanced by SCM and the ridge outcome model includes both.

```{r fit_covsynth, message=F, warning=F}
covsyn <- augsynth(lngdpcapita ~ treated | lngdpcapita + log(revstatecapita) +
                                           log(revlocalcapita) + log(avgwklywagecapita) +
                                           estabscapita + emplvlcapita,
                   fips, year_qtr, kansas,
                   progfunc = "ridge", scm = T)

```

Again we can look at the summary and plot the results.
```{r summ_cvsyn}
summary(covsyn)
```

```{r fig_covsyn, fig.width=8, fig.height=4.5, echo=T, fig.align="center"}
plot(covsyn)
```

Now we can additionally fit ridge ASCM on the residuals, look at the summary, and plot the results.
```{r fit_covsynth_aug, message=F, warning=F}

covsyn_resid <- augsynth(lngdpcapita ~ treated | lngdpcapita + log(revstatecapita) +
                                           log(revlocalcapita) + log(avgwklywagecapita) +
                                           estabscapita + emplvlcapita,
                   fips, year_qtr, kansas,
                   progfunc = "ridge", scm = T, lambda = asyn$lambda,
                   residualize = T)
```

```{r summ_cvsyn_resid}
summary(covsyn_resid)
```


```{r fig_covsyn_resid, fig.width=8, fig.height=4.5, echo=T, fig.align="center"}
plot(covsyn_resid)
```


Finally, we can augment synth with many different outcome models. The simplest outcome model is a unit fixed effect model, which we can include by setting `fixedeff = T`.
```{r fit_desyn, message=F, warning=F}

desyn <- augsynth(lngdpcapita ~ treated,
                   fips, year_qtr, kansas,
                   progfunc = "none", scm = T,
                   fixedeff = T)
```


```{r summ_desyn}
summary(desyn)
```


```{r fig_desyn, fig.width=8, fig.height=4.5, echo=T, fig.align="center"}
plot(desyn)
```

We can incorproate other outcome models by changing the `progfunc`.
Several outcome models are available, including, fitting the factor model directly with `gsynth`, general elastic net regression, bayesian structural time series estimation with `CausalImpact`, and matrix completion with `MCPanel`. For each outcome model you can supply an optional set of parameters, see documentation for details.


================================================
FILE: vignettes/singlesynth-vignette.md
================================================
---
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Single Outcome AugSynth Vignette}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---


# `augsynth`: The Augmented Synthetic Control Method


## Installation

You can install `augsynth` from github using `devtools`.


```r
## Install devtools if noy already installed
install.packages("devtools", repos='http://cran.us.r-project.org')
## Install augsynth from github
devtools::install_github("ebenmichael/augsynth")
```

## Example: Effects of the 2012 Kansas Tax Cuts 

### The data
To show the usage and features of `augsynth`, we'll use data on the impact of personal income tax cuts in Kansas that comes with the `AugSynth` package. Our interest is in estimating the effect of income tax cuts on gross state product (GSP) per capita.


```r
library(magrittr)
library(dplyr)
library(augsynth)
data(kansas)
```

The `kansas` dataset contains the GSP per capita (the outcome measure) `lngdpcapita` for all 50 states from the first quarter of 1990 to the first quarter of 2016.

To run `augsynth`, we need to include a treatment status column that indicates which region was treated and at what time. The table in `kansas` contains the column `treated` to denote this. In the original study, the second quarter of 2012 was the implementation of the tax cut in Kansas.


```r
kansas %>% 
  select(year, qtr, year_qtr, state, treated, gdp, lngdpcapita) %>% 
  filter(state == "Kansas" & year_qtr >= 2012 & year_qtr < 2013) 
#> # A tibble: 4 x 7
#>    year   qtr year_qtr state  treated    gdp lngdpcapita
#>   <dbl> <dbl>    <dbl> <chr>    <dbl>  <dbl>       <dbl>
#> 1  2012     1    2012  Kansas       0 143844        10.8
#> 2  2012     2    2012. Kansas       1 141518        10.8
#> 3  2012     3    2012. Kansas       1 138890        10.8
#> 4  2012     4    2013. Kansas       1 139603        10.8
```


### Synth
Now to find a synthetic control using the entire series of pre-intervention outcomes (and no auxiliary covariates), we can use `augsynth`. To do so we just need to give `augsynth` a formula like `outcome ~ treatment`, tell it what the unit and time variables are, optionally provide when intervention took place (the code will automatically determine this if `t_int` is not provided), and specify that we don't want to fit an outcome model


```r
library(augsynth)
syn <- augsynth(lngdpcapita ~ treated, fips, year_qtr, kansas,
                progfunc = "None", scm = T)
```

We can then look at the ATT estimates for each post-intervention time period and overall. 
We'll also see the quality of the synthetic control fit measured by the L2 distance between Kansas and its synthetic control, and the percent improvement over uniform weights.
By default, we'll also see pointwise confidence intervals using a [conformal inference procedure](https://arxiv.org/abs/1712.09089).


```r
summary(syn)
#> 
#> Call:
#> single_augsynth(form = form, unit = !!enquo(unit), time = !!enquo(time), 
#>     t_int = t_int, data = data, progfunc = "None", scm = ..2)
#> 
#> Average ATT Estimate (p Value for Joint Null):  -0.029   ( 0.328 )
#> L2 Imbalance: 0.083
#> Percent improvement from uniform weights: 79.5%
#> 
#> Avg Estimated Bias: NA
#> 
#> Inference type: Conformal inference
#> 
#>     Time Estimate 95% CI Lower Bound 95% CI Upper Bound p Value
#>  2012.25   -0.018             -0.045              0.006   0.111
#>  2012.50   -0.041             -0.070             -0.015   0.022
#>  2012.75   -0.033             -0.062             -0.007   0.044
#>  2013.00   -0.019             -0.046              0.005   0.111
#>  2013.25   -0.029             -0.053             -0.005   0.044
#>  2013.50   -0.046             -0.073             -0.022   0.022
#>  2013.75   -0.032             -0.056             -0.010   0.022
#>  2014.00   -0.045             -0.074             -0.018   0.022
#>  2014.25   -0.043             -0.074             -0.014   0.022
#>  2014.50   -0.029             -0.061              0.000   0.044
#>  2014.75   -0.018             -0.053              0.011   0.144
#>  2015.00   -0.029             -0.066              0.005   0.078
#>  2015.25   -0.019             -0.051              0.010   0.122
#>  2015.50   -0.022             -0.056              0.007   0.111
#>  2015.75   -0.019             -0.055              0.013   0.189
#>  2016.00   -0.028             -0.067              0.008   0.100
```


The default test statistic is the sum of the absolute treatment efects `function(x) sum(abs(x))`. We can change the test statistic via the `stat_func` argument. For instance, if we want to perform a one-way test against postive effects, we can set the test stastic to be the negative sum `function(x) -sum(x)`:

```r
summary(syn, stat_func = function(x) -sum(x))
#> 
#> Call:
#> single_augsynth(form = form, unit = !!enquo(unit), time = !!enquo(time), 
#>     t_int = t_int, data = data, progfunc = "None", scm = ..2)
#> 
#> Average ATT Estimate (p Value for Joint Null):  -0.029   ( 0.159 )
#> L2 Imbalance: 0.083
#> Percent improvement from uniform weights: 79.5%
#> 
#> Avg Estimated Bias: NA
#> 
#> Inference type: Conformal inference
#> 
#>     Time Estimate 95% CI Lower Bound 95% CI Upper Bound p Value
#>  2012.25   -0.018             -0.080              0.006   0.067
#>  2012.50   -0.041             -0.103             -0.015   0.022
#>  2012.75   -0.033             -0.095             -0.007   0.033
#>  2013.00   -0.019             -0.081              0.005   0.067
#>  2013.25   -0.029             -0.091             -0.005   0.033
#>  2013.50   -0.046             -0.108             -0.022   0.022
#>  2013.75   -0.032             -0.094             -0.010   0.022
#>  2014.00   -0.045             -0.107             -0.021   0.022
#>  2014.25   -0.043             -0.105             -0.014   0.022
#>  2014.50   -0.029             -0.091              0.000   0.033
#>  2014.75   -0.018             -0.080              0.011   0.078
#>  2015.00   -0.029             -0.091              0.005   0.056
#>  2015.25   -0.019             -0.081              0.007   0.078
#>  2015.50   -0.022             -0.084              0.007   0.067
#>  2015.75   -0.019             -0.081              0.013   0.111
#>  2016.00   -0.028             -0.090              0.008   0.067
```
Or if we want to priotize testing the average post-treatment effect, we can set it to be the absolute sum:

```r
summary(syn, stat_func = function(x) abs(sum(x)))
#> 
#> Call:
#> single_augsynth(form = form, unit = !!enquo(unit), time = !!enquo(time), 
#>     t_int = t_int, data = data, progfunc = "None", scm = ..2)
#> 
#> Average ATT Estimate (p Value for Joint Null):  -0.029   ( 0.302 )
#> L2 Imbalance: 0.083
#> Percent improvement from uniform weights: 79.5%
#> 
#> Avg Estimated Bias: NA
#> 
#> Inference type: Conformal inference
#> 
#>     Time Estimate 95% CI Lower Bound 95% CI Upper Bound p Value
#>  2012.25   -0.018             -0.045              0.006   0.111
#>  2012.50   -0.041             -0.070             -0.015   0.022
#>  2012.75   -0.033             -0.062             -0.007   0.044
#>  2013.00   -0.019             -0.046              0.005   0.111
#>  2013.25   -0.029             -0.053             -0.005   0.044
#>  2013.50   -0.046             -0.073             -0.022   0.022
#>  2013.75   -0.032             -0.056             -0.010   0.022
#>  2014.00   -0.045             -0.074             -0.018   0.022
#>  2014.25   -0.043             -0.074             -0.014   0.022
#>  2014.50   -0.029             -0.061              0.000   0.044
#>  2014.75   -0.018             -0.053              0.011   0.144
#>  2015.00   -0.029             -0.066              0.005   0.078
#>  2015.25   -0.019             -0.051              0.010   0.122
#>  2015.50   -0.022             -0.056              0.007   0.111
#>  2015.75   -0.019             -0.055              0.013   0.189
#>  2016.00   -0.028             -0.067              0.008   0.100
```


It's easier to see this information visually. Below we plot the difference between Kansas and it's synthetic control. Before the tax cuts (to the left of the dashed line) we expect these to be close, and after the tax cuts we measure the effect (with point-wise confidence intervals).


```r
plot(syn)
```

<img src="figure/fig_syn-1.png" title="plot of chunk fig_syn" alt="plot of chunk fig_syn" style="display: block; margin: auto;" />

We can also compute point-wise confidence intervals using the [Jackknife+ procedure](https://arxiv.org/abs/1905.02928) by changing the `inf_type` argument, although this requires additional assumptions.


```r
plot(syn, inf_type = "jackknife+")
```

<img src="figure/fig_syn_plus-1.png" title="plot of chunk fig_syn_plus" alt="plot of chunk fig_syn_plus" style="display: block; margin: auto;" />


### Augmenting synth with an outcome model
In this example the pre-intervention synthetic control fit has an L2 imbalance of 0.083, about 20% of the imbalance between Kansas and the average of the other states. We can reduce this by _augmenting_ synth with ridge regression. To do this we change `progfunc` to `"Ridge"`. We can also choose the ridge hyper-parameter by setting `lambda`, while not specifying `lambda` will determine one through cross validation:

```r
asyn <- augsynth(lngdpcapita ~ treated, fips, year_qtr, kansas,
                progfunc = "Ridge", scm = T)
```

We can plot the cross-validation MSE when dropping pre-treatment time periods by setting `cv = T` in the `plot` function:


```r
plot(asyn, cv = T)
```

<img src="figure/fig_asyn_cv-1.png" title="plot of chunk fig_asyn_cv" alt="plot of chunk fig_asyn_cv" style="display: block; margin: auto;" />

By default, the CV procedure chooses the maximal value of `lambda` with MSE within one standard deviation of the minimal MSE. To instead choose the `lambda` that minizes the cross validation MSE, set `min_1se = FALSE`.


We can look at the summary and plot the results. Now in the summary output we see an estimate of the overall bias of synth; we measure this with the average amount that augmentation changes the synth estimate. Notice that the estimates become somewhat larger in magnitude, and the standard errors are tighter.

```r
summary(asyn)
#> 
#> Call:
#> single_augsynth(form = form, unit = !!enquo(unit), time = !!enquo(time), 
#>     t_int = t_int, data = data, progfunc = "Ridge", scm = ..2)
#> 
#> Average ATT Estimate (p Value for Joint Null):  -0.040   ( 0.057 )
#> L2 Imbalance: 0.062
#> Percent improvement from uniform weights: 84.7%
#> 
#> Avg Estimated Bias: 0.011
#> 
#> Inference type: Conformal inference
#> 
#>     Time Estimate 95% CI Lower Bound 95% CI Upper Bound p Value
#>  2012.25   -0.022             -0.044              0.003   0.056
#>  2012.50   -0.047             -0.076             -0.018   0.022
#>  2012.75   -0.043             -0.071             -0.010   0.022
#>  2013.00   -0.030             -0.055             -0.004   0.033
#>  2013.25   -0.041             -0.067             -0.012   0.022
#>  2013.50   -0.059             -0.088             -0.030   0.022
#>  2013.75   -0.045             -0.073             -0.019   0.022
#>  2014.00   -0.058             -0.090             -0.026   0.022
#>  2014.25   -0.055             -0.091             -0.020   0.022
#>  2014.50   -0.041             -0.080             -0.006   0.033
#>  2014.75   -0.029             -0.068              0.006   0.056
#>  2015.00   -0.040             -0.082              0.000   0.056
#>  2015.25   -0.030             -0.066              0.002   0.056
#>  2015.50   -0.033             -0.072              0.003   0.056
#>  2015.75   -0.029             -0.071              0.010   0.056
#>  2016.00   -0.038             -0.087              0.004   0.056
```


```r
plot(asyn)
```

<img src="figure/fig_asyn-1.png" title="plot of chunk fig_asyn" alt="plot of chunk fig_asyn" style="display: block; margin: auto;" />

There are also several auxiliary covariates. We can include these in the augmentation by fitting an outcome model using the auxiliary covariates. To do this we simply add the covariates into the formula after `|`. By default this will create time invariant covariates by averaging the auxiliary covariates over the pre-intervention period, dropping `NA` values. We can use a custom aggregation function by setting the `cov_agg` argument. Then the lagged outcomes and the auxiliary covariates are jointly balanced by SCM and the ridge outcome model includes both.


```r
covsyn <- augsynth(lngdpcapita ~ treated | lngdpcapita + log(revstatecapita) +
                                           log(revlocalcapita) + log(avgwklywagecapita) +
                                           estabscapita + emplvlcapita,
                   fips, year_qtr, kansas,
                   progfunc = "ridge", scm = T)
```

Again we can look at the summary and plot the results.

```r
summary(covsyn)
#> 
#> Call:
#> single_augsynth(form = form, unit = !!enquo(unit), time = !!enquo(time), 
#>     t_int = t_int, data = data, progfunc = "ridge", scm = ..2)
#> 
#> Average ATT Estimate (p Value for Joint Null):  -0.061   ( 0.11 )
#> L2 Imbalance: 0.054
#> Percent improvement from uniform weights: 86.6%
#> 
#> Covariate L2 Imbalance: 0.005
#> Percent improvement from uniform weights: 97.7%
#> 
#> Avg Estimated Bias: 0.027
#> 
#> Inference type: Conformal inference
#> 
#>     Time Estimate 95% CI Lower Bound 95% CI Upper Bound p Value
#>  2012.25   -0.021             -0.044              0.002   0.067
#>  2012.50   -0.047             -0.076             -0.014   0.033
#>  2012.75   -0.050             -0.083             -0.007   0.033
#>  2013.00   -0.045             -0.074             -0.012   0.033
#>  2013.25   -0.055             -0.088             -0.022   0.022
#>  2013.50   -0.071             -0.105             -0.033   0.022
#>  2013.75   -0.058             -0.091             -0.025   0.022
#>  2014.00   -0.081             -0.119             -0.037   0.022
#>  2014.25   -0.078             -0.121             -0.034   0.022
#>  2014.50   -0.065             -0.114             -0.021   0.033
#>  2014.75   -0.057             -0.110             -0.008   0.044
#>  2015.00   -0.075             -0.124             -0.022   0.033
#>  2015.25   -0.063             -0.106             -0.014   0.033
#>  2015.50   -0.067             -0.106             -0.019   0.022
#>  2015.75   -0.063             -0.101             -0.009   0.022
#>  2016.00   -0.078             -0.122             -0.019   0.022
```


```r
plot(covsyn)
```

<img src="figure/fig_covsyn-1.png" title="plot of chunk fig_covsyn" alt="plot of chunk fig_covsyn" style="display: block; margin: auto;" />

Now we can additionally fit ridge ASCM on the residuals, look at the summary, and plot the results.

```r

covsyn_resid <- augsynth(lngdpcapita ~ treated | lngdpcapita + log(revstatecapita) +
                                           log(revlocalcapita) + log(avgwklywagecapita) +
                                           estabscapita + emplvlcapita,
                   fips, year_qtr, kansas,
                   progfunc = "ridge", scm = T, lambda = asyn$lambda,
                   residualize = T)
```


```r
summary(covsyn_resid)
#> 
#> Call:
#> single_augsynth(form = form, unit = !!enquo(unit), time = !!enquo(time), 
#>     t_int = t_int, data = data, progfunc = "ridge", scm = ..2, 
#>     lambda = ..3, residualize = ..4)
#> 
#> Average ATT Estimate (p Value for Joint Null):  -0.055   ( 0.288 )
#> L2 Imbalance: 0.067
#> Percent improvement from uniform weights: 83.4%
#> 
#> Covariate L2 Imbalance: 0.000
#> Percent improvement from uniform weights: 100%
#> 
#> Avg Estimated Bias: 0.006
#> 
#> Inference type: Conformal inference
#> 
#>     Time Estimate 95% CI Lower Bound 95% CI Upper Bound p Value
#>  2012.25   -0.025             -0.046             -0.005   0.044
#>  2012.50   -0.051             -0.076             -0.026   0.011
#>  2012.75   -0.045             -0.070             -0.020   0.011
#>  2013.00   -0.044             -0.069             -0.019   0.011
#>  2013.25   -0.051             -0.077             -0.026   0.011
#>  2013.50   -0.069             -0.094             -0.044   0.011
#>  2013.75   -0.051             -0.077             -0.026   0.011
#>  2014.00   -0.069             -0.095             -0.040   0.011
#>  2014.25   -0.067             -0.097             -0.037   0.011
#>  2014.50   -0.053             -0.083             -0.024   0.011
#>  2014.75   -0.045             -0.075             -0.015   0.022
#>  2015.00   -0.064             -0.093             -0.034   0.011
#>  2015.25   -0.051             -0.076             -0.026   0.011
#>  2015.50   -0.059             -0.089             -0.034   0.011
#>  2015.75   -0.058             -0.087             -0.028   0.011
#>  2016.00   -0.074             -0.103             -0.044   0.011
```


```r
plot(covsyn_resid)
```

<img src="figure/fig_covsyn_resid-1.png" title="plot of chunk fig_covsyn_resid" alt="plot of chunk fig_covsyn_resid" style="display: block; margin: auto;" />


Finally, we can augment synth with many different outcome models. The simplest outcome model is a unit fixed effect model, which we can include by setting `fixedeff = T`.

```r

desyn <- augsynth(lngdpcapita ~ treated,
                   fips, year_qtr, kansas,
                   progfunc = "none", scm = T,
                   fixedeff = T)
```


```r
summary(desyn)
#> 
#> Call:
#> single_augsynth(form = form, unit = !!enquo(unit), time = !!enquo(time), 
#>     t_int = t_int, data = data, progfunc = "none", scm = ..2, 
#>     fixedeff = ..3)
#> 
#> Average ATT Estimate (p Value for Joint Null):  -0.034   ( 0.319 )
#> L2 Imbalance: 0.082
#> Percent improvement from uniform weights: 55.1%
#> 
#> Avg Estimated Bias: NA
#> 
#> Inference type: Conformal inference
#> 
#>     Time Estimate 95% CI Lower Bound 95% CI Upper Bound p Value
#>  2012.25   -0.022             -0.046              0.006   0.078
#>  2012.50   -0.046             -0.070             -0.013   0.022
#>  2012.75   -0.038             -0.062             -0.005   0.044
#>  2013.00   -0.024             -0.048              0.003   0.078
#>  2013.25   -0.033             -0.057             -0.006   0.044
#>  2013.50   -0.050             -0.074             -0.023   0.022
#>  2013.75   -0.035             -0.056             -0.010   0.022
#>  2014.00   -0.049             -0.073             -0.019   0.022
#>  2014.25   -0.047             -0.071             -0.014   0.022
#>  2014.50   -0.033             -0.057              0.000   0.056
#>  2014.75   -0.023             -0.047              0.010   0.122
#>  2015.00   -0.034             -0.061              0.004   0.078
#>  2015.25   -0.023             -0.047              0.007   0.100
#>  2015.50   -0.026             -0.053              0.007   0.100
#>  2015.75   -0.023             -0.050              0.012   0.144
#>  2016.00   -0.033             -0.066              0.008   0.089
```


```r
plot(desyn)
```

<img src="figure/fig_desyn-1.png" title="plot of chunk fig_desyn" alt="plot of chunk fig_desyn" style="display: block; margin: auto;" />

We can incorproate other outcome models by changing the `progfunc`.
Several outcome models are available, including, fitting the factor model directly with `gsynth`, general elastic net regression, bayesian structural time series estimation with `CausalImpact`, and matrix completion with `MCPanel`. For each outcome model you can supply an optional set of parameters, see documentation for details.