Title: | GCC Estimation of the Multilevel Factor Model |
---|---|
Description: | Provides methods for model selection, estimation, bootstrap inference, and simulation for the multilevel factor model, based on the principal component estimation and generalised canonical correlation approach. Details can be found in "Generalised Canonical Correlation Estimation of the Multilevel Factor Model." Lin and Shin (2023) <doi:10.2139/ssrn.4295429>. |
Authors: | Rui Lin [aut, cre], Yongcheol Shin [aut] |
Maintainer: | Rui Lin <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.0.1 |
Built: | 2024-11-23 03:01:45 UTC |
Source: | https://github.com/cran/GCCfactor |
This function computes the asymptotic confidence intervals
for the local loadings for the -th individual in block
.
See Lin and Shin (2023) for details.
AsymCI_local_loading(object, i, j, alpha = 0.05)
AsymCI_local_loading(object, i, j, alpha = 0.05)
object |
An S3 object of class 'multi_result' created by multilevel(). |
i |
An integer indicating the |
j |
An integer indicating the |
alpha |
The significance level, a single numeric between 0 and 1. 0.05 by default. |
A matrix containing the upper and lower band.
Lin, R. and Shin, Y., 2022. Generalised Canonical Correlation Estimation of the Multilevel Factor Model. Available at SSRN 4295429.
panel <- UKhouse # load the data est_multi <- multilevel(panel, ic = "BIC3", standarise = TRUE, r_max = 5, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") bs_local_loading_11 <- AsymCI_local_loading(est_multi, i = 1, j = 1)
panel <- UKhouse # load the data est_multi <- multilevel(panel, ic = "BIC3", standarise = TRUE, r_max = 5, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") bs_local_loading_11 <- AsymCI_local_loading(est_multi, i = 1, j = 1)
Evaluate the Bartlett kernel function: if
and
otherwise.
Bartlett(x)
Bartlett(x)
x |
A single numeric. |
A single numeric between 0 and 1.
Bartlett(0.5)
Bartlett(0.5)
This function employs a bootstrap procedure to obtain confidence intervals
for the global component for the -th individual in block
at time
.
See Lin and Shin (2023) for details.
BS_global_comp(object, i, j, t, BB = 599, alpha = 0.05)
BS_global_comp(object, i, j, t, BB = 599, alpha = 0.05)
object |
An S3 object of class 'multi_result' created by multilevel(). |
i |
An integer indicating the |
j |
An integer indicating the |
t |
An integer specifying the time point at which the CI is constructed. |
BB |
An integer indicating the number of bootstrap repetition. 599 by default. |
alpha |
The significance level, a single numeric between 0 and 1. 0.05 by default. |
A matrix containing the upper and lower band.
Lin, R. and Shin, Y., 2022. Generalised Canonical Correlation Estimation of the Multilevel Factor Model. Available at SSRN 4295429.
panel <- UKhouse # load the data est_multi <- multilevel(panel, ic = "BIC3", standarise = TRUE, r_max = 5, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") bs_gcomp_111 <- BS_global_comp(est_multi, i = 1, j = 1, t = 1)
panel <- UKhouse # load the data est_multi <- multilevel(panel, ic = "BIC3", standarise = TRUE, r_max = 5, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") bs_gcomp_111 <- BS_global_comp(est_multi, i = 1, j = 1, t = 1)
This function employs a bootstrap procedure to obtain confidence intervals
for the global factors at time .
BS_global_factor(object, t, BB = 599, alpha = 0.05)
BS_global_factor(object, t, BB = 599, alpha = 0.05)
object |
An S3 object of class 'multi_result' created by multilevel(). |
t |
An integer specifying the time point at which the CI is constructed. |
BB |
An integer indicating the number of bootstrap repetition. 599 by default. |
alpha |
The significance level, a single numeric between 0 and 1. 0.05 by default. |
A matrix containing the upper and lower band.
panel <- UKhouse # load the data est_multi <- multilevel(panel, ic = "BIC3", standarise = TRUE, r_max = 5, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") bs_global_mid <- BS_global_factor(est_multi, t = est_multi$T / 2)
panel <- UKhouse # load the data est_multi <- multilevel(panel, ic = "BIC3", standarise = TRUE, r_max = 5, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") bs_global_mid <- BS_global_factor(est_multi, t = est_multi$T / 2)
This function employs a bootstrap procedure to obtain confidence intervals
for the global factor loadings for the -th individual in block
. See Lin
and Shin (2023) for details.
BS_global_loading(object, i, j, BB = 599, alpha = 0.05)
BS_global_loading(object, i, j, BB = 599, alpha = 0.05)
object |
An S3 object of class 'multi_result' created by [multilevel()]. |
i |
An integer indicating the |
j |
An integer indicating the |
BB |
An integer indicating the number of bootstrap repetition. 599 by default. |
alpha |
The significance level, a single numeric between 0 and 1. 0.05 by default. |
A matrix containing the upper and lower band.
Lin, R. and Shin, Y., 2022. Generalised Canonical Correlation Estimation of the Multilevel Factor Model. Available at SSRN 4295429.
panel <- UKhouse # load the data est_multi <- multilevel(panel, ic = "BIC3", standarise = TRUE, r_max = 5, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") bs_gamma_11 <- BS_global_loading(est_multi, i = 1, j = 1)
panel <- UKhouse # load the data est_multi <- multilevel(panel, ic = "BIC3", standarise = TRUE, r_max = 5, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") bs_gamma_11 <- BS_global_loading(est_multi, i = 1, j = 1)
This function employs a bootstrap procedure to obtain confidence intervals
for the local component for the -th individual in block
at time
.
See Lin and Shin (2023) for details.
BS_local_comp(object, i, j, t, BB = 599, alpha = 0.05)
BS_local_comp(object, i, j, t, BB = 599, alpha = 0.05)
object |
An S3 object of class 'multi_result' created by multilevel(). |
i |
An integer indicating the |
j |
An integer indicating the |
t |
An integer specifying the time point at which the CI is constructed. |
BB |
An integer indicating the number of bootstrap repetition. 599 by default. |
alpha |
The significance level, a single numeric between 0 and 1. 0.05 by default. |
A matrix containing the upper and lower band.
Lin, R. and Shin, Y., 2022. Generalised Canonical Correlation Estimation of the Multilevel Factor Model. Available at SSRN 4295429.
panel <- UKhouse # load the data est_multi <- multilevel(panel, ic = "BIC3", standarise = TRUE, r_max = 5, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") bs_fcomp_111 <- BS_local_comp(est_multi, i = 1, j = 1, t = 1)
panel <- UKhouse # load the data est_multi <- multilevel(panel, ic = "BIC3", standarise = TRUE, r_max = 5, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") bs_fcomp_111 <- BS_local_comp(est_multi, i = 1, j = 1, t = 1)
This function employs a bootstrap procedure to obtain confidence intervals
for the local factors in block at time
. See Lin and Shin (2023) for details.
BS_local_factor(object, i, t, BB = 599, alpha = 0.05)
BS_local_factor(object, i, t, BB = 599, alpha = 0.05)
object |
An S3 object of class 'multi_result' created by multilevel(). |
i |
An integer indicating the |
t |
An integer specifying the time point at which the CI is constructed. |
BB |
An integer indicating the number of bootstrap repetition. 599 by default. |
alpha |
The significance level, a single numeric between 0 and 1. 0.05 by default. |
A matrix containing the upper and lower band.
Lin, R. and Shin, Y., 2022. Generalised Canonical Correlation Estimation of the Multilevel Factor Model. Available at SSRN 4295429.
panel <- UKhouse # load the data est_multi <- multilevel(panel, ic = "BIC3", standarise = TRUE, r_max = 5, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") bs_local_factor_11 <- BS_local_factor(est_multi, i = 1, t = 1)
panel <- UKhouse # load the data est_multi <- multilevel(panel, ic = "BIC3", standarise = TRUE, r_max = 5, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") bs_local_factor_11 <- BS_local_factor(est_multi, i = 1, t = 1)
This is an internal function which checks the validity of the data and
provide a list of matrices of length for estimation.
check_data( data, depvar_header = NULL, i_header = NULL, j_header = NULL, t_header = NULL )
check_data( data, depvar_header = NULL, i_header = NULL, j_header = NULL, t_header = NULL )
data |
Either a data.frame or a list of data matrices of length |
depvar_header |
A character string specifying the header of the dependent variable. See Details. |
i_header |
A character string specifying the header of the block identifier. See Details. |
j_header |
A character string specifying the header of the individual identifier. See Details. |
t_header |
A character string specifying the header of the time identifier. See Details. |
See Details of GCC().
A list of data matrices of length .
panel <- UKhouse # load the data Y_list <- check_data(panel, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date" )
panel <- UKhouse # load the data Y_list <- check_data(panel, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date" )
Select an optimal bandwidth parameter and apply the dependent wild bootstrap with Bartlett kernel to obtain the resampled time series.
dwBS(y)
dwBS(y)
y |
A |
A matrix of resampled time series.
Shao, X., 2010. The dependent wild bootstrap. Journal of the American Statistical Association, 105(489), pp.218-235.
panel <- UKhouse # load the data est_multi <- multilevel(panel, ic = "BIC3", standarise = TRUE, r_max = 5, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") G_star <- dwBS(est_multi$G)
panel <- UKhouse # load the data est_multi <- multilevel(panel, ic = "BIC3", standarise = TRUE, r_max = 5, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") G_star <- dwBS(est_multi$G)
This function is one of the main functions the package, employing the
generalized canonical correlation estimation for both the global factors
and, when not explicitly provided, for the number of
global factors
. Typically, this function is intended for internal
purposes. However, users one can opt for GCC() instead of multilevel(),
if the users only need to estimate the number of global factors.
GCC( data, standarise = TRUE, r_max = 10, r0 = NULL, ri = NULL, depvar_header = NULL, i_header = NULL, j_header = NULL, t_header = NULL )
GCC( data, standarise = TRUE, r_max = 10, r0 = NULL, ri = NULL, depvar_header = NULL, i_header = NULL, j_header = NULL, t_header = NULL )
data |
Either a data.frame or a list of data matrices of length |
standarise |
A logical indicating whether the data is standardised before estimation or not. See Details. |
r_max |
An integer indicating the maximum number of factors allowed. See Details. |
r0 |
An integer of the number of global factors. See Details. |
ri |
An array of length |
depvar_header |
A character string specifying the header of the dependent variable. See Details. |
i_header |
A character string specifying the header of the block identifier. See Details. |
j_header |
A character string specifying the header of the individual identifier. See Details. |
t_header |
A character string specifying the header of the time identifier. See Details. |
The user-supplied data.frame should contain at least four columns, namely the
dependent variable (), block identifier (
), individual
identifier (
), and time (
). The user needs to supply their corresponding
headers in the data.frame to the function using the parameters "depvar_header",
"i_header", "j_header", and "t_header", respectively. If the data is supplied
as a list, these arguments will not be used.
If either r0 = NULL or ri = NULL, both of them will be estimated. In such case, "r_max" must be supplied. If "r0" and "ri" are supplied then "r_max" is not needed and will be ignored.
If standarise = TRUE, each time series will be standardised so it has zero mean and unit variance. It is recommended to standardise the data before estimation.
See Lin and Shin (2023) for more details.
A list containing the estimated number of global factors ,
the global factors
, and the other elements that are
used in multilevel().
Lin, R. and Shin, Y., 2022. Generalised Canonical Correlation Estimation of the Multilevel Factor Model. Available at SSRN 4295429.
panel <- UKhouse # load the data Y_list <- panel2list(panel, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") est_GCC <- GCC(Y_list, r_max = 10) r0_hat <- est_GCC$r0 # number of global factors G_hat <- est_GCC$G # global factors
panel <- UKhouse # load the data Y_list <- panel2list(panel, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") est_GCC <- GCC(Y_list, r_max = 10) r0_hat <- est_GCC$r0 # number of global factors G_hat <- est_GCC$G # global factors
Automatic bandwidth selection of Andrews (1991) using Bartlett kernel.
get_bw(y)
get_bw(y)
y |
A |
A numeric.
Andrews, D.W., 1991. Heteroskedasticity and autocorrelation consistent covariance matrix estimation. Econometrica: Journal of the Econometric Society, pp.817-858.
panel <- UKhouse # load the data est_multi <- multilevel(panel, ic = "BIC3", standarise = TRUE, r_max = 5, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") lT_G <- get_bw(est_multi$G)
panel <- UKhouse # load the data est_multi <- multilevel(panel, ic = "BIC3", standarise = TRUE, r_max = 5, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") lT_G <- get_bw(est_multi$G)
This function performs model selection for the (2D) approximate factor model and returns the estimated number of factors.
infocrit(Y, method, r_max = 10)
infocrit(Y, method, r_max = 10)
Y |
A |
method |
A character string indicating which criteria to use. |
r_max |
An integer indicating the maximum number of factors allowed. 10 by default. |
"method" can be one of the following: "ICp2" and "BIC3" by Bai and Ng (2002), "ER" by Ahn and Horenstein (2013), "ED" by Onatski (2010).
The estimated number of factors.
Bai, J. and Ng, S., 2002. Determining the number of factors in approximate factor models. Econometrica, 70(1), pp.191-221.
Ahn, S.C. and Horenstein, A.R., 2013. Eigenvalue ratio test for the number of factors. Econometrica, 81(3), pp.1203-1227.
Onatski, A., 2010. Determining the number of factors from empirical distribution of eigenvalues. The Review of Economics and Statistics, 92(4), pp.1004-1016.
# simulate data T <- 100 N <- 50 r <- 2 F <- matrix(stats::rnorm(T * r, 0, 1), nrow = T) Lambda <- matrix(stats::rnorm(N * r, 0, 1), nrow = N) err <- matrix(stats::rnorm(T * N, 0, 1), nrow = T) Y <- F %*% t(Lambda) + err # estimation r_hat <- infocrit(Y, "BIC3", r_max = 10)
# simulate data T <- 100 N <- 50 r <- 2 F <- matrix(stats::rnorm(T * r, 0, 1), nrow = T) Lambda <- matrix(stats::rnorm(N * r, 0, 1), nrow = N) err <- matrix(stats::rnorm(T * N, 0, 1), nrow = T) Y <- F %*% t(Lambda) + err # estimation r_hat <- infocrit(Y, "BIC3", r_max = 10)
This is one of the main functions of this package which performs full estimation of the multilevel factor model.
multilevel( data, ic = "BIC3", standarise = TRUE, r_max = 10, r0 = NULL, ri = NULL, depvar_header = NULL, i_header = NULL, j_header = NULL, t_header = NULL )
multilevel( data, ic = "BIC3", standarise = TRUE, r_max = 10, r0 = NULL, ri = NULL, depvar_header = NULL, i_header = NULL, j_header = NULL, t_header = NULL )
data |
Either a data.frame or a list of data matrices of length |
ic |
A character string of selection criteria to use for estimation of the numbers of local factors. See Details. |
standarise |
A logical indicating whether the data is standardised before estimation or not. See Details. |
r_max |
An integer indicating the maximum number of factors allowed. See Details. |
r0 |
An integer of the number of global factors. See Details. |
ri |
An array of length |
depvar_header |
A character string specifying the header of the dependent variable. See Details. |
i_header |
A character string specifying the header of the block identifier. See Details. |
j_header |
A character string specifying the header of the individual identifier. See Details. |
t_header |
A character string specifying the header of the time identifier. See Details. |
The user-supplied data.frame should contain at least four columns, namely the
dependent variable (), block identifier (
), individual
identifier (
), and time (
). The user needs to supply their corresponding
headers in the data.frame to the function using the parameters "depvar_header",
"i_header", "j_header", and "t_header", respectively. If the data is supplied
as a list, these arguments will not be used.
If either r0 = NULL or ri = NULL, then both of them will be estimated. In such case, "r_max" must be supplied. If "r0" and "ri" are supplied then "r_max" is not needed and will be ignored.
If standarise = TRUE, each time series will be standardised so it has zero mean and unit variance. It is recommended to standardise the data before estimation.
See Lin and Shin (2023) for more details.
The return value is an S3 object of class "multi_result". It contains a list of the following items:
G = A matrix of the estimated global factors.
Gamma = A list of length containing matrices of the estimated global loading matrices for each block.
F = A list of length containing matrices of the estimated local factors for each block.
Lambda = A list of length containing matrices of the estimated global loading matrices for each block.
N = The total number of cross-sections in the panel.
Ni = An array of length containing the number of cross-sections in each block.
r0 = The number of global factors. Unchanged if pre-specified.
ri = An array of length containing the number of local factors for each block. Unchanged if pre-specified.
d = An array of length containing the maximum total number of factors allowed for each block.
The elements are identically equal to r_max if either r0 or ri is supplied as NULL.
Resid = A list of length containing the residual matrices for each block.
delta2 = An array of the mock and the largest squared singular values.
ic = Selection criteria used for estimating the numbers of local factors.
block_names = A array of block names.
Lin, R. and Shin, Y., 2022. Generalised Canonical Correlation Estimation of the Multilevel Factor Model. Available at SSRN 4295429.
panel <- UKhouse # load the data # use data.frame est_multi <- multilevel(panel, ic = "BIC3", standarise = TRUE, r_max = 5, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") # or one can use a list of data matrices Y_list <- panel2list(panel, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") est_multi <- multilevel(Y_list, ic = "BIC3", standarise = TRUE, r_max = 5)
panel <- UKhouse # load the data # use data.frame est_multi <- multilevel(panel, ic = "BIC3", standarise = TRUE, r_max = 5, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") # or one can use a list of data matrices Y_list <- panel2list(panel, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") est_multi <- multilevel(Y_list, ic = "BIC3", standarise = TRUE, r_max = 5)
This function converts the data.frame to a list of data matrices and finds the dimensions of the multilevel panel.
panel2list( panel, depvar_header = NULL, i_header = NULL, j_header = NULL, t_header = NULL )
panel2list( panel, depvar_header = NULL, i_header = NULL, j_header = NULL, t_header = NULL )
panel |
The user-supplied data frame for the multilevel panel data. See Details. |
depvar_header |
A character string specifying the header of the dependent variable. See Details. |
i_header |
A character string specifying the header of the block identifier. See Details. |
j_header |
A character string specifying the header of the individual identifier. See Details. |
t_header |
A character string specifying the header of the time identifier. See Details. |
See the details of GCC().
A list containing the data matrices of the blocks. Each of them
has dimension
.
panel <- UKhouse # load the data # panel$Region identifies different blocks i=1,...,R. # panel$LPA_Type identifies different individuals j=1,...,N_i. Y_list<- panel2list(panel, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date")
panel <- UKhouse # load the data # panel$Region identifies different blocks i=1,...,R. # panel$LPA_Type identifies different individuals j=1,...,N_i. Y_list<- panel2list(panel, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date")
Perform PC estimation of the (2D) approximate factor model:
or in matrix notation:
The factors is estimated as
times the
eigenvectors of
the matrix
corresponding to the
largest eigenvalues in descending order, and the loading matrix is estimated by
.
See e.g. Bai and Ng (2002).
PC(Y, r)
PC(Y, r)
Y |
A |
r |
= the number of factors. |
A list containing the factors and factor loadings:
factor = a matrix of the estimated factors.
loading = a matrix of the estimated factor loadings.
Bai, J. and Ng, S., 2002. Determining the number of factors in approximate factor models. Econometrica, 70(1), pp.191-221.
# simulate data T <- 100 N <- 50 r <- 2 F <- matrix(stats::rnorm(T * r, 0, 1), nrow = T) Lambda <- matrix(stats::rnorm(N * r, 0, 1), nrow = N) err <- matrix(stats::rnorm(T * N, 0, 1), nrow = T) Y <- F %*% t(Lambda) + err # estimation est_PC <- PC(Y, r)
# simulate data T <- 100 N <- 50 r <- 2 F <- matrix(stats::rnorm(T * r, 0, 1), nrow = T) Lambda <- matrix(stats::rnorm(N * r, 0, 1), nrow = N) err <- matrix(stats::rnorm(T * N, 0, 1), nrow = T) Y <- F %*% t(Lambda) + err # estimation est_PC <- PC(Y, r)
Print the relative importance ratios
## S3 method for class 'multi_result' summary(object, ...)
## S3 method for class 'multi_result' summary(object, ...)
object |
An S3 object of class 'multi_result' created by multilevel(). |
... |
Additional arguments. |
A matrix containing the summary of the model.
panel <- UKhouse # load the data est_multi <- multilevel(panel, ic = "BIC3", standarise = TRUE, r_max = 5, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") summary(est_multi)
panel <- UKhouse # load the data est_multi <- multilevel(panel, ic = "BIC3", standarise = TRUE, r_max = 5, depvar_header = "dlPrice", i_header = "Region", j_header = "LPA_Type", t_header = "Date") summary(est_multi)
A data.frame containing the quarterly (mean) house prices of four different types of properties, (detached, semi-detached, terraced and flats/maisonettes) for 331 local planning authorities (LPA) over the period 1996Q1 to 2021Q2. See also Lin and Shin (2023).
UKhouse
UKhouse
## 'UKhouse'
Each LPA belongs to one of the ten regions: North East (NE), North West (NW),
Yorkshire and the Humber (YH), East Midlands (EM), West Midlands(WM),
East of England (EE), London (LD), South East (SE), South West (SW) and Wales (WA).
The real house price growth of the -th LPA-type pair in region
by deflating the nominal house price by CPI and log-differencing it as
By removing the series with missing observations, it ends up with a balanced panel
with ,
and
.
Columns in the dataset:
"Date" Time variable.
"Region" Name of region which the LPA belongs to.
"LPA" Name of the LPA.
"Type" Name of the house type.
"LPA_Type" Name of the LPA-type pair.
Office for National Statistics (ONS), ONS website, statistical bulletin, House price statistics for small areas in England and Wales: year ending June 2021
Lin, R. and Shin, Y., 2022. Generalised Canonical Correlation Estimation of the Multilevel Factor Model. Available at SSRN 4295429.