Essays on mixed-frequency and causality analysis in the frequency domain

  1. Guollo Taufemback, Cleiton
Dirigida por:
  1. Carlos Velasco Gómez Director/a

Universidad de defensa: Universidad Carlos III de Madrid

Fecha de defensa: 14 de septiembre de 2018

Tribunal:
  1. Uwe Hassler Presidente/a
  2. Josu Arteche González Secretario/a
  3. Jesús Gonzalo Muñoz Vocal

Tipo: Tesis

Resumen

In this thesis, we present three different studies about mixed-frequency datasets and/or causality. In a profound revision of mixed-frequency data, MFD, techniques, Foroni and Marcellino (2013) affirm that the predominant approach to deal with MFD is to discard intermediate data, also called temporal aggregation, despite the advance in the literature, e.g., Mariano and Murasawa (2010); Angelini et al. (2006); Banbura and Modugno (2014); Ghysels et al. (2004, 2007); Bai et al. (2013). Here, we understand as temporal aggregation both systematic sampling and non-overlapping aggregation. The first procedure skips stock observations of x(t) that do not match in time with available observations of y(t) and the second performs a continuum summation of a flow variable up to the release date. Please see Granger and Siklos (1995) and Hassler (2011) for an extended discussion about temporal aggregation definition. In Chapter 1, using semiparametric and nonparametric linear models, we aim to establish that the subsampling process that occurs with the dependent variable, y(t), may affect the consistency of least squares linear estimators. The critical point of our analysis is to show how aliasing impacts when the model allows for frequency dependent coefficients, FDC. Notice that, for a single coefficient for all frequencies, the loss of information only affects the estimator efficiency, not its consistency. Assuming that the DGP has coefficients that vary across frequencies, the two main results of this paper are: (1) regressing low-frequency series on a downsampled exogenous series result in inconsistent estimators or in the inability to recover all coefficients; (2) a new proposed method, based on band spectrum regression, can consistently recover distinct frequency coefficients. Geweke (1982), Hosoya (1991), and more recently Breitung and Candelon (2006) have proposed methods based on VAR models to measure pointwise causality in the frequency domain. In Chapter 2, we present a nonparametric alternative to these methods that are not only model-free, but it is also robust to a wide range of series dependence structures. Our method consists of two steps, where the first is performed globally in the time domain and the last locally in the frequency domain. In our first step, we project both the endogenous and the exogenous variable, y(t) and x(t−1), on past values of the endogenous one, filtering away any feedback causality between those variables. We assume that the number of lags increases as the number of observation increases but slowly. In the second step, we locally regress, frequency by frequency, the filtered series, thus allowing that the relationship between these variables can be different at each frequency. Our test statistic is flexible in the sense that it is possible to infer causality between any two series, wx(λ)->wy(λ); conditional causality, wx(λ)->wy(λ)|wz(λ); and multivariate causality, {wx1(λ);...; wxp(λ)}->wy(λ)|wz(λ). The test statistics for the first and for second hypotheses converge, under the null, at every frequency, to a χ2(1) distribution, and for the latest hypothesis to a χ2(p) distribution. Furthermore, if parametric models rely on information criteria, such as AIC and BIC, to determine the underline model, nonparametric models rely on the selection of appropriate bandwidth values. We suggest the use of the ‘leave-one-out’ cross-validation method as a data-driven bandwidth selection. In addition to the statistical novelty, our method presents some interesting features as not to be conditioned to a specific kind of model, neither requires heteroskedasticity and autocorrelation consistent variance estimator for series with conditional variance. Furthermore, compared with Breitung and Candelon (2006) test, our nonparametric model reports an overall higher power and a good size performance. Finally, in Chapter 3, we propose two nonparametric causality testing approaches based on a two-sided unrestricted distributed-lag model, i.e., a distributed lag model with an infinite number of leads and lags. The first method is based on least squares, LS, and the second is based on the Hannan-Inefficient, HI, estimator, see Hannan (1963). In both cases, using lag coefficients, we can infer the absence of causality from the past values of x(t) to y(s;t), and using lead coefficients, from past values of y(s;t) to x(t), as in Sims (1972). We also investigate causality at a specific lead or lag. For coefficient estimation purpose we assume that the number of leads and lags, M, increases with n but slowly. The so-called ‘Hannan-Inefficient’, HI, estimator, Hannan (1963), receives its name from Sims (1973) in comparison to the fully efficient estimation also proposed in Hannan (1963). Sims argues that for the univariate case this procedure is equivalent asymptotically to GLS. According to Sims (1973), the attractiveness of HI estimator over the LS regression shows up whenever the estimation of a large number of lags is necessary. Furthermore, seasonal adjustments are easy to handle, and it automatically handles serial correlation in the residuals. See Amemiya and Fuller (1967), Hannan (1967), and Wahba (1969) for further studies on the HI estimator. Summarizing, we propose two novel nonparametric causality test for mixed-frequency datasets. Under a two-side unrestricted distributed lag model, we proposed a Least Squares and a Hannan-Inefficient estimator approaches. We assume that the low-frequency and the high-frequency variables are generated at the same frequency, but we systematic miss some observations for the LF variable. Causality tests based on mixed-frequency VAR, Ghysels et al. (2016), and Gotz et al. (2016), assume a mismatch in the generation process, which implies that models for regular datasets cannot be employed. Furthermore, in contrary to their methods, our proposal test does not suffer from parameter proliferation.