## Abstract

The identification of effective connectivity from time-series data such as electroencephalogram (EEG) or time-resolved function magnetic resonance imaging (fMRI) recordings is an important problem in brain imaging. One commonly used approach to inference effective connectivity is based on vector autoregressive models and the concept of Granger causality. However, this probabilistic concept of causality can lead to spurious causalities in the presence of latent variables. Recently, graphical models have been used to discuss problems of causal inference for multivariate data. In this paper, we extend these concepts to the case of time-series and present a graphical approach for discussing Granger-causal relationships among multiple time-series. In particular, we propose a new graphical representation that allows the characterization of spurious causality and, thus, can be used to investigate spurious causality. The method is demonstrated with concurrent EEG and fMRI recordings which are used to investigate the interrelations between the alpha rhythm in the EEG and blood oxygenation level dependent (BOLD) responses in the fMRI. The results confirm previous findings on the location of the source of the EEG alpha rhythm.

## 1. Introduction

Recent studies in functional brain imaging have focused on the investigation of interactions between brain areas that are activated during certain tasks. Of particular interest is the determination of directed interactions and, hence, of the *effective connectivity*, which is defined as the influence one region exerts over another (Büchel & Friston 2000).

Several approaches have been suggested for modelling and estimating effective connectivity: structural equation modelling (Büchel & Friston 1997; McIntosh & Gonzales-Lima 1997), nonlinear dynamic causal models (Friston *et al*. 2003) and vector autoregressive modelling (Goebel *et al*. 2003; Harrison *et al*. 2003). The latter approach explicitly makes use of the temporal information in time-resolved fMRI measurements and is closely related to the concept of Granger causality (Goebel *et al*. 2003). This probabilistic concept of causality introduced by Granger (1969) is based on the idea that causes always precede their effects and can be formulated in terms of predictability.

In general, inferences about effective connectivity rely on the assumption that the underlying dependence structure of the system is correctly specified by the statistical model employed for the analysis. In the case of Granger causality and vector autoregressive modelling, it is well known that omission of relevant factors can lead to temporal correlations between the observed components that are falsely detected as causal relationships. Such so-called spurious causalities (Hsiao 1982) are particularly a problem in brain imaging applications where typically only a small number of the potential factors are included in an analysis. This raises the questions of whether and how effective connectivity of neural systems can be determined by fitting vector autoregressive models.

In this paper, we address this question using a new graphical approach for the investigation of dependence structures in multivariate dynamic systems. This approach is motivated by recent work on causal inference (Pearl 2000; Lauritzen 2001) that is based on graphical models. The idea of graphical modelling is to represent possible dependencies among the variables of a multivariate distribution in a graph. Such graphs can be easily visualized and allow an intuitive understanding of complex dependence structures. For an introduction to graphical models we refer to the monographs by Whittaker (1990), Cox & Wermuth (1996), Lauritzen (1996) and Edwards (2000).

In time-series analysis, there exist various approaches for defining graphical models; a partial overview can be found in Dahlhaus & Eichler (2003). Here, we focus on the approach by Eichler (2001, 2002), who linked the notions of graphical modelling and Granger causality to describe the dynamic dependencies in vector autoregressions. In §2, we review the concept of path diagrams associated with vector autoregressions and multivariate Granger causality. Furthermore, we introduce a similar graphical representation of the bivariate Granger-causal relationships among multiple time-series. In §3, we discuss the properties of such multivariate and bivariate path diagrams and their relation to each other. The results suggest using a combination of both representations for the description of dependencies in multiple time-series. These general path diagrams and the implications for causal inference from vector autoregressions are discussed in §4, and an application for simultaneous recordings from EEG and time-resolved fMRI is provided in §5.

## 2. Granger causality and autoregressive modelling

### (a) Multivariate Granger causality

The concept of Granger causality is a fundamental tool for the empirical investigation of dynamic interactions in multivariate time-series. This probabilistic concept of causality is based on the common sense conception that causes always precede their effects. Thus, an event taking place in the future cannot cause another event in the past or present. This temporal ordering implies that the past and present values of a series *X* that influences another series *Y* should help to predict future values of this latter series *Y*. Furthermore, the improvement in the prediction of future values of *Y* should persist after any other relevant information for the prediction has been exploited. Suppose that the vector time-series *Z* comprises all variables that might affect the dependence between *X* and *Y*, such as confounding variables. Then we say that a series *X* Granger causes another series *Y* with respect to the information given by the series (*X*, *Y*, *Z*) if the value of *Y*(*t*+1) can be better predicted by using the entire information available at time *t* than by using the same information apart from the past and present values of *X*. Here, ‘better’ means a smaller variance of forecast error.

Because of the temporal ordering, it is clear that Granger causality can only capture functional relationships for which cause and effect are sufficiently separated in time. To describe causal dependencies between variables at the same time point, Granger (1969) proposed the notion of ‘instantaneous causality’. In general, it is not possible to attribute a unique direction to such ‘instantaneous causalities’ and we therefore will only speak of contemporaneous dependencies.

In practice, the use of Granger causality has mostly been restricted to the investigation of linear relationships. This notion of linear Granger causality is closely related to the important class of vector autoregressive processes. More precisely, let *X*_{V} be a zero-mean, weakly stationary vector time-series with components *X*_{v}, *v*∈*V*={1,…,*d*} and assume that *X*_{V} has an autoregressive representation of the form(2.1)where the matrices *Φ*(*u*) are square summable and *ϵ*_{V}(*t*) is a white noise process with mean zero and non-singular covariance matrix *Σ*. Then, for any *b*∈*V*, the linear prediction of *X*_{b}(*t*+1) on the past and present values of *X*_{V} is given byIt follows that a component *X*_{a} is Granger-non-causal for *X*_{b} with respect to the complete series *X*_{V} if, and only if, *Φ*_{ba}(*u*)=0 for all lags *u*. Furthermore, if *Σ*_{ab}=0, the variables *X*_{a}(*t*+1) and *X*_{b}(*t*+1) are uncorrelated after removing the linear effects of , *s*≤*t* and we say that *X*_{a} and *X*_{b} are contemporaneously uncorrelated with respect to *X*_{V}.

From the definition of Granger causality in terms of the autoregressive parameters, it is clear that the notion of Granger causality depends on the multivariate time-series *X*_{V} available for the analysis. If, for example, we consider only a subset *X*_{s}, *s*∈*S*, of the series in *X*_{V}, the vector time-series *X*_{S} again has an autoregressive representationfor some white noise series with covariance matrix , but the coefficients and will generally be different from those in (2.1). For distinct *a*, *b*∈*V*, we say that *X*_{a} does not Granger-cause *X*_{b} with respect to *X*_{S} if for all lags *u*. To illustrate this dependence on the analysed process *X*_{V}, we consider the following trivariate system. Let(2.2)where *ϵ*_{v}(*t*), *v*=1, 2, 3 are independent and identically normally distributed with mean zero and variance *σ*^{2}. Then *X*_{3} Granger-causes *X*_{2}, which in turn Granger-causes *X*_{1}, whereas *X*_{3} is Granger-non-causal for *X*_{1} (all with respect to the full trivariate series *X*_{{1,2,3}}). On the other hand, in a bivariate autoregressive representation of *X*_{1} and *X*_{3}, we havewith . This representation implies that *X*_{3} Granger-causes *X*_{1} with respect to the bivariate series *X*_{{1,3}}.

### (b) Path diagrams

The Granger-causal relationships in a vector autoregressive process *X*_{V} can be represented graphically by a path diagram in which vertices *v* correspond to the components *X*_{v}, while directed edges (→) between the vertices indicate Granger-causal influences. For a complete graphical description of the dependence structure of *X*_{V}, we also include undirected edges (---) to depict contemporaneous correlations between the corresponding components. This leads to the following definition of path diagrams associated with vector autoregressions (cf. Eichler 2002).

*Let X*_{V} *be a weakly stationary time-series with the autoregressive representation* *(2.1)*. *Then the path diagram associated with X*_{V} *is a graph G*=(*V*, *E*) *with vertex set V and edge set E such that for a*, *b*∈*V with a*≠*b*

;

.

In other words, the path diagram *G* contains a directed edge *a*→*b* if, and only if, *X*_{a} Granger-causes *X*_{b} with respect to the full series *X*_{V}. Similarly, an undirected edge *a*---*b* is present in the path diagram if, and only if, *X*_{a} and *X*_{b} are contemporaneously correlated with respect to *X*_{V}.

As an example, we consider the trivariate vector autoregression in equation (2.2). The associated path diagram is depicted in figure 1. Here the directed path from 3 to 1 via 2 reflects the fact that *X*_{3} Granger-causes *X*_{1} if *X*_{2} is not included in the analysis. A more complicated path diagram is obtained for the following trivariate system:(2.3)with *Σ*_{13}=*Σ*_{23}=0. The associated path diagram in figure 2 provides a concise summary of the interactions and shows for example a feedback relation between *X*_{2} and *X*_{3}. Note that the dependence of *X*_{3} on past values of itself is not depicted in the diagram. It follows from the results in §3 that inclusion of such self-loops *v*→*v* would not change the properties of the graph and we therefore omit them for the sake of simplicity.

### (c) Bivariate Granger causality

Much of the literature on Granger causality has been concerned with the analysis of relationships between two time-series (or two vector time-series) and, as a consequence, relationships among multiple time-series are still quite frequently investigated using bivariate Granger causality; examples involving EEG or fMRI signals can be found in Kamiński *et al*. (2001), Goebel *et al*. (2003) and Hesse *et al*. (2003).

For a better understanding of this approach and its relation to a full multivariate analysis based on multivariate Granger causality, we now introduce a graphical representation of the bivariate connectivity structure. Suppose that *X*_{V} is a weakly stationary process of the form (2.1). Then for *a*, *b*∈*V* the bivariate subprocess *X*_{{a,b}} is again a weakly stationary process and has an autoregressive representation:(2.4)where is a white noise process with covariance matrix . From this representation, it follows that *X*_{a} is bivariately Granger-causal for *X*_{b} if, and only if, the coefficients are zero for all lags *u*. This leads to the following definition of bivariate path diagrams that visualize the bivariate connectivities of *X*_{V}.

*Let X*_{V} *be a weakly stationary time-series of the form* *(2.1)*. *Then the bivariate path diagram associated with X*_{V} *is a graph G*=(*V*, *E*) *with vertex set V and edge set E such that for all a*, *b*∈*V with a*≠*b*

,

,

*where* , *u*=1, 2,… *and* *are the parameters in the autoregressive representation* (2.4) *of the bivariate subprocess X*_{{a,b}}.

As an example, we consider again the trivariate system in (2.2). As noted before, the bivariate autoregressive representation of *X*_{1} and *X*_{3} is given bywith , which implies that *X*_{3} Granger-causes *X*_{1}. Furthermore, it can be shown that the bivariate representation of the subprocess *X*_{{1,2}} is given bywith . Hence, *X*_{2} Granger-causes *X*_{1} in a bivariate analysis. Finally, the autoregressive representation of *X*_{{2,3}} is determined by the second and the third equation in (2.2) and, thus, *X*_{3} Granger-causes *X*_{2}. The bivariate Granger-causal relationships between the variables can be summarized by the path diagram in figure 3*b*.

We note that in general the relation between the bivariate representations (2.4) and the multivariate representation (2.1) is more complicated than in this example and an analytic derivation of the bivariate representations would be very difficult to obtain.

## 3. Markov properties

The edges in the path diagrams discussed in this paper represent pairwise Granger-causal relationships with respect to either the complete process in the case of multivariate path diagrams or bivariate subprocesses in the case of path diagrams depicting the bivariate connectivity structure. The results in this section show that both types of path diagrams more generally provide sufficient conditions for Granger-causal relationships with respect to subseries *X*_{S} for any subset *S* of *V*. The proofs of the results are technical and therefore omitted; for multivariate path diagrams the proofs can be found in Eichler (2002), for bivariate path diagrams they can be obtained from the author upon request.

### (a) Multivariate path diagrams

We first review a path-oriented concept of separating subsets of vertices in a mixed graph that has been used to represent the Markov properties of linear structural equation systems (Spirtes *et al*. 1998; Koster 1999). Following Richardson (2003), we will call this notion of separation in mixed graphs *m*-separation.

Let *G*=(*V*, *E*) be a mixed graph with directed edges → and undirected edges ---. A *path π* between two vertices *a* and *b* in *G* is a sequence *π*=〈*e*_{1},…,*e*_{n}〉 of edges *e*_{i}∈*E* such that *e*_{i} is an edge between *v*_{i−1} and *v*_{i} for some sequence of vertices *v*_{0}=*a*, *v*_{1},…,*v*_{n}=*b*. We say that *a* and *b* are the endpoints of the path, while *v*_{1},…,*v*_{n−1} are the intermediate vertices on the path. Note that the vertices *v*_{i} in the sequence do not need to be distinct and, therefore, that paths may be self-intersecting.

An intermediate vertex *c* on a path *π* is said to be a collider on the path if the edges preceding and succeeding *c* on the path both have an arrowhead or a dashed tail at *c*, that is, →*c*←, →*c*---, ---*c*←, ---*c*---; otherwise the vertex *c* is said to be a non-collider on the path. A path *π* between vertices *a* and *b* is said to be *m*-connecting given a set *C* if

every non-collider on the path is not in

*C*andevery collider on the path is in

*C*,

otherwise we say the path is *m*-blocked given *C*. If all paths between *a* and *b* are *m*-blocked given *C*, then *a* and *b* are said to be *m*-separated given *C*. Similarly, sets *A* and *B* are said to be *m*-separated in *G* given *C* if for every pair *a*∈*A* and *b*∈*B*, *a* and *b* are *m*-separated given *C*.

As an example consider the graph in figure 4. In this graph, vertices 1 and 4 are *m*-separated given *S*={3}. To show this, we examine all paths between the two vertices. First, we consider the path 4→3→1. Since 3 is a non-collider on this path, the path is *m*-blocked given {3}. Second, we note that every path that passes through vertex 2 contains this vertex as a collider. Two examples of such paths are given in figure 4*b*,*c*. Since 2 lies outside *S*={3}, these paths are *m*-blocked given *S*. The only path between 1 and 4 that does not pass through 2 is the path 4→3→1, which is also *m*-blocked given *S*. Hence, no path exists that is *m*-connecting given *S* and consequently 1 and 4 are *m*-separated given *S*.

Now suppose that *X*_{V} is a weakly stationary time-series with autoregressive representation (2.1) and let *G* be its associated multivariate path diagram. Furthermore, let *A*, *B* and *C* be disjoint subsets of *V*. Then, if *A* and *B* are *m*-separated given *C*, it can be shown (Eichler 2002, theorem 3.1) that *X*_{A}(*t*) and *X*_{B}(*t*+*u*) are uncorrelated at all lags *u* after removing the linear effects of the complete series *X*_{C}. For example, in the path diagram in figure 3*a* associated with the trivariate process in (2.2), vertices 1 and 3 are *m*-separated given *S*={2}. This implies that the components *X*_{1} and *X*_{3} are uncorrelated conditionally on *X*_{2} but not in a bivariate analysis.

On the other hand, we have shown in §2*a* that in a bivariate analysis *X*_{3} Granger-causes *X*_{1} whereas *X*_{1} is Granger-non-causal for *X*_{3} (as indicated by the bivariate path diagram in figure 3*b*). Obviously, the notion of *m*-separation is too strong for the derivation of Granger-non-causal relations from multivariate path diagrams: the definition of *m*-separation requires that all paths between vertices 1 and 3 are *m*-blocked whereas the path 3→2→1 is intuitively interpreted as a causal link from *X*_{3} to *X*_{1}. Consequently, the path should not be considered when discussing Granger causality from *X*_{1} to *X*_{3}.

The example suggests the following definition. A path *π* between vertices *a* and *b* is said to be *b*-pointing if it has an arrowhead at the endpoint *b*. More generally, a path *π* between *A* and *B* is said to be *B*-pointing if it is *b*-pointing for some *b*∈*B*. In order to establish Granger non-causality from *X*_{A} to *X*_{B}, it suffices to only consider all *B*-pointing paths between *A* and *B*.

*Suppose X*_{V} *is a weakly stationary time-series with autoregressive representation* *(2.1)* *and let G*=(*V*, *E*) *be the path diagram associated with X*_{V}. *Furthermore*, *let A*, *B and C be disjoint subsets of V*. *If every B-pointing path between A and B is m-blocked given B*∪*C*, *then X*_{A} *is Granger-non-causal for X*_{B} *with respect to* .

Similarly, a graphical condition for contemporaneous correlation can be obtained. Intuitively, two variables *X*_{a} and *X*_{b} are contemporaneously uncorrelated with respect to *X*_{S} if they are contemporaneously uncorrelated with respect to *X*_{V} and the variables are not jointly affected by past values of *X*_{V\S}. For a precise formulation of the conditions, we need the following definition. A path *π* between vertices *a* and *b* is said to be bi-pointing if it has an arrowhead at both endpoints *a* and *b*. Then the sufficient condition for contemporaneous correlation can be stated as follows.

*Suppose X*_{V} *is a weakly stationary time-series with autoregressive representation* *(2.1)* *and let G*=(*V*, *E*) *be the path diagram associated with X*_{V}. *Furthermore*, *let A*, *B and C be disjoint subsets of V*. *If*

*and**every bi-pointing path between A and B is m-blocked given A*∪*B*∪*C*,

*then X*_{A} *and X*_{B} *are contemporaneously uncorrelated with respect to* .

As an example, consider the four-dimensional vector autoregressive process *X*_{V} with components(3.1)where *ϵ*_{i}(*t*), *i*=1,…,4 are uncorrelated white noise processes with mean zero and variance one. The path diagram associated with *X*_{V} is shown in figure 5*a*. In this graph, vertices 1 and 3 are connected by the path 3→2←4→1. On this path, vertex 2 is a collider, whereas vertex 4 is a non-collider. Thus, the path is *m*-blocked given the set {2} and it follows from theorem 3.1 that *X*_{3} is Granger-non-causal for *X*_{1} in a bivariate analysis, but not in a trivariate analysis including *X*_{2}. Furthermore, vertices 1 and 2 are connected by the bi-pointing path 1←4→2, which is only *m*-blocked given vertex 4. Therefore, it follows by theorems 3.1 and 3.2 that *X*_{1} and *X*_{2} Granger-cause each other and, additionally, are contemporaneously correlated regardless of whether *X*_{3} is included in the analysis or not. The Granger-causal relationships, with respect to *X*_{{1,2,3}}, that can be inferred from the path diagram in figure 5*a* can be summarized by the graph in figure 5*b*.

More generally, if a mixed graph *G* encodes certain Granger-non-causal relations of a process *X*_{V}, we say that *X*_{V} satisfies a Markov property with respect to the graph *G*.

*We say that a weakly stationary time-series X*_{V} *satisfies the global Granger-causal Markov property with respect to a mixed graph G if the following conditions hold for all disjoint subsets A*, *B and C of V*:

*If in the graph G every B-pointing path between A and B is m-blocked given B*∪*C*,*then X*_{A}*is Granger-non-causal for X*_{B}*with respect to*.*If the sets A and B are not connected by an undirected edge*(---)*in graph G and every bi-pointing path between A and B is m-blocked given A*∪*B*∪*C*,*then X*_{A}*and X*_{B}*are contemporaneously uncorrelated with respect to*.

With this definition, theorems 3.1 and 3.2 state that a weakly stationary time-series *X*_{V} with the autoregressive representation (2.1) satisfies the global Granger-causal Markov property, with respect to its multivariate path diagram *G*.

For the four-dimensional time-series in (3.1), we have shown above that the Granger-causal relationships with respect to the trivariate subseries *X*_{{1,2,3}} are encoded by the graph in figure 5*b*. It follows from theorems 3.1 and 3.2 that the trivariate subprocess *X*_{{1,2,3}} satisfies the global Granger-causal Markov property with respect to the graph in figure 5*b*. On the other hand, it can be shown that *X*_{{1,2,3}} is given bywhere andSimple calculations show that is indeed a white noise process with uncorrelated components. Thus, the path diagram associated with the trivariate process *X*_{{1,2,3}} is given by the graph in figure 5*c*. Note that this graph is a subgraph of the graph in figure 5*b*, which has been derived from the path diagram of the complete series *X*_{V}: theorems 3.1 and 3.2 provide only sufficient, not necessary, conditions for Granger non-causality with respect to subprocesses.

### (b) Bivariate path diagrams

Next, we discuss the properties of the bivariate path diagrams introduced in §2*c*. Recall that these path diagrams may have two kinds of edges, namely dashed directed edges ⤍ and undirected edges ---. We note that the choice of dashed directed edges to represent bivariate Granger-causal relationships allows the application of the concept of *m*-separation without further modifications. More precisely, let *G* be a mixed graph with directed edges ⤍ and undirected edges --- and let *π* be a path in *G*. Then the intermediate vertices on *π* can be characterized as colliders and non-colliders as in §3*a*, that is, an intermediate vertex *c* on the path *π* is said to be a collider if the edges preceding and succeeding *c* on the path both have an arrowhead or a dashed tail at *c*. Since *G* contains only edges of the form ⤍ or ---, it follows that all paths *π* in *G* are pure-collider paths, that is, all intermediate vertices are colliders.

In §3*a*, we have shown that the concepts of *m*-separation and of pointing paths can be used to derive Granger-non-causal relations with respect to subprocesses from multivariate path diagrams. The same is also true for bivariate path diagrams. More precisely, we have the following result.

*Let X*_{V} *be a weakly stationary time-series with autoregressive representation* *(2.1)* *and let G be the bivariate path diagram of X*_{V}. *Then X*_{V} *satisfies the global Granger-causal Markov property with respect to G*.

As an example, consider the four-dimensional process in (3.1) and suppose that variable *X*_{4} has not been observed. Simple calculations show that *X*_{3} is bivariately Granger-causal for *X*_{2}, which, in turn, bivariately Granger-causes *X*_{1}. Replacing *X*_{4}(*t*−2) with *ϵ*_{4}(*t*−2) in the equation for *X*_{1}(*t*) in (3.1), we find on the other hand that *X*_{1}(*t*) depends only on *ϵ*_{1}(*t*) and *ϵ*_{4}(*t*−2) while *X*_{3}(*t*)=*ϵ*_{3}(*t*). Since the white noise processes *ϵ*_{i}(*t*) are assumed to be uncorrelated, it follows that *X*_{1} and *X*_{3} are completely unrelated. In particular, this implies that *X*_{3} does not bivariately Granger-cause *X*_{1}.

The corresponding bivariate path diagram is shown in figure 5*d*. In this path diagram, the absence of the edge 3⤍1 implies that *X*_{3} is bivariately Granger-non-causal for *X*_{1}. On the other hand, vertices 1 and 3 are connected by the 1-pointing path 3⤍2⤍1. It follows from theorem 3.4 that, in a trivariate analysis based on *X*_{{1,2,3}}, the series *X*_{3} Granger-causes *X*_{1}. Similarly, the 2-pointing path 1⤌2⤌3⤌2 indicates that *X*_{1} Granger-causes *X*_{2} with respect to *X*_{{1,2,3}}. Since this path is also bi-pointing, the diagram also indicates that *X*_{1} and *X*_{2} are contemporaneously correlated with respect to *X*_{{1,2,3}}. Altogether, we find that the bivariate path diagram in figure 5*d* implies all relations that are encoded by the graph in figure 5*b* and additionally that *X*_{3} is bivariately Granger-non-causal for *X*_{1}.

### (c) Comparison of bivariate and multivariate Granger causality

The notion of Granger causality is based on the idea that a correlation between two variables that cannot be explained otherwise must be a causal influence; the temporal ordering then determines the direction of the causal link. This approach requires that all relevant information is included in the analysis. Given data from a multivariate time-series *X*_{V}, it therefore seems plausible to discuss Granger causality with respect to the full multivariate process *X*_{V}.

As a first example, we consider again the trivariate system in (2.2). Here, the multivariate path diagram (figure 3*a*) describes the effective connectivity among the components correctly, whereas it is not clear from the bivariate path diagram (figure 3*b*) whether *X*_{3} has a direct influence on *X*_{1} or affects *X*_{1} only indirectly via *X*_{2}. The situation becomes worse for the following trivariate system(3.2)where the error series are again assumed to be uncorrelated. The corresponding path diagrams are depicted in figure 6. Here, a bivariate analysis suggests a causal link from *X*_{3} to *X*_{1}, although the observed correlation between *X*_{1} and *X*_{3} is only due to confounding by *X*_{2}. A trivariate analysis correctly shows no direct connections between *X*_{1} and *X*_{3}. This phenomenon that an observed Granger-non-causal relation between two variables vanishes after adding variables to the information set is called spurious causality of type II (Hsiao 1982).

A serious problem that arises in practice is the omission of relevant variables from the analysis (e.g. because the variables could not be measured). As an example, we consider again the situation described in §3*b*, where we have examined the connectivity structure of the trivariate subprocess *X*_{{1,2,3}} of the four-dimensional system (3.1). Here, the multivariate path diagram (figure 5*c*) indicates the presence of a direct causal link from *X*_{3} to *X*_{1} although this Granger-causal influence vanishes in a bivariate analysis. In this situation, the bivariate path diagram (figure 5*d*) clearly provides a better graphical description of the effective connectivity among the variables than the multivariate path diagram. This phenomenon that a Granger-causal relationship vanishes after the information set has been reduced is called spurious causality of type I (Hsiao 1982).

More generally, consider a system in which all relationships between the observed variables are due to confounding by latent variables. It can be shown that two observed components are unrelated whenever they are not confounded by the same latent variables. Conversely, this implies that two observed variables are dependent only if they are affected by a common confounder. Thus, the effective connectivity of such a system can be best investigated by bivariate analyses.

## 4. Latent variables and spurious causality

The discussion in §3 has shown that multivariate path diagrams are best suited for the representation of structures that do not involve confounding by latent variables, while the connectivity of a system where all relationships between the observed variables are induced by confounding is best described by a bivariate path diagram. In practice, however, causal structures may be a combination of both situations with only a part of the Granger-causal relationships being due to confounding by latent variables. In such cases neither graphical representation would provide a correct description of the dependencies among the observed variables. As an example, consider the following four-dimensional system:(3.3)where *ϵ*_{i}, *i*=1,…,4 and *Z* are independent white noise series. Simple calculations show that the multivariate and the bivariate path diagrams are given by the graphs in figure 7*a*,*b*, respectively. Neither graph depicts the effective connectivity of the system correctly. Moreover, both graphs describe different aspects of the dependence structure. From the multivariate path diagram, we learn that *X*_{3} is Granger-non-causal for *X*_{1} with respect to *X*_{{1,2,3}}, while the same Granger-non-causal relation cannot be derived from the bivariate path diagram. On the other hand, the bivariate path diagram implies that *X*_{4} is bivariately Granger-non-causal for *X*_{2}, which is not encoded in the multivariate path diagram.

### (a) General path diagrams

We now propose a new graphical representation that is based on the global Granger-causal Markov property and combines elements of both multivariate and bivariate representations. More precisely, we consider graphs that may contain three types of edges, namely →, ⤍, and ---. For the previous example, such a graph is shown in figure 7*c*. Unlike the graphs in figure 7*a*,*b*, this graph encodes both that *X*_{4} is bivariately Granger-noncausal for *X*_{2} and that *X*_{3} is Granger-noncausal for *X*_{1} with respect to *X*_{{1,2,3}}.

Suppose that observations from a multivariate time-series *X*_{S} that is part of a larger system *X*_{V} are available. A necessary condition for a graph *G*=(*S*, *E*) to serve as a model for the effective connectivity of *X*_{S} is that it does not imply any independence relations that do not hold for *X*_{S}.

*A mixed graph G is consistent with X*_{S} *if X*_{S} *satisfies the global Granger-causal Markov property with respect to G*.

Obviously, this condition is not sufficient for the identification of effective connectivity since the Markov property trivially holds for a saturated graph with all possible edges included. So how can we determine a graph that depicts the dependencies between the observed time-series as closely as possible? We first note that the type of a directed edge, → or ⤍, cannot be determined alone from the pairwise Granger-non-causal relations between the two corresponding variables. For instance, suppose that *X*_{{a,b}} is a bivariate process such that *X*_{a} Granger-causes *X*_{b} but not *vice versa*. From this information, it is impossible to decide between the two graphs *a*→*b* and *a*⤍*b* since both encode the same Granger-non-causal relations.

On the other hand, consider again the four-dimensional process (3.1) and suppose, as in §3*b*, that only the subprocess *X*_{S}=*X*_{{1,2,3}} has been observed. The multivariate path diagram *G* associated with *X*_{S} is depicted in figure 8*a*. As noted in §3*b*, the variable *X*_{a} does not Granger-cause *X*_{b} in a bivariate analysis. While this Granger-non-causal relation cannot be derived from graph *a* in figure 8, it is implied by graphs *c* and *d*. This suggests that the latter two diagrams might be better representations of the effective connectivity. Thus, the type of the directed edge from 2 to 1 is determined by the bivariate relationship between components *X*_{3} and *X*_{1}.

This reasoning can be formalized in terms of so-called minimal consistent graphs, which have been introduced by Pearl (2000) who addressed the problem of inferring causal effects from multivariate distributions satisfying certain conditional independence relations. Note that, in contrast to the graphs described here, the causal structures discussed by Pearl need to be directed acyclic graphs.

For a mixed graph *G*=(*S*, *E*), let (*G*) be the set of all independence relations between the variables in *X*_{S} that are encoded in the graph according to the global Granger-causal Markov property. Then for two mixed graphs *G* and *G*′, we write if , that is, the graph *G* encodes the same or more independence relations than the graph *G*′. Furthermore, if and then (*G*′)=(*G*) and we say that the two graphs are Markov equivalent (with respect to the global Granger-causal Markov property). In this case, we will write *G*∼*G*′.

Now if *G* and *G*′ are two consistent graphs such that , we prefer *G* as a graphical representation of the effective connectivity of *X*_{S} since it encodes more information about the dependencies between the components of *X*_{S}. Thus, a graphical representation *G* is optimal if it is consistent with *X*_{S} and cannot be further reduced without violating consistency.

*A graph G is minimal in the class of all graphs consistent with X*_{S} *if G is consistent and for any other consistent graph G*′ *we have G*′∼*G whenever* .

In other words, any graph *G*′ that implies further Granger-non-causal relations additional to those implied by a minimal consistent graph *G* must be inconsistent with *X*_{S}.

We illustrate these concepts by application to the system in (3.1). As before we assume that the component *X*_{4} has not been observed. Figure 8*a* displays the corresponding multivariate path diagram *G*_{{1,2,3}} associated with the trivariate subprocess *X*_{{1,2,3}}, which by theorems 3.1 and 3.2 is consistent with *X*_{{1,2,3}}. For the derivation of a minimal consistent graph , we first note that the edge 2→1 in *G*_{{1,2,3}} can be replaced by the edge 2⤍1 without changing the set (*G*_{{1,2,3}}) of encoded relations. Thus, the resulting graph *G*′ in figure 8*b* is Markov equivalent to *G*_{{1,2,3}}.

Next, we consider the graph *G* in figure 8*c*, which is a subgraph of *G*′ and hence satisfies . Additionally to the relations encoded by *G*′, this graph *G* correctly implies that *X*_{3} is bivariately Granger-non-causal for *X*_{1} since the only 1-pointing path from 3 to 1 is *m*-blocked given the empty set. Since it does not imply any further additional relations among the components, it is also consistent with *X*_{{1,2,3}}. The graph resembles the bivariate path diagram of *X*_{{1,2,3}} (figure 8*d*), which by theorem 3.4 is also consistent with *X*_{{1,2,3}}. Both graphs cannot be further reduced without violating an existing Granger causality from *X*_{3} to *X*_{2} or from *X*_{2} to *X*_{1}; therefore, they are minimally consistent with *X*_{{1,2,3}}. However, we note that the two graphs are not Markov equivalent. For example, the bi-pointing path 1⤌2←3→2 in *G* is *m*-connecting given the empty set and hence indicates that *X*_{1} bivariately Granger-causes *X*_{2}. In contrast, the corresponding bi-pointing path 1⤌2⤌3⤍2 in the bivariate path diagram is *m*-connecting given {3} and, hence, indicates that *X*_{1} Granger-causes *X*_{2} with respect to *X*_{{1,2,3}}. In this case, there does not exist an optimal graphical representation that encodes all relations that hold for *X*_{{1,2,3}}.

Finally, we note that the graphs in figure 8*e* both falsely imply that *X*_{3} is Granger-non-causal for *X*_{1} in a trivariate analysis and hence are inconsistent with the process *X*_{{1,2,3}}.

### (b) Identification of minimal consistent graphs

In practice, the determination of minimal consistent graphs must be databased. Recall that if *G*′ is consistent with process *X*_{S} and *G*′ is another graph such that but *G* and *G*′ are not Markov equivalent, then *G* implies some Granger-non-causal (or contemporaneous non-correlation) relation that is not encoded in *G*′. Thus, we can decide between the two graphs *G* and *G*′ by testing for this relationship. More precisely, suppose that the graph *G* implies—in addition to the relations in (*G*′)—that is Granger-non-causal for with respect to *X*_{A} for some and *a*_{0}, *b*_{0}∈*A*. Since *G*′ is consistent with *X*_{S}, the subseries *X*_{A} has an autoregressive representation:where the parameters *Ψ*(*u*), *u*=1,…,*p* and *Ω* are constrained with respect to the graph *G*′, that is,

*Ψ*_{ba}(*u*)=0 for*u*=1,…,*p*whenever*G*′ implies that*X*_{a}is Granger-non-causal for*X*_{b}with respect to*X*_{A}and*Ω*_{ab}=0 whenever*G*′ implies that*X*_{a}and*X*_{b}are contemporaneously uncorrelated with respect to*X*_{A}.

Now, if the graph *G* is also consistent with *X*_{S}, we have additionally that is Granger-non-causal for with respect to *X*_{A} and hence for all lags *u*=1,…,*p*. Thus, we can determine whether *G* is consistent with *X*_{S} by testing the null hypothesis,against the alternative:Similarly, if *G* additionally implies that and are contemporaneously uncorrelated with respect to *X*_{A}, we can test for consistency of *G* by considering the null hypothesis .

The test is based on fitting a VAR(*p*) model to the subseries *X*_{A} under the constraints (i) and (ii). These constraints define a path diagram *G*_{A}=(*A*, *E*_{A}) for *X*_{A} with *a*→*b*∉*E*_{A}, if *G*′ implies that *X*_{a} is Granger-non-causal for *X*_{b} with respect to *X*_{A} and *a*---*b*∉*E*_{A} if *G*′ implies that *X*_{a} and *X*_{b} are contemporaneously uncorrelated with respect to *X*_{A}. Therefore, the restricted model can be fitted by using the iterative procedure for fitting graphical vector autoregressions described in the Appendix. Noting that for large sample sizes *T* the estimates are approximately normally distributed with variance , we obtain, as a test statistic for the above test problem,where is an estimate for *V*. Under the null hypothesis *H*_{0}, the test statistic *S*_{ba} is asymptotically distributed with *p* degrees of freedom.

As an example, we consider again the system (3.1) which we discussed in the previous section. Recall that we have found that the graph *G*′ in figure 8*b* is Markov equivalent to the multivariate path diagram associated with *X*_{{1,2,3}} and, thus, is consistent with *X*_{{1,2,2}}. Furthermore, we have seen that the graph *G* in figure 8*c* satisfies . In order to test whether *G* is consistent with the data, we note that *G*′ implies that *X*_{1} is bivariately Granger-non-causal for, and bivariately contemporaneously uncorrelated with, *X*_{3}, whereas *G* entails that *X*_{1} and *X*_{3} are completely unrelated. Thus, the graph *G*′ leads to the following restricted bivariate VAR(*p*) model for *X*_{{1,3}}with . For this example, we obtain as the null hypothesis .

Finally, we note that if two graphs *G* and *G*′ are Markov equivalent we cannot decide between the two graphical representations.

### (c) Causal effects and spurious causality

The minimal graphs consistent with a process *X*_{S} describe all causal models that can be used for an explanation of the observed dependencies among the variables in *X*_{S}. Therefore, if a directed edge *a*→*b* is present in all minimal consistent graphs, the Granger-causal relationship from *X*_{a} to *X*_{b} might be in part, but not completely, attributed to a common explanatory latent variable. This implies the existence of a causal influence of *X*_{a} on *X*_{b}.

*X*_{a} *has a causal effect on X*_{b} *if there exists a directed path from a to b in every minimal causality graph* *consistent with X*_{S}.

Conversely, if *X*_{a} Granger-causes *X*_{b} with respect to some subprocess *X*_{S′} of *X*_{S} (including the case *S*′=*S*), but there does not exist any directed path from *a* to *b* in all minimal graphs consistent with *X*_{S}, then this Granger-causal relationship between *X*_{a} and *X*_{b} must be spurious.

*Suppose that X*_{a} *Granger-causes X*_{b} *with respect to X*_{S}. *Then we say that X*_{a} *spuriously causes X*_{b} *with respect to X*_{S} *if no directed path from a to b exists in any minimal graph consistent with X*_{S}.

The concepts presented in this section can be used to evaluate effective connectivity. We note that it may happen that the direct and indirect effects of one variable on another cancel out such that the total effect vanishes. Under such circumstances, the effective connectivity differs from the structural connectivity of the system. Inference about the structural connectivity of a system is only possible under the assumption that the complete system *X*_{V} (including latent variables) is stable (Pearl 2000) or faithful (Spirtes *et al*. 2001) with respect to its associated path diagram *G*_{V}, that is, a Granger-non-causal relation holds for *X*_{V} if, and only if, it can be derived from *G*_{V} (and the same for contemporaneous correlation). We note that is not possible to test whether a system satisfies this assumption.

## 5. Applications

### (a) Simulated example

For an illustration of the application of the concepts developed in this paper, we first consider simulated data (length=512) from a four-dimensional systemwhere the series *ϵ*_{i}(*t*), *i*=1,…,4 and *L*_{i}(*t*), *i*=1, 2 are independent white noise with mean zero and variance one. For the simulation, we have set *α*_{1}=⋯=*α*_{6}=0.6.

In this system, *L*_{1} and *L*_{2} represent latent variables that were not included in the analysis. For the remaining four variables *X*_{1},…,*X*_{4} we have estimated the partial directed correlations (PDC) *π*_{ab}(*u*) (cf. Appendix) with respect to the complete available information (*X*_{{1,…,4}}) as well as with respect to bivariate subprocesses. The estimates are shown in figure 9 and suggest that the series can be well approximated by a VAR model of order 2. Moreover, the results of the multivariate analysis show five PDC that are significantly different from zero, which leads to the multivariate path diagram in figure 9*b*. For a quantitative analysis, we have also performed tests for Granger causality and contemporaneous non-correlation (table 1), which corroborate the results obtained from the inspection of the PDC. Similarly, the bivariate path diagram in figure 9*c* has been obtained from the results of the bivariate analyses. Comparing the two graphs, we note that all directed edges that are present in the multivariate path diagram are also included in the bivariate path diagram. This means that the analysis of all bivariate subseries does not detect any additional relations among the components and thus does not indicate any spurious causalities.

We now examine whether any of the observed relationships between the variables are due to confounding by latent variables. We first consider the graphs in figure 10*a*. Here the left graph is Markov equivalent to the multivariate path diagram. Removing the edge 1→3 we obtain the right graph *G*_{a} which, additionally to the independencies encoded by , implies that *X*_{1} is Granger non-causal for *X*_{3} with respect to *X*_{{1,2,3}}. To test whether *G*_{a} is consistent with the data, we follow the procedure described in the previous section and fit a VAR(2) model to the subseries *X*_{{1,2,3}} under the constraints implied be the graph , that is, for *u*=1, 2 and . In this model, the null hypothesis that the subgraph *G*_{a} is consistent with the data can be formulated as . The corresponding test clearly rejects the null hypothesis (table 2) and, thus, we conclude that graph *G*_{a} is not consistent with the data.

Alternatively, examination of the PDC between *X*_{1} and *X*_{3} with respect to *X*_{{1,2,3}} (figure 11*a*) shows a still significant positive influence of *X*_{1} on *X*_{3}, which implies that *X*_{1} Granger-causes *X*_{3} with respect to the subseries *X*_{{1,2,3}}. Following the same steps as above with component *X*_{4} substituted for *X*_{2} (figure 10*b*), we find that the null hypothesis that *X*_{1} Granger-causes *X*_{3} with respect to *X*_{{1,3,4}} cannot be rejected (table 2). This is also indicated by the PDC between *X*_{1} and *X*_{3} with respect to *X*_{{1,3,4}}, which vanishes completely (figure 11*b*). Consequently, the graph *G*_{b} is consistent with *X*_{{1,2,3,4}}. Further examination shows the graph is also minimally consistent with *X*_{{1,2,3,4}}.

Finally, we consider the right graph in figure 10*c* which encodes that *X*_{1} is bivariately Granger-non-causal for *X*_{3}. Since this contradicts the results of the bivariate analysis of *X*_{{1,3}} it follows that this graph is inconsistent with *X*_{{1,2,3,4}}.

From definitions 4.3 and 4.4, we conclude that the causality from *X*_{2} to *X*_{3} is spurious whereas the association between *X*_{4} and *X*_{3} is due to a causal effect of *X*_{4} on *X*_{3}. In contrast, neither a causal effect nor spurious causality can be established for the directed edges from 1 to 2 and from 1 to 4, since each of them may be replaced by a dashed directed edge ⤍ without changing the Markov properties of the graph.

### (b) Application to fMRI data

In this section, we apply the methods discussed in this paper to a seven-dimensional time-series obtained from concurrent recordings from EEG and fMRI. The EEG was sampled at 200 Hz from an array of 16 bipolar pairs, with an additional channel for the EEG and scan trigger. The fMRI series were measured with a time resolution of 2.5 s at six slice planes (TR=4 mm, skip 1 mm), with the second most inferior slice oriented through the anterior commissure–posterior commissure (AC–PC) line. The data and their requisition are described in detail in Goldman *et al*. (2002). Informed consent was obtained from the volunteers based on a protocol previously approved by the UCLA Office for the Protection of Research Subjects.

For the analysis, the time-varying spectrum of the EEG has been decomposed by parallel factor (PARAFAC) analysis into trilinear components (called atoms), each being the product of spatial, spectral and temporal factors. The PARAFAC analysis extracted three significant atoms characterized by their spectral signature. Only the temporal factor of the alpha atom corresponding to a frequency range of 8–12 Hz was included in the effective connectivity analysis.

The fMRI data are time-series of length *T*=108 for six regions in the brain whose activation was correlated with the EEG alpha atom: visual cortex, thalamus, left and right insulae and left and right somatosensory areas. The time-series for each region was obtained by averaging the time-series of all voxels in that region. For details about preprocessing the data, we refer to Martínez-Montes *et al*. (2004). The use of PARAFAC analysis in analysing three-dimensional EEG data (space, time and frequency) is described in Miwakeichi *et al*. (2004).

For the determination of the Granger-causal relationships among the seven series, a vector autoregressive model of order *p*=1 was fitted to the data. The model order has been determined by minimization of the Akaike's information criterion (AIC) (Lütkepohl 1993),where is the estimate for the covariance matrix of *ϵ*(*t*) and *r* is the number of parameters for the model. From the parameter estimates, the corresponding PDC with respect to the complete series have been computed (figure 12, below diagonal). Similarly, estimates for the bivariate PDC have been obtained by fitting vector autoregressive models of order *p*=2, where the order has been chosen so that the AIC criterion is minimized for most bivariate analyses (figure 12, above diagonal).

For the identification of the multivariate path diagram, we have performed a series of tests for pairwise Granger non-causality and contemporaneous non-correlation; the results of the tests are given in table 3 and the resulting multivariate path diagram is depicted in figure 13. Similarly, the bivariate path diagram could be obtained by testing for bivariate Granger non-causality and bivariate contemporaneous non-correlation. Due to the large number of edges in this graph, it would be difficult to draw any inferences from the bivariate path diagram and we therefore omit it. The results of the tests are also given in table 3.

The multivariate path diagram most notably shows that the EEG alpha atom has a direct influence on the thalamus and visual cortex only, while the BOLD responses in the other regions are only influenced indirectly. This indicates that the brain regions involved in the generation of the ‘EEG alpha rhythm’ are primarily located in the thalamus and visual cortex and corroborates previous findings by Goldman *et al*. (2002) and Martínez-Montes *et al*. (2004), who identified the parieto-occipital cortex and thalamus as generators of the ‘alpha brain rhythm’. Comparing these findings with the results of the bivariate analysis in table 3*b*,*c*, we find that although the EEG alpha atom shows an influence on all cortical brain regions, the positive correlation between the EEG alpha atom and the BOLD response in the thalamus is no longer significant. This suggests that the Granger causality from EEG alpha atom to thalamus is spurious.

For a more detailed analysis of the effective connectivity, we first note that neither in the multivariate nor in the bivariate analyses, the EEG alpha atom, the visual cortex and the thalamus seem to be affected by the remaining four brain regions. We therefore may restrict our analysis to these three components. Figure 14*a*,*b* displays the corresponding multivariate and bivariate path diagrams *G*^{(m)} and *G*^{(b)}, respectively; the former has been obtained following the same steps as above (order *p*=2), while the latter has been directly derived from the bivariate results in table 3. Here, the multivariate path diagram *G*^{(m)} implies that the thalamus and the visual cortex neither Granger-cause the EEG alpha atom nor are they contemporaneously correlated with the EEG component, while the bivariate path diagram *G*^{(b)} additionally encodes that, first, the EEG alpha atom does not bivariately Granger-cause the thalamus and second, the visual cortex and thalamus are bivariately contemporaneously uncorrelated. Thus, we have . Furthermore, since, according to theorem 3.4, the bivariate path diagram is consistent with the data and already encodes all empirically determined relations among the variables, we conclude that the bivariate path diagram *G*^{(b)} is minimally consistent with the data. Replacing the edge TH⤍VC with TH→VC, we obtain another graph *G*′ which is Markov equivalent to *G*^{(b)} and, hence, is also minimally consistent with the data (figure 14*c*). Simple considerations show that no further minimally consistent graph exists.

The minimally consistent graphs show that the correlation between EEG alpha atom and thalamic BOLD responses observed in a multivariate analysis can be attributed to the indirect link EEG⤍VC⤍TH mediated by the visual cortex. This is in line with the results by Martínez-Montes *et al*. (2004), who identified the visual cortex as the source of the EEG alpha rhythm. However, note that, contrary to the explanation provided by Martínez-Montes *et al*. (2004), we find that EEG and visual cortex are negatively correlated while the PDCs that correspond to the link from the visual cortex to the thalamus are positive. Since the intermediate vertex is a collider, the combined correlation is negated resulting in a positive correlation for the pathway. This effect of changing sign can also be observed in figure 8, where the pathway 1⤍2⤍3 lead to an overall negative correlation (also Eichler *et al*. 2003).

Finally, we note that the contemporaneous correlation between the thalamus and the visual cortex in a multivariate analysis is due to the pathway TH⤌VC⤌EEG⤍VC.

The analysis suggests a causal influence of the EEG on the visual cortex. This ‘causal ordering’ is most likely due to the fact that brain activity, which is instantaneously reflected in the EEG measurements, takes much longer to be reflected in the fMRI. Thus, the analysis allows no conclusions about the causal ordering of the EEG and the fMRI component. Furthermore, we note that it is not possible to determine whether the link between the EEG alpha atom and the visual cortex is due to direct confounding since the Markov properties of the graph would not be changed if the directed edge between the EEG alpha atom and the visual cortex were replaced by dashed directed edge.

A serious limitation of effective connectivity analysis from time-resolved fMRI data in general is the short sample size; in the example above only 108 measurements were available. Consequently, only strong correlations among the components show up significantly in an analysis. This particularly affects the identification of indirect pathways as needed for the distinction between direct and spurious causality. For instance, in our discussion of the four-dimensional system (3.1) in §3*a*, we have derived the autoregressive representation of the trivariate subseries *X*_{{1,2,3}}. Here, the coefficient *Φ*_{13}(2), which corresponds to the *m*-connecting pathway 3→2←4→1 was of the form *αβγ*/(1+*β*^{2}). Since all coefficients were smaller than one, the Granger-causal effect of *X*_{3} on *X*_{1} is much smaller than the other links between the components. In an analysis with small sample size such correlations can easily become insignificant and thus hamper the identification of the effective connectivity of the system.

## 6. Conclusion

In this paper, a new graphical approach for the analysis of causal relationships in multivariate time-series has been presented. In particular, this approach allows the comparison of multivariate and bivariate Granger causality, both of which are frequently used in brain imaging to resolve functional relationships between brain areas from EEG or fMRI time-series data. We have shown that both concepts may provide useful information that cannot be obtained by the other.

Our discussion has also shown that the effective connectivity of systems that may be affected by unmeasured latent variables cannot be resolved by multivariate and bivariate analyses alone, but only by examination of Granger-non-causal relations with respect to all subseries. To this end, we have generalized the idea of multivariate and bivariate path diagrams and introduced mixed graphs that can be used to visualize the effective connectivity of such systems.

Currently, the identification of such graphical representations is based on a multi-step procedure where each step requires the fitting of a new autoregressive model. As a consequence, it is impossible to compare two graphical representations of the effective connectivity and test between them. Furthermore, the statistical errors in different steps may lead to contradictory results. Therefore, future research aims at the development of new graphical time-series models with dependencies that are constrained by general path diagrams.

## Acknowledgments

The author wishes to thank Robin Goldman and Mark Cohen, who conducted the EEG–fMRI experiments discussed in §5, and Pedro Valdéz-Sosa and Eduardo Martínez-Montes for providing the data and many helpful comments.

## Appendix A Statistical inference

For the analysis of empirical data, VAR(*p*) models can be fitted using least-squares estimation. For observations *X*_{V}(1),…,*X*_{V}(*T*) from a *d*-dimensional multiple time-series *X*_{V}, let be the *pd*×*pd* matrix composed by submatricesSimilarly, we set . Then the least-squares estimates of the autoregressive coefficients are given byfor *u*=1,…,*p*, while the covariance matrix *Σ* is estimated bywhere are the least-squares residuals. The estimates are asymptotically normally distributed; for details we refer to Lütkepohl (1993).

The coefficients *Φ*_{ab}(*u*) depend, like any regression coefficient, on the unit of measurement of *X*_{a} and *X*_{b} and, thus, are not suited for comparisons of the strength of causal relationships between different pairs of variables. Therefore, Eichler (2005) proposes partial directed correlations as a measure of the strength of causal effects. For *u*>0 (*u*=0) the partial directed correlation *π*_{ab}(*u*) is defined as the correlation between *X*_{a}(*t*) and *X*_{b}(*t*−*u*) after removing the linear effects of while for *u*<0 we have . It is further shown that for *u*>0, estimates for the partial directed correlations *π*_{ab}(*u*) can be obtained from the parameter estimates of a VAR(*p*) model by rescaling the coefficients *Φ*_{ab}(*u*)wherewith . For *u*=0, we obviously haveFor large sample length *T*, the partial directed correlations are approximately normally distributed with mean *π*_{ab}(*u*) and variance 1/*T*.

Tests for Granger causal relationships among the variables can be derived from the asymptotic distribution of the parameters of the VAR(*p*) model. More precisely, let be the estimate for the asymptotic covariance between and and let be the corresponding *p*×*p* matrix. Then the existence of a Granger causal effect of *X*_{b} on *X*_{a} can be tested by evaluation of the test statisticUnder the null hypothesis that *X*_{b} is Granger non-causal for *X*_{a} with respect to *X*_{V}, the test statistic *S*_{ab} is asymptotically *Χ*^{2} distributed with *p* degrees of freedom.

More generally, inference on causal structures should be based on fitting graphical vector autoregressive models. For given graph *G* and order *p*, the parameters of the VAR(*p*, *G*) model can be computed iteratively from the following two steps. First, for the given estimate , the estimates are determined as the solution of the linear equation system:for *u*=1,…,*p* and *a*, *b*∈*V* such that *b*→*a* in *G* under the constraints that *Φ*_{ab}(*u*)=0 whenever the directed edge *b*→*a* is absent in the graph *G*. Second, the estimate is obtained by solving the nonlinear equation system,for *a*, *b*∈*V* such that *b*---*a* in *G*, where is the unconstrained estimate for the covariance matrix of the residuals . The second step corresponds to fitting a covariance model to the residuals , which is determined by the zero constraints on the covariance matrix *Σ*. An iterative algorithm for fitting such covariance models has been introduced by Drton & Richardson (2004). Since the solutions for both sets of equations are not independent, an iteration of the two steps is needed to obtain a joint solution. The fitting of graphical vector autoregressive models will be described in more detail in a forthcoming paper.

## Footnotes

One contribution of 21 to a Theme Issue ‘Multimodal neuroimaging of brain connectivity’.

- © 2005 The Royal Society