## Abstract

‘Repeated’ measurements for a trait and individual, taken along some continuous scale such as time, can be thought of as representing points on a curve, where both means and covariances along the trajectory can change, gradually and continually. Such traits are commonly referred to as ‘function-valued’ (FV) traits. This review shows that standard quantitative genetic concepts extend readily to FV traits, with individual statistics, such as estimated breeding values and selection response, replaced by corresponding curves, modelled by respective functions. Covariance functions are introduced as the FV equivalent to matrices of covariances.

Considering the class of functions represented by a regression on the continuous covariable, FV traits can be analysed within the linear mixed model framework commonly employed in quantitative genetics, giving rise to the so-called random regression model. Estimation of covariance functions, either indirectly from estimated covariances or directly from the data using restricted maximum likelihood or Bayesian analysis, is considered. It is shown that direct estimation of the leading principal components of covariance functions is feasible and advantageous. Extensions to multi-dimensional analyses are discussed.

## 1. Introduction

Quantitative genetics has typically been concerned with traits that, in essence, are points, i.e. measured at a specific location or time. Clearly, however, there are traits which are not only recorded repeatedly per individual, but are also considered to be changing, gradually, up or down, and continually, between measurements.

The most common of these are longitudinal data, i.e. records taken at successive times or ages, such as weights or body size measurements, and we will concentrate on this case in the following. Alternatively, these could be records taken along some spatial scale, or any other continuous covariable, referred to as ‘control variable’ in the following. For instance, we might want to relate changes in backfat thickness of farm animals to weight at recording, or assess the effect of ambient temperature on growth rates of caterpillars (Kingsolver & Gomulkiewicz 2003). In this case, our traits of interest are complete curves or trajectories rather than individual points, and we want to quantify genetic values and their dispersion structure among records for the complete range of the control variable.

Today, such traits are commonly referred to as function-valued (FV) traits, where FV emphasizes that the corresponding curves are described by a mathematical function (rather than implying some notion of functionality), and we will adopt this nomenclature. Earlier, Kirkpatrick *et al*. (1990) called these ‘infinite-dimensional’ traits, as they could, hypothetically, be measured infinitely many times along the continuous scale of interest. This paper reviews the role and analysis of FV traits in quantitative genetics.

## 2. Genetics and functional data analysis

Functional data analysis (FDA) is the branch of statistics concerned with the analysis of FV traits, albeit generally at a phenotypic level. This extends to more than one control variable, e.g. for two spatial coordinates, we might deal with images. Loosely speaking, FDA attempts to characterize patterns in variation of curves and extract common features. This might include horizontal and vertical shifts, i.e. variation in phase and amplitude of curves, and an investigation of the associated eigenfunctions or ‘eigenfaces’. A brief introduction to FDA is given by Ramsay & Silverman (2001), and details can be found in their books (Ramsay & Silverman 1997, 2002). Applications span a wide range of fields, e.g. biology, economics and psychology. For example, Rice & Silverman (1991) considered the analysis of human gait patterns, Spitzner *et al*. (2003) examined tactile perception and Ramsay (2000) presented an analysis of variation in handwriting.

In quantitative genetics, models and methods for analyses of data that represent curves have obtained insights and stimuli from both evolutionary biology and animal breeding (Hill 1998).

### (a) Covariance functions

Kirkpatrick & Heckman (1989) and Kirkpatrick *et al*. (1990) introduced the infinite-dimensional model for traits measured repeatedly per individual, and suggested to model genetic covariances of trajectories through covariance functions.

Let the continuous scale along which repeated observations are taken be time *t*. As the name implies, a covariance function (CF) then gives the covariance between two measurements taken at times *t*_{j} and *t*_{j′} as a function of these times. CFs for FV traits can be interpreted as the equivalent to variance components for standard, non-FV traits. While CFs conceptually involve infinitely many parameters, Kirkpatrick & Heckman (1989) showed that good estimates can be obtained by approximating the CFs as the weighted sum of a relatively small number of basis functions, such as orthogonal polynomials. This yields reduced rank and smoothed estimates of the corresponding covariance matrices. Kirkpatrick *et al*. (1994) illustrated the scope for estimation of genetic CFs, for records taken at individual test days during the course of lactation of dairy cows.

CFs based on orthogonal polynomials or similar functions do not involve any assumptions about the shape of the trajectory or the dispersion structure other than implicit in the choice of order of approximation and, to a lesser extent, of basis functions. In contrast, CFs commonly considered in other areas of statistics are often parametric and restrictive in the structure they permit and are sometimes difficult to justify biologically. However, they generally involve only a small number of parameters.

For the parametric approach, a CF is usually decomposed into a variance function, describing changes in variance with *t*, and a correlation function. Commonly used correlation functions are stationary, i.e. correlations for records taken at a given lag, , are the same for all *t*_{j}. A well-known example is the auto-correlation structure with a single parameter *ρ*, which involves an exponential decay in correlation, , with *θ*=−log(*ρ*). Other functions with one or two parameters exist (e.g. Pletcher & Geyer 1999). Additional parameters can add more flexibility. For instance, a Box–Cox power transformation of *t* accommodates some non-stationarity (Núñez-Antón & Zimmerman 2000). Pletcher & Geyer (1999) presented an application of such models for the estimation of genetic CF, referring to them as ‘character process’ models; see also Jaffrézic & Pletcher (2000) for a review.

### (b) Random regression models

In analyses of data from livestock improvement schemes, measurements along curves have often been treated simply as repeated measurement of the same trait, potentially applying some correction for time or age at recording. In addition, large ranges of observed times have been subdivided into intervals representing individual, correlated traits. A typical example is the weights of cattle, for which we usually distinguish between birth, weaning, yearling and later weights. A major drawback of these approaches is that they do not fully use the information provided by the times of recording, and thus do not account for any constraints they might impose on the covariance structure.

Alternatively, curves of interest, in particular growth and lactation curves, have been fitted at a phenotypic level, and the parameters of the curve have subsequently been analysed as new traits. Drawbacks of this approach are that estimates of curves may be influenced by systematic environmental effects not taken into account, and that constraints are imposed on the families of curves.

These considerations have motivated the random regression (RR) approach. Estimates of breeding values of livestock, i.e. of the additive genetic values of animals, are usually obtained from mixed model analyses. Fitting animals' additive genetic effects as random effects, analyses involve a linear regression of observations on indicator variables, which have values of unity or zero for animals that do or do not have a record, respectively, for the trait considered. This framework extends readily to other and multiple covariables (Henderson 1982), such as functions of *t*. This allows any trajectories that can be described in a regression equation to be modelled directly within the mixed model analysis. While the covariables are usually nonlinear functions of *t*, such as polynomials or splines, the model is linear in the regression coefficients to be estimated. Fitting sets of RR coefficients for each individual and random factor, e.g. additive genetic and permanent environmental effects, then yields estimates of the corresponding trajectories. This is the so-called RR model, also referred to as the random coefficient model in other areas of statistics.

Initial applications of the RR model were in genetic evaluation of dairy cows, using records from individual test days to model the lactation curve (Schaeffer & Dekkers 1994; Jamrozik *et al*. 1997). Over the last decade, the RR model has become a standard for analyses of repeated records from animal breeding schemes, such as test day, growth and feed intake data. A recent review of applications has been given by Schaeffer (2004).

Treating sets of regression coefficients as random effects implies a matrix of covariances among RR coefficients. In turn, these covariances determine the covariance structure among all records along the trajectory (Jamrozik & Schaeffer 1997), i.e. define a CF. In other words, fitting a RR model and estimating CFs approximated by a limited number of terms are equivalent, assuming the corresponding functions of *t* are fitted (Meyer & Hill 1997; Meyer 1998*b*). This can be exploited to estimate CFs efficiently (Meyer 1998*a*).

## 3. Modelling function valued traits

Classical quantitative genetics assumes phenotypic observations are determined by the individuals' additive genetic effects as well as so-called permanent environmental effects, which may encompass any non-additive genetic effects. In addition, there may be systematic environmental effects, and a temporary environmental effect, specific to each record. The same assumptions are invoked for FV traits, but with most effects replaced by corresponding curves.

### (a) The ‘functional’ linear model

Let *y*_{ij} denote the *j*th record for individual *i* taken at time *t*_{j}. Assume the trait considered is continuous and has a multivariate normal distribution. A ‘functional’ model is then(3.1)where *M*(*t*_{j}) represents any systematic, fixed effects affecting *y*_{ij} which change with *t*. This can simply be a mean trajectory, or represent sets of fixed curves nested within some classification. For instance, we might want to allow for different mean growth curves of male and female animals. *F*_{ij} denotes any other fixed effects, and *ϵ*_{ij} is the temporary environmental effect. Additive genetic and permanent environmental effects are given by random functions *g*_{i}(.) and *p*_{i}(.) evaluated at *t*_{j}, with the subscript *i* (omitted in the following) emphasizing that these are curves specific to individual *i*. The additive genetic and permanent environmental covariance functions, and , respectively, describe the covariances between effects at different ages.

In principle, *M*(.), *g*(.) and *p*(.) can be chosen among many types of continuous functions, including, for instance, exponential functions such as Gompertz's or Brody's curves, which are commonly fitted to model growth curves, and different families of function can be used to model different trajectories in equation (3.1). However, we will restrict our discussion to functions that can be expressed as regression equations, i.e. which are linear in the coefficients of the curve. This assumes that *g*(.) and *p*(.) can be expressed as weighted sums of a set of so-called basis functions of *t*, e.g. polynomials.

Let *ϕ*_{r}(*t*_{j}) denote the *r*th basis function, evaluated for *t*_{j}, and *α*_{ir} and *γ*_{ir} be the regression coefficients for additive genetic and permanent environmental effects for the *i*th individual. Assume we fit *k*_{α} and *k*_{γ} regression coefficients, respectively. Our model then becomes(3.2)where the first sum represents the additive genetic, *g*(.), and the second sum the permanent environmental, *p*(.), component of the phenotype. Omitting details, *M*(.), of course, can be expanded in the same fashion.

Theoretically, there are infinitely many terms in the expansion of *g*(.) or *p*(.). Hence, equation (3.2) represents a truncated, reduced-rank approximation. In practice, orders of fit *k*_{α} and *k*_{γ} are often substantially smaller than the number of distinct values of *t* observed.

### (b) Basis functions

A common choice for the basis functions are orthogonal polynomials, as advocated by Kirkpatrick & Heckman (1989). In particular, the Legendre polynomials used by Kirkpatrick *et al*. (1990) have been used extensively in RR analyses. While polynomials are flexible and have been proven capable of modelling a range of covariance structures, they are also frequently associated with numerical problems, especially for high orders of fit. ‘Runge's phenomenon’ describes the observation that the error of polynomial approximation of a curve increased with the order of polynomial fit. Moreover, errors were predominantly due to oscillations at the extremes (de Boor 2001). RR analyses fitting higher degree polynomials have frequently encountered erratic estimates of variance components at the highest ages, in particular, for longitudinal data with few records at the extremes.

An alternative to high degree polynomials are ‘piecewise polynomials’, i.e. curves constructed from pieces of low degree polynomials, which are joined smoothly at selected points, so-called splines. A particular type of spline is the B-spline (de Boor 2001). Rice & Wu (2001) suggested the use of the basis functions of a B-spline as covariables to model random curves in mixed model analyses. Although B-splines are not automatically orthogonal, they have good numerical properties that make them attractive (Ruppert *et al*. 2003). While following the lead of White *et al*. (1999), cubic-smoothing splines have been used in a number of studies (e.g. White *et al*. 1999; Huisman *et al*. 2002; DeGroot *et al*. 2003), use of B-splines in RR analyses so far has been limited (Torres & Quaas 2001; Meyer 2005).

### (c) Covariance structure

Let **α**_{i}, of length *k*_{α}, and **γ**_{i}, of length *k*_{γ}, denote the vectors of RR coefficients for individual *i*. Assuming a multivariate normal distribution for the data,(3.3)where and are the matrices of covariances between RR coefficients, and define .

Temporary environmental effects are assumed to be independently distributed, but variances are allowed to change with *t*. This is modelled through a variance function, i.e. . In the simplest case, *e*(.) could be a step function. Alternatively, polynomials or spline function of *t* can be used to model heteroscedasticity of residual variances. Often, logarithmic values of variances are modelled in this way to account for ‘scale effects’, i.e. dependencies between means and variances.

The covariances between RR coefficients determine the covariance between two records for animal *i*, taken at times *t*_{j} and *t*_{j′}.(3.4)with *δ*_{jj′}=1 for *j*=*j*′, and zero otherwise. The first part on the right-hand side of equation (3.4) is the additive genetic covariance function , evaluated for *t*_{j} and *t*_{j′}, and the second part is its permanent environmental counterpart, .

### (d) Matrix formulation

For *N* animals with *n*_{i} records each, the model is(3.5)with * y* and

*the vectors of observations and residuals, respectively,*

**ϵ***and*

**α***the vectors of genetic and permanent environmental RR coefficients, and*

**γ***the vector of fixed effects fitted. Expectations of these vectors are*

**b***E*[

*]=*

**α****0**,

*E*[

*]=*

**γ****0**,

*E*[

*]=*

**ϵ****0**and

*E*[

*]=*

**y***.*

**Xb***,*

**X***and*

**Z***are the design or incidence matrices. For effects modelled as trajectories, the design matrices contain the basis functions evaluated for the time pertaining to each record,*

**W***ϕ*

_{r}(

*t*

_{j}), i.e. are considerably denser than their counterparts in ‘standard’ analyses, where each row of

*or*

**Z***would only have a single non-zero element of unity. The model accommodates any pattern of values of*

**W***t*

_{j}, i.e. individuals may have different numbers of records and records may be taken at arbitrary times.

Generally, analyses include genetic effects for animals which are parents only, i.e. do not have any records. These are included in * α*, so that

*has length*

**α***N*

_{A}

*k*

_{α}with

*N*

_{A}≥

*N*the total number of animals, but corresponding rows of

*are zero.*

**Z**Random vectors * α*,

*and*

**γ***are defined to be uncorrelated, with covariance matriceswhere*

**ϵ***is the numerator relationship matrix between animals,*

**A**

**I**_{N}an identity matrix of size

*N*, and ‘’ denotes the direct matrix product. Diagonal elements of

*are unity, augmented by the inbreeding coefficient for individuals, and off-diagonal elements give the degree of relationship, e.g. one-half for parents and their offspring or one-fourth for half-sibs. This gives variance of*

**A***(3.6)*

**y**## 4. Estimation of breeding values

The central tasks of quantitative genetic analyses, estimation of breeding values and estimation of covariance components and genetic parameters generally invoke the linear, mixed model. The so-called mixed model equations (MME) for the model given by equation (3.5) are(4.1)Assuming variances are known, solutions to equation (4.1) are generalized least-squares (LSQ) estimates of the fixed effects * b*, and best linear unbiased predictors of the random effects

*and*

**α***(Henderson 1973). For national genetic evaluation schemes, especially multi-trait analyses for dairy or beef cattle, the MME can comprise millions of equations. Hence equation (4.1) is generally solved iteratively. Specialized strategies suitable for large scale, RR-test day models have been discussed by Lidauer*

**γ***et al*. (1999), Gengler

*et al*. (2000) and Jamrozik & Schaeffer (2000).

The MME (equation 4.1) play a central role in statistical analyses in quantitative genetics, and in particular, of animal breeding data. Standard generalized LSQ analyses would require inverting the variance matrix * V*, but only inverses

**R**^{−1}and

**A**^{−1}are needed in the MME. While

*has size equal to the number of records, assuming zero environmental covariances between records on different animals, it is generally blockdiagonal for animals, if not, as in our case, diagonal.*

**R***is thus readily invertable, and the data part of the MME can be accumulated considering records for one animal at a time.*

**R***has size proportional to the number of animals in the analysis (including parents without records), but*

**A**

**A**^{−1}can be set up directly and easily from a list of pedigree records (Henderson 1976; Quaas 1976).

Estimated regression coefficients *α*_{ir} for animal *i* define its genetic merit along the complete trajectory considered. In some cases, we may be interested in estimated breeding values for target times *t*_{k}. These are obtained simply by evaluating the regression equation(4.2)Other functions of the estimated curve may be of interest. For example, integrals of estimated lactation curves to provide an estimate of total lactation genetic merit for dairy cows. For growth curves, first and second derivatives give velocity and acceleration, respectively, and turning points of estimated curves may help distinguish between early and late maturing animals.

Sampling covariances among estimated regression coefficients are given by the inverse of the coefficient matrix in equation (4.1). For most genetic evaluation schemes, however, this matrix is too large to be inverted. Approximation procedures for the inverse and ‘accuracies’ of estimated breeding values for RR analyses have been described by Jamrozik *et al*. (2000) and Tier & Meyer (2004).

## 5. Estimation of covariance functions: I. ‘Indirectly’

Estimates of genetic and environmental CFs can be obtained either directly from the data (see §6), or indirectly. Indirect estimation is, in essence, a two-step procedure. It requires a matrix of covariances for a set of points along the trajectory, to be estimated in a first step. Once such estimates of covariances are available, estimates of CFs for different choices of basis functions and orders of fit can be obtained and compared quickly.

However, in estimating CFs from covariance matrices, usually only one source of variation is considered. This ignores sampling correlations, for example, between genetic and residual covariances. Hence, results tend to differ somewhat from those obtained in corresponding analyses estimating all CFs simultaneously from the data (e.g. van der Werf *et al*. 1998; Kettunen *et al*. 2000), as the latter provide greater scope for a repartitioning of variation to or from other random effects when order(s) of fit are varied.

Let * S* denote the matrix of covariance estimates for a grid of

*s*values

*t*

_{j}, and

**Φ**_{S}, of size

*s*×

*k*, the matrix of basis functions evaluated for the

*t*

_{j}, with

*k*≤

*s*the order of fit for the CF. We then assume(5.1)with

*the matrix of coefficients of the CF, and*

**K***a matrix of residuals.*

**O**### (a) Least-squares

Kirkpatrick *et al*. (1990) described a weighted LSQ procedure to estimate the coefficients of a genetic CF from known genetic covariances. Kirkpatrick *et al*. (1994) proposed a modified scheme for a phenotypic CF, which allowed temporary environmental variances to be separated from permanent environmental variances. Other studies have employed simple LSQ estimation (e.g. van der Werf *et al*. 1998; Veerkamp & Goddard 1998).

Define **X**_{S} as the *s*(*s*+1)/2×*k*(*k*+1)/2 matrix with elements corresponding to the *jl*th element of * S* and the

*uv*th element of

*, and*

**K***δ*

_{uv}as defined above. Weighted LSQ estimates of the coefficients of the CF are then obtained as(5.2)where ‘vech’ is the operator that stacks the columns of the lower triangle of a symmetric matrix into a vector (e.g. Harville 1997), and

*is a matrix of weights.*

**Ω**### (b) Maximum likelihood

A drawback of the LSQ approach is that it does not guarantee the estimated CFs to be positive semi-definite, i.e. to yield estimates within the parameter space (Kirkpatrick *et al*. 1990). This can be overcome in a maximum likelihood framework, as proposed by Mäntysaari (1999).

In essence, this involves treating the estimated covariances (* S*) as if they were crossproducts arising from data, and equating them to their ‘expected’ values, shown above (equation (5.1)), comprising sums of products of coefficients of the CF and base functions evaluated at the grid points. Let

*=*

**F***E*[

*]. The log likelihood (log ) to be maximized is then (Thompson 1976)(5.3)where*

**S***d*are the degrees of freedom associated with

*. Thompson (1976) outlined an iterative procedure to maximize equation (5.3). Let*

**S***denote the vector of parameters to be estimated, and the estimate from the*

**θ***q*th iterate.(5.4)where is the matrix of expected values of second derivatives and the vector of the data part of the first derivatives of log with respect to the elements of

*, both evaluated for*

**θ***as given by . Alternatively, the expectation-maximization approach, used to estimate permanent environmental CFs and temporary environmental variances from matrices of residual covariances (Mäntysaari 1999; Emmerling*

**F***et al*. 2002; Koivula

*et al*. 2004), may be useful to estimate genetic CFs where we impose a structure on

*, such as considering the largest eigenvalues only (see §7).*

**K**### (c) Constructed CF

Estimation of CF from covariance matrices also has been found useful for cases where we might have reliable estimates for some variance components, but have to rely on literature values or assumed relationships (such as the auto-correlation function described above) for estimates of some correlations. Misztal *et al*. (2000) outlined an approach to construct matrices of covariances * S* for this scenario, proposing linear interpolation or variance function of standard shapes to derive missing components, and to obtain LSQ estimates of CFs. An application for CFs for growth of beef cattle has been presented by Legarra

*et al*. (2004).

## 6. Estimation of covariance functions: II. ‘Directly’

Alternatively, CFs can be estimated in a single step, directly from the data. As shown above, the coefficients of CFs are equal to the covariances between RR coefficients in a functional, linear mixed model. Hence, standard mixed-model-based variance component procedures can be used to estimate CFs directly from the data.

### (a) Restricted maximum likelihood

Early applications of restricted maximum likelihood (REML) used a simple reparametrization of standard, multivariate models to estimate CFs (Meyer & Hill 1997). This required MME of size proportional to the number of distinct values of *t* to be manipulated, and thus did not accommodate data from livestock improvement schemes involving records at ‘all ages’. Hence, today the coefficients of CFs are predominantly estimated as the covariances among RR coefficients.

REML estimation maximizes the part of the likelihood independent of fixed effects. Assuming unstructured covariance matrices, i.e. imposing no restrictions on **K**_{α} and **K**_{γ} other than that they should be positive semi-definite and symmetric, the vector of parameters to be estimated, * θ*, comprises the

*k*

_{α}(

*k*

_{α}+1)/2 unique elements of

**K**_{α}, the

*k*

_{γ}(

*k*

_{γ}+1)/2 elements of

**K**_{γ}, and

*k*

_{ϵ}parameters modelling variances due to temporary environmental effects. The REML log likelihood (log ) is then given by(6.1)where

*is the coefficient matrix in the MME (see equation (4.1)), and*

**C***a projection matrix, , so that . The first line of equation (6.1) is the standard form of log (e.g. Harville 1977), expressed in terms of the phenotypic variance matrix*

**P***. The formulation of the second line, however, is computationally advantageous.*

**V**

**K**_{α}and

**K**_{γ}are small, of size

*k*

_{α}×

*k*

_{α}and

*k*

_{γ}×

*k*

_{γ}, respectively, and their determinants are readily evaluated. Log|

*| arises as a by-product when setting up*

**A**

**A**^{−1}from a list of pedigree information. The last two terms,

*′*

**y***and log|*

**Py***|, can be computed simultaneously through a Cholesky factorization of*

**C***augmented by the vector of right-hand sides in equation (4.1) and its transpose, and the residual sum of squares (Graser*

**C***et al*. 1987).(6.2)For

*=*

**M***′, of size*

**LL***M*×

*M*, and

*l*

_{ii}the

*i*th diagonal element of

*,(6.3)REML estimates of*

**L**

**K**_{α}and

**K**_{γ}have to be positive semi-definite, and estimates of temporary environmental variances cannot be negative (Harville 1977). Hence, maximization of log , as given in equation (6.1), represents a constrained optimization problem. A reparametrization can be used to remove constraints on the parameter space. For instance, instead of estimating the unique elements of a covariance matrix

*, we can estimate the elements of its Cholesky factor, taking logarithmic values of the diagonals (Meyer & Smith 1996; Pinheiro & Bates 1996). This not only allows use of unconstrained maximization procedures, but can also improve rates of convergence in an iterative estimation scheme (Groeneveld 1994).*

**K**The Newton–Raphson algorithm, with various modifications, is frequently applied in maximum likelihood estimation (Jennrich & Sampson 1976). It is an iterative scheme, which requires first and second derivatives of log . For Fisher's ‘method of scoring’, the latter are replaced by their expected values, but, in most quantitative genetic applications, both are difficult to compute. Hence, today the so-called ‘average information’ (AI) algorithm (Gilmour *et al*. 1995), which approximates second derivatives with the average of observed and expected values, is generally preferred. These averages are equivalent to second derivatives of the ‘data part’ of log , * y*′

*. They are relatively easy to calculate, while yielding convergence rates similar to those using second derivatives or their expectations; see Hofer (1998) for a review.*

**Py**Algorithms for multivariate analyses via AI REML are readily adapted to the estimation of covariances among RR coefficients. Jensen *et al*. (1997) employ sparse matrix inversion of * C* to obtain first derivatives of log . An alternative, relying on automatic differentiation of the Cholesky factor of

*(see equation (6.2); Smith 1995), is outlined by Meyer (1998*

**M***a*). This requires specifications of first derivatives of

*with respect to the covariance components to be estimated, which have a simple form. An extension, accommodating variance functions to model changes in temporary environmental variances with*

**M***t*and parametric correlation structures for covariances among permanent environmental effects is given by Meyer (2001

*b*). Other authors described expectation–maximization type algorithms, and their extensions to improve convergence, for RR models (Gengler

*et al*. 1999; Foulley & van Dyk 2000; Foulley

*et al*. 2000).

Most software packages available for mixed model analyses allow RR models to be fitted, and the corresponding covariance matrices to be estimated.

### (b) Bayesian analysis

Even for comparatively low orders of fit, RR models involve substantial numbers of covariance components to be estimated. Typically, large datasets are required to estimate so many parameters with sufficient accuracy. However, high computational requirements limit the size of datasets and models that can be analysed using REML.

Bayesian estimation has become a standard for quantitative genetic analyses (Sorensen & Gianola 2002). In particular, schemes sampling from fully conditional posterior distributions of the parameters of interest are widely used. These are computationally easy to implement. While computing times required to sample long or multiple Markov chains may be large, memory requirements are considerably smaller than for corresponding REML analyses. This facilitates analyses of much larger datasets. In addition, as estimates of complete posterior distributions are obtained, sampling properties of estimates or functions thereof are readily examined.

RR analyses using a Gibbs sampler have been applied initially to the analysis of test day records of dairy cows. Full details are given by Jamrozik & Schaeffer (1997) and Rekaya *et al*. (1999). Early models fitted a simple step function for temporary environmental variances. Variance functions (e.g. Schnyder *et al*. 2001) or changepoint techniques (Lopez-Romero *et al*. 2004) were used subsequently, but required Metropolis–Hasting steps (Sorensen & Gianola 2002). Jamrozik (2004) discusses implementation issues of Markov chain Monte Carlo methods for RR analyses.

## 7. Eigenfunctions and beyond

CFs can involve a large number of parameters. Hence estimation can be computationally difficult, and estimates can be subject to large sampling variances. This has motivated the use of principal component methods to the analysis of FV traits (Kirkpatrick & Meyer 2004).

An eigenvalue decomposition of a covariance matrix * K* gives the matrix as the product of the matrix of eigenvectors

*and the diagonal matrix of eigenvalues*

**E***.(7.1)*

**Λ***is orthonormal, i.e.*

**E***′=*

**EE***. The columns of*

**I***, or eigenvectors, define linear transformations of the original variables to new variables, which are uncorrelated and have variances as given by the corresponding eigenvalues. These variables are commonly referred to as ‘principal components’ (PC). Eigenvectors and -values are usually ordered in descending order of the latter, and the*

**E***i*th PC explains maximum variation given PCs 1,…,

*i*−1 (e.g. Jollife 1986).

The FV equivalent to eigenvectors are eigenfunctions, for curves, or eigenfaces, for images. Eigenfunctions are useful in visualizing and analysing patterns of variation of FV traits. Like CFs, eigenfunctions, theoretically, have infinite dimensions, but are, in practice, approximated as the weighted sum of a limited number of basis functions.

If the basis functions are orthogonal, estimates of the eigenfunctions of a CF can be obtained directly from the eigenvectors of the coefficient matrix of the CF (Kirkpatrick & Heckman 1989). Let * K* of size

*k*×

*k*be the coefficient matrix of a CF. The estimates of eigenfunctions are then(7.2)where

*e*

_{ir}are the elements of the

*i*th eigenvector of

*. Evaluating equation (7.2) for all values of*

**K***t*yields a curve, the eigenfunction. Similarly, the eigenvalues of

*provide estimates of the eigenvalues of the CF. For non-orthogonal basis functions, we need to adjust for the basis (James*

**K***et al*. 2000). Alternatively, we can extract the eigenfunctions and -values numerically, by evaluating the CF for a fine grid of values of

*t*, and calculating the eigendecomposition of the resulting covariance matrix (Rice & Wu 2001).

Eigenfunctions are one of the main characteristics of CFs examined in FDA. As discussed below (§8), eigenfunctions of genetic CFs provide an insight into the expected transformation of trajectories when subject to selection.

### (a) Truncated expansion

Principal component analysis is widely employed as a dimension-reduction technique. Considering only the first *m* PCs with the largest eigenvalues, we reduce the number of variables, while still capturing a maximum of variation. If the eigenvalues for PCs *m*+1,…,*k* are small, loss of information can be negligible. The same argument applies for FV traits. Moreover, we can choose to use the eigenfunctions as basis functions, and truncate our expansion by considering the most variable PCs only (Kirkpatrick & Meyer 2004). This is the Karhunen–Loève expansion favoured in FDA (Ramsay & Silverman 1997). As it is based on PCs of the CF, it is the expansion which, for a given number of terms, minimizes the residual.

Let *κ*_{αr}(*t*) and *κ*_{γr}(*t*) denote the *r*th eigenfunction of and , the additive genetic and permanent environmental CFs, respectively. The functional, linear model equation (3.2) then becomes(7.3)with and the *r*th additive genetic value and permanent environmental effect for animal *i*. The corresponding CFs are directly given by the eigenfunctions(7.4)As emphasized by Kirkpatrick & Meyer (2004), there are two levels of truncation. First, we approximate the CF by its first *m* eigenfunctions only, assuming infinite-dimensional CF and eigenfunctions. Second, we estimate the eigenfunctions using a limited number of terms, *k*≥*m*, as shown in equation (7.2).

### (b) Estimating eigenfunctions directly

Genetic PCs or eigenfunctions have usually been estimated by first obtaining full rank estimates of the genetic covariance matrix, and then carrying out an eigenvalue decomposition of the matrix. Alternatively, PCs have been determined at the phenotypic level, and genetic parameters of the new variables have been estimated (e.g. Chase *et al*. 2002). A better approach would be to estimate the genetic eigenfunctions directly, and, at the same time, to restrict estimation to the most important eigenfunctions (Kirkpatrick & Meyer 2004). Fortunately, this can readily be done within the mixed-model framework described above. In essence, it only requires a simple reparametrization of the functional linear model given by equation (7.3).

Let , , , and , where **E**_{α} and **E**_{γ} are the matrices of eigenvectors of **K**_{α} and **K**_{γ}, respectively. These matrices, **E**_{α} and **E**_{γ}, are orthonormal, so at full rank(7.5)is an equivalent model to equation (3.5). The MME for equation (7.5) are of the same form as given in equation (4.1), but with * Z*,

*,*

**W***and*

**α***replaced by*

**γ****,*

**Z****,*

**W**** and*

**α****, respectively. Furthermore, we have diagonal matrices of eigenvalues*

**γ**

**Λ**_{α}and

**Λ**_{γ}, instead of

**K**_{α}and

**K**_{γ}.

Elements of the vector * α**, , are the additive genetic effects for the eigenfunctions. This implies that estimates of the can be used to select animals for desired transformations in the

*r*th eigenfunction.

If we truncate **E**_{α} to the first *m*_{α} columns only, * α** has only

*m*

_{α}elements for each animal, corresponding to the genetic effects for the leading eigenfunctions. Moreover, the number of equations in the MME can be greatly reduced, as the number of genetic effects fitted is now proportional to

*m*

_{α}rather than

*k*

_{α}. Analogously, we can restrict the order of fit for permanent environmental effects to

*m*

_{γ}<

*k*

_{γ}.

### (c) Reduced rank covariance functions

Considering the leading eigenfunctions only, the corresponding CFs (equation (7.4)) have reduced rank, *m*_{α}<*k*_{α} and *m*_{γ}<*k*_{γ}. We can estimate the CFs, i.e. the coefficients of the eigenfunctions and the corresponding eigenvalues, based on the reparametrized mixed model equation (7.5) in the same way as described in §6 for the general RR model. Only REML estimation is considered here.

For a trajectory modelled by *k* RR coefficients, the full-rank coefficient matrix has *k*(*k*+1)/2 covariances among RR to be estimated. If we fit the first *m* eigenfunctions only, the number of parameters describing the CF is reduced to *m*(2*k*−*m*+1)/2. While there are *km* elements in the first *m* eigenfunctions and *m* eigenvalues, the number of parameters is less as * E* is orthonormal. To accommodate these constraints easily, Kirkpatrick & Meyer (2004) suggested replacing matrices

*above with*

**E***=*

**Q**

**EΛ**^{1/2}, i.e. to scale the eigenfunctions by the square root of the corresponding eigenvalues. As shown in Meyer & Kirkpatrick (2005), the

*r*th column of

*then has*

**Q***k*+1−

*r*‘free’ elements, while the remaining

*r*−1 elements are given by the orthogonality constraints on

*, and can be determined as solutions to a linear system of equations.*

**Q**This parametrization puts all parameters to be estimated into the design matrices, now and . As the scaled eigenfunctions * α** and

** have covariance matrices and , the (log) likelihood to be maximized is reduced to(7.6)where*

**γ**** is the coefficient matrix in the correspondingly reparametrized MME (see Meyer & Kirkpatrick 2005 for full details).*

**C**Strategies outlined above (§6) to evaluate and maximize log are directly applicable to the estimation of eigenfunctions. An AI–REML algorithm is described in Meyer & Kirkpatrick (2005). With the number of random effects fitted and the size of the MME to be handled proportional to *m*_{α} and *m*_{γ} rather than *k*_{α} and *k*_{γ}, computational requirements can be reduced dramatically compared with full-rank analyses.

Parametrizing to the ‘free’ elements of matrices * Q*, maximization of log represents an unconstrained optimization problem. Estimates of the corresponding eigenvalues can be obtained by calculating the vector norm for the columns of

*. However, with*

**Q***highly nonlinear in the parameters to be estimated, convergence of standard algorithms for this parametrization has been found to be slow (James*

**V***et al*. 2000; Meyer & Kirkpatrick 2005). This can be alleviated by a rotation of the parameter space (Smith

*et al*. 2001).

Let * Q**=

*, where*

**QT***defines an orthogonal transformation. Since*

**T***′=*

**TT***, this does not affect the likelihood. A suitable choice might be*

**I***=*

**T***′, so that*

**E****=*

**Q**

**EΛ**^{1/2}

*′, which is the matrix square root of*

**E***. Alternatively, consider the Cholesky factorization , and the singular value decomposition of*

**K**

**L**_{K}=

**U**_{1}

**DU**_{2}. The left singular vectors of

**L**_{K}are equal to the eigenvectors of

*and the corresponding singular values are the square root of the eigenvalues (Harville 1997, p. 232), i.e.*

**K***=*

**Q**

**U**_{1}

*. Hence, choosing the matrix of right singular vectors as our transformation,*

**D***=*

**T**

**U**_{2}, the rotated matrix of parameters to be estimated is

**=*

**Q**

**L**_{K}. This implies that we can maximize log with respect to the non-zero elements of the first

*m*columns of the Cholesky factor of

*, to obtain REML estimates of the first*

**K***m*eigenfunctions of

*. This parametrization has been found to have good convergence properties (Groeneveld 1994; Hofer 1998; Thompson & Mäntysaari 1999).*

**K**### (d) Sampling properties

As well as reducing computational requirements, estimation of only the most important eigenfunctions can yield estimates with smaller errors. This is illustrated with a small simulation study.

Matrices of mean crossproducts for 500 unrelated sires with four progeny each and three or five records per progeny, were sampled from a Wishart distribution. Genetic effects were assumed to have CF as considered by Kirkpatrick *et al*. (1990), i.e. with *k*_{α}=3 and eigenvalues *λ*_{α1}=1360.8, *λ*_{α2}=24.5 and *λ*_{α3}=1.5. For simplicity, records were assumed to be taken at the same time for all animals and equally spaced. Permanent environmental effects were considered absent, and temporary environmental variances were assumed to be homogeneous with *σ*^{2}=625. Maximum likelihood estimates were obtained as described above (§5), considering increasing numbers of eigenfunctions, i.e. *m*_{α}=1, 2 and 3. Accuracy of estimates was measured as relative error, (in %), for *θ*=*σ*^{2}, *λ*_{1}, *λ*_{2}, *λ*_{3}, . The error in the genetic CF, , was evaluated by numerical integration. Errors in estimated eigenfunctions were measured as the angle between ‘true’ and estimated eigenvectors (**e**_{r}) of **K**_{α}, i.e. ; in °), where |.| is the norm of a vector; see Kirkpatrick & Meyer (2004) for details.

Results are summarized in table 1. When estimating the first eigenfunction only, estimates of the residual variance tended to be inflated. Similarly, estimates of eigenvalue *λ*_{α1} were somewhat lower than the population values, though deviations were well within the range of sampling errors. The first eigenfunction, *κ*_{α1}, was estimated accurately for all analyses, with mean deviations from the true direction less than 1°. While error in estimates of were largest for *m*_{α}=1, differences to the analyses fitting more components were small. Results indicated that there was little advantage in fitting the second eigenfunction, and that there was nothing to be gained by estimating the third eigenfunction.

## 8. Selection on curves

An important component of quantitative genetics is the selection of animals with desired characteristics, and the prediction of response to selection. Standard methods to quantify selection and predict response, developed for traits which are points, extend naturally to FV traits.

### (a) Quantifying selection

Two related measures for the strength of selection on quantitative traits are in use. The first is the selection differential function, which is the difference in the means of the selected individuals and the population as a whole ,(8.1)If only differentials at individual points are known, we may estimate *s*(*t*) by interpolation. With artificial selection, the selection differential is controlled by the breeder, while in studies of natural populations it needs to be estimated, for instance, from data on survival of individuals and their trait values.

The second measure is the selection gradient function, *ψ*(.). It is a more informative measure in a number of situations, and has a simple interpretation. The value of the selection gradient, *ψ*(*t*), reflects the intensity of selection acting to increase or decrease the population's mean at point *t*. In contrast, the selection differential, *s*(*t*) also includes the effects of selection acting at other points.

It is not always possible to observe the selection differential directly. For example, we may be interested in a trait that changes with age, and individuals die at different ages. We cannot observe the trait values at a given age for individuals that die at a younger age, so we cannot estimate the selection differential at that age. We can, however, still estimate *ψ*(*t*). The selection gradient is equal to the partial regression of relative fitness on to the trait value (Lande & Arnold 1983). We therefore estimate *ψ*(*t*) at each of several ages, then link these points into a continuous function by interpolation (Kirkpatrick & Lofsvold 1992). A similar situation appears in studies of reaction norms, where an individual's phenotype depends on the environment. If each individual can only be observed in a single environment, we cannot observe the selection differential function *s*(.) directly. However, we can estimate *ψ*(*t*) within each of several environments *t* by regressing the trait value on to fitness, and then interpolate, taking into account the frequency with which the environments are encountered (Gomulkiewicz & Kirkpatrick 1992).

Another use of the selection gradient is in predicting evolutionary trajectories, based on assumptions about how a trait affects fitness. Given a fitness function and the distribution of the trait, we can calculate mean fitness, , in a population. When relative fitnesses for all individuals are constant over time, the selection gradient is equal to the gradient in mean fitness as a function of the trait mean (Gomulkiewicz & Beder 1996)(8.2)Intuitively, the strength of directional selection acting at each age is equal to the rate at which changing the mean for the trait at that age would increase the population's mean fitness.

### (b) Response to selection

Given a population's additive genetic covariance function and either the selection differential or the selection gradient, we can calculate the population's mean function in the next generation (Kirkpatrick & Heckman 1989; Beder & Gomulkiewicz 1998)(8.3)If an estimate of *s*(.) is available, the selection gradient function can be obtained by evaluating both *s*(*t*) and the phenotypic covariance function, (*t*, *x*) on a fine grid of points, to form * s* and

*, respectively. An estimate of*

**V***ψ*(t) is then obtained by interpolating the vector

*=*

**p**

**V**^{−1}

*.*

**s**As outlined above (§7), it is advantageous to formulate the functional linear model in terms of its genetic and environmental PCs. Likewise, response to selection can be described by the genetic PCs. We can rewrite the selection gradient function as(8.4)where *ϕ*_{r}(*t*) are the same basis functions as used to estimated the PCs, and * w*={

*w*

_{r}} are a set of regression coefficients. These can be estimated, for instance, using LSQ. For orthonormal basis functions, such as Legendre polynomials, the predicted response to selection for

*m*

_{α}genetic PCs, estimated by corresponding eigenvectors

**e**_{αi}={

*e*

_{αir}}, is then(8.5)The form of the right-hand side of equation (8.5) is convenient, as it does not involve integrals. Furthermore, it shows how selection response is determined by the interactions between the selection gradient and the genetic PCs. The first product, in vector form, measures how strongly selection ‘loads’ on to the respective PCs, i.e. to what extent the deformation of the growth trajectory favoured by selection coincides with that mode of genetic variation. The second product, , represents the amount of genetic variation available for the

*i*th direction.

### (c) Selection strategies

When selecting for several, correlated traits, selection indices are used to combine estimates of genetic merit for individual traits, and to identify animals which maximize the overall response to selection. Index weights are derived as a function of the phenotypic covariance matrix among the selection criteria, and the genetic covariances with the selection objective. Again, this extends readily to FV traits (Kirkpatrick & Bataillon 1999; Kirkpatrick 2002). Togashi & Lin (2003) demonstrated equivalence of selection based on an index of estimated breeding values for part-lactations and selection based on estimates of RR coefficients, to improve persistency of lactation in dairy cows.

Genetic changes in the mean trajectory due to selection can be quantified using the PC approach described above (§7). Genetic PCs or eigenfunctions that have large eigenvalues represent the deformations in the shape of the mean trajectory for which the population has large amounts of genetic variation, i.e. they show the kind of changes readily achieved by selection (Kirkpatrick *et al*. 1990). Further, these PCs indicate the genetic changes likely to occur as correlated responses to selection. For example, the leading PC for growth of animals is typically positive for all ages. This implies that an increase or decrease of the mean at all ages is relatively easy to produce. Conversely, selection to increase (or decrease) mean size at any one age will result in a corresponding change in size for all other ages as a correlated response to selection. PCs that change sign indicate trade-offs: selection that increases the mean at one age will decrease the mean at other ages. Optimal breeding strategies for FV traits can readily be designed by deriving selection index weights for individual PCs (van der Werf 2002).

## 9. Extensions

For simplicity, the model considered so far was univariate and comprised random effects of the animal only. Extensions to more complicated scenarios, involving additional random effects, multiple traits or more than one control variable, are conceptually straightforward. Implementations, however, have been hampered by computational and data structure problems and require further research.

Models including additional random effects have been used in the analysis of data measured on pigs where litter effects were to be modelled (Schnyder *et al*. 2001; Huisman *et al*. 2002), and growth data from beef cattle (e.g. Albuquerque & Meyer 2001; Meyer 2001*a*; Nobre *et al*. 2003), where both additive genetic and permanent environmental effects of the dam needed to be taken into account.

### (a) Multivariate analyses

In some cases, we may have records on several FV traits. Commonly, these have the same control variable *t*. Examples are morphological characters that change with age (Gomulkiewicz & Kirkpatrick 1992), or feed intake and live weight of animals (e.g. Veerkamp & Thompson 1999). Each trait is then modelled by a set of RR regression coefficients, and we need to estimate covariance matrices among RR coefficients of size equal to the total number of coefficients across all traits. Even for small numbers of traits and low orders of fit, this can result in substantial numbers of parameters to be estimated. Reduced rank estimation, as described in (§7), then becomes all the more appealing. The coefficients of the CFs for individual traits are then given by the diagonal blocks of the estimated covariance matrices, while the off-diagonal blocks determine the cross-covariance functions between traits (Meyer & Hill 1997).

This encompasses the scenario where we have a mixture of FV traits and traits that have a single record only. At the genetic level, we can simply treat the latter as ‘curves’, which are described by a single RR coefficient. At the environmental level, we cannot separate temporary from permanent environmental effects for the traits with single records. Hence we need to parametrize the mixed model, so that only the sum of environmental effects and their covariances are required. In a REML framework of estimation, this is readily done by fitting an equivalent model that incorporates permanent environmental effects in the residual; see Meyer (2001*b*) for details. For Bayesian analyses via Gibbs sampling, constraints on parameters have been used for this purpose (Schnyder *et al*. 2002).

### (b) Multiple control variables

In other cases, we may have more than one control variable, for instance, time or age and an environmental variable, such as temperature, season or level of production in a herd. Alternatively, we may have spatial data, described by longitude and latitude. Unless we can assume the different control variables to act independently, we need to model the resulting surface by regressing on bivariate basis functions. The CFs are then functions of four variables. More control variables are conceptually possible, but are intrinsically difficult and have not been considered. Again, the number of parameters to be estimated increases rapidly with increasing dimension. Moreover, computational difficulties and incidence of numerical problems are likely to be exacerbated. As above, direct estimation of the leading eigenfunctions only may make analyses more tractable.

### (c) Reaction norms

A particular application of the FV approach involves the so-called reaction norms, i.e. the curves that describe how a phenotype is expressed as a function of an environmental variable. These have long been considered in evolutionary biology, where both standard (Via & Lande 1985) and FV traits have been studied (Gomulkiewicz & Kirkpatrick 1992; Kirkpatrick 1993). Only recently, however, have reaction norms attracted interest in modelling of data from livestock improvement programmes, in particular, in relation to genotype×environment interactions (de Jong & Bijma 2002). This concept assumes that a given genotype has a different phenotype in different environments, i.e. that it has phenotypic ‘plasticity’ (Falconer 1952). Considering non-FV traits, importance of and models for variation in environmental variability linked to genetic effects have recently been examined by Hill (2002) and Hill & Zhang (2004).

Reaction norms are readily modelled in an RR framework, if a suitable environmental variable can be identified. Corresponding RR analyses have recently been carried out for data from livestock improvement programmes (e.g. Kolmodin *et al*. 2002; Calus & Veerkamp 2003). Optimization of selection programmes considering environmental sensitivity described by a reaction norm has been considered by Kirkpatrick & Bataillon (1999), de Jong & Bijma (2002), Kolmodin *et al*. (2003) and Kolmodin & Bijma (2004).

## 10. Conclusions

I believe the random regression and covariance function methods will completely replace the use of specific time point correlations (Hill 1998, p.34)

Random regression models provide a powerful and—with hindsight—obvious framework for quantitative genetic analyses of data that represent curves. In essence, they are straightforward extensions of the linear models invoked for traits that are points. Hence, standard quantitative genetic theory applies, requiring only minor extensions to accommodate the ‘infinite-dimensional’ nature of function valued traits.

Moreover, models that are linear in the regression coefficients are readily accommodated in the linear mixed model, which is a standard for quantitative genetic analyses, especially for data from livestock improvement programmes. Consequently, regular methods for multivariate estimation in the mixed model can be used, i.e. best linear unbiased prediction for the estimation of breeding values and REML or Bayesian procedures for the estimation of covariance functions, again only needing minor modifications.

Of particular interest is the truncated expansion of trajectories, based on the leading eigenfunctions of the corresponding covariance function. For highly correlated traits, the first few eigenfunctions are likely to explain the bulk of variation and thus give a sufficiently accurate approximation of the trajectory. This can result in very parsimonious models. As shown, this representation can be accommodated in the mixed-model-based analyses, and is thus likely to see increasing use in the analysis of function valued traits in future.

## Acknowledgments

This work was supported by grants DEB-9973221 and EF-0328594 from the National Science Foundation (NSF) and grant NER/A/S/2002/00857 from the Natural Environment Research Council (NERC; M.K.).

## Footnotes

↵† A joint venture of NSW Department of Primary Industries and the University of New England.

One contribution of 16 to a Theme Issue ‘Population genetics, quantitative genetics and animal improvement: papers in honour of William (Bill) Hill’.

- © 2005 The Royal Society