## Abstract

This article presents conceptual and empirical foundations for new parsimonious simulation models that are being used to assess future food and environmental security of farm populations. The conceptual framework integrates key features of the biophysical and economic processes on which the farming systems are based. The approach represents a methodological advance by coupling important behavioural processes, for example, self-selection in adaptive responses to technological and environmental change, with aggregate processes, such as changes in market supply and demand conditions or environmental conditions as climate. Suitable biophysical and economic data are a critical limiting factor in modelling these complex systems, particularly for the characterization of out-of-sample counterfactuals in *ex ante* analyses. Parsimonious, population-based simulation methods are described that exploit available observational, experimental, modelled and expert data. The analysis makes use of a new scenario design concept called representative agricultural pathways. A case study illustrates how these methods can be used to assess food and environmental security. The concluding section addresses generalizations of parametric forms and linkages of regional models to global models.

## 1. Introduction

Despite the successes of the Green Revolution in the twentieth century, and substantial investments in some rural areas, many of the world's rural households will continue to depend on small-scale, semi-subsistence agricultural systems in the foreseeable future. The viability of these systems, and the well-being of households that depend on them, face growing threats from increasing population density, resource degradation and climate change [1]. Although typically small in scale, these farming systems are highly diverse and complex, often consisting of subsistence crops (often intercropped in various ways), livestock, cash crops and, in some regions, aquaculture. In parts of the world where large-scale commercial agriculture predominates, the long-term sustainability of ‘industrial’ agricultural systems and their vulnerability to environmental changes are also being questioned. These concerns are raising interest in the use of more diversified systems that involve practices, such as crop rotations, intercropping and integration of crops with livestock. While advances have been made in understanding these systems, their biophysical and economic heterogeneity and complexity continue to pose great challenges to researchers striving to improve their performance and predict their responses to environmental, economic, technological, social and institutional change.

In this article, we present a novel conceptual and empirical framework for conducting simulation experiments that can be used to support informed science and policy decision-making and illustrate the approach with a study of soil nutrient management in Kenya. The modelling approach described here was inspired by research indicating the importance of site-specific interactions between biophysical and economic processes [2,3]. These theoretical considerations led to the development of a complex modelling system (the trade-off analysis (TOA) model) that linked site-specific process-based crop models, econometric models and process-based environmental models to simulate the behaviour of agricultural systems and assess trade-offs among key outcomes [4–6]. However, similar to many such models, the TOA model's data requirements and complexity limited the scope of its use to a relatively small number of case studies. Also, as other detailed farming system models, the original TOA model did not account for the interaction between farm-level behaviour and aggregate processes, such as how farm management changes affect market equilibrium (ME) prices or off-farm impacts on water quality.

A relatively simple but powerful insight led to the development of a new ‘parsimonious’ model called TOA-multi-dimensional (TOA-MD): the complex econometric models used in the TOA approach provided the basis to construct spatial distributions of economic and environmental outcomes; but once these distributions were constructed from simulations at a representative sample of sites in a population, the model structure was relatively simple. Thus, it was recognized that if these outcome distributions could be quantified directly from field observations, or approximated using experimental, modelled or expert data, they could be used to construct population-based models with a relatively simple generic structure. The first version of the TOA-MD model was an adoption model developed to simulate ecosystem service supply curves for crop-based systems [7]. The next step was to extend the approach in two dimensions to make it a more general impact assessment tool. First, the model was designed to represent a generic farm household comprised crops, livestock, aquaculture and non-agricultural activities. Second, the adoption model was linked to distributions of quantifiable economic, environmental or social outcomes associated with the production system [8]. The relatively simple generic structure of the TOA-MD model has led to its application in many types of systems around the world. The recent versions of the model that are available on World Wide Web have been downloaded by hundreds of researchers since 2010 and are in use at numerous national and international research institutions [9].

The population-based approach to agricultural systems modelling represents a fundamental departure from the conventional analysis that uses averaged or aggregated data in ‘representative farm’ models and agricultural market models. Besides representing the essential biophysical and economic heterogeneity of these systems, the population-based approach provides a new way to simulate economic, environmental and social impacts of changes in technology and in economic or environmental conditions, taking into account the self-selection behaviour emphasized in the econometric policy evaluation literature [10] and in recent contributions to the technology adoption literature [8,11]. In this article, we show how the population-based model can be linked to aggregate processes, for example, changes in market supply and demand conditions or environmental conditions for example, climate [12], and we show how the approach can be implemented with parsimonious empirical models following [7,8,13]. We argue that model parsimony is particularly important when analysing changes in technology, economic or environmental conditions ‘out of sample’—i.e. when extrapolating from situations where systems can be observed, to situations, for example, of climate change impact assessment where they cannot be observed, and thus conventional parameter estimation and calibration methods cannot be used. These parsimonious simulation models can be parametrized with the various observational, experimental, modelled and expert data that may be available, including information from new scenario methods being developed by the global impact assessment community [14–17]. The concluding section discusses possible generalizations of the population-based approach, including linkages between regional model and global models.

## 2. Assessing food–environment synergies and trade-offs: nutrient management in Kenya

We motivate and illustrate our approach with an analysis of a critical challenge to agricultural progress in sub-Saharan Africa: declining agricultural productivity and persistence of high poverty levels, and the search for policy interventions that will achieve the win–win outcome of reversing both of these adverse trends [18]. Agriculture is the most important sector of the Kenyan economy, representing about 30% of its gross domestic product [19]. Most agriculture is semi-subsistence where intercropping, small farm size (less than 2.5 ha), high rates of crop failure (more than 50% during dry years) and the lack of an established capital market are typical [20,21]. In many regions of Kenya, rapid population growth and limited access to land have led to extremely small farm sizes that are neither economically nor environmentally sustainable.

This case study focuses on the Machakos region located southeast of Nairobi, comprising a hilly, semi-arid area of approximately 14 000 km^{2} and an altitude range between 340 and 1710 m above sea level. The predominant system is semi-subsistence mixed crop-livestock agriculture with maize as the main staple crop. Other main crops in Machakos region include pigeon pea, sorghum, beans, horticultural crops and fruit trees. Farms that produce milk tend to have much higher incomes than those that depend mostly on subsistence crops. In the data used here, the headcount poverty rate (the percentage of the population below a $1/person/day poverty line) for the farms without dairy is over 80% in contrast to 40% for the farms with substantial dairy production. In the subsistence group, about 30% of cropped area is planted with maize, and less than 20% for the farms with dairy or irrigated vegetable production. Maize is an important subsistence crop and a cash crop for larger farms. Despite several efforts of the government and research programs to increase maize yields, average yields in Kenya are low, averaging around 1 tonne grain per hectare, far below the potential, contributing to frequent food deficits in Machakos and other regions. This hilly area suffered high rates of soil erosion in the mid-twentieth century, but government sponsored terracing programmes have reduced erosion rates substantially [22]. Nevertheless, there still appear to be high rates of soil nutrient and organic matter losses [23], one of the major constraints for increasing productivity. Researchers argue that to reverse the declining trends in *per capita* food production, soil fertility management must be improved [18,24].

Despite research showing that fertilizer could be a profitable option to increase yields and income, fertilizer use in sub-Saharan Africa is low, and it is even lower in semi-arid areas. According to the United Nations Development Program [25], the average consumption of fertilizer in 1998 was 13.8 kg of N–P–K nutrients per hectare of arable and permanently cropped land and the situation has not improved much since then. Low fertilizer use has been attributed to high prices caused by high transport costs and import tariffs, high levels of risk associated with low and highly variable rainfall patterns, inefficient input distribution and availability, financial constraints and difficulty of farmers in assessing returns to fertilizer [26]. Marenya & Barrett [18] show that low rates of fertilizer use in Kenya are also associated with low soil fertility owing to severe nutrient depletion and resulting low fertilizer response. Antle *et al*. [27] show that this type of situation can be a low-level equilibrium ‘trap’ in the sense that, once soils are depleted beyond a threshold level, it may not be economically viable for poor farmers to invest enough in restoring soil productivity to achieve a permanent higher productivity level, even if it is technically feasible.

### (a) Scenario definition

Various policy interventions to deal with poverty and land degradation have been discussed in Kenya and other African countries. In this study, we base our analysis on a policy intervention scenario described by Valdivia *et al*. [12]. This *fertilizer-use scenario* consists of reducing import tariffs and increasing investment in extension information and market infrastructure aimed to increase fertilizer availability and use while reducing fertilizer farm-gate cost, and is consistent with policies being pursued by the Government of Kenya [28]. This scenario assumes that these interventions reduce the mean fertilizer price by 50%, and induce all farmers to increase fertilizer use, as determined by the fertilizer demand component of the model described below.

### (b) Trade-off curves

The foundation of our analytical approach is to construct simulation models that quantify the inter-relationships among key sustainability indicators defined for a farm household population. Here, we define two sustainability indicators of interest: an economic indicator defined as the per cent of the population *above* the income-based poverty line (defined here as 100 minus the headcount poverty rate with the poverty line set at $1 per day), and the rate of *gains* in soil nutrients during the growing season. We interpret negative balances as indicative of a system whose productivity cannot be maintained if the negative balance persists.

Following much of the impact assessment literature [4,29–31], we use the inter-relationships between these indicators—which may involve either negative (trade-offs) or positive (synergies or win–win) outcomes—to assess the effects of the fertilizer-use scenario. We refer to the relationships among sustainability indicators, holding constant specified factors, as *trade-off curves*. Two fundamental processes drive these trade-offs and synergies and lead to two different types of trade-off curves. First, for a population of farmers using a given production system, changes in prices, policies and other economic variables induce behavioural responses, leading to *price-based trade-off curves*. Maize is considered to be a key crop by rural households and policymakers, so we construct price-based trade-off curves between these two indicators by varying the mean of the maize price distribution. Second, given prices and other economic factors, farmers may make changes in their production system by adopting new technological components, such as improved seeds, fertilizers and associated management practices, giving rise to *adoption-based trade-off curves.* As we discuss later, the characteristics of the farmers' production systems and the biophysical and economic environment in which they operate determines the properties of these trade-off curves.

### (c) Modelling approach and results

This study builds on a project on sustainable nutrient management and uses the same input data [23]. Here, we extend the original analysis by combining two models. First, the TOA-MD model [8,9] (discussed in detail later) simulates farmers' choice between two production systems, the current system in use and a system with improved nutrient management practices involving increased fertilizer use, for given prices and costs of production. Second, a market equilibrium model called TOA-ME—which links site-specific process-based crop simulation models, econometric models of farm output supply and input demand, and a nutrient balance model—simulates the price-based trade-off curves and identifies the points of ME along those curves [12]. The TOA-ME results are used to obtain equilibrium prices corresponding to the two technology scenarios. Then the TOA-MD model is calibrated to simulate the fertilizer-use scenario and generate the adoption-based trade-off curves at those prices. The data on which the analysis is based are available at http://tradeoffs.oregonstate.edu.

As described below, the TOA-MD model is based on the assumption that farmers choose the practice that provides the highest economic value. The prediction of an adoption rate is based on the distribution of the *difference* between economic values of the two systems, defined as the *opportunity cost* of changing systems. Figure 1 shows the cumulative distribution of opportunity cost for the fertilizer-use scenario, at the base (observed) prices and at the ME prices. We interpret these cumulative distributions as *adoption curves* because the point where this curve crosses the horizontal axis indicates the proportion of farms that expect higher returns from system 2, and thus is the predicted adoption rate of system 2. Figure 1 shows that at base prices the adoption rate is 63% and it also shows that when the ME effect of adoption on maize prices is taken into account, the rate falls to 55%.

Figure 2 illustrates the relationship between the adoption rate and an economic indicator, the per cent of the farm households with incomes above the poverty line (the case corresponding to the ME maize price is shown). As figures 1 and 2 show, the economic value of the system is maximized at the predicted adoption rate—intuitively, the model shows that if farmers choose the system that provides them with the highest economic return, then the proportion of households above the poverty line will be maximized in the entire population—increasing from about 22% when all farms use the current practice with relatively poor nutrient management, to about 30% when 55% of the farms adopt the improved practices. Figure 2 also shows the value of this indicator for the sub-populations of farms that adopt or do not adopt the improved nutrient management. As we discuss in the following section and in appendix A, the correlations between opportunity cost and other outcomes determine how those outcomes change with adoption, and thus are related to the slopes of the adopter and non-adopter curves. Figure 2 shows that the correlation between opportunity cost and mean economic returns is negative for adopters and positive for non-adopters. Thus, as the adoption rate increases, mean expected returns among adopters decline, and mean returns among non-adopters increase. At the predicted 55% adoption rate, about 33% of adopters are above the poverty line, but only about 26% of non-adopters are.

As we explain further below, in addition to the levels of outcomes associated with adoption and non-adoption, it is possible to simulate the counterfactuals for the adopters and non-adopters, and thus estimate the various *treatment effects* (i.e. changes in outcomes associated with adoption) discussed in the econometric policy evaluation literature [10]. Table 1 presents the values of the indicators, counterfactuals and treatment effects for the subsistence farms at equilibrium prices and the predicted adoption rate. The counterfactual for the adopter group shows that without increased fertilizer use, 9.8% were above the poverty line, whereas 27.4% are above poverty with it, thus almost tripling the number of farms above poverty in this group. Table 1 also provides useful information about the non-adopters: while there would be larger positive impacts on soil nutrient gains for the non-adopters, they would be substantially worse off economically if they were to adopt the practice. These remaining non-adopters would require additional economic inducements to adopt the practice.

Figure 3 presents the linkage between adoption-based and price-based trade-off curves for the environmental indicator (soil nutrient gains) and the economic indicator (per cent of population above the poverty line). The price-based trade-off curves were generated by the TOA-ME model as described in [12]. The adoption-based trade-off curves were generated by the TOA-MD model, for the base maize price (from 1 to 1″) and the ME maize price (from 2 to 2″). Holding prices constant, the adoption analysis shows that there are substantial economic benefits but minimal environmental effects (e.g. the movement from point 1 to 1′ or from 2 to 2′), whereas when the ME effect is included, the movement from point 1 to 2′ shows smaller gains in the economic dimension but larger gains in the environmental dimension. Thus, the linkage of the population-based analysis to the ME analysis provides potentially important additional information about the ultimate economic and environmental consequences of this fertilizer-use scenario.

### (d) Representative agricultural pathways and socio-economic scenarios

History shows that the well-being of rural populations is highly dependent on what happens in the rural non-agricultural economy and in the larger national and international economies. Farm households' incomes come from agricultural and non-agricultural sources, including off-farm employment in the rural sector and remittances from family members working outside the rural sector. Farm income is determined by prices affected by local, national and international demand and supply; farm size depends on demand-pull factors, such as employment opportunities in the non-agricultural sector and technology that enables farm consolidation and mechanization. Institutional factors, including policies to support investments in infrastructure and education and supportive policies that lower costs of imported inputs and technologies, all play important roles. Some of these factors, for example international prices, can be simulated under plausible future scenarios using global agricultural economic models. But many of these factors are not provided by such models. The role of representative agricultural pathways (RAPs) is to provide a consistent narrative together with quantitative information about the economic, technological, social and institutional context in which agricultural development occurs. Using these pathways, researchers can formulate specific model scenarios that are consistent with these defined development pathways [17].

An example of a positive development pathway for the Machakos region of Kenya was provided by Hochman & Zilberman [2] and Valdivia *et al*. [12]. This pathway involves changes in rural development that lead to doubling the average farm size from 3 to 6 ha and reducing the average household size by 25% from about eight to six persons, consistent with received economic growth theory and the historical experience in other countries that have experienced economic growth. At the base prices, these changes shift the base point 1 in figure 3 from coordinates of (−32, 24) to (−27, 49), thus doubling the number of households that are above the poverty line and reducing the rate of nutrient depletion by about 15%. The introduction of the improved nutrient management practices under these improved socio-economic conditions then results in an adoption rate of about 64%, similar to the adoption rate under the base scenario (figure 1); however, the combination of the improved rural development conditions plus the nutrient management scenario lifts the per cent of households above poverty to over 60% and reduces the nutrient loss rate slightly to around 26%. The obvious—but often overlooked—point that this example illustrates is that the socio-economic environment in which the farm households operate can have a very substantial impact on their well-being, often more important than the possible technological improvements that could be made in their farm operation. Valdivia *et al*. [21] and Claessens *et al.* [32] also find that the future socio-economic conditions of farm households may have a greater impact on their well-being than possible effects of climate change, at least in the near term (to 2030) when climate change impacts may be relatively small.

## 3. Conceptual foundations: linking population- and market-based models

This section presents the conceptual framework that is the foundation for the simulation analysis presented earlier, using a heuristic and graphical exposition; technical details are provided in appendix A and in cited publications. As we noted in §2, a fundamental feature of agricultural household populations is their heterogeneity, in both biophysical and socio-economic dimensions. Our approach is based on the design of simulation models that represent a heterogeneous population of farms, with biophysical and economic-behavioural processes operating at both farm and population scales. While site-specific biophysical process and farm-level behaviour are the foundation of these complex systems, impact assessment for public policy decision making is primarily concerned with measures of impacts for populations. Accordingly, our conceptual framework integrates economic and environmental heterogeneity with aggregate economic and environmental processes for analysis of trade-offs and synergies between food and environmental security at the population level. We begin with the analysis of a heterogeneous farm population, holding biophysical and economic drivers constant (e.g. soil and climate, prices of outputs and inputs, technology), and then discuss how this population-based model is affected by changes in biophysical or economic drivers, e.g. thorough biophysical feedbacks or interactions between farm households and markets.

### (a) Outcome distributions, system choice and impact assessment

Following the modern econometric policy evaluation literature, our approach is based on the concept of *outcome distributions:* a heterogeneous population using a particular production system comprising a set of inter-related crop and livestock activities (call this system 1) is characterized by an associated joint distribution of economic, environmental and social outcomes. These outcome distributions arise from the complex interactions of biophysical, economic and social processes at the field, farm, household and population levels.

In a typical technology adoption analysis, an alternative system (call this system 2) becomes available to the population using system 1. If the entire population were to switch to this alternative system, the result would be a different outcome distribution. In most cases, some farms choose to continue to use system 1 (i.e. are non-adopters of system 2), and some use system 2 (i.e. are adopters of system 2), in which case the overall population is characterized by a mixture of the outcome distributions of the two systems. This conceptual framework can be used to design simulation experiments that mimic what would be observed when a population is ‘treated’ in this way (i.e. offered the option of using a new system). However, as recognized in the biomedical and economic policy evaluation literature, there is a critical difference between controlled physical experiments and interventions that involve people. As the fertilizer-use scenario example presented above suggests, in most experiments involving people, that the potential subjects choose whether or not to be subjected to the ‘treatment’—they self-select into treatment. In the analysis of new technologies, or in the analysis of adaptation to changes in environmental, economic or policy conditions, farmers and other economic agents can be expected to make purposeful choices between alternatives. The biostatistics, econometrics and related literature show that quantitative analysis of the outcomes of such purposeful choices must, therefore, take into account the statistical inter-relationships between peoples' choices and the associated outcomes.

Farms are ordered according to an index *ω* such that for the adoption threshold *a*, *ω* ≥ *a* for those farms using system 1 and *ω* < *a* for those using system 2. Expected economic outcomes associated with each system are defined as *v*(1) and *v*(2), and we let *ω* = *v*(1)−*v*(2), i.e. we order farms according to *ω* which is interpreted as the *opportunity cost* of changing from system 1 to system 2. As introduced in §2, we assume farms choose the system that maximizes expected returns, thus *a* = 0. Alternatively, the adoption threshold *a* may be non-zero to represent incentives to encourage or discourage adoption, as in payments for ecosystem services [7,33,34]. The opportunity cost can be represented as a present value over a relevant time horizon if the choice between systems involves important fixed costs and it can also incorporate costs associated with risk or transaction costs. Opportunity cost *ω* is spatially distributed across the landscape as a function of prices and other exogenous variables represented by *p*. The proportion of farms using system 2, referred to as the *adoption rate of system 2*, is given by the cumulative distribution function for *ω* < *a* and is defined as *r*(*p*,2,*a*); the share of farms using system 1 is *r*(*p*,1,*a*) ≡ 1 − *r*(*p*,2,*a*). To simplify the presentation, we abstract from the dynamics of the adoption process, but recognize that in reality such changes play out over time.

This model represents a system as a set of production activities and management practices that have certain components in common (e.g. an improved crop variety), but all farms need not be using those components in the same manner. In terms of the economic decisions being simulated, the only meaningful difference between systems 1 and 2 is that each produces different economic outcomes, giving rise to a non-degenerate ordering *ω*. This feature of the model is important because a wide array of management practices are typically applied to particular technologies, for example, improved seed varieties. Another important feature of this model is that the economic behaviour of farmers is likely to result in a level of adoption between 0 and 100% owing to the heterogeneity in the conditions in which farms operate (soils, climate, prices, location, etc.). A substantial literature on technology adoption argues that incomplete adoption of ‘new’ or ‘improved’ technologies is caused by constraints on adoption, such as risk aversion and access to information (for a review, see [35]); yet a growing body of evidence also points to the fundamental importance of appropriately measured, site-specific biophysical and economic heterogeneity in explaining technology use (for further critical discussion, see [11,36]. In the spirit of parsimony, we begin with a model based on expected returns and interpret it as predicting economic feasibility, while acknowledging that other behavioural elements can be introduced as needed and as feasible with available data.

We consider the case of two outcomes associated with each system *h* = 1,2, one economic outcome *v*(*h*) and one environmental outcome *z*(*h*). The adoption variable *ω* and the outcomes *k* = *v*, *z* are influenced by many of the same factors, and thus are jointly distributed. Antle [8] derives several key results for this model. First, the sub-population using each system is characterized by the joint outcome distribution between *ω* and *k* = *v*, *z*, truncated according to *ω* ≥ *a* for system 1 and *ω* < *a* for system 2. It is important to recognize that these joint distributions embed the underlying biophysical and economic processes that link inputs and outputs of the farm production system and farm household, and these relationships are embedded in the structure of the other relationships that are based on these joint outcome distributions. When some farmers choose to use system 1 and others choose system 2, this behaviour generates outcome distributions conditional on the adoption threshold *a*. This ‘selection’ or choice behaviour links the adoption process to outcome distributions conditional on adoption, and thus provides the basis for the construction of impact indicators based on these distributions. Second, Antle [8] defines a class of indicators that can be constructed using outcome distributions and shows that they exhibit properties similar to the class of indicators developed by Foster *et al*. [37] that are widely used to measure poverty. Measures of vulnerability to exogenous environmental changes, for example, climate change, also can be represented using these indicators. One desirable property of these indicators is that the value for the whole population is a weighted average of the values for the non-adopters and adopters, with the weights given by *r*(*p*,1,*a*) and *r*(*p*,2,*a*) (see appendix A).

Figure 4 illustrates this model's properties, for the case where system 1 is interpreted as the system in use initially by all farms, with system 2 then becoming available for adoption (as in the improved nutrient management example presented in §2). The indicator variable is defined as the mean of the economic outcome *v*(*h*), under the assumption that *ω* = *v*(1) – *v*(2) and farms choose the system that maximizes *v* (implying that the adoption threshold is *a* = 0). The right-hand side of the figure shows the construction of the truncated outcome distributions for farms using each system, by combining the adoption rate and the joint distributions between *v* and *ω* (these distributions are represented as ellipsoids of equal density). The left-hand side of the figure shows the relationship between the adoption rate for system 2 and the mean indicator for the users of each system, and for the overall population. In this model, it can be shown that the mean economic indicator for the population (in this case, the mean of *v, μ_{v}*) is maximized at the adoption rate of system 2,

*r*(

*p*,2,0).

A key feature of this model is the relationship between *ω* and the outcome variables. This relationship is embodied in the correlations between *ω*, *v* and *z*, and is represented in figure 4 as the angle of the ellipsoid axis of the joint distribution of each system. This correlation is translated into the relationships between the adoption rate and the indicators shown in the left-hand side of figure 4. Antle [8] shows that the slopes of the indicator functions for the adopter and non-adopter sub-populations (see the left-hand side of figure 4) are related to the correlations between *ω* and the outcome variable. Note that the correlation between *ω* and economic returns for adopters in figure 2 is negative as in figure 4, but the correlation for non-adopters in figure 2 is positive, whereas it is shown to be negative in figure 4; either sign is possible. When the adoption variable *ω* is uncorrelated with the outcome variable, the indicator curves for the adopter and non-adopter sub-populations are horizontal lines (i.e. are independent of the adoption rate), and the overall population indicator is proportional to the adoption rate (a straight line connecting the means *μ _{v}*(

*p*,1) and

*μ*(

_{v}*p*,2)).

This conceptual framework is a generalization of some statistical models developed for analysis of observational experiments (e.g. [10]). In figure 4, for example, the quantity referred to in this literature as the ‘average treatment effect’ is equal to the difference between the mean values of the two systems, i.e. *μ _{v}*(

*p*,2) –

*μ*(

_{v}*p*,1); similarly, the ‘average treatment effect on the treated’ (i.e. the change in the indicator for the adopting sub-population) and other treatment effects can be constructed by calculating the appropriate counterfactuals for adopter and non-adopter sub-populations (as elaborated in appendix A). The indicator variables also can be defined as any functions of the relevant outcome distributions, including indicators based on thresholds, for example, the poverty rate, or an environmental risk defined as the likelihood that an environmental indicator exceeds a critical threshold.

Following the structure of figure 4, indicators based on economic and environmental outcomes (and more generally, for any quantifiable outcome of interest, e.g. nutritional outcomes) can be constructed. As the adoption threshold *a* is varied, the adoption rate of each system varies, and the resulting combination of indicators produces what we defined in §2 as the *adoption-based trade-off curve A*(*p*) (see appendix A for a technical definition). These adoption-based curves are constructed by varying the adoption threshold *a* holding the price vector *p* fixed (e.g. see curve *A*(*p*_{0}) in figure 5, for the price *p*_{0}). Thus, there is a family of such curves associated with the set of feasible prices *Ψ*.

### (b) Scaling up: linking heterogeneous populations to aggregate processes

Thus far the population of farms chooses between two systems, while taking prices and other exogenous factors (‘drivers’ or ‘boundary conditions’) as given. Now we consider how a farm population—which may be a mixture of farms, some using system 1 and some using system 2—is impacted when prices or other factors change. These changes may be caused by factors exogenous to the population or may be induced endogenously when system 2 is introduced. In economic terms, the regional, national or global economies may cause changes in prices faced by farmers in a region; moreover, when new systems are introduced that change quantities produced locally, there may be market impacts through changes in local food supply and demand. Similarly, changes in land management practices induced by adoption of system 2 may cause changes in environmental conditions, impacting all farms in the population, e.g. a change in water quality caused by a collective increase in nutrient or pesticide use.

The linkage of the population to a market is represented in figure 5. This figure shows, in the northeast quadrant, an adoption-based trade-off curve *A*(*p*_{0}) defined for a particular price vector *p*_{0} ∈ *Ψ*. This adoption-based trade-off curve connects two other curves we define as *price-based trade-off curves.* These latter curves are defined formally as the combination of indicator values associated with a particular system, or a combination of systems, in use as prices are varied, and are denoted as *T*(*Ψ*, *h*), where *h* = 0,1,2 indexes the system in use (see appendix A for a formal definition). *T*(*Ψ*, 1) corresponds to the case where all farms are using system 1 and *T*(*Ψ*, 2) corresponds to the case where all farms are using system 2. We use *T*(*Ψ*, 0) to denote the case where both systems 1 and 2 are available and the adoption threshold is *a* = 0, so that the population comprise some farms using system 1 and some farms using system 2. Paralleling these definitions, we denote the corresponding population-level indicators associated with system *h* as *I _{k}* (

*p*,

*h*) for

*k*=

*v,z*. As the figure shows, the two types of trade-off curves are closely related: the adoption-based trade-off curve defined for a particular price vector

*p*

_{0}connects the point

*S*on

*T*(

*Ψ*, 1) to the corresponding point on

*T*(

*Ψ*, 2); similarly, at a lower price

*p*

_{1}the adoption-based trade-off curve shifts to

*A*(

*p*

_{1}) and connects the points on

*T*(

*Ψ*, 1) and

*T*(

*Ψ*, 2) corresponding to that price. If farms choose between systems 1 and 2 to maximize

*I*, the economic indicator, with the adoption threshold

_{v}*a*= 0, then the aggregate outcome in the population will be at the maximum value of

*I*along

_{v}*A*(

*p*

_{0}), at point

*S*′; at the lower price

*p*

_{1}, the maximum occurs along

*A*(

*p*

_{1}) at S″.

The northwest quadrant of figure 5 shows the relationship between the economic indicator *I** _{v}* and the aggregate quantity

*Q*, as the adoption threshold

*a*is varied holding prices constant. Thus, the curves designated as

*I*

*(*

_{v}*p*) attain a maximum at the aggregate quantity

*Q*that corresponds to the maximum value of

*I*along

_{v}*A*(

*p*). For this example, we let

*p*represent the price of the output

*Q*, and the market supply curve defined as

*Q*(

*p*,

*h*) is shown in the southwest quadrant, where

*h*= 0,1,2 is defined as above. Note that

*Q*(

*p*, 1) is the market supply function when only system 1 is in use;

*Q*(

*p*, 0) is the market supply function when system 2 is introduced, the adoption threshold is

*a*= 0, and thus 100

*r*(

*p*, 2, 0) per cent of farms adopt system 2. Figure 5 illustrates two important cases: the first case is where the region is small relative to the larger regional or global market for the product

*Q*, hence the price of

*Q*remains at its initial level

*p*

_{0}(the region faces a horizontal demand curve for

*Q*because changes in its aggregate output do not affect the market price

*p*). The second is the case where the region faces a down-sloping demand for

*Q*(curve

*D(p)*in figure 5). In this case, when system 2 is introduced and adopted by some farms, the supply curve shifts outward, and the ME price declines from

*p*

_{0}to

*p*

_{1}. Similarly, changes in external conditions that impact the market price of

*Q*will be translated into shifts in the adoption-based trade-off curve. The linkage from price-based trade-off curves to ME is discussed further in [12].

The southeast quadrant of figure 5 shows the relationship between an environmental indicator *I _{z}*(

*p*,

*h*) and the price

*p.*Prices influence the environmental outcomes through the joint outcome distributions described above: changes in prices affect opportunity cost

*ω*and thus affect the choice between systems and environmental outcomes. In figure 5, the indicator

*I*(

_{z}*p*, 1) and

*I*(

_{z}*p*, 2) represent the relationship for the case where only system 1 or 2 is in use.

*I*(

_{z}*p*

_{0}, 0) and shows this relationship when both systems 1 and 2 are in use.

We conclude by noting that changes in external drivers of the system, for example climate, will generally impact the productivity of the production systems 1 and 2, as well as the relationship between those systems and environmental outcomes, and thus will shift all of the relationships represented in figures 4 and 5. By quantifying these shifts, this framework can be used for the analysis of climate change impacts, and it can also be used to evaluate how systems can be modified to facilitate adaptation to climate change and other environmental changes [9,21].

## 4. Empirical implementation

In this section, we discuss how the conceptual framework presented in the previous section can be translated into parsimonious simulation models. As we noted at the outset, a major challenge in assessing food and environmental security implications of agricultural systems is the empirical characterization of these systems and their essential features, taking into account key system characteristics, while using the various types of data that are available.

### (a) A parsimonious model for multi-dimensional impact assessment: TOA-MD

The analysis of technology adoption presented in §2 was implemented using the TOA-MD model (see [4,9]). In this model, the following definitions are used for distribution, with economic, environmental and social outcomes indexed by *k*, and systems indexed by *h* = 1,2, and *k*(*h*) refers to outcome *k* for system *h*: *μ*_{k}(*h*) ≡ mean of *k*(*h*); *ρ _{k}* ≡ correlation between outcomes

*k*(1) and

*k*(2); κ

*(*

_{k}*h*) ≡ correlation between outcomes

*v*(

*h*) and

*k*(

*h*) and

*θ*(

_{k}*h*) ≡ correlation between outcome

*k*(

*h*) and

*ω*. Three correlations play a role in the model:

*ρ*

_{k}represents between-system correlations of a given outcome

*k*;

*κ*(

_{k}*h*) represents within-system correlations between economic returns

*v*; and outcome

*k*and

*θ*(

_{k}*h*) is the correlation between outcome

*k*(

*h*) and opportunity cost. As in §2, we define the economic outcome

*v*(

*h*) for each system, so that the population of farm households can be ordered according to

*ω*=

*v*(1) –

*v*(2), and we assume that

*ω*is distributed in the population: 4.1and 4.2Normality is not an essential assumption, but it is analytically convenient and appropriate for a parsimonious model because the normal distribution is itself parsimonious. Antle & Valdivia [7] and Antle

*et al.*[34] present validation of this model for the analysis of ecosystem services supply by comparing it to more elaborate models. To use the results presented in §2, we also assume that the environmental or social outcomes are normally distributed. Normality is a particularly useful assumption for the parametrization of the truncated distributions discussed in §3, both for its parsimony and for the well-known, tractable properties of the moments of truncated normal distributions. If outcome distributions are non-normal, stratification of a population, e.g. by farm size or system type, or using methods to identify sub-populations, for example, estimation of finite mixture models, can be used.

Using the above definitions, the correlation between *k*(*h*) and *ω* = *v*(1) *– v*(2) is
4.3The means and variances of the marginal distributions of the outcome variables, for the sub-populations of farms using system *h*, with adoption threshold *a*, are defined as *μ _{k}*(

*h*,

*a*) and These statistics can be constructed using the above quantities and standard results from the literature on truncated bivariate normal distributions.

Thus, the TOA-MD model involves five parameters needed to construct the distribution of *ω*, the means and variances of *v*(1) and *v*(2) and their correlation. For each non-economic outcome variable, there are seven additional parameters, a mean and a variance for each system, and the three correlations defined above equation (4.1); with *m* non-economic indicators, the total number of parameters is equal to 5 + 7 m*.* This relatively small number of parameters makes this model easy to interpret and well suited for out-of-sample analysis.

### (b) Crop and livestock simulation models

Crop and livestock productivity play a central role in the analysis of agricultural systems, and thus crop and livestock simulation models play a key role in constructing the counterfactuals for impact assessment. Note that for linkage to the TOA-MD model, we need to estimate yield distributions for the base system (system 1) and an alternative or counterfactual system (system 2) [38]. In some within-sample assessments, systems 1 and 2 are observed in the region, and available site-specific data allow for the assessment of statistics on the distribution of the crop and livestock production. But in many within-sample and all out-of-sample analyses, site-specific data on one or both systems is lacking. Ideally, crop and livestock simulation models can be used to assess the productivity of these alternative systems. However, in practice, researchers face a number of issues in the use of these models for impact assessment. Here, we focus on crop growth simulation models, which are the most widely used models. There are relatively few livestock models, but similar issues apply [39].

Crop growth simulation models are available only for certain major crops and incorporate a limited number of processes [40]. Models for important cash crops, such as coffee and tea, are lacking but also models for commonly applied intercrops, such as maize and beans, are not available. In addition, many models are able to deal with weather conditions, water constraints and some nutrient constraints but are not able to deal with other important factors, notably pests and diseases. These models also do not incorporate any factors related to the farmer's management capability. As a result, we often see that the simulated nutrient-limited or potential yields are higher than the actual yields obtained by farmers. This difference, the so-called yield gap, is particularly relevant in smallholder farming systems where soil fertility and disease management is suboptimal (from a plant perspective).

The application of the crop growth simulation models requires detailed data on soil and climatic conditions in the region. When such data are lacking, a model may provide accurate predictions of relative yield differences between systems, but is not likely to be a reliable predictor of actual yield levels. Applying a model for representative conditions and management in the population and comparing the results to observed yield levels allows an assessment of the yield gap [41]. Under the assumption that the yield gap is constant across sites, we can simulate yields under alternative management and correct the simulated yields for the yield gap. The spatial variability of yield can be estimated from cross-sectional survey data; assessing the standard deviation in yield using a simulation model requires data on the spatial variability of soils, climate and management in the region. However, the standard deviation of simulated yields under nutrient- or water-limited conditions is typically smaller than the standard deviation in actual yield. Thus, although the crop growth simulation models are a powerful tool, and often the only tool for the assessment of yield distributions out-of-sample, their use requires substantial skill and adequate data. Whether more parsimonious crop models could be developed that require fewer data inputs and be still sufficiently accurate for out-of-sample counterfactual analysis would be a worthwhile topic for additional research. It is also important to note that statistical yield models can also be used to project yield response to weather, and at some spatial scales may perform better than process-based models [42].

### (c) Environmental data and models

Even when systems can be observed, environmental models are often necessary to assess environmental outcomes because measurement across the landscape is prohibitively expensive; in out-of-sample analysis the use of models is essential. In contrast to the case of crop growth simulation models, parsimonious models for environmental processes have been developed and are applied in a wide range of cases dealing with soil erosion [43], soil fertility [44] and carbon [45]. These parsimonious models have been found to be useful in a wide range of applications, even when more complex mechanistic simulation models are available [46]. This is owing to the fact that often the data needed for the implementation of the more complex models are not available for calibration and application. The key question that remains is whether the parsimonious models with their simplifications in the model structure and the various processes provide us with a better insight in the distribution of environmental outcomes than the complex simulation models with the problems of data availability.

For the application in TOA-MD, means and variances of environmental outcome distributions are required, as well as correlations between the environmental outcomes and expected returns (see equation (4.3)). Until recently, the importance of these outcome distribution parameters, and in particular the correlations with economic outcomes, had not been recognized, and consequently they have not been measured. Therefore, further research using field measurements and available simulation models is needed to estimate these correlations. As these parameters are estimated across many systems and ecoregions, it will be possible to define plausible ranges for these parameters, similar to the way that ranges of plausible soil carbon rates have been established and used for analysis of agricultural greenhouse gas mitigation [36].

### (d) Model parametrization for out-of-sample impact assessment

We consider now the parametrization of the TOA-MD model for analysis out-of-sample. This is sometimes referred to as the problem of *ex ante* evaluation, and in the treatment effects literature relates to the problem of ‘external validity’ in the sense that we may view this problem as being solved by extrapolating from one setting to another. [10, p. 4791] refer to ‘…forecasting the impacts of interventions (constructing counterfactual states associated with interventions) never historically experienced to various environments…’. If we interpret the word ‘forecasting’ to mean attempting to predict what will actually happen, then we can distinguish forecasting from scenario analysis, which explores plausible possible future states of the world without attempting to predict their likelihood of occurring, as in the analysis presented in §2 using RAPs. We contrast such scenario analysis with the conventional problem of within-sample *ex post* impact assessment that has been addressed extensively in the econometrics and impact assessment literature, i.e. the case wherein the consequences of adoption can be directly observed, so that given such observations the empirical problem is to quantify the relevant counterfactual [47]. In this discussion, we assume that data are available so that estimates of the means, variances and covariances of system 1 components can be made for the population of interest using standard statistical methods.

In the context of economic policy analysis, the mid-twentieth century economist Jacob Marschak recognized that in order to use statistical models to carry out policy analysis, it is necessary to identify policy-invariant *combinations* of parameters, with the relevant combination of parameters defined relative to the policy question being asked [10]. Complex systems typically embed parameters from a hierarchy of levels—e.g. a process-based crop simulation model may have a fundamental genetic parameter for a crop, but this genetic parameter itself depends on underlying physical and chemical processes. Similarly, an economic model of a market depends on the underlying behavioural parameters of households and firms. Depending on the context, various combinations of such parameters may be invariant to the exogenous changes of interest in an impact assessment. For example, in the TOA-MD model outlined above, there are five basic economic parameters, and with each non-economic indicator there are seven parameters*.* These TOA-MD model parameters depend on underlying parameters of the processes generating the outcome distributions as well as on the defined exogenous variables of the systems, such as the climate and soils, the prices of crops produced and their costs of production, and the social and institutional setting. In an analysis of technology adoption that takes prices as given, we can derive the adoption-based trade-off curve *A*(*p*) which is a function of the exogenous parameter vector *p* which includes market prices of products the farm sells, but in a market-level analysis those price components of *p* become endogenous to the analysis.

Once the fundamental parameters for a particular analysis are identified, we need to estimate the parameters for the unobservable components of the analysis, i.e. the parameters of system 2 and the covariances between systems. The use of parsimonious models based on outcome distributions is helpful for this type of analysis because the parameters of outcome distributions (means, variances and correlations) can be easily interpreted, and the relatively small number of parameters makes it relatively easy to achieve logical consistency among parameters.

#### (i) Extrapolation methods

One way that model parameters can be generated for out-of-sample analysis is to assume that there is an underlying stable relationship between some observable covariates and the parameters of interest. Then if model parameters have been estimated under a sufficiently rich set of conditions, it may be feasible to use statistical methods to construct a ‘meta-model’ for extrapolation. For example, environmental economists estimate the value of non-market goods in one environment and then attempt to extrapolate them to other settings [48]. Economists also use statistical meta-models of process-based models to simulate the impacts of changes in policy [49]. Another approach that has been used in the climate change literature is to use spatial or temporal analogues to estimate potential impacts of climate change, e.g. to use large cross-sectional data bases to estimate statistical models that are then extrapolated into the future to predict impacts of climate change on crop yields [42,50]. These methods require large amounts of data, and also usually depend on strong and untested assumptions that parameters are stable across space or time.

#### (ii) Using process-based models

As discussed above, process-based crop or livestock growth models and environmental process models can be used to construct out-of-sample counterfactuals. This approach was pioneered several decades ago [51,52] and continues to be used and improved upon [53]. In principle, one could argue that process-based models might predict out-of-sample better than statistical models, because their structure and parameters are invariant to the changes taking place in the physical environment. However, as discussed above, process-based models also have substantial limitations. A number of other methodological issues need to be addressed when these models are used with economic models, for example, how to use point-based models to represent aggregate outcomes, how to use them with gridded spatial input data and how to account for management variability. Another important issue is whether spatially referenced soil and climate data needed for input into crop simulation models are available at the same sites as economic data so that simulated crop yields can be linked to economic data [38].

#### (iii) Economic engineering and expert data

Often system 2 is a relatively simple modification of system 1, and thus the observed properties of system 1 can provide a close approximation to system 2 when supplemented with relevant information about the modified part of the system. For example, when considering the introduction of a new crop variety, the performance of the system may be very similar except for a change in the mean yield and changes in input use, for example, the average fertilization rate. Often expert data can be used to estimate parameters such as population means.

#### (iv) Minimum-data methods

Antle & Valdivia [7] discuss several methods that can be used to parametrize the TOA-MD model when, for example, statistically representative data are not available or only aggregated data are available. In this latter case, the aggregate data can be used to estimate population means, but supplemental data are needed to estimate spatial heterogeneity. For example, under appropriate assumptions, crop yields or other physical data can be used to estimate spatial variation in farmers' expected returns [33].

#### (v) Pathway and scenario methods

When non-marginal changes are being considered, for example, in climate change impact assessment, it may be appropriate to use new pathway and scenario concepts being developed by integrated assessment researchers [14–16], as illustrated in the Machakos case study presented above with RAPs [17]. Claessens *et al.* [32] present an application of the TOA-MD model to climate impact assessment using RAPs.

## 5. Conclusion

Assessing the future food and environmental security of farm populations poses great challenges at regional and global scales. In this paper, we present conceptual and empirical foundations for a new, parsimonious approach to the development of simulation models that can be used to assess future food and environmental security of farm populations. Our conceptual framework integrates key features of the biophysical and economic-behavioural processes on which these farming systems are based and links them with aggregate processes, for example, changes in market supply and demand conditions, or environmental conditions, for example, climate. Both biophysical and economic data are a critical limiting factor in modelling these complex systems, particularly for the important out-of-sample counterfactuals that must be dealt with in the assessment of important challenges, for example, climate change. We propose the use of parsimonious, population-based simulation methods as a way to make progress in meeting these challenges. One virtue of this approach is that it provides a way for researchers to take advantage of the various observational, experimental, modelled and expert data that may be available, including information from a new scenario design concepts, for example, RAPs.

Despite these desirable features, the population-based approach presented here demands further testing, validation and generalization. This type of model has been validated against a more complex system model [34] and has been used successfully to predict adoption and related economic and nutritional outcomes out-of-sample [54]. Nevertheless, several limitations need to be addressed in future research. First, in implementing the simulation of outcome distributions, the current version of the TOA-MD model is based on the assumption of bivariate normality. Yet, distributions of economic returns at the farm level are typically right-skewed, and it is likely that other important outcome distributions are substantially non-normal [54]. Thus, additional research is needed to quantify the importance of possible biases introduced by the assumption of normality as well as methods that could overcome this assumption. Such methods could entail the use of statistical mixture models to appropriately stratify populations or the use of non-parametric methods to characterize and simulate outcome distributions. The second limitation of the simulation approach is that parameter uncertainty is typically evaluated using sensitivity analysis, e.g. as in [8]. Alternatively, to the extent that distributions can be assessed for model parameters, confidence intervals for the simulation outcomes could be derived using Monte Carlo or bootstrapping methods. Finally, it is important to emphasize that while population-based models are useful for policy and research priority setting, they are not useful as decision support tools for individuals. Other kinds of tools are needed to provide management recommendations at the level of an individual farm operation.

A major finding of the approach and example we have presented here is the importance of combining the adoption (or selection) behaviour of heterogeneous farm populations with the behaviour-modifying effects of aggregate processes, for example, market interactions. No doubt, similar conclusions can be drawn with respect to important environmental processes. By necessity, environmental and economic models operating on the global scale cannot effectively represent this essential heterogeneity. One possible solution we see is to couple models similar to TOA-MD to regional or global-scale models that capture important aggregate interactions. Aggregate agricultural market models can be interpreted as providing population mean outcomes for important variables, such as crop and livestock production, for agro-ecologically defined regions; however, they do not provide information about the spatial heterogeneity within those regions. Some economic models use crop growth simulation models on a gridded basis to predict changes in crop production. Using some of the ‘minimum-data’ methods we have developed for parsimonious economic models, it may be possible to couple global model outputs with a model similar to TOA-MD to assess consistently the aggregate and distributional implications of climate change and also to incorporate more realistic adaptation scenarios. We foresee these types of enhanced modelling capabilities as the next generation of integrated modelling that will improve the quality and relevance of information that integrated assessment modelling can provide to decision-makers at global and regional scales.

## Funding statement

This research was supported in part by the Agricultural Model Inter-comparison and Improvement Project and by the Regional Approaches to Climate Change in the Pacific Northwest, award no. 2011-68002-30191 from USDA National Institute for Food and Agriculture.

## Appendix A

Following §3, farms are ordered according to an index *ω* such that for the adoption threshold *a*, *ω* ≥ *a* for those farms using system 1 and *ω* < *a* for those using system 2, for *ω* = *v*(1) – *v*(2), where *v*(*h*) is the economic outcome for system *h.* The variable *ω* is spatially distributed according to the density *φ*(*ω*|*p*), which is generally a function of prices and other exogenous variables represented by *p*. The proportion of farms using system 2, referred to as the *adoption rate of system 2*, is given by the cumulative distribution function
A1and the share of farms using system 1 is *r*(*p*, 1, *a*) ≡ 1 – *r*(*p*,2,*a*). We also consider other environmental or social outcomes *z*(*h*). Several relevant results can be derived using this framework [8]. First, the sub-population using each system is characterized by a the joint outcome distribution between *ω* and *k* = *v*,*z*, truncated according to *ω* ≥ *a* for system 1 and *ω* < *a* for system 2, denoted here as *ϕ*(*ω*,*k*| *p*,*h,a*). Second, the joint distribution of *ω* and *k* = *v,z* in a population using both systems is a mixture of the distributions for each system with mixing proportions *r*(*p*,*h*,*a*). Third, integrating *ϕ*(*ω*,*k*|*p*,*h,a*) over the interval *ω* ≥ *a* for system 1 and over *ω* < *a* for system 2 gives the marginal outcome distributions *χ*(*k*|*p*,*h*,*a*) for outcome *k*, conditional on the adoption threshold *a*. These results link the adoption process to the marginal outcome distributions conditional on adoption, and thus provide the basis for the construction of impact indicators based on these outcome distributions. Antle [8] defines the indicators
A2where ι(*k*) is a function of *k*, and *τ* defines the range of the variable considered for a threshold effect. The outcome distribution in the entire population is a mixture of the outcome distributions conditional on adoption:
A3Combining (A 2) and (A 3), it follows that the impact indicator for the entire population is
A4Note that in the text and figures 4 and 5, we define the indicators such that *I _{k}*(

*p*,1) ≡

*I*(

_{k}*p*,

*−∞,τ*),

*I*(

_{k}*p*,2) ≡

*I*(

_{k}*p*,

*+∞,τ*) and

*I*(

_{k}*p*,0) ≡

*I*(

_{k}*p*,0,

*τ*).

As the adoption threshold *a* is varied, the adoption rate of each system varies, and the resulting combination of indicators can be graphed as what we call an *adoption-based trade-off curve*:
A5where *Ψ* is the set of feasible prices. Note that *A*(*p*) depends on the threshold parameters *τ _{v}* and

*τ*which we suppress henceforth to simplify notation. These adoption-based trade-off curves are defined by varying the adoption threshold, holding fixed the price vector

_{z}*p*; thus, there is a family of such curves associated with the set of feasible prices

*Ψ. Price-based trade-off curves*are defined as the combination of indicator values associated with a particular system, or a combination of systems, in use as prices are varied: A6generated by varying

*p*over the set of prices

*Ψ*, where

*I*(

_{k}*p*,

*a,*) =

*τ*_{k}*r*(

*p*,1,

*a*)

*I*(

_{v}*p*,1

*,a,*) +

*τ*_{v}*r*(

*p*,2,

*a*)

*I*(

_{v}*p*,2

*,a,*) is the indicator for outcome variable

*τ*_{v}*k*=

*v*,

*z*in the population with adoption threshold

*a*(again we suppress the threshold parameters to simplify notation). In §2, we focus specifically on three special cases of these price-based trade-off curves. In the case where

*a*= −∞, all farms use system 1, in which case we obtain the price-based trade-off curve

*T*(

*p*, 1); and for

*a*= +∞, all farms use system 2, giving the price-based trade-off curve

*T*(

*p*, 2). For the case in which farms choose between systems 1 and 2 to maximize the economic value of the system in use, i.e.

*a*= 0, we obtain the trade-off curve defined as

*T*(

*p*, 0).

The expected output from system *h* with adoption threshold *a* is
A7where *p* represents the price of the output *q.* Expected output from all farms is given by the aggregate or market supply curve
A8where *N* is the number of farms in the population. Note that *Q*(*p*,*−∞*) is the market supply function when only system 1 is in use; *Q*(*p*,0) is the market supply function when system 2 is introduced, the adoption threshold is *a* = 0, and thus 100*r*(*p*,2,0) per cent of farms adopt system 2.

The northwest quadrant of figure 5 shows the relationship between the economic indicator *I** _{v}* and the aggregate quantity

*Q*of an output produced in the market in which the population participates. Define the net returns expected by an individual farm given output price

*p*as

*ι*(

*q*) =

*pq – c*(

*q*), where

*c*(

*q*) is cost of production, so that

*I*(

_{k}*p*,

*a,τ*) is the population mean net returns defined for adoption threshold

*a*(see A 2). As the adoption threshold is varied over its range from

*−∞*to

*+∞*, we obtain the curve

*I*(

_{v}*p*) shown in the northwest quadrant of figure 5, which is the correspondence between

*I*(

_{k}*p*,

*a,*τ) and

*Q*(

*p*,

*a*) with price

*p*.

**(a) Calculating treatment effects**

In the ‘treatment effect’ literature, the analysis is based on the change in an outcome caused by treatment in the treated and untreated sub-populations. Counterfactuals corresponding to the indicators defined in (A 2) can be constructed as an extension of the results presented in [8]. Here, we present results for mean indicators under normality; extension to threshold indicators follows the same logic. Also we suppress the parameter *p*, which is held constant, to simplify the presentation. Under normality, the conditional mean of outcome *k*, given *ω*, is
A9where *μ _{k}*(

*h*) is the unconditional mean of

*k*for system

*h*, and other parameters are defined in equations (4.1)–(4.3) and related text. For a standard normal density

*ϕ**, the inverse Mills' ratio for the truncated distribution of

*ω*associated with each system is A10The means of the truncated distributions of

*ω*for each system are A11Taking the expectation of (A 9) with respect to the truncated distribution of

*ω*, and using (A 10) and (A 11), the means of the truncated outcome distributions for systems

*h*= 1,2 are A12For system 1, the counterfactual mean is constructed by taking the expectation with respect to the distribution of

*k*for system 2, over the interval (

*a,+∞*), thus A13Similarly, the counterfactual mean for system 2 is A14Using these results and the standard definitions of treatment effects [10], we have the average treatment effects on the treated (TT) and untreated (TU) for outcome

*k*at adoption threshold

*a*Treatment effects on the treated and untreated groups can be derived for threshold indicators as well (for example, the poverty rates given in table 1). The average treatment effect (ATE) for the entire population is the difference in the unconditional means for systems 1 and 2, ATE

*=*

_{k}*μ*(2) –

_{k}*μ*(1). It also follows that the ‘local average treatment effect’, i.e. the ATE taken over some range of adoption rates can be similarly constructed. Finally, observe that any ‘policy-relevant treatment effect’ can be constructed from these results, that is, the treatment effect for a policy-defined value of the adoption threshold

_{k}*a*or a policy-constrained range of the adoption rate. For example, in an analysis of a subsidy or tax on the use of system 2, the adoption of system 2 will deviate from the point

*a*= 0 to a point where

*a*equals the subsidy or tax, and treatment effects can be constructed at the adoption rate for system 2 at that value of

*a*.

## Footnotes

One contribution of 16 to a Discussion Meeting Issue ‘Achieving food and environmental security: new approaches to close the gap’.

- © 2014 The Author(s) Published by the Royal Society. All rights reserved.