Protected areas remain a cornerstone for global conservation. However, their effectiveness at halting biodiversity decline is not fully understood. Studies of protected area benefits have largely focused on measuring their impact on halting deforestation and have neglected to measure the impacts of protected areas on other threats. Evaluations that measure the impact of protected area management require more complex evaluation designs and datasets. This is the case across realms (terrestrial, freshwater, marine), but measuring the impact of protected area management in freshwater systems may be even more difficult owing to the high level of connectivity and potential for threat propagation within systems (e.g. downstream flow of pollution). We review the potential barriers to conducting impact evaluation for protected area management in freshwater systems. We contrast the barriers identified for freshwater systems to terrestrial systems and discuss potential measurable outcomes and confounders associated with protected area management across the two realms. We identify key research gaps in conducting impact evaluation in freshwater systems that relate to three of their major characteristics: variability, connectivity and time lags in outcomes. Lastly, we use Kakadu National Park world heritage area, the largest national park in Australia, as a case study to illustrate the challenges of measuring impacts of protected area management programmes for environmental outcomes in freshwater systems.
Protected areas are the primary strategy implemented to halt the global decline in biodiversity, as evidenced by global commitments such as the Convention on Biological Diversity Aichi targets to protect 17% of terrestrial and inland water and 10% of marine areas by 2020 . Protected areas contribute to biodiversity conservation by removing extraction pressures from an area (e.g. deforestation in terrestrial biomes; fishing in marine biomes) and by supporting management of threats within protected areas (e.g. control of invasive plants and animals). Progress towards the Convention on Biological Diversity targets has resulted in a steady increase in the numbers of both terrestrial and marine protected areas [2–4]. Despite this increase, current indices show that biodiversity continues to decline while human pressures increase [2,5], and there is a lack of evidence that protected areas are a meaningful mechanism to protect biodiversity . Recent literature on terrestrial protected areas has largely focused on measuring their impact on halting deforestation [7–9], yet there remain critical gaps in applying impact evaluation methods to other ecosystems and other interventions such as altered management of established protected areas.
The primary way in which the impacts of protected area management have been evaluated is based on the framework of management effectiveness (over 8000 assessments globally ). The information needed to assess management effectiveness is lacking for most protected areas, although recent assessments have concluded that few had sound management [10–12]. The main reasons for this finding are lack of financial resources (especially in developing countries), deficiencies in management capacity (e.g. lack of skilled staff), or a poor understanding of how to addresses threats [13,14]. Management effectiveness evaluation includes assessing three main components: (i) design issues relating to the management of both individual sites and protected area systems, (ii) appropriateness of management systems and processes and (iii) delivery of stated protected area objectives . This review focuses on the third component, assessing the delivery of management objectives, particularly evaluation methods to estimate more rigorously the success or failure of management. The standard approach to measuring the impact of protected area management is performance measurement, which monitors changes in biophysical and socio-economic indicators over time [10,11]. While this provides important information about changes within protected areas, it fails to provide a true measure of the impact of management as it does not account for changes that might have occurred in the absence of management.
Impact evaluation approaches involve randomized experimental trials or quasi-experimental approaches that use appropriate statistical tools to account for biases to evaluate the effects of an intervention [7,16]. The use of matching methods to compare ‘treated’ units with ‘control’ units that are very similar in baseline environmental and socio-economic characteristics is becoming more common in conservation impact evaluation . The majority of conservation impact evaluation studies have investigated the effects of protection on rates of deforestation using readily available satellite imagery products for forest cover [8,9,17], or threatened species management policies (e.g. the US Endangered Species Act or the Australian Environment Protection and Biodiversity Conservation Act [18,19]). A major research gap is how to apply these evaluation methods to a wider range of ecosystems and conservation strategies.
A fundamental requirement of impact evaluation is the identification and clear definition of the counterfactual : what would have happened in the absence of intervention? Addressing counterfactuals in evaluating conservation interventions requires (i) identification of ecological and socio-economic factors that covary with interventions (confounders), (ii) collection of data on confounders, (iii) construction or identification of control groups (units that are not subjected to the intervention) and (iv) collection of indicator data pre- and post-intervention for treated and control groups to quantify responses to the intervention [7,20]. Relevant data on confounders and outcomes of interventions can be very difficult to identify and collect for conservation interventions.
The difficulty in obtaining relevant data may be one reason that most rigorous studies of impact evaluation for environmental policy to date have focused on terrestrial systems and used readily observable data from satellite imagery (e.g. forest cover) to measure intervention impact at multiple timesteps. By contrast, studies that have evaluated the benefits of protected areas for freshwater species (reviewed in ) have typically used outcome measures such as fish abundance, which cannot be estimated remotely, requiring field sampling and limiting the spatial extent of studies. Furthermore, the distinctive connectivity of freshwater and marine systems (i.e. mediated by the presence and movement of water) poses unique challenges compared with terrestrial systems. This requires alternative approaches for measuring and controlling for confounders; for example, upstream land and water uses might affect both the placement and outcomes of downstream freshwater management, thus confounders both within the realm of interest and connected realms should be accounted for . Difficulties such as a lack of appropriate assessment methods, indicators and data, and complex feedbacks from connectivity that are difficult to map and quantify, could explain the lag in impact evaluation studies for environmental policy in marine and freshwater ecosystems.
Relatively few studies have evaluated the benefits of protected areas in maintaining freshwater species, and the evidence they have provided is mixed (reviewed in ). Examples of studies finding positive impacts of freshwater-protected areas include Baird & Flaherty , who reported that village-managed fish conservation zones enhanced fish stocks in the Mekong River, Lao. Similarly, Cucherousset et al.  found that eels were more abundant and larger in protected portions of a French wetland than in fished areas, and Sanyanga et al.  reported that the mean body size of commercial fish species was larger in protected than in fished areas of Lake Kariba, Zimbabwe. Studies finding no benefits of freshwater-protected areas include Mancini et al. , who detected no differences in an aquatic macroinvertebrate index between stream reaches inside and outside of Italian protected areas. Similarly, Srinoparatwatana & Hyndes  found no consistent differences in abundance or biomass of wetland fish species between a protected area and an adjacent fished area in Thailand, and Chessman  concluded that protected status had little overall effect on fish assemblages in the Murray–Darling basin, Australia. Importantly, these studies involved comparisons of locations inside versus outside protected areas, but did not assess performance over time in response to protected area establishment and management [10,11]; nor did they attempt to elucidate causal relationships with experimental or quasi-experimental designs incorporating counterfactual thinking .
The aim of this review is to contribute to progress in impact evaluation for conservation of freshwater systems. We focus on freshwater systems because of their recognized importance in conservation [28,29], the inherent complexities they present for impact evaluation of environmental policy, and the relative paucity of guidance on how to conduct evaluations for these systems. We focus our review specifically on protected area management because it is recognized as an important conservation intervention for freshwater , and the impacts of protected area management actions have rarely been evaluated.
We first provide a general review of the barriers to impact evaluation for freshwater systems. We then contrast the potential confounding factors across terrestrial and freshwater systems, discuss the types of data available to account for confounding factors, and highlight the specific constraints and opportunities associated with impact evaluation in freshwater systems. Lastly, we use Kakadu National Park World Heritage area to demonstrate how an impact evaluation study might be designed for measuring the impact of protected area management in the river–floodplain systems of northern Australia .
2. Challenges and opportunities for impact evaluation in freshwater systems
Ferraro  listed a number of barriers for evaluating the impact of environmental programmes such as protected area designation and management. Given that most evaluations of protected area impacts have focused on terrestrial ecosystems, it is important to consider the extent to which these potential barriers apply to freshwater ecosystems and to the potential challenges and opportunities for impact evaluation in freshwater ecosystems. Most barriers are similar across realms, such as limited resources (small operations budgets) or data (e.g. infrequent sampling, insufficient baselines) for undertaking evaluations. Three potential barriers—relating to high levels of natural variability and spatio-temporal connectivity—may be particularly difficult to account for in freshwater systems. For these three barriers, we highlight possible confounding factors—extraneous variables that correlate with both the selection and outcome of treatment—that should be accounted for. We discuss each of the barriers individually with a special focus on freshwater impact evaluation.
(a) High rates of natural variability
Evaluating the impacts of protected areas on freshwater ecosystems can be difficult owing to the high rates of natural variability that can affect the outcomes of management interventions and the ability to detect and monitor responses to them. While natural variability in environmental conditions such as spatio-temporal variability of rainfall or temperature applies across all realms, this can be particularly dramatic in freshwater systems. For example, even if low rainfall years result in less ground cover in some terrestrial vegetation communities, the vegetation community is unlikely to disappear completely. In contrast, a freshwater system may disappear entirely owing to low rainfall, resulting in much more radical (albeit temporary) structural changes. This spatio-temporal environmental variability has repercussions for both measuring outcomes and selecting appropriate control units for comparison with treatment units. For example, hydrological disturbances such as floods and droughts cause profound changes in lotic ecosystems and have been shown to affect the outcomes of river restoration interventions [31,32]. Natural variation in rainfall patterns, including short-term variability in localized rain events, and catchment topography can mean that geographically proximate river systems have very different hydrological disturbance histories and recovery trajectories . Thus, approaches for selecting control sites that share similarities in climate and catchment characteristics  with treated sites may not adequately account for hydrological disturbance histories.
To account for natural variability among locations, evaluations of protected areas will require comparable control and treatment sites with an explicit focus on ensuring that they have similar hydrological histories. This can be assisted with data products such as regional classifications of freshwater systems  or hydrological data collected for other purposes such as measuring water availability. Matching broad hydrological regimes of catchments is important when selecting control units, but may not account for discrete events such as intense storms that effect individual catchments. Selecting a sufficient sample size of treated and matched control units can address this type of variation. However, if the number of treatment units is small, hydrological data can inform selection of measurement periods and interpretation of differences between treated and control trajectories.
(b) Spatial scale of ecological processes and spill-over effects
Rivers and wetlands rarely have their entire catchments protected. In fact, rivers are often used as the boundary lines for protected areas, creating mixed land use on either side of the river . One of the few exceptions in Australia is Kakadu National Park (see §4 and figure 1), which was established at a scale to protect the entire catchment of a large floodplain river. Furthermore, protected areas are rarely declared with specific objectives for freshwater ecosystems. Developing freshwater management plans or objectives tends to occur retrospectively . Where freshwater systems are specifically protected, the protection is often linked with the imposition of a direct threat elsewhere; examples are protecting the upper reaches of dam catchments to maintain water quality and as compensation for the loss of forests under reservoirs [36,37]. Because protected areas are rarely designed with freshwater systems as a central focus, they often fail to capture longitudinal (headwaters to sea), lateral (river to riparian and floodplain) and vertical (surface water to groundwater) connectivity fundamental to the functioning of freshwater ecosystems . Insufficient recognition is given to connectivity for surface water systems, but even less consideration is given to the dependence of freshwater ecosystems on groundwater when setting protected area boundaries. Yet, in some regions, groundwater connections may be far more significant than catchment boundaries .
Larger-scale connectivity initiatives, such as connecting reserve systems with extensive corridors, have been undertaken in terrestrial systems to support the movement of fauna in response to threats (e.g. loss of vegetation and climate change) . Although the concepts are well founded [29,41], and tools are available to explicitly incorporate freshwater connectivity into systematic conservation prioritizations [42,43], we know of no examples of similar, extensive initiatives that have been implemented in freshwater systems. These design issues, which are particularly relevant for freshwater ecosystems (failure to protect connected systems and designation without relevant environmental objectives defined and incorporated in the design), may exacerbate difficulties associated with capturing ecological processes and mitigating threats.
Owing to the highly connected nature of freshwater ecosystems (longitudinal, lateral, vertical and temporal ), spill-over effects—interactions across the boundaries of protected areas—are likely to bias the evaluation of local impacts unless these connections are understood and controlled for. Interestingly, spill-over effects may be positive or negative and may be across treatment boundaries (cross-boundary) and ecosystems (cross-realm) . Negative spill-over effects include threat propagation, such as invasive fish dispersing upstream or pollutants drifting downstream into protected areas. For example, despite being designed to protect the entire catchment for the South Alligator River, Kakadu experiences cross-boundary threats; threats to Kakadu's freshwater floodplains, such as feral buffalo and aquatic invasive weeds, move across the landscape and enter the park from neighbouring properties (figure 1). Kakadu also experiences cross-realm threats; for example, sea-level rise represents a potential cross-realm threat from the marine realm to the freshwater realm. Examples of positive spill-over effects include protected areas acting as a refuge or source for populations in unprotected sites and propagation of benefits from protected areas such as improved downstream water quality from vegetation management erosion control. Many of the actions to mitigate threats to terrestrial systems have an indirect benefit for freshwater ecosystems, highlighting the potential for positive cross-realm spill-over . Abraham & Kelkar , for example, found that terrestrial protected areas benefited freshwater species (protected sites had higher total and endemic fish diversity than unprotected sites), demonstrating cross-realm benefits.
Terrestrial evaluations often assume that neighbouring forest units are independent and hence divide protected areas into standard units of evaluation (e.g. 1 km2 grids ); this assumption of independence does not hold true in freshwater systems where neighbouring streams are often connected longitudinally, laterally and vertically. One potential approach to controlling for connectivity is to select units of evaluation that encompass all relevant connections, such as floodplains, catchments or subcatchments. While selecting connected systems as units of evaluation may help to account for connectivity influences, it may reduce the available sample size and increase the difficulty of selecting appropriate control units.
(c) Time lags
Closely linked to the consideration of how spatial connectivity might bias evaluations in freshwater ecosystems is the issue of appropriate temporal scales for measurement of baselines and outcomes, particularly with respect to the influence of time lags. For example, water entering groundwater aquifers may take decades before becoming part of the surface water system. Consequently, activities that affect water quantity or quality (e.g. regional aquifer drawdown) may not be detected until well after they have occurred. This is compounded by the fact that understanding of groundwater–surface water interactions is increasingly recognized as a key knowledge gap in the management of freshwater ecosystems . In addition, surface water connections may also result in time lags when long distances and/or cross-realm connections are involved . For example, the effects of managing catchment vegetation to reduce erosion and improve water quality may take many years to manifest in aquatic ecosystems .
For evaluations of protection or management actions to be valid, they need to account for the potentially long delays (of decadal scales) in ecological responses of freshwater ecosystems, requiring an understanding of the interactions among the catchment, surface water and groundwater systems. Such knowledge, particularly for surface water–groundwater interactions, is likely to be absent in most systems and will require further data acquisition and modelling of systems to understand and account for time lags. Acquiring data over relevant periods requires a commitment to implementing long-term environmental monitoring to detect ecological responses, as well as overcoming constraints imposed by available budget, human resources and technical impediments. Remote sensing technologies offer one affordable means of sampling over large spatial scales and long time frames. Advances allow for the application of this technology to some aquatic ecological assets, such as benthic habitat . Another opportunity for collecting data over appropriate timescales is the use of existing data on water quality and quantity that are routinely collected for other purposes (owing to the utility value of water resources, e.g. gauged discharge data and water quality data). These data can be used to assess alterations in flow regime and river health related to treatment outcomes, but can also be used for selection of appropriate control units for comparison with treatment units to account for confounding factors.
3. Identifying and controlling for confounding factors in an evaluation of protected areas on freshwater systems
As discussed above, variability, connectivity and time lags present particular difficulties for impact evaluations in freshwater. Evaluation design will thus need to address these difficulties through approaches such as controlling for connectivity by defining sampling units to capture relevant connections. We explore these barriers for both terrestrial and freshwater systems by elucidating and contrasting the types of bias and possible data and approaches available to account for them.
The first step in evaluating any environmental programme, whether it be with experimental, quasi-experimental or other evaluation approaches, is formulating a theory of change related to the intervention, such as protected area management, and the expected outcome(s) . This will include stating causal hypotheses, identifying assumptions, considering alternative explanations for outcomes and controlling for potential confounders.
We start by describing a theory of change related to environmental protection and associated management, and discuss possible confounding factors and selection bias (figure 2; for in-depth discussion, see electronic supplementary material). Establishing a protected area might have two associated mechanisms for changing a system: removing or avoiding exposure to a threat, or managing and mitigating the impacts of a threat. Removal or avoidance of a threat is primarily associated with the identification and actual declaration of the protected area, whereas the second (management or mitigation) is associated with the environmental management actions implemented after establishment. Typical protected area management actions that manage or mitigate the impacts to freshwater ecosystems include riparian zone management (e.g. fencing and revegetation) and removal of introduced plants and animals and management of pollution inputs (e.g. nutrients, toxicants) . The first outcome—removal or avoidance of a threat—is the focus of most evaluations to date in which, for example, the effect of protection on deforestation rates is measured using the outcome of forest cover through time [8,9]. Analogously, in freshwater systems, this could be measured with changes in hydrology or water quality or with surrogate measures of potential sediment load such as riparian forest cover (figure 2a,d). The second outcome—mitigation of threats—requires understanding which threats are present and have been addressed through active management; if multiple actions are implemented, this may require accounting for their interactions in the theory of change (figure 2b,e).
The ecological and socio-economic factors that covary with the placement of protection and the implementation of management are likely to be correlated, but controlling for biases may also require additional information for evaluating management impacts (see the electronic supplementary material). For example, the placement of protection is often biased to less-productive lands, so surrogate measures of land productivity such as water availability, soil and slope can be used to account for these factors . In the case of management, areas that are more conducive to establishment of a particular threat, such as invasive plants, or experience higher rates of exposure to the threat are more likely to have the threat present and are thus more likely to trigger management of the threat [50,51]. Possible confounders might relate to land productivity (e.g. soil and slope) and exposure to risks, such as proximity to areas already affected by invasive species, transport corridors or historical land use. Thus, similar confounders (i.e. those that relate to land productivity) are relevant to both protection and management evaluations.
The types of confounders associated with terrestrial ecosystems have candidate predictors for which data are readily available that are also relevant for freshwater systems (see the electronic supplementary material). However, as discussed above, other sources of bias in evaluation, such as spatio-temporal variation in hydrological regime and connectivity, are more important in freshwater than terrestrial ecosystems, and will require additional data and approaches to be developed to account for them [43,52]. Addressing these potential sources of bias will start with developing causal hypotheses that account for connectivity and how it might relate to the spatial and temporal dynamics of outcomes. Including connectivity issues in the theory of change can then inform the selection of sample units that account for relevant aspects of connectivity (e.g. longitudinal flow of upstream threats in a catchment; figure 2f). Additional aspects of connectivity (exposure to cross-realm and cross-boundary threats) may be difficult to account for solely with biologically meaningful sample units. For example, using catchments as sample units will not account for cross-realm threats from the marine system (e.g. sea-level rise). Instead, these biases may be accounted for when selecting control units with methods such as matching . One limitation of using catchments as sample units is that it can reduce the number of available control units that are sufficiently similar to treatment units to warrant consideration. This is exacerbated by the physical and ecological heterogeneity of catchments, which might require matching of subcatchments for some purposes. Ultimately, assessing the effectiveness of protected areas relies on careful matching of control sites. This is more challenging in the design of studies for freshwater ecosystems because of the larger number of factors that must be matched.
4. The case of Kakadu National Park: measuring the benefits of protected area management on an aquatic invasive plant species
The tropical coastal floodplains (figure 1) of the ‘top end’ region of Australia's Northern Territory (NT) are internationally recognized for their ecological, socio-economic and cultural values. In the wet season, high rainfall (1300–1500 mm per year) results in high river flows and extensive floodplain inundation. The dry season is characterized by reduced river flows and floodplain drying . Significant components of this river–floodplain system occur within Kakadu National Park. Kakadu's freshwater values are threatened by invasive weeds such as Mimosa pigra and olive hymenachne and feral animals such as buffalos and pigs (figure 1). The management of these threats can be challenging owing to the connectivity of the floodplains; for example, water-borne seeds of invasive weeds are dispersed widely (both downstream and laterally across floodplains), causing long-distance dispersal that can be difficult to detect and manage. Given Kakadu's status as a world heritage area and its national significance in the Australian National Reserve System, it has received significant investment of public resources for management. Despite this significant investment, the impact of management of the park has never been systematically evaluated.
One management outcome, which is often cited as being the consequence of protection and a legislated, rapid, well-resourced and ongoing management response, is the effective management of the leguminous shrub Mimosa pigra, a noxious weed occurring in many parts of northern Australia (figure 3) . The rapid management response to mimosa, followed by long-term and consistent management, have resulted in the successful containment of a very high-risk weed within the park; a management success that has not been replicated in other comparable floodplains within the NT.
Given the noted success of managing mimosa within the park and the availability of distribution data, we restrict our discussion to mimosa but note that a similar approach can be applied to the management of other invasive species. We highlight the aspects of above-discussed evaluation design, including sample units, selection of control units and controlling for confounders.
(a) Theory of change
In the 1970s, significant uranium deposits were discovered in the Alligator Rivers region, and in 1975, a formal proposal to develop the deposits was submitted. This resulted in a formal inquiry to address potentially conflicting national and regional issues of uranium mining, including conservation and Aboriginal land ownership. The recommendations of the Inquiry included declaration of Kakadu National Park in stages and continued mining outside of the declared areas . Declaration of the portion of Kakadu containing the floodplains occurred in two stages between 1979 and 1984. The declaration of Kakadu was particularly significant, because the park boundaries were located to ensure that they encompassed almost all of the catchment of the South Alligator River, the entire catchment of the smaller West Alligator, and significant parts of the catchments of the Wildman River and the East Alligator River (figure 3).
Following Park declaration, extractive land uses and other activities inconsistent with national park status were restricted, and a park management plan was established to address existing threats. Therefore, Kakadu floodplains differ from neighbouring floodplains in that they have not been exposed to extractive land uses, in particular grazing and mining, since 1984 and they have been managed in line with the park management plan. The aim of Kakadu National Park's management plan as it relates to invasive species is to protect Park values by strategic management of weeds, prevention of invasion by new species and increased understanding of weed management among Park residents, neighbours and visitors. The Park is bound by Commonwealth and Territory legislation (Environment Protection and Biodiversity Conservation (EPBC) Act 1999 EPBC and Weeds management act 2001 (NT) to manage listed species in accordance with the management plants. In the case of mimosa, Kakadu is within the declared eradication zone. As such, the goal of the Kakadu mimosa management programme is to eradicate this species within the park and to stop its reinvasion from surrounding properties. The inputs to this programme include a dedicated team of four staff and associated resources (e.g. airboat, spray equipment, herbicide) requiring an ongoing annual financial commitment of approximately $500 000 . The programme's activities are built around a ‘search and eradicate’ response that includes systematic survey of the floodplains for any new infestations and multiple visits each year to existing management plots for monitoring and treatment . The outputs of monitoring are the number of plots visited each year and the number of plants within these plots treated (by manual removal or herbicide). The outcome of the programme is the removal of mimosa from the floodplains (measured by percentage cover). Final outcomes would be the increased extent and abundance of floodplain biodiversity (e.g. measured by native vegetation extent and significant biodiversity such as the size of the breeding population of magpie geese) [58–60].
We hypothesize that the percentage cover of mimosa on Kakadu's floodplains is significantly less than it would be without the type of active management in a protected area. Possible confounding factors include land suitability for establishment of this weed, historical land use and threat exposure such as proximity to source populations of mimosa.
(b) Selecting a control
The counterfactual for the impact of the mimosa management programme in Kakadu is the outcome expected in the absence of protected area management. Given that floodplains are connected units, and in line with our discussions above, we suggest using each floodplain as a sample unit. By using whole floodplains as sample units, we account for most of their lateral and longitudinal connectivity (noting that floodplains are sometimes connected to neighbouring floodplains in the wet season and so may not be entirely isolated units).
In order to select an appropriate ‘control’ floodplain to serve as a measure of this counterfactual, both the invasion history and management history must be known. In the case of Kakadu, we can consider the potential pool of floodplains in northern Australia for their shared biophysical and land-use characteristics, and any other confounding characteristics to determine whether one of the floodplains might serve as an appropriate counterfactual (see the electronic supplementary material for list of confounding factors to consider). The two closest floodplains, those of the Mary and East Alligator/Murgenella Rivers, are exposed to similar climatic conditions and have similar environmental characteristics which suggest that they would be appropriate ‘control’ units  (figure 3). The hydrology of floodplains is an important aspect of habitat suitability for both native and invasive plant species. While mapping of inundation patterns for Kakadu shows fine-scale variation in inundation frequency and hydroperiod within floodplains, the mean inundation period and hydroperiod across floodplains is similar  (table 1). This suggests that at the sample unit level (floodplains) there is not a significant difference within Kakadu and this is likely to hold true for neighbouring floodplains. These shared ecological and biophysical characteristics also suggest that the floodplains may have similar suitability for mimosa.
The ecological condition and land use prior to declaration of Kakadu are also broadly similar across these floodplains. At the time of declaration, the floodplains were primarily used for indigenous traditional use (food hunting and harvesting, cultural purposes), commercial hunting of non-domesticated buffalo for their hides and horns, and pastoralism of domesticated cattle. The characteristics are therefore similar for floodplains within and neighbouring Kakadu.
Records indicate that the invasion history of Kakadu and neighbouring floodplain regions is also similar, with time of first invasion occurring around the same period. The first large infestation of mimosa in the NT was discovered in 1952 in the Adelaide River , 100 km south of Darwin and approximately 100 km from what is now the border of Kakadu. In 1981, reported records of mimosa indicate that there were ‘plants scattered over approximately 4000 ha’ in the Adelaide River catchment and that spread from the Adelaide River had occurred eastward through to Arnhem land . While time of invasion is similar across the floodplains, the percentage of floodplains infested at the time of park declaration (1984) differs [62,63] (table 1). This may in part be owing to differences such as distance to the original infestation or habitat suitability. Initial environmental conditions will influence programme outcomes; this is particularly true for changes in invasive species populations that exhibit exponential growth rates. Therefore, we select the East Alligator/Murgenella floodplain as a control for our difference-in-difference estimator below. However, the Mary River floodplain provides useful information for bounding our understanding of the potential protected area management impacts. We therefore retain the Mary River floodplain data for inclusion in our partial identification approach.
(c) Evaluation approach
We aimed to estimate the average treatment effect on the treated (ATT): the difference between the expected change in percentage cover of mimosa on Kakadu's floodplains under protected area management and the counterfactual mimosa cover without protected area management. We start our analysis by providing an estimate of the plausible ranges of the treatment effect making no or limited assumptions. To do this, we use a partial identification approach . We then use a difference-in-difference (also known as before-after-control-impact) estimator to narrow our ATT estimate based on the additional assumption that the expected trend in mimosa cover of the control unit is equal to the expected trend in mimosa cover of the Kakadu units in the absence of the protected area management programme.
For both approaches, we used distribution records and available references to estimate the initial (1984) and current (2014) percentage cover of mimosa on floodplains [55,65]. We chose 2014 as the measurement period to capture 30 years of protected area management. This period is sufficient to account for time lags. For example, mimosa has a long-lived seed bank and therefore ongoing management is required to ensure that seedlings do not emerge at treated sites. The current cover of mimosa on wetlands and floodplains in northern Australia, using floodplains as the unit of measurement, ranges from 0% to 67% (figure 3). Given the reported doubling rate of 1.4 years  and widespread reported presence in 1981 , we assume that equilibrium populations are currently established and that this range reflects the true range of possible percentage cover. The current cover on the Kakadu floodplains varies from 0.031% to 0.059%, and the two possible control floodplains have higher infestation levels (2.790–9.914%; figure 3 and table 1). For the ATT estimates below, we consider average infestation across the Kakadu floodplains.
We start with a partial identification approach  to provide information about plausible ranges of the ATT given limited assumptions. First, we identify the minimum and maximum possible floodplain cover to inform the possible range of impact. The expected maximum mimosa cover in Kakadu in the absence of protected area management can be no greater than 67%, the maximum percentage cover observed in northern Australia. Therefore, the potential outcome for Kakadu in the absence of management can be no greater than a 66.99% increase in mimosa (67% minus initial cover of 0.011%; table 1). Similarly, the expected potential change in mimosa cover in the absence of protected area management can be no smaller than −0.011%, which is the decline in coverage if the initial mimosa cover (0.011%) died back naturally to 0%. If we take the difference in observed change in cover (0.032%) and these potential changes, this implies, based only on observed data, that the ATT is within [−66.96%, 0.043%]; the programme at best avoided a 66.96% increase in mimosa cover and at worst created 0.043% mimosa cover. We next assume a monotone treatment effect; the treatment effect on the treated cannot be negative and therefore management cannot have created any new infestations. This narrows the bound on the ATT to [−66.96%, 0]. Lastly, we consider the Mary River floodplain as a possible worst-case bound. The Mary River floodplain is similar to the Kakadu floodplains in environmental characteristics, but initial cover was much larger. It therefore represents a plausible worst-case scenario as we expect the Mary River and Kakadu floodplains to have similar extents of suitable habitat based on environmental characteristics. However, assuming exponential growth rates, the Mary would have had a larger growth trajectory in cover over the treatment period. This revises our expected maximum coverage in the absence of protected area management to 9.91%, and the expected increase in mimosa in the absence of management would therefore be 9.9% (table 1). This narrows our bound on the ATT to [−9.87%, 0].
Next, we use a difference-in-difference estimator to estimate the treatment effect using the East Alligator/Murgenella (figure 3 and table 1) as a counterfactual . The difference-in-difference method attributes any differences in trends between the treatment and control groups to the intervention . If factors are present that affect the difference in trends between the two groups, the estimator will be biased. We chose the East Alligator/Murgenella floodplains for the counterfactual, as the initial cover is closest to the observed initial cover in Kakadu. The difference-in-difference estimate of the treatment effect is −2.704%; the outcome of Kakadu protected area management is the avoidance of a 2.704% increase in mimosa floodplain cover (table 1). This is in contrast to the observed 0.032% increase of mimosa cover over the 30-year treatment period. Therefore, a programme evaluation taking into account only observations within the park would have underestimated the impact of the programme by estimating the impact as a negative one—a 0.032% increase in cover. This contrasts with our positive estimate: the management programme prevented a 2.704% increase in cover, which is approximately 58 km2 of avoided infestation across Kakadu. It is worth noting that the current distribution data have not been collected uniformly , thus, this estimate could be improved with systematically mapped distribution data.
This estimate relies on the strong assumption that the East Alligator/Murgenella floodplain serves as an appropriate counterfactual. This would mean that there are no unobservable differences in the control and treated units, and that trend lines of cover change in the control and treated units would be the same in the absence of protected area management. Given the small number of treated units (floodplains within Kakadu) and a larger donor pool of possible control floodplains across northern Australia, one approach for improving upon this assumption would be to use synthetic control design in which a combination of floodplain characteristics may provide a better comparison than any single floodplain alone [7,66]. In other words, there may not be another Kakadu (i.e. a floodplain that shares all of the same characteristics as those within Kakadu) within the donor pool of possible control units, but a synthetic Kakadu might be constructed through a weighted combination of multiple floodplains. Our case study of protected area management emphasizes a situation that is common in environmental management interventions: there are a small number of unique treated units with a potentially large set of controls . Synthetic control design is one way of strengthening the support for the strong assumptions required for estimator approaches such as difference-in-difference .
Our case study demonstrates the use of key recommendations including sample unit and measurement period selection and use of hydrological data to control for confounding factors from our above discussion. We used floodplains as our sample units to account for longitudinal and lateral connectivity. We controlled for possible confounding factors relating to habitat suitability with hydrological measurements. Lastly, we selected our measurement period based on our knowledge of possible time lags in the system.
5. Recommendations for future evaluations of freshwater-protected areas
Our review has highlighted both the challenges and opportunities for designing and implementing impact evaluations to measure the benefits of protected area management of freshwater systems. Evaluating the performance of policy and management responses should be a standard part of protected area planning and funding. Key recommendations from our review include:
— appropriate spatial units for evaluation are needed in order to account for connectivity; examples of appropriate units are floodplains, catchments or full river reaches;
— build upon similarities in the types of confounding factors in both terrestrial and freshwater systems when matching treatment and control units;
— take advantage of existing long-term hydrological and water quality data to account for variability and connectivity and ensure appropriate matching of treatment and control units; and
— select baselines and outcome measurements to cover appropriate timescales that capture natural variability and time lags in ecological responses of freshwater systems.
While the challenges to freshwater impact evaluation discussed here are not insurmountable, addressing them in most study systems will likely require further data and an understanding of hydroecology in order to account for connectivity and time lags. Developing this understanding of dynamic and complex systems may require significant investments. Thus, while there is promise in applying quasi-experimental designs similar to those applied for terrestrial protected area evaluations, these may not be feasible or appropriate in many freshwater systems. Where there are limited resources to undertake quasi-experimental impact evaluations, protected areas of significance, such as Kakadu National Park, may be good candidates given that findings can inform both the evaluated management practices as well as the design and location of other protected area management programmes based on lessons learned . If the resources required to undertake impact evaluation are not warranted, such as in smaller protected areas or those with few threats, other evaluation approaches such as performance measurement informed by complex theories of change may be appropriate [7,67]. These approaches should also aim to address the challenges reviewed here, such as controlling for connectivity of freshwater systems or selecting baselines based on appropriate timescales for capturing natural variability and time lags. While freshwater ecosystems present some unique challenges, there are many opportunities to undertake rigorous impact evaluations, particularly for key interventions like protected catchments, such as Kakadu National Park.
All authors contributed to the conception, design and writing of this manuscript.
We declare we have no competing interests.
This work was supported by funding from the National Environmental Research Programme Northern Australia Hub.
We thank Professor Paul Ferraro and the participants of the Australian Research Council Centre of Excellence for Environmental Decisions funded Causal Inference workshop for discussions that informed the case study.
One contribution of 16 to a theme issue ‘Measuring the difference made by protected areas: methods, applications and implications for policy and practice’.
- Accepted August 18, 2015.
- © 2015 The Author(s)