Problematica are taxa that defy robust phylogenetic placement. Traditionally the term was restricted to fossil forms, but it is clear that extant taxa may be just as difficult to place, whether using morphological or molecular (nucleotide, gene or genomic) markers for phylogeny reconstruction. We discuss the kinds and causes of Problematica within the Metazoa, as well as criteria for their recognition and possible solutions. The inclusive set of Problematica changes depending upon the nature and quality of (homologous) data available, the methods of phylogeny reconstruction and the sister taxa inferred by their placement or displacement. We address Problematica in the context of pre-cladistic phylogenetics, numerical morphological cladistics and molecular phylogenetics, and focus on general biological and methodological implications of Problematica, rather than presenting a review of individual taxa. Rather than excluding Problematica from phylogeny reconstruction, as has often been preferred, we conclude that the study of Problematica is crucial for both the resolution of metazoan phylogeny and the proper inference of body plan evolution.
1. Progress and remaining controversy
In the proceedings of a previous international symposium held in London two decades ago, Barnes (1985) summed up progress in metazoan evolution in six ‘phylogenetic assumptions’, only one of which was strictly phylogenetic (besides monophyly of Metazoa)—protostomes and deuterostomes ‘continue to be recognized as major lines of animal evolution’. From that starting point, progress in the field can only be labelled as remarkable (Halanych 2004). However, it is difficult to present an unambiguous consensus view of high-level metazoan phylogeny. First, metazoan phylogenetics is thriving, for example with several large-scale projects underway in both Europe and USA, and promoting an ongoing flux of ideas. Preliminary results of new studies that were presented during the current symposium provide a taster of possible new insights, like a potential refutation of the long-standing dichotomy between protostomes and deuterostomes. Second, workers may favour either particular kinds of phylogenetic evidence, and/or modes of phylogenetic analysis (Jenner & Scholtz 2005). The conservative consensus that forms the basis of our discussion (figure 1) will probably become obsolete in the near future, and it holds the middle ground between recent more extreme views that either report only ‘a few minor controversies’ left to clean up (Eernisse & Peterson 2004, p. 204) or depict a virtually unresolved polytomy (Philippe & Telford 2006).
In this paper, we examine progress in high-level animal phylogenetics by focusing on the most challenging cases: the Problematica. We use the term to label extant or fossil taxa that defy robust, unambiguous phylogenetic placement. Agreeing with Conway Morris (1991, p. 23), we regard Problematica principally as ‘a problem for biologists, not biology’. We review the methodological and biological causes of Problematica in the context of high-level metazoan phylogeny and provide possible strategies for dealing with Problematica. We discuss fossil and extant Problematica from the perspectives of morphological and molecular phylogenetics. A summary of attempts to grapple with Problematica provides valuable insights into the relative abilities of different kinds of data and phylogenetic methods to deal with some of the most challenging problems in all of systematics.
Space limitations dictate three important caveats. First, we address individual Problematica only by way of selected examples. Second, all factors that can influence the outcome of a phylogenetic analysis are relevant for a discussion of Problematica. Evidently, we can do no justice to all relevant data and ideas, however, the general issues that we address apply to all known Problematica in the Metazoa. Third, in line with our own area of least ignorance, we focus on extant Problematica.
2. Problematica: causes and recognition criteria
Problematica confront phylogeneticists with all the problems that can beset phylogenetic analysis. Problematica arise when we lack unambiguous phylogenetic signal that can relate them to other taxa. In many cases, this is simply the result of not (yet) having enough knowledge of a taxon. This is certainly the case for many fossil Problematica with unfavourable preservation, and it is likewise a factor for many extant high-level metazoan Problematica, for which our knowledge of non-phenotypic characters is still rudimentary. Biological reasons for the absence of sufficient phylogenetic signal can be grouped into three categories.
Not enough phylogenetic signal has evolved,
phylogenetic signal is lost through extinction, and
phylogenetic signal is lost or obscured by the evolution of non-phylogenetic signal.
In the first case, if lineage splitting events succeed each other rapidly, there may not be enough time for distinctive features to evolve, which can be used to group descendant species. Although the length of the fuse of the Cambrian explosion is still debated, this has long been considered a distinct possibility for the divergence of the animal phyla.
Extinction may exacerbate the problem of inferring clades on the basis of homoplasy, or erase phylogenetic signal altogether if the organisms are not discovered. For example, reconstruction of the panarthropod stem group revealed that the subventral mouth shared by extant arthropods and onychophorans has evolved convergently (Eriksson & Budd 2000). As is well known, fossils can contribute important phylogenetic signal (Cobbett et al. 2007), and in view of the considerable differences between the body plans of extant phyla, extinction must have removed substantial amounts of morphological phylogenetic signal that can only be retrieved by the study of fossils.
The third category groups several causes related to evolutionary change that can erode or obscure phylogenetic signal with the same effects for phylogenetic analysis as extinction of taxa, even when all relevant taxa are included into the analysis. This is especially important when inferring phylogenies with short stems and long terminal branches (Rokas & Carroll 2006), of which the metazoan phylogeny is a prime example. Firstly, if newly evolved lineages have not yet evolved complete intrinsic isolating mechanisms, extensive introgressive hybridization may occur, even of morphologically distinct species (Wiens et al. 2006). Although extensive gene exchange between morphologically distinct species is considered rare (Coyne & Orr 2004), this could scramble any original phylogenetic signal (Clarke et al. 1996; Chan & Levin 2005). Causes in this category also relate to the power of natural selection or shared constraints to produce convergent evolution, and parallelisms (non-random non-phylogenetic signal) that may lead to the false inference of monophyletic taxa. This can be an important problem for both morphological and molecular phylogenetic analyses (Waegele & Mayer 2007). It also captures problems caused by the evolution of random noise (undirected homoplasy), such as may arise purely by chance, e.g. as with substitutional saturation of sequences.
These causes can affect phylogenetic analyses of both fossil and extant taxa, at any taxonomic level and independent of the type of evidence used. Difficulties generally become greater with increasing age of the divergence events we attempt to reconstruct, and all causes mentioned above have probably confounded attempts to place particular Problematica in the tree of the Metazoa. In the following sections, we pay more detailed attention to specific causes that are of relevance for certain Problematica.
Several criteria can be used to recognize Problematica.
Number of alternative sister group hypotheses,
phylogenetic spread and hierarchical range of alternative sister group hypotheses,
controversial homology assessments,
absence of phylogenetically informative characters, and
exclusion from phylogenetic analyses based on explicit datasets.
The first two criteria are the most straightforward for recognizing Problematica, partly because they can only be applied after phylogenetic analyses. Classic Problematica, such as Chaetognatha, Ectoprocta, and Pogonophora, have long exhibited both a large number of alternative sister group hypotheses and a large phylogenetic spread among these alternatives (covering both Protostomia and Deuterostomia). The phylogenetic spread of alternative hypotheses is positively related to the hierarchical depth across which the alternatives may be distributed. For example, the placement of Pentastomida is problematic only within the Panarthropoda, with a position either within Crustacea or in the arthropod stem group as the two main contending hypotheses (Waloszek et al. 2005). In contrast, vetulicolians are problematic on a much larger scale, across a wide phylogenetic spread (Bilateria) and a large hierarchical depth (ranging from being attributed to a separate ‘phylum-level’ clade to belonging to a subtaxon of Tunicata; Aldridge et al. 2007). However, the problematic status of other taxa emerges only after careful study of potential phylogenetic evidence. Tardigrada, for example, are generally considered to be close relatives of the onychophorans and arthropods, together comprising Panarthropoda. In contrast to the latter two taxa, which are often grouped together to the exclusion of tardigrades, tardigrades lack an ostiate heart and nephridia. However, whether these absences are secondary losses due to miniaturization, or primary absences is unclear. Consequently, correct determination of the phylogenetic significance of these characters depends on whether tardigrades are primitively small bodied or secondarily miniaturized. Our current understanding does not clearly favour either hypothesis, leaving panarthropod relationships unresolved. This example also shows that the distinction between Problematica and non-Problematica is not sharp. Other taxa are problematic owing to the lack of or insufficient study of informative characters. Myxozoa, for example, are probably derived cnidarians (Jiménez-Guri et al. 2007) that share so few characters with their closest non-parasitic relatives that most textbooks did not even include them in the Metazoa until very recently. Lacking detailed knowledge may also cause Problematica to be excluded from phylogenetic discussions. Species such as not only Jennaria pulchra, the lobatocerebrids, Xenoturbella bocki (until recently), Buddenbrockia and myxozoans but also myzostomids and pentastomids are frequently excluded from morphological phylogenetic analyses. This is not because their phylogenetic position is so well understood.
3. Fossil Problematica
(a) The vagaries of preservation, typological thinking and model choice
All the difficulties that beset phylogenetic analyses of extant taxa also play a role in the systematization of fossils. With fossils, however, several additional factors can cause problems, of which we think three are of particular importance. First, preservational artefacts can lead to formidable problems of interpretation. Although the majority of fossils can be related to extant body plans without much difficulty, ‘unusual objects do occur in rocks’ (Yochelson 1991, p. 288). Problematica are particularly common from the fossil record of the Late Neoproterozoic and the earliest Phanerozoic (approx. 575–500 Myr ago) and it is especially these forms that may provide unique clues to the origin and diversification of early animal body plans. Yet, many important taxa found in this time interval defy unambiguous interpretation owing to the limits of preservation, and taphonomic changes of the organism and surrounding sediment. This is clearly illustrated in recent debates over the putative Precambrian animal Vernanimalcula (a coelomate bilaterian?), the Cambrian animal Odontogriphus (segmented?) and the oldest putative metazoan eggs and embryos (animals or bacteria?) and in the continuing debate about the Ediacaran biota (Dzik 2003; Fedonkin 2003; Bengtson & Budd 2004; Chen et al. 2004; Narbonne 2005; Butterfield 2006; Caron et al. 2006; Bailey et al. 2007; Donoghue 2007).
Budd & Jensen (2000) nominated typological thinking as another factor that may hinder the phylogenetic systematization of fossils, especially in the context of extant taxa. Through a misguided emphasis on differences, fossils have automatically been labelled Problematica if their body plan does not exactly conform to that of a living phylum (see also Briggs et al. 1992). Such reasoning is incompatible with proper phylogenetic logic, but it is nevertheless prevalent (Jenner 2006).
A third factor that inescapably affects thinking about fossil Problematica is that fossils can only be interpreted in the light of our knowledge of living species. Consequently, disagreements about the phylogenetic placement of fossil Problematica frequently hinge upon the use of different living species as models, as illustrated by the vetulicolians (Aldridge et al. 2007).
(b) Solving fossil Problematica: stem groups, new fossil and new techniques
Yochelson (1991, p. 289) remarked that he could only offer ‘a few platitudes’ about how ‘to do’ fossil Problematica. We hope the following suggestions are helpful. In essence, fossils should be treated like any other living taxon. Attempts to systematize fossils will lead to the establishment of stem groups (Budd & Jensen 2000; Conway Morris 2000; Budd 2001, 2003). Although differences between fossils and extant taxa should not be ignored, they should not be interpreted typologically as evidence against affinities (Budd & Jensen 2000; Jenner 2006). Putative stem group taxa are expected to exhibit some but not all of the diagnostic characters of crown groups, and by creating paraphyletic series of stem taxa, we can illustrate the orderly sequential evolution of body plans. This may not be easy of course. If crucial information is not preserved, a fossil may not be reliably placed. In such cases, unless new fossils are found or new techniques reveal new information, ambiguity will endure.
The main reason why fossil Problematica occur frequently in the Late Neoproterozoic and Early Phanerozoic is extinction. These fossils document the early evolution of animal body plans. The older the fossils are, the more they are expected to fall outside the limits of extant body plans (Budd 2003; Valentine 2004). Unless body plan evolution takes large leaps, failure to systematize fossil Problematica is chiefly the result of not (yet) knowing related taxa that can bridge their morphology with those of the crown group. Hence, most progress is made with fossil Problematica when new specimens are found. Better-preserved fossils and forms with novel character combinations address the problems of taxon and character matrix completeness, allowing unknowns to be substituted with characters. However, this approach relies on much fieldwork and a great deal of luck.
Palaeontological and analytical techniques are constantly being developed that present ways of discerning new characters, or of better resolving existing ones, and of handling existing data. For example, the three-dimensional reconstruction of fossil forms from thin serial sections has achieved remarkable levels of resolution, thanks to refinements in microscopy and computer rendering. This has provided valuable phylogenetic information for a host of taxa, ranging across the Bilateria (Sutton et al. 2001a,b, 2005a–,c; Thomson et al. 2003). X-ray tomographic microscopy and Raman spectroscopy combined with confocal laser scanning microscopy have also yielded images and insights into the biomolecular nature of fossils with unrivalled resolution (Schopf & Kudryavtsev 2005; Donoghue et al. 2006; Chen et al. 2007).
Other advances will come from improvements in methods of phylogeny reconstruction. Model-based methods of analysis have proven their worth with molecular data, particularly in dealing with long-branch problems in phylogenetic reconstruction. Such methods, although still in their infancy, are now available for the analysis of morphological and fossil data as well (Lewis 2001). This promises the chance to include incomplete taxa, such as fossil Problematica, with morphological and even molecular data from extant taxa using maximum likelihood or Bayesian techniques (Wiens et al. 2005b), while at the same time parsimony-based methods are refined to be able to deal efficiently with large amounts of diverse phylogenetic evidence (Wheeler et al. 2006).
4. Extant Problematica
(a) An apparent paradox: weak molecular signal and large amounts of morphological evolution
It is not surprising that Problematica are encountered when metazoan phylogeny is analysed on the basis of extant taxa. First, any comparison between two extant species belonging to different phyla has to bridge in the order of one billion years of independent evolution. This is ample time to erase signs of ancestry, either through extensive modification or loss of characters, and for convergent evolution to obscure phylogenetic signal. It may thus be unsurprising that sessile taxa (ectoprocts, brachiopods, phoronids), very small (possibly miniaturized) taxa (tardigrades, placozoans, Lobatocerebrum) and parasitic taxa (pentastomids, myxozoans) have been particularly prominent Problematica. Another consequence is that molecular phylogenies of the Metazoa bear the typical signature of short stems and long terminal branches, providing ample opportunity for long branch attraction (Waegele & Mayer 2007). This has been a problem for the placement of several taxa, ranging from myxozoans to acoels (Philippe et al. 2007). Second, the major metazoan lineages may have radiated very rapidly, potentially allowing for very little phylogenetic signal to evolve. Although it remains disputed whether lack of resolution is a convincing signature of closely spaced cladogenetic events (Giribet 2002; Rokas et al. 2005; Rokas & Carroll 2006; Baurain et al. 2007; Whitfield & Lockhart 2007), if current molecular clock estimates of metazoan divergence times are approximately accurate (Peterson et al. 2004, 2005), the fact remains that the major metazoan lineages diverged over a time span that is significantly shorter than the subsequent independent history of modern phyla (including their stem groups). The appearance in the fossil record of a variety of crown phyla with their distinctive body plans as early as the Cambrian (Budd 2003; Valentine 2004) implies that important morphological traces of ancestry were probably already erased early in metazoan history.
Intriguingly, the relative branch lengths of morphological metazoan phylogenies seemingly contradict the absence of sufficient phylogenetic signal. These typically show a much smaller discrepancy between the length of stems and terminal branches, or even the opposite pattern of relatively longer stems and shorter tips (Zrzavy et al. 1998, 2001; Nielsen 2001; Peterson & Eernisse 2001; Brusca & Brusca 2003; Zrzavy 2003). Large amounts of body plan evolution are commonly inferred along almost all stems. This raises interesting issues about the relationship between genetic and phenotypic evolution which we cannot address here. What is pertinent here is the large amount of body plan evolution inferred across a relatively small number of speciation events. For example, depending on the precise topology of the tree, possibly just six or seven nodes separate the body plan of the last common ancestor shared by (at least some) sponges and the remaining animals, and the last common ancestor of the chordates! Unless half a dozen speciation events are really all that are required to evolve from a sponge grade organization to that of a protochordate, we must be missing something. That something is fossils.
Recent studies of the fossil record have yielded important insights that may help explain why extant Problematica are to be expected. First, Wagner (Wagner 2000, 2001b; Wagner et al. 2006) drew the important conclusion that during evolutionary history taxa tend to exhaust their character state spaces. This means that as clades age, homoplasies increase in frequency. Not surprisingly, homoplasies are common between the major lineages of animals (Valentine 2004). Our estimates of homoplasy based on morphological phylogenetic studies are probably underestimates, given a widespread problem of character coding (Jenner 2004b).
Distressingly, Wagner (2001b) noted that the inclusion of fossils into a phylogenetic analysis of extant species could reveal a significant amount of previously hidden character change along branches subtending extant taxa. This positive correlation between the amount of character change that is discovered and the number of taxa included is well known by molecular systematists, and is known as the node density effect. However, its effect for morphological phylogenetics and inference of body plan evolution has barely been acknowledged (Jenner & Wills 2007). Hence, the inclusion of even incomplete fossil taxa has the potential to reveal that synapomorphies of extant taxa may in fact be homoplasies or symplesiomorphies, and their inclusion can improve accuracy of the phylogenetic relationships inferred between living taxa (Wiens et al. 2005a). The reconstruction of stem groups is crucial for a complete picture of body plan evolution, and there is ample evidence that phylogenetic inferences based on extant taxa can be misled; for arthropod examples see Eriksson & Budd (2000) and Budd (2001). The amount of character evolution that is missed by a focus on extant taxa is increasingly illustrated by studies that show that rates of morphological character change may be the highest in the early history of a clade, which may go hand in hand both with the general early establishment of morphological disparity in the history of large clades and indications that morphological transformations had larger step sizes early in a clade's history (Valentine 2004; Ruta et al. 2006; Erwin 2007). In combination, these insights suggest that by focusing on living taxa only we are missing a lot of character evolution, the recognition of which is crucial to prevent clades being based on homoplasies or symplesiomorphies.
(b) From the unequal eye to morphological cladistics
To see all things with equal eye is not within our power: humans, and especially human narrators, always look upon the world with an unequal eye (O'Hara 1992, p. 140).Before computers came to assist phylogenetic analysis, Problematica were an inescapable by-product of phylogenetic inference. Without the help of a computer it is impossible to achieve a balanced and unbiased evaluation of large amounts of comparative data for more than a few taxa. Emphasis on different aspects of available evidence as well as the lack of a uniform phylogenetic methodology fostered disagreement between workers. Consequently, from the beginning of our discipline a researcher's central insights were not uncommonly labelled another's ‘fata morgana’ (Hubrecht 1887, p. 641), and the coordinating theme of one school of zoological thought would deserve to be ‘dead and buried’ in the opinion of proponents of another (Hyman 1959, p. 750).
The widespread adoption of cladistic reasoning in the second half of the twentieth century increased the promise of reaching a general consensus on metazoan phylogeny. Yet, without the help of computers progress was once again foiled as the huge amounts of conflicting evidence allowed many mutually exclusive conclusions. The computer-assisted morphological cladistic analyses of metazoan phylogeny published over the last decade greatly advanced the objectivity, explicitness and testability of phylogenetic hypotheses. In this period, the field progressed significantly beyond the traditional textbook trees (Adoutte et al. 2000), but perhaps the most important insight of this era of fruitful debate was discovering exactly how problematic many taxa and clades were. As reviewed elsewhere (Jenner 2004a,b), differences in the construction of data matrices, including different strategies of character selection, character coding and scoring, and taxon selection resulted in many incompatible phylogenies. Taxa such as Chaetognatha and Ectoprocta behave like phylogenetic renegades, residing in as many different clades as there are studies, and although other aspects of the phylogenetic backbone seemed more secure (monophyly of Protostomia, Spiralia), total agreement between analyses is absent. Evidently, the phylogenetic signal residing in morphology needs to be supplemented with molecular evidence.
(c) Old Problematica solved and new Problematica revealed with molecules
A new phylogenetic synthesis for the Metazoa (Halanych 2004; figure 1) has emerged largely on the basis of molecular evidence. The backbone of this phylogeny is based on rDNA sequences (18S and 28S), and despite challenges (Rogozin et al. 2007) its major aspects are confirmed by increasingly sophisticated phylogenomic analyses based on larger amounts of data, and employing improved model-based analytical methods (Philippe et al. 2005; Baurain et al. 2007; Irimia et al. 2007). However, since for most taxa phylogenomic data are not yet available, workers still rely heavily on sequence data from a few popular loci. Nevertheless, molecular evidence has provisionally solved a number of controversies. For example, Echiura, Pogonophora and Vestimentifera, and possibly Sipunculida and Myzostomida as well, are all parts of Annelida (McHugh 1997; Rouse 2001; Bleidorn et al. 2007; Struck et al. 2007), and xenoturbellids are deuterostomes with a probable sister group relationship with the Ambulacraria (Bourlat et al. 2006). Molecular data have also helped to constrain the number of existing hypotheses for the perennially problematic chaetognaths to being sister group to either Protostomia or Lophotrochozoa (Marlétaz et al. 2006; Matus et al. 2006). This resolution is intriguing. Peterson & Eernisse (2001) and Eernisse & Peterson (2004) hypothesized that chaetognaths may retain various bilaterian plesiomorphies. It could be that morphological phylogenetic analyses misplaced them based on symplesiomorphies that were erroneously interpreted as synapomorphies. For example, an emphasis on mouth position and cuticle composition would support ecdysozoan affinities, while embryological features are more in line with a deuterostome affinity.
However, in spite of increasingly broad taxon sampling, in places ‘overall resolution remains discouraging’ (Rousset et al. 2007, p. 54), especially within the Lophotrochozoa. Consequently, molecular data have not yet provided a reliable picture of lophotrochozoan phylogeny and the relationships of non-bilaterians in particular.
The combination of molecular and morphological evidence into single analyses has also yielded interesting insights, but differences in the methods and datasets used make it very difficult to judge the relative merits of different studies (Eernisse & Peterson 2004; Glenner et al. 2004). Also, considered in isolation, analyses based on morphology versus molecules show varying degrees of conflict depending on the datasets. Some of the more conspicuous differences concern Gastrotricha, placed in Ecdysozoa (morphology) or Lophotrochozoa (molecules), the monophyly or paraphyly of Cycloneuralia (Nematoida+Scalidophora; morphology and molecules, respectively) and the lack of robust molecular support for the monophyly of morphologically widely accepted taxa such as Mollusca and Gastrotricha and some of their major subtaxa (Giribet et al. 2006; Todaro et al. 2006).
(d) Guidelines for future progress in metazoan phylogeny
A large literature exists on troubleshooting molecular systematics. Some excellent recent reviews include Gribaldo & Philippe (2002), Sanderson & Shaffer (2002), Delsuc et al. (2005), Philippe et al. (2005), Boore (2006), Philippe & Telford (2006), Rokas & Carroll (2006), Wiens (2006) and Whitfield & Lockhart (2007). We extract a number of guidelines to be kept in mind to ensure continued progress in understanding rather than mere stochastic change of opinions.
It makes increasingly little sense to label and target particular taxa as Problematica. Their correct placement in the tree of the Metazoa is unlikely to arise through the isolated accumulation of data. Several factors need to be balanced to produce a good phylogenetic analysis—number of taxa, number of characters, quality of data and quality of analytical models. The interaction between these variables determines whether the results of a phylogenetic analysis are informative and reliable, or suffer from stochastic or systematic error. Stochastic error arises as chance correspondences overwhelm true phylogenetic signal when there are not enough informative data. Systematic error results when reconstruction methods are inaccurate and are unable to deal with bias in the raw data, which can have several causes (Philippe & Telford 2006). The common problem of long branch attraction (Anderson & Swofford 2004; Waegele & Mayer 2007) can be a result of both stochastic and systematic errors.
The problem is that in trying to avoid stochastic error by increasing the number of characters in a dataset, the chances of systematic error may increase when the number of taxa sampled is too small or the amount of data across taxa is uneven. So far the molecular data generated for different phyla are wildly uneven (figure 1) owing to the bias towards key taxa that are important as model organisms, or organisms of biomedical or economic importance, or simply because they are the easiest to collect. To avoid systematic error it is therefore important to strive for a better balance in the number of taxa and characters (Philippe & Telford 2006).
(i) Avoiding stochastic error
Increase the number of characters. In molecular systematics this is the main rationale for doing phylogenomics, based on large amounts of data generated through genome projects, EST projects, or large-scale projects targeting particular genes with degenerate primers (Delsuc et al. 2005; Philippe et al. 2005; Philippe & Telford 2006; Baurain et al. 2007).
(ii) Avoiding systematic error
sample more taxa, including at least several species representing a high-level metazoan taxon, which may do more to prevent systematic error than aiming to have whole genome sequences for fewer taxa (Hillis et al. 2003),
recognize and remove problematic data, such as fast evolving taxa or characters, or characters the evolution of which violates phylogenetic model assumptions (Lecointre & Deleporte 2004; Delsuc et al. 2005; Philippe et al. 2005).
(iii) Other considerations
Care should be taken not to be misled by gene duplication (paralogy), causing gene trees to diverge from the species tree,
to maximize the power to test the phylogenetic position of a particular taxon, try to include at least all the taxa that have previously been proposed to be its closest relatives,
if practical, reconstruct a phylogenetic scaffold based on a restricted number of taxa scored for many characters. Additional taxa can then be added sequentially on the basis of smaller number of characters (Wiens 2006). Addition of incompletely known taxa can boost accuracy and confidence. To prevent systematic error it may be better to add a smaller number of characters scored for many taxa, rather than many characters for fewer taxa,
if there is not enough phylogenetic signal, focus on characters with higher rates of evolution,
assess data quality as a standard part of any phylogenetic analysis (Waegele & Mayer 2007),
exploit combined evidence analyses, including fossil data wherever possible (Giribet 2002; Eernisse & Peterson 2004), while recognizing the interpretational difficulties associated with combining molecular exemplar species and inferred morphological ground patterns,
boost the amount of descriptive and comparative morphological studies to revise outdated received wisdom, and provide more data crucial for the inference of body plan evolution (Nielsen 2001; Jenner 2006),
adopt an experimental approach (sensitivity analysis) to phylogenetic analysis to see how results change depending on different assumptions,
re-evaluate contentious morphological evidence in the light of independent molecular phylogenies, especially to detect cases of unrecognized character loss (Jenner 2004b),
carefully construct morphological datasets to maximize testing power (Jenner 2004a),
adopt standardized methods for the presentation, annotation and analysis of molecular data. To this end Leebens-Mack et al. (2006) have called for a standard for reporting on phylogenies, the Minimum Information about a Phylogenetic Analysis (MIAPA), in which each component of a phylogenetic analysis (alignment procedures, alignment, sequences, voucher specimens, methods and parameters used, etc.) is outlined using universally accepted criteria. This will facilitate better evaluation and comparison of results of different analyses.
In summary, the recognition of Problematica reveals more than the sum of their missing or ambiguous parts. In avoiding fragmentary fossils or extant organisms with combinations of chimaeric or autapomorphic features, and by excluding long branching taxa or heavily biased nucleotide and protein sequences from molecular analyses, we may bring near completeness to data matrices and greater stability to our phylogenetic analyses, but probably at the expense of accuracy and an understanding of the full evolutionary picture. Problematica reveal themselves as supremely important; for without their inclusion and accurate placement, other relationships are liable to change. In understanding how to deal with Problematica, we realize the limits of systematics and our ability to have faith in our reconstructions of the tree of life.
The authors dedicate this paper to the memories and careers of Reinhard M. Rieger (1943–2006) and Ellis L. Yochelson (1928–2006). R.A.J. gratefully acknowledges the UK Biotechnology and Biological Sciences Research Council for financial support. We thank Matthew Wills, Ciara Ni Dhubhghaill and Sarah Webb for their insightful comments on the manuscript. We are especially indebted to Greg Edgecombe, Graham Budd and Andrew Smith for their ruthlessly perceptive critiques of the final version of this paper.
One contribution of 17 to a Discussion Meeting Issue ‘Evolution of the animals: a Linnean tercentenary celebration’.
- © 2008 The Royal Society