Although a large part of this work is a classical morphological revision of the New World Siricidae, DNA barcoding analysis was used to identify potential new species and develop a method to identify siricid larvae.
DNA barcoding as used here was originally proposed by Hebert et al (2003) as “a new approach to taxon identification.” They postulated that if we wished to identify extant biodiversity we needed a faster, easier system than classical morphological methods and proposed that animal species could be uniquely identified by an approximately 600 base pair DNA sequence (barcode) of the mitochondrial Cytochrome Oxidase 1 gene. The advantages of barcode analysis included that it was fast, inexpensive, the characters are relatively uniform and unbiased, the analysis is quantitative, it can be used on all life stages, and it requires no specialized taxonomic experience or knowledge.
Since the proposal of Hebert et al. in 2003, barcodes have been used to identify animals including birds, fish and arthropods, discover cryptic species and associate life stages (Hajibabaei et al. 2006, Hebert et al. 2004, Hebert et al. 2004A, Hogg and Hebert 2004, Ball and Armstrong 2006, Smith et al. 2006, Ward 2005). However, as more studies were published, theoretical and practical difficulties were used to challenge the use of DNA barcodes alone for new species identification and classification (summarized in Rubinoff et al. 2006). These issues included heteroplasmy, where more than one mitochondrial haplotype is present in an individual (Frey and Frey 2004); numts (Lopez et al. 1994) where a nuclear pseudogene of mitochondrial origin was sequenced instead of the mitochondrial gene itself (Song et al. 2008, Pamilo et al. 2007, Koutroumpa et al. 2009); hybridization or indirect selection resulting from organisms like Wohlbachia mediating mitochondrial introgression in closely related species (Whitworth et al. 2007, Linnen and Farrell 2007, 2008); effects related to the biology of mitochondria such as reduced population size, maternal inheritance and limited recombination; and, finally, how much genetic distance should be used to delimit species (see Rubinoff et al. 2006 and the references therein). These limitations made it very difficult to use DNA barcoding as an easy alternative to classical or more sophisticated molecular methods for identifying new species. However, DeSalle (2006) in a rebuttal to Rubinoff et al. (2006) made a distinction between “species discovery” and “species identification.” He argued that using barcodes alone for species discovery was indeed rife with difficulties, but that once a set of barcodes was established for a group of species, unidentified specimens could be identified with the caveat that some specimens might not be resolvable. He suggested that a novel barcode sequence should be viewed as only a new species hypothesis to be tested and verified with more established methods. Although this resolution does not solve the challenge of how to recognize the vast number of undescribed species in the world, with our combined morphological and barcoding approach, it should allow us a means to identify adults and thus immature stages of New World Siricidae.
As with many groups of Hymenoptera, there are no morphological keys to immature stages of Siricidae, for several mostly practical reasons. First, until recently, there has been no pressing need for morphological keys to siricid larvae. Sirex noctilio, the most significant siricid pest, has only been an economic pest in conifer plantations in the Southern Hemisphere where there were no native woodwasps to confuse it with (Hoebeke et al. 2005). Second, rearing larvae from trees is costly and time consuming. Locating, harvesting and storing infested trees is labor intensive and because many species of woodwasps take up to several years to attain maturity it is quite time consuming and thus expensive. Third, until this manuscript, most woodwasps were not considered to be particularly host specific and because many species can attack the same host it was not easy to associate specific larvae with reared adults.
The primary reasons to identify larvae are to recognize an infestation of a pest species and to prevent further introductions of exotic species. As the larval stage is present for 11 months and adults are only present for a few weeks it would be advantageous to be able to identify larvae immediately using molecular methods (hours or days) rather than wait as much as a year or more until identifiable adults can be reared. Because DNA is the same for all life stages, a molecular technique that identifies adults will also identify immature life stages.
The 622 specimens of woodwasps sequenced were resolved into 31 taxa including 28 taxa of Siricidae (603 sequences) and one taxon each of Xiphydriidae (Xiphydria mellipes, 3 sequences), Syntexidae (Syntexis libocedrii, 12 sequences) and Orussidae (Orussus thoracicus, 4 sequences) (Fig. E2.1). Complete consensus sequences, 658 base pairs, were obtained for 29 of the 31 taxa ultimately resolved. The consensus sequences for Sirex obesus and Sirex near californicus were only 613 and 615 base pairs, respectively. Of the 622 specimens sequenced, 476 (76.5%) were complete sequences; of the rest, 88 specimens were greater in length than 600 base pairs, 48 were longer than 500bp, 6 were longer than 400bp and 4 were longer than 300bp. Length of sequence for individual specimens is recorded under each species description. All species except Sirex obesus and Sirex near californicus had at least one specimen with a full length sequence.
Although all 622 specimens were unambiguously assigned to the correct family, genus and species/taxon according to the siricid family revision proposed here, when this work was started, under the former classification (summarized in Smith 1979, Smith and Schiff 2002, Schiff et al. 2006), barcoding results generated several new species level hypotheses. In two cases, one in Xeris and one in Sirex, pairs of what were considered to be good species or subspecies were found to share identical barcodes. What were formerly classified as Sirex nigricornis and S. edwardsii are now listed as S. nigricornis and what were formerly listed as Xeris spectrum townesi and X. morrisoni indecisus are now listed as X. indecisus. Further, two pairs of subspecies, X. morrisoni morrisoni and X. morrisoni indecisus, and Urocerus gigas gigas and U. gigas flavicornis were easily separated using barcodes and are now elevated to species as Xeris indecisus, X. morrisoni, Urocerus gigas and U. flavicornis, respectively. DNA barcodes also hypothesized or supported several new taxa. Sirex abietinus was a single novel sequence until the species was characterized morphologically and more specimens were obtained and sequenced. Xeris melancholicus was initially recognized by its unique barcode and then characterized morphologically. Sirex obesus was identified morphologically and then, when fresh specimens were obtained and sequenced, supported by barcodes. Two other taxa, Sirex near nitidus and especially Sirex near californicus are recognized by barcodes but have not been assigned species names because we have been unable to find supporting morphological characters with so few specimens.
The neighbor-joining tree of consensus sequences of each taxon (Fig. E2.1) showed well-delimited taxa. Separate neighbor-joining trees (Figs. E2.2, E2.3, E2.4a, E2.4b, E2.4c, E2.5a, E2.5b, E2.5c, E2.5d, E2.5e and E2.5f) for individual specimens of small groups of species showed low intra-specific and high inter-specific divergence with no overlap between species. Percent identity and divergence for consensus sequences of all taxa are presented in Table E2.6. The greatest divergences were between families of woodwasps (30–40%). Anaxyelidae was most divergent from the others (34.1%–45.5%) followed by Orussidae (30.5%–42.6%) and Xiphydriidae (30.5%–40.3%). Within the Siricidae, the genera were well defined with percent divergences in the 20s–30s and within genera as low as 1.7% to the 20s. Divergences for the closest pairs of taxa were 1.7% for Sirex nitidus and S. near nitidus, 2.2% for Xeris indecisus and X. morrisoni, 2.8% for Urocerus gigas and U. flavicornis, 3.3% for Xeris caudatus and X. melancholicus, 4.6% for Sirex abietinus and S. varipes, 5.1% for Sirex californicus and S. near californicus and approximately 3.7% for Sirex cyaneus and S. nitidus or S. near nitidus. Of these least divergent pairs the smallest and largest divergences were for pairs that lacked morphological support.
The most important question when deciding to use a new technique to identify species is: does the technique unambiguously identify specimens of each species correctly 100% of the time? In the case of using DNA barcodes to identify New World Siricidae the answer is yes but it was difficult to get to this answer because the Siricidae was in need of revision when the project was started. Our simultaneous morphological and barcoding analyses are in almost complete agreement. Unique barcodes exist for all morphologically distinct species for which we could obtain sequences. However, two of the morphologically distinct species, Sirex californicus and S. nitidus, each appear to harbor a cryptic taxon that is only recognizable by DNA barcode. The question remains: are these cryptic taxa good species? It is possible they could be artifacts of barcoding such as heteroplasmy or numts or it may be they are very good cryptic species and we have been unable as yet to discover morphological or behavioral support for them. To reduce the risk of heteroplasmy we directly sequenced double stranded PCR products. If there were rare haplotypes they would be masked by the most common haplotype. If there were two or more common haplotypes there would have been double peaks and the sequences would have been difficult to read. To reduce the possibility of having amplified numts we isolated samples from mitochondrial rich tissue and we inspected translated sequences to look for artifacts common in numts such as stop codons, insertions and deletions. There were no stop codons, insertions or deletions in any of the samples except for Orussus thoracicus which was missing one codon, in frame. We do not believe this is indicative of a nuclear mitochondrial pseudogene however, as the same codon is absent in three other Orussus species (data not presented). Either, all four Orussus species have the same pseudogene which is amplified preferentially over the mitochondrial gene, which seems unlikely, or the missing codon reflects a genuine difference between Orussus and all the other woodwasps. Although we believe the cryptic taxa are probably valid species, until we can examine more specimens and do further analyses we have chosen to leave the cryptic taxa unnamed. Despite the utility of barcodes for identifying Siricidae we still believe new species require a morphological description.
One of the reasons barcoding was so useful in revising the North American Siricidae is because it is color blind. Prior to this study, abdomen and leg color were often used as simple diagnostic characters for siricid species (Middlekauf 1960, Smith and Schiff 2002, Schiff et al. 2006). However, identical DNA barcodes supported by morphological characters suggested that pairs or groups of what were considered to be good species based on abdomen color were really single species. In this study there were three examples, Sirex nigricornis, Xeris indecisus and Tremex columba. In the first two examples, each species has two female color morphs with either red (the former Sirex nigricornis and the former Xeris morrisoni indecisus) or black (the former Sirex edwardsii and the former Xeris spectrum townesi) abdomens. In the third example, females of T. columba have one of three color morphs associated with wing color differences. These color morphs were recognized as separate species until Bradley (1913) lumped them together, a position supported by the current barcode results. Whereas it is easy to understand why such dramatic characters would be considered diagnostic for species, this study demonstrates that abdomen color can be misleading. Interestingly, in the original description Brullé suggested that the only difference he saw between Sirex edwardsii and Sirex nigricornis was that the abdomen was blue and he even suggested that it might just be a variety of Sirex nigricornis. Genetic control of abdomen color must be fairly loose in Symphyta because there are several examples of different color morphs in at least four different families. Species with both red and black abdominal color morphs have been recorded in the Xiphydriidae (Xiphydria tibialis Say, in Smith 1976), Xyelidae (Macroxyela ferruginea (Say), in Smith and Schiff 1998), Tenthredinidae (Lagium atroviolaceum (Norton), in Smith 1986) and, Siricidae (present study). Barcodes were also useful in resolving leg color morphs. Sirex californicus, S. nitidus and S. noctilio each have pale and dark leg color morphs. At least for Sirex californicus and S. nitidus both color forms have the same barcode. We have no sequences for the dark color morph of Sirex noctilio. Ironically, abdomen and leg color are still useful characters for identifying woodwasps (e.g., Sirex varipes) but this work shows that they should not be used as sole diagnostic characters. Instead, they should be combined with other characters, as we do here, to lead to a diagnosis.
To identify any stages of woodwasps using barcodes, a novel sequence should be aligned with the 31 consensus sequences reported here (See appendix 3) using Clustal V and then visualized in a neighbor-joining tree using appropriate software. The novel sequence should align very closely with the branch of its congener. The range of intra-specific variation is represented in the species trees (Figs. E2.2, E2.3, E2.4a, E2.4b, E2.4c, E2.5a, E2.5b, E2.5c, E2.5d, E2.5e and E2.5f) and it should be easy to recognize if a species falls outside its expected range. Determining a species threshold limit for barcode data of unknown taxa is quite controversial (Rubinoff et al. 2006). Hebert et al. (2003) originally proposed that a 2-3% difference would be sufficient to separate animal species. At that level, we might not be able to separate Sirex nitidus from the cryptic taxon S. near nitidus, or two pairs of closely related but morphologically distinct species, Urocerus flavicornus from U. gigas and Xeris morrisoni from X. indecisus. Later, Hebert et al. (2004A) proposed a threshold that was 10 times the mean intraspecific variation for the group under study. This new threshold addresses the diagnostic value of the relationship of interspecific to intraspecific variation but still presupposes a level of species uniformity. Both of these thresholds could be problematic if we were trying to separate species from a sea of unknowns; fortunately, we are trying to identify unknowns by comparison to a relatively well sampled database of recognized species. Unknown sequences will either match one of the known species or become a new hypothesis to be evaluated with morphological or other methods. Although all the species represented here are well delimited, it is possible that barcodes for newly recognized, closely related species could overlap and this database would not be able to resolve them.
We believe the consensus tree (Fig. E2.1) is robust because of the species sampling that went into it. We obtained representatives of each species from as much of the geographic and temporal ranges as possible, as can be seen in the specimens for molecular studies section under each species description. Although sampling can never be complete, multiple samples across the range are a more cogent representation of the species variation then a single specimen from one location in its range.
The combination of classical morphological and DNA barcoding methods have allowed us to revise New World Siricidae and develop a DNA database that will enable identification of most New World siricid larvae. Each morphological species has a corresponding well-delimited barcode. Two species appear to have a cryptic taxon which we have chosen to keep unnamed because they lack morphological support. Our work demonstrates that barcodes are a useful addition to other taxonomic methods, especially for tasks such as associating life stages.
Orussus thoracicus:
USA. California: 2005, CBHR 35, 655; 2005, CBHR 306, 655; 2005, CBHR 307, 655; 2005, CBHR 308, 655.
Syntexis libocedrii:
USA. California: 2005, CBHR 86, 658; 2005, CBHR 87, 658; 2005, CBHR 88, 658; 2005, CBHR 89, 658; 2005, CBHR 90, 658; 2005, CBHR 91, 658; 2005, CBHR 92, 658; 2005, CBHR 93, 658; 2005, CBHR 94, 658; 2005, CBHR 95, 658. Oregon: 2003, CBHR 7, 658; 2003, CBHR 9, 658.
Xiphydria mellipes:
CANADA. Ontario: 2005, CBHR 1055, 658; 2005, CBHR 1095, 658. USA. Wisconsin: 2005, CBHR 149, 658.