Copy number variation (CNV) makes a major contribution to overall genetic variation and is suspected to play an important role in adaptation. However, aside from a few model species, the extent of CNV in natural populations has seldom been investigated. Here, we report on CNV in the pea aphid Acyrthosiphon pisum, a powerful system for studying the genetic architecture of host-plant adaptation and speciation thanks to multiple host races forming a continuum of genetic divergence. Recent studies have highlighted the potential importance of chemosensory genes, including the gustatory and olfactory receptor gene families (Gr and Or, respectively), in the process of host race formation. We used targeted resequencing to achieve a very high depth of coverage, and thereby revealed the extent of CNV of 434 genes, including 150 chemosensory genes, in 104 individuals distributed across eight host races of the pea aphid. We found that CNV was widespread in our global sample, with a significantly higher occurrence in multigene families, especially in Ors. We also observed a decrease in the gene probability of being completely duplicated or deleted (CDD) with increase in coding sequence length. Genes with CDD variants were usually more polymorphic for copy number, especially in the P450 gene family where toxin resistance may be related to gene dosage. We found that Gr were overrepresented among genes discriminating host races, as were CDD genes and pseudogenes. Our observations shed new light on CNV dynamics and are consistent with CNV playing a role in both local adaptation and speciation.
The Oxford Nanopore MinION device represents a unique sequencing technology. As a mobile sequencing device powered by the USB port of a laptop, the MinION has huge potential applications. To enable these applications, the bioinformatics community will need to design and build a suite of tools specifically for MinION data.
The earliest stages of convergent evolution are difficult to observe in the wild, limiting our understanding of the incipient genomic architecture underlying convergent phenotypes. To address this, we capitalized on a novel trait, flatwing, that arose and proliferated at the start of the 21st century in a population of field crickets (Teleogryllus oceanicus) on the Hawaiian island of Kauai. Flatwing erases sound-producing structures on male forewings. Mutant males cannot sing to attract females, but they are protected from fatal attack by an acoustically orienting parasitoid fly (Ormia ochracea). Two years later, the silent morph appeared on the neighboring island of Oahu. We tested two hypotheses for the evolutionary origin of flatwings in Hawaii: (1) that the silent morph originated on Kauai and subsequently introgressed into Oahu and (2) that flatwing originated independently on each island. Morphometric analysis of male wings revealed that Kauai flatwings almost completely lack typical derived structures, whereas Oahu flatwings retain noticeably more wild-type wing venation. Using standard genetic crosses, we confirmed that the mutation segregates as a single-locus, sex-linked Mendelian trait on both islands. However, genome-wide scans using RAD-seq recovered almost completely distinct markers linked with flatwing on each island. The patterns of allelic association with flatwing on either island reveal different genomic architectures consistent with the timing of two mutational events on the X chromosome. Divergent wing morphologies linked to different loci thus cause identical behavioral outcomes--silence--illustrating the power of selection to rapidly shape convergent adaptations from distinct genomic starting points.
Genetic linkage maps are useful tools for mapping quantitative trait loci (QTL) influencing variation in traits of interest in a population. Genotyping-by-sequencing approaches such as Restriction-site Associated DNA sequencing (RAD-Seq) now enable the rapid discovery and genotyping of genome-wide SNP markers suitable for the development of dense SNP linkage maps, including in non-model organisms such as Atlantic salmon (Salmo salar). This paper describes the development and characterisation of a high density SNP linkage map based on SbfI RAD-Seq SNP markers from two Atlantic salmon reference families.
Dense single nucleotide polymorphism (SNP) genotyping arrays provide extensive information on polymorphic variation across the genome of species of interest. Such information can be used in studies of the genetic architecture of quantitative traits and to improve the accuracy of selection in breeding programs. In Atlantic salmon (Salmo salar), these goals are currently hampered by the lack of a high-density SNP genotyping platform. Therefore, the aim of the study was to develop and test a dense Atlantic salmon SNP array.
Next-generation sequencing (NGS) technologies have dramatically expanded the breadth of genomics. Genome-scale data, once restricted to a small number of biomedical model organisms, can now be generated for virtually any species at remarkable speed and low cost. Yet non-model organisms often lack a suitable reference to map sequence reads against, making alignment-based quality control (QC) of NGS data more challenging than cases where a well-assembled genome is already available. Here we show that by generating a rapid, non-optimized draft assembly of raw reads, it is possible to obtain reliable and informative QC metrics, thus removing the need for a high quality reference. We use benchmark datasets generated from control samples across a range of genome sizes to illustrate that QC inferences made using draft assemblies are broadly equivalent to those made using a well-established reference, and describe QC tools routinely used in our production facility to assess the quality of NGS data from non-model organisms.
Molecular markers produced by next-generation sequencing (NGS) technologies are revolutionizing genetic research. However, the costs of analysing large numbers of individual genomes remain prohibitive for most population genetics studies. Here, we present results based on mathematical derivations showing that, under many realistic experimental designs, NGS of DNA pools from diploid individuals allows to estimate the allele frequencies at single nucleotide polymorphisms (SNPs) with at least the same accuracy as individual-based analyses, for considerably lower library construction and sequencing efforts. These findings remain true when taking into account the possibility of substantially unequal contributions of each individual to the final pool of sequence reads. We propose the intuitive notion of effective pool size to account for unequal pooling and derive a Bayesian hierarchical model to estimate this parameter directly from the data. We provide a user-friendly application assessing the accuracy of allele frequency estimation from both pool- and individual-based NGS population data under various sampling, sequencing depth and experimental error designs. We illustrate our findings with theoretical examples and real data sets corresponding to SNP loci obtained using restriction site-associated DNA (RAD) sequencing in pool- and individual-based experiments carried out on the same population of the pine processionary moth (Thaumetopoea pityocampa). NGS of DNA pools might not be optimal for all types of studies but provides a cost-effective approach for estimating allele frequencies for very large numbers of SNPs. It thus allows comparison of genome-wide patterns of genetic variation for large numbers of individuals in multiple populations.
BACKGROUND: Caligid copepods, also called sea lice, are fish ectoparasites, some species of which cause significant problems in the mariculture of salmon, where the annual cost of infection is in excess of [euro sign]300 million globally. At present, caligid control on farms is mainly achieved using medicinal treatments. However, the continued use of a restricted number of medicine actives potentially favours the development of drug resistance. Here, we report transcriptional changes in a laboratory strain of the caligid Lepeophtheirus salmonis (Kr[latin small letter o with stroke]yer, 1837) that is moderately (~7-fold) resistant to the avermectin compound emamectin benzoate (EMB), a component of the anti-salmon louse agent SLICE(R) (Merck Animal Health). RESULTS: Suppression subtractive hybridisation (SSH) was used to enrich transcripts differentially expressed between EMB-resistant (PT) and drug-susceptible (S) laboratory strains of L. salmonis. SSH libraries were subjected to 454 sequencing. Further L. salmonis transcript sequences were available as expressed sequence tags (EST) from GenBank. Contiguous sequences were generated from both SSH and EST sequences and annotated. Transcriptional responses in PT and S salmon lice were investigated using custom 15 K oligonucleotide microarrays designed using the above sequence resources. In the absence of EMB exposure, 359 targets differed in transcript abundance between the two strains, these genes being enriched for functions such as calcium ion binding, chitin metabolism and muscle structure. gamma-aminobutyric acid (GABA)-gated chloride channel (GABA-Cl) and neuronal acetylcholine receptor (nAChR) subunits showed significantly lower transcript levels in PT lice compared to S lice. Using RT-qPCR, the decrease in mRNA levels was estimated at ~1.4-fold for GABA-Cl and ~2.8-fold for nAChR. Salmon lice from the PT strain showed few transcriptional responses following acute exposure (1 or 3 h) to 200 mug L-1 of EMB, a drug concentration tolerated by PT lice, but toxic for S lice. CONCLUSIONS: Avermectins are believed to exert their toxicity to invertebrates through interaction with glutamate-gated and GABA-gated chloride channels. Further potential drug targets include other Cys-loop ion channels such as nAChR. The present study demonstrates decreased transcript abundances of GABA-Cl and nAChR subunits in EMB-resistant salmon lice, suggesting their involvement in avermectin toxicity in caligids.
Atlantic halibut (Hippoglossus hippoglossus) is a high-value, niche market species for cold-water marine aquaculture. Production of monosex female stocks is desirable in commercial production since females grow faster and mature later than males. Understanding the sex determination mechanism and developing sex-associated markers will shorten the time for the development of monosex female production, thus decreasing the costs of farming.
In this study, we used restriction site-associated DNA (RAD) sequencing to discover SNP markers suitable for population genetic and parentage analysis with the aim of using them for monitoring the reintroduction of the Eurasian beaver (Castor fibre) to Scotland. In the absence of a reference genome for beaver, we built contigs and discovered SNPs within them using paired-end RAD data, so as to have sufficient flanking region around the SNPs to conduct marker design. To do this, we used a simple pipeline which catalogued the Read 1 data in stacks and then used the assembler cortex_var to conduct de novo assembly and genotyping of multiple samples using the Read 2 data. The analysis of around 1.1 billion short reads of sequence data was reduced to a set of 2579 high-quality candidate SNP markers that were polymorphic in Norwegian and Bavarian beaver. Both laboratory validation of a subset of eight of the SNPs (1.3% error) and internal validation by confirming patterns of Mendelian inheritance in a family group (0.9% error) confirmed the success of this approach.
The salmon louse (Lepeophtheirus salmonis (Krøyer, 1837)) is a parasitic copepod that can, if untreated, cause considerable damage to Atlantic salmon (Salmo salar Linnaeus, 1758) and incurs significant costs to the Atlantic salmon mariculture industry. Salmon lice are gonochoristic and normally show sex ratios close to 1:1. While this observation suggests that sex determination in salmon lice is genetic, with only minor environmental influences, the mechanism of sex determination in the salmon louse is unknown. This paper describes the identification of a sex-linked Single Nucleotide Polymorphism (SNP) marker, providing the first evidence for a genetic mechanism of sex determination in the salmon louse. Restriction site-associated DNA sequencing (RAD-seq) was used to isolate SNP markers in a laboratory-maintained salmon louse strain. A total of 85 million raw Illumina 100 base paired-end reads produced 281,838 unique RAD-tags across 24 unrelated individuals. RAD marker Lsa101901 showed complete association with phenotypic sex for all individuals analysed, being heterozygous in females and homozygous in males. Using an allele-specific PCR assay for genotyping, this SNP association pattern was further confirmed for three unrelated salmon louse strains, displaying complete association with phenotypic sex in a total of 96 genotyped individuals. The marker Lsa101901 was located in the coding region of the prohibitin-2 gene, which showed a sex-dependent differential expression, with mRNA levels determined by RT-qPCR about 1.8-fold higher in adult female than adult male salmon lice. This studys observations of a novel sex-linked SNP marker are consistent with sex determination in the salmon louse being genetic and following a female heterozygous system. Marker Lsa101901 provides a tool to determine the genetic sex of salmon lice, and could be useful in the development of control strategies.
Sex in Oreochromis niloticus (Nile tilapia) is principally determined by an XX/XY locus but other genetic and environmental factors also influence sex ratio. Restriction Associated DNA (RAD) sequencing was used in two families derived from crossing XY males with females from an isogenic clonal line, in order to identify Single Nucleotide Polymorphisms (SNPs) and map the sex-determining region(s). We constructed a linkage map with 3,802 SNPs, which corresponded to 3,280 informative markers, and identified a major sex-determining region on linkage group 1, explaining nearly 96% of the phenotypic variance. This sex-determining region was mapped in a 2 cM interval, corresponding to approximately 1.2 Mb in the O. niloticus draft genome. In order to validate this, a diverse family (4 families; 96 individuals in total) and population (40 broodstock individuals) test panel were genotyped for five of the SNPs showing the highest association with phenotypic sex. From the expanded data set, SNPs Oni23063 and Oni28137 showed the highest association, which persisted both in the case of family and population data. Across the entire dataset all females were found to be homozygous for these two SNPs. Males were heterozygous, with the exception of five individuals in the population and two in the family dataset. These fish possessed the homozygous genotype expected of females. Progeny sex ratios (over 95% females) from two of the males with the "female" genotype indicated that they were neomales (XX males). Sex reversal induced by elevated temperature during sexual differentiation also resulted in phenotypic males with the "female" genotype. This study narrows down the region containing the main sex-determining locus, and provides genetic markers tightly linked to this locus, with an association that persisted across the population. These markers will be of use in refining the production of genetically male O. niloticus for aquaculture.
Population bottlenecks can restrict variation at functional genes, reducing the ability of populations to adapt to new and changing environments. Understanding how populations generate adaptive genetic variation following bottlenecks is therefore central to evolutionary biology. Genes of the major histocompatibility complex (MHC) are ideal models for studying adaptive genetic variation due to their central role in pathogen recognition. While de novo MHC sequence variation is generated by point mutation, gene conversion can generate new haplotypes by transferring sections of DNA within and across duplicated MHC loci. However, the extent to which gene conversion generates new MHC haplotypes in wild populations is poorly understood. We developed a 454 sequencing protocol to screen MHC class I exon 3 variation across all 13 island populations of Berthelots pipit (Anthus berthelotii). We reveal that just 11-15 MHC haplotypes were retained when the Berthelots pipit dispersed across its island range in the North Atlantic ca. 75,000 years ago. Since then, at least 26 new haplotypes have been generated in situ across populations. We show that most of these haplotypes were generated by gene conversion across divergent lineages, and that the rate of gene conversion exceeded that of point mutation by an order of magnitude. Gene conversion resulted in significantly more changes at nucleotide sites directly involved with pathogen recognition, indicating selection for functional variants. We suggest that the creation of new variants by gene conversion is the predominant mechanism generating MHC variation in genetically depauperate populations, thus allowing them to respond to pathogenic challenges.
Genetic variation has been shown to play a significant role in determining susceptibility to the salmon louse, Lepeophtheirus salmonis. However, the mechanisms involved in differential response to infection remain poorly understood. Recent findings in Atlantic salmon (Salmo salar) have provided evidence for a potential link between marker variation at the major histocompatibility complex (MHC) and differences in lice abundance among infected siblings, suggesting that MHC genes can modulate susceptibility to the parasite. In this study, we used quantitative trait locus (QTL) analysis to test the effect of genomic regions linked to MHC class I and II on linkage groups (LG) 15 and 6, respectively.
Inexpensive short-read sequencing technologies applied to reduced representation genomes is revolutionizing genetic research, especially population genetics analysis, by allowing the genotyping of massive numbers of single-nucleotide polymorphisms (SNP) for large numbers of individuals and populations. Restriction site-associated DNA (RAD) sequencing is a recent technique based on the characterization of genomic regions flanking restriction sites. One of its potential drawbacks is the presence of polymorphism within the restriction site, which makes it impossible to observe the associated SNP allele (i.e. allele dropout, ADO). To investigate the effect of ADO on genetic variation estimated from RAD markers, we first mathematically derived measures of the effect of ADO on allele frequencies as a function of different parameters within a single population. We then used RAD data sets simulated using a coalescence model to investigate the magnitude of biases induced by ADO on the estimation of expected heterozygosity and F(ST) under a simple demographic model of divergence between two populations. We found that ADO tends to overestimate genetic variation both within and between populations. Assuming a mutation rate per nucleotide between 10(-9) and 10(-8), this bias remained low for most studied combinations of divergence time and effective population size, except for large effective population sizes. Averaging F(ST) values over multiple SNPs, for example, by sliding window analysis, did not correct ADO biases. We briefly discuss possible solutions to filter the most problematic cases of ADO using read coverage to detect markers with a large excess of null alleles.
Restriction site-associated DNA Sequencing (RAD-Seq) is an economical and efficient method for SNP discovery and genotyping. As with other sequencing-by-synthesis methods, RAD-Seq produces stochastic count data and requires sensitive analysis to develop or genotype markers accurately. We show that there are several sources of bias specific to RAD-Seq that are not explicitly addressed by current genotyping tools, namely restriction fragment bias, restriction site heterozygosity and PCR GC content bias. We explore the performance of existing analysis tools given these biases and discuss approaches to limiting or handling biases in RAD-Seq data. While these biases need to be taken seriously, we believe RAD loci affected by them can be excluded or processed with relative ease in most cases and that most RAD loci will be accurately genotyped by existing tools.
Stenotrophomonas maltophilia PML168 was isolated from Wembury Beach on the English Coast from a rock pool following growth and selection on agar plates. Here we present the permanent draft genome sequence, which has allowed prediction of function for several genes encoding enzymes relevant to industrial biotechnology, including a novel flavoprotein monooxygenase.
Tilapia species exhibit a large ecological diversity and an important propensity to interspecific hybridisation. This has been shown in the wild and used in aquaculture. However, despite its important evolutionary implications, few studies have focused on the analysis of hybrid genomes and their meiotic segregation. Intergeneric hybrids between Oreochromis niloticus and Sarotherodon melanotheron, two species highly differentiated genetically, ecologically, and behaviourally, were produced experimentally. The meiotic segregation of these hybrids was analysed in reciprocal second generation hybrid (F2) and backcross families and compared to the meiosis of both parental species, using a panel of 30 microsatellite markers. Hybrid meioses showed segregation in accordance to Mendelian expectations, independent from sex and the direction of crosses. In addition, we observed a conservation of linkage associations between markers, which suggests a relatively similar genome structure between the two parental species and the apparent lack of postzygotic incompatibility, despite their important divergence. These results provide genomics insights into the relative ease of hybridisation within cichlid species when prezygotic barriers are disrupted. Overall our results support the hypothesis that hybridisation may have played an important role in the evolution and diversification of cichlids.
Restriction site-associated DNA sequencing (RAD-Seq) is a genome complexity reduction technique that facilitates large-scale marker discovery and genotyping by sequencing. Recent applications of RAD-Seq have included linkage and QTL mapping with a particular focus on non-model species. In the current study, we have applied RAD-Seq to two Atlantic salmon families from a commercial breeding program. The offspring from these families were classified into resistant or susceptible based on survival/mortality in an Infectious Pancreatic Necrosis (IPN) challenge experiment, and putative homozygous resistant or susceptible genotype at a major IPN-resistance QTL. From each family, the genomic DNA of the two heterozygous parents and seven offspring of each IPN phenotype and genotype was digested with the SbfI enzyme and sequenced in multiplexed pools.
The transfer of the genomic resources developed in the Nile tilapia, Oreochromis niloticus, to other Tilapiines sensu lato and African cichlid would provide new possibilities to study this amazing group from genetics, ecology, evolution, aquaculture, and conservation point of view. We tested the cross-species amplification of 32 O. niloticus microsatellite markers in a panel of 15 species from 5 different African cichlid tribes: Oreochromines (Oreochromis, Sarotherodon), Boreotilapiines (Tilapia), Chromidotilapines, Hemichromines, and Haplochromines. Amplification was successfully observed for 29 markers (91%), with a frequency of polymorphic (P(95)) loci per species around 70%. The mean number of alleles per locus and species was 3.2 but varied from 3.7 within Oreochromis species to 1.6 within the nontilapia species. The high level of cross-species amplification and polymorphism of the microsatellite markers tested in this study provides powerful tools for a wide range of molecular genetic studies within tilapia species as well as for other African cichlids.
Related JoVE Video
Journal of Visualized Experiments
What is Visualize?
JoVE Visualize is a tool created to match the last 5 years of PubMed publications to methods in JoVE's video library.
How does it work?
We use abstracts found on PubMed and match them to JoVE videos to create a list of 10 to 30 related methods videos.
Video X seems to be unrelated to Abstract Y...
In developing our video relationships, we compare around 5 million PubMed articles to our library of over 4,500 methods videos. In some cases the language used in the PubMed abstracts makes matching that content to a JoVE video difficult. In other cases, there happens not to be any content in our video library that is relevant to the topic of a given abstract. In these cases, our algorithms are trying their best to display videos with relevant content, which can sometimes result in matched videos with only a slight relation.