Soybean oil and meal are major contributors to world-wide food production. Consequently, the genetic basis for soybean seed composition has been intensely studied using family-based mapping. Population-based mapping approaches, in the form of genome-wide association (GWA) scans, have been able to resolve loci controlling moderately complex quantitative traits (QTL) in numerous crop species. Yet, it is still unclear how soybean's unique population history will affect GWA scans. Using one of the populations in this study, we simulated phenotypes resulting from a range of genetic architectures. We found that with a heritability of 0.5, ?100% and ?33% of the 4 and 20 simulated QTL can be recovered, respectively, with a false-positive rate of less than ?6×10(-5) per marker tested. Additionally, we demonstrated that combining information from multi-locus mixed models and compressed linear-mixed models improves QTL identification and interpretation. We applied these insights to exploring seed composition in soybean, refining the linkage group I (chromosome 20) protein QTL and identifying additional oil QTL that may allow some decoupling of highly correlated oil and protein phenotypes. Because the value of protein meal is closely related to its essential amino acid profile, we attempted to identify QTL underlying methionine, threonine, cysteine, and lysine content. Multiple QTL were found that have not been observed in family-based mapping studies, and each trait exhibited associations across multiple populations. Chromosomes 1 and 8 contain strong candidate alleles for essential amino acid increases. Overall, we present these and additional data that will be useful in determining breeding strategies for the continued improvement of soybean's nutrient portfolio.
The insertion of DNA into a genome can result in the duplication and dispersal of functional sequences through the genome. In addition, a deeper understanding of insertion mechanisms will inform methods of genetic engineering and plant transformation. Exploiting structural variations in numerous rice accessions, we have inferred and analyzed intermediate length (10-1,000 bp) insertions in plants. Insertions in this size class were found to be approximately equal in frequency to deletions, and compound insertion-deletions comprised only 0.1% of all events. Our findings indicate that, as observed in humans, tandem or partially tandem duplications are the dominant form of insertion (48%), although short duplications from ectopic donors account for a sizable fraction of insertions in rice (38%). Many nontandem duplications contain insertions from nearby DNA (within 200 bp) and can contain multiple donor sources--some distant--in single events. Although replication slippage is a plausible explanation for tandem duplications, the end homology required in such a model is most often absent and rarely is >5 bp. However, end homology is commonly longer than expected by chance. Such findings lead us to favor a model of patch-mediated double-strand-break creation followed by nonhomologous end-joining. Additionally, a striking bias toward 31-bp partially tandem duplications suggests that errors in nucleotide excision repair may be resolved via a similar, but distinct, pathway. In summary, the analysis of recent insertions in rice suggests multiple underappreciated causes of structural variation in eukaryotes.
Plants from the Zingiberaceae family are a key source of spices and herbal medicines. Species identification within this group is critical in the search for known and possibly novel bioactive compounds. To facilitate precise characterization of this group, we have sequenced chloroplast genomes from species representing five major groups within Zingiberaceae. Generally, the structure of these genomes is similar to the basal angiosperm excepting an expansion of 3 kb associated with the inverted repeat A region. Portions of this expansion appear to be shared across the entire Zingiberales order, which includes gingers and bananas. We used whole plastome alignment information to develop DNA barcodes that would maximize the ability to differentiate species within the Zingiberaceae. Our computation pipeline identified regions of high variability that were flanked by highly conserved regions used for primer design. This approach yielded hitherto unexploited regions of variability. These theoretically optimal barcodes were tested on a range of species throughout the family and were found to amplify and differentiate genera and, in some cases, species. Still, though these barcodes were specifically optimized for the Zingiberaceae, our data support the emerging consensus that whole plastome sequences are needed for robust species identification and phylogenetics within this family.
We review the evidence that upstream open reading frames (uORFs) function as RNA sequence elements for post-transcriptional control of gene expression, specifically translation. uORFs are highly abundant in the genomes of angiosperms. Their negative effect on translation is often attenuated by ribosomal translation reinitiation, a process whose molecular biochemistry is still being investigated. Certain uORFs render translation responsive to small molecules, thus offering a path for metabolic control of gene expression in evolution and synthetic biology. In some cases, uORFs form modular logic gates in signal transduction. uORFs thus provide eukaryotes with a functionality analogous to, or comparable to, riboswitches and attenuators in prokaryotes. uORFs exist in many genes regulating development and point toward translational control of development. While many uORFs appear to be poorly conserved, and the number of genes with conserved-peptide uORFs is modest, many mRNAs have a conserved pattern of uORFs. Evolutionarily, the gain and loss of uORFs may be a widespread mechanism that diversifies gene expression patterns. Last but not least, this review includes a dedicated uORF database for Arabidopsis.
Upstream open reading frames (uORFs) are protein coding elements in the 5 leader of messenger RNAs. uORFs generally inhibit translation of the main ORF because ribosomes that perform translation elongation suffer either permanent or conditional loss of reinitiation competence. After conditional loss, reinitiation competence may be regained by, at the minimum, reacquisition of a fresh methionyl-tRNA. The conserved h subunit of Arabidopsis eukaryotic initiation factor 3 (eIF3) mitigates the inhibitory effects of certain uORFs. Here, we define more precisely how this occurs, by combining gene expression data from mutated 5 leaders of Arabidopsis AtbZip11 (At4g34590) and yeast GCN4 with a computational model of translation initiation in wild-type and eif3h mutant plants. Of the four phylogenetically conserved uORFs in AtbZip11, three are inhibitory to translation, while one is anti-inhibitory. The mutation in eIF3h has no major effect on uORF start codon recognition. Instead, eIF3h supports efficient reinitiation after uORF translation. Modeling suggested that the permanent loss of reinitiation competence during uORF translation occurs at a faster rate in the mutant than in the wild type. Thus, eIF3h ensures that a fraction of uORF-translating ribosomes retain their competence to resume scanning. Experiments using the yeast GCN4 leader provided no evidence that eIF3h fosters tRNA reaquisition. Together, these results attribute a specific molecular function in translation initiation to an individual eIF3 subunit in a multicellular eukaryote.
We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ?400-Mb assembly covers ?80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).
The sequence elements that mediate post-transcriptional gene regulation often reside in the 5 and 3 untranslated regions (UTRs) of mRNAs. Using six different families of dicotyledonous plants, we developed a comparative transcriptomics pipeline for the identification and annotation of deeply conserved regulatory sequences in the 5 and 3 UTRs. Our approach was robust to confounding effects of poor UTR alignability and rampant paralogy in plants. In the 3 UTR, motifs resembling PUMILIO-binding sites form a prominent group of conserved motifs. Additionally, Expansins, one of the few plant mRNA families known to be localized to specific subcellular sites, possess a core conserved RCCCGC motif. In the 5 UTR, one major subset of motifs consists of purine-rich repeats. A distinct and substantial fraction possesses upstream AUG start codons. Half of the AUG containing motifs reveal hidden protein-coding potential in the 5 UTR, while the other half point to a peptide-independent function related to translation. Among the former, we added four novel peptides to the small catalog of conserved-peptide uORFs. Among the latter, our case studies document patterns of uORF evolution that include gain and loss of uORFs, switches in uORF reading frame, and switches in uORF length and position. In summary, nearly three hundred post-transcriptional elements show evidence of purifying selection across the eudicot branch of flowering plants, indicating a regulatory function spanning at least 70 million years. Some of these sequences have experimental precedent, but many are novel and encourage further exploration.
Related JoVE Video
Journal of Visualized Experiments
What is Visualize?
JoVE Visualize is a tool created to match the last 5 years of PubMed publications to methods in JoVE's video library.
How does it work?
We use abstracts found on PubMed and match them to JoVE videos to create a list of 10 to 30 related methods videos.
Video X seems to be unrelated to Abstract Y...
In developing our video relationships, we compare around 5 million PubMed articles to our library of over 4,500 methods videos. In some cases the language used in the PubMed abstracts makes matching that content to a JoVE video difficult. In other cases, there happens not to be any content in our video library that is relevant to the topic of a given abstract. In these cases, our algorithms are trying their best to display videos with relevant content, which can sometimes result in matched videos with only a slight relation.