Often of primary interest in the analysis of multivariate data are the copula parameters describing the dependence among the variables, rather than the univariate marginal distributions. Since the ranks of a multivariate dataset are invariant to changes in the univariate marginal distributions, rank-based estimators are natural candidates for semiparametric copula estimation. Asymptotic information bounds for such estimators can be obtained from an asymptotic analysis of the rank likelihood, i.e. the probability of the multivariate ranks. In this article, we obtain limiting normal distributions of the rank likelihood for Gaussian copula models. Our results cover models with structured correlation matrices, such as exchangeable or circular correlation models, as well as unstructured correlation matrices. For all Gaussian copula models, the limiting distribution of the rank likelihood ratio is shown to be equal to that of a parametric likelihood ratio for an appropriately chosen multivariate normal model. This implies that the semiparametric information bounds for rank-based estimators are the same as the information bounds for estimators based on the full data, and that the multivariate normal distributions are least favorable.
ANOVA decompositions are a standard method for describing and estimating heterogeneity among the means of a response variable across levels of multiple categorical factors. In such a decomposition, the complete set of main effects and interaction terms can be viewed as a collection of vectors, matrices and arrays that share various index sets defined by the factor levels. For many types of categorical factors, it is plausible that an ANOVA decomposition exhibits some consistency across orders of effects, in that the levels of a factor that have similar main-effect coefficients may also have similar coefficients in higher-order interaction terms. In such a case, estimation of the higher-order interactions should be improved by borrowing information from the main effects and lower-order interactions. To take advantage of such patterns, this article introduces a class of hierarchical prior distributions for collections of interaction arrays that can adapt to the presence of such interactions. These prior distributions are based on a type of array-variate normal distribution, for which a covariance matrix for each factor is estimated. This prior is able to adapt to potential similarities among the levels of a factor, and incorporate any such information into the estimation of the effects in which the factor appears. In the presence of such similarities, this prior is able to borrow information from well-estimated main effects and lower-order interactions to assist in the estimation of higher-order terms for which data information is limited.
Male and female sexes have evolved repeatedly in eukaryotes but the origins of dimorphic sexes and their relationship to mating types in unicellular species are not understood. Volvocine algae include isogamous species such as Chlamydomonas reinhardtii, with two equal-sized mating types, and oogamous multicellular species such as Volvox carteri with sperm-producing males and egg-producing females. Theoretical work predicts genetic linkage of a gamete cell-size regulatory gene(s) to an ancestral mating-type locus as a possible step in the evolution of dimorphic gametes, but this idea has not been tested. Here we show that, contrary to predictions, a single conserved mating locus (MT) gene in volvocine algae-MID, which encodes a RWP-RK domain transcription factor-evolved from its ancestral role in C. reinhardtii as a mating-type specifier, to become a determinant of sperm and egg development in V. carteri. Transgenic female V. carteri expressing male MID produced functional sperm packets during sexual development. Transgenic male V. carteri with RNA interference (RNAi)-mediated knockdowns of VcMID produced functional eggs, or self-fertile hermaphrodites. Post-transcriptional controls were found to regulate cell-type-limited expression and nuclear localization of VcMid protein that restricted its activity to nuclei of developing male germ cells and sperm. Crosses with sex-reversed strains uncoupled sex determination from sex chromosome identity and revealed gender-specific roles for male and female mating locus genes in sexual development, gamete fitness and reproductive success. Our data show genetic continuity between the mating-type specification and sex determination pathways of volvocine algae, and reveal evidence for gender-specific adaptations in the male and female mating locus haplotypes of Volvox. These findings will enable a deeper understanding of how a master regulator of mating-type determination in an ancestral unicellular species was reprogrammed to control sexually dimorphic gamete development in a multicellular descendant.
We describe protein interaction quantitation (PIQ), a computational method for modeling the magnitude and shape of genome-wide DNase I hypersensitivity profiles to identify transcription factor (TF) binding sites. Through the use of machine-learning techniques, PIQ identified binding sites for >700 TFs from one DNase I hypersensitivity analysis followed by sequencing (DNase-seq) experiment with accuracy comparable to that of chromatin immunoprecipitation followed by sequencing (ChIP-seq). We applied PIQ to analyze DNase-seq data from mouse embryonic stem cells differentiating into prepancreatic and intestinal endoderm. We identified 120 and experimentally validated eight 'pioneer' TF families that dynamically open chromatin. Four pioneer TF families only opened chromatin in one direction from their motifs. Furthermore, we identified 'settler' TFs whose genomic binding is principally governed by proximity to open chromatin. Our results support a model of hierarchical TF binding in which directional and nondirectional pioneer activity shapes the chromatin landscape for population by settler TFs.
Micromonospora species live in diverse environments and exhibit a broad range of functions, including antibiotic production, biocontrol, and degradation of complex polysaccharides. To learn more about these versatile actinomycetes, we sequenced the genome of strain L5, originally isolated from root nodules of an actinorhizal plant growing in Mexico.
Heteromorphic sex-determining regions or mating-type loci can contain large regions of non-recombining sequence where selection operates under different constraints than in freely recombining autosomal regions. Detailed studies of these non-recombining regions can provide insights into how genes are gained and lost, and how genetic isolation is maintained between mating haplotypes or sex chromosomes. The Chlamydomonas reinhardtii mating-type locus (MT) is a complex polygenic region characterized by sequence rearrangements and suppressed recombination between its two haplotypes, MT+ and MT-. We used new sequence information to redefine the genetic contents of MT and found repeated translocations from autosomes as well as sexually controlled expression patterns for several newly identified genes. We examined sequence diversity of MT genes from wild isolates of C. reinhardtii to investigate the impacts of recombination suppression. Our population data revealed two previously unreported types of genetic exchange in Chlamydomonas MT--gene conversion in the rearranged domains, and crossover exchanges in flanking domains--both of which contribute to maintenance of genetic homogeneity between haplotypes. To investigate the cause of blocked recombination in MT we assessed recombination rates in crosses where the parents were homozygous at MT. While normal recombination was restored in MT+ ×MT+ crosses, it was still suppressed in MT- ×MT- crosses. These data revealed an underlying asymmetry in the two MT haplotypes and suggest that sequence rearrangements are insufficient to fully account for recombination suppression. Together our findings reveal new evolutionary dynamics for mating loci and have implications for the evolution of heteromorphic sex chromosomes and other non-recombining genomic regions.
Novel dose-finding designs for Phase I cancer clinical trials, using estimation to assign the best estimated Maximum Tolerated Dose (MTD) at each point in the experiment, most prominently via Bayesian techniques, have been widely discussed and promoted since 1990.
It is common for novel dose-finding designs to be presented without a study of their convergence properties. In this article we suggest that examination of convergence is a necessary quality check for dose-finding designs. We present a new convergence proof for a nonparametric family of methods called "interval designs," under certain conditions on the toxicity-frequency function F. We compare these conditions with the convergence conditions for the popular CRM one-parameter Phase I cancer design, via an innovative numerical sensitivity study generating a diverse sample of dose-toxicity scenarios. Only a small fraction of scenarios meet the Shen-OQuigley convergence conditions for CRM. Conditions for "interval design" convergence are met more often, but still less than half the time. In the discussion, we illustrate how convergence properties and limitations help provide insight about small-sample behavior.
Approximately 10% of ulcerative colitis patients develop colorectal neoplasia. At present, identification of this subset is markedly limited and necessitates lifelong colonoscopic surveillance for the entire ulcerative colitis population. Better risk markers are needed to focus surveillance onto the patients who are most likely to benefit. Using array-based comparative genomic hybridization, we analyzed single, non-dysplastic biopsies from three patient groups: ulcerative colitis progressors (n=9) with cancer or high-grade dysplasia at a mean distance of 18 cm from the analyzed site; ulcerative colitis non-progressors (n=8) without dysplasia during long-term surveillance; and non-ulcerative colitis normal controls (n=2). Genomic DNA from fresh colonic epithelium purified from stroma was hybridized to 287 (low-density) and 4342 (higher-density) feature bacterial artificial chromosome arrays. Sample-to-reference fluorescence ratios were calculated for individual chromosomal targets and globally across the genome. The low-density arrays yielded pronounced genomic gains and losses in 3 of 9 (33%) ulcerative colitis progressors but in none of the 10 control patients. Identical DNA samples analyzed on the higher-density arrays, using a combination of global and individual high variance assessments, distinguished all nine progressors from all 10 controls. These data confirm that genomic alterations in ulcerative colitis progressors are widespread, even involving single non-dysplastic biopsies that are far distant from neoplasia. They therefore show promise toward eliminating full colonoscopic surveillance with extensive biopsy sampling in the majority of ulcerative colitis patients.
We describe here a system for the expression and purification of small ubiquitin-related modifier (SUMO) fusion proteins, which often exhibit dramatically increased solubility and stability during expression in bacteria relative to unfused proteins. The vector described here allows expression of a His-tagged protein of interest fused at its N-terminus to SUMO. Using this vector, we have produced a polypeptide consisting of SUMO fused to the Q domain of Drosophila Groucho in a concentrated soluble form. Hydrodynamic analysis shows that, consistent with previous studies on full-length Groucho, the fusion protein forms an elongated tetramer, as well as higher order oligomers. After expressing a protein as a fusion to SUMO, it is often desirable to cleave the SUMO off of the fusion protein using a SUMO-specific protease such as Ulp1. To facilitate such processing, we have constructed a dual expression vector encoding two fusion proteins: one consisting of SUMO fused to Ulp1 and a second consisting of SUMO fused to a His-tagged protein of interest. The SUMO-Ulp1 cleaves both itself and the other SUMO fusion protein in the bacterial cells prior to lysis, and the proteins retain solubility after cleavage.
Although dimorphic sexes have evolved repeatedly in multicellular eukaryotes, their origins are unknown. The mating locus (MT) of the sexually dimorphic multicellular green alga Volvox carteri specifies the production of eggs and sperm and has undergone a remarkable expansion and divergence relative to MT from Chlamydomonas reinhardtii, which is a closely related unicellular species that has equal-sized gametes. Transcriptome analysis revealed a rewired gametic expression program for Volvox MT genes relative to Chlamydomonas and identified multiple gender-specific and sex-regulated transcripts. The retinoblastoma tumor suppressor homolog MAT3 is a Volvox MT gene that displays sexually regulated alternative splicing and evidence of gender-specific selection, both of which are indicative of cooption into the sexual cycle. Thus, sex-determining loci affect the evolution of both sex-related and non-sex-related genes.
Social network data often involve transitivity, homophily on observed attributes, clustering, and heterogeneity of actor degrees. We propose a latent cluster random effects model to represent all of these features, and we describe a Bayesian estimation method for it. The model is applicable to both binary and non-binary network data. We illustrate the model using two real datasets. We also apply it to two simulated network datasets with the same, highly skewed, degree distribution, but very different network behavior: one unstructured and the other with transitivity and clustering. Models based on degree distributions, such as scale-free, preferential attachment and power-law models, cannot distinguish between these very different situations, but our model does.
Lectins are a diverse group of carbohydrate-binding proteins that are found within and associated with organisms from all kingdoms of life. Several different classes of plant lectins serve a diverse array of functions. The most prominent of these include participation in plant defense against predators and pathogens and involvement in symbiotic interactions between host plants and symbiotic microbes, including mycorrhizal fungi and nitrogen-fixing rhizobia. Extensive biological, biochemical, and molecular studies have shed light on the functions of plant lectins, and a plethora of uncharacterized lectin genes are being revealed at the genomic scale, suggesting unexplored and novel diversity in plant lectin structure and function. Integration of the results from these different types of research is beginning to yield a more detailed understanding of the function of lectins in symbiosis, defense, and plant biology in general.
The percentile-finding experimental design known variously as forced-choice fixed-staircase, geometric up-and-down or k-in-a-row (KR) was introduced by Wetherill four decades ago. To date, KR has been by far the most widely used up-and-down (U&D) design for estimating non-median percentiles; it is implemented most commonly in sensory studies. However, its statistical properties have not been fully documented, and the existence of a unique mode in its asymptotic treatment distribution has been recently disputed.Here we revisit the KR design and its basic properties. We find that KR does generate a unique stationary mode near its target percentile, and also displays better operational characteristics than two other U&D designs that have been studied more extensively. Supporting proofs and numerical calculations are presented. A recent experimental example from anesthesiology serves to highlight some of the up-and-down design familys properties and advantages.
Interactions between DNA and transcription factors (TFs) guide cellular function and development, yet the complexities of gene regulation are still far from being understood. Such understanding is limited by a paucity of techniques with which to probe DNA-protein interactions. We have devised magnetic protein immobilization on enhancer DNA (MagPIE), a simple, rapid, multi-parametric assay using flow cytometric immunofluorescence to reveal interactions among TFs, chromatin structure and DNA. In MagPIE, synthesized DNA is bound to magnetic beads, which are then incubated with nuclear lysate, permitting sequence-specific binding by TFs, histones and methylation by native lysate factors that can be optionally inhibited with small molecules. Lysate protein-DNA binding is monitored by flow cytometric immunofluorescence, which allows for accurate comparative measurement of TF-DNA affinity. Combinatorial fluorescent staining allows simultaneous analysis of sequence-specific TF-DNA interaction and chromatin modification. MagPIE provides a simple and robust method to analyze complex epigenetic interactions in vitro.
Related JoVE Video
Journal of Visualized Experiments
What is Visualize?
JoVE Visualize is a tool created to match the last 5 years of PubMed publications to methods in JoVE's video library.
How does it work?
We use abstracts found on PubMed and match them to JoVE videos to create a list of 10 to 30 related methods videos.
Video X seems to be unrelated to Abstract Y...
In developing our video relationships, we compare around 5 million PubMed articles to our library of over 4,500 methods videos. In some cases the language used in the PubMed abstracts makes matching that content to a JoVE video difficult. In other cases, there happens not to be any content in our video library that is relevant to the topic of a given abstract. In these cases, our algorithms are trying their best to display videos with relevant content, which can sometimes result in matched videos with only a slight relation.