Genovese G, Fromer M, Stahl EA, Ruderfer DM, Chambert K, Landén M, Moran JL, Purcell SM, Sklar P, Sullivan PF, Hultman CM, McCarroll SA
Nature Neuroscience, 2016
Our results suggest that synaptic dysfunction may mediate a large fraction of strong, individually rare genetic influences on schizophrenia risk.
Boettger LM, Salem RM, Handsaker RE, Peloso GM, Kathiresan S, Hirschhorn JN, McCarroll SA.
Nature Genetics, 2016
We describe a way to analyze the polymorphism of HP gene by imputation from SNP haplotypes and find that these HP exonic deletions associate with reduced LDL and total cholesterol levels.
Sekar A, Bialas AR, de Rivera H, Davis A, Hammond TR, Kamitaki N, Tooley K, Presumey J, Baum M, Van Doren V, Genovese G, Rose SA, Handsaker RE, Schizophrenia Working Group of the Psychiatric Genomics Consortium, Daly MJ, Carroll MC, Stevens B, Mccarroll SA.
The results implicate excessive complement activity in the development of schizophrenia and may help explain the reduced numbers of synapses in the brains of individuals with schizophrenia.
Structural forms of the human amylase locus and their relationships to SNPs, haplotypes and obesity
Usher CL, Handsaker RE, Esko T, Tuke MA, Weedon MN, Hastie AR, Cao H, Moon JE, Kashin S, Fuchsberger C, Metspalu A, Pato CN, Pato MT, McCarthy MI, Boehnke M, Altshuler DM, Frayling TM, Hirschhorn JN, McCarroll SA.
Nature Genetics, 2015
We describe a way to analyze genomic regions of high structural complexity and apply it the human amylase locus, which encodes the enzymes that digest starch into sugar. Though this variation has been reported to be the human genome’s largest influence on obesity, we find that this is not the case.
Macosko EZ, Basu A, Satija R, Nemesh J, Shekhar K, Goldman M, Tirosh I, Bialas AR, Kamitaki N, Martersteck EM, Trombetta JJ, Weitz DA, Sanes JR, Shalek AK, Regev A, McCarroll SA.
We describe a way to profile genome-wide gene expression in thousands of individual cells simultaneously – in facile, inexpensive experiments. We call this approach “Drop-seq”.
Large multiallelic copy number variations in humans
Handsaker RE, Doren VV, Berman JR, Genovese G, Kashin S, Boettger LM, McCarroll SA.
Nature Genetics, 2015
We describe an intriguing form of a copy number variation, in which a gene or genetic locus is present in widely varying numbers of copies in different individuals.
Regan JF, Kamitaki N, Legler T, Cooper S, Klitgord N, Karlin-Neumann G, Wong C, Hodges S, Koehler R, Tzonev S, McCarroll SA.
PLoS One, 2015
We describe a molecular method for quickly determining the chromosomal phase of pairs of sequence variants, even when they are separated by hundreds of thousands of base pairs, by using droplets to isolate long chromosomal segments.
Clonal Hematopoiesis and Blood-Cancer Risk Inferred from Blood DNA Sequence
Genovese G, Kähler AK, Handsaker R, Lindberg J, Rose SA, Bakhourn SF, Chambert K, Mick E, Neale BM, Fromer M, Purcell SM, Svantesson O, Landén M, Höglund M, Lehmann S, Gabriel SB, Moran JL, Lander ES, Sullivan PF, Sklar P, Grönberg H, Hultman CM, McCarroll SA.
New England Journal of Medicine, 2014
We describe a common pre-cancerous state, involving the clonal amplification of blood cells with somatic mutations, that is readily detected by DNA sequencing, is increasingly common as people age, and is associated with increased risk of blood cancer later in life.
Genetic Variation in Human DNA Replication Timing
Koren A, Handsaker RE, Kamitaki N, Karlić R, Ghosh S, Polak P, Eggan K, McCarroll SA.
We describe a new way to study DNA replication by using increasingly abundant whole genome sequence data, which we find contains signatures of DNA replication processes that were active in cells at the moment DNA was extracted from them. Using data from the 1000 Genomes Project, we find that aspects of genome replication vary from person to person and are controlled by genetic variation that affects the presence and utilization of replication origins.
Random replication of the inactive X chromosome
Koren A, McCarroll SA.
Genome Research, 2014
We find that DNA replication follows two strategies: slow, ordered replication associated with transcriptional activity, and rapid, unstructured, “random” replication of silent chromatin on the inactive X chromosome and the autosomes. The two strategies coexist int he same cell, yet are segregated in space and time.
Genome-scale neurogenetics: methodology and meaning.
McCarroll SA, Feng G, Hyman SE.
Nature Neuroscience, 2014
Genetic analysis is currently offering glimpses into molecular mechanisms underlying such neuropsychiatric disorders as schizophrenia, bipolar disorder and autism. After years of frustration, success in identifying disease-associated DNA sequence variation has followed from new genomic technologies, new genome data resources, and global collaborations that could achieve the scale necessary to find the genes underlying highly polygenic disorders. Here we describe early results from genome-scale studies of large numbers of subjects and the emerging significance of these results for neurobiology.
Mapping the human reference genome’s missing sequence by three-way admixture in Latino genomes
Genovese G, Handsaker RE, Li H, Kenny EE, McCarroll SA.
American Journal of Human Genetics, 2013
We show that data from Latino genomes can be used to map a substantial fraction of the human genome’s remaining unmapped sequence.
Using population admixture to help complete maps of the human genome
Genovese G, Handsaker RE, Li H, Altemose N, Lindgren AM, Chambert K, Pasaniuc B, Price AL, Reich D, Morton CC, Pollak MR, Wilson JG, McCarroll SA.
Nature Genetics, 2013
We describe a way to map the human genome’s “missing pieces” – tens of megabases of apparently human genome sequence that had no home on maps of the human genome – by using mathematical patterns in the sequence variation that is present in admixed populations such as African Americans. Surprisingly, we find that much of this sequence has been hiding in and around the centromeres of human chromosomes.
Of rats and men [Review Article]
Patil CK, McCarroll SA.
The selective breeding of rats as physiological, behavioral, and disease models generated a wealth of variation relevant to the genetics of complex traits.
Progress in the genetics of polygenic brain disorders: significant new challenges for neurobiology [Review Article]
McCarroll SA, Hyman SE.
Advances in genome analysis are making possible successful genetic analyses of polygenic brain disorders. We outline the challenges and opportunities for neurobiology that lie ahead.
Our fallen genomes [Review Article]
Macosko EZ, McCarroll SA.
Few human conceits are as relentlessly undermined by science as humans’ naïve assumptions about our own perfection. Charles Darwin abolished one such set of assumptions by showing that “inferior creations” are man’s evolutionary cousins. However, Darwin’s theory of evolution ultimately abetted a modern conceit—that the genomes in our cells are highly optimized end products of evolution.
The variation within [Review Article]
Macoscko EZ, McCarroll SA.
Nature Genetics, 2012
We usually think of an individual’s cells as sharing the same genome.
Differential relationship of DNA replication timing to different forms of human mutation and variation
Koren A, Polak P, Nemesh J, Michaelson JJ, Sebat J, Sunyaev SR, McCarroll SA.
American Journal of Human Genetics, 2012
We describe how DNA replication timing shapes the generation of new mutations across the human genome.
Structural haplotypes and recent evolution of the human 17q21.31 region
Boettger LM., Handsaker RE., Zody MC., McCarroll SA.
Nature Genetics, 2012
We describe an extreme form of structural variation at the human 17q21.31 inversion locus, which we find is segregating in at least nine different structural forms in human populations. We further show that complex genome structures can be analyzed by imputation from SNPs.
Discovery and genotyping of genome structural polymorphism by sequencing on a population scale
Handsaker RE, Korn JM, Nemesh J, McCarroll SA.
Nature Genetics, 2011
We describe a new class of methods for analyzing structural variation in whole genome sequence data.
Copy number variation and human genome maps [Review Article]
Nature Genetics, 2010
Maps of human genome copy number variation (CNV) are maturing into useful resources for complex disease genetics.
Donor-recipient mismatch for common gene deletion polymorphisms in graft-versus-host disease.
McCarroll SA, Bradner JE, Turpeinen H, Volin L, Martin PJ, Chilewski SD, Antin JH, Lee SJ, Ruutu T, Storer B, Warren EH, Zhang B, Zhao LP, Ginsburg D, Soiffer RJ, Partanen J, Hansen JA, Ritz J, Palotie A, Altshuler D.
Nature Genetics, 2009
Transplantation and pregnancy, in which two diploid genomes reside in one body, can each lead to diseases in which immune cells from one individual target antigens encoded in the other’s genome. One such disease, graft-versus-host disease (GVHD) after hematopoietic stem cell transplantation (HSCT, or bone marrow transplant), is common even after transplants between HLA-identical siblings, indicating that cryptic histocompatibility loci exist outside the HLA locus. The immune system of an individual whose genome is homozygous for a gene deletion could recognize epitopes encoded by that gene as alloantigens. Analyzing common gene deletions in three HSCT cohorts (1,345 HLA-identical sibling donor-recipient pairs), we found that risk of acute GVHD was greater (odds ratio (OR) = 2.5; 95% confidence interval (CI) 1.4-4.6) when donor and recipient were mismatched for homozygous deletion of UGT2B17, a gene expressed in GVHD-affected tissues and giving rise to multiple histocompatibility antigens. Human genome structural variation merits investigation as a potential mechanism in diseases of alloimmunity.
Integrated detection and population-genetic analysis of SNPs and copy number variation.
McCarroll SA, Kuruvilla FG, Korn JM, Cawley S, Nemesh J, Wyoker A, Shapero MH, deBakker PIW, Maller J, Kirby A, Elliott AL, Parkin M, Hubbell E, Webster T, Mei R, Veitch J, Collins PJ, Handsaker R, Lincoln S, Nizzari MM, Blume J, Jones K, Rava R, Daly MJ, Gabriel SB, Altshuler DM.
Nature Genetics, 2008
Dissecting the genetic basis of disease risk requires measuring all forms of genetic variation, including SNPs and copy number variants (CNVs), and is enabled by accurate maps of their locations, frequencies and population-genetic properties. We designed a hybrid genotyping array (Affymetrix SNP 6.0) to simultaneously measure 906,600 SNPs and copy number at 1.8 million genomic locations. By characterizing 270 HapMap samples, we developed a map of human CNV (at 2-kb breakpoint resolution) informed by integer genotypes for 1,320 copy number polymorphisms (CNPs) that segregate at an allele frequency >1%. More than 80% of the sequence in previously reported CNV regions fell outside our estimated CNV boundaries, indicating that large (>100 kb) CNVs affect much less of the genome than initially reported. Approximately 80% of observed copy number differences between pairs of individuals were due to common CNPs with an allele frequency >5%, and more than 99% derived from inheritance rather than new mutation. Most common, diallelic CNPs were in strong linkage disequilibrium with SNPs, and most low-frequency CNVs segregated on specific SNP haplotypes.
Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn’s disease.
McCarroll SA, Huett AS, Kuballa P, Chilewski S, Landry A, Goyette P, Zody MC, Hall JL, Brant SR, Cho JH, Duerr RH, Silverberg MS, Taylor KD, Rioux JD, Altshuler D, Daly MJ, Xavier RJ.
Nature Genetics, 2008
Following recent success in genome-wide association studies, a critical focus of human genetics is to understand how genetic variation at implicated loci influences cellular and disease processes. Crohn’s disease (CD) is associated with SNPs around IRGM, but coding-sequence variation has been excluded as a source of this association. We identified a common, 20-kb deletion polymorphism, immediately upstream of IRGM and in perfect linkage disequilibrium (r(2) = 1.0) with the most strongly CD-associated SNP, that causes IRGM to segregate in the population with two distinct upstream sequences. The deletion (CD risk) and reference (CD protective) haplotypes of IRGM showed distinct expression patterns. Manipulation of IRGM expression levels modulated cellular autophagy of internalized bacteria, a process implicated in CD. These results suggest that the CD association at IRGM arises from an alteration in IRGM regulation that affects the efficacy of autophagy and identify a common deletion polymorphism as a likely causal variant.
Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs.
Korn JM, Kuruvilla FG, McCarroll SA, Wysoker A, Nemesh J, Cawley S, Hubbell E, Veitch J, Collins PJ, Darvishi K, Lee C, Nizzari MM, Gabriel SB, Purcell S, Daly MJ, Altshuler D.
Nature Genetics, 2008
Accurate and complete measurement of single nucleotide (SNP) and copy number (CNV) variants, both common and rare, will be required to understand the role of genetic variation in disease. We present Birdsuite, a four-stage analytical framework instantiated in software for deriving integrated and mutually consistent copy number and SNP genotypes. The method sequentially assigns copy number across regions of common copy number polymorphisms (CNPs), calls genotypes of SNPs, identifies rare CNVs via a hidden Markov model (HMM), and generates an integrated sequence and copy number genotype at every locus (for example, including genotypes such as A-null, AAB and BBB in addition to AA, AB and BB calls). Such genotypes more accurately depict the underlying sequence of each individual, reducing the rate of apparent mendelian inconsistencies. The Birdsuite software is applied here to data from the Affymetrix SNP 6.0 array. Additionally, we describe a method, implemented in PLINK, to utilize these combined SNP and CNV genotypes for association testing with a phenotype.
Copy-number variation and association studies of human disease.
McCarroll SA, Altshuler DM.
Nature Genetics, 2007
The central goal of human genetics is to understand the inherited basis of human variation in phenotypes, elucidating human physiology, evolution and disease. Rare mutations have been found underlying two thousand mendelian diseases; more recently, it has become possible to assess systematically the contribution of common SNPs to complex disease. The known role of copy-number alterations in sporadic genomic disorders, combined with emerging information about inherited copy-number variation, indicate the importance of systematically assessing copy-number variants (CNVs), including common copy-number polymorphisms (CNPs), in disease. Here we discuss evidence that CNVs affect phenotypes, directions for basic knowledge to support clinical study of CNVs, the challenge of genotyping CNPs in clinical cohorts, the use of SNPs as markers for CNPs and statistical challenges in testing CNVs for association with disease. Critical needs are high-resolution maps of common CNPs and techniques that accurately determine the allelic state of affected individuals.
Common deletion polymorphisms in the human genome.
McCarroll SA, Hadnott TN, Perry GH, Sabeti PC, Zody MC, Barrett J, Dallaire S, Gabriel SB, Lee C, Daly MJ, Altshuler DM.
Nature Genetics, 2006
The locations and properties of common deletion variants in the human genome are largely unknown. We describe a systematic method for using dense SNP genotype data to discover deletions and its application to data from the International HapMap Consortium to characterize and catalogue segregating deletion variants across the human genome. We identified 541 deletion variants (94% novel) ranging from 1 kb to 745 kb in size; 278 of these variants were observed in multiple, unrelated individuals, 120 in the homozygous state. The coding exons of ten expressed genes were found to be commonly deleted, including multiple genes with roles in sex steroid metabolism, olfaction and drug response. These common deletion polymorphisms typically represent ancestral mutations that are in linkage disequilibrium with nearby SNPs, meaning that their association to disease can often be evaluated in the course of SNP-based whole-genome association studies.