Peter White, PhD, Assistant Professor of Pediatrics and Director of the Biomedical Genomics Core at Nationwide Children’s Hospital, led the winning team in CLARITY Undiagnosed, an international challenge that interpreted genomic data from five families with undiagnosed conditions. The team also took part in CLARITY’s first challenge in 2012, receiving special mention. Here, White describes the team’s approach to these “toughest of the tough” patients.
The CLARITY Undiagnosed challenge was markedly harder than the first CLARITY challenge. This time around, we were given whole-genome sequence datasets for five families and asked to produce clinically useful results through improved interpretation and reporting. It turned out to be a fantastic learning experience for all of us, and we will be using the collaborative approach we developed to solve genomics challenges in our own patients.
The cases were particularly demanding. The patients already had been through extensive clinical and genetic testing, so we had to look for changes in unusual genes or for rare genetic variants unlikely to be a part of routine clinical genetic testing, including structural variants and non-coding variants. This information, with its uncertainties, then had to be distilled into a report that would be understandable, help guide the clinicians and provide value to the families.
The key to our approach was the diversity of the team we assembled. Members included geneticists, genomic researchers, bioinformaticians, big data informatics experts, genetic counselors, medical geneticists and other clinicians. Several had competed in the first CLARITY challenge and were eager to participate again.
With no obvious genetic variants that we could link back to the patient’s clinical features, we had to start getting more creative. We met for the first time in August and continued to meet weekly over the six weeks we were given for the project. During those first meetings, the clinicians and geneticists provided differential diagnoses for each patient after carefully reviewing their clinical information (including medical and family history, videos and laboratory, imaging and pathology results). They then suggested likely gene candidates to evaluate.
Given this guidance from the clinical members of the team, our bioinformaticians set out to analyze the whole-genome sequence data. A significant component of our approach involved using improved bioinformatics methodology to identify pathogenic genetic variation within these data in a way that would generate reliable results and simplify the interpretation process.
We had recently developed a computational pipeline, Churchill, that fully automates the analytical steps required to take raw sequencing output through the complex, computationally intensive process of genomic data analysis. For CLARITY Undiagnosed, we added functionality to enable the pipeline to detect both copy number variants and other structural variants in chromosomes. A significant advantage of Churchill is that it produces higher quality, more accurate variant calls than alternative pipelines, making analysis more reliable.
We next developed a comprehensive strategy for prioritizing variants for further team review. To help rule out common genetic variants unlikely to play a role in these families’ disease, we analyzed thousands of control genomes with the Churchill pipeline with support from GenomeNext, Amazon Web Services and Intel. Initially, we focused on changes in coding sequences, sorting candidate gene lists for the main affected member of each family according to suggested patterns of inheritance. These lists were then reviewed by our clinicians and other team members.
We especially thank the families for their bravery in sharing their stories. When it became clear that the families had few, if any, “good” candidate variants within the coding regions of their genomes, we expanded the search to include copy number changes, even though, for several of the families, a clinical microarray had been performed with no significant results. In addition, we analyzed non-coding regulatory regions for high priority candidate genes in several families, guided by the RegulomeDB database, but found no likely candidates with this approach.
At this point, having exhausted our search for variants within our candidate gene lists, it became tempting to just give up. With no obvious genetic variants that we could link back to the patient’s clinical features, we had to start getting more creative.
Our clinicians went back and identified key clinical features of each patient’s condition, which we used to conduct a more systematic, phenotype-driven search for gene variants. Our big data experts mined links between these individual clinical features and the entire list of potentially pathogenic genetic variants for each patient. This yielded an expanded list of variants for each family.
We carefully considered each new variant as a team, searching publications and clinical databases. In some cases, the variants were in genes that had never previously been linked to human disease, but had evidence in the literature that supported a potential role for the variant in the patient’s disease.
We gathered all this data into clinical summaries, using guidelines for classifying sequence variants and for reporting of incidental findings developed by the American College of Medical Genetics and Genomics. Finally, our genetic counselors drafted letters for each family, compassionately explaining what we had done and what was found.
We are thankful to Boston Children’s and Harvard Medical School for reaching out to the genomics community to help with diagnoses for these patients. We are already beginning to apply this experience to our own work and patients. Beginning this month, we are launching a new research genomics program at Nationwide Children’s Hospital, made possible through The Nationwide Pediatric Innovation Fund and the CLARITY Undiagnosed challenge prize money we received.
We especially thank the families for their bravery in sharing their stories and their willingness to give us and all the participating teams an opportunity to provide them some answers.
The Nationwide Children’s Hospital Team
Peter White, PhD (Team Leader). Principal Investigator, Center for Microbial Pathogenesis. Director of Molecular Bioinformatics and Director, Biomedical Genomics Core, The Research Institute at Nationwide Children’s Hospital and Assistant Professor, Department of Pediatrics, The Ohio State University.
Donald J. Corsmeier, DVM (Project Coordinator). Postdoctoral Scientist, The White Lab, The Research Institute at Nationwide Children’s Hospital.
Gail E. Herman, MD, PhD. Principal Investigator, Center for Molecular and Human Genetics, The Research Institute at Nationwide Children’s Hospital and Professor, Department of Pediatrics, The Ohio State University
Kim L. McBride, MD, MS. Principal Investigator, Center for Cardiovascular and Pulmonary Research. Director, Cell Line Core Lab, The Research Institute at Nationwide Children’s Hospital and Associate Professor, Department of Pediatrics, The Ohio State University.
Kevin M. Flanigan, MD. Principal Investigator, Center for Gene Therapy, The Research Institute at Nationwide Children’s Hospital and Professor, Department of Pediatrics and Neurology, The Ohio State University.
Robert E. Pyatt, PhD. Assistant Director, Cytogenetics/ Molecular Genetics Laboratory, Nationwide Children’s Hospital and Associate Professor-Clinical, Department of Pathology, Ohio State University.
Elizabeth Varga, MS, LGC. Licensed Genetics Counselor, Genetic and Genomic Services Coordinator, Co-Director of Personalized Medicine, Hematology/Oncology/BMT, Nationwide Children’s Hospital.
Sayaka Hashimoto, MS, LGC. Genetics Counselor, Laboratory Medicine, Nationwide Children’s Hospital.
Sara M. Fitzgerald-Butt, MS, LGC. Licensed Genetic Counselor, The Heart Center and The Research Institute at Nationwide Children’s Hospital and Clinical Assistant Professor, Department of Pediatrics, The Ohio State University.
Benjamin J. Kelly, MS. Bioinformatics Scientist, The White Lab, The Research Institute at Nationwide Children’s Hospital.
James Fitch. Bioinformatics Scientist, The White Lab, The Research Institute at Nationwide Children’s Hospital.
Harkness Kuck. Bioinformatics Scientist, The White Lab, The Research Institute at Nationwide Children’s Hospital.
Soheil Moosavinasab. NLP & Big Data Specialist, Research Information Solutions and Innovation Research & Development, Nationwide Children’s Hospital.
Yungui Huang, PhD, MBA. Director, Research Information Solutions and Innovation Research & Development, The Research Institute at Nationwide Children’s Hospital.
Simon Lin, MD, MBA. Chief Research Information Officer, The Research Institute at Nationwide Children’s Hospital.