Abstract
The Human Variome Project (HVP) is a global initiative to collect, curate and make accessible data on genetic variations that affect human health. The HVP was launched in 2006 following the recommendations of a meeting co-sponsored by the World Health Organization (WHO) in Melbourne, Australia. The meeting came up with at least 96 recommendations that were grouped in 12 categories. The idea of the HVP was prompted by the fact there was a lot of ongoing work on curation of the genetic variation data across the globe but the efforts lacked coordination often resulting in incomplete or inaccurate data. HVP is therefore a collaborative effort involving many countries, institutions and individuals. The general aim of HVP is to improve health outcomes by synchronization and standardization of data on human genetic variation and its impact on health. The specific objectives are to encourage the development and adoption of standards; reach consensus on and implement ethical requirements (including informed consent forms and approaches for protecting patient confidentiality); develop systems for automated data submission; develop community education and communication programs; enable participation by developing countries; support curation processes; promote evidence-based genetic medicine and create usable systems for contribution, curation, search, and retrieval.This research paper will describe the HVP focusing on the recommendations that formed the foundation of the HVP, the objectives of the project, the tools, scope and data source of the project. The paper will also explore the benefits of the HVP.
1.0 Introduction
1.1 Background Information
Since the early 1970s there have been concerted efforts to collect and document human gene variations and their effects on human health for diagnostic purposes. The said gathering and curation of these variations lacked coordination and standardization thus limiting and complicating their use. With the myriad possible variations in the approximate twenty four thousand genes in the human genome, standardizing the criteria for collecting these variations would be extremely useful. The incomplete collection and verification of these variations in different parts of the world underscores the need to have a universal standard system that would be easily accessible to all. It is this and other problems that prompted the World Health Organization and other organizations to sponsor a meeting in Melbourne, Australia on June 20-23 2006 from which emanated the idea of the Human Variome Project (HVP). The meeting was organized into eight sessions; the clinic and phenotype, research laboratory, diagnostic laboratories, informatics, curation and collection, federation and integration, relevance to the emerging world and funding and sustainability. The meeting came up with and documented 96 recommendations as the foundation and framework for the future work of the HVP.
The main objective of Human Variome Project is to systematically collect information on human gene variations with the associated phenotypes and make it easily accessible to those who need it. The project brings about global collaboration involving an interaction of different projects on human genetics developed, funded and implemented by different working groups all over the world. The working groups develop the finer details of their projects but the entire HVP is founded on the recommendations of the 2006 Melbourne meeting and work already done in different parts of the world. Coordination of the project aims at eradicating the possibility of duplication and misreporting or reporting non useful data. The HVP will therefore collect, curate and electronically recode gene variations (alleles) in the form of either polymorphism or mutations in the human genome with emphasis on pathological conditions (disease) and other phenotype associations. The HVP was preceded by the Human Genome Project which generated a database for nucleotide sequence of the human genome. The availability of this sequence database and other databases as well as a systematic nomenclature for genes, exons, mutations and other vocabulary in genetics makes the HVP feasible.
Being a collaborative effort the HVP complements and coordinates the ongoing resequencing efforts that continue to contribute to variation databases such as dpSNP and the growing body of knowledge from genome association studies. The massive data already available provides excellent data on variations compatible with life as well as variations contributing to common disease. More important to the project is the verification of variations/ mutations via observation of rare phenotypes which is essential for capturing information that would otherwise be missed. The available databases are categorized into two major categories; the Locus specific databases (LSDBs) and the genomic repositories. LSDBs focus on the variations with major or “causative” direct effect on one or more phenotypes related to disease. The second group of databases consists of data on neutral variations and those that slightly modify the resulting protein or those indirectly associated with disease. In some cases the latter type of databases contain gene mutations without distinguishing those that are pathogenic from those that are not; they also have little detail or don‘t contain all mutations thus they are often delayed before publication. The HVP therefore delves into some of these databases; verifies the significant variations especially with regard to disease and produces a common and synchronized database that will hopefully have no “grey” areas.
1.2 Research questions
This research paper will seek to address various questions related to the HVP such as: what is Human Variome project? What are the objectives of HVP? What is the scope of HVP? What are the sources of HVP data? And finally what are some of the tools that are useful to the HVP?
2 Methodology
The research paper will basically be an informative retrospective study. As such the method employed will be a review of available literature on the HVP. The literature to be reviewed and analyzed will include journals and reports on the ongoing HVP. It is important to note that this research paper is not an empirical research thus statistical analysis of results will not be employed. Twelve literature sources will be critically and analytically reviewed to give an accurate and detailed description of the Human Variome Project. From the twelve literature sources the research paper will seek to define the HVP with emphasis on its objectives, scope, progress, tools and benefits of the project. Once the HVP has been described the research paper will give a brief conclusion of the HVP
3 Description of the HVP
The HVP was launched in Melbourne, Australia in June 2006 as a global initiative to collect and curate genetic variations and their disease-associated phenotypes. The data collected by the HVP is to be made available to healthcare providers such as clinicians and diagnosticians as well as genetic researchers with an aim of improving quality of healthcare provision. It is important to note that the collection and curation of genetic variations began a decade before the launching of the HVP with the Human Genome Variation Society participating in most of the initial work. However HVP brought in the much needed coordination and standardization. Launching of the project was aimed at attracting major funding as well as international collaboration. Richard Cotton, the Founder and Director of Genomic Disorders Research centre (GDRC) in Melbourne, is often credited for instigating the HVP after using different mutation databases that were rapidly expanding, becoming diverse and thus frustrating to use.
Though there are several objectives of the HVP, as will be seen later, the main aim of the project is to facilitate rapid diagnosis of rare genetic disorders as well as development effective pharmacological agents to treat these conditions.
The motivation for the launching the HVP was the need to coordinate and standardize the ongoing efforts to collect and curate the gene variations associated with specific disease associated phenotypes (symptoms). With about 100,000 mutations having been discovered in humans, representing only 5% of the predicated mutations, only about 2000 of these mutations have been linked to disease thus a lot of wok is cut out for the HVP. To add to the complexity of the HVP work is the fact that a single disease could be associated with multiple gene variations, for instance over 100 mutations have been associated with cystic fibrosis. While it is easy to detect some genetic disorder at the molecular level, for example triplet repeat expansion in Huntington disease, it is more complicated in neurogenetic disorders in which the genotype-phenotype interactions are rather complex. The neurological phenotypes are complex, varied and may evolve with time. Two individuals with the same condition may have different phenotypes and a particular phenotype may be present in different conditions. Furthermore mutations at the mitochondrial level may complicate the understanding of genetic disorders. With such complexities coordination and standardization is paramount otherwise the work could result in haphazard inapplicable data. It is hoped that complete cataloguing of the genetic variations will prove invaluable in dealing with genetic disorders.
The June 2006 Melbourne meeting that culminated in the launching of the HVP brought together scholars, clinicians and representatives from funding organizations, World Health Organization (WHO) and the United Nations Educational, Scientific and Cultural Organization (UNESCO). The participants, who were categorized in eight working groups, made 96 recommendations and categorized them in 12 classes that formed the basis for HVP. In order to fully describe the HVP it is imperative to briefly look at some of these recommendations.
4.0 Recommendations of the First HVP Meeting.
The Melbourne meeting recommended the incorporation of the coordinating office, which coordinated activities of the Human Genome Variation Society since 1996, in HVP. The office located at the Genomic Disorder Research Centre in Melbourne was given the responsibility to develop oversight and steering committees and working groups. The office in conjunction with the working groups would develop a business plan for fundraising, lobby Governments to facilitate data collection, generate a list of WHO accredited databases and communicated with all stakeholders.
The meeting also recommended accurate and complete documentation of clinical phenotypes (symptoms). It was concluded that Bioinformaticians and clinicians would develop systems for evaluating and accrediting the various data types and users. It was also recommended that standards, guidance and incentives be availed to all those involved in the collection, reporting and interpretation of data. This would also include structured descriptions of phenotypes for specific disease using data sheets. The third recommendations encouraged participation of diagnostic laboratories by having consent forms and archiving as part of quality control and licensing. This recommendation encourages inclusion of the less-wealthy societies. The fourth category of recommendations deals with supporting disease-specific databases as well as networks of collaboration.
There were also recommendations on centralization of databases. These recommendations include use of standard data models, placement variation- onassemblies links to LSBDs, designing unified LSDBs sets with individually curated and user-friendly interface, maximal automation and linkage between databases. The sixth set of recommendations had to do with curation, collection from LSDBs accredited by HVP and using predetermined standard reference sequence, nomenclature and standards. Formation of a federation of curator with informatics support and good software would facilitate these recommendations. Another vital recommendation was the involvement of the developing countries. It was recommended that HVP be inclusive, develop capacity and skills in the third world and enhance national and international collaboration. There were also recommendations dealing with funding and sustainability, nomenclature and standards, ethics and education, reporting in publications and journals and translations.
5.0 Objectives of HVP
During the same Melbourne meeting in 2006 the mission and the objectives for the HVP were drawn. The main aim of HVP is to improve health outcomes by unification and standardization of data on human genetic variation and its impact on health. The project supports the application of the mentioned data in clinical environment by developing resources to carry out ten specific objectives. The key objectives include capturing and cataloguing all gene variations associated with human disease through gene specific curation in a central location. This would also include creation of multiple sites in different countries to ensure maximum security and integrity of data and to allow comprehensive gene search using a common interface. The HVP also provides standardization of gene variation nomenclature, reference sequence and supports systems enabling diagnostic labs to use and contribute to the human gene variation knowledge. Another role of HVP is to establish a system that ensures sufficient curation of gene variation information from gene specific, disease-specific or country-specific databases hence improve accuracy, reduce errors and develop a comprehensive data base of all human genes.
The HVP also facilitates development of softwares to collect and exchange human variation data among various databases. The project provides a structured mechanism for clinicians to determine health outcomes associated with genetic variation thus linking the users and providers of the data. The project creates a support system for research labs to collect and log discovered phenotypic and genotypic data in a free, unrestricted and open access system. The other objective of the HVP is to formulate ethical standards ensuring free access to all human gene variation data to be used for the public good and to address the needs of indigenous communities. The HVP also supports the participation of developing countries in collection, analysis and sharing of the human variation data by building capacity and skills in these countries. The HVP establishes a communication and education program to collect and disseminate information related to human gene variation across all countries. Finally HVP continues to carry out research on human genetic variation and presents the findings to users for the benefit of all. All the objectives of the HVP are achieved by execution of the at least 96 recommendations of the 2006 Melbourne meeting.
6.0 The participants and scope of the HVP
The project involves thousands of people in a coordinated effort using specifically designed tools and protocols. Different organizations are involved in the HVP; some organizations providing funding, others providing expertise and others are involved capacity building. Institutions such WHO, UNESCO, COSMIC National Health Institutions, International Society for Gastrointestinal Heredity Tumors (InSiGHT, National Cancer Institute, just to mention a few, have been engaged in the HVP efforts. It is however important to not that as per the WHO policy, drug companies are not allowed to participate due to conflicting interests but they may be involved in the project individually after discussions with legal experts at WHO. An incentive is provided for submission of mutation information with a tracing system to verify the origin being used. It is worth noting that several publications and journals are involved in the project by giving credit and publishing findings on the gene variation research. The Adopt-a-Gene program through the HVP encourages industry and patients to sponsor the curation efforts. In additional the project is international seeking to involve as many countries as possible. Approximately 20 countries, including the USA, Japan and the UK are involved in the project with each having established a national repository (HVP Country Node).
The HVP initially included information on single-gene disorders with the number of these diseases being predicted to increase from the currently identified 6000 to about 23,000 as new disorders are discovered. The clinicians or diagnostician who discovers a variation in a patient decides whether it is causing the disease using the already available algorithms. The algorithms predict pathogenicity based on features like evolutionary conservations, frequency, protein structural changes and nature of missense. There are ongoing efforts to develop an ideal algorithm that combines all the above features to give the probability of pathigenicity. HVP annotate the gene variations prioritizing them based on the prevalence, severity and treatability of the disorder. Different tools and databases are used to capture all disease mutations in the project.
6.0 The data sources and tools in the HVP
The HVP depends on the already available data from databases such as OMIM, GenBank, dbSNP, dbGAP and HapMap and adds value to these databases. The information from gene specific, disease-specific or country specific databases are evaluated and verified to ensure accuracy and that there is no duplication. In addition to these electronic databases data is also generated from refereed publications, clinicians’ formal patient records, registries of specific hereditary diseases, diagnostic and research laboratories. This wide range of data sources ensures that all disease causing mutations are captured and documented to facilitate rapid diagnosis of genetic disorders. The data gathered and documented will save valuable time wasted in going through many publications and databases to determine if a mutation found in a patient has been characterized. This will in turn ensure that diagnosis and prognosis are based on evidence rather than conjecture and guesswork. The project not only avails information but also affords many individuals and organizations an opportunity to contribute to the growing body of knowledge on human genetic variations.
The 2006 Human Genome Variation Society scientific meeting discussed about five tools used in the HVP to evaluate the pathogenicity of genetic variants. The first tool involves multiplying the odds ratios (ORs) of several lines of evidence with a variant giving an OR > 20 being deleterious and a variant with OR < 0.05 being neutral. The “mutalyzer” tool, designed to consistently name gene variants, improves the description of DNA sequence changes in mutation databases. The third tool, splicing sequences finder, is used to analyze nucleotides sequence for traditional splicing signals that affect transcription. The phen code links genotypes, phenotypes and clinical data forming a rather complete loop. The final tool that was described is the mutationView which is an integrated database of database for mutations in monogenetic diseases.
7.0 Conclusions
The HVP is an ambitious global effort to collect, curate and make accessible data on genetic variations affecting human health that culminated from recommendations of a meeting cosponsored by WHO in 2006. The HVP is aimed at coordinating the ongoing curation efforts thus facilitate rapid diagnosis of genetic disorders. Various tools and data sources are employed in the HVP leading to better understanding of how gene variations directly or indirectly affect human health. HVP involves collaboration of different organizations, individuals and countries to ensure comprehensive information that is credible and to prevent duplication. The objectives, participants and scope of the HVP are clearly defined with different working groups and committees meeting and publishing their progress reports regularly.
References
AlAama, J., Smith, T. D., Lo, A., Howard, H., Kline, A. A., Lange, M., et al. (2011). Initiating a Human Variome Project Country Node. Human Mutation , 1-24.
An. (2006, August). Human Variome Project to identify all human gene mutations launched. Personalized Medicine , p. 227.
Appelbe, W., Auerbach, A. D., Becker, K., Bodmer, W., Boone, D. J., Boulyjenkov, V., et al. (2007). Recommendations of the 2006 Human Variome Project meeting. Melbourne: Nature Publishing group.
Cotton, R. G., & Hardman, L. (2008, March 4). Human Variome project: progress and plans. Personalized Medicine , p. 99.
Cotton, R. G., Aqeel, A. I., Al-Mulla, F., Carr, P., Claustres, M., Ekong, R., et al. (2009). Capturing all disease-causing mutations for clinical and research use: Toward an effortless system for the Human Variome Project. Genetics IN Medicine , 843–849.
Cotton, R. G., Auerbach, A. D., Axton, M., Barash, C. I., Berkovic, S. F., Brookes, A. J., et al. (2008, November 7). The Human Variome Project. National Institutes of Health-Author Manuscript , pp. 861-862.
Editorial. (2007, April 23). What is the human Variome project? Nature Genetics , pp. 423-422.
Howard, H. J., Horaitis, O., Cotton, R. G., Vihinen, M., Dalgleish, R., Robinson, P., et al. (2010). The Human Variome Project (HVP) 2009 Forum ‘‘Towards Establishing Standards’’. Vienna: WILEY-LISS, INC.
Kohonen-Corish, M. R., Al-Aama, J. Y., Auerbach, A. D., Axton, M., Barash, C. I., Bernstei, I., et al. (2010). How to Catch All Those Mutations—The Report of the Third Human Variome Project Meeting. The third Human Variome Project (HVP) Meeting ‘‘Integration and Implementation’’ (pp. :1374–1381). Paris: Wiley-Liss, Inc.
Kohonen-Corish, M., Weber, T. K., Lindblom, A., & Macrae, F. (2009). Report on Combined Meeting of the International Society for Gastrointestinal Hereditary Tumours, the Human Variome Project and the National Cancer Institute Colon Cancer Family Registry. Combined Meeting of the International Society for Gastrointestinal Hereditary Tumours, the Human Variome Project and the National Cancer Institute Colon Cancer Family Registry (pp. 705-711). Duesseldorf: Springer Science+Business Media.
Oetting, W. S. (2006). HUMAN MUTATION. The 2006 Human Genome Variation Society Scientific Meeting (pp. 517-521). New Orleans, Louisiana: WILEY-LISS, INC.
Ring, D. Z., Kwok, P.-Y., & Cotton., R. G. (2006, October). Human Variome Project: an international collaboration to catalogue human genetic variation. Pharmacogenomics , p. 96.