TECHNOLOGY: Searching the Druggable Genome

May 28, 2017

As the launching pad for all things druggable genome, the TargetCentral website is a trove of information for investigators seeking information about the Illuminating the Druggable Genome (IDG) Consortium, its partnering sites, and technology additions. Improving the understanding of the properties and functions of proteins that are currently unannotated within some of the most commonly drug-targeted protein families such as G-protein coupled receptors, nuclear receptors, ion channels, and protein kinases, has been a guiding principal of the IDG program. The IDG Consortium builds a network of Knowledge Management Centers, collecting and integrating data from across various resources to aid in prioritizing the understudied protein targets and connecting these with Technology Development Centers that have brought forth new methodologies to shed light onto these targets.

The Harmonizome curation project search tool tackles the important task of mining biomedical literature to extract and aggregate decades worth of research findings into online databases, integrating known gene and protein knowledge. This is accomplished by distilling information from original datasets into attribute tables that define significant associations between genes and attributes, where attributes may be genes, proteins, cell lines, tissues, experimental perturbations, diseases, phenotypes, or drugs, depending on the dataset. Gene and protein identifiers are mapped to NCBI Entrez Gene Symbols and attributes are mapped to appropriate ontologies. The output data can be integrated to perform many types of computational analyses for knowledge discovery and hypothesis generation. Harmonizome is a collection of information about genes and proteins from 114 datasets provided by 66 online resources.

DrugCentral is a search tool and online drug information resource created and maintained by the Division of Translational Informatics at the University of New Mexico. When searching for either proprietary or non-proprietary drug names, DrugCentral provides information on active ingredients, chemical entities, pharmaceutical products, drug mode of action, indications, and pharmacologic action. Moreover, the database is searchable by disease name, providing results that are indicated for the treatment of the queried disorder. DrugCentral administrators monitor FDA, EMA, and PMDA for new drug approval on a regular basis to ensure currency of the resource. Limited information on discontinued and drugs approved outside the US is also available however regulatory approval information can't be verified.

The Drug Target Ontology (DTO) project has developed a novel semantic framework to formalize knowledge about drug targets with a focus on the current IDG protein families (G-protein coupled receptors, nuclear receptors, ion channels, and protein kinases). The DTO was developed as a reference for drug targets with the long-term goal of creating a community standard that will facilitate the integration of diverse drug discovery information from numerous heterogeneous resources. In this search and viewer tool protein classes are linked to tissue and disease via different levels of confidence. DTO also contains drug target development classifications, a large collection of cell lines from the LINCS project and relevant cell disease and cell-tissue relations.

The utilizes an interactive visualization tool called target importance and novelty explorer (TIN-X) for discovering interesting targets in the context of a disease. The developers have incorporated natural language processing to identify disease and protein mentions in the text of PubMed abstracts. Results from queries enable users to explore the relationships between the novelty of potential drug targets and their importance to diseases. Novelty measures the relative scarcity of specific publications about a given concept (such as a target or a disease), while importance measures the relative strength of the association between two concepts.