Data Source for Gene Network Dataset

1. Data collection

    Gene networks play important role in causing complex diseases. To examine schizophrenia candidate genes in the context of protein-protein interaction network, we first constructed a comprehensive human PPI network by integrating protein-protein interaction pairs from six databases: Human Protein Reference Database (HPRD) ( Peri et al. 2003), BIND ( Bader et al. 2001), IntAct ( Hermjakob et al. 2004), MINT ( Zanzoni et al. 2002), Reactome ( Matthews et al. 2009) and DIP ( Salwinski et al. 2004). We then map our core genes onto this PPI network and found 32 of the core genes are present at the network. In network topology, proteins in the shortest path tend to have same or similar biological process (Managbanag et al. 2008). Thus we identified the shortest paths between any two of the 32 core genes. Finally, we included all the genes whose coding proteins are present at these shortest paths. A total of 1035 genes were selected based on this network feature.

2. Scoring system

     Each gene in the shortest path to a pair of core genes was assigned a score to describe it closeness to the phenotype (schizophrenia in this case). We modified Wu et al. (2008) method to calculate the closeness of a gene in the shortest path to schizophrenia (i.e. core genes). The closeness of a gene g in the shortest path to a schizophrenia core gene is calculated by Gaussian kernel where g' is the core gene and L is the distance between genes g and g' in the shortest path. Therefore, the final score of the gene is the sum of its closeness to all core genes:


    where C is the set of core genes.

3. Score distribution
Figure 1. Score distribution of PPI network dataset
  • Bader, G.D., Donaldson, I., Wolting, C., Ouellette, B.F., Pawson, T., and Hogue, C.W. (2001) BIND--The Biomolecular Interaction Network Database. Nucleic Acids Res. 29: 242-245. PubMed BIND
  • Hermjakob, H., Montecchi-Palazzi, L., Lewington, C., Mudali, S., Kerrien, S., Orchard, S., Vingron, M., Roechert, B., Roepstorff, P., Valencia, A., et. al. (2004) IntAct: an open source molecular interaction database. Nucleic Acids Res. 32: D452-D455 PubMed IntAct
  • Matthews, L., Gopinath, G., Gillespie, M., Caudy, M., Croft, D., de Bono, B., Garapati, P., Hemish, J., Hermjakob, H., Jassal, B., et al. (2009) Reactome knowledgebase of human biological pathways and processes. Nucleic Acids Res. 37: D619-D622 PubMed Reactome
  • Peri, S., Navarro, J.D., Amanchy, R., Kristiansen, T.Z., Jonnalagadda, C.K., Surendranath, V., Niranjan, V., Muthusamy, B., Gandhi, T.K., Gronborg, M., et. al. (2003) Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res. 13(10): 2363-71 PubMed HPRD
  • Salwinski, L., Miller, C.S., Smith, A.J., Pettit, F.K., Bowie, J.U., and Eisenberg, D. (2004 Jan 1) The Database of Interacting Proteins: 2004 update. Nucleic Acids Res. 32: D449-D451 PubMed DIP
  • Wu, X., Jiang, R., Zhang, M.Q., and Li, S. (2008) Network-based global inference of human disease genes. Mol Syst Biol 4 PubMed
  • Zanzoni, A., Montecchi-Palazzi, L., Quondam, M., Ausiello, G., Helmer-Citterich, M., and Cesareni, G. (2002 Feb 20) MINT: a Molecular INTeraction database. FEBS Lett. 513(1): 135-40 PubMed MINT

Copyright © Bioinformatics and Systems Medicine Laboratory All Rights Reserved since 2009.