Bioinformatics combines the subjects of biology, mathematics and computer science in to a single field of study. There are various tools available for the biomedical researchers so that they are able to find out the desired biochemical, genomic and medical information. Search programs, data base storage and software programs are some types of bioinformatics tools that help in the analysis of proteomic and genomic data. The web based tutorial of bioinformatics programs will be used in this study to analyze the most important enzyme of fertilization called phospholipase C-gamma. This tutorial will use the bioinformatics tools easily acquired from the web-site of National Center for Biotechnology Information – NCBI (a constituent of National Institute of Health), which include GenBank, PubMed, Gene and RefSeq.
- Gene is database that provides information about the genes, for example: a brief summary of a certain gene, information about its function, the common gene symbol, and links to the sequence information, articles and websites for that specific gene.
- GenBank is another database that contains historical information about the gene sequences. More elaborately, this database includes all the sequences that have ever been published, even those that were published more than once. Hence, it is regarded as the redundant database.
- RefSeq is also a database giving information about the gene sequences but it is not redundant because it has been updated by NCBI. Thus, it includes the most reliable information about the gene sequence as determined by the NCBI.
- The ClustalW will also be used in this paper, which is an alignment program of multiple sequences. This program enables the researcher to input a series of protein or gene sequences that may be similar and evolutionarily related. A BLAST search is usually conducted for acquiring such sequences. Then the ClustalW program is used to align these sequences in order to arrange the largest number of analogous residues for having fewest gaps.
- PLC gamma is known as the most important fertilization enzyme. In Xenopus laevis, the process of fertilization follows the following path:
a) The egg and the sperm bind together.
b) The PLD1b enzyme is activated through this binding.
c) The phosphatidylcholine is disintegrated in to choline and phosphatidic acid by
the PLD1b enzyme.
d) The Src enzyme, also known as tyrosine kinase, is activated by the phosphatidic
acid, which transmits a phosphate to other proteins from ATP for the purpose of turning the
other protein on or off through the process called phosphorylation.
e) The PLC gamma is then phosphorylated by the activated Src enzyme.
f) Further, the lipid PIP2 is disintegrated by the PLC gamma in to DAG and IP3. The diffusion
of IP3 from the cell membrane discharges the deposited calcium inside the endoplasmic
reticulum.
g) The events of fertilization are then caused by the calcium flowing in to the cytoplasm. The calcium flows from the site of sperm binding through to the zygote, due to which the following waves are formed: wave surface contraction, elevation wave of the fertilization envelop, cortical granule exocytosis wave, and other developmental events are initiated that leads to the cytokinesis or first cleavage.
This paper will use freely available programs and databases on the web for analyzing the genes and proteins along with their function and structure. So far, the enzymes and their genes of human forms have been explored through the bioinformatics databases, in this paper. Now, find out the reference to the existence of the PLCG enzyme in Xenopus Laevis with the help of using the Entrez Gene database. For this purpose, return to the home page having the search string of ‘Phospholipase C-gamma in Xenopus laevis’ and the database type selected as ‘Gene’. Now, find that how many references are there for Xenopus PLCG? Also find, what is the preferred name of the PLCG enzyme in each of the references links and how they are different from each other?
Three lists would be found under the ‘Related Sequences’ for the first reference, as given below:
The above table includes the nucleotide base sequences along with their respective amino acids.
Now, go to the ‘Pathways’ section under the General gene information by selecting the second reference to Xenopus PLCG. Kyoto Encyclopedia of Genes and Genomes (KEGG) is database maintained by a Japan based research institute and it contains the records of all the recognized metabolic and signaling pathways. There is an individual entry in the database for every small molecule metabolite and every protein in the pathway, which can be retrieved by clicking on that metabolite or protein in the given figure of the pathway. This web site helps in anticipating that what would occur during the downstream events through to the pathway when the protein that is under study is either more active or less active. Many links can be found to determine the function of the PLC gamma 1b during metabolism. By clicking on the link associated with inositol metabolism, the following figure is obtained.
Figure 1:
The red arrow in this figure indicates the location of the PLC gamma 1b in the first path and the number of enzyme at this location is 3.1.4.11.
(Source: KEGG)
The red text inside the green box shows the PLC.
STRING Database
The string database includes information related to protein to protein interactions on the basis of the following seven types of evidences:
1. It is assumed that if the proteins are functionally related then the genes in the neighborhood encoded them should be maintained together. (Von et al., 2003).
2. It has been found that the genes that encode protein interaction complex are vulnerable for fusion in to a single gene and convert in to a polypeptide (Enright, 1999).
3. Usually, the occurrence pattern of the functional partners is similar to each other in different organisms where such genes are stored or conserved (Von et al., 2003).
4. Co-expression depends upon the micro-array data; proteins that are functionally associated generally are low regulated or up regulated in a consecutive manner.
5. Physical binding has been proven through experimental tests.
6. The database consists of Pathway Interaction Database, KEGG and other curated databases.
7. Co-occurring gene can be located through text mining in one sentence under the abstracts of PubMed database.
During the fertilization process, the SH2 domain are involved in the activation of PLC gamma that leads to the conclusion that tyrosine kinase is the activator (Rhee and Choi, 1992). The source of this kinase could be either the egg or the sperm. Kinase can be obtained through the following processes: the stimulation of a receptor in the membrane of the egg that in turn instigates the cystosolic kinase, the activation of a receptor in membrane of the egg due to the contact with sperm ligand, the induction receptor kinase in to the egg membrane from the sperm membrane due to the fusion of sperm and egg, or the induction of a protein that instigates the kinase like the cystosolic kinase in to the egg cytoplasm from the sperm. Despite the fact that the phosphorylation of PLCg is responsible to activate PLCg through the receptor tyrosine kinases, however, it has also been found that the PLCg can possibly be activated in vitro by means of a non-catalytic interaction with a tyrosine kinase receptor (Hernandez-Sotomayor and Carpenter, 1993).
Results
The evidences of the possible protein reactors have been presented in the Table 2 given below:
The table 2 above presents 21 proteins that have evidence for protein interaction. The level of this interconnectivity of the protein group is shown through the graph parameter called cluster coefficient. If the value of the cluster coefficient is closer to zero then it infer that the pair of proteins are very close and for the farthest protein pair, the value of the cluster coefficient is away from the value of zero.
The result of the Human BLAT Search Genome is shown in the TABLE 3 given below:
In the sequence obtained through the BLAT search, given below, the blue colored and capitalized sequences indicate that the bases of the genomic and cDNA sequences have been matches whereas the boundaries of gaps in each of the sequences have been marked as the light blue bases.
The result obtained through the UCSC Genome Browser on the Human platelets in reference to the research on various animals like rhesus, mouse, dogs, elephant, chicken, zebra fish, etc., is given below:
The solid black boxes in the represents the SH2 domains, the boxes with darker shade of grey represent the SH3 domains and the light grey boxes represent the GST.
The query of Amino Acid string yielded the following result from the protein sequence database by using the BLASTP search. The result below shows the table of sequences and their score value that represent the corresponding alignment to each sequence.
The result below is obtained through CLUSTAL W, which shows the multiple sequence alignment for the following sequences:
The following figure was obtained through the string database, which shows the network of proteins indicating that how different proteins are linked together.
Figure 2: Relationship between signaling proteins.
The figure 2 above shows some surrounded proteins of PLC gamma that are found to have interactions with PLC gamma. The various colored lines showing their relationship indicates the different aspects of their link.
GRB2:
Growth factor receptor-bound protein 2; Adapter protein that provides a critical link between cell surface growth factor receptors and the Ras signaling pathway
Identifier: ENSP00000339007
GRB2’s sequence:
GRB2’ domains:
SH2 Domain:
The phosphotyrosine that includes polypeptides is bound Src homology 2 domain through the 2 surface pockets. The residue interaction provides specificity, which are different from the phosphotyrosine. It has been found that S.cerevisiae only includes a single occurrence of a SH2 domain.
SH2’ sequence:
HPWFFGKIPRAKAEEMLSKQRHDGAFLIRESESAPGDFSLSVKFGNDVQHFKVLRDGAGKYFLWVVKFNSLNELVDYHRSTSVS
SH3 Domain:
The sequences that include proline and hydrophobic amino acids are used by the SH3 domain binding for targeting proteins. The SH3 domain may bind to proteins that include polypeptides in two distinct binding orientations.
SH3’ sequence:
MEAIAKYDFKATADDELSFKRGDILKVLNEECDQNWYKAELNGKDGFIPKNYIEMKP
Discussion
The results indicate that the PLC gamma is regulated through the SH2 and SH3 domains through possible interactions with other proteins.
The concept that the phosphoinositide pathway is very important for the activation of egg activation has been endorsed through many researches during last 20 years or more. The measurements of Biochemical in the eggs of frog and sea urchin illustrated the rise in the levels of phospholipids and inositol trisphosphate during the process of fertilization (Turner et al., 1986). Moreover, PLC gamma 1 also plays a central role in the activation of eggs during fertilization, particularly in humans (Swann et al., 2006). It has also been found that the sperm activates G proteins which in turn instigates the production of IP3 and subsequently, Ca2+ is released through the PLC.
References
Cohen,P. (2002) The origins of protein phosphorylation. Nat. Cell Biol., 4, E127–E30
Enright, A.J., Iliopoulos, I., Kyrpides,N. C.&Ouzounis, C. A. Protein interactionmapsfor
complete genomes based on gene fusion events.Nature 402, 86‐90 (1999).
Mabuchi, I., Y. Hamaguchi, T. Kobayashi, H. Hosoya, S. Tsukita, and S. Tsu- kita. 1985. a-Actinin from sea urchin eggs: biochemical properties, interac- tion with actin, and distribution in the cell during fertilization and cleavage. J. Cell Biol. 100:375–383
M. Yoshida, N. Kawano, and K. Yoshida, “Control of sperm motility and fertility: diverse factors and common mechanisms,”CellularandMolecularLifeSciences,vol.65,no. 21, pp. 3446–3457, 2008.
Parrington, J., K. Swann, V.I. Shevchenko, A.K. Sesay, and F.A. Lai. 1996. Cal- cium oscillations in mammalian eggs triggered by a soluble sperm protein.
Nature (Lond.). 379:364–368.
Rhee, S.G., and K.D. Choi. 1992. Regulation of inositol phospholipid-specific phospholipase C isozymes. J. Biol. Chem. 267:12393–12396.
Rindflesch,T.C. et al. (1999) Mining molecular binding terminology from biomedical text. Proc. AMIA Symp., 127–131.
Swann, K, et al. (2006). PLCzeta(zeta): a sperm protein that triggers Ca2+ oscillations and egg activation in mammals. Semin Cell Dev Biol 17: 264–273.
T. Hunter, “Tyrosine phosphorylation: thirty years and counting,” Current Opinion in Cell Biology, vol. 21, no. 2, pp. 140–146, 2009.
Turner, P.R., et al. (1986). Regulation of cortical vesicle exocytosis in sea urchin eggs by inositol 1,4,5-trisphosphate and GTP-binding protein. J Cell Biol 102: 70–76.
von Mering, C. et al. STRING: a database of predicted functional associations between
proteins.Nucleic Acids Res 31, 258‐61 (2003).