February+2013

Overview:
toc This month I am continuing to work with Francis on his project. I am going to perform a gene ontology with the data I analyzed last month. The data includes genes with specific UniGene identifiers that interact with Protein1. Protein1 is the main focus of Francis's work and he would like a gene ontology of the selected genes. A gene ontology is a method for describing related genes in any organism, and in this case in //Danio Rerio// (Zebrafish). The ontology is made up of three smaller parts including Molecular Function, Biological Process and Cellular Component. They are arranged in a graphic way with the three terms at the roots. I will be doing the gene ontology in MATLAB because matlab has a specific function to run gene ontologies. The command is geneont which is built into the bioinformatics toolbox. Then the command to be typed in: GO=geneont('live',true) get(GO)

This will bring up specifics on the gene ontology function as a whole, not related to the actual data. Then to run the actual ontology use the function GO(accession number).terms which brings up specifics about that gene specifically.

Then other functions to use are : **getancestors**, **getdescendants**, **getrelatives** and **getmatrix**. These will pull similar terms based on genes from the entire data. It will list the accession numbers of the terms.

Using **biograph** and **getmatrix**, it turns the data reflected in the terms into a visual representation. cm=getmatrix(riboanc); BG=biograph(cm,get(riboanc.Terms, 'name')) view(BG)

Questions:
While working through the commands I have had a few questions including the following:

What is micro array data analysis?

What file format does my gene ontology have to be in? So far, excel (xls) hasn't been very compatible with the bioinformatics toolbox on matlab. It accepts other more widely accepted files like SoFT and MiNiMaL but not excel.

- matlab's gene ontology function will turn obo formatted objects into something that can be put in the matlab search path and therefore accessed because it is linked to the Gene Ontology website and all its database.

How do I change the excel worksheet I have into a new format? - I learned a matlab command that I can now use for turning my unigene identifiers into Gene Ontology IDs by using the command: GOIDs=num2goid(X) where X is a string of numbers which for my data will be unigene identifiers. The command will turn the 5 number string into a 7-digit number preceded by the prefix GO. An example would be the following taken from mathworks.com

t= [ 5575 5622] ids = num2goid(t) 'GO:0005575' 'GO:0005622' I will then just be able to use ids in the next step of running a gene ontology because it will hold all the data. - I have gotten this to work on Matlab and it stores the numbers in a string value but I am not sure how to turn the string into an OBO formatted object that can be found on the matlab search path.

The next question I have is accessing the gene ontology on matlab in general. I cannot seem to get a hold on any function that will let me access anything gene ontology related.

- the next thing is that I can use FileValue to "string specifying the file name of an OBO-formatted file that is on the MATLAB search path" - MOST IMPORTANTLY: GeneontObj will use "matlab object containing gene ontology information"

This image is an example of what a gene ontology actually does. It links different cellular processes by biological process, molecular function and cellular component. It is then able to create networks between genes.

media type="custom" key="22296450" align="center"

I found this video very helpful in my quest for truly understanding what a gene ontology is and what tools are going to be the most helpful to me as I learn and push forward. I have the data and now I just need to learn what to do with it.