March+2013

Gene Ontology
I am continuing my work from January and February in hopes of completing a gene ontology on Matlab. I need to still transform the data into something matlab can read. My questions are as follows:

How do I change my excel file into something other than a list of accession numbers in an array in matlab?

Once I have them, how do I follow the functions for creating the annotated ontology?

How do I create a photogenic tree to analyze the data? - The phylogenic tree is actually a biograph that links paths and ancestors of the data set together. Creating a biograph calls on one function that grabs a matrix with the ancestor linked genes pulled together and then creating a link using their names.

Is the tree necessary? or is there a different way to map? -The tree is extremely necessary. It is the key to discovering the mapping. And it is not actually a phyolgenic tree, it is a biograph that links many similar ancestors together in order to find similar paths for overall gene expression. The tree is the most important part of running a gene ontology because



After many months of perseverance, this is an example of a biograph that links many paths together by using ancestors and how they connect. This is just dealing with biological process, excluding cellular component and molecular function. With these also taken into consideration, I will now be able to pull together many different trees. I had many difficulties, but I found that importing the excel sheet into matlab, then turning it into an array, I can easily call the data from that.

The first code that has to be run each time that I proceed to do a gene ontology is GO=genont('live',true); get(GO) This updates the system each time to include any new data that has been added to the gene ontology database online. Then the next step is bringing back into memory my array data set. I re-import the worksheet into matlab and and save the array into the workspace to be pulled up later on. The code that is most important is GO('accesionnumber').terms, this goes into the gene ontology database and brings up something like the following: As illustrated in the figure above, the GO(35143).terms brings forth the data underneath. It gives it's id in the data base, the name of the species to which the id belongs and then the ontology followed by information regarding the coding of the program.. The most important parts I will be looking at is the name and the ontology. The ontology will be important in creating a biograph like the one that is illustrated above. It creates paths of important genes to be used for discovering retinal regeneration in the case of my studies.

The next most important thing to do is to check for ancestors. Based on how many ancestors we find for a specific gene, it will tell whether creating a biograph is vital. If few ancestors show up, then the web will not reveal much to us, but if there are many ancestors than chances are there is something hiding there. The gene being expressed is important and is clearly highlighted. Also the genes that show up the most in the ancestors category are important due to their tie to multiple genes.



Looking at the data pulled from this id, the function tells me how many terms show up, which in this case is 17. With 17 being a relatively high number for the terms in my data set, I created a biograph to explore the connections.

The biograph function is important and also confusing for creating a web. The command is:

name=getmatrix(ribo); BG=biorgraph(name,get(ribo.Terms,'name')); view(BG)

This set of commands with retrieve the matrix that I saved under the name ribo, then it will use the biograph function to display the names of the ancestors in the web, and the final command will let us view the graph.

And now from looking at different graphs I will be able to draw connections and find the major connections between the genes