Académique Documents
Professionnel Documents
Culture Documents
Introduction
Methods have been created of for the
quantitative
analysis
fluorescent
microscope images for a large number of proteins to get their subcellular locations.
August 5, 2004
Project Goal
Improve the algorithm to extract the subcellular locations contained in structured protein databases.
August 5, 2004
August 5, 2004
GO Tree
A node of the GO tree contains:
GO ID GO Code
It is a unique number. The length of the GO Code is 15. A GO ID may have more than one GO Code.
GO Code
3132410000000000 3132411000000000
GO ID
0005792 0019718
3132410000000000
3132411000000000
3132412000000000
August 5, 2004
August 5, 2004
Methods
Protein Name Protein Location
GOfetcher
August 5, 2004
Example
A protein will have more than one GO ID and GO Code.
Protein name: Actin
GO ID: 0015629 GO ID: 0030482 GO ID: 0005884 GO ID: 0030864 GO ID: 0030864 GO Code: 318911000000000 GO Code: 318911100000000 GO Code: 318911200000000 GO Code: 318912000000000 GO Code: 318411000000000
GO Term / Output
Extracellular Cell Surface Membrane Cell (other component) Intracellular Chromosome Cytoplasm Cytoplasmic Vesicle Cytoskeleton Cytosol ER Endosome ER Golgi intermediate compartment Golgi Apparatus Mitochondrion Ribosome Nucleus Nuclear Membrane Nucleolus Nucleoplasm
August 5, 2004
Methods (continuation)
GO Terms Protein name GO IDs LCA GO Term obtained is the result GO Codes
GO Term
GO ID
GO Code
August 5, 2004
Results
Program Versions Running Time
New Old
August 5, 2004
August 5, 2004
8
0 0 0 0 0 0 5
0 0
0 0 0
0 0 0 0
0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 1 4 0 14
0
0 0 0 0 0 7
1
0 0 0 0 0
3
0 0 0 4
0
0 0 0
2
0 1
0
2
August 5, 2004
4 0 0 0 0 0 0 0 0 1
August 5, 2004
0 0 0 0 0 0 0 0 0 2
0 0 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 2
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 2 2 0 2
0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0 0
5 0 1 2 2 2 2 0 0 13
Conclusions
The LCA algorithm is more faster and accurate than the old version. Higher recall Agreement with previous results More friendly interface
August 5, 2004
Future Work
With this program we can compare the subcellular location determined by the image analysis with those extracted from protein databases. This comparison can reveal whether the description obtained from the analysis of the images is consistent or not, and, for example, can identify proteins that were mis-localized due to tagging artifacts.
August 5, 2004 SURP Final Presentation
Acknowledgments
Dr. Robert F. Murphy Murphy Lab Group
Juchang Hua
NSF
August 5, 2004