and gene expression data analysis protein structure prediction
• Proteins are chains of amino acids joined
together by peptide bonds. • Many conformations of this chain are possible due to the rotation of the chain about each atom. • Protein structure is these conformational changes that are responsible for differences in the three dimensional structure of proteins. Why we are using cloud computing
• It require high computing capabilities and often
operate on large data- sets that cause extensive I/O operations.
• Protein structure prediction is a computationally
intensive task that is fundamental to different types of research in the life sciences Benefits of protein structure
• Manually 3D structure determination is difficult, slow and
expensive • Structure helps in the design of new drugs for the treatment of diseases. • The geometric structure of a protein cannot be directly inferred from the sequence of genes that compose its structure, but it is the result of complex computations aimed at identifying the structure that minimizes the required energy. • In the above figure the web portal enables scientist not to worry about predictions task, all work is done by cloud service. Machines divides the pattern recognition problem into three phases: • initialization, • classification, • and a final phase. these phases executes in parallel to reduce the computational time of the prediction. The prediction algorithm is then translated into a task graph that is submitted to Aneka. Once the task is completed, the middleware makes the results available for visualization through the portal. Gene expression data analysis
• Gene expression profiling is the measurement of
the expression levels of thousands of genes at once, Consequently, it is widely used for cancer prediction. • It is also used in medical diagnosis and drug design. Cancer
• Cancer is a disease characterized by uncontrolled
cell growth and proliferation. This behavior occurs because genes regulating the cell growth mutate. This means that all the cancerous cells contain mutated genes. • These uncontrolled growth develops different types of tumors, In this context, gene expression profiling is utilized to provide a more accurate classification of tumors. • The dimensionality of typical gene expression datasets ranges from several thousands to over tens of thousands of genes • For these large classification is solved by eXtended Classifier System(XCS) which has been successfully utilized for classifying large datasets. • Cloud-CoXCS, is a machine learning classification system for gene expression datasets on the Cloud infrastructure. It extends the XCS model by introducing a coevolutionary approach. • CoXCS divides the entire search space into sub domains and employs the standard XCS algorithm in each of these sub domains. Working of CoXCS