Vous êtes sur la page 1sur 8

Computers are from Mars, Organisms are from Venus: Interrelationship guide to Biology and Computer Science

Junhyong Kim Department of Ecology and Evolutionary Biology The media is abuzz with the phrases biology and computation, bioinformatics, DNA computing, genetic algorithms, etc. What is all this about? Like human relationships, scientific disciplines continuously undergo mergers and splits. In recent years one active area of interdisciplinary merger has been biology and computer science. The noise we hear is from the strenuous struggle of these two disciplines coming together. Is this the agonizing grinding noise of unmatched gears? Or, is it the birth sounds from a constructive synthesis of two youthful sciences? There is a kind of a natural affinity between biology and computer science. E. Schrdinger envisioned life as an aperiodic crystalthat is, he observed that the organizing structure of life is neither completely regular like a pure crystal nor is it completely chaotic and without structure like dust in the wind. Perhaps because of this, biological information has never satisfactorily yielded to classical mathematical analysis. But, a simple look out the window shows that there is a great abundance of structure in biological objects, from fractal like branches of an oak tree to the symmetries of a DNAs double helix. Machine computations combine elegant algorithms with brute force calculations. As a first guess, this seems to be a reasonable way to approach this aperiodic structure. On the other side of this relationship, the idea of a computer is to have a machine that can flexibly solve diverse problems. In nature, such plastic problem solving is uniquely in the domain of organic matter. Whether it is through historical evolution or individual behavior, organisms are always adaptively solving the problems posed by their environments. Thus an examination of the ways organisms solve their problems leads to new approaches to computation and algorithm development. In this essay, I will take a quick tour of how computation is affecting biology and how biology is affecting computation. No matter how compelling, interdisciplinary research is always difficult. The reasons for the difficulties lie largely in problems of human nature rather than in problems of science. In the last section, I will explore these difficulties and suggest the solution is to get over the mythical idea of an expert.

Biology meets Computer Science: Computational Biology and Bioinformatics


Biology is the youngest of the natural sciences. All natural sciences progress from an information-gathering phase (so-called stamp collecting) to an information-processing phase (theorizing) when the collected information reaches a critical density. Information processing is the dominant activity in more mature sciences like Physics where new information is scarce and theoretical abstractions and predictions play a much more important role. Until recently, the major activity in biology has been gathering new informationin the lab and in the field. In the last five years, the growth of biological information, especially at the molecular level, has been astonishing. The growth curve of the total information stored in Genebank (http://www.ncbi.nlm.nih.gov/, the primary database of molecular biology information) follows an exponential curve, quite well mimicking the exponential curve of computer processing power (so-called Moores Law). When I started doing laboratory work about ten years ago, it took us approximately 4~5 days to obtain 200 base pairs of DNA sequence data. This year, the biotechnology corporation Celera produced the approximately 170 million bases of the fruitfly genome in a matter of several months. The current estimates of public sequencing capacity devoted to just the human genome project is about 28 million bases a month and the private capacity is several times that. We are overwhelmed with this volume of information. No single person or even a large group of people can ever hope to make sense of the information, especially at the rate that it is being produced. The demand for organization, abstraction, and theory is a ringing siren in the information storm.

What are some examples of uses of computational approaches? Currently, two most successful uses of computers in biology are comparative sequence analysis and in silico cloning. When we isolate some new molecular sequence data in the laboratory we would like to know everything there is to know about that sequence. One easy idea is to see if other people have already studied any molecular sequences similar to my sequence. Similarity of sequences is generated by two mechanisms: functional constraints and evolutionary descent. Like relatives who share similarities, biomolecular sequences related by evolutionary descent share sequence similarities. As well, the structural requirement of performing a particular function also constrains two molecules with similar function to resemble each other. Therefore, a great deal can be learned and extrapolated about a biomolecular sequence by comparing them to already well studied similar sequences. Probably the most widely used computational tool in biology is the program BLAST http://www.ncbi.nlm.nih.gov/). BLAST is used to search the databases like Genebank for all sequences that are similar to a target sequence. These days, when we isolate a new molecular sequence, pretty much the first thing anybody does is to run a BLAST search against the existing databases. If Nobel prizes were given out for the scientific utility of a tool (as they sometimes are), the creators of BLAST certainly should be at the head of the line. In silico cloning is the nickname given to cloning a gene using a computer search of existing databases. Most people are aware that cloning is an important activity in biology. Cloning might be analogized to finding a particular sentence in a book in a very large library. Find a particular sentence is a purposefully vague statement. This might include searching by semantics such as find a sentence that expresses the angst of a young prince or by syntax like find a sentence that starts with a preposition, present subjective verb, , or by patterns like find a sentence that starts with To be or not to be. All of these search clauses happen in real experimental settings. We may want to clone a gene by its phenotype, say olfaction, or its structure, say G protein-coupled receptor, or by a pattern fragment, say the DNA pattern ACCAGTC. Doing this in the laboratory is like physically going to the library to do your search; there are a lot of logistic problems. Genome projects are somewhat like putting all the library books on a CDROMit alleviates all the physical access problems. However, problems remain because the data from a genome project is similar to having all the books on a CDROM without any titles, annotations, cross references, etc. If we were asked to find a sentence that expresses the angst of a young prince, there will still be a lot of reading and interpreting to do. Yet, this is a far better state than physically going through the books, and in the case of a pattern search it can be done very rapidly. Thus, in silico cloning is an important benefit of the genome projects. One of the challenges is how to do find a sentence that expresses the angst automatically by a computer. Like the language semantics problem, it remains difficult to search genome databases with a criterion like olfactory genes. However, computational approaches can solve long-standing laboratory problems. Last year, in a collaborative project between Dr. John Carlsons laboratory and myself, using a computer program we successfully attacked a 15-year-old problem of isolating olfactory genes from the fruitfly. The computer program narrowed down the possible candidate genes sufficiently to make the experimental work manageable. The utility of computational approaches is often best demonstrated in such close collaborative projects where the computer is used to guide the more expensive and time-consuming wet-lab experiments. These days computers are routinely used in biology in: biomolecular sequence alignment, assembly of DNA pieces, multivariate analysis of large scale gene expressions, metabolic pathway analysisjust to name a few. What are some of the important challenges in computational biology? Here are some of my favorites. Historically, and somewhat inherently, biology is a diverse field with information coming from many different distributed sources. For example, 50 different laboratories around the world might study a given gene. Given these heterogeneous sources of information, one major problem is collecting and integrating them into a coherent set of information. Curating and integrating distributed databases is a fundamental problem that is critically important especially given the pace of data production. The kinds of problems encountered ranges from relatively simple to almost philosophical. Simple problems include different labs giving the same object ten different names. Difficult problems include the seemingly trivial problem of how to define a gene, which in fact is not a philosophical problem but a very practical one to a database specialist. Currently, most major databases such as Genebank and SWISSPROT (http://www.ebi.ac.uk/swissprot/index.html) operate partially by human curation and partially by various

automated data sharing schemes. Issues of data quality, inter-operation, and integration of dispersed sources of information remain problematic and there is active research in the area at Yale (http://bioinfo.mbb.yale.edu/ and http://www.cs.yale.edu/Linda/linda.html). As mentioned above, one of the important goals of computational biology is to use existing information to extrapolate knowledge about novel bio-molecules. Genome projects generate raw data without giving them biological meaning. Therefore an important problem is to annotate this raw data with all the pieces of information that might be biologically relevant. Useful information include whether a stretch of DNA contains amino acid coding sequence, transposons, regulatory sequence; if an amino acid is coded, what is the putative function, etc. One of the most detailed such annotation can be found at http://www.flybase.org/publications/Adh.html for the three million bases surrounding the Drosophila melanogaster ADH region. Much of this annotation was carried out using a variety of computer tools and human interpretation. However, given the rate of DNA sequence generation, careful human analysis is becoming increasingly difficult. The computational challenge is to automatically annotate the raw data. Some of the necessary tools for this problem include gene prediction, gene classification, comparative genomics, and evolutionary modeling. (These some of the topics addressed in my laboratory http://jkim.eeb.yale.edu/). Although biological information is being increasingly synthesized into general theories, our knowledge remains scattered and complex. For example, we know a great deal of detailed knowledge about the molecular events governing the early development of the fruitfly, Drosophila melanogaster. However, this knowledge is extremely complex, in the form of statements like gene A and B interacts to positively induce gene C but under the presence of gene D, the positive regulation is modulated by. The main problem is that we usually do not have an a priori theory under which we gather coordinated data, but rather hundreds of different research groups make independent observations hoping to synthesize that theory. These hundreds of independent observations are published in thousands of pieces of articles using scores of variations in terminology, methodology, etc. And, this is just for the biology of fruitfly development. I would wildly guess that the number of articles published each year on cancer probably runs into tens of thousands. What all of this calls for is some kind of system of automatic knowledge extraction, a computer program that will scan all of these articles, classify them, and produce synthetic new information. This is obviously a tall ordersomething that would be useful not only in the biological context but also in everyday life. Therefore, automatic text retrieval and knowledge extraction is an obviously important area of research in computer science (http://www.cs.cmu.edu/cald/research.html). These tools are just beginning to be applied to biological research. Most people when asked what is the Holy Grail of computational biology would answer SequenceStructure-Function Prediction. Sequence-structure-function prediction refers to the idea that given the sequence identity of a molecule we would like to predict its three dimensional structure and from that structure infer its molecular function. In the past, we have been widely successful in deducing the universal genetic code. The universal genetic code is a relational map from DNA sequences to amino acid sequences allowing us to completely know the amino acid identity once we know the DNA sequence identity. The universality of this code and its elegant combinatorial structure is a remarkable fact of nature. But in addition, our knowledge of the code has extremely practical consequences. It is difficult and costly to identify the amino acid sequence of a protein but easy to identify the corresponding DNA sequence. Since we have the genetic code, we only need to do the latter. For the case of going from the amino acid sequence to protein structure, we have reason to believe that the amino acid sequence identity completely determines the three dimensional structure of the protein. Therefore, similar to the genetic code, we should be able to generate a relational map from the amino acid sequences to the protein structures. It would another wonderful triumph if we were able to deduce this map and again it would be tremendously useful. Direct protein structure determination is an extremely difficult problem. Once we have this second genetic code, all we would have to know would be the DNA sequence and we would have complete information on its corresponding protein structure. The problem of structure prediction from sequence is extremely hard for many reasons. In the case of genetic code, the possible values were the 20 different amino acids. In the case of protein structures, we estimate that there are ~1,000 different major structures called folds, each with tens of thousands of

variations (http://scop.stanford.edu/scop/). Also in the case of the genetic code, an enzymatic mechanism (tRNA and associated protein synthesis machinery) provides the physical basis of the map in a relatively straightforward manner. In proteins, physical forces governing the interaction of the hundreds, sometimes thousands of amino acid residues, determine the structure. Not only do we not know the details of these interactions, even if we knew them, it would be near impossible to compute the consequences of these forces (a many body problem in physic involving hundreds and thousands of non-ideal bodies). Still, significant progress has been made especially due to a competition called CASP (Critical Assessment of Structure Prediction http://www.ncbi.nlm.nih.gov/Structure/RESEARCH/casp3/index.shtml). CASP is a worldwide open challenge to computational biologists to predict protein structures for test cases. The test cases are drawn from protein structures that have been solved using experimental techniques but not yet released to the public. This contest format has drawn a great deal of attention and resulted in dramatic improvements in prediction rates. There is reasonable evidence that a proteins structure approximately determines their molecular function such as catalysis, binding to DNA, binding to cell components, etc. Therefore, some people have the thought that there should be a relational map between structure and function that is deducible (perhaps as a third genetic code). This is also the main idea behind so-called rational drug design. If we were able to predict the action of some protein by looking at its three dimensional structure, like an engineer looking at an automobile design, we would be able to say if we streamline the structure here and reduce the bump here we will get a better working drug. And, if we have solved the sequence-structure problem, we should also know exactly how to make those changes. Unfortunately, we are far off from that ideal. Not the least of our problems is the fact that the idea of a function is murky, both in theory and in practice. The function of an object, whether it is a screw holding together my chair or the same screw in a car jack, is quite often context dependent. The parts of an object conferring function are also often interrelated such that we cant simply chop off a part and expect nothing else to happen. The whole question is whether we can take an engineering approach to biological objects. Some of the most spectacular advances in biology came from such optimistic view that we will achieve a physics like understanding of biology. Perhaps then, this is the Holy Grail of Biologya mechanical, physical understanding of the organism. Is there in fact a grand theory of the organism? A set of laws that govern organisms form and function? Evolutionary theory has provided us with such laws and theories for the generative process of populations. Can we now obtain similar theories for the generative process of the individual? Whatever the answer, it is clear that the growing body of molecular data and computation will play a fundamental role in its formation.

Computer Science meets Biology: DNA Computing and Genetic Algorithms


The possibility of using DNA for computation was suggested in a landmark paper by Leonard Adelman (summary at http://www.hks.net/~cactus/doc/science/molecule_comp.html) who used sequence-specific hybridization of DNA molecules and polymerase chain reaction to solve a computational problem called Finding the Hamiltonian of a graph. A graph is a diagram with vertices and edges more or less like a map of roads (edges) connecting cities (vertices). The Hamiltonian of a graph is a route that starts at a particular vertex of a graph and completely traverses each vertex exactly once before ending at a second designated vertex. For example, we may be given ten major cities in the Eastern seaboard connected with a variety of one-way highways. An example Hamiltonian problem would be to find the route (if it exists) from New York to Boston, going through each of the ten cities exactly once. The problem solved in the original paper involved a small number of vertices and it would have been no challenge for any computer. But the problem of finding the Hamiltonian of a graph becomes radically complex as the number of vertices becomes larger. In fact, it belongs to a class of problems in computer science that is called NP-complete. NP means Non-deterministic Polynomial, which can be roughly interpreted as saying This problem cant be solved in reasonable time except by a set of chance computation steps. In order to solve the ten cities problem posed above, a program may have to consider all possible routes through the ten cities, which can be a very large number. On the other hand, a lucky person might guess New York-> Hartford->New

Haven -> -> Boston and we could check and see that the person was correct. NP-complete problems are some of the practically hardest problems in computation. The concept that this class of problems can be solved in reasonable time by a chance set of computation is extremely important with respect to the idea of DNA computing. Another way of interpreting the statement is if infinite numbers of computers started solving the problem, at least one of them will finish in reasonable time. That is, it may be chance for a single computer, but a sure thing for an infinite number of computers. This was the original advantage sought in Adelmans paper. If we can code the problem and solve it using molecular machines, we may not have infinite machines, but we can have very many machines. A mole of DNA (albeit pretty weighty) will have Avogadros number of molecules ~1023! Since the original work, additional works have shown that DNA can be used to encode a universal computer (the same kind of computer as we all use on our desktops) and that this property comes purely from the ability of the DNA to find complementary pairs of sequences. There is still considerable skepticism surrounding the practical use of DNA computers. The main issues are problems of encoding the problem and reading the output (usually on the order of days even for simple problems), inherent error in computation, and the amount of DNA required to solve practical hard problems. As an example of the last problem, currently the largest super computers perform approximately 1019 elementary switch operations per second. While molecular interactions of DNA hybridization can be extremely fast, our ability to cycle such operations is on the order of minutes (~102 sec). It isnt clear how many switch operations of a logical circuit is equivalent to a single DNA hybridization reaction, but if for the moment we assume equivalency, we would need approximately 10 mmoles of DNA to obtain 1019 switch operations (1021 x 10-2). This would be about 100 g of DNA using say 24 base pair sequences. A 100 g of DNA is an unbelievably large amount in a molecular experiment. Despite these problems, the idea of DNA computers has opened exciting new avenues of research in nanotechnology and models of computation. And, real organisms do perform complex computations (http://www.princeton.edu/~lfl/FRS.html), the question is whether we can harness this ability for our specific use. The foundations for genetic algorithms were described in the 1960s when researchers like John Holland started describing the possibility of machines that can adaptively solve complex problems. In recent years, the diverse activities in this field have been given the name Evolutionary Computing because the predominant idea is to emulate the evolutionary adaptive behavior of real organisms to solve complex problems. The goal of Evolutionary Computing is to solve complex computational problems such as the Hamiltonian problem described above. (This contrasts with the related topic of Artificial Life where the goal is to produce artificial entities that have organism-like behavior.) A somewhat loose collection of introduction to Evolutionary Computing can be found at http://www.cerias.purdue.edu/coast/archive/clife/FAQ/www/. Strict evolutionary adaptation requires three components. One, there should be a property (or a suite of properties) of the organism that governs their differential survival. Two, individuals should reproduce with inheritance of those properties. Third, there should be a mechanism that generates variation (mutation) of those properties. The idea of Evolutionary Computing is to generate a population of computer programs and choose those that are particularly good at solving a posed problem by tying their survival to problem solving. The property of reproduction and inheritance ensures that this selection for problem solving ability continues through many cycles. The property of mutation allows the population to continuously try out new variants of the solutions. Many studies show that these classes of algorithms may be particularly suitable for hard problems where the objective function landscape is rough. The meaning of the last phrase may be unclear. Many difficult problems can be visualized as that of finding the highest mountain peak in a mountain range. In this metaphor, the height of the mountains represents some optimality criterion, often called the objective function. For example, in the protein structure determination problem, the optimality criterion might be the total free energy of the folded structure. The land occupied by these mountains represents the search spacethe collection of possible solutions whose appropriateness is measured by the objective function. Again, for protein structure, the search space would consist of all possible ways to fold an amino acid sequence. Together, the search space and the objective function comprise the landscape of the problem. Just as in real life, when the landscape is rough, it is

difficult to find the highest peak. We often become trapped on local peaks or find it hard to move across a rugged terrain. Many computer algorithms traverse such landscapes using some rule such as look around and go up the steepest direction. Rugged landscape often foils such simple rules. While the exact theory is unclear, it seems that the principles used in evolutionary algorithms seem to work quite well in such situations. I distinguished above evolutionary computation and artificial life. The main distinction was that in evolutionary computation, we are generally interested in a problem-solving device whereas in artificial life we are interested in mimicking organismal behavior. There is a kind of mixture of these two ideas called evolutionary programs. Basically, these are programs whose code self replicates and adaptively change in response to some kind of selection scheme (usually their ability to solve some problem in shortest amount of time). In genetic algorithms, the problem to be solved is schematically coded in the programs and this architecture does not change; only the parameters of the scheme adaptively change. In evolutionary programming, the actual execution of the program changes fundamentally. An example will make this a little clearer. We might again have the protein structure determination problem. The problem is, given a string of amino acids, compute the three-dimensional position of individual atoms such that the total free energy is minimized. To implement a genetic algorithm we might start our population with a set of random solutionsi.e., assign random three-dimensional coordinates to the atoms. We then evaluate the implied free energy of the solutions and select for the lower free energy solutions. We then mutate the solutions by changing the three dimensional coordinates. Under the evolutionary programming idea, we implement a program that takes as input a string of amino acids and produces as output a set of coordinates. Then, the program itself randomly changes, rearranging it own code, say duplicating some parts, deleting others, etc. The environment of the programs selects those that produce the right solution and compute efficiently. Some of the idea of a program that replicates and changes its own code was initiated in 1950s by people playing around in the Bell Labs (called Core Wars and presented in the pages of May 1984 Scientific American). The first successful truly evolving self-replicating program was created by a former tropical ecologist, Tom Ray. (Shades of interdisciplinary research!) Tom Ray created an artificial world of simplified program language called Tierra that allows programs to mutate, replicate, and evolve. Originally, in this environment, small programs were randomly generated and those that learned to self-replicate were selected. Pretty soon, all kinds of variants appeared that self-replicated more efficiently. Furthermore, a whole ecology of computer programs evolved including those that reproduced as a parasite hitching a ride with normal programs. Many developments have followed including an Internet version that lives on the Internet going after spare computer cycles. A version of Tierra which combines some form of problem solving with a more structured environment called AVIDA can be found at this link: http://www.krl.caltech.edu/avida/. A link to Tierra can also be found here.

Mars and Venus: the difficulties of interdisciplinary research


In the above, I briefly looked at a collection of research at the interface of computer science and biology. The topics are fascinating and there are connections and ties all over the place. Yet, in practice, interdisciplinary work is difficult. Many of the most creative people have trouble finding support and institutional positions. It is easy to think that interdisciplinary research is hard because it involves the combined knowledge of two different fields. While this is certainly true, such problems can be overcome if we work hard and we all do our homework. There is absolutely no reason why an abecedarian of biological sciences should not also be able to deeply learn computer science, mathematics, and statistics. The main blocks to interdisciplinary science, I believe, comes from symptoms of our hubris manifested in what I call the collaboration fallacy and the expert fallacy. The collaboration fallacy occurs in several steps. The first fallacy is the idea that an expert in one field needs help in just this little problem and nothing else. The expert from one field, say a biologist, might approach a computer scientist and ask I have this gene finding problem can you run a program for me? Now in general, most such problems require tedious work on the part of the computer scientist. Without being provided the scientific context, such as why the gene is interesting and why this problem is

important, the second expert has no motivation for collaboration. It would be just make-work. The assumption of the biologist is that (a) the computer scientist would not be interested in the biology, (b) its too involved to explain everything, and (c) the computer scientist would not understand anyway. The second fallacy is the desire to revolutionize the other field. After having been told the context of the problem, the computer scientist may say that is completely wrong, your whole concept of a gene is flawedheres how we should define a gene. Indeed such fundamental restructuring of an idea in field A using insights from field B can be extremely rewarding and revolutionizing. However, this kind of attitude cannot be applied to every interdisciplinary problem. We would simply not get anywhere. Moreover, every discipline has its own narrative tradition under which questions are posed and answered. The necessity of making progress requires each expert to respect the tradition of the other field. Fundamental revisions are a long-term careful undertaking not to be hastily embarked. The third fallacy is assuming an expert in A will have nothing useful to say about field B and vice-versa. That is, when two experts get together, they expect each other to stay within their own domains and some narrowly proscribed interface. If a computer scientist makes a remark on the molecular mechanisms of carcinogenesis or if a biologist makes some remark on Lambda calculus, our first instinct is to ask, does s/he know what s/he is talking about? This is extremely unfortunate given that it is precisely such insights from outsiders that can lead to important breakthroughs. This, our hubris of the certainty of our scientific pedigree, is what I would like to call the expert fallacy. We live in an age of specialization. Baseball games that were pitched by a single person is now pitched by starters, middle relief, and stoppers. I hold a faculty position in Ecology and Evolutionary Biology to carefully differentiate from Molecular Biophysics and Biochemistry. The idea of an expert rules. We talk of going to the worlds foremost expert in X and being taught by an expert in Y. Indeed there is probably more information today than there ever was in history and no single person can be expected to cover even a small fraction of that knowledge. When two distinct disciplines such as biology and computer science are involved, it may be asking for too much for one person to acquire the equivalent knowledge of specialists in each field. (Although historically significant advances were made when in fact such cross-over individuals arose.) Yet, by its own nature, the idea of an expert is a greedy idea. It is in the self-interest of an expert to make sure that she or he is considered the foremost expert and no one else; thus the expert in computer science cannot tolerate the computer science knowledge of a biologist and vice-versa. More importantly, given a collection of experts, it is in the best interest of the expert to make sure his or her expertise is considered the most important expertise; thus we are taught disdain for the soft sciences, applied sciences, non-natural sciences, mere engineering, etc. And, because of this need to differentiate ourselves, the domain of our expertise continues to shrink until pretty soon we will have the Department of the Left Wing of Drosophila. I argue that the key to interdisciplinary research is to get over this idea of the expert. All knowledge is equal. Indeed, if we really knew which knowledge is important and which is notthat is, if we had an algorithm for knowledge, we would be all be using it with shared certainty. We do not have such an algorithm. (In fact, we have results showing that such algorithms cannot exist.) Growth of knowledge, whether it is personal or that of a field, is haphazard and anfractuous. Our best hope is to keep our eyes open in all directions just in case something interesting happens. We should pursue knowledge vigorously but with agnostic value judgements.

Love and Prospectus


What is in the future? What can we expect out of all these interactivities of biology and computer science? As a biologist, my ideal would be to be aboard the Starship Enterprise where experiments are conducted by asking computer compare all known alien genetic profiles to the current sample, cross match to Mr. Spock, report any anomalies related to radiation from the Crab nebulae. That is, in the ideal world, science will be driven by creativity and insight--not by technical processes. I remember the thrill of sitting at a terminal attached to one of the early supercomputers and suddenly realizing I was getting results back as

fast as I could think of things to do! My grand hope in computational biology is that this is what we will be able to do in the future for all biological problems. Not being a card-carrying computer scientist, I cannot say what the dreams of a computer scientist would be in relation to biology. But, I suspect it wouldnt be far different from the idea of a new model of computation that leads to robust, adaptive, flexible problem solving machinesjust like organisms. As I noted above, fastest computers currently operate at about 1019 elementary switch operations per second. The human brain contains about 1012 neurons with peak synaptic activity of about 1 kHz, resulting in 1015 neuronal operations per second. We obviously do not know how a single synaptic activity translates into a computer chips elementary switch operations. But if it is within 10,000 to 100,000, with respect to pure information processing capability it seems our machines are there, comparable with the human brain. Then, it becomes a question of the software and the model of computation. We would hope that this is where the interface with biology becomes important. The prognosis for interdisciplinary research in biology and computers is great. Practically speaking, the career opportunities in computational biology are exponentially increasing. Scientifically, computational approaches are rapidly gobbling up gimmiesthose problems that are so easily approachable using a computer and so hard from a laboratory. The interface between computation and biology is one of the most exciting growth fields today. Certainly some of the noise will turn out to be noise and some of the heat will inevitably cool. However, it is also a compelling and natural relationship and I have strong belief that the growth of either field will critically rely on this interaction. Our best hope is to enthusiastically embrace this relationship without the hubris of prejudice and hasten the day we will see revolutions from individuals that can equally discuss quantum computation and post-translational regulation.

Vous aimerez peut-être aussi