Vous êtes sur la page 1sur 20

Perspective

An Online Bioinformatics Curriculum


David B. Searls*
Independent Consultant, Philadelphia, Pennsylvania, United States of America

Abstract: Online learning initia- took the initiative to post course materials, those described above (http://www.saylor.
tives over the past decade have including video, in widely varying for- org).
become increasingly comprehen- mats. Some adopted the use of Khan- In the fall of 2011, a highly publicized
sive in their selection of courses style videos or tablet-based screencasts of online course, Introduction to Artificial
and sophisticated in their presen- the sort popularized by the Kahn Acad- Intelligence (AI), was conducted by
tation, culminating in the recent emy with its vast library of instructional Stanford University Prof. Sebastian Thrun
announcement of a number of videos, which started as a viral You- and Googles Director of Research, Peter
consortium and startup activities Tube sensation and has now become its Norvig, based on the Stanford AI course.
that promise to make a university own well-funded institution (http://www. It ran live in the sense that new videos
education on the internet, free of khanacademy.org). were released and homework assignments
charge, a real possibility. At this YouTube indeed became the destina- collected on a weekly basis, and quizzes
pivotal moment it is appropriate to tion of many academic videos, which are and exams were given at set times, while
explore the potential for obtaining now aggregated by institution under discussion logs allowed for some degree of
comprehensive bioinformatics interaction. The course attracted 160,000
YouTube EDU (http://www.youtube.
training with currently existing free students from 190 countries, 22,000 of
video resources. This article pre- com/education). Apple has also put its
distinctive stamp on online learning whom finished successfully and were
sents such a bioinformatics curric-
with iTunes U (http://www.apple.com/ granted certificates of completion [1].
ulum in the form of a virtual course
education/itunes-u), also organized by Shortly afterwards, MIT set up a similar
catalog, together with editorial
commentary, and an assessment institution but with integrated search approach on a new platform called MITx,
of strengths, weaknesses, and likely capability and, of course, deployment to offering a course in electronic circuits that
future directions for open online iPad and iPhone apps. Countless aggrega- attracted comparable numbers of students
learning in this field. tors also assemble collections of video (https://6002x.mitx.mit.edu).
courses, but generally with little value The trend to structured presentation
added. and high production quality then acceler-
ated remarkably, and took an entrepre-
Online Learning Comes of Age Yale University began in 2007 to release
Open Yale Courses (http://oyc.yale.edu) neurial turn. The AI course was effectively
Online academic courseware at the in a more curated and consistent format spun off by Prof. Thrun into a Web
university level has now been available to than most other efforts, including high- startup called Udacity (http://www.
the public for a decade, the earliest udacity.com), which is currently live with
quality video and extensive syllabi; courses
concerted effort having originated in six courses. In April of 2012, two other
appeared incrementally, with just under
2002 with the Massachusetts Institute of Stanford scientists, Profs. Andrew Ng and
50 available to date. Then, in 2011, MIT
Technology (MIT) and their OpenCour- Daphne Koller, announced a similar
revamped several of its online courses into
seWare initiative (http://ocw.mit.edu). newco called Coursera (https://www.
a much more structured instructional
This project offered up the syllabi, lecture coursera.org), with backing from major
format, with learning modules in outline
notes, quizzes, exams, and/or other study Silicon Valley venture capital firms. Cour-
form containing videos interspersed with
materials for a very large number of sera, also now live, is being stocked with
courses, at the discretion of professors self-assessment and other activities. In a
courses from academic partners Stanford,
but with strong support and encourage- somewhat different vein, the non-profit
Princeton University, the University of
ment from the MIT administration. Only Saylor Foundation compiled a compre- Pennsylvania, and the University of Mi-
in a minority of cases were videos of hensive online university curriculum com- chigan; this list was recently augmented
lectures posted. prising courses that are essentially mash- with a tranche of a dozen more top-tier
Even before this, The University of ups of video and text resources from many universities. And in May of 2012, barely
California, Berkeley, had started webcast- existing sources, including a number of six months after MIT had rolled out its
ing lectures, and eventually began posting
both audio and video for public consump- Citation: Searls DB (2012) An Online Bioinformatics Curriculum. PLoS Comput Biol 8(9): e1002632. doi:10.1371/
tion at their Berkeley Webcast site (http:// journal.pcbi.1002632
webcast.berkeley.edu), though without the Editor: Fran Lewitter, Whitehead Institute, United States of America
ancillary materials of MITs OpenCourse-
Published September 13, 2012
Ware. A number of other universities
followed suit, though seldom so extensive- Copyright: 2012 David B. Searls. This is an open-access article distributed under the terms of the Creative
Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium,
ly; among these was Stanford with its provided the original author and source are credited.
ClassX streaming service (http://classx.
Funding: The author received no specific funding for this article.
stanford.edu/ClassX) and an earlier effort
Competing Interests: The author has declared that no competing interests exist.
called Stanford Engineering Everywhere
(http://see.stanford.edu/see/courses.aspx). * E-mail: david.b.searls@gmail.com
In many cases, individual faculty members David B. Searls is an Associate Editor of PLOS Computational Biology.

PLOS Computational Biology | www.ploscompbiol.org 1 September 2012 | Volume 8 | Issue 9 | e1002632


new MITx platform, they and Harvard authors qualifications and methodology in career change. There are certainly exten-
announced that the institutions were offering these opinions. sion programs of universities and other
investing $30 million each in a joint online The author has advanced degrees in for-profit resources that offer good value-
learning initiative called edX (http:// both biology and computer science, has for-money in this arena, and those who
www.edxonline.org). published original research in both fields, can afford it should not be discouraged
All of these initiatives promise to offer and has passing familiarity with but is by from taking advantage of such benefits as
undiluted, highly interactive university- no means expert in all of the advanced personalized instruction. Nevertheless,
level courses to the public, free of charge. course topics described below. He has part of the challenge in the present
Moreover, there is every indication that helped design academic curricula as part instance is to see just how far the free
the instruction can be effective; the U.S. of a major training grant and taught at resources have come. Moreover there is
Department of Education, in an exhaus- both an undergraduate and graduate level, the practical issue that extending the
tive meta-analysis of 51 published head-to- though not extensively, having spent most analysis to paid courses would open up a
head trials, found that on average, of his career in the computer and then the much larger set of alternatives, most of
students in online learning conditions pharmaceutical industries. However, in which are inaccessible to evaluation with-
performed better than those receiving the latter positions he was directly or out expenditure.
face-to-face instruction [2]. indirectly responsible for hiring well over a Only video courses are included, either
hundred scientists and engineers for bioin- showing the instructor with slides and/or
formatics-related roles. Thus if any bias blackboard, or in screencast format. Learn-
An Online Bioinformatics
exists, it is probably in favor of the ing from course notes only, or even
Education disembodied audio, simply doesnt have
practical over the theoretical, though the
Clearly a revolution in open online authors own research is somewhat more the immediacy of the visual experience of a
learning is at hand. This is a welcome in the latter category. lecture hall or even a tablet-based screen-
addition to a movement that also encom- In terms of methodology, the author has cast. At the other extreme, one could
passes open online scientific publication, of personally sampled all of the main courses maintain that reading textbooks at ones
which this journal is an example. As such, listed below that are currently available, as own speed is a more efficient and focused
this is an appropriate forum to assess the well as most of those offered as alternatives way to learn. That is certainly true for
current potential for a freely accessible or suggested for advanced study. Of these, some, and perhaps more so for experienced
online bioinformatics education. he has actually completed six of the main and mature scholars, but it is probably also
Both the completeness and the quality courses and seven in the latter categories true that a lecture format offers much-
of such an unconventional education (most recently, two of the inaugural needed structure to the learning process for
should be evaluated. Such judgments offerings by Coursera), and has made others. Moreover, cognitive psychology
significant progress in several more. In offers both a theoretical basis and empirical
cannot be entirely objective, and even
each case the main course offering for a evidence for the benefits of multimedia
curricula in conventional university set-
given topic was adjudged superior to the learning [3]. In any case, most of the
tings vary widely. Thus, this must ulti-
alternatives based on a variety of criteria courses below require reading at least
mately be considered an opinion piece.
including coverage, production quality, selections from one or more textbooks in
Even its purely factual content has to be
availability of ancillary course material, close coordination with the lectures (though
viewed as evanescent, given the rate of
and incorporation of the latest modular in a surprising number of cases the
change in online education, and the fact
courseware technologies described above. textbooks are freely available online).
that newly announced initiatives may
Less tangible factors such as teaching style, What follows, then, is a virtual catalog for
increase the selection and quality of
clarity, and pace were also considered. a course of study in bioinformatics. It
courses available to a considerable extent includes both core courses and electives, as
even within the year. Courses listed as alternatives to the main
courses still met basic standards of quality, will be evident in the commentaries includ-
Even so, the first opinion offered here is ed with each course. Even at that, different
that it is probably already possible for a and in addition to offering redundancy
often had other features that might appeal paths are possible depending on preparation
motivated student to become a competent, (whether the student starts with a biology
to specific students, for instance in terms of
employable bioinformatics professional in and/or computer science background al-
areas of emphasis. In several cases, courses
the comfort of his or her own homewith ready) and inclination (whether the student
were selected as main offerings despite
certain important caveats to be elaborated plans to focus on bioinformatics analysis and
being scheduled but not yet online; such
in the discussion at the end. By way of needs less programming experience, or
judgments were made based on instruc-
evidence, a suggested curriculum will be hopes to develop algorithms and systems
tors proven teaching backgrounds and in
laid out that is supported by existing online that require considerably more computa-
some instances after direct consultation
resources. tional sophistication). Since this virtual
with them on the syllabi.
This central thesis, that online bioinfor- program awards no degrees and makes no
Only courses offered without charge
matics education has in some sense guarantees, it will not attempt to set absolute
were considered. Online courses and
arrived, can certainly be challenged on standards for numbers of credits and
entire degree programs for money are
a number of counts. The fundamental distribution of core and elective subjects,
widely available, though troubling to some
question of the optimal content for given issues of accreditation and mounting but will suggest possible study threads in the
bioinformatics training would probably student debt. Course discussion logs on penultimate section of this article.
elude universal consensus in any case, free resources like Coursera indicate a
and perhaps the most that can be hoped tremendous demand for online education Biology Department
for is that what follows will contribute in the developing world, and students Fundamentals of Biology
meaningfully to the dialogue. Even so, the anywhere may need to be thrifty, partic- Source. MIT, 7.012, Profs. Eric Lan-
reader has a right to question both the ularly if they are retraining or exploring der, Robert Weinberg, Tyler Jacks, Hazel

PLOS Computational Biology | www.ploscompbiol.org 2 September 2012 | Volume 8 | Issue 9 | e1002632


Sive, Graham Walker, Sallie Chisholm, and plant biology in more detail than is make for a rather refreshing multicultural
Dr. Michelle Mischke (Fall 2011) necessary for bioinformatics, but also experience.
Link. http://ocw.mit.edu/courses/ provides a solid introduction to genetics Prerequisites. Introduction to
biology/7-01sc-fundamentals-of-biology- and phylogeny that may be preferred as Biology. Organic Chemistry.
fall-2011 being more molecular (http://webcast. Alternatives. Dr. Heather Tienson of
Provider description. Fundamentals berkeley.edu/playlist#c,d,Biology,434C6 the University of California, Los Angeles,
of Biology focuses on the basic principles of A29FA3A4580). Another interesting al- teaches the introductory course in their
biochemistry, molecular biology, genetics, ternative is the introductory course by biochemistry series entitled Biochemistry:
and recombinant DNA. These principles are Stanford Prof. Robert Sapolsky on Hu- Introduction to Structure, Enzymes, and
necessary to understanding the basic mech- man Behavioral Biology, which actu- Metabolism (http://www.oid.ucla.edu/
anisms of life and anchor the biological ally covers a wide swath of evolution, webcasts/courses/2011-2012/2012winter/
knowledge that is required to understand molecular genetics, and neuroscience chem153a-1). The Stanford ClassX stream-
many of the challenges in everyday life, (http://www.youtube.com/playlist?list= ing service has a biochemistry course taught
from human health and disease to loss of PL848F2368C90DDC3D). by Prof. Lynette Cegelski, but again it is
biodiversity and environmental quality. Going further. The U.S. National In- only the first in a series of three and this one
Commentary. Anyone motivated to stitutes of Health has a series of 16 invited does not extend to metabolism (http://
enter the field of bioinformatics is unlikely lectures on evolution (http://nihvideoidol1. classx.stanford.edu/ClassX/system/users/web
to need a freshman-level introduction to cit.nih.gov:8080/NIH/main.jsp and click on /pg/view_subject.php?subject=CHEM181_
biology, but this one is included for the Lectures, then Evolution and Medicine). WINTER_2010_2011). Oregon State Uni-
sake of completeness. The faculty are Among a number of resources inspired by the versity offers a two-term course in General
stellar, and the course has recently been recent Darwin centennial, one of the best is Biochemistry taught by Dr. Kevin Ahern,
converted to modular form with inter- the Stanford course Darwins Legacy both of which are available, but the visuals
active quizzes, problem sets, exams, and (http://www.youtube.com/playlist?list=PL are sometimes unclear (http://www.youtube.
additional helpful features. F2E17B4CDCCE15F5). com/playlist?list=PL850269AA28EF394A
Alternatives. Berkeleys Biology 1A and http://www.youtube.com/playlist?list=
covers similar material plus somewhat PL347B70A1CC0D91C6). Profs. Reginald
more physiology and is available in
Biochemistry
Source. Indian Institute of Garrett and Charles Grisham of the Univer-
several versions taught by a range of sity of Virginia have a free online version of
instructors, most recently one offered in Technology (IIT), Kharagpur, BT20001,
Prof. Swagata Dasgupta their textbook Biochemistry [4].
Spring 2012 (http://webcast.berkeley.
edu/playlist#c,d,Biology,CF8E59B3C769 Link. http://nptel.iitm.ac.in/video.php?
FB01). subjectId=102105034 Genetics
Going further. All of the remaining Provider description. Chemistry Source. Berkeley, PMB 160, Profs.
courses in this virtual Department extend and metabolism of biopolymers (carbohy- Robert Fischer and Jennifer Fletcher
the material in this course in various ways. drates, lipids, proteins, nucleic acids, and (Spring 2012)
nucleoproteins), vitamins, and hormones. Provider description. A consider-
Amino acid, primary, secondary, tertiary, ation of plant genetics and molecular
Principles of Evolution, Ecology, and
and quaternary structure of proteins biology. Principles of nuclear and orga-
Behavior Enzymes and co-enzymes. Glycolytic path-
Source. Yale, EEB122, Prof. Stephen nellar genome structure and function:
way and TCA cycle. Electron transport and regulation of gene expression in response
Stearns (Spring 2009)
oxidative phosphorylation to environmental and developmental sti-
Link. http://oyc.yale.edu/ecology-and-
Commentary. Exposure to muli; clonal analysis; investigation of the
evolutionary-biology/eeb-122
biochemistry in greater detail than is molecular and genetic bases for the excep-
Provider description. This course
found in the introductory biology courses tional cellular and developmental strate-
presents the principles of evolution, eco-
is particularly recommended for those gies adopted by plants.
logy, and behavior for students beginning
interested in biochemical pathway analy- Link. http://webcast.berkeley.edu/
their study of biology and of the envir-
sis, metabolomics, and structural bioin- playlist#c,d,PMB,2B7E0C3DBF1D43ED
onment Recent advances have ener-
formatics. With this video course we Source. Berkeley, MCB C148, Profs.
gized these fields with results that have
introduce a resource developed by the In- Daniel Barsky and Louise Glass (Spring
implications well beyond their boundaries:
ideas, mechanisms, and processes that dian National Programme on Technology 2011)
should form part of the toolkit of all Enhanced Learning (NPTEL), whose Provider description. Course em-
biologists and educated citizens. ambition is to build at least one version phasizes bacterial and archaeal genetics
Commentary. This is a modern of each course offered in all of Science and and comparative genomics. Genetics and
treatment of evolution and ecology but Engineering in India, from BTech/BSc to genomic methods used to dissect meta-
not one especially geared to quantitative PhD programs (http://nptel.iitm.ac.in). bolic and development processes in bac-
analysis, so may be considered optional for It currently offers some 110 full video teria, archaea, and selected microbial eu-
students of bioinformatics. Still it is a courses, skewed toward engineering, but karyotes. Genetic mechanisms integrated
valuable reminder that molecular biology with plans for up to 400 total. The courses with genomic information to address
is not all there is. Especially interesting is tend to follow very traditional syllabi and integration and diversity of microbial pro-
the coverage of evolutionary medicine, in sometimes move slowly, but are generally cesses. Introduction to the use of compu-
which Prof. Stearns is a leading light. well produced and exhaustive in their tational tools for a comparative analysis
Alternatives. The continuation of coverage. The lectures are delivered in of microbial genomes and determining
the first-year Berkeley program, Biology English that is more or less accented but relationships among bacteria, archaea,
1B, spends a third of the course covering nearly always impeccable, and altogether and microbial eukaryotes.

PLOS Computational Biology | www.ploscompbiol.org 3 September 2012 | Volume 8 | Issue 9 | e1002632


Link. http://webcast.berkeley.edu/ architecture, nucleocytoplasmic transport, may want to take only the first two thirds
playlist#c,s,Spring_2011,59C08AE05E signal transduction mechanisms, and cell of that course and then this course in its
752758 cycle control. entirety.
Commentary. This pair of courses Commentary. This upper-level Berke- Prerequisites. Introduction to Biolo-
together provide in-depth coverage of ley course in their Biochemistry and Mole- gy, Biochemistry, or equivalent.
classical genetics through modern geno- cular Biology track, which is subtitled Going further. One particular sub-
mics of the non-human variety. The first Macromolecular Synthesis and Cellular field of biology that constitutes an exce-
course, entitled Plant Molecular Gene- Function, is a thorough introduction to edingly complex system is immunology,
tics, actually begins with a comprehen- basic cellular information processing and as which has even spawned its own discipline
sive introduction to general Mendelian such is important background for bioin- of immunoinformatics. There are several
genetics, before delving into plant genetics formatics. The first third (taught by Prof. introductory immunology courses avail-
in detail. The student may wish to skip Alber) covers DNA replication and repair, able, including a shorter one presented
some of the latter lectures, but they do the second third (Prof. Zhou) does RNA from a medical perspective by Dr. Harris
cover many aspects of molecular genetics and protein synthesis, and the final third Goldstein of Albert Einstein Medical
that are completely general. The second (Prof. Zhong) includes cell membranes, College (http://www.youtube.com/playlist?
entry, Microbial Genetics and Geno- membrane proteins, trafficking, signaling, list=PL5703ABB5D07584D7) and another
mics, starts halfway through the actual the cell cycle, and apoptosis. Note that from a molecular and evolutionary standpoint
course with the lectures of Prof. Glass, there are some missing lectures in the first by Prof. Gregory Beck of the University of
focusing on comparative genomics, and third of the Fall 2009 version, but the Massachusetts (http://itunes.apple.com/us/
includes an extended exercise in anno- student can use the Fall 2008 version for itunes-u/intro-to-immunology-biol-378/
tation of a new microbial genome from the Prof. Albers lectures (http://itunes.apple. id476313031).
Joint Genome Institute. Finally, for some com/itunes-u/molecular-cell-biology-110/
exposure to current human genetics, the id354820355), which, however, is missing
Eukaryotic Gene Expression
student should take the Genetics for the final third of the course. Note that in all
Source. Indian Institute of Science
Epidemiologists short course conducted cases iTunes has the order of courses
(IISc), Bangalore, Prof. P.N. Rangarajan
by the National Human Genome Re- reversed in its listing. (An iTunes link is Link. http://nptel.iitm.ac.in/courses/
search Institute in 2008 (http://www. provided rather than a Berkeley Webcast 104108056
youtube.com/playlist?list=PL6D747D95E link because a significant number of courses Provider description. [Topics
BB33F2D). While this pastiche of sources were dropped from the latter website include] cis-acting elements and trans-
may not be ideal, it touches on the major during a redesign in 2011.) acting factors domain structure of
themes in this diverse subject and will Prerequisites. Introduction to Bio- eukaryotic transcription factors role of
give a good sense of the tools under- logy, Biochemistry, or equivalent. chromatin synthesis of mRNA, rRNA,
lying many laboratory methods used in and tRNA cell surface receptors
molecular biology. Cell and Systems Biology intracellular receptors regulation of
Prerequisites. Introduction to Source. Berkeley, MCB130, Profs. gene expression during development
Biology. Randy Schekman, Kunxin Luo and recombinant protein expression systems
Going further. The book Human David Drubin (Spring 2009) gene therapy and transgenic
Molecular Genetics by Drs. Tom Link. h t t p : / / i t u n e s . a p p l e . c o m / technology
Strachan and Andrew Reed, now in its itunes-u/molecular-cell-biology-130/ Commentary. This NPTEL course
4th edition, goes deeper into modern id354820424 offers a significantly more detailed view of
techniques [5]. Though now a bit dated, Provider description. This course gene regulation than the courses above,
a freely accessible online version of the 2nd is aimed at conveying an understanding of though it overlaps with them. It is not
edition is available from the National how cellular structure and function arise as absolutely current but will still be of
Center for Biotechnology Information a result of the properties of cellular interest to those interested in bioinfor-
(NCBI) of the U.S. National Institutes of macromolecules. An emphasis will be matics of signaling pathways and genetic
Health (NIH) (http://www.ncbi.nlm.nih. placed on the dynamic nature of cellular networks. For the larger perspective stu-
gov/books/NBK7580). organization and will include a description dents should also view a seminar by Dr.
of physical properties of cells (dimensions, Robert Tjian on The Molecular Biology
Molecular Biology concepts of free energy, diffusion, bio- of Gene Regulation (http://www.
Source. Berkeley, MCB110, Profs. physical properties). Students will be ibioseminars.org/lectures/bio-mechanisms/
Thomas Alber, Qiang Zhou and Qing introduced to quantitative aspects of cell robert-tjian.html) and, for more recent
Zhong (Fall 2009) biology and a view of cellular function that aspects of microRNA-based regulation, talks
Link. h t t p : / / i t u n e s . a p p l e . c o m / is based on integrating multiple path- by Dr. Adrian Ferre-DAmare on Catalytic
WebObjects/MZStore.woa/wa/viewPodcast? ways and modes of regulation (systems and Gene Regulatory RNAs (http://
id=354820440 biology). videocast.nih.gov/launch.asp?17170), by
Provider description. Molecular Commentary. Another upper-level Dr. Victor Ambros on MicroRNA Path-
biology of prokaryotic and eukaryotic cells Berkeley course, this one in their Cell ways in Animal Development (http://
and their viruses. Mechanisms of DNA and Developmental Biology track, offers a videocast.nih.gov/launch.asp?14844), and
replication, transcription, translation. Struc- different take on the cell that is geared to by Dr. Witold Filipowicz on Regulating
ture of genes and chromosomes. Regulation current systems biology. Berkeley does not the Regulators: Mechanisms Controlling
of gene expression. Biochemical processes allow this course and the previous one to Function and Metabolism of microRNAs
and principles in membrane structure and be taken together for elective credit, but (http://videocast.nih.gov/launch.asp?17234).
function, intracellular trafficking and the overlap is mainly with the last third of Prerequisites. Introduction to
subcellular compartmentation, cytoskeletal the Molecular Biology course, so students Biology and Biochemistry or equivalent.

PLOS Computational Biology | www.ploscompbiol.org 4 September 2012 | Volume 8 | Issue 9 | e1002632


Computational Molecular Biology uiuc.edu/Training/SumSchool/lectures.html). Current Topics in Genome Analysis
Source. Stanford, Biochem 218, Prof. The laboratory of Prof. Burkhard Rost of Source. National Human Genome
Doug Brutlag (Spring 2012) the Technische Universitat Munchen Research Institute (Winter 2012)
Link. http://biochem218.stanford.edu maintains several short video courses with Link. http://www.genome.gov/12514288
Provider description. a separate slides, having titles such as Pro- Provider description. A lecture
practical, hands-on approach to the field tein Prediction and Computational Sys- series covering contemporary areas in
of computational molecular biology. The tems Biology (http://rostlab.org/cms/ genomics and bioinformatics.
course is recommended for both molecular teaching/materials). The Canadian Bioin- Commentary. This series of 13
biologists and computer scientists desiring formatics Workshops provide a number of extended guest lecturers in course format
to understand the major issues concerning short courses annually on topics including is offered every other year by the National
analysis of genomes, sequences and pathway and network analysis, high- Human Genome Research Institute
structures. throughput sequencing data, metabolo- (NHGRI) of the U.S. National Institutes
Commentary. A wide-ranging bioin- mics, microarrays, and cancer genomics, of Health (NIH). Coverage includes bio-
formatics practicum covering aspects of all of which are archived (http:// logical sequence analysis, genome brow-
sequence analysis, genomics, phylogenetic bioinformatics.ca/workshops/open_access); sers, regulatory and epigenomic landscapes
reconstruction, gene regulation, and meta- some lecture videos are missing, but the slide of mammalian genomes, next-generation
bolic networks. There is an excellent set of sets are complete. sequencing technologies, population gene-
slides in PDF format, which should be tics, genome-wide association studies,
viewed in parallel with the video lectures, pharmacogenomics, large-scale expression
and a set of practical how-to videos as well. Introduction to Genome Science analysis, genomic medicine, and genomics
This course provides a biologists approa- Source. University of Pennsylvania of microbes and microbiomes. Handouts
ch to computational biology, and is thus on Coursera, Profs. John Hogenesch and are provided. As part of this course,
listed separately from a corresponding John Isaac Murray (Fall 2012) students should also do the NHGRI
course in the Computer Science Depart- Link. https://www.coursera.org/course/ tutorial Next-Gen 101 from 2011,
ment. The emphasis here is more on how genomescience which has 9 shorter lectures on whole-
to use the algorithms than on the details of Provider description. This course exome sequencing and analysis (http://
their construction. serves as an introduction to the main la- videocast.nih.gov/launch.asp?16885), as
Prerequisites. Molecular Biology. boratory and theoretical aspects of genomics well as the 1000 Genomes Tutorial of 6
Alternatives. MIT offers Genomics and is divided into themes: genomes, even shorter lectures on this important
and Computational Biology by Prof. genetics, functional genomics, systems resource for bioinformatics (http://www.
George Church (http://ocw.mit.edu/ biology, single cell approaches, proteomics, youtube.com/playlist?list=PLF61543E11FF
courses/health-sciences-and-technology/hst- and applications. We start with the basics, 78240).
508-genomics-and-computational-biology- DNA sequencing and the genome project, Prerequisites. Molecular Biology.
fall-2002), but the online version is now 10 then move to high throughput sequenc- Going further. The EMBO Practi-
years old, and is audio-only so that the user ing methods and applications. Next we cal Course on Analysis of High-Throughput
must coordinate the lecture with a sepa- introduce principles of genetics and then Sequence Data (http://www.ebi.ac.uk/
rate, rather massive set of slides. One hopes apply them in clinical genetics and other training/online/course/embo-practical-
that the recently announced edX initiative large-scale sequencing projects. In the course-analysis-high-throughput-seq) is
will provide a Harvard-MIT course in this functional genomics unit, we start with highly recommended as a hands-on
area soon. A short practical course on RNA expression dynamics, analysis of introduction to modern genomic analysis.
DNA/Protein Sequence Analysis is of- alternative splicing, epigenomics and ChIP- It closely coordinates video lectures with
fered by Prof. Amy Denton of California seq, and metagenomics. Model organisms detailed analysis exercises, with tutorial
State University, Channel Islands (http:// and forward and reverse genetics screens are handouts and code supplied, using R and
itunes.apple.com/WebObjects/MZStore. then discussed, along with quantitative trait Bioconductor. Topics include short read
woa/wa/viewPodcast?id=472584215). The locus (QTL) and eQTL analysis. After that, analysis, ChIP-Seq data and analysis,
author is aware of at least one graduate- we introduce integrative and single cell statistical concepts, differential expression
level course in bioinformatics that is in genomics approaches and systems biology. by RNA-Seq, and allele-specific expres-
preparation for one of the major online Finally, we conclude by introducing sion and eQTL.
venues, but is as yet unannounced. While proteomic approaches.
lacking any videos, Stanford Prof. Russ Commentary. This anticipated Cour- Biological Seminars
Altmans course Representations and sera entry promises to touch on all the hot Source. Howard Hughes Medical
Algorithms for Computational Molecular topics in genomics, chip technologies, and Institute, iBioSeminars
Biology has a wealth of notes, slides, next-generation sequencing, making it Link. http://www.ibioseminars.org
readings, and other useful links (http:// central to this curriculum. It will be closely Provider description. iBioSeminars
helix-web.stanford.edu/bmi214-2006). based on the long-established core course in is a freely available library of video seminars
Going further. The University of the Penn Graduate Group in Genomics and from outstanding scientists, including many
Illinois at Urbana-Champaign conducted Computational Biology, and in fact the HHMI investigators. These lectures, which
a Summer School on Computational instructors plan to use the material with describe on-going research in leading
Approaches for Simulation of Biological their own students. Prof. Hogenesch in laboratories, feature an extensive intro-
Systems in 2003 that posted a number of particular has a strong computational duction to the subject matter, making them
videos relating to biophysical modeling orientation and indicates that the material accessible to advanced undergraduates or
and bioinformatics analyses of macro- taught in this course will be bioinformatics- beginning graduate students and researchers
molecular structures, a topic otherwise ready (personal communication). outside of the specific field. The main subject
underrepresented here (http://www.ks. Prerequisites. Molecular Biology. areas are biological mechanisms, cell biology

PLOS Computational Biology | www.ploscompbiol.org 5 September 2012 | Volume 8 | Issue 9 | e1002632


and medicine, developmental biology and Prerequisites. Differential and Inte- Regression, Integration, and Ordinary
evolution, chemical biology and biophysics, gral Calculus. Refreshers are widely avail- Differential Equations. Calculation of errors
and global health and energy. able online, including vintage videos made and their relationship to the accuracy of the
Commentary. Much of a biologists four decades ago at MIT (when the author numerical solutions is emphasized throu-
advanced training is down to depart- of this article was learning the subject ghout the course.
mental seminars, invited speakers, confer- there!) called Calculus Revisited (http:// Commentary. Numerical methods
ences, etc. This star-studded collection ocw.mit.edu/resources/res-18-006-calculus- are an important skill set for those who
amassed by the Howard Hughes Medical revisited-single-variable-calculus-fall-2010 will actually need to solve differential
Institute now has some 80 extended and http://ocw.mit.edu/resources/res-18- equations and other formulations that
seminars covering a wide range of topics, 007-calculus-revisited-multivariable- have no easy closed form expression,
including some that are underrepresented calculus-fall-2011). The first edition of which applies to a lot of real-world
in the available online courseware, such as Prof. Strangs (see next course) calculus mathematical biology. While math pack-
neurosciences and developmental biology. textbook is freely available online [6], as ages can handle much of the dirty work,
An important side benefit of learning the is one by U.C. San Diegos Prof. Gil the real pros need to understand whats
scientific content itself is the educational Williamson [7]. under the hood. While Prof. Kaws course
experience of becoming familiar with the Alternatives. The ancient MIT vid- at the University of South Florida is listed
names, faces, and presentation techniques eos mentioned above also round out here, this link is actually for an inde-
of many of the top scientists in the sophomore-level math with coverage of pendent e-learning course funded by
American biological community. Complex Variables, Differential Equa- major grants to Prof. Kaw from the U.S.
Alternatives. A particularly rich lode tions, and Linear Algebra (http://ocw. National Science Foundation and used by
of talks by distinguished scientists is the mit.edu/resources/res-18-008-calculus- a variety of universities. It is modular,
NIH Directors Wednesday Afternoon revisited-complex-variables-differential- including not only hundreds of short
Lecture series (http://videocast.nih.gov/ equations-and-linear-algebra-fall-2011). videos but also quizzes, slides, examples,
PastEvents.asp?c=3). While there are al- The free online resource Interactive Dif- and demonstrations using a free Mathe-
most 15 years worth of these videos ferential Equations can be helpful for matica Player. An associated textbook is
available for mining, the online student visualization (http://www.aw-bc.com/ide). also freely available online, a chapter at a
might be well advised to make a habit of Going further. For classic applica- time [8]. Sample code is provided in each
tuning in to the live streaming of these tions of differential equations to mathe- of Maple, MathCad, Mathematica, and
events, for more of a flavor of the campus matical modeling in biology, Prof. Jeffrey MatLab, none of which are free, but the
experience. Chasnov of the Hong Kong University of Octave free software package (http://
Science and Technology makes his course www.gnu.org/software/octave) closely ap-
Mathematics Department notes freely available in book form (http:// proaches the core functionality of MatLab,
Differential Equations www.math.ust.hk/,machas/mathematical- which is heavily used in this and several
Source. MIT, 18.03SC, Prof. Arthur biology.pdf). For a more recent perspective, other listed courses for numerical compu-
Mattuck (Fall 2011) Drs. Adam Arkin and John Doyle gave A tation and matrix math.
Link. http://ocw.mit.edu/courses/ Short Course on Mathematical Modeling of
mathematics/18-03sc-differential-equations- Signaling Mechanisms in Biology at NIH Linear Algebra
fall-2011 (http://videocast.nih.gov/launch.asp?9948). Source. MIT, 18.06SC, Prof. Gilbert
Provider description. The laws of For those who will not be going further with Strang (Fall 2011)
nature are expressed as differential equa- formal math but would like to acquire some Link. http://ocw.mit.edu/courses/
tions. Scientists and engineers must know tools for self-defense in this arena, MIT Prof. mathematics/18-06sc-linear-algebra-fall-2011
how to model the world in terms of dif- Sanjoy Mahajan provides a free online Provider description. This course
ferential equations, and how to solve those textbook called Street-Fighting Math covers matrix theory and linear algebra,
equations and interpret the solutions. This (http://ocw.mit.edu/courses/mathematics/ emphasizing topics useful in other disci-
course focuses on the equations and 18-098-street-fighting-mathematics-january- plines such as physics, economics and so-
techniques most useful in science and iap-2008/readings/sf_math.pdf). cial sciences, natural sciences, and engineering.
engineering. Commentary. Prof. Strang is a le-
Commentary. Bioinformatics Numerical Methods gend as an educator, charmingly diffident
students who have somehow only studied Source. University of South Florida, in his delivery yet never lacking in clarity.
math through integral calculus may find EML3041, Prof. Autar Kaw (Summer He has long held that the subject of linear
that some knowledge of differential equa- 2012) algebra should be given as much or more
tions is an important addition to their skill Link. http://numericalmethods.eng. teaching emphasis than calculus and
set. Not only are differential equations a usf.edu differential equations, and the rise of Big
mainstay of mathematical biology in areas Provider description. Numerical Data is now proving him correct beyond
such as enzyme kinetics and population methods are techniques to approximate any doubt. No bioinformatics professional
dynamics, but they are the basis of many mathematical procedures Approxi- dealing with high-dimensional data can
approaches to modeling of biological mations are needed because we either afford to neglect an understanding of
systems. Prof. Mattucks development of cannot solve the procedure analytically matrix math, with many bioinformatics
the subject is fairly traditional, but is or because the analytical method is methods currently making use of various
supplemented by updated wrappers in intractable. In this course, you will learn matrix factorizations, transformations,
the MIT courseware that provide helpful the numerical methods for the following decompositions, and eigenwhatevers.
visualizations and simulations of the sort to mathematical procedures and topics - Alternatives. The charismatic Prof.
which many modern treatments of the Differentiation, Nonlinear Equations, Simul- N.J. Wildberger of the University of New
subject are trending. taneous Linear Equations, Interpolation, South Wales offers a similar course

PLOS Computational Biology | www.ploscompbiol.org 6 September 2012 | Volume 8 | Issue 9 | e1002632


(http://www.youtube.com/playlist?list=PL only for classical statistical tests taught distributions, independence, transfor-
01A21B9E302D50C1). Prof. Jim Hefferon here but for more advanced applications mations, and Multivariate Normal. Limit
of Saint Michaels College has a nice intro- such as linear and nonlinear modeling, laws: law of large numbers, central limit
ductory online textbook (http://joshua.smcvt. time-series analysis, classification, cluster- theorem. Markov chains: transition
edu/linearalgebra). ing, etc. probabilities, stationary distributions,
Going further. The Harvard Exten- Alternatives. Udacity is offering a convergence.
sion School has an advanced course in similar introductory course by Stanford Commentary. Bioinformatics meth-
Abstract Algebra taught by Prof. Bene- Prof. Sebastian Thrun (http://www.udacity. ods depend on statistics to a much greater
dict Gross, starting from a linear algebra com/overview/Course/st101). Profs. Susan degree and in much greater depth than
foundation to study group theory, vector Dean and Barbara Illowski of De Anza biologists typically encounter in their training
spaces, fields, etc. (http://www.extension. College offer an Elementary Statistics for analysis of variance and experimental
harvard.edu/open-learning-initiative/abstract- video course that also has a free online design. Consequently a solid foundation in
algebra). Prof. Edwin Connell of the textbook and a full complement of quizzes, probability is de rigeur, particularly in
University of Miami has a free online exams, and assignments (http://sofia.fhda. preparation for data mining and machine
textbook Elements of Abstract and Linear edu/gallery/statistics/index.html). For a learning applications. Prof. Blitzstein has an
Algebra with a similar approach (http:// stimulating change, one can consider unintimidating, even laid-back style, always
www.math.miami.edu/,ec/book). While learning or reviewing the basics of statistics striving to convey valuable intuitions, but
these may be overkill for bioinformatics, it from the perspectives of other disciplines. does not lack in rigor or depth of coverage.
might just inspire some to seek deeper For instance, another way to pick up R Prerequisites. As noted above, those
insights into structures in large datasets. while learning a little epidemiology is who lack even a basic working knowledge
Prof. Strang himself teaches two follow-on through Berkeley Prof. Tomas Aragons of statistics should take Statistics One,
video courses in applied mathematics, course in Applied Epidemiology using R which can also serve as a less demanding
developing his linear algebra-oriented ap- (http://www.youtube.com/view_play_list? lead-in to this course.
proach to networks, structures, estimation, p=1CBCB8C53D0CBE1F). A somewhat Alternatives. UCLA offers Probability
Fourier analysis, convolution filtering, etc. more detailed (but also considerably more for Life Science (Math 3C), a somewhat
(http://ocw.mit.edu/courses/mathematics/ protracted) treatment of basic research gentler approach to the topic, taught by the
18-085-computational-science-and-engineering- statistics is to be found in Berkeley Prof. late Prof. Herbert Enderton (best known for
i-fall-2008 and http://ocw.mit.edu/courses/ Frederic Theunissens Research and Data his work in mathematical logic) (http://
mathematics/18-086-mathematical-methods- Analysis in Psychology (http://www. www.youtube.com/playlist?list=PL5BE097
for-engineers-ii-spring-2006). His magisterial youtube.com/view_play_list?p=A07B0BAB1 09EECF36AA&feature=plcp). The Har-
self-published textbook for these courses D82C53C). For those with more math and vard University Extension School appar-
includes a treatment of microarray analysis less time, an Introduction to Statistical ently competes with the mother ship by
to discover eigengenes [9]. Methods for High-Energy Physics by Prof. fielding a video course by Prof. Paul
Glen Cowan (http://videolectures.net/ Bamberg entitled Sets, Counting, and
Statistics cernstudentsummerschool09_cowan_is) is a Probability (http://www.extension.harvard.
Source. Princeton on Coursera, Prof. four-lecture overview of material taught in the edu/open-learning-initiative/math-sets-
Andrew Conway (Fall 2012) University of London course. probability). A good textbook entitled
Link. https://www.coursera.org/course/ Going further. Prof. Wim Krijnen of Introduction to Probability by Swarth-
stats1 Hanze University in the Netherlands has a more College Prof. Charles Grinstead and
Provider description. Statistics free online textbook Applied Statistics for Dartmouth College Prof. J. Laurie Snell is
One is designed to be a friendly intro- Bioinformatics using R [10] that does a available in a free online version [11].
duction to very simple, very basic, funda- lovely job of combining a course in Going further. IIT Kharagpur offers
mental concepts in statistics Random statistics with instruction in R and more two courses through NPTEL that start
sampling and assignment. Distributions advanced applications to bioinformatics with a more mathematically intensive treat-
Descriptive statistics. Measurement such as microarray analysis. Further study ment of probability founded in measure
Correlation. Causality Multiple of statistics should be undertaken only theory (usually kept behind the curtain
regression. Ordinary least squares after completing the Introduction to for non-mathematicians), but then extend it
Confidence intervals. Statistical power Probability below. in two different directions: Probability and
t-tests, chi-square tests. Analysis of Statistics by Prof. Somesh Kumar (http://
Variance. Introduction to Probability nptel.iitm.ac.in/courses/111105041) and
Commentary. Only those with no Source. Harvard, Statistics 110, Prof. Probability and Random Processes by
exposure at all to statistics, or those who Joseph Blitzstein (Fall 2011) Prof. Mrityunjoy Chakraborty (http://www.
would benefit from a refresher, should feel Link. http://itunes.apple.com/us/ youtube.com/playlist?list=PLD85E88483F7
the need to take this rather elementary course/statistics-110-probability/id50249 82338). One flavor of stochastic processes
introduction, but the skills are certainly 2375 that is especially important in bioinformatics
essential to bioinformatics analysis. If Provider description. A compre- is taught in Introduction to Markov
necessary it can also provide a gentle hensive introduction to probability. Basics: Processes by Prof. Christof Schutte, head
lead-in to the Introduction to Probability sample spaces and events, conditional of the Biocomputing Group at the Freie
course, which in turn will be required for probability, and Bayes Theorem. Uni- Universitat Berlin (http://www.networkmaths.
more advanced work in statistics. The variate distributions: density functions, ie/videos/list_videos.php?course=mar). In
course makes use of the free statistical expectation and variance, Normal, t, terms of books, a quick tour of statistical
software package R (http://www.r- Binomial, Negative Binomial, Poisson, inference suited to a computer science world
project.org), which bioinformatics practi- Beta, and Gamma distributions. Multi- view can be found in the ambitiously titled
tioners should have in their toolbox not variate distributions: joint and conditional All of Statistics by Carnegie-Mellon

PLOS Computational Biology | www.ploscompbiol.org 7 September 2012 | Volume 8 | Issue 9 | e1002632


University Prof. Larry Wasserman [12]. For courses/106106049). The book by MIT in a Coursera offering by Stanford Prof.
a treatment of probability, statistics, and Prof. Michael Sipser is standard [16], but Michael Genesereth (https://www.coursera.
stochastic processes that makes reference to for a free online alternative try the text by org/course/intrologic).
bioinformatics throughout, see the book the late Prof. Eitan Gurari of Ohio State
Statistical Methods in Bioinformatics by University [17].
University of Pennsylvania Prof. Warren
Analytic Combinatorics
Source. Princeton on Coursera, Prof.
Ewens and Gregory Grant [13]. The first Discrete Math
edition of Stanford Prof. Robert Grays Michael Sedgewick (Spring 2013)
Source. Stony Brook University, Prof. Link. https://www.coursera.org/course/
Probability, Random Processes, and Ergo- Steven Skiena, CSE 547 (1999)
dic Properties, since reissued in a revised introACpartI
Link. http://www.cs.sunysb.edu/%7
Provider description. Analytic
second edition, is freely available online Ealgorith/math-video
[14]. Combinatorics aims to enable precise
Provider description. The mathe-
quantitative predictions of the properties
matical analysis of algorithms uses a variety of large combinatorial structures. The
Automata of topics from discrete mathematics theory has emerged over recent decades
Source. Stanford, CS154 on Cour- combinatorial analysis, number theory, as essential both for the scientific analysis
sera, Prof. Jeffrey Ullman (Spring 2012) and graph theory. The purpose of this
of algorithms in computer science and for
Link. https://www.coursera.org/course/ course is to provide fluency with sum-
the study of scientific models in many
automata mations, congruences, generating func-
other disciplines, including probability
Provider description. The course tions, graph theory, and other tools of the
theory, statistical physics, computational
covers four broad areas: (1) Finite automata trade. The emphasis will be on learning
biology and information theory. Part I of
and regular expressions, (2) Context-free how to attack and solve problems.
this course covers recurrence relations,
grammars, (3) Turing machines and deci- Commentary. Discrete math pro-
generating functions, asymptotics, and
dability, and (4) the theory of intractability, vides much of the theoretical foundation
fundamental structures such as trees,
or NP-complete problems. for computer science. At its more rarefied
permutations, strings, tries, words, and
Prerequisites. Data Structures or levels it must be considered an elective for
mappings, in the context of applications to
equivalent. Prof. Ullman recommends the purposes of bioinformatics. Never-
the analysis of algorithms.
portions of his free online textbook theless it is important in the analysis of
Commentary. Although more na-
Foundations of Computer Science as algorithms and even certain aspects of
rrowly focused than the preceding entry,
preparation [15]. The optional program- biology, and for those with ambitions to
many will prefer this course as an entry
ming assignments require Java or Python. speak at RECOMB or publish in the
point to discrete math because it will be on
Commentary. Despite the name, this Journal of Computational Biology, this sort of
Coursera. It should serve as good training
course also extends to formal language course is a necessary first step. The video is
for the mathematical mindset and rigor of
theory and introduces tractability. The of marginal quality, displaying old-
the subject area in general, and relative to
primary attraction of this Coursera fashioned hand-written transparencies, so
some other treatments of combinatorics it
offering is its illustrious instructor, who the student may wish to consider the
will have the advantage of being closely
literally wrote the book on automata (and alternatives below, but this course has the
tied to algorithms. In addition, its textbook
on databases, on algorithms, etc.). Its hard considerable advantage of closely follow-
will be made freely available online [20].
to imagine a better way for biologists to be ing the truly gem-like textbook Concrete
Alternatives. There is a free online
introduced to the theory of computation. Mathematics by U.C. San Diegos Prof.
version of another textbook by U.C. San
Topics such as automata and grammars Ronald Graham, Stanfords Prof. Donald
Diego Profs. Edward Bender and Gil
are important in areas like pattern Knuth and Oren Patashnik [18].
Williamson entitled Foundations of
matching and RNA fold prediction, Alternatives. Udacity has an
Combinatorics with Applications [21].
while an awareness of tractability and introductory course in Logic and
Going further. Prof. Sedgewick will
decidability is essential in contemplating Discrete Mathematics by Dr. Jonathan
also be offering Part II of this course on
algorithmic approaches to new problems. Farley, a mathematician (http://www.
Coursera (https://www.coursera.org/course/
Perhaps most importantly, as Prof. Ullman udacity.com/overview/Course/cs221). Prof.
introACpartII), delving further into his ap-
points out, surveys of Stanford grads show Kamala Krithivasan of IIT Madras also
that this course was one of the most useful teaches a comprehensive math-oriented proach to generating functions. For a deep
in their subsequent careers, for the course in Discrete Structures via NPTEL mathematical exploration of generating func-
mindset it engendered in solving many (http://nptel.iitm.ac.in/video.php? tions there is a free online version of a textbook
real-world computational challenges. subjectId=106106094). A textbook enti- by University of Pennsylvania Prof. Herbert
Alternatives. For a somewhat more tled A Short Course in Discrete Math- Wilf, with the intriguing title generatingfunc-
extensive treatment, the Harvard Exten- ematics is now available online for free, tionology (sic) [22].
sion School has an outstanding Intro- and offers a traditional approach by U.C.
duction to Formal Systems and Compu- San Diego Profs. Edward Bender and Gil Networks: Theory and Application
tation by Prof. Harry Lewis (http:// Williamson [19]. Source. University of Michigan, SI
itunes.apple.com/WebObjects/MZStore. Going further. Several topics that 508, Prof. Lada Adamic (Winter 2009)
woa/wa/viewPodcast?id=429428100). A fall under the rubric of discrete math are Link. http://open.umich.edu/education/
Theory of Automata, Formal Languages covered more extensively by other courses si/si508/fall2008
and Computation is offered by Prof. in this curriculum, such as Introduction Provider description. The course
Kamala Krithivasan of IIT Madras to Probability and Analytic Com- covers topics in network analysis, from
through NPTEL, which includes lectures binatorics. Additional topics in discrete social networks to applications in infor-
on natural language processing and math include Boolean algebra and mathe- mation networks such as the Internet. I
DNA computing (http://nptel.iitm.ac.in/ matical logic, which are very well-covered will introduce basic concepts in network

PLOS Computational Biology | www.ploscompbiol.org 8 September 2012 | Volume 8 | Issue 9 | e1002632


theory, discuss metrics and models, use theory proper is offered by Prof. L. Sunil and genetic algorithms. One, by Prof. Sean
software analysis tools to experiment with Chandran of IISc Bangalore through Luke of George Mason University, offers
a wide variety of real-world network data, NPTEL (http://nptel.iitm.ac.in/courses/ general coverage [25], while another by
and study applications to areas such as 106108054). Profs. Riccardo Poli and William Langdon
information retrieval. of Essex and Prof. Nicholas McPhee of the
Commentary. Networks have a Applied Optimization University of Minnesota at Morris focuses on
central place in current approaches to Source. Purdue University, Profs. genetic algorithms [26].
systems biology, and this course introduces Ragu Balakrishnan and Stephen Cauley
important ideas about their forms and (Summer 2009) Dynamical Systems and Chaos
properties, using the Gephi open platform Link. http://www.networkmaths.ie/ Source. Texas A&M, Math 614,
(http://gephi.org) for visualization and videos/list_videos.php?course=opt-2 Prof. Michael Pilant (2004)
analysis. This course does not actually Provider description. 1) The basic Link. http://www.math.tamu.edu/,
have any video in its original form, but a optimization problem: a) general formu- mpilant/math614
version of it re-labelled Social Network lation, b) special cases, c) motivating Provider description. Discrete
Analysis is being offered on Coursera in examples. 2) Linear programming: a) maps; continuous flows; dynamical sys-
Fall 2012 (https://www.coursera.org/ general form, b) Simplex method, c) tems; Poincare maps; symbolic dynamics;
course/sna). Prof. Adamic, trained in applications in network flow. 3) Convex chaos, strange attractors; fractals; com-
physics like many in the field, has taught optimization: a) algorithms , b) puter simulation of dynamical systems.
the material at both an undergraduate and applications 4) General optimization: Commentary. This should be con-
graduate level, and the online version will a) mixed integer programming, b) sidered an advanced elective for mathe-
be less demanding so as to be more algorithms and heuristics matically talented students interested in a
generally accessible (personal communica- Commentary. Optimization is a vast deep understanding of dynamical systems
tion). However, in addition to doing the field, often associated with operations modeling in biology. It is an individual
online courses optional programming research or engineering disciplines but effort by a math professor, in screencast
assignments (in R or NetLogo) the more not seen as a core aspect of bioinfor- format, with a wealth of ancillary web
advanced student can and should make matics to date. Nevertheless applications resources including training in MatLab.
use of the rich array of slides, tutorials, can be found and are emerging in systems (From the main page, click Video
demonstrations, and sample data in the biology, modeling, experimental design, Lectures on the left, and then Archival
original course posting, thus using the metabolic engineering, and now synthetic Videos at the top.)
videos as a framework to explore the biology. This well-produced introductory Prerequisites. Differential
course materials in greater depth. For course was actually taught by Purdue Equations is essential, and Linear Algebra
purposes of bioinformatics, students faculty in a summer program at Trinity highly recommended.
should also take the online tutorials College Dublin, the Network Mathematics Alternatives. IIT Kharagpur through
associated with the ubiquitous Cytoscape Graduate Programme, along with several NPTEL offers a course in Chaos, Fractals,
platform (http://www.cytoscape.org), and other courses listed in this curriculum. (Be and Dynamic Systems by Prof. Soumitro
apply the learnings from the course itself sure to select the high resolution option on Banerjee that is similarly exhaustive but
to biological datasets wherever possible. the video, and if the built-in player approaches the subject from an engineering
Alternatives. The wide-ranging misbehaves simply download the MP4 perspective (http://nptel.iitm.ac.in/video.
textbook Networks, Crowds and files.) Students should supplement the php?subjectId=108105054).
Markets by Cornell Profs. David Easley course with the seminar Combinatorial
and Jon Kleinberg is excellent, and a Optimization in Bioinformatics by Prof. Information Theory
prepublication draft is available online for Clarisse Dhaenens of the University of Source. Stanford ClassX, EE376A,
free [23]. Lille (http://videolectures.net/prib2010_ Prof. Tom Cover (Winter 2011)
Going further. Graph theory con- dhaenens_oaab). Link. http://171.64.93.201/ClassX/
tributes a rich foundation of techniques to Prerequisites. Differential system/users/web/pg/view_subject.php?
current network theory as well as under- Equations, Linear Algebra. subject=EE376A_WINTER_2010_2011
lying a large branch of the field of Alternatives. A standard intro- Provider description. The funda-
algorithms. Students will have some expo- duction to optimization is offered by mental ideas of information theory.
sure to graph theory in both this course Prof. Prabha Sharma of IIT Kanpur Entropy and intrinsic randomness. Data
and the Algorithms course, but can find a through NPTEL in Linear Program- compression to the entropy limit. Huffman
much fuller treatment in the short course ming and Extensions (http://nptel.iitm. coding. Arithmetic coding. Channel capa-
Graph Theory and Network Analysis ac.in/courses/111104027). city, the communication limit. Gaussian
taught by Prof. Paul van Dooren of the Going further. Stanford Engineering channels. Kolmogorov complexity. Asy-
Universite Catholique de Louvain (http:// has two advanced courses in Convex mptotic equipartition property. Infor-
www.networkmaths.ie/videos/list_videos. Optimization by the estimable Prof. mation theory and Kelly gambling.
php?course=gra), which covers not only Stephen Boyd (http://see.stanford.edu Applications to communication and data
the math (at a fairly intuitive level) but also /see/courseinfo.aspx?coll=2db7ced4-39d1- compression.
its application to practical problems such 4fdb-90e8-364129597c87 and http://see. Commentary. It goes without saying
as graph similarity, ranking, clustering, stanford.edu/see/courseinfo.aspx?coll=523b that much of molecular biology deals with
etc. Unfortunately a few lectures are bab2-dcc1-4b5a-b78f-4c9dc8c7cf7a); the the storage and transmission of
missing, but all the slides are separately text book is available online for free [24]. information, which by itself makes
available (http://perso.uclouvain.be/paul. Other free online books cover heuristic information theory a proper topic of
vandooren/DublinCourse.pdf). A still optimization methods that are of interest in study for bioinformatics. Basic elements
more comprehensive treatment of graph bioinformatics, such as simulated annealing of the theory are important in machine

PLOS Computational Biology | www.ploscompbiol.org 9 September 2012 | Volume 8 | Issue 9 | e1002632


learning approaches to data mining and other things. Although this course was 6-00sc-introduction-to-computer-science-
appear frequently in bioinformatics tools recorded a quarter-century ago, it still feels and-programming-spring-2011
and algorithms, including sequence motif very well put-together, and the eminent Provider description. This subject
analysis and many other applications. Prof. Oppenheim wrote the definitive text is aimed at students with little or no
However, the mathematical depth of this in the subject [30]. programming experience. It aims to
course will only be necessary for serious Prerequisites. Differential provide students with an understanding
theorists. The late Prof. Cover was the Equations. Linear Algebra and Probability of the role computation can play in solving
author of the standard textbook in the field are helpful. While not strictly speaking a problems. It also aims to help students,
[27]. prerequisite, the MITx course Circuits regardless of their major, to feel justifiably
Prerequisites. Introduction to and Electronics introduces some of the confident of their ability to write small
Probability and general mathematical material in Signals and Systems in a programs that allow them to accomplish
sophistication. beautifully structured format, as well as useful goals. The class will use the Python
Alternatives. Stanford Prof. Tsachy teaching circuit theory that may also be programming language.
Weissman will offer a new online version of very useful in studying biological networks Commentary. For biologists
this course on Coursera (http://www. and the neurosciences (https://6002x. possessing only end-user experience with
infotheory-class.org). While it would be a mitx.mit.edu). computers, several courses are available
shame to miss the chance to learn this Alternatives. Prof. Mark Wickert of that offer a modest introduction to actual
material at the feet of the esteemed Prof. the University of Colorado at Colorado programming, generally in the context of
Cover, this newer version will provide the Springs has put up a very nice screencast an overview of computer science. This one
distinct benefits of a structured, modular series with good notes (http://www.eas. is chosen somewhat arbitrarily, but in
format. The first edition of the book uccs.edu/wickert/ece2610). Prof. Richard particular because it makes use of Python.
Entropy and Information Theory by Baraniuk of Rice University, who has been While opinions about languages vary, a
Stanford Prof. Robert Gray, just reissued a long-time advocate for open source case can be made that Python is both well
in a second edition, is available free online learning (http://www.ted.com/talks/richard_ accepted in the bioinformatics community
[28], as is Information Theory, Inference, baraniuk_on_open_source_learning.html), main- and pedagogically useful in encompassing
and Learning Algorithms by Cambridge tains a free online textbook (http://cnx.org/ many features (perhaps even too many) of
University Prof. David MacKay [29]. content/col10064). In a somewhat different object-oriented, imperative, and functional
vein, Stanford Engineering offers an excellent programming, in addition to ample
Signals and Systems course by Prof. Brad Osgood on The Fourier libraries and bindings to other languages
Source. MIT, RES.6-007, Prof. Alan Transform and its Applications that adopts and resources. It is also a good com-
Oppenheim (1987) more of a deep mathematical than an promise between a traditional Java or C++
Link. http://ocw.mit.edu/resources/ engineering approach to the subject, so for language approach, which only serious
res-6-007-signals-and-systems-spring-2011 those who passionately prefer i to j (and bioinformatics developers will need, and
Provider description. The course you know who you are) this may be a better the sort of lightweight scripting for string,
presents and integrates the basic concepts choice (http://see.stanford.edu/see/courseInfo. file, and process manipulation (think Perl)
for both continuous-time and discrete-time aspx?coll=84d174c2-d74f-493d-92ae-c3f that every analyst will do sooner or later.
signals and systems. Signal and system 45c0ee091). In addition, Python is simply easier to
representations are developed for both Going further. Prof. Wickert (see manage on a home laptop, a requirement
time and frequency domains. These above) has also created an advanced video for many online learners.
representations are related through the course on Statistical Signal Processing Alternatives. The Harvard Exten-
Fourier transform and its generalizations, that again has good notes (http://www.eas. sion School has an Intensive Introduc-
which are explored in detail. Filtering and uccs.edu/wickert/ece5615). The book In- tion to Computer Science that, instead of
filter design, modulation, and sampling for troduction to Statistical Signal processing occupying a Pythonesque middle ground,
both analog and digital systems, as well as by Stanford Prof. Robert Gray and Uni- uses the C language on the one hand and
exposition and demonstration of the basic versity of Maryland Prof. L. D. Davisson is PHP and JavaScript on the other (http://
concepts of feedback systems for both freely available online [31]. For a deeper www.extension.harvard.edu/open-learning-
analog and digital systems, are discussed dive into modern linear systems theory, initiative/introduction-computer-science).
and illustrated. Stanford Engineering has a wonderful MIT also offers a fascinating course that
Commentary. Traditional engineer-
course by Prof. Stephen Boyd called combines introductory Python program-
ing approaches to signal processing and Introduction to Linear Dynamical Sys- ming with aspects of electrical engineer-
linear systems theory have not had a huge tems (http://see.stanford.edu/see/courseinfo. ing, such as signals and systems, circuits,
impact in bioinformatics to date, despite the aspx?coll=17005383-19c6-49ed-9497-2ba8b probability and planning (http://ocw.mit.
fact that signal transduction and trans- fcfe5f6). Linear Algebra is an absolute edu/courses/electrical-engineering-and-
prerequisite for both these advanced courses, computer-science/6-01sc-introduction-to-
mission are central aspects of cell biology.
and the former would require Probability as electrical-engineering-and-computer-science-
Still, some training in the engineering math
well. i-spring-2011). Prof. David Evans of the
that relates to feedback systems, filters,
convolution, and the like is recommended University of Virginia offers an Intro to
as an elective, given trends in areas like Computer Science Department Computer Science (CS101) on Udacity that
systems biology and neuroinformatics. Nor Introduction to Computer Science teaches Python by building a web crawler
should it be forgotten that Fourier analysis is and Programming (http://www.udacity.com/overview/Course/
at the foundation of crystallographic struc- Source. MIT, 6.00SC, Prof. John cs101/CourseRev/apr2012), and his textbook
ture determination, and that signal process- Guttag (Fall 2008) is also available in a free online version (though
ing is directly relevant to instrumentation Link. http://ocw.mit.edu/courses/ it uses Scheme rather than Python) [32]. To
used in omics and image processing, among electrical-engineering-and-computer-science/ simply learn Python if you already have

PLOS Computational Biology | www.ploscompbiol.org 10 September 2012 | Volume 8 | Issue 9 | e1002632


significant experience in some other language, interest, the title and pedigree of this course tions, which covers much of the same
try Nick Parlantes video tutorial at Google traces back to the legendary MIT course ground as the Berkeley course but does it
(https://code.google.com/edu/languages/ 6.001, which for decades started many by introducing the C++ language (http://see.
google-python-class). He is also a long-time computer scientists and electrical engineers stanford.edu/SEE/courseinfo.aspx?coll
instructor at Stanford, and does an intro- on their careers, and to the associated =11f4f422-5670-4b4c-889c-008262e09e4e).
ductory Computer 101 course on Cour- Scheme-based book (now available online A much more recent instantiation of this
sera (https://www.coursera.org/course/ [33]) by Profs. Hal Abelson and Gerald Jay same course is currently on the Stanford
cs101), yet another alternative starting Sussman, young versions of whom can be ClassX streaming service (http://classx.
point. seen delivering the full set of lectures online stanford.edu/ClassX/system/users/web/
Going further. The courses above in 25-year-old videos (http://ocw.mit.edu/ pg/view_subject.php?subject=CS106B_S
offer taster menus of various aspects of courses/electrical-engineering-and-computer- PRING_2010_2011).
computer science and only basic pro- science/6-001-structure-and-interpretation-
gramming skills, and as such are appro- of-computer-programs-spring-2005).
priate for bioinformatics professionals who Alternatives. Stanford Engineering Machine Structures
need exposure to programming but will has long made available its own triad of Source. Berkeley, Computer Science
not be doing it for a living. Those who core courses for CS majors, which are only 61C, Profs. Dan Garcia and Michael
plan to do coding in-the-large or create very slightly showing their age. Stanfords first Franklin (Fall 2011)
compute-intensive applications should course is Programming Methodology, Link. http://webcast.berkeley.edu/
start with the following three courses which teaches Java by jumping in the deep playlist#c,d,Computer_Science,B96D778
instead, which offer greater depth and end, paying a fair amount of attention along 365083506
breadth in programming principles. Those the way to good software engineering prac- Provider description. The internal
who would like to focus immediately on tice (http://see.stanford.edu/SEE/courseinfo. organization and operation of digital com-
data-driven scientific computing could do aspx?coll=824a47e1-135f-4508-a5aa-866adca puters. Machine architecture, support for
worse than Advanced Scientific Com- e1111). Udacity has a post-introductory high-level languages (logic, arithmetic,
puting with Python taught by Berkeley programming course taught by Googles Dr. instruction sequencing) and operating sys-
Astronomy Prof. Joshua Bloom (http:// Peter Norvig with a bit of an artificial in- tems (I/O, interrupts, memory manage-
itunes.apple.com/itunes-u/astronomy- telligence flavor (http://www.udacity.com/ ment, process switching). Elements of
250-001-spring-2012/id497766986); this overview/Course/cs212). computer logic design. Tradeoffs invol-
course is not particularly tied to astrono- ved in fundamental architectural design
my (which is wrestling with Big Data from decisions.
Data Structures Commentary. Despite the title of this
sky surveys rather than omics), and
Source. Berkeley, Computer Science course, it brings hardware into the picture only
introduces packages ranging from statis-
61B, Prof. Paul Hilfinger (Fall 2011) as it relates to designing fast and memory-
tics to visualization to parallel computing,
Link. http://webcast.berkeley.edu/ efficient code. The student will learn the C
although the resolution of the videos may
playlist#c,d,Computer_Science,63AE13 language, mainly because it is close to the
lead to eye strain.
B304CE443E machine, and this is still very important to
Provider description. Fundamen- bioinformatics developers who need to tune the
The Structure and Interpretation of tal dynamic data structures, including performance of compute-intensive appli-
Computer Programs linear lists, queues, trees, and other cations. The current version of this course
Source. Berkeley, Computer Science linked structures; arrays, strings, and hash touches on not only parallelism but Cloud
61A, Prof. Paul Hilfinger (Spring 2012) tables. Storage management. Element- computing, also very relevant to bioinformatics.
Link. http://webcast.berkeley.edu/ ary principles of software engineering. Prerequisites. Data Structures or
playlist#c,d,Computer_Science,EE656 Abstract data types. Algorithms for sorting equivalent.
57BC5C79469 and searching. Introduction to the Java Alternatives. The third course in the
Provider description. Introduction programming language. Stanford sequence is Programming Para-
to programming and computer science. Commentary. Moving on from digms (http://see.stanford.edu/SEE/
This course exposes students to techniques Python to Java, the Berkeley sequence courseinfo.aspx?coll=2d712634-2bf1-4b55-9
of abstraction at several levels: (a) within a not only lays out the standard toolbox of a3a-ca9d470755ee), which also delves into
programming language, using higher-order data structures but begins to sprinkle in bit-level machine details and memory man-
functions, manifest types, data-directed more software engineering techniques, agement using C and C++, but then also
programming, and message-passing; (b) awareness of machine architecture, ab- introduces the functional paradigm (with
between programming languages, using straction, and classic algorithms. Prof. Hil- LISP) and concurrency, as well as surveying
functional and rule-based languages as finger, who teaches these first two courses, (briefly) other languages such as Python and
examples. moves at a good clip and is unfailingly C#. Note that the Stanford series as a whole
Commentary. This is the first in a rigorous yet clear. For enterprise-wide thus teaches the languages Java, then C++,
cycle of three core courses that Berkeley bioinformatics programming Java is the and finally a bit of LISP, Python, etc., while
requires of computer science majors; the language of choice, and the class text, the Berkeley series does Python, then Java,
other two follow below. It now teaches Head First Java [34], is reputed to be and then C. The latter ordering is probably
Python 3 (after many years of using one of the least painful ways to learn this more appropriate for bioinformatics.
Scheme, a LISP dialect and thus more (or any) languagehigh praise indeed.
purely functional) to get across the big Prerequisites. The Structure and
ideas of programming, covering design Interpretation of Computer Programs or Building Dynamic Websites
principles, analysis of performance, con- equivalent. Source. Harvard Extension School,
firmation of correctness, and management Alternatives. The second course in the Computer Science E-75, Prof. David
of complexity. As a matter of historical Stanford sequence is Programming Abstrac- Malan (Fall 2010)

PLOS Computational Biology | www.ploscompbiol.org 11 September 2012 | Volume 8 | Issue 9 | e1002632


Link. http://cs75.tv/2010/fall modifying large software systems. Introduction to Databases
Provider description. This course Function-oriented and object-oriented Source. Stanford, Prof. Jennifer
teaches students how to build dynamic modular design techniques, designing for Widom (Fall 2011)
websites with Ajax and with Linux, re-use and maintainability. Specification Link. http://www.db-class.org/course
Apache, MySQL, and PHP (LAMP), one and documentation. Verification and Provider description. This course
of todays most popular frameworks. validation. Cost and quality metrics and covers database design and the use of
Students learn how to set up domain estimation. Project team organization and database management systems for appli-
names with DNS, how to structure pages management. cations. It includes extensive coverage of
with XHTML and CSS, how to program Commentary. Programming is one the relational model, relational algebra,
in JavaScript and PHP, how to configure thing, software engineering is quite and SQL. It also covers XML data
Apache and MySQL, how to design and another. Bioinformatics applications are including DTDs and XML Schema for
query databases with SQL, how to use increasingly yielding to bioinformatics validation, and the query and trans-
Ajax with both XML and JSON, and how systems, thus the need for practitioners formation languages XPath, XQuery,
to build mashups. The course explores hoping to do significant development to and XSLT. The course includes database
issues of security, scalability, and cross- study this topic in depth. The provider design in UML, and relational design
browser support and also discusses description is taken from the Berkeley principles based on dependencies and
enterprise-level deployments of websites, course catalog, but in fact the instructors normal forms. Many additional key data-
including third-party hosting, virtualiza- have lately been morphing the course base topics from the design and appli-
tion, colocation in data centers, fire- toward an agile development approach cation-building perspective are also co-
walling, and load-balancing. to Software as a Service (SaaS) using Ruby vered: indexes, views, transactions, author-
Commentary. Sooner or later, on Rails for Cloud deployment. In other ization, integrity constraints, triggers, on-
anyone doing bioinformatics is likely to words, they are hitting many themes that line analytical processing (OLAP), and
have to create web pages that provide data are important to recent bioinformatics emerging NoSQL systems.
and/or services to others. Although the trends. A version of this course is also on Commentary. This is a relatively
technologies continue to evolve rapidly, Coursera (https://www.coursera.org/ short but well-constructed course that
this course provides both practical course/saas). was yet another variation on Stanford
experience in recent tools and good Prerequisites. Programming profi- Engineerings courseware initiatives. The
discussions of general considerations that ciency in an object-oriented language quizzes and short segments, presaging the
will carry over to whatever comes down such as Java, C#, C++, Python, or approach used by Coursera, seem par-
the pike next. Ruby. ticularly effective for learning efficiently.
Prerequisites. Programming ability Alternatives. MITs approach in This material should be considered core to
and some familiarity with HTML. their Computer System Engineering
Alternatives. A more foundational
bioinformatics of any stripe.
course tends to view software and hard- Alternatives. The University of Wash-
course on Internet Technology is taught ware as a whole, focusing on controlling
by Prof. Indranil Sengupta of IIT ington has an archived distance learning
complexity, strong modularity, networks, course by Prof. Alon Halevy (now at Google)
Kharagpur through NPTEL (http://nptel. parallelism, recovery, reliability, and se-
iitm.ac.in/video.php?subjectId=106105084). that is titled Introduction to Database
curity (http://ocw.mit.edu/courses/electrical- Systems but emphasizes data management
Udacity offers a Web Application Engi- engineering-and-computer-science/6-033-
neering course taught by web entrepreneur (http://www.cs.washington.edu/education/
computer-system-engineering-spring-2009). courses/csep544/04sp). There is a more
Steve Huffman (http://www.udacity.com/
A more traditional course, with greater classical and in-depth database course by
overview/Course/cs253). There are a
emphasis on project management, is Profs. Dharanipragada Janakiram of IIT
large number of practical tutorial videos
available from IIT Bombay Profs. Madras and Srinath Srinivasa of IIT Banga-
available on web design and the relevant
N.L. Sarda, Umesh Bellur, and Rushi- lore via NPTEL (http://nptel.iitm.ac.in/
scripting languages, easily found by
kesh Joshi through NPTEL (http://nptel. video.php?subjectId=106106093).
search.
iitm.ac.in/video.php?subjectId=1061010
Going further. One possible direc-
61).
tion to go from here is into the realm of
Going further. MIT also offers a Computer Graphics
iPhones and iPads. Although this architec-
higher-level course called Performance Source. UC Davis, ECS 175, Prof.
ture hasnt proven friendly to bioinforma-
Engineering of Software Systems that Kenneth Joy (Fall 2009)
tics to date, students wishing to experiment
focuses on performance analysis, algori- Link. http://itunes.apple.com/us/
can find many online courses, including
thmic techniques for high performance, itunes-u/computer-graphics-fall-2009/
one by Stanford Prof. Paul Hegarty (http://
instruction-level optimizations, cache and id457893733
itunes.apple.com/itunes-u/ipad-iphone-
memory hierarchy optimization, parallel Provider description. Principles of
application-development/id473757255).
programming, and building scalable computer graphics. Current graphics hard-
distributed systems (http://ocw.mit. ware, elementary operations in two-and
Software Engineering edu/courses/electrical-engineering-and- three-dimensional space, transformational
Source. Berkeley, Computer Science computer-science/6-172-performance- geometry, clipping, graphics system design,
169, Profs. Armando Fox and David engineering-of-software-systems-fall-2010). standard graphics systems, individual
Patterson (Spring 2012) A more elementary course but one that projects.
Link. h t t p : / / i t u n e s . a p p l e . c o m / focuses on an important specific skill is Commentary. Given the importance
WebObjects/MZStore.woa/wa/viewPodcast? Udacitys Software Testing by Prof. of scientific visualization to bioinformatics,
id=496893325 John Regehr of the University of Utah this should be a popular elective. This
Provider description. Ideas and (http://www.udacity.com/overview/Course/ course goes straight to 3D graphics, using
techniques for designing, developing, and cs258). Open GL and Qt for a considerable

PLOS Computational Biology | www.ploscompbiol.org 12 September 2012 | Volume 8 | Issue 9 | e1002632


amount of high-level coding. It is also a itunes-u/image-processing-analysis/id4587 users/web/pg/view_subject.php?
good opportunity to get some exposure to 53849). subject=NVIDIA_ICME_SPRING_2010
graphical processing units (GPUs), which Going further. Machine learning _2011), Hadoop (http://classx.stanford.
can also be used to greatly speed up non- techniques for computer vision and image edu/ClassX/system/users/web/pg/view_
graphical computations of relevance to understanding are useful extensions of the subject.php?subject=HADOOP_WINTER_
bioinformatics. basic techniques of image processing. 2010_2011), and the Amazon Cloud
Prerequisites. Linear Algebra, Data Berkeley Prof. Jitendra Malik has a (http://classx.stanford.edu/ClassX/system/
Structures, strong programming skills. Coursera entry entitled Computer users/web/pg/view_subject.php?subject=A
Alternatives. The Harvard Exten- Vision: The Fundamentals that covers EC2_WINTER_2010_2011).
sion School has a substantially similar segmentation of biological images (https://
offering entitled Introduction to Com- www.coursera.org/course/vision). Short Introduction to Algorithms
puter Graphics and GPU Programming courses available on Videolectures.net (see Source. MIT, 6.046J, Profs. Charles
by Prof. Hanspeter Pfister and Eric Chan Computational Seminars below) include, Leiserson and Erik Demaine (Fall 2005)
(http://itunes.apple.com/WebObjects/ among others, Learning in Computer Link. http://ocw.mit.edu/courses/
MZStore.woa/wa/viewPodcast?id=429428 Vision by Prof. Simon Lucey of Carnegie electrical-engineering-and-computer-science/
034). A more exhaustive introduction to the Mellon University (http://videolectures. 6-046j-introduction-to-algorithms-sma-5503-
algorithms (but with no coding) is provided net/mlss08au_lucey_linv) and Markov fall-2005
by IIT Madras Prof. Sukhendu Das in Random Fields for Vision and Graphics Provider description. This course
Computer Graphics via NPTEL (http:// by Prof. Richard Hartley of the Australian teaches techniques for the design and
nptel.iitm.ac.in/video.php?subjectId=10610 National University (http://videolectures. analysis of efficient algorithms, empha-
6090). net/ssll09_hartley_covi). Students should sizing methods useful in practice. Topics
Going further. UC Davis also offers first take Learning Systems or similar. covered include: sorting; search trees,
advanced courses through their Institute heaps, and hashing; divide-and-conquer;
for Data Analysis and Visualization, in- dynamic programming; amortized analy-
cluding Graphics Architecture (http:// Massively Parallel Computing sis; graph algorithms; shortest paths;
itunes.apple.com/us/itunes-u/graphics- Source. Harvard Extension School, network flow; computational geometry;
architecture-winter/id404606990), which CSCI E-292, Profs. Hanspeter Pfister number-theoretic algorithms; polynomial
does GPUs in-depth; Geometric Model- and Nicolas Pinto (Spring 2011) and matrix calculations; caching; and
ing (http://itunes.apple.com/us/itunes- Link. http://itunes.apple.com/Web parallel computing.
u/computer-science-introduction/id389259246); Objects/MZStore.woa/wa/viewPodcast? Commentary. The ability to under-
and Advanced Visualization (http:// id=429428651 stand the workings of and even create
itunes.apple.com/us/itunes-u/advanced- Provider description. In this novel algorithms makes some formal
visualization-ecs277/id389259186). course, students get hands-on experience training in algorithms mandatory in
in developing software for massively par- bioinformatics. This has never been more
Digital Image Processing allel computing resources. We cover relevant given the volumes of data now
Source. Indian Institute of Techno- parallel programming models, hardware being managed, particularly from next-
logy (IIT) Kharagpur, EC61501, Prof. architectures, multi-threaded program- generation sequencing. This is a classic
P.K. Biswas ming, GPU programming, cluster com- course at MIT, using perhaps the most
Link. http://nptel.iitm.ac.in/video.php? puting, cloud computing, and MapReduce famous textbook in the field [35], co-
subjectId=117105079 using Hadoop and Amazons EC2. authored by one of the instructors.
Provider description. Digital image Commentary. Another set of skills Prerequisites. A strong program-
fundamentals Image enhancement in highly relevant to current bioinformatics ming background. Exposure to aspects of
spatial domain Edge detection Image practice, and therefore an attractive discrete math, especially proof techniques
filtering in frequency domain Image elective. This course focuses first on GPU and basic probability theory, that would
restoration Color image processing programming with CUDA and then on be well satisfied by the Automata and
Morphological Image Processing Image MapReduce/Hadoop programming on Introduction to Probability courses above.
segmentation Texture Analysis the Amazon Cloud. For the former, a Alternatives. There are a number of
Commentary. Image processing has home computer with a high-end Nvidia viable alternatives. Coursera is offering a
long been important in biomedical GPU should be sufficient (the pyCUDA two-part course from Princeton, by Profs.
imaging and in certain omic technologies Python binding is used), though online Robert Sedgwick and Kevin Wayne, also
such as microarrays. It also comes into students will of course not have access to using their own textbook, which however
play with next-generation sequencing the GPU cluster used in the course. For requires knowledge of Java (https://www.
platforms as well as high-content screen- the Cloud, EC2 accounts are free but coursera.org/course/algs4partI). Coursera
ing that involves image processing of cell- Amazon will charge a modest amount for will also have the first part of the Stanford
based assays. This is a rigorous engineering cycles (http://aws.amazon.com/ec2). algorithms sequence, by Prof. Tim Rough-
approach to the subject for hard-core pixel Prerequisites. Programming skills garden (https://www.coursera.org/course/
jockeys. and some exposure to UNIX systems algo). Berkeley offers the course taught by
Prerequisites. Differential programming. Profs. Christos Papadimitriou (who has
Equations, Linear Algebra, Signals and Alternatives. Stanford offers a course bioinformatics papers as well as several
Systems more narrowly focused on GPUs (http:// textbooks among his publications) and Sa-
Alternatives. The UC Davis pro- itunes.apple.com/itunes-u/programming- tish Rao (http://itunes.apple.com/itunes-u/
gram described in the previous entry also massively-parallel/id384233322?mt=2) as computer-science-170-001-spring/id496893325).
offers an Image Processing and Analy- well as shorter practical courses in GPUs UC Davis has a course by Prof. Dan
sis course (http://itunes.apple.com/us/ (http://classx.stanford.edu/ClassX/system/ Gusfield, who has also published a book on

PLOS Computational Biology | www.ploscompbiol.org 13 September 2012 | Volume 8 | Issue 9 | e1002632


computational biology algorithms [36] and intelligence (AI), including the origins of Learning Systems
includes two lectures on RNA folding in his the Intelligent Systems for Molecular Source. California Institute of Tech-
discussion of dynamic programming (http:// Biology conference series. Besides intro- nology, CS 156, Prof. Yaser Abu-Mostafa
www.cs.ucdavis.edu/,gusfield/cs122f10/videolist. ducing machine learning, which should be (Spring 2012)
html). pursued further in the next course listed, Link. http://work.caltech.edu/telecourse.
Going further. UC Davis also offers this course introduces knowledge repre- html
a graduate-level algorithms course given sentation, important as a foundation for Provider description. Introduction
by Prof. Gusfield (http://www.cs.ucdavis. biological ontologies; Bayesian nets, useful to the theory, algorithms, and applications
edu/,gusfield/cs222f07/videolist.html). in biological network causal analysis; and of automated learning. How much
Meanwhile his colleague Prof. Chip natural language understanding, which is information is needed to learn a task,
Martel has his own, significantly differ- highly relevant to biomedical text mining. how much computation is involved, and
ent version of the same graduate course The course uses Python, and refers to but how it can be accomplished. Special
on iTunes U (http://itunes.apple.com/ does not require the very popular text by emphasis will be given to unifying the
us/itunes-u/design-analysis-algorithms/ Berkeley Prof. Stuart Russell and Googles different approaches to the subject coming
id389258657). Peter Norvig, Artificial Intelligence: A from statistics, function approximation,
Modern Approach [38]. optimization, pattern recognition, and
Computational Biology Alternatives. As noted in the neural networks.
Source. Stony Brook University, CSE introduction, a well-publicized live course Commentary. Prof. Abu-Mostafa is
549, Prof. Steven Skiena (2010) by Stanfords Prof. Sebastian Thrun and an acclaimed teacher and the material
Link. http://www.algorithm.cs.sunysb. Googles Peter Norvig was offered in the Fall covered is absolutely central to current
edu/computationalbiology of 2011 (https://www.ai-class.com); the lec- bioinformatics practice. His self-published
Provider description. This course
tures and quizzes are now accessible on book [41] and web site are actually
focuses on current problems in compu- YouTube but in a rather awkward format. entitled Learning from Data, which
tational biology and bioinformatics. Our However, Prof. Thrun now has a similar AI gives a better sense of the relevance of
emphasis will be algorithmic, on disco- course on Udacity, which uses Python and is the course to bioinformatics than does the
vering appropriate combinatorial algori-
keyed to programming a robotic car (http:// Provider Description above.
thm problems and the techniques to solve
www.udacity.com/overview/Course/cs373). Prerequisites. Introduction to Pro-
them. Primary topics will include DNA
The U.S. Naval Postgraduate Schools Prof. bability and Linear Algebra are recom-
sequence assembly, DNA/protein sequence
Neil Rowe has a book entitled Artificial mended.
assembly, DNA/protein sequence compari-
Intelligence through Prolog that presents Alternatives. Stanford Engineering
son, hybridization array analysis, RNA and
some core topics using logic programming offers a Machine Learning course by
protein folding, and phylogenic trees.
(favored by this author) and is now available Prof. Andrew Ng, also available now on
Commentary. This course provides
online for free [39]. Coursera (which he co-founded). It is
a computer scientists approach to com-
Prerequisites. Data Structures or excellent in its own way and heavily
putational biology, and is thus listed sepa-
equivalent. Basic probability and propo- overlaps the material in this one, though
rately from a corresponding course in the
sitional logic. with less of a data mining focus and some
Biology Department. The emphasis here is
Going further. This course provides attention paid to robotics (http://see.
more on how the algorithms work than on
modest coverage of the topics in the stanford.edu/see/courseinfo.aspx?coll=
how to use them. Prof. Skienas back-
Commentary, which may well lead the 348ca38a-3a6d-4052-937d-cb017338d7b1).
ground is algorithms and discrete math,
interested student to pursue additional There is an accompanying set of very
and he uses the book An Introduction to
Bioinformatics Algorithms by Neil Jones elective courses below. Students interested polished course notes (http://cs229.
and Prof. Pavel Pevzner of the University in robotic technologies, for instance in stanford.edu/materials.html).
of California at San Diego [37]. control of laboratory automation, should Going further. The Machine
Prerequisites. Introduction to Al consider Stanford Prof. Oussama Khatibs Learning Summer School that took
gorithms is recommended, though Prof. course Introduction to Robotics (http:// place at Cambridge University in 2009
Skiena encourages the participation of see.stanford.edu/see/courseinfo.aspx?coll= has 20 introductory and specialized tuto-
biologists. 86cc8662-f6e4-43c3-a1be-b30d1d179743). rials of 23 hours each in a coordi-
For a look at the deepest philosophical nated video and slide format (http://
Artificial Intelligence foundations of ontologies, students may videolectures.net/mlss09uk_cambridge). A
Source. Berkeley, CS 188, Prof. enjoy a short course by Prof. Barry Smith very popular albeit advanced text, Ele-
Pieter Abbeel (Spring 2012) of the University of Buffalo entitled An ments of Statistical Learning by Stanford
Link. http://itunes.apple.com/Web Introduction to Ontology (http://ontology. Profs. Trevor Hastie, Robert Tibshirani,
Objects/MZStore.woa/wa/viewPodcast? buffalo.edu/smith/IntroOntology_Course. and Jerome Friedman, is also available
id=496298636 html). For a more computational approach, online for free [42]. Students should obtain
Provider description. Basic ideas Prof. John Sowa has a well-organized but and become proficient in machine learning
and techniques underlying the design of text-only Guided Tour of Ontology tools, which can be done from R or Octave
intelligent computer systems. Topics in- (http://www.jfsowa.com/ontology/guided. (as a free alternative to MatLab) environ-
clude heuristic search, problem solving, htm) that includes readings from his book ments (see above). A friendlier user envi-
game playing, knowledge representation, Knowledge Representation [40]. Dr. ronment is provided by tools like Weka
logical inference, planning, reasoning un- Doug Lenat, another knowledge represen- (http://www.cs.waikato.ac.nz/ml/weka),
der uncertainty, expert systems, learning, tation pioneer, gave an interesting seminar widely used in teaching, or Orange,
perception, language understanding. at NIH called Computers versus Common which has add-ons for bioinformatics
Commentary. Bioinformatics has a Sense (http://videocast.nih.gov/launch. and text mining (http://orange.biolab.
long tradition relating it to artificial asp?15085). si); both are open source. Python also has

PLOS Computational Biology | www.ploscompbiol.org 14 September 2012 | Volume 8 | Issue 9 | e1002632


resources in this arena, for example the editing of videos for copyright reasons Link. http://oyc.yale.edu/chemistry/
PyML machine learning framework (http://see.stanford.edu/see/courseinfo.aspx? chem-125a
(http://pyml.sourceforge.net). For bioin- coll=63480b48-8819-4efd-8412-263f1a472f5a). Provider description. This is the
formatics, Hidden Markov Models Prerequisites. Programming skills in first semester in a two-semester intro-
(HMMs) are terrifically important (not Python or Java. Some Calculus, Pro- ductory course focused on current theo-
just for sequence profiles, but also Copy bability, and Linear Algebra are used, but ries of structure and mechanism in organic
Number Variation discovery, Single Nu- also introduced in the course. The Automata chemistry, their historical development,
cleotide Polymorphism genotyping, gene course would be excellent preparation. and their basis in experimental observa-
prediction, etc.). HMMs are not covered Alternatives. A good book for self- tion. The course is open to freshmen with
in the core courses above (though they teaching much of the basic material (also excellent preparation in chemistry and
are introduced in the course below), nor recommended by the instructor of this course, physics, and it aims to develop both taste
are there the same sort of user-friendly and freely available online) is Natural for original science and intellectual skills
environments for HMMs, but there are Language Processing with Python [43], necessary for creative research.
toolkits that the bioinformatics student which actually teaches Python alongside Commentary. Computer scientists
can use to study the associated algo- NLP, and introduces a powerful open who have managed to avoid organic
rithms, such as HMMoC (http:// source supporting library for NLP and text chemistry may benefit from the insight it
biowiki.org/HMMoC) or HMMConver- analytics called NLTK (Natural Language provides about the molecular basis of
ter (http://people.cs.ubc.ca/,irmtraud/ Toolkit, http://www.nltk.org). biological systems, including the nature
hmmconverter), as well as R packages. of the chemical bond and considerations
Stanford Prof. Daphne Koller, the other Computational Seminars of energy and entropy that carry over into
academic co-founder of Coursera, is Source. Videolectures.net, Computer certain computational methods. This
offering a course on Probabilistic Science/Bioinformatics category course is especially interesting for its
Graphical Models, another important Link. http://videolectures.net/Top/ wide-ranging scope and historical per-
flavor of machine learning that includes Computer_Science/Bioinformatics spective, and in particular an illumi-
Bayesian nets and Markov random fields, Provider description. VideoLec- nating case study on drug testing and
which already have had significant im- tures.net is a free and open access usage.
pact in network bioinformatics (https:// educational video lectures repository. Going further. Yale also provides a
www.coursera.org/course/pgm). The lectures are given by distinguished second-semester continuation of this
scholars and scientists at the most course by the same professor (http://oyc.
important and prominent events like yale.edu/chemistry/chem-125b).
Natural Language Processing conferences, summer schools, workshops Alternatives. UC Irvine also offers a
Source. Stanford on Coursera, CS 224N, and science promotional events from beginning course by Prof. James Nowick,
Profs. Dan Jurafsky and Christopher Manning many fields more tightly focused on straight organic
(TBA) Commentary. As is the case for chemistry (http://ocw.uci.edu/courses/
Link. https://www.coursera.org/course/ biology, there are myriad individual se- Chemistry-51A-Organic-Chemistry.aspx).
nlp minars online in computer science. One of
Provider description. This course the best aggregations for advanced com- Fundamentals of Pharmacology
covers a broad range of topics in natural putational aspects of bioinformatics can be Source. University of Pennsylvania
language processing, including word and obtained from Videolectures.net, which on Coursera, Prof. Emma Meagher
sentence tokenization, text classification consists of talks from a large number of (Summer 2012)
and sentiment analysis, spelling correction, European Union-sponsored events, many Link. https://www.coursera.org/course/
information extraction, parsing, meaning of which tend to take the form of com- pharm101
extraction, and question answering. We prehensive mini-courses. Recent meetings Provider description. This [course]
will also introduce the underlying theory include ones on Machine Learning in will discuss the discipline of pharmacology
from probability, statistics, and machine Systems Biology, Cancer Bioinformatics, and its integration throughout medical
learning that are crucial for the field, and Pattern Recognition in Bioinformatics, Learn- science. Specifically, the content will be
cover fundamental algorithms like n-gram ing and Inference in Computational Systems organized as follows: 1) Basic Pharma-
language modeling, naive bayes and Biology, and many more, amounting to a total cological Principles; 2) Applied Pharma-
maxent classifiers, sequence models like of some 200 talks to date. cology, the concept of applying the basic
Hidden Markov Models, probabilistic Alternatives. Google Tech Talks (http:// principles to each organ system with an
dependency and constituent parsing, and www.youtube.com/user/GoogleTechTalks/ emphasis on melding pathophysiology with
vector-space models of meaning. videos) are another source of seminars, biologic targets for drug therapy; 3)
Commentary. Not only is natural though with over 1,600 videos and little Therapeutics, considered to be the clinical
language processing (NLP) technology organization, its necessary to use the search application of applied pharmacology,
important in biological text mining function judiciously. The Santa Fe Institute including the financial implications of
applications, but grammars and parsing also has a large collection of video seminars therapy, evidence-based medicine, and
are relevant to several aspects of sequence on various aspects of complexity research, the limitations of drug therapy and future
analysis. The probabilistic methods their specialty (http://santafe.edu/research/ directions of therapeutics in all disease
introduced are very generally applicable videos/catalog). states, as well as the legal implications of
to bioinformatics, especially classifiers and prescription writing; and 4) Advanced
Hidden Markov Models. This course, or a Other Departments Pharmacological Principles, such as can-
version of it by Prof. Manning alone, is Organic Chemistry cer therapeutics.
available on the Stanford Engineering Source. Yale, Chem 125A, Prof. J. Commentary. This brief overview
open courseware site, though with some Michael McBride (Fall 2008) will be a useful elective for bioinformatics

PLOS Computational Biology | www.ploscompbiol.org 15 September 2012 | Volume 8 | Issue 9 | e1002632


practitioners interested in drug discovery (http://www.ibioseminars.org/lectures/cell- Entrepreneurship
and/or translational research, from either bio-a-med/sangeeta-bhatia.html). Source. Stanford Technology Ven-
a scientific or employment standpoint. tures Program Entrepreneurship Corner
Supplementary seminars that may be of Link. http://ecorner.stanford.edu
interest include Introduction to Drug Game Theory Provider description. The Stan-
Discovery by Drs. James Wells and Source. Yale, ECON 159, Prof. Ben ford Technology Ventures Program (ST-
Michelle Arkin (http://www.ibioseminars. Polak (Fall 2007) VP) Entrepreneurship Corner is a free
org/lectures/bio-techniques/james-wellsmichelle- Link. http://oyc.yale.edu/economics/ online archive of entrepreneurship resour-
arkin.html), Imatinib (Gleevec) as a Par- econ-159 ces for teaching and learning. The mission
adigm of Targeted Cancer Therapies by Provider description. This course of the project is to support and encourage
Dr. Brian Druker (http://www.ibioseminars. is an introduction to game theory and faculty around the world who teach
org/lectures/cell-bio-a-med/brian-druker.html), strategic thinking. Ideas such as domi- entrepreneurship to future scientists and
Protein Kinases; Structure, Function, and nance, backward induction, Nash equili- engineers, as well as those in management
Regulation by Dr. Susan Taylor (http:// brium, evolutionary stability, commitment, and other disciplines.
www.ibioseminars.org/lectures/bio- credibility, asymmetric information, ad- Commentary. Many students who
mechanisms/susan-taylor.html), and Sev- verse selection, and signaling are discussed learn bioinformatics will be exposed to
en Transmembrane Receptors by Dr. and applied to games played in class and to the very latest advances in both biote-
Robert Lefkowitz (http://ibioseminars. examples drawn from economics, politics, chnology and computing, probably the
org/lectures/cell-bio-a-med/robert-lefkowitz- the movies, and elsewhere. two fields that result in the greatest rate of
1.html). Commentary. Game theory has long business startups, especially from acade-
Going further. For an exploration of been used in the study of evolutionary mic spinoffs. Thus learning entreprene-
the interface of systems biology with phar- dynamics, an increasingly important field, urship skills is entirely appropriate as an
macology, the two-day NIH workshop on and backward induction is a genera- elective in this curriculum. The STVP is
Quantitative and Systems Pharma- lization of the same sort of dynamic housed in Stanford Engineering and
cology held in 2008 is still very relevant programming used in biological sequence hosted by the department of Manage-
(http://videocast.nih.gov/launch.asp?14673 analysis, applied to such problems as ment Science and Engineering. The web
and http://videocast.nih.gov/launch.asp? choosing optimal strategies in sports. site has hundreds of videos, including se-
14674). Game theory also bears on modeling and minars, case studies, and tutorials, many
network theory. Scientists in any quanti- by Silicon Valley luminaries. As a way of
Frontiers of Biomedical Engineering tative field probably ought to be familiar organizing the students approach to this
Source. Yale, BENG 100, Prof. Mark with such basic ideas as the prisoners cornucopia, two collections in particular
Saltzman (Spring 2008) dilemma, Pareto optimality, and Nash are recommended: Invitation to Ven-
Link. http://oyc.yale.edu/biomedical- equilibria, or indeed with any field that ture (http://ecorner.stanford.edu/collections.
engineering/beng-100 has produced eight Nobel prizes. The html?collectionId=1) as an introduction, and
Provider description. The course subject tends to be taught in economics then Technology Ventures (http://ecorner.
covers basic concepts of biomedical departments, which makes for an stanford.edu/collections.html?collectionId=2)
engineering and their connection with interesting change of perspective, and as a more directed approach of relevance to
the spectrum of human activity. It serves also provides a link to the fascinating bioinformatics.
as an introduction to the fundamental new field of neuroeconomics. Going further. Students with
science and engineering on which Alternatives. Stanford is planning a strongly entrepreneurial tendencies might
biomedical engineering is based. Case Coursera offering on this topic taught by also wish to take a look at University of
studies of drugs and medical products an economist and computer scientist Michigan Prof. Gautam Kauls Intro-
illustrate the product development- tandem, which may make it a better duction to Finance on Coursera (https://
product testing cycle, patent protection, choice in this context (https://www. www.coursera.org/course/introfinance). For
and FDA approval. It is designed for coursera.org/course/gametheory). The the basics, there are countless economics
science and non-science majors. University of Pennsylvanias Prof. Michael courses online, but the Annenberg Center
Commentary. This topic falls outside Kearns mixes a little game theory and has a particularly nicely produced overview
the usual definition of bioinformatics, but network theory together in Networked (http://www.learner.org/resources/series79.
the course has so many useful and Life, also on Coursera (https://www. html).
interesting elements, including imaging, coursera.org/course/networks).
cell culture and tissue engineering, Going further. For an advanced, Justice
cardiovascular and renal physiology, and more purely mathematical approach to Source. Harvard, ER22, Prof. Mich-
immunology, not to mention product the subject matter, see Non-Cooperative ael Sandel (Fall 2008)
development, that it is likely to make an Game Theory as taught by Prof. Tamer Link. http://www.justiceharvard.org
intriguing elective. Basar of the University of Illinois at Provider description. A critical ana-
Alternatives. MIT offers an Urbana-Champaign (http://www.networkmaths. lysis of classical and contemporary theories of
Introduction to Bioengineering, which, ie/videos/list_videos.php?course=game). Two justice, including discussion of present-day
however, has just a few lectures supple- worthwhile seminars relating game theory to applications. The course examines debates
mented by a large number of extended neurosciences are Neural Basis of Strategic about justice prominent in moral and political
interviews with relevant faculty (http:// Choice by Dr. Giorgio Coricelli (http:// philosophy, and invites students to subject
ocw.mit.edu/courses/biological-engineering/ videocast.nih.gov/launch.asp?17030) and Neu- their own views on these controversies to
20-010j-introduction-to-bioengineering-be- roeconomic Approaches to Mental Disorders critical examination.
010j-spring-2006). See also a seminar by Dr. by Dr. P. Read Montague (http://videocast.nih. Commentary. At the inception of
Sangeeta Bhatia on Tissue Engineering gov/launch.asp?16632). the Genome Project significant emphasis

PLOS Computational Biology | www.ploscompbiol.org 16 September 2012 | Volume 8 | Issue 9 | e1002632


was placed on ELSI or ethical, legal, Jonathan Moreno will be covering the explanations of source abbreviations and
and social implications, and these are even interaction of neurosciences with ethics further elaboration of requirements.)
more prominent today in such issues as for Neuroethics (https://www.cour- There may well be other paths, and
personal data privacy, bioethics in human sera.org/course/neuroethics). The NIH certainly a variety of more specialized
and animal experimentation, and the like. offers a comprehensive short course on ones, but these broad categories would
Biologists nowadays often have some Ethical and Regulatory Aspects of Clin- seem to be a useful start.
training in bioethics but for computer ical Research (http://www.bioethics.nih. In Tables 14, the courses in each
scientists it may be more novel, yet gov/hsrc and click on Podcasts for the virtual department indicated as prerequi-
increasingly important given new videos). sites for a given track represent an
capacities for mining Big Data. This is a assumed background for individuals en-
relatively short and very general tering the track, and should certainly be
introduction to ethics, but one that is
Courses of Study taken if the material is unfamiliar or needs
highly intellectually stimulatingso much As noted at the outset, students will refreshing. Core courses are those
so that it fills a large theatre whenever it is come to online learning from different deemed central to the track, and should
presented at Harvard by Prof. Sandel, backgrounds and with different goals in be taken if the material has not already
with production values worthy of a one- mind, and moreover will have different been mastered elsewhere. Electives are at
man show on Broadway. You are likely to amounts of time to devote to the process. the option of the student, but certain of
discover useful things about yourself, for Therefore it is not helpful to be overly these are indicated as recommended, and
example, whether you are a deontologist prescriptive about course selection. How- several at least should be taken as time
or a consequentialist (which, for you ever, it is possible to identify some basic permits. Finally, for some tracks, addi-
computer types, has something to do types of bioinformatics practitioners, tional study is recommended to extend
with whether your moral judgments are and to suggest possible course selections certain course topics (denoted by plus
determined at compile-time or run-time). best suited to those career paths. It should signs), as discussed below under Indepen-
Alternatives. Oxford has a course of be emphasized that different institutions dent Study.
similar (short) length called A Romp and individuals may have other views on Bioinformatics Analysis (BA). This
through Ethics for Complete Beginners, bioinformatics curricula, disagreeing on track prepares an individual to do
taught by Prof. Marianne Talbot with more appropriate electives and even on core biological data analysis with a view to
focus on traditional moral philosophy courses. To this, the author can only plead interpretation or prediction. It involves
(http://podcasts.ox.ac.uk/series/romp- editorial privilege, and remind the reader such skills as sequence, expression, and
through-ethics-complete-beginners). that these are opinions based on one functional analysis by means of a standard
Going further. UCLA Prof. Bob bioinformatics tool set, as well as an ability
persons experience in the field. It would
Goldberg teaches an honors collegium be prudent for potential students to seek a to write computational scripts, database
entitled Genetic Engineering in Medi- queries, and simple programs.
variety of opinions.
cine, Law, & Agriculture that focuses on Data Mining (DM). This track
a range of legal and ethical issues in begins with the analyst skill set but goes
biotechnology (http://www.mcdb.ucla. Curriculum Tracks further to enable more sophisticated
edu/Research/Goldberg/HC70A_W12/ We identify below a set of five possible analyses of datasets that are especially
videos.php). On Coursera from the Uni- tracks, noting two-letter abbreviations complex, for example, by virtue of being
versity of Pennsylvania, Prof. Ezekiel used in Tables 14 where the recom- very large scale, noisy, high-dimensional,
Emanuel has a timely course on Health mended distributions of courses for each semantically rich, poorly organized or
Policy and the Affordable Care Act track are indicated using symbols defined integrated, etc. It entails a greater depth
(https://www.coursera.org/course/ in the key at the bottom of each table. (See of both mathematical knowledge and
healthpolicy), while his colleague Prof. individual course descriptions above for programming skills.

Table 1. Biology Department curriculum with recommended tracks.

Course Source BA DM BT SW CB

Fundamentals of Biology MIT


Principles of Evolution, Ecology, & Behavior Yale #
Biochemistry NPTEL #
Genetics Berkeley # #
Molecular Biology Berkeley #
Cell and Systems Biology Berkeley #
Eukaryotic Gene Expression NPTEL # #
Introduction to Genome Science U. Penn #
Computational Molecular Biology Stanford
Current Topics in Genome Analysis NHGRI #
Biological Seminars HHMI

: Prerequisite; : Core; : Recommended; #: Elective; +: Independent Study.


doi:10.1371/journal.pcbi.1002632.t001

PLOS Computational Biology | www.ploscompbiol.org 17 September 2012 | Volume 8 | Issue 9 | e1002632


Table 2. Mathematics Department curriculum with recommended tracks.

Course Source BA DM BT SW CB

Differential Equations MIT


Numerical Methods U. S. Florida # #
Linear Algebra MIT
Statistics Princeton
Introduction to Probability Harvard
Automata Stanford
Discrete Math Stony Brook #
Analytic Combinatorics Princeton #
Networks: Theory and Application U. Michigan # #
Applied Optimization Purdue # #
Dynamical Systems and Chaos Texas A&M #
Information Theory Stanford #
Signals and Systems MIT # #

: Prerequisite; : Core; : Recommended; #: Elective; +: Independent Study.


doi:10.1371/journal.pcbi.1002632.t002

Bioinformatics Tools (BT). This the-large, at a level sufficient to participate science and engineering disciplines re-
track is meant to afford the capability to in or lead the development of major levant to the sciences of complexity, infor-
develop standalone tools of significant bioinformatics systems and/or products, mation, and systems.
sophistication for bioinformatics analysis, for instance supporting data management
visualization, presentation, and local data and analysis from novel technological Independent Study
management. It requires programming platforms through complex downstream Even in a university environment, it is
skills in a variety of languages and the analysis pipelines. not unusual for the classes that are
ability to implement complex algorithms Computational Biology (CB). This necessary or desirable for a given course
efficiently, based on solid biological track is intended to prepare individuals to of study to be unavailable when needed.
domain knowledge. do original research in biological modeling Certainly the curriculum above is con-
Bioinformatics Systems (BS). This and analysis by way of advanced mathe- strained by the available online courses, as
track adds to the previous one the matical and computational techniques. It discussed below in the conclusion. In
competency for software engineering in- provides a deeper grounding in computer addition the patchwork nature of the

Table 3. Computer Science Department curriculum with recommended tracks.

Course Source BA DM BT SW CB

Introduction to Computer Science & Programming MIT


Structure & Interpretation of Computer Programs Berkeley # #
Data Structures Berkeley
Machine Structures Berkeley
Building Dynamic Websites Harvard #
Software Engineering Berkeley #
Introduction to Databases Stanford #
Computer Graphics UC Davis # # #
Digital Image Processing NPTEL # # #
Massively Parallel Computing Harvard #
Introduction to Algorithms MIT #
Computational Biology Stony Brook #
Artificial Intelligence Berkeley # # #
Learning Systems Cal Tech
Natural Language Processing Stanford # # # #
Computational Seminars E.U.

: Prerequisite; : Core; : Recommended; #: Elective; +: Independent Study.


doi:10.1371/journal.pcbi.1002632.t003

PLOS Computational Biology | www.ploscompbiol.org 18 September 2012 | Volume 8 | Issue 9 | e1002632


Table 4. Other Departments curriculum with recommended tracks.

Course Source BA DM BT SW CB

Organic Chemistry Yale # # #


Fundamentals of Pharmacology U Penn # # # #
Frontiers of Biomedical Engineering Yale # # # # #
Game Theory Yale # # #
Entrepreneurship Stanford # # # # #
Justice Harvard # # # #

: Prerequisite; : Core; : Recommended; #: Elective; +: Independent Study.


doi:10.1371/journal.pcbi.1002632.t004

courses, arising as they do from many may be true of the Data Mining track, One undeniable truism is that indepen-
institutions, can be a strength but also a though these individuals are probably dent study requires motivation and disci-
weakness, with less opportunity for coor- more likely to be committed to a career pline in the extreme. Students must be
dination and seamless sequencing of in exclusively dry biology. committed to doing assigned readings,
course contents. As in academia, any gaps Students in the two software tracks, exercises and assessments faithfully to
can be addressed, or special interests Bioinformatics Tools and Bioinformatics achieve maximum uptake, the more so
accommodated, by independent study. Systems, may wish to take additional for being on their own. A companion
The major disadvantage is the lack of a courses in subjects such as machine article by the author, Ten Simple Rules
faculty mentor, which requires students to architecture, operating systems, or theory for Online Learning [44], attempts to
be proactive, self-sufficient, and conscien- of programming languages, but by far the provide practical advice along these lines.
tious in discerning the needs and means most important requirement for indepen- A particular piece of advice it offers is to
for supplementing their coursework. Per- dent study is actual programming experi- pay special attention to doing program-
haps the best way to approach this is for ence. These individuals would be well ming projects in the biological domain.
students to make a habit of reading the key advised to take on substantial projects in One great risk to the proposition of online
journals in their field so as to discover the biological domain that go beyond the bioinformatics education is that students
systematic gaps in their knowledge. requirements of the courses taken. never really get to grips with applying
The type of independent study needed Finally, the Computational Biology newfound computational or analytic skills
will depend on the background of the track may call for independent study in a to real biological data and actual problems
student and on the track they are follow- variety of topics in advanced mathematics in the full context of the scientific estab-
ing. Some suggestions for individual tracks and computer science as well as biological lishment. To be sure, biological databases
are indicated by plus signs in Tables 14. background necessary for a particular are readily accessible and datasets may be
A plus to the right of a course symbol specialization. The curriculum offered found online that can serve as challenge
(whether prerequisite or core) indicates here is slanted toward systems biology in problems for classification, and so forth.
that advanced work in the topic area of this regard, but individuals may prefer to But that is not the same as the interactive
that course is recommended for students in study topics such as evolutionary dynamics process of designing a novel experimental
that track. Often some specific suggestions or mathematical genetics that would program, acquiring data direct from
for additional study are indicated in the require additional study. instrumentation, cleaning and reducing
Going Further sections of the course it, and taking responsibility for storing it
catalog, but where specialized courses are Conclusion in both persistent and queryable form. Nor
not to be found online (as is likely), one does classroom learning by itself, virtual or
hopes that the basic course has provided As noted at the outset, any proposed otherwise, fully prepare one for establish-
sufficient background for the student to curriculum must be based on the shifting ing real-world error models, dealing with
learn by self-study of more advanced texts sands of available offerings, and moreover missing data, establishing a statistical case
and journals. is necessarily a matter of opinion, both for some result, arguing and defending
For Bioinformatics Analysis, additional scientific and pedagogical. Without a scientific positions, navigating the publica-
biology coursework or other study would doubt there are gaps, and quality is not tion process, and sundry other practical
be required for the student to approach uniform. For instance, there are few skills.
problems with the expected degree of suitable resources in important areas such Thus, a useful adjunct to online learning
domain sophistication, so that interpreta- as neuroscience and structural biology, in bioinformatics might be a portfolio of
tions of data are placed in an appropriate and several other areas are thin. But the suggested projects based on real-world
biological context. Ideally this would offerings are only getting better and more datasets that would help exercise the skills
include exposure to laboratory science, numerous, and so any imperfections in the of trainees, perhaps in the context of an
which of course is unlikely in the case of current collection should be increasingly online community of peers. One can even
online learners. However, it is expected easy to correct with the passage of time. A imagine a future in which the use of virtual
that many individuals embarking on this more pertinent question is whether an laboratories makes it possible for students
track would already be degreed biologists online education is an adequate substitute to undertake mixed wet/dry studies of
who are seeking additional training to do for what is termed a resident education, in their own. Just as the Amazon Cloud now
advanced analyses with their own data or general and in the particular case of makes large-scale computing accessible
that of others. To some degree the same bioinformatics. and economically feasible without the

PLOS Computational Biology | www.ploscompbiol.org 19 September 2012 | Volume 8 | Issue 9 | e1002632


support of a large institutional data center, social context of science. In an online web technology will only go so far in this
the decreasing cost of sequencing technol- learning environment, direct interaction regard, and probably not far enough in the
ogy and the synthetic biology movement with peers is certainly possible after a case of wet biology. However, the field of
are both suggestive of the possibility of fashion, through discussion logs and the bioinformatics by its nature may offer the
analogous sorts of remote biology. Educa- like, but to date hasnt addressed such best chance for finding ways to involve
tional grants for the creation of virtual important educational elements as the distance learners directly in ongoing
laboratories to enrich the online learning development of public speaking skills. scientific research, and that would seem
experience might be public (or philan- Perhaps the last great barrier to self- to be a worthy goal for the burgeoning
thropic) money well spent. learning is the absence of an advisor, with online education movement.
Any amount of study in any context all that implies, and of membership in a
cannot substitute for immersion in the working lab. Even the most imaginative

References
1. Markoff J (18 Apr 2012) Online education 15. Aho AV, Ullman JD (1994) Foundations of bridge University Press. 640 p. Available: http://
venture lures cash infusion and deals with 5 top computer science. San Francisco, CA: W.H. www.inference.phy.cam.ac.uk/mackay/itila. Ac-
universities. The New York Times. Available: Freeman. 786 p. Available: http://i.Stanford. cessed 16 August 2012.
http://www.nytimes.com/2012/04/18/ edu/,ullman/focs.html. Accessed 16 August 30. Oppenheim AV, Willsky AS, Hamid S (1996)
t ec hn olo gy /co ur sera -pl an s-t o- an noun ce - 2012. Signals and systems (2nd edition). Englewood
university-partners-for-online-classes.html. Ac- 16. Sipser M (1997) Introduction to the theory of Cliffs, NJ: Prentice Hall.
cessed 16 August 2012. computation. Boston, MA: PWS Publishing. 31. Gray RM, Davisson LD (2010) Introduction to
2. Means B, Toyama Y, Murphy R, Bakia M, Jones 396 p. statistical signal processing. Cambridge, UK:
K (SRI International) (2009) Evaluation of 17. Gurari E (1989) An introduction to the theory of Cambridge University Press. 478 p. Available:
evidence-based practices in online learning: a computation. New York, NY: Computer Science http://ee.stanford.edu/,gray/sp.html. Accessed
meta-analysis and review of online learning Press. 314 p. Available: http://www.cse.ohio- 16 August 2012.
studies. Final Report September 2010. Washing- state.edu/,gurari/theory-bk/theory-bk.html. 32. Evans D (2011) Introduction to computing:
ton (D.C.): Department of Education. Contract Accessed 16 August 2012. explorations in language, logic, and machines.
number ED-04-CO-0040 Task 0006. 66 p. 18. Graham RL, Knuth DE, Patashnik O (1989) Charleston, SC: CreateSpace. 266 p. Available:
Available: http://www2.ed.gov/rschstat/eval/ Concrete mathematics. Reading, MA: Addison- http://www.computingbook.org. Accessed 16
tech/evidence-based-practices/finalreport.pdf. Wesley. 625 p. August 2012.
Accessed 16 August 2012. 19. Bender EA, Williamson SG (2004) A short course 33. Abelson H, Sussman GJ, Sussman J (1996)
3. Mayer RE (2001) Multimedia learning. New in discrete mathematics. New York: Dover. 256 p. Structure and interpretation of computer pro-
York, NY: Cambridge University Press. Available: http://cseweb.ucsd.edu/,gill/ grams. 2nd edition. Cambridge, MA: MIT Press.
4. Garrett RH, Grisham CM (2004) Biochemistry. BWLectSite. Accessed 16 August 2012. Available: http://mitpress.mit.edu/sicp/full-
3rd edition. St. Paul, MN: Brooks/Cole Publish- 20. Flagolet P, Sedgewick R (2012) Analytic combi- text/book/book.html. Accessed 16 August 2012.
ing. Available: http://www.web.virginia.edu/ natorics. Cambridge: Cambridge University. 824 34. Bates B, Sierra K (2003) Head first java: your
Heidi/home.htm p. Available: http://ac.cs.princeton.edu/home. brain on java - a learners guide. Sebastopol, CA:
5. Strachan T, Reed A (2010) Human molecular Accessed 16 August 2012. OReilly Media.
genetics. 4th edition. New York: Garland Science. 21. Bender EA, Williamson SG (2006) Foundations of 35. Cormen TH, Leiserson CE, Rivest RL, Stein C
807 p. combinatorics with applications. New York: (2009) Introduction to algorithms. 3rd edition.
6. Strang G (1991) Calculus. Wellesley, MA: Well- Dover. 480 p. Available: http://cseweb.ucsd. Cambridge, MA: MIT Press.
esley-Cambridge Press. 615 p. Available: http:// edu/,gill/FoundCombSite. Accessed 16 August
36. Gusfield D (1997) Algorithms on strings, trees and
ocw.mit.edu/resources/res-18-001-calculus- 2012.
sequences: computer science and computational
online-textbook-spring-2005/textbook. Accessed 22. Wilf HS (2005) generatingfunctionology. 3rd
biology. Cambridge, UK: Cambridge University
16 August 2012. edition. Natick, MA: A.K Peters/CRC Press.
Press. 556 p.
7. Williamson SG (1987) Top-down calculus. Rock- 245 p. Available: http://www.math.upenn.edu/
37. Jones NC, Pevzner PA (2004) An introduction to
ville, MD: Computer Science Press. 429 p. ,wilf/DownldGF.html. Accessed 16 August
bioinformatics algorithms. Cambridge, MA: MIT
Available: http://cseweb.ucsd.edu/,gill/ 2012.
Press.
TopDownCalcSite. Accessed 16 August 2012. 23. Easley D, Kleinberg J (2010) Networks, crowds
8. Kaw A, Kalu EE (2011) Numerical methods with and markets: reasoning about a highly connected 38. Russell S, Norvig P (2009) Artificial intelligence: a
applications. Raleigh, NC: Lulu. 740 p. Available: world. Cambridge, UK: Cambridge University modern approach. 3rd edition. Englewood Cliffs,
http://numericalmethods.eng.usf.edu/topics/ Press. 744 p. Available: http://www.cs.cornell. NJ: Prentice Hall. 1152 p.
textbook_index.html. Accessed 16 August 2012. edu/home/kleinber/networks%2Dbook. Ac- 39. Rowe NC (1988) Artificial intelligence through
9. Strang G (2007) Computational science and cessed 16 August 2012. prolog. 2nd edition. Englewood Cliffs, NJ:
engineering. Wellesley, MA: Wellesley-Cam- 24. Boyd S, Vandenberghe L (2004) Convex optimi- Prentice Hall. 481 p. Available: http://faculty.
bridge Press. 713 p. zation. Cambridge, UK: Cambridge University nps.edu/ncrowe/book/book.html. Accessed 16
10. Krijnen WP (2009) Applied statistics for bioinfor- Press. 730 p. Available: http://www.stanford. August 2012.
matics using R. Available: http://cran.r-project. edu/,boyd/cvxbook. Accessed 16 August 2012. 40. Sowa JF (2000) Knowledge representation. Pacific
org/doc/contrib/Krijnen-IntroBioInfStatistics. 25. Luke S (2009) Essentials of metaheuristics. Grove, CA: Brooks Cole Publishing. 594 p.
pdf.Accessed 16 August 2012. Raleigh, NC: Lulu. 230 p. Available: http://cs. 41. Abu-Mostafa YS, Magdon-Ismail M, Lin H-T
11. Grinstead CM, Snell JL (1997) Introduction to gmu.edu/,sean/book/metaheuristics. Accessed (2012) Learning from data. Pasadena: AMLBook.
probability. New York: American Mathematical 16 August 2012. 42. Hastie T, Tibshirani R, Friedman J (2009) The
Society. 510 p. Available: http://www. 26. Poli R, Langdon WB, McPhee NF (2008) A field elements of statistical learning: data mining,
dartmouth.edu/,chance/teaching_aids/books_ guide to genetic programming. Raleigh, NC: inference, and prediction. 2nd edition. New York:
articles/probability_book/book.html. Accessed Lulu. 252 p. Available: http://www.gp-field- : Springer. 768 p. Available: http://www-stat.
16 August 2012. guide.org.uk. Accessed 16 August 2012. stanford.edu/,tibs/ElemStatLearn. Accessed 16
12. Wasserman L (2003) All of statistics. New York: 27. Cover TM, Thomas JA (1991) Elements of August 2012.
Springer. 461 p. information theory. New York: Wiley. 748 p. 43. Bird S, Klein E, Loper E (2009) Natural language
13. Ewens WJ, Grant GR (2001) Statistical methods 28. Gray RM (2011) Entropy and information. 2nd processing with python. Sebastopol, CA: OReilly
in bioinformatics. New York: Springer. 476 p. edition. New York: Springer. 436 p. Available: Media. Available: http://www.nltk.org/book.
14. Gray RM (2010) Probability, random processes, http://ee.stanford.edu/,gray/it.html. Accessed Accessed 16 August 2012.
and ergodic properties. 2nd edition. New York: 16 August 2012. 44. Searls DB (2012) Ten simple rules for online
Springer. 357 p. Available: http://ee.stanford. 29. MacKay D (2003) Information theory, inference, learning. PLoS Comp Biol 8: e1002631.
edu/,gray/arp.html. Accessed 16 August 2012. and learning algorithms. Cambridge, UK: Cam- doi:10.1371/journal.pcbi.1002631

PLOS Computational Biology | www.ploscompbiol.org 20 September 2012 | Volume 8 | Issue 9 | e1002632