Vous êtes sur la page 1sur 22

CS305/503, Spring 2009 Introduction

Michael Barnathan

Welcome
Who I am:
Michael Barnathan Monmouth University CS alumnus, now an adjunct professor. This is a course I originally taught at Monmouth University during Spring 2009. I now donate the slides to the Polymath Foundation in hopes that they will edify additional students. The slides may periodically contain references to a class that no longer exists Ill record voiceovers, but only as I have the time. It may take a while. Please be patient.

What this course will be about:


Data Structures Algorithms Java

Prereq: Basic Java and object-oriented programming experience. What I expect of you:
Its like learning to play an instrument: View the lectures dont miss your lessons. Keep programming on your own practice.
You can eventually learn how to play by practicing without taking lessons (though it will take longer), but youll never learn how if you go to lessons but dont practice!

Much as I like teaching, this class is offered for your own benefit! Data Structures and Algorithms form the core of CS. You will use these skills whenever you touch software design in any context, and this is the best opportunity you will ever receive to learn them.

Your goals:

Let them guide your learning.

Syllabus
Its online. If you have any questions about it, feel free to email me.

Java
This is the first time this course is being taught in Java (it was previously taught in C++). Have any of you worked with Java before? Because Java may be new to some of you, we will cover it in some detail. However:
You should also make an effort to learn the language on your own (its fairly similar to C++). You will be expected to know the language on the assignments and exams. You will also be expected to know how to write, compile, and execute a Java program without using an IDE.
The systems in this lab are unreliable when it comes to running IDEs. They may not always work. Even when they do, you may find yourself struggling with the IDE longer than it would have taken you to just write the code! I recommend learning how to use vim or emacs, as these are very powerful editors, but failing that, at least know how to use nano (which is like Notepad for the Linux terminal). Know how to invoke the javac and java commands to compile and run your programs.

What is a Computer Program?


What is a computer? What is a program? What are the key components of a program? Must things be this way?
Must a computer be a machine? Need a program run on a computer? Can a program be created while avoiding those components?

The Class Model


According to the class, a computer:
Is a machine (whats a machine?) Performs tasks designated by a user. Has security in place to protect its integrity.

And a program:
Interacts with the computer/user (I/O). Consumes system resources. Accomplishes a goal.

What is a computer?
Given infinite time and space, you CAN build a computer using rocks*.

Or neurons
Computer systems are actually organized in a fairly similar manner to brains.

* Warning: do not attempt in a finite universe unless otherwise recommended by a physicist.

--xkcd

First Two Computers

Here the first known computer is seen using the second to perform a calculation.

--Office online clip art.

Turing Machines
Computers can actually be implemented with abstract models that dont exist at all One such example is a Turing machine. This model consists of:
An infinite length of tape (memory, storage) A head, which can read or write a single strip of tape and move one strip left or right at a time. A stack, containing instructions and a stack pointer.

Why implement a fake computer?


Because Turing machines are able to perform any calculation a real machine can. Therefore, things that cannot be done on a Turing machine cant be done on real machines either.

A machine or programming language capable of performing every computation that can be performed on a Turing machine is called turing complete.

Data Structures and Algorithms: What are they?


Data structures are models.
They are means of arranging data within your program. Choosing the right arrangement can make your job easier!

Algorithms are recipes.


For solving problems in general:
Programming. Cooking. Driving. Solving math problems. Many other areas.

You dont need a computer to execute them. In fact, very often you are the one executing them:
Following driving directions. Solving puzzles. Interacting with a program (such as one well discuss today).

What is the difference between an algorithm and a program?


The differences are slight. Programs are implementations of data structures and algorithms. Data structures and algorithms are abstract concepts. Just as the Turing machine was an abstract computer. Programs are generally directly interpretable, and can somehow be used directly to solve problems.
Very often you will need to translate an algorithm, given in pseudocode or by a diagram, into a program. Sometimes youll derive an algorithm but leave the implementation to others. Often youll be expected to do both!

How do we measure algorithms?


Correctness:
Produces the correct answer given the correct input.

Halting:
Stops after it does what its supposed to. Doesnt crash.

Performance:
Time (e.g. CPU time) Space (e.g. Memory)

Development time dont overlook it.


Is saving 5 seconds per run worth a week of your time? Unless the app. is time-critical, probably not.

Complexity:
Is the solution simple, intuitive, easy to understand?

Tradeoffs
As computer scientists, you will make many tradeoffs:
Space vs. time. Space and time vs. complexity. Everything vs. dev. time.
Optimization of an algorithm takes time and effort.

Making these tradeoffs wisely is what separates good from great.

Measuring Time and Space


An abstract notion, not usually measured in seconds, but units that only have meaning relative to one another.
For example, 2 takes twice as long as 1, but we have no idea how long 1 takes.

We care only about the order of growth.


Not how long something takes. Instead, how much longer it takes with more data.

Well start with Worst-case analysis.


Murphys Law: assume that everything that can slow the algorithm down will.
If your algorithm runs slower when it reads in the number 2, worst -case analysis would have you assume every element in the dataset is 2! However unlikely this may be it is the worst case, not the worst probable case.

The purpose of this is to find out the absolute longest amount of time / largest amount of space that an algorithm might use.

Theres also a best-case, which assumes the opposite.

Big-O (Asymptotic) Notation.


O(f(n)), where f(n) is some function; e.g.:
Best O(1) O(log n) O(n) O(n log n) O(n^2) O(n^p) O(2^n) Constant time Logarithmic Linear n log n (less common: Linearithmic) Quadratic Polynomial Exponential

Worst

This only indicates the dominant term of the algorithms growth.


For example, the function 3n^2 + 10n + 5 is O(n^2).

We dont take constant terms into account.


y = .0001x^2 is worse than y = 10000x. This is because as x gets large, the x^2 term will dominate.

Therefore, even if algorithm B takes twice as long as algorithm A, they would still belong to the same complexity class.
There is no such thing as O(2n); its just O(n). Constants are not counted.

Big-O visualized

How bad is exponential time?


This bad:
O(2^n)

O(n log n)

Other Symbols
Big-O is only one of the symbols used in asymptotic analysis. Others are and .
O represents an upper bound (no worse than). represents a lower bound (no better than). represents a tight bound (exactly; no better or worse than).

It is not wrong to say a linear function is O(n2); it just means we dont know enough to refine our estimate. O(n) represents a tighter bound and a more complete state of knowledge about the running time of the algorithm.
It is wrong to say the function is anything other than (n), however. An (n) bound, if correctly proven, will never change.

For the next few weeks, we will use O and interchangeably (many sources do, although its not precise to do so) and will avoid altogether. But do be aware of the difference; well start to distinguish between them later.

Optimality
Unfortunately, exponential time is not always avoidable. Certain problems cant be solved any faster. Thus, an optimal solution might not be fast
Its just the fastest one that can exist.

Our subject is the study of algorithms: how to solve problems efficiently. The study of how quickly problems can be solved by any algorithm is called complexity theory.
We unfortunately wont have time to cover it. There is a famous unsolved problem in this field known as P=NP?
This has to do with whether a certain class of problems whose solutions can be verified in polynomial time can also be solved in polynomial time.
If not, these algorithms are all exponential (or worse!)

Theres a $1 million prize for solving it, but its an extremely difficult question!

The Guessing Game: Writing a simple Java program.


Problem:
Guess a random number from 1 to 100. Prompt the user to input guesses. Print out whether the number is actually higher, lower, or equal to the users guess. Repeat until the user guesses correctly.

How would you do this in Java?


The answer is in Thursdays notes.

Decomposition of the problem


It helps to make problems simpler; to iteratively take them down to the level at which you can implement them. Higher/Lower/Equal translate into if comparisons between the number and the guess. Repeat until usually means use a loop. Prompt the user means youll be using Javas input facilities (hint: look up the Scanner class). Random numbers can be generated using Math.random() or the Random class. When in doubt about how to use a class, consult the Java API documentation!

Thats all for now!


The lesson:
Algorithmic thought exists outside of computer science. You can apply what you will learn in this class in many unexpected places.

Next class: The guessing game, review of asymptotic analysis, arrays.

Vous aimerez peut-être aussi