Vous êtes sur la page 1sur 3

Let's talk about arrays. So why would we ever want to use arrays?

Well let's say


you have a program that needs to store 5 student IDs. It might seem reasonable
to have 5 separate variables. For reasons we'll see in a bit, we'll start counti
ng from 0. The variables we'll have will be int id0, int id1, and so on. Any log
ic we want to perform on a student ID will need to be copied and pasted for each
of these student IDs. If we want to check which students happen to be in CS50,
we'll first need to check if id0 represents the student in the course. Then to d
o the same for the next student, we'll need to copy and paste the code for id0 a
nd replace all occurrences of id0 with id1 and so on for id2, 3, and 4.
As soon as you hear that we need to copy and paste, you should start thinking t
hat there is a better solution. Now what if you realize you don't need 5 student
IDs but rather 7? You need to go back into your source code and add in an id5,
an id6, and copy and paste the logic for checking if the IDs belong to the class
for these 2 new IDs. There is nothing connecting all these IDs together, and so
there is no way of asking the program to do this for IDs 0 through 6. Well now
you realize you have 100 student IDs. It's starting to seem less than ideal to
need to separately declare each of these IDs, and copy and paste any logic for t
hose new IDs. But maybe we are determined, and we do it for all 100 students. Bu
t what if you don't know how many students there actually are? There are just so
me n students and your program has to ask the user what that n is. Uh oh. This i
sn't going to work very well. Your program only works for some constant number o
f students.
Solving all of these problems is the beauty of arrays. So what is an array? In s
ome programming languages an array type might be able to do a bit more, but here
we'll focus on the basic array data structure just as you'll see it in C. An ar
ray is just a big block of memory. That's it. When we say we have an array of 10
integers, that just means we have some block of memory that is large enough to
hold 10 separate integers. Assuming that an integer is 4 bytes, this means that
an array of 10 integers is a continuous block of 40 bytes in memory. Even when y
ou use multidimensional arrays, which we won't go in to here, it's still just a
big block of memory. The multidimensional notation is just a convenience. If you
have a 3 by 3 multidimensional array of integers, then your program will really
just treat this as a big block of 36 bytes. The total number of integers is 3 t
imes 3, and each integer takes up 4 bytes.
Let's take a look at a basic example. We can see here 2 different ways of declar
ing arrays. We'll have to comment 1 of them out for the program to compile since
we declare x twice. We'll take a look at some of the differences between these
2 types of declarations in a bit. Both of these lines declare an array of size N
, where we have #define N as 10. We could just as easily have asked the user for
a positive integer and used that integer as a number of elements in our array.
Like our student ID example before, this is kind of like declaring 10 completely
separate imaginary variables; x0, x1, x2, and so on up to xN-1. Ignoring the li
nes where we declare the array, notice the square brackets intact inside the for
loops. When we write something like x[3], which I'll just read as x bracket 3,
you can think of it like asking for the imaginary x3. Notice than with an array
of size N, this means that the number inside of the brackets, which we'll call
the index, can be anything from 0 to N-1, which is a total of N indices.
To think about how this actually works remember that the array is a big block of
memory. Assuming that an integer is 4 bytes, the entire array x is a 40 byte bl
ock of memory. So x0 refers to the very first 4 bytes of the block. X[1] refers
to the next 4 bytes and so on. This means that the start of x is all the program
ever needs to keep track of. If you want to use x[400], then the program knows
that this is equivalent to just 1,600 bytes after the start of x. Where'd we get
1,600 bytes from? It's just 400 times 4 bytes per integer.
Before moving on, it's very important to realize that in C there is no enforceme

nt of the index that we use in the array. Our big block is only 10 integers long
, but nothing will yell at us if we write x[20] or even x[-5]. The index doesn't
even have to be a number. It can be any arbitrary expression. In the program we
use the variable i from the for loop to index into the array. This is a very co
mmon pattern, looping from i=0 to the length of the array, and then using i as t
he index for the array. In this way you effectively loop over the entire array,
and you can either assign to each spot in the array or use it for some calculati
on.
In the first for loop, i starts at 0, and so it will assign to the 0 spot in th
e array, the value 0 times 2. Then i increments, and we assign the first spot in
the array the value 1 times 2. Then i increments again and so on up until we as
sign to position N-1 in the array the value N-1 times 2. So we've created an arr
ay with the first 10 even numbers. Maybe evens would have been a bit better name
for the variable than x, but that would have given things away. The second for
loop then just prints the values that we have already stored inside of the array
.
Let's try running the program with both types of array declarations and take a l
ook at the output of the program. As far as we can see, the program behaves the
same way for both types of declarations. Let's also take a look at what happens
if we change the first loop to not stop at N but rather say 10,000. Way beyond t
he end of the array. Oops. Maybe you've seen this before. A segmentation fault m
eans your program has crashed. You start seeing these when you touch areas of me
mory you shouldn't be touching. Here we are touching 10,000 places beyond the st
art of x, which evidently is a place in memory we shouldn't be touching. So most
of us probably wouldn't accidentally put 10,000 instead of N, but what if we d
o something more subtle like say write less than or equal to N in the for loop c
ondition as opposed to less than N. Remember that an array only has indices from
0 to N-1, which means that index N is beyond the end of the array. The program
might not crash in this case, but it's still an error. In fact, this error is so
common that it has it's own name, an off by 1 error.
That's it for the basics. So what are the major differences between the 2 types
of array declarations? One difference is where the big block of memory goes. In
the first declaration, which I'll call the bracket-array type, though this is by
no means a conventional name, it will go on the stack. Whereas in the second, w
hich I'll call the pointer-array type, it will go on the heap. This means that w
hen the function returns, the bracket array will automatically be deallocated, w
hereas as you must explicitily call free on the pointer array or else you have a
memory leak. Additionally, the bracket array isn't actually a variable. This is
important. It's just a symbol. You can think of it as a constant that the compi
ler chooses for you. This means that we can't do something like x++ with the bra
cket type, though this is perfectly valid with the pointer type.
The pointer type is a variable. For the pointer type, we have 2 separate blocks
of memory. The variable x itself is stored in the stack and is just a single poi
nter, but the big block of memory is stored on the heap. The variable x on the s
tack just stores the address of the big block of memory on the heap. One implic
ation of this is with the size of operator. If you ask for the size of the brack
et array, it will give you the size of the big block of memory, something like 4
0 bytes, but if you ask for the size of the pointer type of array, it will give
you the size of the variable x itself, which on the appliance is likely just 4
bytes. Using the pointer-array type, it is impossible to directly ask for the s
ize of the big block of memory. This isn't usually much of a restriction since w
e very rarely want the size of the big block of memory, and we can usually calc
ulate it if we need it.
Finally, the bracket array happens to provide us with a shortcut for initializin
g an array. Let's see how we could write the first 10 even integers using the sh

ortcut initilization. With the pointer array, there isn't a way to do a shortcut
like this. This is just an introduction to what you can do with arrays. They sh
ow up in almost every program you write. Hopefully you can now see a better way
of doing the student IDs example from the beginning of the video.
My name is Rob Bowden, and this is CS50.

Vous aimerez peut-être aussi