Vous êtes sur la page 1sur 5

11/6/2015 Solutions 2 - Google Docs

Distributed Systems
Thilo Kielmann, Arno Bakker, Konstantin Urysov
Vrije Universiteit Amsterdam, University of Amsterdam
Fall 2015

Solutions to Exercise 2

Architectures
1. Consider Fig. 2-14 from the book:

How can you map the cases from left to right to modern Web (application) technology such as
HTML/CSS, JavaScript, PHP, and a SQL database?

A: The leftmost case depicts a “dumb terminal” with mostly pixel rendering capacities,
and part of the user interface running on a server. This is not the case with web
technology, as even simple HTML already lets the client build the whole user interface.
The second case (from the left) is the one where a static web page (with HTML and
CSS) is displayed by the browser, while all interactions are performed by the server. In
the middle case, parts of the application are running on the client (Web browser) by
means of embedded JavaScript, while other parts of the application are running on the
server, e.g. using PHP scripts. In the second case (from the right), the complete
application is running in the browser which is communicating directly with the database,
likely via AJAX GET and POST calls. In the rightmost case, even part of the storage is
located on the client. This might be the case in which a mobile client is storing or caching
application state in the case of a temporal loss of connectivity. A well-known example for
this is the Chrome browser that allows Google documents to be edited and stored offline
until the connection to the Internet is resumed.

https://docs.google.com/document/d/1g6ZdTvZqcJfbQMvlSZr2qZpSmXAMhADPMuz6PxUivzk/edit 1/5
11/6/2015 Solutions 2 - Google Docs

2. Using the Linda model, write a simple bulletin board application in which some user
can start a discussion on a topic and to which other users can react. It is important that
everybody can see all contributions in the same order. In your favourite language (or in
pseudo code), write the following functions:
createTopic(topicName, firstPost)
getAllPosts(topicName)
addPost(topicName, newPost)

In order to make this happen, the three operations (rd, in, out) shown in class are not
enough. Linda had been extended early on by so-called predicate versions of in and rd,
namely inp and rdp. These versions do not block until a matching tuple becomes
available, but return immediately if this is not the case, indicating that nothing could be
found.

A: To achieve ordering of posts, we need to include a counter into the tuples that gets
initialized when a new topic is created. And we need a protected counter for the
sequence number for each topic.

function createTopic(topicName, firstPost){


ts = linda.universe.rd(“Bulletinboard”,linda.tupleSpace);
ts.out(topicName, 1, firstPost);
ts.out(“counter”, topicName, 1);
}

function getAllPosts(topicName){
var i, p, ok, posts = [];
ts = linda.universe.rd(“Bulletinboard”,linda.tupleSpace);
i = 1;
do {
ok = ts.rdp(topicName,i,p);
if ( ok ) posts.push(p);
} while ( ok );
return posts;
}

function addPost(topicName, newPost){


var i;
ts = linda.universe.rd(“Bulletinboard”,linda.tupleSpace);
ts.in(“counter”, topicName, i);
ts.out(“counter”, topicName, i+1);
ts.out(topicName, i+1, newPost);
}

https://docs.google.com/document/d/1g6ZdTvZqcJfbQMvlSZr2qZpSmXAMhADPMuz6PxUivzk/edit 2/5
11/6/2015 Solutions 2 - Google Docs

Processes
1. What is the difference between a process and a thread?

A: A process is a combination of data and code that operates on that data, which is
currently being executed on a machine. Its data and code are protected from other
processes by means of the hardware and the operating system. A thread is also a piece
of code in execution, but it runs within a process and its data is not protected from other
threads in that process.

2. What is an asynchronous (non-blocking) I/O operation? How can you implement


asynchronous I/O when your communication API is synchronous?

A: An asynchronous I/O operation is an operation that does not wait for the I/O (e.g.
reading from a disk, or receiving a network packet) to succeed. It instructs the system to
perform the I/O. The system may notify the caller when it has completed, or the caller will
find the result on the next invocation (cf. network packets).
You can build asynchronous I/O operations from synchronous ones by calling them
within a dedicated thread (and have only this thread block until the operation is
complete).

3. Does it make sense to use threads on a single-core CPU? Will multi-threaded


applications run faster on a single core?

A: Yes, it makes sense. Like processes they will time-share the CPU and offer
programmers a nice way of structuring their programs. If applications perform a lot of
(network) I/O on which they have to wait, letting other threads compute meanwhile might
actually speed up the program even though there is only a single core.

4. VM images such as AMIs can be quite big. How does this impact cloud providers that
have many customers creating many different virtual machines all the time?

A: If there are many customers with various AMIs of 10 or more Gigabytes of data, the
cloud provider needs to devise a distribution and storage solution for these images that
is efficient and allows VMs to be deployed quickly on all clusters. Building such a
solution is quite a challenge. One property that can be exploited is that the number of
operating systems and OS distributions used is fairly small. So the disk blocks making

https://docs.google.com/document/d/1g6ZdTvZqcJfbQMvlSZr2qZpSmXAMhADPMuz6PxUivzk/edit 3/5
11/6/2015 Solutions 2 - Google Docs

up that part of the VM image can be cached in the clusters, reducing the amount of
unique data that needs to be transported to the VM host.

5. Are Web servers stateless or stateful?

A: This depends on the implementation of the sites that the Web server hosts. In
general, the HTTP protocol prescribes that each request is independent of the others.
This lack of state has lead to developments like “cookies” that store some application
state on the client side that is sent along with the requests of a user’s session, adding
some state to the otherwise stateless protocol.

6. What is the difference in request dispatching for local-area and wide-area clusters? At
what point will we need a redirection policy?

A: A server cluster uses a frontend switch (machine) that accepts incoming (TCP)
requests from clients that it forwards to a selected server machine. In a local cluster, the
switch can actually use TCP handoff such that is does not become a performance
bottleneck by forward all the data from clients to servers and back. In case of a
wide-area cluster, TCP handoff can not be used. This is because the server would need
to pretend having the IP address of the switch which is in a different IP network than its
own. As a result, the client can not communicate with the server as its packets will not be
routed to it.
For scalability reasons, keeping the frontend switch in the loop of transferring all data
may not be a feasible solution, then the switch might decide to redirect clients to switch
machines of remote clusters that can in turn use TCP handoff with their local server
machines.

7. Wide-area redirection requires a method for measuring the distance between two IP
addresses. Think of two different methods and discuss pros and cons.

A: One class of methods involves mapping an IP address to a geographical location and


using geographical distance as a metric. This makes sense as in an ideal world network
latency is bounded only by the speed of light. So for a client in Western-Europe, a server
in Africa should be faster to reach than a US server. However, in reality a server in Africa
is most likely to be slower to reach than a US one, because of the layout of network
connections on that continent. It may therefore be better to use network information such
as provided by the Border Gateway Protocol (BGP). We return to this topic in Chapter 6.

8. What problems will you need to solve to allow live migration of virtual machines between
different wide-area clusters?

https://docs.google.com/document/d/1g6ZdTvZqcJfbQMvlSZr2qZpSmXAMhADPMuz6PxUivzk/edit 4/5
11/6/2015 Solutions 2 - Google Docs

A: When moving between different clusters the IP address of the VM will have to
change.
One solution is to use Mobile IP. Also the virtual disk used by the VM will need to be
migrated to the other cluster. Normally, a single cluster shares a storage facility for the
virtual disks such that their corresponding VMs can be easily moved within the cluster.
The need to move the virtual disk can greatly impact migration speed.

9. According to Fuggetta (Note 3.9) there are three segments in a process. Which segment
do you think is typically more difficult to migrate?

A: The code segment is relatively easy to migrate; as long as the receiving machine is
able to execute the same format, there is no problem here. Likewise, the execution
segment, holding the current execution state, does not impose much of a challenge.
The resource segment, however, contains links to local resources. These resources
(files, devices, …) can not easily be re-established on another machine.

https://docs.google.com/document/d/1g6ZdTvZqcJfbQMvlSZr2qZpSmXAMhADPMuz6PxUivzk/edit 5/5

Vous aimerez peut-être aussi