Conference Title: Society of Digital SD

Conference Title
The Fourth International Conference on Digital Information

Processing, E-Business and Cloud Computing (DIPECC2016)
Conference Dates
September 6-8, 2016
Conference Venue
Asia Pacific University of Technology and Innovation (APU),

Malaysia
ISBN
978-1-941968-37-6 2016 SDIWC
Published by
The Society of Digital Information and Wireless

Communications (SDIWC)
Wilmington, New Castle, DE 19801, USA
www.sdiwc.net
Table of Contents
Approaches to IT Service Management in Improving IT Management in the Banking Sector

A Case Study in Tanzanian Banks .. 1
Collect and Disseminate Layer Protocol for Searching Cloud Services ... 7
Practicality of Migration to Cloud Computing .. 15
Domain Oriented Divisible Load Scheduling in the Private Cloud .. 20
FPGA-Based Processor Array Architecture for Profile Hidden Markov Models ... 28
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
Approaches to IT Service Management in Improving IT

Management in the Banking Sector A Case Study in
Tanzanian Banks
Jiten Kirit Vaitha

School of Computing & Technology
Asia Pacific University of Technology &
Innovation
Kuala Lumpur, Malaysia
jiten.vaitha@gmail.com
Abstract This paper discusses how various

frameworks or guidelines can be used to improve
the current situation in managing IT departments
in the banks in Tanzania, by also initially
identifying the current challenges faced by the
banks. IT management involves managing all the
resources that utilize information technology in an
organization or firm that have to be managed
according to priority. A good management of these
resources brings about benefits in staffing and
budgeting, organization and control as well as
usage of technology. In todays world, many new
firms are coming up which highly rely on
information technology, making it complex at times
in terms of managing IT infrastructure. Hence, the
organizations have to make use of certain
guidelines that can help them to manage their IT
infrastructures in the desired way, and to bring
about profitable benefits.
Keywords ITIL, COBIT, ITSM, IT, ISO, Banks
INTRODUCTION
Information Technology Service Management
(ITSM) can be referred to as the way of operating
the IT part of the organization by focusing on the
day to day services provided by the IT department
in that organization, so as to meet customers
needs. ITSM acts as a service function of the
function of the IT provided by the organization.
Since the technology is growing rapidly, it is
difficult for the IT service providers to afford new
technology all the time so instead, they focus more
1.
ISBN: 978-1-941968-37-6 2016 SDIWC
Nicholas Jeremy Francis

School of Computing & Technology
Asia Pacific University of Technology &
Innovation
Kuala Lumpur, Malaysia
nicholas_jeremy@apu.edu.my
on giving a better quality service to the customers

[1].
The bank is a type of business organization that
needs to make use of proper IT services in order to
operate and deliver their services to the customers
in the most efficient way. Due to a rapid growth of
IT, the systems should be more flexible and
scalable in order to improve the swiftness of the
organization. In some absurd situations, the IT
managers have to look for cost-effective ways to
handle the IT management aspect, and this is done
through outsourcing the IT services [2].
There are different approaches to ITSM. Some
of the common ones are IT Governance,
Information Technology Infrastructure Library
(ITIL), Control Objective for Information and
related Technology (COBIT), ISO/IEC 20000, and
Business Service Management (BSM) [3].
Implementation of ITSM brings about changes in
the organizational culture, new processes and the
training of the staff which helps in reorganizing
the approaches used to do business supported by
IT. ITSM is known as a Business Process Change
Management (BPCM) which is of a high concern,
because there is a lot of resources and expenditure
needed for it [4].
IT GOVERNANCE
Reference [3] says that IT governance is a
different term from other terms used in ITSM
frameworks, such that it does not focus on
managing the IT organization on a daily basis, but
it aims at the critical factors required in order to
2.
achieve the business needs in consonance with IT.

It is referred to as applying the rights concerned
with decision making in the structure of an
organization [3].
IT governance concerns management of IT at
the top level of decision making and the rights by
provision of adjustment and control whereby the
duty of the organization is to specify the risk
drivers and the value of business so that the
process and objectives of the governance can be
monitored [5].
Considering the banking sector, good IT
governance can help in managing the finances as
well as the risks in a well concerned manner. For a
developing country like Tanzania, good IT
governance can help in an exposure to new
challenges under IT and new technologies that are
coming up which can help in making the banking
services better.
ITIL
This is one of the frameworks that best
complies with the ITSM, whereby the current
version is ITIL V3. It is mainly aimed at provision
of IT services of top quality which can be used for
ITSM. Two main reasons for this are; one being
that ITIL is more customer-focused and the other
being an effective governance for IT [6].
Organizations are moving more towards ITIL
because it provides a way for managing the IT
services in a systematic manner. It can help the
organization in improving customer satisfaction,
production, as well as cost reduction. ITIL is more
focused on the projects lifecycle [7].
Many organizations are moving towards
making use of ITIL but the question is, do they
follow the exact guidelines? ITIL provides specific
guidelines so that the organization can run the
business by providing good quality service for
customers by focusing on ITSM. A good
implication of this would bring good
improvements and benefits to the banking sector
in Tanzania.
3.
COBIT
This is a standard used widely by organizations
around the world. It is considered to be one of the
most applicable framework for an organization to
achieve business related goals by making use of
4.
ISBN: 978-1-941968-37-6 2016 SDIWC
IT, since its control objectives are focused on the

needs of the business. There are 34 different
control objectives associated with this framework,
which help in equalizing the risks related to IT
against investment in IT controls [8].
In order for the organizations to stay
competitive, it is important to manage the progress
of different projects as they move through their
lifecycle. COBIT is one of the best framework for
control of management, however it receives a low
academic support even though it is widely used in
IT organizations [9].
The organizations have to be highly dependent
on IT in the current world in order to make
improvements on the performance of business, so
as to ensure that the management of risks is
properly done. COBIT framework helps in
managing these risks by focusing on the internal
control of the organization, which consists of the
different processes, methods, policies as well as
the different structures of the organization that are
constructed in order to give an acceptable
affirmation that the objectives of the business can
be met, while identifying and revising the
unwanted factors [10].
ISO/IEC 20000
International Organization for Standardization
(ISO)
and
International
Electro-technical
Commission (IEC) usually called ISO 20000,
focus to improve the way IT services are delivered
by providing best methods to be used for service
management. ISO 20000 and ITIL are very
compatible, even though their approaches are
different. ISO 20000 has two parts; part 1 is a
standard to be adhered to which helps in
assessment of organization, and the part 2 acts as a
base for supporting part 1 by guidance [11].
According to [2], ISO 20000 is a special
standard for ITSM and it was one of the first
standards to be adopted around the world. The IT
services use this standard as a benchmark due to
the fact that it helps in provision of specifications
to the service providers as well as pointing out if
the management standard related to the services of
the organization is adequate for use.
Unlike the other frameworks, ISO 20000 is a
standard which helps in improving the IT service
delivery by having a look at methods which can
5.
satisfy management of services, so that the best

outcome can be achieved, it is used to measure the
maturity of ITSM. In order to produce best results,
it is used with ITIL framework so that all the
related issues between business and IT can be
covered in a systematic way.
COMPARING ITIL AND COBIT
Most of the organizations while using IT as an
enabler are more focused on how to achieve their
goals to improve their business practices. These
organizations make use of the guidelines from
ITIL and COBIT in order to overcome the
challenges that they face in aligning IT with
business [12]. ITIL and COBIT focus on similar
areas as follows [12]:
1. To provide an efficient IT governance and
control so that the IT can meet all the
requirements of the organization.
2. To create best ITSM processes so that
management of IT can be done properly
from a business aspect and the goals of
business can be reached.
3. Provision of a way to measure the progress
of the business as well as process goals of
organization.
Reference [12] also points out that companies
are still confused with both these frameworks in a
way that they are taken as two different
approaches to achieve the same goal. In actual
fact, they complement each other and if used
together, they contribute to a higher value. They
complement each other such that ITIL is focused
on how to overcome the challenges while
COBIT is focused on what needs to be done to
overcome the challenges.
6.
ITIL VS COBIT
The following table shows the main differences
identified between ITIL and COBIT frameworks.
7.
TABLE I.
DIFFERENCES BETWEEN ITIL AND COBIT
ITIL
Focused on the operation
management by provision
of best practices and
COBIT
Focused on IT governance
by defining, implementing,
measuring and improving
ISBN: 978-1-941968-37-6 2016 SDIWC
principles.
Improvement of
effectiveness and quality of
IT customer service and IT
operations.
Focused on HOW to meet
the challenge.
Focused more on ITSM by
using five (5) stages of
service life-cycle (service
strategy, service design,
service transition, service
operation and continual
service improvement).
definite processes involved

with IT life cycle.
Improvement of the IT
governance in terms of
qualitative and quantitative
basis.
Focused on WHY to meet
the challenge.
Based on five principles
(meet stakeholder needs,
single integrated
framework, holistic
approach, governance and
management separation
and end to end cover of
enterprise).
FINDINGS ON BANKS IN TANZANIA

A research was conducted on 11 different
banks in Tanzania, to understand what kind of
frameworks or guidelines they use to manage their
IT services. Questionnaires were sent to the IT
staff in order for them to help to provide
requirements of the research and which areas are
the most challenging for them. The following
areas were found to be the most challenging,
identified by the banks Helpdesk, Operations,
New Technology and Customer Service:
8.
Figure 1.
Pie chart to identify the most challenging

management areas in the banks
From the research conducted on 11 different

banks in Tanzania, noticed that every bank has to
comply with their own guidelines in order to
manage their IT departments. There is no specific
guideline followed fully and it is seen that some
of the banks have started to adopt to the wellknown guidelines such as ITIL. Majority of the
banks which are part of the samples for the
research are still using MOF (Microsoft Office
Framework) and ISO standard alone for managing
their IT.
There was a question about how strictly the
banks follow their framework based on a scale of
0-10, whereby it resulted that most of the banks
follow their desired frameworks for about 80%,
but do not strictly follow it apart from one bank.
Some of the banks only follow it for about 50%
and 60%. ITIL is used by some of the banks but if
not fully implemented, then the bank may not get
full benefit out of the framework, therefore some
improvements are needed.
Figure 3.
Bar graph showing how costly it is for the banks

to manage and solve problems
In summary, questions asked were mainly on

what are the different problems that the banks
face, what areas have to be improved, as well as
what are their current problem solving methods
are carefully researched.
IDENTIFIED AREAS OF CHALLENGES
There were some common challenging areas
found where the 11 banks face. These areas
mainly were helpdesk, operations, customer
service and new technology.
1. Helpdesk this area is concerned with
technical problems, where the customers
can call and seek for help to solve problem.
Most of the banks mentioned this as a
challenge, when their customers need
information regarding a certain problem.
This is due to the fact that some of the
issues need time and cannot be resolved
immediately but the customer would
demand the solution immediately.
2. Customer service This is considered to
be a very important area in the banks,
because the banks need to care for
customers so that their needs and
requirements can be met. A good customer
service can help improve customer
satisfaction and ensures customer remains
using the services of that bank. However,
this is also one of the challenges identified
by the banks, due to the fact that there are
different types of people and they need to
be handled in their own way. Due to this
reason, the banks have to use utmost effort
9.
Figure 2.
Bar graph to show how strictly the banks follow

their frameworks
9 out of the 11 banks use their own guidelines in

order to help them resolve their challenges. One
of the banks has implemented the latest
technology called Service Now which can help
them to manage their helpdesk area. A question
was asked in order to identify how costly it is to
manage in case of any incidents based on the scale
of 0-10. Most of the banks mentioned that IT
equipments, networks and applications to be the
most expensive parts to manage in case of a
problem. Some of them said their database would
be the most costly area in case of any failure. New
technology adoption would also raise costs due to
equipment replacement.
ISBN: 978-1-941968-37-6 2016 SDIWC
to fulfil each and everyones needs on a

timely manner in order to avoid complains.
3. Operations - This concerns the daily tasks
that are actively operated by the banks, so
that the services can be carried out in an
efficient manner. This was also one of the
major area of challenges seen from the
research, which most of the banks face.
This is because the banks have to deliver
quality services to both, their staff and
customers. Thus, there is a close
monitoring and observation needed for
provision of a better service. The
customers require immediate services and
results, making it difficult for the banks to
maintain their service level for their
internal and external operations.
4. New technology Tanzania being a
developing country, there are always new
technologies from time to time to adapt
and/or adopt in order to stay current. The
banking sector is moving very quickly
towards new technology, making it a
challenge for some of the banks because
the newer technology might not be
compatible with the older versions, hence
the whole system has to be changed. New
technology is good in a way that it can help
to improve processes, but on the other
hand, it can become a challenge.
LIMITATIONS OF THE RESEARCH
The data gathering was focused on the IT
departments of the banks, which are the most
active parts of the banks. The information from IT
departments is highly confidential, hence it is not
known how accurate the data provided by the
respondents was. There may be other areas of
challenges in other banks, however this research
was mainly aimed at 11 banks in order to study if
there is any trend in their identified areas.
10.
RECOMMENDATIONS
In order to tackle all areas of challenges faced
by the banking sector, a combined framework of
Microsoft Office Framework (MOF), ITIL and
COBIT is recommended. Since some of the banks
already practiced and complying with at least one
of the three, they should use some elements of the
11.
ISBN: 978-1-941968-37-6 2016 SDIWC
other frameworks as well for minimizing the

occurrences of challenges. Combining these three
frameworks will produce a positive and optimal
result in the areas of operation, customer service,
and service desk. In contrary, majority of the
banking sector do not follow the readily existing
frameworks. Hence, more awareness regarding
the frameworks would be helpful to improve the
business efficiency and bring about more
advantages. In order to do this, the senior
management should study the importance of the
frameworks for some changes to take place. At
the same time, they should promote them and
motivate their employees on exploring the
capabilities of these frameworks. Furthermore, the
banks should be aware of the initial cost for
implementing a combined framework. It can be
high, but the return in long run will be much
higher. To accomplish this, the banking sector
should take trivial but vital steps progressively
towards implementing the combined frameworks
for continuous improvement.
CONCLUSION
The research brought to light about what are the
different methods the banks in Tanzania are
adopting to in managing IT services as well as the
challenges they are currently facing. It can be seen
that how important it is for the banks to follow a
good ITSM method so that optimum results can be
achieved in order to manage the business. There is
still a poor awareness of ITIL in Tanzania, and for
some of the banks that have already started
implementing it, the customers are still not aware
of the standard procedures while requesting
services from the IT department.
Therefore, there should be more awareness
campaign for the banks to know about existing
frameworks like ITIL and COBIT, to increase the
business efficiency. The banks should consider
using the existing frameworks by implementing
them fully and understand how profitable they
could be by bringing about changes to their
operational processes. If training is provided by
the senior management to the employees, then
they will be aware about what changes that can be
made.
12.
It is undeniable that technology keeps

advancing every day and it is necessary in todays
era. In order for the businesses to serve their best
for their customers, they have to adopt new
technologies. The banking industry is not an
exception to this adaptation. Thus, it is a necessary
to invest in IT and apply better ITSM processes to
improve the operations and customer services.
REFERENCES
[1]
[2]
[3]
[4]
[5]
Iden, J. and Eikebrokk, T. (2013). Implementing IT

Service Management: A systematic literature review.
International Journal of Information Management,
33(3), pp.512-523.
Kumbakara, N. (2008). Managed IT services: the role
of IT standards. Info Mngmnt & Comp Security, 16(4),
pp.336-359.
Winniford, M., Conger, S. and Erickson-Harris, L.
(2009). Confusion in the Ranks: IT Service
Management Practice and Terminology. Information
Systems Management, 26(2), pp.153-163.
Wu, M., Huang, S. and Chen, L. (2011). The
preparedness of critical success factors of IT service
management and its effect on performance. The Service
Industries Journal, 31(8), pp.1219-1235.
ZHU, D. and LI, F. (2014). The IT governance:
operating model and governance framework. 2014
ISBN: 978-1-941968-37-6 2016 SDIWC
[6]
[7]
[8]
[9]
[10]
[11]
[12]
International Conference on Management of eCommerce and e-Government, 1(1), p.4.

Ahmad, N., Tarek Amer, N., Qutaifan, F. and Alhilali,
A. (2013). Technology adoption model and a road map
to successful implementation of ITIL. Journal of Ent
Info Management, 26(5), pp.553-576.
Cervone, F. (2008). ITIL: a framework for managing
digital library services. OCLC Systems & Services:
International digital library perspectives, 24(2), pp.8790.
Ridley, G., Young, J. and Carroll, P. (2008). Studies to
Evaluate COBIT's Contribution to Organisations:
Opportunities from the Literature, 2003-06. Australian
Accounting Review, 18(4), pp.334-342.
Bernroider, E. and Ivanov, M. (2011). IT project
management control and the Control Objectives for IT
and related Technology (COBIT) framework.
International Journal of Project Management, 29(3),
pp.325-336.
Kerr, D. and Murthy, U. (2013). The importance of the
COBIT framework IT processes for effective internal
control over financial reporting in organizations: An
international survey. Information & Management,
50(7), pp.590-597
Cooper, L. (2008). ISO20000 qual complements ITIL.
ITNOW, 50(4), pp.20-20.
Hill, P. and Turbitt, K. (2015). Combine ITIL and
COBIT to Meet Business Challenges. Bmc Software,
1(1), p.16.
Collect and Disseminate Layer Protocol for Searching Cloud Services

Dr Thamilvaani Arvaree @ Alvar
University of Nottingham Malaysia Campus
Faculty of Science, Jalan Broga, 43500 Semenyih,
Selangor Darul Ehsan.
Thamil.Vaani@nottingham.edu.my
ABSTRACT
1 INTRODUCTION
Cloud computing is one contemporary

technology in which the research community
has recently embarked. Developers with
innovative ideas for new Internet services no
longer require the large capital outlays in
hardware to deploy their service or the human
expense to operate it.
Cloud computing is a paradigm for large-scale

distributed computing that makes use of the
existing technologies such as virtualization,
service-orientation, and grid computing. It
offers different ways to acquire and manage IT
resources on a larger scale. Even though, cloud
computing provides numerous benefits to the
users, still there are many risks involved in this
technology. Risks are classified into four main
groups. They are namely operational risks,
contingent risks, security risks and business
risks. The common sub categories of
operational risks are availability, reliability,
integrity, fit and maintainability [3].
Searching cloud services over in Internet from

service provider are becoming difficult to end
users due to variety of services and resources
offered in the clouds. Philip in his statement
has mentioned about inexistence of standards,
open protocols and search mechanisms for
discovering different kinds of clouds in detail
[1]. Lack of common cloud standards delayed
the interoperability across providers. Thus, lead
the cloud customer to face challenges and
problems in selecting the right service provider
that meets their needs [2].
The main objective of this research is to
formulate to enhance the environment by
having a search protocol that can ease users
who are seeking for Cloud services.
KEYWORDS
Cloud computing,
services, protocol.
searching
protocol,
ISBN: 978-1-941968-37-6 2016 SDIWC
cloud
2 LITERATURE REVIEW
This research mainly focuses on fitting. Fitting
in this study means matching user search query
in cloud environment. More specifically, how
accurately the user meets their services from
available service providers. The possible risk
occurs in meeting the users query when the
available services mismatch with the users
search query [3]. From the detailed study of
existing research and preliminary study,
accuracy of meeting and matching the users
query consists of three main categories of
issues or gaps namely services, searching and
standard or protocol. The following sections
explained them in detailed.
2.1 Focused Gap: Cloud services

Meeting the user search query is an essential
task at user level in cloud environment. Cloud
computing environment is expected to provide
clear and easy interfaces, concise information
and other supportive services to the users.
Cloud computing service providers offer their
available services to users through web based
application where users are required to have
basic skills and knowledge to access the
accurate services from service providers.
Having basic skills and knowledge will help the
users in reducing number of searching process
for the correct services. Access the web based
application by the service providers and
providing the most accurate search query is a
challenging issue encountered by the user.
However, Kang stated in their research very
limited study focuses on search engine and web
portal for cloud computing system for users
who want to find Cloud service [4]. Therefore,
besides prompt the search query the number of
available services resources to get the services
is also an issue among the users.
2.2 Focused Gap: Searching
There are many existing generic search engines
that users can use for finding Cloud services.
These engines may return URLs containing not
relevant webpages to meet the original search
query of users. Intuitively, visiting all the
webpage can be time consuming job. Generic
search engines (e.g., Google, MSN, etc) are
very effective tools for searching URLs for
generic user queries. Generic search engines
designs are inappropriate in terms of the
relations among the different types of cloud
services and determining which service(s)
would be the best or most appropriate service
for meeting consumers services [5].
Searching cloud services over in Internet from
service provider become difficult for users
because lacking of portal site and search
ISBN: 978-1-941968-37-6 2016 SDIWC
mechanism that are specialized for Cloud

computing [4]. Users need to perform the
searching on their own knowledge by entering
keyword randomly using common search
engines
2.3 Focused Gap: Focused Gap: Standard/
Protocol
Explosive growth in cloud computing market
resulted varieties of heterogeneous and less
interoperable Cloud infrastructures. In recent
years, numbers of cloud service providers are
increased. At the same time the lack of
common cloud
standards delays
the
interoperability across these providers also
increased. Thus, lead the cloud users to face
challenges and problems in selecting the right
service provider who meets their queries [2].
Therefore, standard interface and intermediate
services are needed to overcome the
interoperability issues. Kang in his statement
states inexistence of standards, open protocols
and search mechanisms for discovering
different kinds of Clouds lead users to search
the Clouds randomly. Besides Kang, Philip in
his research has mentioned that with advances
of hardware and software, more and more
medium and small companies may introduce
their clouds aiming at a specific area to the
Cloud computing community. At that time,
manual search will not be an appropriate
approach for service discovery. Users will find
difficulties in searching the accurate services
and selecting the accurate service providers
without a proper standards, procedures or
protocol.
2.3.1 Existing Searching Protocol
One common link between all information via
Internet resources is searching mechanisms. It
is vital for a user to be able to distil the
information relevant to him. Most search
mechanisms involve a client sending a set of
search criteria (e.g., a textual keyword) to a
server and the server performing the search
over some large data set. However, for some

applications a client would like to hide their
search criteria, that is, which data they are
interested in. A client might want to protect the
privacy of their search queries for a variety of
reasons ranging from personal privacy to
protection of commercial interests. There are
existing searching protocols known as private
searching protocols addressing security issues
in searching. There are namely Ostrovsky
protocol and Cooperative Private Searching
(COPS) protocol.
Ostrovosky Protocol
Private searching was first proposed by

Ostrovosky [6]. In private searching, data is
stored in the clear form and encrypted with the
Paillier cryptosystem that exhibits the
homomorphic properties. The server process
the encrypted query on each file and stores the
encrypted file into a compact buffer, with
which the user can successfully recover all
wanted files with high probability. Since, the
query and the results are encrypted under the
users public key; the server cannot know the
users interest.
Ostrovosky protocol consists of three main
steps. The first step is QueryGen algorithm.
Each user runs the QueryGen algorithm to send
an encrypted query to the cloud. Based on
dictionary which have list of keywords, each
user will generate a query. Then the user will
encrypt each generated query using Pailler
Cryptosystem under his public key. Because of
the encrypted query is semantically secure,
cloud cannot guess the users interest. The
second step is PrivateSearch algorithm. The
cloud will proceed to execute this algorithm to
return a compact buffer to each user. Step 3 is
FileRecover algorithm. This algorithm will
recover the matched files. Given the buffer,
each user uses his private key to decrypt the
buffer entry by entry to recover the files.
ISBN: 978-1-941968-37-6 2016 SDIWC
The drawback of this protocol is that keyword

for each query to be selected from a public,
unencrypted dictionary will reveal too much
information about the clients interests.
Cooperative Private Searching (COPS)

Protocol
Unlike Ostrovosky protocol, COPS protocol

introduces an Aggregation and Distribution
Layer (ADL) as a middleware layer between
users and Cloud [6]. User will send the query to
ADL, which will combine queries and send to
the Cloud. COPS protocol consists of five
steps. The first step is QueryGen algorithm.
This step will send shuffled query to the ADL
using shuffle function in the algorithm. Step 2
is QueryMerge algorithm where ADL runs to
send the combined query to the Cloud. The
ADL encrypt each merged query with Pailler
cryptosystem under own public key. Step three
is PrivateSearch algorithm. Step 3, will
returning two compact buffers to the ADL. Step
4 of COPS protocol is ResultDivide algorithm.
This is mainly to divide the appropriate results
to each user. The last step is execution of
FileRecover algorithm to recover the matched
files.
The following figure 1 shows the working
process of Ostrovosky and COPS protocols.
Figure 1 Working Processes of Ostrovosky and COPS

Protocols.
Performance Comparison
The performance of both Ostrovosky protocol

and COPS protocol are evaluated based on
computational cost and communication cost.
Figure 2 shows the comparison of
computational cost and communication cost of
above discussed protocols.
Figure 2 Performance Comparison
The running time of Ostrovosky protocol is

linear with number of users (n) and number of
files stored in the cloud (t). The running time
of COPS protocol which mainly focus on
PrivateSearch algorithm is on the generation of
t+d pairs where t refer to number of files stored
in the cloud and d refers to number of keywords
in the dictionary.
However, Ostrovosky and COPS protocol are
suitable to use in private searching mainly in
solving security issues. As for this research
concern and scope, we will be focusing on open
Cloud and accurate search results. Therefore,
this research proposing a new search protocol
caters for open Cloud.
3 METHODOLOGIES
Four phases were required for this research,
namely to identify prominent problems via
preliminary study, to design the proposed
protocol and tool development for evaluation,
to evaluate the proposed protocol via developed
tool and to verify performance analysis of
proposed protocol with set of experiments.
Figure 3 Shows the details on each phases
ISBN: 978-1-941968-37-6 2016 SDIWC
Figure 3 Research Methodology
As indicated in Figure 4, the proposed protocol

involves five distinct steps. There are possible
query expansion, database index, searching and
matching algorithm, service catalogue and
similarity ranking. All five steps grouped
together as one layer during the development of
proposed protocol. The basic reason to
introduce the layer is to get the users search
queries and disseminate appropriate results to
the user from the service provider. In other
word, CDL will reduce the number of iteration
cycle to get the services from the provider.
CDL is requiring retrieving matched results to
the user instead of simply returning everything
to user.
10
computing environment search engine. There

are three other metrics was selected namely
computation time, computation cost and
searching overhead to measure the performance
analysis.
4 RESULTS AND DISCUSSIONS
Figure 4 Cloud Keyword Search protocol with Collective

and Dissemination Layer (CDL)
This research idea was to convert the way of

depicting the search results from the ordinary
view to some well-organized view and also to
support the users while performing the service
searching. To support the research idea, this
study proposed a method call cascading guided
search to be implemented in proposed search
engine. The Cascading Guided Search method
is combination of cascading design strategy and
guided search method. The reasons of adopting
this proposed method is to guide the user to
perform layer by layer searching so that the
process of searching will be easy for user.
Between, the Cascading Guided Search will
derive all the benefits from both methods which
is cascading layer design strategy and guided
search method. Thus also enable the user to
find the search result faster, secure and the most
important is user will feel comfortable during
the searching process by avoiding multiple
attempts of searching due to incorrect search
query.
Three important metrics have been considered
for evaluation of the proposed protocol and
search engine in this research which somehow
defines the term better. They are response time,
success rate and similarity ranking. These three
metrics are selected based on previous study
which conducted and proven by Kang for cloud
ISBN: 978-1-941968-37-6 2016 SDIWC
The evaluation experiments for this research

carried out using two schemes. The first scheme
was known as without CSSE. It refers to
before the implementation of the proposed
protocol and search engine. Second scheme was
known as with CSSE. The second scheme
refers to after the implementation of proposed
protocol and search engine. There are three
experiments conducted using measurement
metrics for each scheme namely response time,
success rate and similarity ranking. Figure 5
shows the graph for response time for both
schemes.
Figure 5 Response Time Using With CSSE and Without

CSSE
The different possible comparisons results of

response time for using CSSE and without
CSSE in this research provide a valid
conclusion on proposed search engine. The
implementation of proposed search engine
improves the searching response time. Besides,
use of generic search engine for cloud services
11
searching will be time consuming for user who

wishes to get the services.
response time, high success rate and high

similarity ranking.
The second experiment is on success rate. Table

below shows the success rate obtain for both
schemes.
The
computation
time
collected
via
questionnaire form is used to compare the
performance of proposed protocol and search
engine. Figure 6 shows the graph plot using the
average computation time calculated.
Table 1 Success Rate

Schemes
Answer
Frequency
Percentage
Cumulative
percentage
Without
CSSE
Yes
18
27.7
27.7
No
Total
Yes
No
Total
47
65
58
7
65
72.3
100.00
89.2
10.8
100
100
89.2
100
Time
With
CSSE
From the second experiment of evaluation

stage, this research concludes that the lesser
attempts taken by more users lead to better
matching protocol. Therefore, it is proven that
proposed protocol, CDL, is performing better in
searching and matching of the search results.
The following table shows the similarity rate
obtain in both schemes.
Table 2 Similarity Rate
Statistics
Similarity
(with CSSE)
Mean
Similarity
(without CSSE )
87.69
27.23
33.108
18.883
Minimum
Maximum
100
75
Std. Deviation
It is proven by the evaluation activity that the

complete proposed protocol (CDL) and search
engine (CSSE) implementation provides better
similarity rate.
This entire evaluation experiments and the
results obtained supported the objective of this
study. It has been prove the CDL and CSSE are
effective enough to use by the user who wish to
perform cloud service searches with faster
ISBN: 978-1-941968-37-6 2016 SDIWC
1360
1350
1340
1330
1320
1310
1300
1290
1280
1270
1260
1250
1240
1230
Time (100keyword)
Time (200keyword)
20 40 60 80 100 120 140 160 180 200

Numbers of Users
Figure 6 Computation Time
The calculation of computation time refers to

running time of piece of algorithm. The
proposed protocol consists of five layers. In this
research, we treat the execution of searching
algorithm is the main part of entire program.
Calculation of computation time for the above
code will provide O(n) in big notation. The
searching algorithm used is faster enough to
perform the searching since it is executed O(n)
times. Execution of algorithm for O(n) times
will provides linear time of searching.
Referring to Figure 6, the graph shows that the
computation time for 100 keywords and 200
keywords in database progress linearly at the
beginning of the experiment and differences
between the computation times when it start.
This is because the proposed protocol and
search engine are stored with new keywords
when keywords are added to database.
Between, while users enter the search query to
perform the searching, the protocol required
time to search and map the search query with
appropriate search results. However, once the
12
process of adding and new search query is

repeated many times, the searching time will be
constant after a while. This is because; the
proposed protocol able to indicate early the
search results after many run times of repeated
keywords. Therefore, it takes lesser time.
The results obtain by the graph, prove that the
implementation of proposed protocol and
search engine will be provide faster searching
time even though the number of users or size of
database is increase. This is the important
criteria of any search engine. CSSE meets the
criteria yet maintaining the computation time.
From the searching algorithm discussed in the
previous section, computation cost of the
searching process can be calculated. Since the
searching algorithm executes O(n) times, the
computation cost also expected to grow
linearly. However, since for the proposed
protocol and search engine the growth of
computation time will be constant after addition
of 100 users for 100 keywords and 80 users for
200 keywords, this research concludes the
computation cost also will be constant. In cloud
environment, n users will send the query to the
cloud independently. Therefore, the search
engine will perform n time searching differently
and will return the results n time to the users.
Since the proposed protocol reduces the
searching instead of n time, it will collect and
disseminate the queries and results to the users.
The proposed protocol shows better
performance compared to existing protocol
such as Ostrovosky and COPS protocols. Table
3 shows the comparison between the
Ostrovosky, COPS and the proposed protocols.
However, the value of n in the proposed
protocol proves to be grows constant after
several execution of searching process using
CSSE. Moreover, computation cost of existing
protocol depending on number of user, number
of stored files in cloud and number of keywords
in the dictionary. Computation time of the
ISBN: 978-1-941968-37-6 2016 SDIWC
proposed protocol only depends on number of

keywords in database.
Table 3 Performance Comparison
Protocol
Communication cost
Computation
cost
Ostrovosky
COPS
Proposed
Protocol
Searching overhead in this research reflects to

the similarity ranking percentage of the
searching process when number of keywords
and number of users are increase. Figure 7
shows the similarity ranking while the number
of users and keywords are increased.
Figure 7 Searching Overhead
Results from Figure 7 shows that the similarity

is increasing while the numbers of keywords
are increase. The maximum similarity of search
result can be found up to 100. The similarity
percentage is increasing linearly and it able to
produce above 80% similarity ranking when the
keywords are increased as the aim of the
experiment. This defined that, CSSE is
providing better search result. Conversely, the
performance of the similarity has been low at
some point of the experiment. This is due to
incompatible search key with the search query
entered by users.
By
conducting
performance
analysis
experiment, this study proves that numbers of
13
users are not an issue since there is no effect on

the time when the numbers of users are
increases. However, this experiment also shows
that, when numbers of keyword in the database
are varies, the time taken in search process will
increase. For the case of numbers of keyword is
100 for 20 users, the execution time recorded is
0.001274 milliseconds and for 200 keywords
the execution time is 0.001339 milliseconds.
The time recorded is different if keywords
numbers are increased. The value of n in this
research will be purely depending on the
numbers of keywords stored in database. This
experiment also proves that, the performance of
proposed protocol and search engine produce
better and significant improvement on services
searching in terms of three metrics namely
computation time, computation cost and
searching overhead.
Since, nowadays, number of cloud users and

types of cloud users are increasingly expanding
all around the world and are taking more
responsibilities to provide business solutions,
the researchers are recommended to consider
the growing field such as medical, pharmacy,
science to implement the proposed search
protocol. The semantic search and ontology are
other advance techniques which can be
considered to perform searching especially on
web services.
5 CONCLUSIONS
REFERENCES
By considering various comparison studies and

the results, it can conclude that the accuracy of
searching using the proposed protocol is better
than existing generic search engine. This
research focuses on searching methods and
techniques in cloud environments. This means
that the results and findings of this research are
only applicable to cloud environment. There are
many other environments seeking for search
protocol and search engine to map the search
result or information. This leaves a room to
other domain which needs the searching
protocol such as Information Retrieval, Image
Retrieval and so on.
[1]
Philip, S. W. Semantic Computing, Cloud

Computing,and Semantic Search Engine. In
Proceedings of International Conference on
Semantic Computing, Berkeley, 2009, 654-657.
[2]
Jrad, J. T. SLA Based Servicend Brokering In

Intercloud. In Proceedings of 2 International
Conference on Cloud Computing and Services
Science (CLOSER12), Portugal, 2012, 76-81
[3]
Rellermeyer, M. D. Engineering the Cloud from

Software Modules. In Proceedings of the 2009 ICSE
Workshop on Software Engineering Chanlenges of
Cloud Computing, Vancouver, Canada, 2009, 32-37.
[4]
Kang , A Cloud Portal with a Cloud Service Search

Engine. In Proceedings of International Conference
on Information and Intelligent Computing (IACSIT),
Singapore, 2011, 1-8.
[5]
Han, K. M. An Ontology-enhanced Cloud Service

Discovery System. In Proceeding of International
MultiConferences of Engineers and Computer
Scientist (IMECS10), Hong Kong, 2010, 644-649
[6]
Qin Liu, Chiu C, Tan, Jie Wu, Guojun Wang.

(2012). Cooperative Private Searching In Clouds.
Journal Of Parallel and Distributed Computing,
72(8):10191031.
This research focused on Software As a Service

(SaaS) type where the user will downloadable
services from the proposed search engine will
be list of software types. As cloud computing
offers other types of services too, this protocol
can be extending with other types of services as
well such as Platform As a Services (PaaS) and
Infrastructure As a Services (IaaS).
ISBN: 978-1-941968-37-6 2016 SDIWC
Finally more research is required to cover other

aspects and details of the searching protocol.
This research mostly focused on the query
searching and matching. But for protocol more
details of security is needed to be investigated
and more solutions are needed to be discovered
and applied in search protocol.
14
Practicality of Migration to Cloud Computing
Mohamed S. Hajji1 and Tamer Abdullah2

1
Faculty OF Computing and IT, University of Sohar, Oman

mhajji@soharuni.edu.om
2
Syrian Virtual University, Syria
ABSTRACT
The aim of this research is to assess some published
measurement models which evaluate the economic
benefits for organizations regarding moving to
cloud computing. This research also aims at
checking if any additional factors need to be
incorporated or if any fine tuning is required to
improve the models. Assessment and modifications
of the models are based on applying the models on
real life case studies and comparing the output of
the models with available real information. A
realistic case study from a big university was
considered. The case study highlighted the need to
consider more factors in the model, in particular the
impact of any previous partial move to cloud
computing.
KEYWORDS
Cloud
index
computing, Migration model, Suitability
INTRODUCTION
According to the National Institute of

Standards and Technology (NIST), Cloud
computing is a model for enabling convenient,
on-demand network access to a shared pool of
configurable
computing
resources
(e.g.,
networks, servers, storage, applications, and
ISBN: 978-1-941968-37-6 2016 SDIWC
services) that can be rapidly provisioned and

released with minimal management effort or
service provider interaction [1]. Cloud
computing provides three models, Software as a
Service (SaaS), Platform as a Service (PaaS),
and Infrastructure as a Service (IaaS)
The NIST asserts that it is vital for
organizations and government bodies to move
into cloud computing to meet economic
difficulties.
It
acknowledges,
however
challenges
regarding
interoperability,
portability and security.
Though it sounds that the economic benefits are
non-questionable, this research tries to work
with a model for evaluating cost-effectiveness
of moving to cloud computing. The research
tries to improve on some already suggested
models and apply the resultant model on a
significant real case study based on real
organization data.
Section 2 sheds light on the background of this
work. Section 3 summarizes some related work
and describes models regarding migration to
cloud computing. A case study is described in
Section 4 with discussion of the outcome.
Section 5 is dedicated to conclusions and future
work.
15
BACKGROUND
Cloud computing is the crowning of several

known computing models and concepts such as
Grid
Computing,
Cluster
Computing,
Virtualization and Peer-to-peer Computing [2].
With regard to observed trends, it is expected
that huge amount of investment would be
involved in cloud computing industry. Cloud
computing is known to be deployed as Private
cloud, Community cloud (private for a specific
community), Public cloud. Or Hybrid cloud.
Articles and literatures cite several economic
benefits of cloud computing including lower
costs, capital expense free computing, faster
projects deployment and lower maintenance
costs [1].
Numerous studies and research groups provided
revisions, tools and models to support moving
totally or partially to cloud computing [3] [4]
[5]. Some articles tackle more profound issues
like cloud interoperability [6]. On the other
hand, a number of researchers presented
economic
models or studies regarding
economic aspects of moving to cloud
computing [7].
3
MODELS FOR MIGRATION

CLOUD COMPUTING
TO
Different factors could play an important role in

the decision to move to cloud computing [8].
Some researcher worked on economic studies
like Return on Investment for a specific
organization. Incorporating intangible factors,
Misra and
Mondal [7] identify four
characteristics to consider before moving to
cloud computing:
IT resources size,
pattern of using these resources,
concerned data sensitivity and
criticality of concerned work.
ISBN: 978-1-941968-37-6 2016 SDIWC
Misra and Mondal developed a Suitability

Index based on comprehensive factors as details
to those four characteristics. Each one of those
characteristics
is
broken
down
into
subcategories. For example the size of
resources is founded on the number of servers
used, the size of the customer, the annual
revenue from IT, and the number of countries
where the company exists. Each subcategory is
well defined and given an evaluation range;
evaluation numbers would be associated with
each range. The model also associates Credits
(weights) with each factor. Those credit values
can be customized in relation to the concerned
company. Initial guidelines are, however,
provided in tables of credits.
The calculation ends up in a score that indicates
the suitability to move or not to cloud
computing. Three intervals are considered: a
score: below 3760 means that moving is not
recommended. A score above a 4600 indicates
that it is good to move to cloud computing. In
between the two thresholds, however, the result
is not conclusive.
4
CASE STUDY
An academic institution which hosts around

15000 students provided data which was used
to apply Misra and Mondal model. The
application followed the following steps:
Step 1: Set up the credits.

Step 2: Do the partial calculations.
Step 3: Calculate the suitability index.
Step 4: Show the results and discuss the
outcome
Table 1 highlights available information, while

Table 2 summarizes the set credits based on the
guidelines given in [7] and the data provided by
the academic institution. The long formulae
given in [7] were applied, the results of the
calculation are as follows: (note that in the
formulas any factor name prefixed with C
means the credit of the same factor, example
16
Annual revenue AR, while C AR is Credit of

Annual Revenue)
Peak Usage value

(PU) = DoP CDoP + PbA CPbA
Largeness value
(L) = NoS CNoS + NoC CNoC + AR CAR
So
(PU) = 4*5 + 2*9 = 38
So
(L) = 4*8 + 5*4 + 3*4 = 64
Workload Variability
(WV) = PU CPU + AU CAU + ADH CADH
Average Usage value

(AU) = ToS CT oS + (4 SCB) CSCB
So
So
(AU) = 3*5 + (4 3)*7 = 22
(WV) = 38*6 + 22*8 + 3*5 = 419

Table 2. Credits of different factors related to the case
study
Table 1. Available information
Related
number in
the model
Tabulation
sheet
Number of servers <100
NoS
Number of countries
it is spread across
1
NoC
Annual revenue
from IT offerings
AR
Duration of peak
usage/year DoP
Peak by average
PbA
Type of services
ToS
Size of customer
base SCB
Amount of data
handling ADH
Sensitivity of data
SoD
< 20 m $
4
5
Characteristics
Credits
Size of IT resources
Number of servers
Number of countries it is spread across 4

Annual revenue from IT offerings
Workload variability
Peak usage
Duration of peak usage/year
Peak by average
Average usage
Few hours
Type of services
<5 times
Type of projects undertaken
n/a
Size of customer base
Amount of data handling
Sensitivity of data
Criticality of work done
Profile 4:
Moderately
variable workload
with occasional
surges
Below 20000
500 GB 1TB/month
sensitive
ISBN: 978-1-941968-37-6 2016 SDIWC
3
3
3
17
5 CONCLUSIONS AND FUTURE WORK
Data Sensitivity
(DS) = SoD=3
Criticality
(C) = CWD= 3
Suitability index =
L CL + WV CWV + DS CDS ADH + C
CC (65 L)
= 64 * 8 + 419 * 8 + 3 *
* 4 * (65 64)
+ 3
= 3930
The result is in the gray area, which could
indicate a weakness in the model. Available
information (though it was not possible to get
accurate info due to authorization issues) from
staff in the institution indicates otherwise.
The concerned university has partially moved
to cloud computing before we applied the
available model.. This issue of legacy partial
conversion is not considered in the available
model. We need to incorporate a factor
regarding the percentage of load designated to
cloud computing prior to applying the model.
By studying more similar cases and comparing
with some pure cases (i.e. cases where
information are available about the institution
before any move to cloud computing and after
full move).
One case study is not enough to give a strongly
credible feedback about the model. Another
academic institution is being considered. Also,
it is important to apply the model on an
industrial or commercial organization (/s) and
check if there are any general clear differences
between
academic
and
nonacademic
organizations regarding the appropriateness of
cloud computing
ISBN: 978-1-941968-37-6 2016 SDIWC
This preliminary paper describes a research

trying to assess and improve a mathematical
model (that is measurable and quantitative)
with consideration to real life cases. One case
study is made and work on other cases are
being carried out with the aim of studying the
factors used in the current model, trying to fine
tune some of these factors and to introduce new
factors. The preliminary case study indicates
weaknesses in the model but the result is not
conclusive with one case study.
More case studies including industrial and
commercial organization should be considered
in the future. It would be interesting to observe
any clear differences in the appropriateness of
cloud computing between different categories
of organizations. It is also interesting to find out
whether different types of organizations require
different migration models.
REFERENCES
[1] NIST, "NIST Cloud Computing Program," [Online].

Available:
http://www.nist.gov/itl/cloud/index.cfm.
[Accessed March 2016].
[2] S. Androutsellis-Theotokis and D. Spinellis, "A
survey
of peer-to-peer content distribution
technologies," ACM Computing Surveys (CSUR), vol.
36, no. 4, pp. 335-371, 2004.
[3] S. S. Manvia and G. K. Shyam, "Resource
management for Infrastructure as a Service (IaaS) in
cloud computing: A survey," Journal of Network and
Computer Applications, vol. 41, no. May, p. 424
440, 2014.
[4] F. Gagliardi, B. Jones, F. Grey and M. E. Bgin ,
"Building an infrastructure for scientific Grid
computing: status and goals of the EGEE project,"
Philosophical Transactions of the Royal Society A:
Mathematical, Physical and Engineering Sciences ,
vol. 363, no. 1833, pp. 1729-1742, 2005.
18
[5] A. Chien, B. Calder, S. Elbert and K. Bhatia,

"Entropia: architecture and performance of an
enterprise desktop grid system," Journal of Parallel
and Distributed Computing, vol. 63, no. 5, pp. 597610, 2003.
[6] A. N. Toosi, R. N. Calheiros and ,. R. Buyya,
"Interconnected Cloud Computing Environments:
Challenges, Taxonomy, and Survey," ACM
Computing Surveys (CSUR), vol. 47, no. 1, pp. 1-47,
2014.
[7] S. C. Misra and A. Mondal, "Identification of a
companys suitability for the adoption of cloud
computing and modelling its corresponding Return
on Investment," Mathematical and Computer
Modelling, vol. 53, no. 3, pp. 504-521, 2011.
[8] R. Buyya, R. Ranjan and R. N. Calheiros,
"InterCloud: Utility-Oriented Federation of Cloud
Computing Environments for Scaling of Application
Services," in Algorithms and Architectures for
Parallel Processing: Algorithms and Architectures
for Parallel Processing: 10th International
Conference, ICA3PP 2010, Busan, Korea, May 21 23, 2010. Proceedings. Part I, vol. 25, Berlin
Heidelberg, Springer, 2010, pp. 13-31.
ISBN: 978-1-941968-37-6 2016 SDIWC
19
Domain Oriented Divisible Load Scheduling in the Private Cloud

Min Zhong1, Qian Wei1, Yan Kong1 and Junzhao Li2
1. College of Computer and Communication Engineering, China University of Petroleum
No. 66 Changjiang West Road, Huangdao District, 266580, Qingdao, Shandong Province, P. R. China
zhongmin@upc.edu.cn, 1536622574@qq.com and 276249526@qq.com
2. Puyang Radio and TV Station
No. 52 Zhongyuan East Road, Hualong District, 457001, Puyang, Henan Province, P.R. China
1070062444@qq.com
ABSTRACT
Seismic data processing tasks which can be
arbitrarily divided into independent subtasks with
no
precedence
relations
or
inter-subtask
communication. To expand the scale of data
processing, shorten the processing makespan and
make it easy for distributed parallel computing in
private cloud, this paper presented a hierarchical
scheduling model with application-level scheduler
to support divisible seismic data processing
applications. Consequently, it proposed a probing
based data partitioning algorithm. The deduction of
a closed-form solution of data partition was
introduced in detail. Then FFD pre-stack depth
migration task of the Marmousi model using the
PDP algorithm was executed. Finally the result was
compare with that of the sequential processing and
average partitioning strategy.
KEYWORDS
Seismic Data Processing, Private Cloud, Divisible
Load, Scheduling, Data Partition
1 INTRODUCTION and RELATED WORK

The amount of seismic data acquisition and
computing is growing rapidly with the
development and application of the new seismic
technology in the field of seismic exploration[1].
Seismic data processing is one of the main
application fields of high performance
computing technology. Seismic data has the
characteristics of massive volume, read-only
processing, regular data format and can be
processed in the distribute environment to
ISBN: 978-1-941968-37-6 2016 SDIWC
expand the processing scale and accordingly

shorten the processing time.
During the second period of ChinaGrid[2]
construction, our university developed energy
and power key disciplines private cloud and
typical applications based on our existing grid
infrastructure[3].
We
integrated
the
heterogeneous high performance computing
resources to provide the teachers and students
in our university and ChinaGrid users with high
performance computing services. The typical
applications of the private cloud are seismic
data processing tasks which can be arbitrarily
divided into independent subtasks with no
precedence
relations
or
inter-subtask
communication. One of the key problems of the
private cloud is to scheduling the divisible
seismic data processing tasks to minimize the
makespan. In recent years, with the
development of divisible load theory (DLT), it
has become the potent model for divisible data
processing[4]. In this paper, an application level
scheduling algorithm based on DLT is present
to processing seismic data in the distributed
heterogeneous system. And the performance
metric is to optimize the makespan, the time
from the schedule is generated to the
completion time of processing of the last chunk
of data.
DLT was first put forward by James Cheng
in 1988 and used to handle communication and
computation problems in the sensor network[5].
Subsequently, many researchers begin to study
DLT theory in scheduling divisible tasks in
distributed computing environment such as
cluster, grid and cloud. [6-9] studied the
20
divisible scheduling in clusters. [10] presented

an adaptive divisible load model for the grid
application. [11-17] investigated on divisible
load scheduling in the cloud system. All of the
above work mainly focused on the theoretical
analysis and simulation test. [18] implemented
APST-DV, is the first usable for deploying
divisible load applications on grid platforms.
[19] used divisible load analysis to process
video encoding applications. Our research is
focus on the seismic data processing based on
DLT in the cloud. Since the system overhead is
much less than the computation and
communication, we ignored the system
overhead. And the imaging result collecting
time can be omitted in compare with the source
data distributing time.
The remainder of this paper is organized as
follows. In section 2, hierarchical scheduling
model is presented along with the application
characteristics analysis and Job description file.
In section 3, A probing based data partitioning
Scheduling Algorithm has been proposed and
the deduction of the closed formula of data
partition is described in detail. The experiment
result based on Marmousi model and analysis
of result is shown in section 4. The final section
concludes the paper with discussion and the
future work.
2 HIERARCHICAL SCHEDULING
MODEL
2.1 Application Characteristics
Seismic data processing is computing and data
intensive application. A typical three-dimension
seismic data set is up to TB and need more than
1018 FLOPS[20]. There are two types of
processing techniques, traditional processing
and special-purpose processing such as prestack migration imaging which need massive
parallelism[21]. Seismic data is always organized
by gathers in SEG-Y file format[22], such as
Common Shot Point (CSP) gathers, Common
Midpoint (CMP) gathers and Common Imaging
Point (CIP) gathers, and divisible by gathers
ISBN: 978-1-941968-37-6 2016 SDIWC
during the processing. With Fourier finite

difference (FFD) pre-stack depth migration as
an example, data is organized by CSP and can
be divided into arbitrary shot gathers when
processed in parallel.
2.2 Job Description
There are three types of job in the private cloud.
Single tasks are traditional processing programs
with reasonable computational requirement that
need only one resource. MPI tasks are parallel
programs for special seismic data processing
that need multi resources. And divisible tasks
are sequential programs that used to processing
different chunks of data in multi resources. In
the private, tasks requests are always described
in the job template (JT) files. The metascheduler then parses the job parameters in the
JT files and accordingly schedules the tasks. the
main parameters of JT files are shown in table 1.
Table 1 Main parameters of JT files
Parameter name
Meaning of the name
EXECUTABLE
Path of the executable program
ARGUMENTS
Execution parameters
Type
of jobs(single, MPI,
TYPE
divisible)
NP
Required number of resources
FILE-IN-DIV
Path of data file
Parameter
of SEG-Y file To guide
DIV-PAR
the data partition
FILE-IN-OTH
Path of other input files
FILE-OUT
File name of result
Name
of redirection file of
STD-OUT-FILE
STDOUT
Name of redirection file of
STD-ERR-FILE
STDERR
Requirement of resources
attributes, for example,
REQUIREMENTS
REQUIRMENTS = CPUMHZ>2000
Weight of the resources attributes
listed in REQUIREMENTS, for
RANK
example, RANK =CPU-MHZ
*0.5+MEM-MB*0.5
2.3 Hierarchical Scheduling Model

In the private cloud, divisible seismic data
processing tasks need multi resources to work
cooperatively and involve mass data and long-
21
time processing. The scheduler need not only

schedule the divisible tasks according to the
user requirement and resources workload but
also track status of the job execution. We
present a hierarchical scheduling model, as
shown in Figure 1.
DATA
3
PROBING
BASED
PARTITIONING (PDP) ALGORITHM
The central
services layer
Information service
Meta-scheduler
Applicationlevel scheduler1
3.1 Related Conventions and Definitions
ApplicationApplicationlevel scheduler2 level schedulern
Dynamic
aggregation layer
Logical
community1
Logical
community2
Local
scheduler
Local
scheduler
execution information of the applications. But

this will result in the increasing of system
complexity and burden of the monitoring
system. In the next section, a probing based
data partitioning strategy is described in detail.
Logical
communityn
PDP strategy is based on several assumptions.

(i) The system topology is single-level tree,
as shown in figure 2. The root p0 is the client
that installed application-level scheduler which
does not involve in computing. The workers p1pn are logical community that execution the
task.
root p0
Local
Information
service
Resource nodem
Resource node2
Resource node1
Resources layer
Local
Information
service
Local
scheduler
Local
Information
service
l1
Tcp1
p1
l2
Tcp2
p2
ln
Tcpn
pn
Figure 2 Single-level tree topology

Figure 1 Hierarchical Scheduling Model
As for divisible tasks in the model, users

install application-level scheduler locally.
When there is a task that need be processed, the
user submits JT file to meta-scheduler. Next,
meta-scheduler interacts with information
service and selects the optimal resources to
form logical community according to the
requirements in the JT file. Then metascheduler sends the resources list to
application-level
scheduler.
Finally,
application-level scheduler partitions the task
according to scheduling strategy and distributes
the chunks to the resources and tracking the
execution.
In the private cloud, many resources are
heterogeneous. And even if some resources are
homogeneous, the workloads of the systems are
changing dynamically. The computing and
communication performance of resource can be
gathered by monitoring tools or the history
ISBN: 978-1-941968-37-6 2016 SDIWC
(ii) There are both heterogeneous and

homogeneous in the system.
(iii) Task is dispatched to workers in single
round during the scheduling phase. In special
seismic data processing, computing time is far
greater than communication time for each
gathers. So we adopt single round scheduling.
The multi round scheduling will be the next
work for us to realize the real-time load balance.
(iv) The root distributes the loads in
sequence.
(v) System overheads such as startup of
communication and computing are ignored
because system overheads are much less than
computing time in practice.
(vi) The time of result data retrieving is
ignored because the result data is so small in
compare with the pre-stack data.
(vii) In the optimal schedule, all the workers
finish computing simultaneously. And all the
22
chunks are sent in the order of non-decreasing

link capacities. [23]
The following are definitions for PDP
algorithm.
n represents the number of workers.
p0 denotes the application-level scheduling
root.
pi(i1~n) denotes workers.
z denotes the total load.
0 denotes the chunk size in probing phase.
i denotes the chunk size of workeri in
scheduling phase.
tcmi denotes the unit performance of workeri.
tcpi denotes the unit performance of workeri.
itcpi i 1tcmi 1 i 1tcpi 1, i 1,..., n 1 (1)

n
i 1
0tcm1
p1
0tcm3
p3
0tcp3
Pn
2tcp2
0tcpn
Probing
3tcm3
ntcmn
k-2
(3)
j k 1
2
(t
i k
cmi
cpj
tcpi )
Substituting Eq. (3) into Eq. (2), we obtain

Eq. (4) as follows.
tcp1
tcp1tcp 2
1 +
1
1 +
tcm 2 tcp 2
(tcm 2 tcp 2 )(tcm 3 tcp 3 )
1
...+
j n 1
2
(t
i n
cmi
cpj
tcpi )
(4)
1 z 0
Then we obtain 1 as follow

z 0
1
1
tcpj
n
j k 1
1 2
k 2
(tcmi tcpi )
(5)
4 CASE STUDY: FFD MIGRATION

IMAGING
3tcp3
0tcmn
(tcmk tcpk )(tcmk 1 tcpk 1 )
And we can derive ={1,2,3,,n}. In

practice, Eq. (1) and Eq. (2) can be transform
into linear equation (6) and then solve the
equation.
2tcm2
0tcp2
tcpk 1tcpk 2
i k
1tcp1
0tcp1
0tcm2
p2
The finish time
(2)
=......
1tcm1
z 0
The relation between 1, 2,, n and 1 can

be derived from formula (1):
t
k cpk 1 k-1
tcmk tcpk
3.2 PDP Algorithm

The basic idea of Probing is to divide the
scheduling into two phases, probing phase and
data distributing phase, as shown in figure 3. In
the probe phase, root sent0 chunk to all the
workers and wait for the feedback of worker to
compute the unit computing time tcmi and unit
communication time tcpi of each worker. Then
the root calculates the size of chunk i for each
worker. The closed formula is deduced as
follow.
={1,2,3,,n} denotes the chunk size
vector. During the scheduling phase, the total
size to be scheduled is z-0. According to the
assumption (vii), we obtain Eq. (1) and (2).
ntcpn
Scheduling phase
Figure 3 Timing diagram of PDP strategy
ISBN: 978-1-941968-37-6 2016 SDIWC
4.1 FFD Migration Imaging Scheduling

Algorithm based on PDP
The basic steps of the algorithm are as follow:
(1) Root sends 0 shots to all the workers.
23
(2) The root calculates tcmi and tcpi, then solve

the Eq. (6) to obtain ={1,2,3,,n}. i
denotes the number of shots that pi need
process.
(3) Calculates index of beginning shots
start_nxshoti and ending shots end_nxshoti that
pi is allocated. For i=1, start_nxshoti = 1
end_nxshoti = ai. And for i>1, start_nxshoti =
end_nxshoti-1+1, end_nxshoti = start_nxshoti
+i.
(4) Calculates index of beginning gather

start_tracei = (start_nxshoti-1)shot_trace + 1
and ending gather end_tracei = end_nxshoti
shot_trace. shot_trace denotes the total gathers
in one shot.
(5) Distributes all the chunks to pi in the
order of non-increasing tcmi.
(6) Collects the result of pi and calculate the
stack result.
tcp1 tcm 2 tcp 2

1 0
0tcp 2 tcm 3 tcp 3

2 0
t t t
0
3
cp 3
cm 4
cp 4
... ...
0tcpn 1 tcmn tcpn n 1
n z 0
4.2 Experiment based on Marmousi Model

We executed the FFD pre-stack depth migration
of the Marmousi model using the PDP
algorithm. Figure 4 shows the velocity field of
the Marmousi model.
velocity target stratum below 2,400 m. In total,

there are 240 shots with 96 traces per shot and
750 samples per trace. The shot and receiver
spacing are both 25 m, and the velocity sample
interval is 4m.
The tasks included total 138 shots data from
103 to 240 shots all of which is migrated from
the first gather on 5 workers. The configuration
of the worker is shown in table 2.
Worker
p1
p2
p3
Figure 4 Marmousi Model Velocity Field
The model is a standard 2D geologic model

which has three large faults in the upper part, a
salt dome in the middle, high-velocity bodies
on both sides of the lower part, and a low-
ISBN: 978-1-941968-37-6 2016 SDIWC
(6)
Table 2 Configuration of the workers

Memory
CPU
Intel Pentium(R) Dual-Core CPU
2GB
E5300 2.60GHz
2GB
E5400 2.70GHz
2GB
E5400 2.70GHz
p4
Intel Core2 Duo Cpu 2.53GHZ
1GB
p5
Dell Intel Core i5-3470 Cpu

3.2GHZ
4GB
Firstly, the root sent one shot to each worker

at random (in the order of p5,p4,p2,p1,p3) and
obtain the performance of each worker, as
shown in table 3. The makespan of probing
phase Tpb can be calculated by Eq. (7).
24
Tpb =maxTpbi =max( 0tcmj +0tcpi )
(7)
j1
Table 3 Probing result
worker
p1
p2
p3
p4
p5
tcm(s)
0.0493
0.0413
0.0331
0.0063
0.0375
tcp(s)
11.8169
19.5106
28.5794
29.9638
30.6262
Tpb(m)
0.1978
0.3267
0.4784
0.5016
0.5132
Consequently, the root calculated the

={20,20,19,30,49}, and the index of beginning
and ending gathers for each worker, as shown
in table 4.
Table 4 Beginning and Ending Gathers for Each Worker
Worker
Ending
Beginning
p1
9793
11712
p2
11713
13632
p3
13633
15456
p4
15457
18336
p5
18337
23040
Figure 5 Imaging of Parallel Processing
In comparison, we processed all the chunks

on worker p5. The makespan is 17.7869 m and
the stack result is shown in figure 6 which is
consistence with the result of parallel
processing.
Then the root distributed the chunks to

workers in the order of p1, p2, p3, p4, p5.The
communication and computing time of each
worker is shown in table 5. The makespan of
the scheduling phase can be calculated by Eq.
(8).And the stack result is shown in figure 5.
The makespan of each worker is not completely
equal because of the dynamic of the links and
workloads. The total makespan of the task is
10.5221 m including probing phase.
i
TS =Tsi = jtcmj +itcpi
(8)
j1
Table 5 Communication and Computing Time

of Each Worker
Tsi(m)
tcm(s)
tcp(s)
Worker
p1
0.0940 580.7180
9.6802
p2
0.7350 578.5620
9.6565
p3
0.5470 599.1560
10.0089
p4
1.2500 576.8600
9.6581
p5
2.4380 562.2190
9.4547
ISBN: 978-1-941968-37-6 2016 SDIWC
Figure 6 Imaging of Sequential Processing
And then we processed the task by average

partitioning strategy. Each of worker p1-p4 was
allocated 27 shots and p4 was allocated 30 shots.
The result is shown in table 6. The total
makespan is 13.3882. The difference of
makespan of each worker is significant.
25
Table 6 The Result of Average Partitioning Strategy

Worker
tcp(s)
tcm(s)
Makespan(m)
p1
0.1350
803.1560
13.3882
p2
0.9510
778.4900
12.9929
p3
1.3110
836.7380
13.9856
p4
1.2170
517.2970
8.6819
p5
1.4990
364.5190
6.1605
5 CONCLUSION
In this paper we have presented a hierarchical
scheduling model with application-level
scheduler to support divisible seismic data
processing applications. Then it proposed a
probing based data partitioning algorithm. The
deduction of the closed formula of data
partition was introduced in detail. Finally, FFD
pre-stack depth migration task of the Marmousi
model using the PDP algorithm was executed.
The result validated the PDP strategy on the
condition that resources list was given. In future
work we will study multi round scheduling
algorithm to overlap the communication with
computing. We will also investigate the
dynamic load balance during the data
processing.
ACKNOWLEDEGMENTS
This work was supported by the Fundamental
Research Funds for the Central Universities
(15CX02046A) and ChinaGrid project funded
by MOE of China.
REFERENCES
[1] Matheny, Paul, et al. "Evolution of the land seismic
super crew." Seg Technical Program Expanded
Abstracts 2009:4338.
[2] Hai Jin. "Constructing a resources sharing platformChina Grid" China Education Network 9(2006):2526.
. "
ChinaGrid." 9(2006):25-26.
[3] Min Zhong, et al. "Grid platform for seismic data
parallel processing and its application " Journal of
China University of Petroleum(Edition of Natural
Science) 38.02(2014):180-186.
. "
ISBN: 978-1-941968-37-6 2016 SDIWC
." ()
38.02(2014):180-186.
[4] Bharadwaj, Veeravalli, D. Ghose, and T. G.
Robertazzi. "Divisible Load Theory: A New
Paradigm for Load Scheduling in Distributed
Systems." Cluster Computing 6.1(2003):7-17.
[5] Cheng, Y. C., and T. G. Robertazzi. "Distributed
computation with communication delay (distributed
intelligent sensor networks)." IEEE Transactions on
Aerospace & Electronic Systems 24.6(1988):700712.
[6] Yang, Yang, and H. Casanova. "UMR: a multi-round
algorithm for scheduling divisible workloads."
International Parallel & Distributed Processing
Symposium IEEE, 2003:24b.
[7] Ghose, Debasish, H. J. Kim, and T. H. Kim.
"Adaptive Divisible Load Scheduling Strategies for
Workstation Clusters with Unknown Network
Resources." IEEE Transactions on Parallel &
Distributed Systems 16.10(2005):897-907.
[8] Lin, Xuan, et al. "Real-Time Divisible Load
Scheduling for Cluster Computing." IEEE Real Time
& Embedded Technology & Applications
Symposium IEEE, 2007:303-314.
[9] Chuprat, Suriayati. "Divisible load scheduling of
real-time task on heterogeneous clusters."
Information Technology IEEE, 2010:721-726.
[10] Othman, M., et al. "Adaptive Divisible Load Model
for Scheduling Data-Intensive Grid Applications."
Lecture
Notes
in
Computer
Science
4487.1(2007):446-453.
[11] Iyer, G. N., B. Veeravalli, and S. G. Krishnamoorthy.
"On
Handling
Large-Scale
Polynomial
Multiplications in Compute Cloud Environments
using Divisible Load Paradigm." Aerospace &
Electronic Systems IEEE Transactions on
48.1(2012):820-831.
[12] Abdullah, Monir, and M. Othman. "Cost-based
Multi-QoS Job Scheduling Using Divisible Load
Theory in Cloud Computing. " Procedia Computer
Science 18.1(2013):928-935.
[13] Nisha, L., A. S. Ajeena Beegom, and M. S. Rajasree.
"Management of data intensive divisible load in
cloud systems with gossip protocol." International
Conference
on
Control,
Instrumentation,
Communication and Computational Technologies
IEEE, 2014:856-861.
[14] Rosas, Claudia, et al. "Improving Performance on
Data-Intensive Applications Using a Load Balancing
Methodology Based on Divisible Load Theory."
International Journal of Parallel Programming
42.1(2014):94-118.
[15] Ismail, Leila, and L. Khan. "Implementation and
performance evaluation of a scheduling algorithm
for divisible load parallel applications in a cloud
computing environment." Software Practice &
Experience 45.6(2015):765781.
[16] Kang, Seungmin, B. Veeravalli, and K. M. M. Aung.
"Scheduling Multiple Divisible Loads in a Multicloud
System."
IEEE/ACM,
International
Conference on Utility and Cloud Computing IEEE,
2015:371-378.
[17] Suresh, S., H. Huang, and H. J. Kim. "Scheduling in
compute cloud with multiple data banks using
divisible load paradigm." IEEE Transactions on
Aerospace & Electronic Systems 51.2(2015):12881297.
26
[18] Raadt, Krijn Van Der, Y. Yang, and H. Casanova.

"Practical Divisible Load Scheduling on Grid
Platforms with APST-DV." Parallel and Distributed
Processing Symposium, 2005. Proceedings. IEEE
International 2005:29b.
[19] Li, Ping, B. Veeravalli, and A. A. Kassim. "Design
and implementation of parallel video encoding
strategies using divisible load analysis." IEEE
Transactions on Circuits & Systems for Video
Technology 15.9(2005):1098-1112.
[20] You Jang, Jun Chen and Jun Huang. "Applicatioin of
HPC to Seismic Data Processing". Computer
Engineering & Science 31.S1(2009):328-330.
, , and . "
."
31.S1(2009):328-330.
[21] Hong Liu. GPU/CPU co-processing parallel
computation for seismic data processing in oil and
gas exploration. Petroleum Industry Press 2010.
. GPU/CPU
. , 2010.
[22] Calderon, Karynna. "Seismic Data Format (SEG-Y
Format) Archive of Chirp Seismic Reflection Data
Collected During Usgs Cruises 00SCC02 and
00SCC04, Barataria Basin, LOUISIANA, May 12 31 and June 17 - July 2, 2000." USGS - U.S.
Geological Survey, Center for Coastal & Watershed
Studies.
[23] Beaumont, Olivier, et al. "Scheduling Divisible
Loads on Star and Tree Networks: Results and Open
Problems." IEEE Transactions on Parallel &
Distributed Systems 16.3(2005):207-218.
ISBN: 978-1-941968-37-6 2016 SDIWC
27
FPGA-Based Processor Array Architecture for Profile Hidden Markov Models

Atef Ibrahim1,2,a Hamed Elsimary1,b Abdullah Aljumah1,c Fayez Gebali3,d
1
Prince Sattam Bin Abdulaziz University, AlKharj, Saudi Arabia
2
Electronics Research Institute, Cairo, Egypt
3
University of Victoria, Victoria, BC, Canada
a
aa.mohamed@psau.edu.sa, b hamed@eri.sci.eg, c aljumah88@hotmail.com, d Fayez@ece.uvic.ca.
ABSTRACT
This paper proposes novel processor array structure
to speed up the Viterbi algorithm for Profile Hidden Markov Models. This structure is amended to
allow hardware reuse instead of repeating the processing elements of the processor array on multiple
FPGAs. Also, it has the advantage of reducing the
area overhead of the FPGA compared to the previously reported processor array structure. Therefore, it increases the maximum number of Processing Elements (PEs) that could be implemented on
the FPGA and hence increasing the throughput.
FPGA implementation results show that the proposed design has a considerable higher speedup (up
to 165%) over the previously reported one.
KEYWORDS
processor arrays, bioinformatics, profile hidden
Markov model, sequencing technology, biological
computation, reconfigurable computing, digital circuits design.
INTRODUCTION
Predicting the biological functions of protein

sequences is a critical task in bioinformatics.
The determination of protein function can be
done by using the homology search method.
In this method, protein sequences of unknown
function are aligned against known protein sequences stored in the database. In this alignment process, profile Hidden Markov models
(profile HMMs) have produced great results
[1], and are utilized by different databases.
The profile HMM is a probabilistic model used
to represent the protein family in database
searches. The Viterbi algorithm [2], which
is a dynamic programming based (DP-based)
algorithm, can be efficiently used to align a
profile HMM query and a subject sequence.
ISBN: 978-1-941968-37-6 2016 SDIWC
Protein alignment by Viterbi algorithm using general purpose processors (microprocessors) results in quadratic time complexities and
hence searching the database will require very
long computation time. Therefore, a massive
growth in research which focuses on accelerating DP-based Viterbi algorithm has happened.
The FPGA implementation of Viterbi algorithm was among the early works reported in
literature [3, 4]. They presented a simple architecture to what is called full plan 7 [5]
by neglecting the feedback loop and this led
to an effective architecture with fine-grained
parallelism of processor arrays with estimated
speedup performance of one to two orders of
magnitude. Other reported FPGA implementations with no feedback loop dependency have
also been reported in [6, 7, 8]. Oliver et al. then
reported studies for accelerating the Viterbi algorithm with full plan 7 architecture in 2007
[9] and 2008 [10]. They presented different
approach to calculate the alignment matrix by
computing cells in the Dynamic Programming
(DP) matrix in row-major order, but the strategy they proposed was not suitable for parallel
computation due to feedback loop dependency.
Typical FPGA-based HMMER acceleration
computes alignment degrees using processor
arrays by allocating a single processing element (PE) for HMM node. Each PE utilized
from 300 to 500 logic slices for the implementation of the Viterbi algorithm resulting
in 10 to 100 PE systolic arrays implemented
in hardware depending on the FPGA chip
used. With profile HMM with about 200
nodes on the average [11], more logic slices
and a larger size of block RAMs (BRAM) are
consumed since PEs are replicated to increase
parallelism. Therefore, folding technique has
been utilized to allow for the implementation
28
of HMM longer profile on an FPGA chip. This

approach reuses PEs to increase the computation of the alignment degrees over multiple
passes. Computing the alignment matrix in
several passes require an update of all PEs with
coefficients through each fold computation.
This rises the configuration time of each
PE by a factor q, where q is the number of
required folds. Isa et. al [12] presented a new
hardware architecture that has a fixed number
of 2 Configuration Elements (CEs) in the
PE to store coefficients for alignment matrix
calculation. Moreover, the CE is realized on
plentiful FPGA logic slices which preserve the
restricted block RAM for other critical tasks
in the profile HMM-based sequence alignment.
A new folded processor array structure proposed in this paper for the Viterbi algorithm.
This structure outperforms the processor array
structure presented in [12] in speed and area.
Also, it presents the hardware implementation
of the PE of this structure and applies the
scheduling technique of [12] to the PE architecture, without losing the time required for
configuration, to reuse the processor array for
numerous pass processing .
The organization of the paper is as follows.
Section 2 gives a brief discussion about the
Viterbi algorithm that is used to align the profile HMM to a protein (subject) sequence. Section 3 presents the proposed processor array
architecture and shows the modification of the
proposed architecture, using folding technique,
to be reused for multiple pass processing. Section 4 compares the proposed processor array
architecture to the previously reported one in
terms of speed and area. At the end, Section 5
provides the conclusion of the paper.
2
VITERBI ALGORITHM
Figure 1 shows the profile HMM with simplified plan 7 architecture [5]. The full plan 7 architecture has a feedback loop that makes the
computation of Viterbi algorithm in parallel is
difficult. According to the experiments performed by Takagi [13], the ratio that the feed-
ISBN: 978-1-941968-37-6 2016 SDIWC
back loop is chosen is extremely low, about

0.01%, and the loss in performance resulted
from the recalculation process is less than a
few percent. Therefore, the feedback loop can
be neglected in the simplified plan 7 architecture. The architecture composed of a sequence of sets. Each set consists of three states:
match M , insert I, and delete state D.
There are transitions with related probabilities
between the sets. Also, each match state and
insert state are associated with a location specific table with 20 residue emission probabilities. Both emission and transition probabilities are produced from alignment of multiple
sequences of a protein family.
To define the probability that the protein sequence related to the protein family, a profile
HMM should be aligned to a particular protein sequence. The most likely route through
the profile HMM producing a sequence similar
to the specified sequence make out a similarity
score. The following recurrence equations of
the Viterbi algorithm can be used to compute
this score.
M (i 1, j 1) + tr(Mj1 , Mj )
M (i, j) = e(Mj , si ) + max

I(i 1, j 1) + tr(Ij1 , Mj )
D(i 1, j 1) + tr(D , M )
j1
(1)
{
I(i, j) = e(Ij , si ) + max
{
D(i, j) = max
M (i 1, j) + tr(Mj , Ij )
I(i 1, j) + tr(Ij , Ij )
M (i, j 1) + tr(Mj1 , Dj )
(2)
(3)
D(i, j 1) + tr(Dj1 , Dj )
where e(Mj , si ) represents the emission cost

of residue si at state Mj and tr(X1 , X2 ) represents the transition cost from state X1 to
state X2 , where Xi could represent Mi , Ii
or Di . M (i, j) indicates the best route score
that matches sub-sequence s1 si to the submodel up to state Mj , resulting in si being
produced by state Mj . Also, I(i, j) indicates
the best path score resulting in si being produced by Ij and D(i, j) indicates the the best
path score terminated at state Dj . M (0, 0) =
29
0 and M (n, m) identifies the initial and final

matching scores, respectively, for query profile
HMM of length m and a subject sequence of
length n.
D1
D2
D3
D4
I0
I1
I2
I3
I4
M0
(start)
M1
M2
M3
M4
M5
(end)
Figure 1. Simplified transition structure of profile

HMM of length 4.
PROPOSED PROCESSOR
ARCHITECTURE
ARRAY
Figure 2 shows the proposed processor array

for proposed design. It consists of m PEs.
The subject sequence input bits, s(km + i ),

should be assigned to each PE in the processor array through the whole computation cycle
(m clock cycles), where k takes values ranging
from 0 to n/m 1. The intermediate bits of
I(km + i , j), D(km + i , j) and M (km + i , j)
are pipelined between adjacent PEs. After latency of n + m 1 clock cycles, the output
score is produced in serial from the latest PE
(PEm ).
than repeating the PEs on a cluster of multiple FPGAs. This design approach is called
folding approach [14, 15] and it has the advantage of reducing the significant amounts of energy consumed besides decreasing design size,
maintenance and operational costs. The design
modification starts by partitioning the considered algorithm into small partitions and map
these partitions on a linear processor array of
fixed size. This problem was previously studied in several papers [16, 14, 15]. The following illustrates the modification process.
Let us start by supposing the common case of a
query profile HMM of size m and a fixed size v
processor array, where q = m/v and m > v.
At first, the m size processor array is theoretically expanded to a q v size array with the
latest qv m PEs loaded with zero values. By
that means the extra PEs do not affect the total results of alignment. After the expansion
step, the resulted q v sized processor array is
folded into the actual fixed size v array. Due
to this folding process, the alignment process
will be completed in kq passes over the fixed
size processor array. The intermediate results
of each pass should be stored in a first-in-firstout (FIFO) before they are passed back to the
input of array for the following pass (see Figure 3).
s(km+(qv+1))
s(km+(qv+2))
PE1
PE2
s(km+(qv+v))
Best score
s(km+1)
s(km+2)
s(km+i)
s(km+m)
PE1
PE2
PEi
PEm
PE
PEv
Best score
M_FIFO
Intermediate results
Intermediate results
I_FIFO
D_FIFO
depth = m-v
Figure 2. Proposed processor array design.
3.1
Proposed folded processor array
To solve the problem of long queries of profile

HMM, this design architecture has been modified to be reused for multiple-pass processing and implemented on a single FPGA rather
ISBN: 978-1-941968-37-6 2016 SDIWC
Figure 3. Proposed folded fixed size processor array
The computing over several passes requires a

different set of emission and transition probability scores for each pass computation. The
PE temporarily stores the emission and transition probability scores in a look-up table rather
than a block RAM. In [12], the authors referred
30
Figure 4 shows the PE hardware structure of

the folded processor array. The PE has two
configuration elements, CE0 and CE1, which
temporarily store emission and transition probability scores of a specific profile HMM node.
During the configuration mode, the controller
of the CE-NODE MAPPER selects the emission and transition probability scores corresponding to the CE by mux1 or mux2.
The CE briefly holds three look-up tables (one
holds 20 elements emission scores of the M
state, the other one holds 20 elements scores
of the I state, and the last one holds 9 elements of the transition state scores) for alignment matrix computation. Thus, each CE with
a total CE depth of 49 elements represents a
specific profile HMM node position. The CEs
are used alternately for alignment matrix computation, where the turn for the CE (either
CE0 or CE1) is dictated by the computational
passes. The controller sets all CE0 elements
in the processor array to hold the probability
scores during all even numbered fold computations, whereas CE1 elements are used during all odd-numbered computation. mux3
is used to select the probability scores either
from CE0 or CE1 since both CEs are used alternately for computation and this selection is
set by the CE-SEL port. The selected probability scores are used as inputs for the processing core. The input residue of the subject se
quence, s(km + (qv + i )), is loaded - during
the configuration phase- in a register and applied for the whole computational cycle. After
latency of n + m 1 clock cycles, the output
score is produced from the latest PE (PEm ) in
the processor array in a serial way. The ob-
ISBN: 978-1-941968-37-6 2016 SDIWC
tained best score output and the corresponding subject sequence address are stored in a
best score FIFO if it satisfies a given threshold
value, otherwise they are ignored.
Figure 5 shows the processing core inside the
PE. It implements the basic operations of the
Viterbi algorithm which calculates the matrices M (i, j), I(i, j) and D(i, j) as described by
Equations (1), (2), and (3). It consists of three
arithmetic instances; the M (km + (qv + i ), j)

instance which computes scores of the M state,
the I(km + (qv + i ), j) instance which computes scores of the I state, and the D(km +
(qv + i ), j) instance which computes scores of

the D state.
emission and transition probability scores for profile HMM nodes
w
From CE-NODE
MAPPER
mux1
mux2
CE0
CE1
CE
demux
s(km+(qv+i ))
w
Reg.
to the look-up table as the configuration element (CE) and this term is also useed in this
paper. The mapping of a CE to its corresponding profile HMM node position is dictated by
a controller inside a CE-NODE MAPPER as
discussed in [12]. The design of the proposed
sequence alignment core architecture is based
on the scheduling strategy (Known as overlapped computation and configuration (OCC))
proposed by [12] (see [12] for more details regarding this strategy).
E
M
E
I
T
r
CE
w
CE-SEL
mux3
9xw
e(Mj , s ) e(Ij , s )
tr
Processing
Figure 4. PE logic diagram.
4 COMPLEXITY COMPARISONS
In this section, we discuss the performance
evaluation and resource usage of the proposed
novel design (Design1) and the previously reported design of [12] (Design2). These designs
are modeled using VHDL language and realized using Xilinx ISE8.1 tools on Alpha Data
ADM-XRC-5LX card. The developed VHDL
code for each design is parameterizable, in
which we can change the number of PEs in
the processor array, the PE word size, and the
lengths of both subject sequence n and the profile HMM query m. The synthesis results after place and route (using PE word width of
16) show that the maximum number of PEs
(for m = 2295 and n = 35, 213) that can be
implemented on the FPGA in the case of the
31
M(km+(qv+i -1), j-1)

+
tr( Mj-1 , Mj)
I(km+(qv+i -1), j-1)
M
A
X
+
tr( Ij-1 , Mj)
D(km+(qv+i -1), j-1)
+
e(Mj , s
tr( Dj-1 , Mj)
cw M(km+(qv+i ), j)
Reg
M(km+(qv+i ), j-1)
tr(Mj-1 , Dj)
M
A
X
tr(Dj-1 , Dj)
cw
D(km+(qv+i ), j)
D(km+(qv+i ), j-1)
Reg
M(km+(qv+i -1), j)
+
tr(Mj , Ij)
M
A
X
I(km+(qv+i -1), j)
cw
I(km+(qv+i ), j)
+
tr(Ij , Ij)
e(Ij , s
Figure 5. PE Processing core.
proposed Design1 is 53 against maximum 43

PEs can be fitted on the the same FPGA for
the case of conventional Design2. The excess
number of the implemented PEs in case of the
proposed Design1 is attributed to the great saving in resources that is resulted from removing
the subject sequence feedback FIFO as well as
reducing the sizes of the intermediate results
feedback FIFOs. The obtained maximum operating frequency from the post place and route
synthesis results is 164 M Hz for the proposed
design and 166 M Hz for the conventional one.
The slight reduction in frequency in case of the
proposed design is due to the increased critical
path delay and hence the clock period of the
proposed design over that of the conventional
one. This increase in critical path delay in the
proposed design is due to the increase in the
number of PEs over that of the conventional
one. The increase in number of PEs leads to an
increase in wire lengths (i.e., increase of parasitic resistances and capacitances) and fan-outs
of the gates in the processor array.
In an attempt to perform real testing for
both architectures, several samples of profile
ISBN: 978-1-941968-37-6 2016 SDIWC
HMM queries (of lengths ranging from 38

to 2295) picked from the Pfam database [17]
are aligned against a UniprotKB/Swiss-Prot
knowledgebase [18] of 549,215 subject sequences (195,767,212 residues) in a fixed size
processor array of v = 38 with different folds.
Table 1 shows the resulted execution time and
normalized speedup in each case. In this table, ET1 and ET2 is the execution time,
in second, needed by Design1 and Design2,
respectively, to finish an alignment operation.
The #LC1 and #LC2 are the logic cells
(LCs) number occupied by Design1 and Design2 processor arrays. The design metrics of
Speedup, Area Normalized Speedup, and
Area Ratio are computed in order to evaluate the amount of improvement carried out in
each design. The Speedup is computed by
dividing the Design2 execution time ET2 by
Design1 execution time ET1 and the Area
ratio is calculated by dividing Design1 logic
slices #LC1 by Design2 logic slices #LC2.
Furthermore, the Area Normalized Speedup
is computed by dividing the Speed up by the
Area Ratio. The column entitled % Normalized Speedup is the percentage increase in
Area Normalized Speedup. It is calculated
by subtracting the values of the Area Normalized Speedup given in Table 1 from the default value, which is equal to one, and multiplying the resulted value by 100%.
Table 1 shows the proposed design area (#LC1)
varies in their values based on the size of the
profile HMM query, while the area of the conventional design (Design2) has fixed values.
The variation in area values in the proposed
design (Design1) is due to the processor array
FIFO depth is determined by the length of the
present aligned profile HMM query, however
the fixed area values in the conventional design
is due to the processor array FIFO depth of the
conventional design (Design2) is determined
by the highest length of subject (n = 35, 213)
in the UniprotKB/Swiss-Port database Also,
the table shows that the proposed design has a
considerable higher normalized speedup ranging from 69% to 165 % over the conventional
design for profile HMM query lengths ranging
32
from 38 to 2295.
165 %
143 %
130 %
109 %
97 %
88 %
83 %
77 %
69 %
2.65
2.43
2.30
2.09
1.97
1.88
1.83
1.77
1.69
0.691
0.702
0.710
0.723
0.727
0.738
0.749
0.761
0.779
1.83
1.71
1.63
1.51
1.43
1.39
1.37
1.35
1.32
Area
Normalized
Speedup
Area
Ratio
Speedup
In this paper, we presented novel processor array structure for accelerating the Viterbi algorithm with optimal results. This structure
has been amended to allow hardware reuse to
avoid repetition of PEs of the processor array on multiple FPGAs. Moreover, it has a
significant reduction in area over the conventional design due to removing the subject FIFO
and using small size FIFOs for the intermediate results. This has led to the increase of the
maximum number of PEs that can be implemented on the FPGA and hence increasing the
total throughput. The implementation results
showed that the proposed design has a significant higher normalized speedup ranging from
69% to 165 % over the conventional design for
profile HMM query lengths ranging from 38 to
2295.
q
1
2
4
8
10
12
14
24
61
ET1 (sec.)
0.12
0.32
2.13
9.54
14.52
21.45
22.98
39.39
100.11
#LC1
38,123
38,745
39,185
39,867
40,124
40,722
41,287
41,987
42,987
ET2 (sec.)
0.22
0.54
3.47
14.41
20.76
29.81
31.48
53.18
132.14
#LC2
55,154
55,154
55,154
55,154
55,154
55,154
55,154
55,154
55,154
ACKNOWLEDGEMENTS
m
38
76
152
304
380
456
532
901
2295
Design2
Design1
query
length Folds
Table 1. Performance comparison using folded processor array of size v = 38.
% Normalized
Speedup
5 CONCLUSION
The authors would like to acknowledge the

support of the deanship of scientific research at
Prince Sattam Bin Abdulaziz university under
the research project # 2015/01/3570.
REFERENCES
[1] S. R. Eddy, Profile hidden markov models,
Bioinformatics, vol. 14, pp. 755763, 1998.
[2] A. Viterbi, Error bounds for convolutional codes
and an asymptotically optimum decoding algorithm, IEEE Transactions on Information Theory,
vol. 13, no. 2, 1967.
[3] P. M. Rahul, B. Jeremy, D. C. Roger, A. F. Mark,
and H. Brandon, Accelerator design for protein
sequence hmm search, in Proceedings of the 20th
annual international conference on Supercomputing, (Cairns, Queensland, Australia), pp. 288296,
2006.
[4] T. F. Oliver, B. Schmidt, Y. Jakop, and D. L.
Maskell, Accelerating the viterbi algorithm for
profile hidden markov models using reconfigurable hardware, in Lecture Notes in Computer
Science, (Springer Berlin / Heidelberg), pp. 522
529, 2006.
ISBN: 978-1-941968-37-6 2016 SDIWC
33
[5] S. R. Eddy, Hmmer users guide, in Washington

University School of Medicine, 2003.
[6] A. C. Jacob, J. M. Lancaster, J. D. Buhler, and
R. D. Chamberlain, Preliminary results in accelerating profile hmm search on fpgas, in Proceedings of IEEE International Symposium on Parallel and Distributed Processing (IPDPS), (Long
Beach, CA), pp. 18, 2007.
[7] K. Benkrid, P. Velentzas, and S. Kasap, A
high performance reconfigurable core for motif
searching using profile hmm, in Proceedings of
NASA/ESA Conference on Adaptive Hardware and
Systems (AHS 2008), (Noordwijk), pp. 285292,
2008.
[8] T. F. Oliver, B. Schmidt, Y. Jakop, and D. L.
Maskell, High speed biological sequence analysis with hidden markov models on reconfigurable platforms, IEEE Transactions on Information Technology in Biomedicine, vol. 13, pp. 740
746, 2009.
[15] D. Moldovan and J. Fortes, Partitioning and mapping of algorithms into fixed size systolic arrays,
IEEE Trans. on Computers, vol. 35, pp. 112,
1986.
[16] K. Benkrid, Y. Liu, and A. Benkrid, A highly
parameterized and efficient fpga- based skeleton
for pairwise biological sequence alignment, IEEE
Trans. On VLSI systems, vol. 17, pp. 561570,
2009.
[17] R. Finn, A. Bateman, J. Clements, P. Coggill,
R. Eberhardt, S. Eddy, A. Heger, K. Hetherington,
L. Holm, J. Mistry, E. Sonnhammer, J. Tate, and
M. Punta, The pfam protein families database,
Nucleic Acids Research, vol. 42, 2014.
[18] Uniprotkb/swiss-prot,
Uniprotkb/swiss-prot
protein knowledgebase release 2015 09..
http://web.expasy.org/docs/relnotes/relstat.html.
2015.
[9] T. Oliver, L. Yeow, and B. Schmidt, High performance database searching with hmmer on fpgas,
in Proceedings of IEEE International Symposium
on Parallel and Distributed Processing (IPDPS),
(Long Beach, CA), pp. 17, 2007.
[10] T. Oliver, L. Y. Yeow, and B. Schmidt, Integrating
fpga acceleration into hmmer, Journal of Parallel
Computing, vol. 34, pp. 681691, 2008.
[11] M. Punta, P. Coggill, R. Eberhardt, J. Mistry,
J. Tate, C. Boursnell, N. Pang, k. Forslund,
G. Ceric, J. Clements, A. Heger, L. Holm,
E. Sonnhammer, S. Eddy, A. Bateman, and
R. Finn, The pfam protein families database, Nucleic Acids Research, vol. 40, pp. D290D301,
2012.
[12] M. Isa, K. Benkrid, and T. Clayton, A novel efficient fpga architecture for hmmer acceleration,
in Proceedings of the International Conference on
Reconfigurable Computing and FPGAs (ReConFig), (Cancun), pp. 16, 2012.
[13] T. Takagi and T. Maruyama, Accelerating hmmer
search using fpga, in Proceedings of the International Conference on Field Programmable Logic
and Applications (FPL 2009), (Prague), pp. 332
337, 2009.
[14] S. Kung, VLSI Array Processors.
Cliffs, N.J.: Prentice- Hall, 1988.
Englewood
ISBN: 978-1-941968-37-6 2016 SDIWC
34

Conference Title: Society of Digital SD

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Conference Title: Society of Digital SD

Transféré par

Droits d'auteur :

Formats disponibles

Conference Title

The Fourth International Conference on Digital Information

September 6-8, 2016

Asia Pacific University of Technology and Innovation (APU),

978-1-941968-37-6 2016 SDIWC

The Society of Digital Information and Wireless

Approaches to IT Service Management in Improving IT Management in the Banking Sector

Practicality of Migration to Cloud Computing .. 15

Domain Oriented Divisible Load Scheduling in the Private Cloud .. 20

Approaches to IT Service Management in Improving IT

Jiten Kirit Vaitha

Abstract This paper discusses how various

ISBN: 978-1-941968-37-6 2016 SDIWC

Nicholas Jeremy Francis

on giving a better quality service to the customers

achieve the business needs in consonance with IT.

ISBN: 978-1-941968-37-6 2016 SDIWC

IT, since its control objectives are focused on the

satisfy management of services, so that the best

DIFFERENCES BETWEEN ITIL AND COBIT

ISBN: 978-1-941968-37-6 2016 SDIWC

definite processes involved

FINDINGS ON BANKS IN TANZANIA

Pie chart to identify the most challenging

From the research conducted on 11 different

Bar graph showing how costly it is for the banks

In summary, questions asked were mainly on

Bar graph to show how strictly the banks follow

9 out of the 11 banks use their own guidelines in

ISBN: 978-1-941968-37-6 2016 SDIWC

to fulfil each and everyones needs on a

ISBN: 978-1-941968-37-6 2016 SDIWC

other frameworks as well for minimizing the

It is undeniable that technology keeps

Iden, J. and Eikebrokk, T. (2013). Implementing IT

ISBN: 978-1-941968-37-6 2016 SDIWC

International Conference on Management of eCommerce and e-Government, 1(1), p.4.

Collect and Disseminate Layer Protocol for Searching Cloud Services

Cloud computing is one contemporary

Cloud computing is a paradigm for large-scale

Searching cloud services over in Internet from

ISBN: 978-1-941968-37-6 2016 SDIWC

2.1 Focused Gap: Cloud services

ISBN: 978-1-941968-37-6 2016 SDIWC

mechanism that are specialized for Cloud

over some large data set. However, for some

Private searching was first proposed by

ISBN: 978-1-941968-37-6 2016 SDIWC

The drawback of this protocol is that keyword

Cooperative Private Searching (COPS)

Unlike Ostrovosky protocol, COPS protocol

Figure 1 Working Processes of Ostrovosky and COPS

The performance of both Ostrovosky protocol

Figure 2 Performance Comparison

The running time of Ostrovosky protocol is

ISBN: 978-1-941968-37-6 2016 SDIWC

Figure 3 Research Methodology

As indicated in Figure 4, the proposed protocol

computing environment search engine. There

4 RESULTS AND DISCUSSIONS

Figure 4 Cloud Keyword Search protocol with Collective

This research idea was to convert the way of

ISBN: 978-1-941968-37-6 2016 SDIWC

(WV) = 386 + 228 + 3*5 = 419