Académique Documents
Professionnel Documents
Culture Documents
Conference Dates
Conference Venue
ISBN
Published by
Table of Contents
Collect and Disseminate Layer Protocol for Searching Cloud Services ... 7
FPGA-Based Processor Array Architecture for Profile Hidden Markov Models ... 28
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
INTRODUCTION
Information Technology Service Management
(ITSM) can be referred to as the way of operating
the IT part of the organization by focusing on the
day to day services provided by the IT department
in that organization, so as to meet customers
needs. ITSM acts as a service function of the
function of the IT provided by the organization.
Since the technology is growing rapidly, it is
difficult for the IT service providers to afford new
technology all the time so instead, they focus more
1.
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
COBIT
This is a standard used widely by organizations
around the world. It is considered to be one of the
most applicable framework for an organization to
achieve business related goals by making use of
4.
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
ITIL VS COBIT
The following table shows the main differences
identified between ITIL and COBIT frameworks.
7.
TABLE I.
ITIL
Focused on the operation
management by provision
of best practices and
COBIT
Focused on IT governance
by defining, implementing,
measuring and improving
principles.
Improvement of
effectiveness and quality of
IT customer service and IT
operations.
Focused on HOW to meet
the challenge.
Focused more on ITSM by
using five (5) stages of
service life-cycle (service
strategy, service design,
service transition, service
operation and continual
service improvement).
Figure 1.
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
of the banks have started to adopt to the wellknown guidelines such as ITIL. Majority of the
banks which are part of the samples for the
research are still using MOF (Microsoft Office
Framework) and ISO standard alone for managing
their IT.
There was a question about how strictly the
banks follow their framework based on a scale of
0-10, whereby it resulted that most of the banks
follow their desired frameworks for about 80%,
but do not strictly follow it apart from one bank.
Some of the banks only follow it for about 50%
and 60%. ITIL is used by some of the banks but if
not fully implemented, then the bank may not get
full benefit out of the framework, therefore some
improvements are needed.
Figure 3.
Figure 2.
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
RECOMMENDATIONS
In order to tackle all areas of challenges faced
by the banking sector, a combined framework of
Microsoft Office Framework (MOF), ITIL and
COBIT is recommended. Since some of the banks
already practiced and complying with at least one
of the three, they should use some elements of the
11.
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
ABSTRACT
1 INTRODUCTION
KEYWORDS
Cloud computing,
services, protocol.
searching
protocol,
cloud
2 LITERATURE REVIEW
This research mainly focuses on fitting. Fitting
in this study means matching user search query
in cloud environment. More specifically, how
accurately the user meets their services from
available service providers. The possible risk
occurs in meeting the users query when the
available services mismatch with the users
search query [3]. From the detailed study of
existing research and preliminary study,
accuracy of meeting and matching the users
query consists of three main categories of
issues or gaps namely services, searching and
standard or protocol. The following sections
explained them in detailed.
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
Ostrovosky Protocol
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
Performance Comparison
3 METHODOLOGIES
Four phases were required for this research,
namely to identify prominent problems via
preliminary study, to design the proposed
protocol and tool development for evaluation,
to evaluate the proposed protocol via developed
tool and to verify performance analysis of
proposed protocol with set of experiments.
Figure 3 Shows the details on each phases
10
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
11
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
The
computation
time
collected
via
questionnaire form is used to compare the
performance of proposed protocol and search
engine. Figure 6 shows the graph plot using the
average computation time calculated.
Answer
Frequency
Percentage
Cumulative
percentage
Without
CSSE
Yes
18
27.7
27.7
No
Total
Yes
No
Total
47
65
58
7
65
72.3
100.00
89.2
10.8
100
100
89.2
100
Time
With
CSSE
Similarity
(without CSSE )
87.69
27.23
33.108
18.883
Minimum
Maximum
100
75
Std. Deviation
1360
1350
1340
1330
1320
1310
1300
1290
1280
1270
1260
1250
1240
1230
Time (100keyword)
Time (200keyword)
12
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
Communication cost
Computation
cost
Ostrovosky
COPS
Proposed
Protocol
13
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
5 CONCLUSIONS
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
14
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
ABSTRACT
The aim of this research is to assess some published
measurement models which evaluate the economic
benefits for organizations regarding moving to
cloud computing. This research also aims at
checking if any additional factors need to be
incorporated or if any fine tuning is required to
improve the models. Assessment and modifications
of the models are based on applying the models on
real life case studies and comparing the output of
the models with available real information. A
realistic case study from a big university was
considered. The case study highlighted the need to
consider more factors in the model, in particular the
impact of any previous partial move to cloud
computing.
KEYWORDS
Cloud
index
INTRODUCTION
15
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
BACKGROUND
TO
IT resources size,
pattern of using these resources,
concerned data sensitivity and
criticality of concerned work.
CASE STUDY
16
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
Largeness value
(L) = NoS CNoS + NoC CNoC + AR CAR
So
(PU) = 4*5 + 2*9 = 38
So
(L) = 4*8 + 5*4 + 3*4 = 64
Workload Variability
(WV) = PU CPU + AU CAU + ADH CADH
So
So
(AU) = 3*5 + (4 3)*7 = 22
Related
number in
the model
Tabulation
sheet
Number of servers <100
NoS
Number of countries
it is spread across
1
NoC
Annual revenue
from IT offerings
AR
Duration of peak
usage/year DoP
Peak by average
PbA
Type of services
ToS
Size of customer
base SCB
Amount of data
handling ADH
Sensitivity of data
SoD
< 20 m $
4
5
Characteristics
Credits
Size of IT resources
Number of servers
Workload variability
Peak usage
Peak by average
Average usage
Few hours
Type of services
<5 times
n/a
Sensitivity of data
Profile 4:
Moderately
variable workload
with occasional
surges
Below 20000
500 GB 1TB/month
sensitive
3
3
3
17
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
Data Sensitivity
(DS) = SoD=3
Criticality
(C) = CWD= 3
Suitability index =
L CL + WV CWV + DS CDS ADH + C
CC (65 L)
= 64 * 8 + 419 * 8 + 3 *
* 4 * (65 64)
+ 3
= 3930
The result is in the gray area, which could
indicate a weakness in the model. Available
information (though it was not possible to get
accurate info due to authorization issues) from
staff in the institution indicates otherwise.
The concerned university has partially moved
to cloud computing before we applied the
available model.. This issue of legacy partial
conversion is not considered in the available
model. We need to incorporate a factor
regarding the percentage of load designated to
cloud computing prior to applying the model.
By studying more similar cases and comparing
with some pure cases (i.e. cases where
information are available about the institution
before any move to cloud computing and after
full move).
One case study is not enough to give a strongly
credible feedback about the model. Another
academic institution is being considered. Also,
it is important to apply the model on an
industrial or commercial organization (/s) and
check if there are any general clear differences
between
academic
and
nonacademic
organizations regarding the appropriateness of
cloud computing
REFERENCES
18
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
19
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
KEYWORDS
Seismic Data Processing, Private Cloud, Divisible
Load, Scheduling, Data Partition
20
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
21
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
DATA
3
PROBING
BASED
PARTITIONING (PDP) ALGORITHM
The central
services layer
Information service
Meta-scheduler
Applicationlevel scheduler1
Dynamic
aggregation layer
Logical
community1
Logical
community2
Local
scheduler
Local
scheduler
Logical
communityn
Local
Information
service
Resource nodem
Resource node2
Resource node1
Resources layer
Local
Information
service
Local
scheduler
Local
Information
service
l1
Tcp1
p1
l2
Tcp2
p2
ln
Tcpn
pn
22
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
i 1
0tcm1
p1
0tcm3
p3
0tcp3
Pn
2tcp2
0tcpn
Probing
3tcm3
ntcmn
k-2
(3)
j k 1
2
(t
i k
cmi
cpj
tcpi )
...+
j n 1
2
(t
i n
cmi
cpj
tcpi )
(4)
1 z 0
(5)
3tcp3
0tcmn
2tcm2
0tcp2
tcpk 1tcpk 2
i k
1tcp1
0tcp1
0tcm2
p2
(2)
=......
1tcm1
z 0
ntcpn
Scheduling phase
23
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
t t t
0
3
cp 3
cm 4
cp 4
... ...
n z 0
(6)
p4
1GB
p5
4GB
24
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
(7)
j1
worker
p1
p2
p3
p4
p5
tcm(s)
0.0493
0.0413
0.0331
0.0063
0.0375
tcp(s)
11.8169
19.5106
28.5794
29.9638
30.6262
Tpb(m)
0.1978
0.3267
0.4784
0.5016
0.5132
(8)
j1
25
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
5 CONCLUSION
In this paper we have presented a hierarchical
scheduling model with application-level
scheduler to support divisible seismic data
processing applications. Then it proposed a
probing based data partitioning algorithm. The
deduction of the closed formula of data
partition was introduced in detail. Finally, FFD
pre-stack depth migration task of the Marmousi
model using the PDP algorithm was executed.
The result validated the PDP strategy on the
condition that resources list was given. In future
work we will study multi round scheduling
algorithm to overlap the communication with
computing. We will also investigate the
dynamic load balance during the data
processing.
ACKNOWLEDEGMENTS
This work was supported by the Fundamental
Research Funds for the Central Universities
(15CX02046A) and ChinaGrid project funded
by MOE of China.
REFERENCES
[1] Matheny, Paul, et al. "Evolution of the land seismic
super crew." Seg Technical Program Expanded
Abstracts 2009:4338.
[2] Hai Jin. "Constructing a resources sharing platformChina Grid" China Education Network 9(2006):2526.
. "
ChinaGrid." 9(2006):25-26.
[3] Min Zhong, et al. "Grid platform for seismic data
parallel processing and its application " Journal of
China University of Petroleum(Edition of Natural
Science) 38.02(2014):180-186.
. "
." ()
38.02(2014):180-186.
[4] Bharadwaj, Veeravalli, D. Ghose, and T. G.
Robertazzi. "Divisible Load Theory: A New
Paradigm for Load Scheduling in Distributed
Systems." Cluster Computing 6.1(2003):7-17.
[5] Cheng, Y. C., and T. G. Robertazzi. "Distributed
computation with communication delay (distributed
intelligent sensor networks)." IEEE Transactions on
Aerospace & Electronic Systems 24.6(1988):700712.
[6] Yang, Yang, and H. Casanova. "UMR: a multi-round
algorithm for scheduling divisible workloads."
International Parallel & Distributed Processing
Symposium IEEE, 2003:24b.
[7] Ghose, Debasish, H. J. Kim, and T. H. Kim.
"Adaptive Divisible Load Scheduling Strategies for
Workstation Clusters with Unknown Network
Resources." IEEE Transactions on Parallel &
Distributed Systems 16.10(2005):897-907.
[8] Lin, Xuan, et al. "Real-Time Divisible Load
Scheduling for Cluster Computing." IEEE Real Time
& Embedded Technology & Applications
Symposium IEEE, 2007:303-314.
[9] Chuprat, Suriayati. "Divisible load scheduling of
real-time task on heterogeneous clusters."
Information Technology IEEE, 2010:721-726.
[10] Othman, M., et al. "Adaptive Divisible Load Model
for Scheduling Data-Intensive Grid Applications."
Lecture
Notes
in
Computer
Science
4487.1(2007):446-453.
[11] Iyer, G. N., B. Veeravalli, and S. G. Krishnamoorthy.
"On
Handling
Large-Scale
Polynomial
Multiplications in Compute Cloud Environments
using Divisible Load Paradigm." Aerospace &
Electronic Systems IEEE Transactions on
48.1(2012):820-831.
[12] Abdullah, Monir, and M. Othman. "Cost-based
Multi-QoS Job Scheduling Using Divisible Load
Theory in Cloud Computing. " Procedia Computer
Science 18.1(2013):928-935.
[13] Nisha, L., A. S. Ajeena Beegom, and M. S. Rajasree.
"Management of data intensive divisible load in
cloud systems with gossip protocol." International
Conference
on
Control,
Instrumentation,
Communication and Computational Technologies
IEEE, 2014:856-861.
[14] Rosas, Claudia, et al. "Improving Performance on
Data-Intensive Applications Using a Load Balancing
Methodology Based on Divisible Load Theory."
International Journal of Parallel Programming
42.1(2014):94-118.
[15] Ismail, Leila, and L. Khan. "Implementation and
performance evaluation of a scheduling algorithm
for divisible load parallel applications in a cloud
computing environment." Software Practice &
Experience 45.6(2015):765781.
[16] Kang, Seungmin, B. Veeravalli, and K. M. M. Aung.
"Scheduling Multiple Divisible Loads in a Multicloud
System."
IEEE/ACM,
International
Conference on Utility and Cloud Computing IEEE,
2015:371-378.
[17] Suresh, S., H. Huang, and H. J. Kim. "Scheduling in
compute cloud with multiple data banks using
divisible load paradigm." IEEE Transactions on
Aerospace & Electronic Systems 51.2(2015):12881297.
26
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
27
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
KEYWORDS
processor arrays, bioinformatics, profile hidden
Markov model, sequencing technology, biological
computation, reconfigurable computing, digital circuits design.
INTRODUCTION
Protein alignment by Viterbi algorithm using general purpose processors (microprocessors) results in quadratic time complexities and
hence searching the database will require very
long computation time. Therefore, a massive
growth in research which focuses on accelerating DP-based Viterbi algorithm has happened.
The FPGA implementation of Viterbi algorithm was among the early works reported in
literature [3, 4]. They presented a simple architecture to what is called full plan 7 [5]
by neglecting the feedback loop and this led
to an effective architecture with fine-grained
parallelism of processor arrays with estimated
speedup performance of one to two orders of
magnitude. Other reported FPGA implementations with no feedback loop dependency have
also been reported in [6, 7, 8]. Oliver et al. then
reported studies for accelerating the Viterbi algorithm with full plan 7 architecture in 2007
[9] and 2008 [10]. They presented different
approach to calculate the alignment matrix by
computing cells in the Dynamic Programming
(DP) matrix in row-major order, but the strategy they proposed was not suitable for parallel
computation due to feedback loop dependency.
Typical FPGA-based HMMER acceleration
computes alignment degrees using processor
arrays by allocating a single processing element (PE) for HMM node. Each PE utilized
from 300 to 500 logic slices for the implementation of the Viterbi algorithm resulting
in 10 to 100 PE systolic arrays implemented
in hardware depending on the FPGA chip
used. With profile HMM with about 200
nodes on the average [11], more logic slices
and a larger size of block RAMs (BRAM) are
consumed since PEs are replicated to increase
parallelism. Therefore, folding technique has
been utilized to allow for the implementation
28
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
VITERBI ALGORITHM
Figure 1 shows the profile HMM with simplified plan 7 architecture [5]. The full plan 7 architecture has a feedback loop that makes the
computation of Viterbi algorithm in parallel is
difficult. According to the experiments performed by Takagi [13], the ratio that the feed-
M (i 1, j 1) + tr(Mj1 , Mj )
D(i 1, j 1) + tr(D , M )
j1
(1)
{
I(i, j) = e(Ij , si ) + max
{
D(i, j) = max
M (i 1, j) + tr(Mj , Ij )
I(i 1, j) + tr(Ij , Ij )
M (i, j 1) + tr(Mj1 , Dj )
(2)
(3)
D(i, j 1) + tr(Dj1 , Dj )
29
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
D2
D3
D4
I0
I1
I2
I3
I4
M0
(start)
M1
M2
M3
M4
M5
(end)
PROPOSED PROCESSOR
ARCHITECTURE
ARRAY
than repeating the PEs on a cluster of multiple FPGAs. This design approach is called
folding approach [14, 15] and it has the advantage of reducing the significant amounts of energy consumed besides decreasing design size,
maintenance and operational costs. The design
modification starts by partitioning the considered algorithm into small partitions and map
these partitions on a linear processor array of
fixed size. This problem was previously studied in several papers [16, 14, 15]. The following illustrates the modification process.
Let us start by supposing the common case of a
query profile HMM of size m and a fixed size v
processor array, where q = m/v and m > v.
At first, the m size processor array is theoretically expanded to a q v size array with the
latest qv m PEs loaded with zero values. By
that means the extra PEs do not affect the total results of alignment. After the expansion
step, the resulted q v sized processor array is
folded into the actual fixed size v array. Due
to this folding process, the alignment process
will be completed in kq passes over the fixed
size processor array. The intermediate results
of each pass should be stored in a first-in-firstout (FIFO) before they are passed back to the
input of array for the following pass (see Figure 3).
s(km+(qv+1))
s(km+(qv+2))
PE1
PE2
s(km+(qv+v))
Best score
s(km+1)
s(km+2)
s(km+i)
s(km+m)
PE1
PE2
PEi
PEm
PE
PEv
Best score
M_FIFO
Intermediate results
Intermediate results
I_FIFO
D_FIFO
depth = m-v
3.1
30
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
tained best score output and the corresponding subject sequence address are stored in a
best score FIFO if it satisfies a given threshold
value, otherwise they are ignored.
Figure 5 shows the processing core inside the
PE. It implements the basic operations of the
Viterbi algorithm which calculates the matrices M (i, j), I(i, j) and D(i, j) as described by
Equations (1), (2), and (3). It consists of three
the I(km + (qv + i ), j) instance which computes scores of the I state, and the D(km +
From CE-NODE
MAPPER
mux1
mux2
CE0
CE1
CE
demux
s(km+(qv+i ))
w
Reg.
to the look-up table as the configuration element (CE) and this term is also useed in this
paper. The mapping of a CE to its corresponding profile HMM node position is dictated by
a controller inside a CE-NODE MAPPER as
discussed in [12]. The design of the proposed
sequence alignment core architecture is based
on the scheduling strategy (Known as overlapped computation and configuration (OCC))
proposed by [12] (see [12] for more details regarding this strategy).
E
M
E
I
T
r
CE
w
CE-SEL
mux3
9xw
e(Mj , s ) e(Ij , s )
tr
Processing
4 COMPLEXITY COMPARISONS
In this section, we discuss the performance
evaluation and resource usage of the proposed
novel design (Design1) and the previously reported design of [12] (Design2). These designs
are modeled using VHDL language and realized using Xilinx ISE8.1 tools on Alpha Data
ADM-XRC-5LX card. The developed VHDL
code for each design is parameterizable, in
which we can change the number of PEs in
the processor array, the PE word size, and the
lengths of both subject sequence n and the profile HMM query m. The synthesis results after place and route (using PE word width of
16) show that the maximum number of PEs
(for m = 2295 and n = 35, 213) that can be
implemented on the FPGA in the case of the
31
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
+
tr( Ij-1 , Mj)
D(km+(qv+i -1), j-1)
+
e(Mj , s
cw M(km+(qv+i ), j)
Reg
M(km+(qv+i ), j-1)
tr(Mj-1 , Dj)
M
A
X
tr(Dj-1 , Dj)
cw
D(km+(qv+i ), j)
D(km+(qv+i ), j-1)
Reg
M(km+(qv+i -1), j)
+
tr(Mj , Ij)
M
A
X
I(km+(qv+i -1), j)
cw
I(km+(qv+i ), j)
+
tr(Ij , Ij)
e(Ij , s
32
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
from 38 to 2295.
165 %
143 %
130 %
109 %
97 %
88 %
83 %
77 %
69 %
2.65
2.43
2.30
2.09
1.97
1.88
1.83
1.77
1.69
0.691
0.702
0.710
0.723
0.727
0.738
0.749
0.761
0.779
1.83
1.71
1.63
1.51
1.43
1.39
1.37
1.35
1.32
Area
Normalized
Speedup
Area
Ratio
Speedup
In this paper, we presented novel processor array structure for accelerating the Viterbi algorithm with optimal results. This structure
has been amended to allow hardware reuse to
avoid repetition of PEs of the processor array on multiple FPGAs. Moreover, it has a
significant reduction in area over the conventional design due to removing the subject FIFO
and using small size FIFOs for the intermediate results. This has led to the increase of the
maximum number of PEs that can be implemented on the FPGA and hence increasing the
total throughput. The implementation results
showed that the proposed design has a significant higher normalized speedup ranging from
69% to 165 % over the conventional design for
profile HMM query lengths ranging from 38 to
2295.
q
1
2
4
8
10
12
14
24
61
ET1 (sec.)
0.12
0.32
2.13
9.54
14.52
21.45
22.98
39.39
100.11
#LC1
38,123
38,745
39,185
39,867
40,124
40,722
41,287
41,987
42,987
ET2 (sec.)
0.22
0.54
3.47
14.41
20.76
29.81
31.48
53.18
132.14
#LC2
55,154
55,154
55,154
55,154
55,154
55,154
55,154
55,154
55,154
ACKNOWLEDGEMENTS
m
38
76
152
304
380
456
532
901
2295
Design2
Design1
query
length Folds
% Normalized
Speedup
5 CONCLUSION
REFERENCES
[1] S. R. Eddy, Profile hidden markov models,
Bioinformatics, vol. 14, pp. 755763, 1998.
[2] A. Viterbi, Error bounds for convolutional codes
and an asymptotically optimum decoding algorithm, IEEE Transactions on Information Theory,
vol. 13, no. 2, 1967.
[3] P. M. Rahul, B. Jeremy, D. C. Roger, A. F. Mark,
and H. Brandon, Accelerator design for protein
sequence hmm search, in Proceedings of the 20th
annual international conference on Supercomputing, (Cairns, Queensland, Australia), pp. 288296,
2006.
[4] T. F. Oliver, B. Schmidt, Y. Jakop, and D. L.
Maskell, Accelerating the viterbi algorithm for
profile hidden markov models using reconfigurable hardware, in Lecture Notes in Computer
Science, (Springer Berlin / Heidelberg), pp. 522
529, 2006.
33
Proceedings of the Fourth International Conference on Digital Information Processing, E-Business and Cloud Computing (DIPECC), Kuala Lumpur, Malaysia, 2016
[15] D. Moldovan and J. Fortes, Partitioning and mapping of algorithms into fixed size systolic arrays,
IEEE Trans. on Computers, vol. 35, pp. 112,
1986.
[16] K. Benkrid, Y. Liu, and A. Benkrid, A highly
parameterized and efficient fpga- based skeleton
for pairwise biological sequence alignment, IEEE
Trans. On VLSI systems, vol. 17, pp. 561570,
2009.
[17] R. Finn, A. Bateman, J. Clements, P. Coggill,
R. Eberhardt, S. Eddy, A. Heger, K. Hetherington,
L. Holm, J. Mistry, E. Sonnhammer, J. Tate, and
M. Punta, The pfam protein families database,
Nucleic Acids Research, vol. 42, 2014.
[18] Uniprotkb/swiss-prot,
Uniprotkb/swiss-prot
protein knowledgebase release 2015 09..
http://web.expasy.org/docs/relnotes/relstat.html.
2015.
[9] T. Oliver, L. Yeow, and B. Schmidt, High performance database searching with hmmer on fpgas,
in Proceedings of IEEE International Symposium
on Parallel and Distributed Processing (IPDPS),
(Long Beach, CA), pp. 17, 2007.
[10] T. Oliver, L. Y. Yeow, and B. Schmidt, Integrating
fpga acceleration into hmmer, Journal of Parallel
Computing, vol. 34, pp. 681691, 2008.
[11] M. Punta, P. Coggill, R. Eberhardt, J. Mistry,
J. Tate, C. Boursnell, N. Pang, k. Forslund,
G. Ceric, J. Clements, A. Heger, L. Holm,
E. Sonnhammer, S. Eddy, A. Bateman, and
R. Finn, The pfam protein families database, Nucleic Acids Research, vol. 40, pp. D290D301,
2012.
[12] M. Isa, K. Benkrid, and T. Clayton, A novel efficient fpga architecture for hmmer acceleration,
in Proceedings of the International Conference on
Reconfigurable Computing and FPGAs (ReConFig), (Cancun), pp. 16, 2012.
[13] T. Takagi and T. Maruyama, Accelerating hmmer
search using fpga, in Proceedings of the International Conference on Field Programmable Logic
and Applications (FPL 2009), (Prague), pp. 332
337, 2009.
[14] S. Kung, VLSI Array Processors.
Cliffs, N.J.: Prentice- Hall, 1988.
Englewood
34