Académique Documents
Professionnel Documents
Culture Documents
Aaron Wilhelm
we generally wish to have results quickly and memory is relatively cheap the faster algorithm is generally the preferred choice. One problem that requires the use of dynamic programming techniques is the edit distance problem. The edit distance problem is concerned with nding the minimum changes that can be made to one string to get a desired resultant string. Dierent implementations of the algorithms used to solve this problem use dierent transformations for string modication. The transformations used are generally a subset of the following transformations:
Copy: simply copies a character from the output string to the input string Replace: sets a character in the input string to a value in the output string Delete: deletes a character in the input string Insert: inserts a character from the output string Twiddle: swaps the next two characters Kill: removes the rest of the characters in the text
2
Understanding the edit distance problem can be used in several applications such as, spell checking, spell checkers must search for closest word to a misspelled word, DNA sequencing uses need to nd similarities in DNA structure to nd what what sequences are responsible for what attributes and di algorithms are used to nd and track dierences in source code and other les to manage what was done by whom.
Procedures
In order to get the algorithms to work appropriate weights and transformations had to picked. To pick the transformations I attempted to pick the most basic of transformations and also make sure that I was guaranteed the ability to transform and string to any other string. Since kill is simply a repeated delete until the string is gone I decided not to use it. Since replace can easily be substituted by a delete and an insert I didn't implement replace either. Copy, delete, insert and twiddle are what I decided to use. Twiddle was used because I wished to possibly use these algorithms with DNA sequencing or to aid in tracking changes in text documents where it is common for letters or lines to be switched. For the weights, the original values relative to each other were picked by using logic. For instance the copy transformation should be the only transformation used when the beginning and ending string are the same, therefore the copy operator should be lowest in weight. Copying should also be lower weight than inserting or deleting since copying changes the string the least, both copying and inserting do almost the same amount in modifying the string so they should be close to the same and twiddle should have a fairly low weight since nothing is created or destroy only ipped around. From there I modied the values by hand until I got transformations that matched what I believed to be intuitive 3
Algorithm 1 RecursiveM inW eight(input, i, output, j) Pre: When originally called i and j equal 1 Post: Return minimum weight needed to change input[i..input.size] to out1: smallest = 2: i input.size OR j output.size 3: j output.size 4: weight = IN SERT _W EIGHT
put[j..output.size]
if
if
then
then
weight = COP Y _W EIGHT + RecursiveM inW eight(input, i + 1, output, j + 1) smallest = min(smallest, weight) weight = DELET E _W EIGHT + RecursiveM inW eight(input, i + 1, output, j) smallest = min(smallest, weight) j + 1 output.size AND input[i] == output[j + 1] AND output[j] == input[i + 1] weight = T W IDDLE _W EIGHT + RecursiveM inW eight(input, i + 2, output, j + 2) smallest = min(smallest, weight)
then
Complexity Analysis
The recurrence for this algorithm is
T (n, m) = T (n 1, m) + T (n, m 1) + T (n 1, m 1) + T (n 2, m 2) (1)
(2)
The memory complexity on the other hand has a constant memory per call to itself and has max depth of n + m leading to a memory complexity of
O(n + m)
(3)
Algorithm 2 IterativeM inW eigh(input, output) Pre: None Post: This will return the minimum weight of transforms to turn input into
1: Let weightable[1..input.size + 1][1..output.size + 1] be a new array 2: Let transf ormtable[1..input.size + 1][1..output.size + 1] be a new array 3: i = input.size + 1 to 1 4: j = output.size + 1 to 1 5: calc_weight(transf ormtable, weighttable, input, output, i, j) 6: 7: 8: F indM inP ath(transf ormtable, weighttable) 9: weighttable[1][1]
do
do
Algorithm 3 calc_weight(transf ormtable, weighttable, input, output, i, j) Pre: The the weight and transform table must be lled correctly for values if x and y such that x > i and y > j Post: This will save that optimal solution in weighttable[i][j] 1: smallest = 2: if i input.size OR j output.size then 3: if j output.size then 4: weight = IN SERT _W EIGHT + weighttable[i][j + 1] 5: if weight < smallest then
6: 7: 8: 9: 10:
11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24:
end if end if if i input.size AND j output.size AND input[i] == output[j] then weight = COP Y _W EIGHT + weighttable[i + 1][j + 1] if weight < smallest then end if end if if i input.size then weight = DELET E _W EIGHT + weighttable[i + 1][j] if weight < smallest then end if end if if i + 1 input.size AND if
trans = DELET E smallest = weight trans = COP Y smallest = weight
25: 26: 27: 28: 29: 30: 31: 32: weighttable[i][j] = smallest 33: transf ormtable[i][j] = trans
j + 1 output.size AND input[i] == output[j + 1] AND output[j] == input[i + 1] weight = T W IDDLE _W EIGHT + weight[i + 2][j + 2] weight < smallest trans = T W IDDLE smallest = weight
then
then
Algorithm 4 F indM inP ath(transf ormtable, weighttable) Pre: The 2-D arrays transformtable and weighttable is properly ll, with (1,1) the in the tables is the minimum transform weight Post: That the returned list will contain the the smallest
1: Let minList be an empty list 2: i = 1 3: j = 1 4: i start.size OR j end.size 5: t = transf ormtable[i][j] 6: minList.pushb ack(t) 7: t == COP Y 8: i=i+1 9: j =j+1 10: t == DELET E 11: i=i+1 12: t == IN SERT 13: j =j+1 14: t == T W IDDLE 15: i=i+2 16: j =j+2 17: 18: 19: minList
while if
do
then
Complexity Analysis
The function FindMinPath will run in time proportional to the number of elements in the weight table which means the FindMinPath runs in O(n m) time, since calc_weight runs in constant time and is called a number of times proportional to n m so the running time of the iterative algorithm is
O(n m)
(4)
The memory usage is constant every where except for the variables weighttable and transform table which each have a size of (m+1)(n+1). So the memory
(5)
Results
Recursive Version Empirically Derived Run Times:
Recursive Edit Distance Results 70 Recursive Plot Recursive Approximation 60
50
Time (Sec.)
40
30
20
10
0 8 9 10 11 Input Size 12 13 14 15
T (n) = 0.00162938 2n
(6)
n0 = 13
(7)
10
20
Time (Sec.)
15
10
0 0 2000 4000 6000 8000 10000 Input Size 12000 14000 16000 18000 20000
T (n) = 5.90954e 08 n2
(8)
n0 = 9000
(9)
Final Weights
COPYWEIGHT = 1 DELETEWEIGHT = 3 INSERTWEIGHT=3 TWIDDLEWEIGHT=2
11
Problems
One problem was that there was no clear cut best set of weights or even best sequences for the more complicated input strings. This was dealt with by setting up many dierent test cases and hand tweaking the weights until sensible solutions were found. Another is that the recursive version was so slow that using it to compare transformation sequences with the iterative version quickly became unbearable. Another problem is that nding the exact values of n0 such that the actual time taken of input sizes of n n0 behaves like the asymptotic complexity is dicult to nd do to timer resolution begin only a second.
Conclusion
From the data it has been shown that by using dynamic programming it has drastically cut down the running time complexity but at the cost of increasing the memory complexity. Both algorithms exhibited their expected run time complexities and the recursive algorithm's time approached asymptotic complexity very quickly, at about n0 = 13 while the iterative version was about
n0 = 9000. With the experiments I performed I arrived at
12
Appendix A
Source Code
main.cpp
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / // / // / // / \ file \ author \brief main . c p p Aaron Wilhelm and benchmarking of the Edit Distance algorithms
Testing
#include <iostream > #include <vector > #include <c a s s e r t > #include <s t r i n g > #include <fstream > #include <c s t d l i b > #include <time . h> #include " e d i t _ d i s t a n c e . h" #define ITER_START_SIZE 1000 #define ITER_STEP_SIZE 1000 #define ITER_END_SIZE 20300 #define RECV_START_SIZE 8 #define RECV_STEP_SIZE 1 #define RECV_END_SIZE 15 using namespace std ; void s t r 2 v e c t o r ( const s t r i n g & s t r ,
{ { } { }
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
char> &
vct )
vct . c l e a r ( ) ;
i = 0 ; i < s t r . s i z e ( ) ; ++i )
vct . push_back ( s t r [ i ] ) ;
void
{
str2vector ( vct . c l e a r ( ) ;
const char
char> &
vct )
13
vct . push_back ( s t r [ i ] ) ;
void
{
int
argc ,
char
argv [ ] ,
bool
r,
bool
i,
bool
u )
for ( int
{ } { } { } { } {
s t r = argv [ j ] ;
if (
s t r == "h" ) cout << " Well your screwed " << endl ; exit (0); s t r == " b i " )
else if (
( i ) =
else if (
s t r == "br " )
( r ) =
else if (
s t r == "u" )
(u) =
else
} {
int
main (
int
argc ,
Edit_Dist< Edit_Dist<
char> char>
char
argv [ ] )
//
benchmark
recursive
algo
//
benchmark
iterative
algo
14
vector < > start , f i n i s h ; ofstream r e c _ f i l e , i t _ f i l e ; time_t start_time , end_time ; srand ( time ( 0 ) ) ; argc ; argv++; parse_inputs ( argc , argv , &bench_recv , &bench_it , &u n i t _ t e s t ) ; {
char
if (
bench_recv ) cout << " S t a r t i n g R e c u r s i v e Benchmark" << endl ; r e c _ f i l e . open ( " r e c u r s e . dat " ) ; start . clear (); finish . clear (); {
s t a r t . push_back ( rand ( ) ) ; f i n i s h . push_back ( rand ( ) ) ; i >= RECV_START_SIZE && ( i RECV_START_SIZE) % RECV_STEP_SIZE == 0) start_time = time (NULL) ; r e c u r s e . find_min_weight ( s t a r t , f i n i s h ) ; end_time = time (NULL) ; r e c _ f i l e << i << "\ t " << ( end_time start_time ) << endl ;
} {
} } rec_file . close (); cout << " Ending R e c u r s i v e Benchmark" << endl ; bench_it )
if (
cout << " S t a r t i n g I t e r a t i v e Benchmark" << endl ; i t _ f i l e . open ( " i t e r a t e . dat " ) ; start . clear (); finish . clear (); {
if (
{
i >= ITER_START_SIZE && ( i ITER_START_SIZE) % ITER_STEP_SIZE == 0) start_time = time (NULL) ; i t e r a t e . it_min_weight ( s t a r t , f i n i s h ) ; end_time = time (NULL) ; i t _ f i l e << i << "\ t " << ( end_time start_time ) << endl ;
}
//
it_file . close (); cout << " Ending I t e r a t i v e Benchmark" << endl ;
some
simple
tests
if (
EMPTY VECTOR /
i t e r a t e . it_min_weight ( s t a r t , f i n i s h ) ; r e c u r s e . find_min_weight ( s t a r t , f i n i s h ) ; a s s e r t ( i t e r a t e . s i z e ( ) == 0 ) ; a s s e r t ( f i n i s h . s i z e ( ) == 0 ) ;
/
for ( int
i = 0 ; i < 1 0 ; ++i )
} i t e r a t e . it_min_weight ( s t a r t , f i n i s h ) ; r e c u r s e . find_min_weight ( s t a r t , f i n i s h ) ; 16
a s s e r t ( i t e r a t e . s i z e ( ) == s t a r t . s i z e ( ) ) ; { }
/
for ( int
i = 0 ; i < 1 0 ; ++i )
a s s e r t ( i t e r a t e [ i ] == r e c u r s e [ i ] ) ; a s s e r t ( i t e r a t e [ i ] == COPY ) ;
OBVIOUS INSERT TEST /
for ( int
i = 0 ; i < 1 0 ; ++i )
i = 0 ; i < i t e r a t e . s i z e ( ) ; ++i )
}
/
a s s e r t ( i t e r a t e [ i ] == r e c u r s e [ i ] ) ; a s s e r t ( i t e r a t e [ i ] == INSERT ) ;
OBVIOUS DELETE TEST /
for ( int
i = 0 ; i < 1 0 ; ++i )
{ }
i = 0 ; i < i t e r a t e . s i z e ( ) ; ++i )
a s s e r t ( i t e r a t e [ i ] == r e c u r s e [ i ] ) ; a s s e r t ( i t e r a t e [ i ] == DELETE ) ;
OBVIOUS TWIDDLE TEST /
i = 0 ; i < i t e r a t e . s i z e ( ) ; ++i )
a s s e r t ( i t e r a t e [ i ] == r e c u r s e [ i ] ) ; a s s e r t ( i t e r a t e [ i ] == TWIDDLE ) ;
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
s t r 2 v e c t o r ( " This t e x t w i l l be m od if i ed " , s t a r t ) ; s t r 2 v e c t o r ( "Text m o d i f c a t i o n done " , f i n i s h ) ; i t e r a t e . it_min_weight ( s t a r t , f i n i s h ) ; a s s e r t ( i t e r a t e [ 0 ] == COPY) ; i = 1 ; i <= 5 ; i ++) { a s s e r t ( i t e r a t e [ i ] == DELETE) ; } ( i = 6 ; i <= 9 ; i++ ) { a s s e r t ( i t e r a t e [ i ] == COPY) ; } ( i = 1 0 ; i <= 1 7 ; i ++)
18
{ } { } {
} }
return
19
edit_distance.h
#ifndef #define
// / // / // /
EDIT_DISTANCE_H EDIT_DISTANCE_H
edit_distance . h Aaron Wilhelm for Class to handle the Edit distance problem
Declarations
#include < l i s t > #include <vector > #include " e d i t _ d i s t a n c e _ t r a n s . h" struct _ed_table_cell { unsigned int weight ;
};
// / // /
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
Trans_types
t_type ;
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / @class @brief Edit_Dist Declarations for Class to handle the Edit distance problem
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
unsigned int
( );
std : : vector <g e n e r i c > & s t a r t , std : : vector <g e n e r i c > & end
void c l e a r ( ) ; unsigned int get_min_weight ( ) ; void g e t _ t r a n s f o r m a t i o n s ( std : : vector <std : : s t r i n g > static bool apply_transformations ( const Transform_list &, const std : : vector <g e n e r i c > & in ,
std : : vector <g e n e r i c > out 20
&);
);
unsigned int it_min_weight ( const std : : vector <g e n e r i c > & s t a r t , const std : : vector <g e n e r i c > & end
);
size ();
);
unsigned int pos_i , std : : vector <g e n e r i c > & unsigned int pos_o
cell_weight _ed_table_cell t ,
void
(
); };
const std : : vector <g e n e r i c > & const std : : vector <g e n e r i c > & unsigned int x , unsigned int y
input , output ,
Transform_list c u r r _ t r a n s _ l i s t ;
21
edit_distance.tpp
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / // / // / // / // / @file @author @brief edit_distance . tpp Aaron This Wilhelm implements the Edit Distance problem both iteratively and
recursively
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
template<class
}
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / // / // / // / // / // / // / @fn @brief @pre @post @param @return operator [ ] The none returns index The the index transform optimal you want index th optimal transform
Which th
index
transform
template<class return
@brief
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
index )
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
of
weights end
it
takes
to
transform
the
start
vector
minimum
weight
to
transform
start
into
end
v e c t o r <g e n e r i c >
start
22
// / // / // /
@param @return
v e c t o r <g e n e r i c > e n d The minimum start sum to of end weights that it takes to transform the
vector
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
return
@fn @brief
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / // / // / // / // / // / // / // / // / @post @return @pre get_min_weight After that One the min weight has been calculated this can be used to get
weight of the find_min_weight empty get the ones ) zero most efficient list transformation list functions must be must be to called get with information
( even
called
valuable
just of
weight of the
Weight
transformation
template<class g e n e r i c > unsigned int Edit_Dist<g e n e r i c > : : get_min_weight ( ) { return min_trans_list . weight ( ) ;
}
// / // / // / // / // / // / // / // / // / // / // / // / // / @param @param @post @pre @fn @brief _find_min_weight This it is the be recursive spliting implementation the to problem up
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
of and
the only
edit
distance
problem
works
transforming
yield have
must
invalid
output to
weight
transform list
the
vector (
output
string
doesn ' t
actually
23
The a
position
into of
the
this of
splits
the
problem .. end ]
into
subproblem
find
output
the
that
vector
unsigned The a
pos_o the output the min vector , weight this of splits the problem .. into
position
subproblem minimum
find of
o u t p u t [ pos_i
size ()] to
The
weight
transforming
i n p u t [ pos_i . . s i z e ( ) ]
.. size ()] step the does shown algorithm not that picks the optimal decision solution . made something possible
decision be
with
global
optimal is
optimal
choice
because that
each
time
algorithm is the
attempts smallest
the
resulting
value
value
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
unsigned int pos_i , std : : vector <g e n e r i c > & unsigned int pos_o
unsigned int
/
if (
if (
Insert
pos_o < output . s i z e ( ) ) c u r r _ t r a n s _ l i s t . push_back (INSERT ) ; c u r r = INSERT_WEIGHT + _find_min_weight ( input , pos_i , output , pos_o +1); c u r r _ t r a n s _ l i s t . pop_back ( ) ; {
if (
}
/
Copy
if (
&& 24
) {
pos_o < output . s i z e ( ) && input [ pos_i ] == output [ pos_o ] c u r r _ t r a n s _ l i s t . push_back (COPY) ; c u r r = COPY_WEIGHT + _find_min_weight ( input , pos_i +1, output , pos_o +1); c u r r _ t r a n s _ l i s t . pop_back ( ) ; {
if (
}
/
if (
Delete
pos_i < input . s i z e ( ) ) c u r r _ t r a n s _ l i s t . push_back (DELETE) ; c u r r = DELETE_WEIGHT + _find_min_weight ( input , pos_i +1, output , pos_o ) ; c u r r _ t r a n s _ l i s t . pop_back ( ) ;
{ }
/
if (
if (
Twiddle
pos_i + 1 < input . s i z e ( ) && pos_o + 1 < output . s i z e ( ) && input [ pos_i ] == output [ pos_o+1] && output [ pos_o ] == input [ pos_i +1] c u r r _ t r a n s _ l i s t . push_back (TWIDDLE) ; c u r r = TWIDDLE_WEIGHT + _find_min_weight ( input , pos_i +2, output , pos_o +2); c u r r _ t r a n s _ l i s t . pop_back ( ) ;
{ }
at to /
if (
}
end min
}
/
and
need
to
compare to see
weight if bettter
weight
list
else
25
{ }
if (
} }
else if (
return
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / // / // / // / // / // / // / // / // / // / // / // / // / // / // / // / // / // / // / // / // / proof : @return @param @param @pre @post @fn @brief it_min_weight This it is the be iteraive spliting implementation the to problem up of and the edit distance problem
works
only
transforming
o u t p u t [ pos_o . . s i z e ( ) ]
transform list
input is
into in is
output
is
calculated
and
the
vector (
output
string
doesn ' t
actually
modified v e c t o r <g e n e r i c > is The turned into weight of transforming i n p u t [ pos_i . . s i z e ( ) ] to output
the
vector
that
the
input
vector
minimum
.. size ()] step the does shown algorithm not that picks the optimal decision solution . made something possible
decision be
with
global
optimal is
optimal
choice
because that
each
time
algorithm is the
attempts smallest
the
resulting
value
value
template<class g e n e r i c > unsigned int Edit_Dist<g e n e r i c > : : it_min_weight ( const std : : vector <g e n e r i c > & s t a r t , const std : : vector <g e n e r i c > & end
) { 26
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
unsigned int r e t ; t a b l e = new _ed_table_cell [ s t a r t . s i z e ( ) + 1 ] ; for ( unsigned int i = 0 ; i < s t a r t . s i z e ()+1; i ++) { t a b l e [ i ] = new _ed_table_cell [ end . s i z e ( ) + 1 ] ;
// Create table
_ed_table_cell t a b l e = NULL;
// //
Fill
s t a r t . s i z e ( ) ; ; i )
would
check
that
table
is
empty
j = end . s i z e ( ) ; ; j )
here would is check loop condition properly a local minimum for x > i and y > j and being filled is
// // //
table
c e l l _ w e i g h t ( t a b l e , s t a r t , end , i , j ) ;
check
that
table [ x ][ y]
{ } { }
// // //
if (
j == 0 )
break ;
if (
i == 0 )
}
and that
break ;
here the check ! loop condition is true that post condition has the min table [0][0] weight
invariant
//
Find
min
transformation
list
// // //
could value
be of
used in the
to
show
at at
each i , j
value is
of
and
table
truely
optimal
subproblem
27
check
};
and the !
break
invariance
r e t = t a b l e [ 0 ] [ 0 ] . weight ;
condition
{ }
}
// / // / // /
delete [ ] t a b l e ; return r e t ;
@fn @brief @pre cell_weight Calculate the table the must be filled in the positive pos_i and positive pos_o
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
28
p r o b l e m
will
be
saved
t [ p o s _ i ] [ pos_o ] table of minimum weights and corresponding that input is transformations with that get is
input the
the
vector (
started
output
string
doesn ' t
actually
modified v e c t o r <g e n e r i c > is turned into int into of pos_i the input the min vector , weight this of splits the problem .. end ] into output
the
vector
that
the
input
vector
unsigned The a
position
subproblem int
find
i n p u t [ pos_i
unsigned The a
pos_o into of the input the min vector , weight this of splits the problem .. end ] into
position
subproblem
find
o u t p u t [ pos_o
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
) {
const std : : vector <g e n e r i c > & const std : : vector <g e n e r i c > & unsigned int pos_i , unsigned int pos_o unsigned int
/
_ed_table_cell t ,
input , output ,
if (
check
to
see
if
at
end
if (
Insert
if (
}
/
if (
Copy
29
) {
pos_i < input . s i z e ( ) && pos_o < output . s i z e ( ) && input [ pos_i ] == output [ pos_o ] c u r r = COPY_WEIGHT + t [ pos_i +1][ pos_o +1]. weight ; { }
if (
}
/
if (
Delete
{ }
if (
}
/
if (
Twiddle
pos_i + 1 < input . s i z e ( ) && pos_o + 1 < output . s i z e ( ) && input [ pos_i ] == output [ pos_o+1] && output [ pos_o ] == input [ pos_i +1] c u r r = TWIDDLE_WEIGHT + t [ pos_i +2][ pos_o +2]. weight ;
{ }
if (
}
//
} }
Write
30
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / // / // / // / // / @fn @brief @pre @post clear clear none clears internal data internal data
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / // / // / // / // / // / @fn @brief @pre @post @return size get None get size the of size of transformation list list the size of the transformation list
transformation
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
31
edit_distance_trans.h
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / // / // / // / // / @file @author @brief edit_distance . h Aaron This of Wilhelm manages the transform list and weight sums for the list
transforms
#ifndef EDIT_DISTANCE_TRANS_H #define EDIT_DISTANCE_TRANS_H #include < l i s t > #include <vector > #define COPY_WEIGHT #define DELETE_WEIGHT #define INSERT_WEIGHT #define TWIDDLE_WEIGHT
/ Weights of
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
different
transforms
1 3 3 2
enum
/
Different
Trans_types
types
of
transformations
};
// / // / // / // / // / // / // /
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / @fn @brief @param @pre @post @return get_weight get the none get the in the weight of a of a transformation unless an a is invalid the type weight of of a transformation that you want the weight of
transformation
weight which
transformation is returned
case
zero
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
inline unsigned int get_weight ( Trans_types switch ( a ) { case COPY: return COPY_WEIGHT; case DELETE: return DELETE_WEIGHT;
32
a)
};
// / // /
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
class Transform_list public : void void unsigned int unsigned int void
};
Transform_list ( ) ; Transform_list & = ( Transform_list & A) ; push_back ( Trans_types a ) ; pop_back ( ) ; weight ( ) ; size (); clear ();
operator
unsigned int
/
EDIT_DISTANCE_TRANS_H
#endif
33
edit_distance_trans.cpp
#include
// / // / // / // / @fn @pre @post
" e d i t _ d i s t a n c e _ t r a n s . h"
Transform_list default nono none constructor
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
@brief
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
list
you
none This this object object will contain the same data as A
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
operator=
( Transform_list & A)
return this
@brief
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / // / // / // / // / // / @fn push_back add a a transformation transformation to you the end of the to list the list
the
want
added
{ }
void
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
34
// / // /
@pre @post
must last
not
be
transformation
void
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
Transform_list : : pop_back ( )
}
// / // / // / // / // /
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / @fn @brief @pre @post @return size get none size size of of list the gotten list size of the list
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
recieved list
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
{ }
void
// / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
Transform_list : : c l e a r ( )