Vous êtes sur la page 1sur 6

Computational Biology and Chemistry 71 (2017) 274–279

Contents lists available at ScienceDirect

Computational Biology and Chemistry


journal homepage: www.elsevier.com/locate/compbiolchem

Research Article

Solving probability reasoning based on DNA strand displacement and


probability modules$
Qiang Zhanga,b,* , Xiaobiao Wanga , Xiaojun Wanga , Changjun Zhoua
a
Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, Dalian University, Dalian, 116622, China
b
School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China

A R T I C L E I N F O A B S T R A C T

Article history:
Received 12 September 2017 In computation biology, DNA strand displacement technology is used to simulate the computation
Accepted 25 September 2017 process and has shown strong computing ability. Most researchers use it to solve logic problems, but it is
Available online 28 September 2017 only rarely used in probabilistic reasoning. To process probabilistic reasoning, a conditional probability
derivation model and total probability model based on DNA strand displacement were established in this
Keywords: paper. The models were assessed through the game “read your mind.” It has been shown to enable the
DNA strand displacement application of probabilistic reasoning in genetic diagnosis.
Conditional probability © 2017 Elsevier Ltd. All rights reserved.
Total probability
Probability reasoning

1. Introduction strand according to the principle of clonal selection algorithm of


the immune system.
Even since Adleman proposed biological computing in 1994, all The structure of network itself facilitates computation expres-
kinds of applications have gradually been put forward. As one of sion. The bio-molecular computation can also be used to simulate
the most popular types of technology in biological computing, DNA neural networks. In 2011, Winfree et al. used the DNA strand
strand displacement has been applied to various computer systems displacement to simulate the neural network, and they designed a
Seelig et al. (2006), Qian and Winfree (2011), Lakin and Phillips model of a four-neuron Hopfield associative memory. In 2013,
(2011), Boemo et al. (2015), from simple logical structures Seelig Anthony J. Genot et al. scaled down DNA circuits with competitive
et al. (2006), Qian and Winfree (2011) to complex chemical neural network and simulated winner-take-all effect, it was also
reaction networks Condon et al. (2014), Zhang and Seeling (2011). proven that DNA is an exquisite substrate.
For example, in 2006, in order to realize the logic gates, Georg In recent years, reasoning modules were established base on
Seeling et al. designed the AND, OR, and NOT gateSeelig et al. DNA strand displacement Inaki Sainz de Murieta et al. (2014), Ran
(2006). They used single nucleotide strands as input and output. et al. (2009), Lakin et al. (2012). For example, in 2009, the logical
This structural model laid the foundation for the future calculation reasoning at the molecular level was put forward again by Shairo’s
model. In 2013, Andrew Phillips et al. implemented and simulated team, who developed an enzyme-driven system to perform simple
the chemical reaction network structure using DNA strand logical deductions. In 2014, Inaki Sainz de Murieta et al. presented
displacement, and designed a voting system based on two-domain a computing model based on the DNA strand displacement that
strand Lakin et al. (2013). The process of DNA strand self-assembly uses Bayesian inference. It has been applied to genetic diagnosis in
similar to the relationship between the antigen and antibody. In vitro.
2013, Rizki Mardian et al. designed a decision marker by using DNA The probabilistic reasoning is widely used in various applica-
tions. In 2014, Sainz de Murieta et al. implemented the conditional
probability based on DNA displacement. They used single strands
to encode the prior probabilities and double strand complexes to
$
This paper was presented at the IDMB 2017 conference in Xi’an Medical encode conditional probabilities. However, in many case and
University on 23–24th March 2017. This paper was recommended for publication in computation, the total probability must be used for getting the
revised form. result according to the probabilities reasoning.
* Corresponding author at: Key Laboratory of Advanced Design and Intelligent
Computing, Ministry of Education, Dalian University, Dalian, 116622, China.
In the paper, the modules of conditional probability and total
E-mail address: zhangq@dlut.edu.cn (Q. Zhang). probability were designed based on DNA strand displacement. For

https://doi.org/10.1016/j.compbiolchem.2017.09.011
1476-9271/© 2017 Elsevier Ltd. All rights reserved.
Q. Zhang et al. / Computational Biology and Chemistry 71 (2017) 274–279 275

reducing the complex of strand, the two-domain strand is used in


the module Cardelli (2010). All signal strands are two domains and
the double strand was only auxiliary strand. In the experiment, the
ratio of corresponding strand concentration was used to represent
the results; and all kinds of results were made possible, and
systems of more types of status were rendered possible. In order to
verify the correctness of the model, the network structure was
constructed and a module of the “read your mind” game was
prepared using the constructed probability model. The results are
shown using DNA strand displacement (DSD) software.

2. The principle of probability model and implementation

In this section, the principle of probability model and the


computational model based on DNA strand displacement reaction
are presented. Fig. 1. Encoding conditional probabilities.

2.1. Conditional probability


(A\Q)). Throughout the whole reaction, the other strands were
2.1.1. The definition of conditional probability used for intermediate reactions of the auxiliary strand and the
Sometimes considering the probability of occurrence of event recovery unit or gate.
A, P(A), necessitates considering the other conditions, which is that The two input strands react with auxiliary strands in the model.
the proposition Q is known to occur. Here, it is commonly denoted The input strands are present at different initial concentrations,
as P(A|Q). and the divided reaction was able to speed up the reaction process;
Definition 1 Let (V, F, P) is a probability space, the event AF, the however, it also helpful to expend the model and track the reactant.
event QF, and P(Q) > 0,when the proposition Q has known to occur, The magnitude of concentration represents the probability value.
the condition probability of the proposition A is defined as: The final output result of the multiplication formula is expressed
by controlling the concentration of the reactant.
PðA \ QÞ
PðAjQÞ ¼ ð2:1Þ
PðQÞ
2.2. Total probability formula
Formula (2.1) can be derived as following:
2.2.1. The definition of total probability formula
PðA \ QÞ ¼ PðAjQÞ  PðQÞ ð2:2Þ
The total probability formula is an important formula in the
Formula (2.2) can also be called the multiplication formula of theory of probability. It turns a complex event probability problem
proposition probability. The two domain strands were here used to into an addition problem, which is the probability of simple events
build the model for the formula (2.2). In the logical computation, occurring in different situations.
the express way is only 0/1 or T/F. In the probability reasoning, Definition 2 Let (V, F, P) is a probability space, A1, A2, . . . , An is
there is not only 0/1, but also a certain proportion of relationships. a finite split, and P(Ai) > 0 (i = 1, 2, . . . , n). Then for any event QiF as
We can control the concentration to solve the probability of follows:
multiplication.
X
n
PðQ i Þ ¼ PðQ j jAi ÞPðAi Þ ð2:3Þ
2.1.2. Implementation of the model i¼1
The strand displacement is used by many researchers because it
The formula (2.3) is called the total probability formula.
is autonomous. In the whole computation, the energy is provided
Formula (2.3), shows that the total probability formula is a total
by structures themselves. The mechanism is a toehold-mediated
formula, which is a result of the sum of n conditional probability.
branch migration. The process of reaction does not need organic
For this reason, the results of the conditional probability formula
sources or other motive force except DNA molecules. The strand
must be obtained first, and then they must be put together.
can be divided into a two-domain strand, a three-domain strand
and a multimedia domain strand Cardelli (2009), Soloveichik et al.
2.2.2. Implement of the total probability model
(2010). The structure of the two-domain strand is simple, invariant
In accordance with condition probability, the signal strand
and easily implemented, and it is amenable to formalization and to
structure still serves as the input signal and output signal. In the
mechanical verification Cardelli (2010).
current model, we assume n = 4, the formula is simplified as
In the paper, a two-domain strand was used to construct the
follows:
model. All the input and output strands were expressed using a
single strand, and the double strand served as a gate or an PðQ i Þ ¼ PðA1 Q i Þ þ PðA2 Q i Þ þ PðA3 Q i Þ þ PðA4 Q i Þ ð2:4Þ
intermediate auxiliary framework. Formula (2.2) showed that P
(P(A1Qi) indicates P(Qi|A1)*P(A1)) it can be implemented by the
(A\Q) is produced when P(Q) and P(A|Q) are present, and its
upper conditional probability model, so are other signals. The
content is related to P(A) and P(A|Q). The structure of the model is
model is constructed as shown Fig. 2.
shown in Fig. 1 (assume that the ratio of P(A)and P(A|Q) is greater
As shown Fig. 2, the input strand A1Q, A2Q, A3Q, and A4Q react
than 1).
with the auxiliary strand a1, a2, a3, and a4 respectively, and
As shown Fig. 1, the input strand <t^ A> and <t^ A_Q> (indicate
produce the common intermediate product strand <a t^>; and
P(A) and P(A|Q) respectively) react with the auxiliary strand a1 and
then the product strand <a t^> and quantitative intermediate
a2 respectively, and produce the intermediate product strand <a
strand <Q t^> would react with the strand a5 and finally generate
tb^> and <b tb^>; and then the intermediate strand react with
the output strand <t^ Q>. In the model, each input strand would
auxiliary strand a4 to produce output strand <t^ AQ>(indicate P
react with corresponding gate independently and they do not
276 Q. Zhang et al. / Computational Biology and Chemistry 71 (2017) 274–279

3.1. Read your mind game

The game involves four questions. As shown Fig. 3(a), a specific


person can be identified from a list of four possibilities using the
answers to four questions. For example, when Q3 = 0, the result is
Santiago Ramon Cajal. Of course the answer of question must be
correct. When Q1 = 0, the result is not certain, possibly Rosalind
Franklin or Claude Shannon. At this time it is not possible to judge
without enough information. Qian et al. used the DNA circuit and
neural network to implement the game in 2011. However, when
they could not confirm the single result according to given input
signal, the output is wrong. To produce a result, we will use the
principle of probability reasoning to work out this situation.
This type of situation is common in really life. When there is not
enough information, it is very hard to select the correct result. If
Fig. 2. Encoding total probability formula.
there is not only one result, it is often possible to express them with
probability in life. All possible answers can be shown and the
probability of each case can be established. Probability reasoning is
affect each other. By adjusting the concentration of the auxiliary used in here. In the current paper, the game was modified and the
strand, the reaction can be controlled effectively and the questioning process was analyzed to design network frame based
concentration of the output strand can be drawn effectively. The on the principle of neurons. The whole process is divided into many
concentration of output strand is the sum of the input strand. In small modules, which is designed from the model. The results are
order to render the reaction more thorough, recycling intermediate shown by probability.
product equipment was added to in our model. As shown Fig. 3(a), the process of the game is known, and the
A two-domain strand structure was used in the current model. answer to each corresponding question is controlled by the answer
The structure of each two-domain strand was simpler than that of to every other question. It can be seen as a mapping function
other multimedia domains strands. The uniform structure is between a set of problems and actions (Fig. 3(b)). When not all
relatively easy to identify. values of Q are known, we have to “guess” the result. To obtain
accurate results, the network framework shown in Fig. 3(c) was
3. Implementation and simulation built. At first, it was divided into four different conditions
according to the number of signals, followed by one signal, two
This section introduces the read your mind game. Strand signals, three signals, and four signals; and then different modules
displacement theory was used to model the game based on were enacted according to the specific signal value, and finally the
probability reasoning. All possible results are expressed using the conclusion was drawn.
value of probability when it is not unique. Visual DSD was used for It is here assumed that Q = {Q1, Q2, Q3, Q4}, Y = {A, B, C, D, E}.
the experiment. According to the game content in here, Q is the samples of input
signal, Y is the output signal of the sample set, E is all other cases
except given cases. When the input signal Qi is obtained, according
to the formula (2.1), the following can be determined:
PðA \ Q i Þ
PðAjQ i Þ ¼ ð3:1Þ
PðQ i Þ

PðA \ Q i Þ
PðQ i jAÞ ¼ ð3:2Þ
PðAÞ
From (3.2), the following can be determined:
PðA \ Q i Þ ¼ PðQ i jAÞPðAÞ ð3:3Þ
According to the formula (2.3), the following is true:
X
n
PðQ i Þ ¼ PðQ i jY j ÞPðY j Þ ð3:4Þ
j

Thus
PðQ i jAÞPðAÞ
PðAjQ i Þ ¼ ð3:5Þ
X
n
PðQ i jY j ÞPðY j Þ
j

The reasoning is similar for other cases.

3.2. Simulation and results

Formula (3.5) is foundation of the game process. The


Fig. 3. (a)The process of read your mind game;(b)The relation between problems computation model was designed for formula (3.5) according to
and actions;(c)The network framework.
Q. Zhang et al. / Computational Biology and Chemistry 71 (2017) 274–279 277

Fig. 4. (a) The result when the signal Q1 = 0; (b)The result when the signal Q2 = 0;(c) The result when the signal Q3 = 1; (d)The result when the signal Q4 = 1.

the model of conditional probability and the model of total In the current simulation, the concentration ratio was used to
probability based on DNA strand displacement. The Microsoft represent the probability value. The answer can be obtained even
Visual DSD Simulator was here used to evaluate the designed when there is not enough information. When the input signal has
scenario Phillips and Cardelli (2009). The model was designed for nothing to do with any of the possible results, there will be no
two cases, one signal and two signals. Some of the results are output. The process of probability reasoning can be applied into
shown in Figs. 4 and 5. The horizontal coordinate represents the many fields in the future.
time, the vertical coordinate represents the concentration.
As shown in Fig. 4, when we only know the input signal Q1 = 0,
the corresponding set of probability value is got (P(A), P(C), P(E0), P 4. Conclusions
(Q1 = 0|A), P(Q1 = 0|C), P(Q1 = 0|E0)). According to the formulas
(3.3) and (3.4), we can calculate the value of the condition In this paper, the models of conditional probability and total
probability and full probability. As shown in Fig. 4, these values probability are implemented based on DNA strand displacement.
represent the strand <t^ PQ10>, <t^ PQ10A>, and <t^ PQ10C> They can be extended to the probabilistic reasoning. The entire
respectively. Their ratios are the final results. If there are two or reaction process is completely autonomous, and enzyme free. The
more answers, they can also be found. The other case is similar “read your mind” game was used to examine the computation
when there is only one signal. Each strand represents correspond- capacity of the model. In the current experiment, when only one
ing probability in Table 1 when we have only know the input signal answer or more than one answer was provided in the problem set,
Q1 = 0. The concentration of output strands is an approximation. the corresponding possible action may occur. The process of
As shown Fig. 5, when the input signal is Q1 = 0, Q3 = 1, the computation is transformed into the network structure flow chart,
corresponding probability set is P(A), P(C), P(E), P(Q1 = 0\Q3 = 1|A), and divided into smaller framework. The models provide a means
P(Q1 = 0\Q3 = 1|C), P(Q1 = 0\Q3 = 1|E).Then, the probability value of implementing probability reasoning based on DNA strand
based on the formula (3.3) and (3.4) can be found. Here, two displacement. It has the potential to deliver other applications of
probability values can be found. In other words, when Q1 = 0 and probability. We plan to build and enhance the capacity of model so
Q3 = 1, the results may be A and C, and they have same probability that it can work with other network structures, such as neural
value. The inference of other cases is similar. networks.
278 Q. Zhang et al. / Computational Biology and Chemistry 71 (2017) 274–279

Fig. 5. (a) The result when Q1 = 0 and Q4 = 0; (b)The result when Q1 = 0 and Q3 = 1;(c) The result when Q2 = 1 and Q3 = 1; (d)The result when Q3 = 1 and Q4 = 1.

and the Program for Liaoning Key Lab of Intelligent Information


Table 1
Relationship between the strand and the probability when the signal Q1 = 0. Processing and Network Technology in University.

Probability Corresponding stand Concentration value(nM)


References
P(A) <t^ PA1> 100
P(C) <t^ PC1> 100 Adleman, L.M., 1994. Molecular computation of solutions to combinatorial
P(E0) <t^ PQ1E0> 600 problems. Science 266, 1021–1024.
P(Q1 = 0|A) <t^ Q10_A> 1600 Boemo, M.A., Turberfield, A.J., Cardelli, L., 2015. Automated Design and Verification
P(Q1 = 0|C) <t^ Q10_C> 1600 of Localized DNA Computation Circuits. DNA Comput. Mol. Program. 9211, 168–
P(Q1 = 0|E0) <t^ Q10_E0> 1600 180.
P(Q1 = 0) <t^ PQ10> 800 Cardelli, L., 2010. Two-domain DNA strand displacement. Comput. Models 26, 47–
P(A\Q1 = 0) <t^ PQ10A> 100 61.
Cardelli, L., 2009. Strand algebras for DNA computing. Nat. Comput. 10, 407–428.
P(C\Q1 = 0) <t^ PQ10C> 100
Condon, A., Kirkpatrick, B., Man  uch, J., 2014. Reachability bounds for chemical
reaction networks and strand displacement systems. Nat. Comput. 4, 499–516.
Genot, A.J., Fujii, T., Rondelez, Y., 2013. Scaling down DNA circuits with competitive
neural networks. J. R. Soc. Interface 10 (85) (20130212-20130212).
Lakin, M.R., Phillips, A., 2011. Modelling, Simulating and Verifying Turing-Powerful
Acknowledgements Strand Displacement Systems. Springer, Berlin Heidelberg.
Lakin, M.R., Phillips, A., Parker, D., Cardelli, L., 2012. Design and Analysis of DNA
strand displacement devices using probabilistic model checking. R. Soc. 9, 1470–
This work is supported by the National Natural Science 1485.
Foundation of China (Nos. 61425002, 61772100, 61672121, Lakin, M.R., Phillips, A., Stefanovic, D., 2013. Modular verification of DNA strand
displacement networks via serializability analysis. Proceedings of the 19th
61572093, 61402066, 61402067, 61370005, 31370778), Program International Conference on DNA Computing and Molecular Programming 133–
for Changjiang Scholars and Innovative Research Team in 146 (8141).
University (No. IRT_15R07), the Program for Liaoning Innovative Mardian, R., Sekiyama, K., Fukuda, T., 2013. DNA strand displacement for stochastic
decision making based on immune’s clonal selection algorithm. Inf. Technol.
Research Team in University (No. LT2015002), the Basic Research Know. 7, 34–44.
Program of the Key Lab in Liaoning Province Educational Phillips, A., Cardelli, L., 2009. A programming language for composable DNA circuits.
Department (Nos. LZ2014049, LZ2015004), Scientific Research J. R. Soc. Interface 6, 419–436.
Qian, L.L., Winfree, E., 2011. Scaling up digital circuit computation with DNA strand
Fund of Liaoning Provincial Education (Nos. L2015015, L2014499),
displacement cascades. Science 332, 1196–1201.
Q. Zhang et al. / Computational Biology and Chemistry 71 (2017) 274–279 279

Qian, L.L., Winfree, E., Bruck, J., 2011. Neural network computation with DNA strand Seelig, G., Soloveichik, D., Zhang, D.Y., Winfree, E., 2006. Enzyme-free nucleic acid
displacement cascades. Nature 475, 368–372. logic circuits. Science 314, 1585–1588.
Ran, T., Kaplan, S., Shapiro, E., 2009. Molecular implementation of simple logic Soloveichik, D., Seelig, G., Winfree, E., 2010. DNA as a universal substrate for
programs. Nat. Nanotechnol. 4, 642–648. chemical kinetics. Proc. Natl. Acad. Sci. 12, 5393–5398.
Sainz de Murieta, I., Rodriguez-Paton, A., 2014. Probabilistic reasoning with a Zhang, D.Y., Seeling, G., 2011. Dynamic DNA nanotechnology using strand
Bayesian DNA device based on strand displacement. Nat. Comput. 4, 549–557. displacement reaction. Nat. Chem. 3, 103–113.

Vous aimerez peut-être aussi