Vous êtes sur la page 1sur 9

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17)

Add Data into Business Process Verication:


Bridging the Gap between Theory and Practice

Riccardo De Masellis,1 Chiara Di Francescomarino,1


Chiara Ghidini,1 Marco Montali,2 Sergio Tessaris2
1
FBK-IRST, Via Sommarive 18, 38050 Trento, Italy
2
Free University of BozenBolzano, piazza Universit, 1, 39100 Bozen-Bolzano, Italy

Abstract oer only basic features (such as Data Type checks) without
considering the interaction between control- and data-ow,
The need to extend business process languages with the ca- or they fail to incorporate data into verication questions,
pability to model complex data objects along with the control
ow perspective has lead to signicant practical and theo-
thus producing misleading answers. This does not come as a
retical advances in the eld of Business Process Modeling surprise as, when analyzing the evolution of data in a process
(BPM). On the practical side, there are several suites for con- which interacts with the external world, unboundedly many
trol ow and data modeling; nonetheless, when it comes to new, i.e., fresh, data values have (in general) to be considered,
formal verication, the data perspective is abstracted away due making verication of even simple properties undecidable.
to the intrinsic diculty of handling unbounded data. On the On the theoretical side there is a signicant body of litera-
theoretical side, there is signicant literature providing decid- ture on the boundaries of decidability and complexity for the
ability results for expressive data-aware processes. However, verication of data-aware processes against dierent formal
they struggle to produce a concrete impact as being far from properties. The problem of these frameworks is that either
real BPM architectures and, most of all, not providing actual they are far from the modeling languages used in real BPM
verication tools. In this paper we aim at bridging such a gap:
we provide a concrete framework which, on the one hand,
suites, and lacking any tool support, or the data model is not
being based on Petri Nets and relational models, is close to adequate for expressive scenarios (See 2 and 7).
the widely used BPM suites, and on the other is grounded on In this paper we aim at bridging the gap between the-
solid formal basis which allow to perform formal verication ory and practice by providing a concrete framework, called
tasks. Moreover, we show how to encode our framework in an RAW-SYS (from: Relational-AWare SYStem) for modeling
action language so as to perform reachability analysis using and verifying data-aware processes as represented by well
virtually any state-of-the-art planner. established BPM suites. In particular we provide:
1. a language for modeling the control-ow, the data and
their interaction based on the most popular, yet for-
1 Introduction mal, frameworks for modeling these three components,
The need to extend business processes with the capa- namely Petri Net (PN) (Van Der Aalst 1998), relational
bility to handle complex data objects has been increas- models, and actions la Data Centric Dynamic System
ingly recognized both in the BPM and AI areas and has (DCDS) (Bagheri Hariri et al. 2013) ( 4);
led to signicant practical and theoretical advances in the 2. a reference architecture that mimics how data-aware pro-
BPM eld (Hull 2008; Meyer, Smirnov, and Weske 2011; cesses are represented by well established BPM suites ( 2)
Reichert 2012; Calvanese, De Giacomo, and Montali 2013; and an encoding of such an architecture in RAW-SYS ( 4);
Hull and Motahari Nezhad 2016). On the practical side, 3. a decidable, yet very expressive, customization of the ref-
several well-established suites, capturing both the process erence architecture to enable formal verication ( 5);
control-ow and its relevant data, are nowadays available 4. an actual verication mechanism based on an encoding
as commercial and non-commercial tools. Examples are the of RAW-SYS into an action language which allows for
Bizagi BPM Suite, Bonita BPM, Camunda and YAWL. De- exploiting automated planners ( 6).
spite dierent modeling choices all such tools share a com-
mon feature: the way data are modied is often hidden 2 Motivations and Architecture
inside the logic of activities/tasks implemented, e.g., with
Java classes, hence resulting in an essentially activity-centric Commercial and non-commercial BPM suites, such as
model, where data are introduced in an ad-hoc way as a sort the ones listed in Table 1, nowadays support the mod-
of procedural attachment (Calvanese, De Giacomo, and eling of both control and data ow and provides con-
Montali 2013). As a consequence, when coming to the for- sistency and verication support. Focusing on the lat-
mal verication of data related aspects, these tools either ter we can identify three dierent levels at which con-
trol and data ow are veried. We illustrate them with
Copyright 2017, Association for the Advancement of Articial the help of the following simple example where activ-
Intelligence (www.aaai.org). All rights reserved. ity A is followed by an exclusive choice (xor-split) that

1091
Workow Data Data Formal Case
Tool Task Case
i
Language Model Handling Verication Task Case
i o
Task
Bonita BPMN ER global,local None i local o
local o

Bizagi BPMN ER global,local None local


YAWL YAWL XMLschema global,local (1) write
Camunda BPMN None global,local None read global

Table 1: Tool analysis Figure 1: RAW-SYS architecture.

leads the process either to terminate or to re-execute A. specic encoding from each BPM tools to DCDS, in order to
True
Activity A
perform verication al level (3). Such a solution is however
False
R(x, y) := S(x, y); not practical for (at least) two important reasons: on the one
R, S = (x, y.R(x, y) S(x, y))?
hand the formalization of control and data ow in DCDS is
Activity A modies data by setting relation R to be equal provided in terms of a STRIP-like language that is extremely
to S and the xor-split requires conditions (on data) to be abstract and far from the languages and system architectures
specied on each outgoing ow: specically, whether there used in BPM suites, and on the other hand DCDSs do not
is a common tuple to both R and S, if they are not empty. have any tool support to concretely perform verication.
At level (1) verication focuses only on the control ow. In This analysis has motivated us towards the introduction
this case the process is considered sound as one path exists of RAW-SYS, a new framework for modeling and verifying
that leads to termination. At level (2) verication takes into data-aware processes as represented by well established BPM
account also the conditions on arcs, e.g., checking whether suites. To do that RAW-SYS is based on two basic pillars:
they are satisable. Again the process is considered sound. rst, a conceptual model close to the ones used by BPM
At level (3) verication takes also into account the eects of suites and, at the same time, amenable to formal verication;
activities on data: only in this case we can conclude that the second, a system architecture that mimics the way BPM
process never terminates, as all tuples in S are also in R. suites deal with processes executions and data.
Although the process never terminates, when veried by Concerning the conceptual model, we dene RAW-SYS
existing BPM suites such as Bizagi1 or YAWL (ter Hofstede on top of three reference components: (i) Petri Net (PN), for
et al. 2010), this critical issue is not revealed. Indeed YAWL specifying the process control-ow; (ii) relational models
oers verication features limited to the control ow (i.e., at with expressive constraints for describing data; and (iii) ac-
level (1)) and thus it wrongly reports that such a process can tions la DCDS for expressing the interaction between
always reach the termination state. All other tools in Table 1 control-ow and data. These components are, undoubtably,
instead only oer a simulation environment (i.e., no formal among the most popular formal frameworks available in liter-
verication) that checks whether the process passes through ature. Also, as we can observe in Table 1, PNs and relational
all the sequence ows, without taking into account data. models act as suitable formal models for the specic work-
Framework Form. Verif. Moving from actual ows and data models used by BPM suites2.
CPN
(3 ) tools to existing theo- Concerning the system architecture, we adopt the one il-
Conceptual WF-Nets (3 ) retical frameworks the lustrated in Fig. 1. As the majority of BPM suites, we allow
DCDS (3) situation improves (see for both modeling and executing process (and data) instances.
Table 2). Colored Petri Therefore, we distinguish the intensional level /schema of a
Table 2: Framework analysis Nets (CPN) and exten- RAW-SYS system which we call model, from the extensional
sions of Workow-Nets level/instance, called snapshot, which captures the status of
(WF-Nets) (Sidorova, Stahl, and Trcka 2011) can oer the system at a certain time. From a high-level perspective,
verication support that take into account both conditions a RAW-SYS model is made up by a data store and a set
on labels and some interaction between activities and data. of process models. Following a common practice in BPM
Nonetheless these frameworks suer of two problems: rst, suites (see column Data Handling in Table 1), the data store
they do rely on an explicit data model but rather encode is structured in a global data store, that is a standard rela-
data within the Net thus making dicult to model tools tional database schema with integrity constraints common to
which are mostly based on a relational data model; second, all process models and a local data store, which is again a
decidability is guaranteed only by strongly limiting the full-edged relational database schema, dened within a sin-
number of colors/types (CPNs) or abstracting from actual gle process model. At runtime processes are instantiated in
values (WF-Nets) (see 7). For the above reasons, they a number of cases. Thus a snapshot contains all cases active
oer a restricted form of (3), whence (3 ), thus posing a in a certain moment of time. Note that each case creates an
serious limitation to process execution which often need instantiation of the local data store private to the case itself,
unboundedly many new, i.e., fresh, data values. Conversely, which is hence used to keep auxiliary information that are
Data Centric Dynamic System (DCDS) does oer - from a
theoretical point of view - a more expressive, yet decidable, 2For the usage of PNs to provide a formal semantics of BPMN
reference framework. Thus one may think to provide a see (Dijkman, Dumas, and Ouyang 2008); for the integration of
XML schemas with relational database systems see (Kappel, Kap-
1http://www.bizagi.com/ sammer, and Retschitzegger 2004).

1092
of interest only as long as the case is active. Conversely, the Denition 2 (WF-net) A Petri net P, T, F  is a WF-net if
global data store provides the stable memory of the com- it has a single source place i, a single sink place o, and
pany, and can be accessed by every case. Note also that tasks every place/transition is on a path from i to o, i.e., for all
can interact with external (human or system) agents and the n P T , (i, n) F and (n, o) F , where F is the
interaction can result in modifying data, including the injec- reexive transitive closure of F .
tion of new, fresh data into the system (e.g, think of a user Example 2 The WF-net modeling the RB control-ow is:
lling a form with arbitrary data). no-op
We conclude this section presenting a reimbursement (RB)
p0 review
example used throughput the paper to illustrate our work. i reviewReq fillReimb p1
Reimb
o

Example 1 Company GoodsKit is equipped with an infor- At the start of the process a request is examined. If it is not
mative system which manages, among others, the employees approved the process terminates; if it is approved, the em-
reimbursement procedure (RB) for their business trips. At ployee can ll a reimbursement which will be then reviewed
runtime, the system deploys an architecture as in Fig. 1 where before termination. The no-op transition is needed to prevent
each process is instantiated by a number of cases. E.g., an connections between nodes of the same type.
RB case deals with the trip reimbursement of employee john
to NewYork. As customary, data are contained in a global The semantics of a PN (thus also of a WF-net), and in partic-
and a local component: an example of local variable is one ular the notion of valid ring, denes how transitions route
that maintains the status of the request. tokens trough the net so that they correspond to a process
execution. A ring of a transition t T from M to M  is
t0
3 The Workow Nets modeling language valid, in symbols M M  , i (i) t is enabled in M , i.e.,
Petri Nets (PNs) is a widely-known language for modeling {p P | M (p) > 0} t; and (ii) the marking M  satis-
distributed systems that has become the de-facto standard for es the property that for every p P : M  (p) = M (p) 1
the formal representation of (the control-ow of) business if p t \ t ; M  (p) = M (p) + 1 if p t \ t and
processes (van der Aalst and Stahl 2011). Structurally, a PN is M  (p) = M (p) otherwise.
t1
a directed bipartite graph with two node types, called places A case of a WF-Net is a sequence of valid rings M0
t2 tk
and transitions, connected via directed arcs. Connections M1 , M1 M2 , . . . , Mk1 Mk where M0 is the marking
between two nodes of the same type are not allowed. with a single token in i. We sat that a WF-Net reaches a
Denition 1 (Petri Net) A Petri Net is a triple P, T, F  marking M if there exists a nite sequence t1 , . . . , tk of
t1 tk
where P is a set of places; T is a set of transitions, such transitions such that M0 M1 , . . . , Mk1 M.
that P T = ; F (P T ) (T P ) is the ow relation The number of tokens owing through a PN is usually
describing the arcs connecting places and transitions. subject to a maximum threshold that is never exceeded. PNs
The preset of a transition t is the set of its input places: enjoying this property are called safe".

t = {p P | (p, t) F }. The postset of t is the set of its Denition 3 (k-safeness) A marking of a PN is k-safe if it
output places: t = {p P | (t, p) F }. Intuitively, places assigns no more than k tokens to each place. A PN is k-safe
represent states/conditions associated to threads (i.e., tokens) if the initial marking M0 , and all markings reachable from
dynamically moving through the PN, whereas transitions rep- M0 , are k-safe.
resent atomic tasks/operations used to evolve such tokens
from one state to another. To characterize the global con- 4 The RAW-SYS Framework
guration of the system, tokens are distributed over places.
Technically, this is done by marking the net, where a marking We start the formal presentation from the data stores. We x
is a total mapping M : P N indicating how many tokens once and for all a countably innite set of constants to be
are present in each place of the PN. used as domain and we observe that technically there is no
PNs come with a graphical nota- dierence between the global and local data stores.
p0
tion where places are represented by Denition 4 (Data store) A (local or global) data store is a
circles, transitions by rectangles, and tuple D = R, C, I0 , where:
tokens by full dots within places. Fig. 2 p1 t p2 R is a database schema, i.e., a set of relation schemas;
depicts a PN with a marking M (p0 ) = C is a set of safe range FO-constraints3 over R, capturing
2, M (p1 ) = 0, M (p2 ) = 1. The pre- Figure 2: A PN. the real-world constraints of the application domain;
set and poset of t are {p0 , p1 } and I0 is the initial database instance of D, i.e., a data store
{p2 }, respectively. The idea of using PNs to model processes instance conforming to R, satisfying the constraints C, and
starts from the observation that business processes are basi- made of values in .
cally ordered set of tasks. Thus they can be mapped onto a
Example 3 Due to lack of space, we present some relations
PN in the following way: tasks are modeled by transitions and
of the RB local data store only (underlined attributes are pri-
precedence relations are modeled by places. However, pro-
mary keys): CurrReq(empl , dest, st) listing the employee
cesses have specic characteristics: they have a clear starting
that requested a reimbursement, the trip destination and the
and completion state, with control-ows connecting such two
extreme points. This observation resulted in the denition of 3This is standard. Recall that relational algebra is safe range by
workow nets (WF-nets) (Van Der Aalst 1998). construction.

1093
status of the request and TrvlMax (ma) containing the max- to enable the creation of new objects via new identiers. The
imum amount estimated by the travels oce for the trip. eects ei are assumed to take place simultaneously.4
We now move to processes, which are modeled in a Example 4 We now show a simple action of the running sce-
prescriptive fashion leveraging standard WF-nets, but we nario. Activity reviewRequest examines the employees re-
notably enrich them with data. We call the resulting nets quest to leave for a business trip and evaluates the maximum
reimbursable amount. The (only) eect of the corresponding
Relational-AWare workow nets (RAW-nets). Intuitively, action rvwReq() (with no parameters) is specied as:
RAW-nets are standard WF-nets equipped with a (local) data
store as in Denition 4. We concentrate on 1-safe nets, which CurrReq(e, d, s) 
generalize the class of structured workows and are the ba- del {CurrReq(e, d, s)}
sis for best practices in process modeling (Kiepuszewski, ter add{CurrReq(e, d, status()), TrvlMax (ma())}
Hofstede, and Bussler 2013). It is important to notice that
our approach can be seamlessly generalized to other classes In order to update the request status, the tuple representing
of Petri nets, as long as it is guaranteed that they are k-safe. the current travel request in CurrReq must be rst deleted
This reects the fact that the process control-ow is well- and then added with the new status (recall that additions
dened. Also, transitions are data-aware: their execution is have higher priority than deletions). Query CurrReq(e, d, s)
guarded by queries over the local and global data store, and selects such a tuple, eect del{CurrReq(e, d, s)} deletes
their eects update the local and the global data store. it while add{CurrReq(e, d, status())} adds the same tu-
ple but with a new value for the status. Also, the value
Denition 5 (Relational-aware WF-net) A RAW-net over a of the max reimbursable amount is added by the fact
global data store DG is a tuple D, P, T, F, F, A, G where: TrvlMaxAmnt(ma()). We use of functions status() and
D is the process local data store (as in Def. 4) with I0 = ; ma() to model unknown values coming from the external
P, T, F  is a WF-net; environment, in our case the company travels oce5.
F is a function that associates
 each transition with a nite The valid ring of a transition in addition to the usual
set of functions from n0 (n ), each representing marking conditions on the input places, as specied by the
the interface to a (nondeterministic) external service; RAW-net is conditioned by the satisability of the guard
A is a function that associates each transition t with an and the successful application of the action. Note that an
action (see later); action might be unsuccessful because of a violation of a
G is a function that associates each transition with a safe constraint in the local data store.
range query over D DG (the guard). Finally, a RAW-SYS model simply incorporates the global
Transition actions may include parameters (see later): in that data store together with a set of processes:
case the guard free variables must match the parameters of
the corresponding tasks. Denition 6 (RAW-SYS model) A RAW-SYS model is a tu-
ple DG , W where:
Among the many dierent data interaction formalisms DG is a global data store as in Denition 4;
existing in the literature, we adopt the approach in W is a set of RAW-nets as in Denition 5 over the global
(Bagheri Hariri et al. 2013; Montali and Calvanese 2016), data schema DG .
which allows us to express virtually any pattern of update, in-
cluding CRUD operations over the data stores, but also bulk Semantics is provided in term of global states (snapshots)
operations that simultaneously manipulate large portions of that include the set of all active cases as well as the instance
the data store at once. Intuitively, an action is described by a of the global data store.
set of add and remove operations on data store instances con- Denition 7 (RAW-SYS snapshot) A snapshot s of a RAW-
ditioned by a domain independent query. More formally an SYS model DG , W is a tuple IS , K where:
action act(p1 , . . . , pn ) : {e1 , . . . , em } is characterized by: IS is an instance of the global data store DG ;
(i) the name act; (ii) a list of parameters p = p1 , . . . , pn and K is a set of cases, i.e., tuples w, (M, I) where w W
(iii) a set of eect e1 , . . . , em . The parameters are substituted and (M, I) represents the state of the RAW-net w, namely
with actual values d when the action is invoked. Such values its marking M and the instance I of its local data store.
are every answer of of the corresponding guard query given
The behavior of the system is described by means of all the
by G on the local and/or global data source. Each eect ei
possible sequences of snapshots evolving from the initial one.
is of the form: Q(p, x)  add A(p, x, y ) del D(p, x) where
The initial snapshot s0 has an empty set of cases (none of the
Q is a domain independent query over a data schema with
processes is active), and the global instance I0 specied, i.e.,
open variables x, while A and D are sets of facts over (pos-
s0 = I0 , . It is worth noting that, due to the presence of
sibly another) schema. Intuitively, Q is used to select some
external service calls and also due to the possibility of non-
values (its answers) that are then used to add and/or remove
deterministically spawning new process cases, the execution
facts. Notice that added facts A may include Skolem terms y
semantics of a RAW-SYS needs in general to account for an
that model the interaction with external services: at runtime
Skolem terms are substituted by the value returned by issu- 4As in STRIPS, additions have higher priority than deletions.
ing the corresponding service call given by F. Some of those 5If we require a function to return a value from a given set, e.g.,
services can be guaranteed to return fresh values (i.e. con- status() from {accp, rejc}, it is enough to include a foreign key
stants not included in the active domain) and this is essential constraint to an auxiliary relation containing accp and rejc.

1094
innite number of states, as well as truly innite runs that boundedness can be applied: the global data store, the local
may visit innitely many dierent data store instances. data store, and the number of running cases. We respectively
Sequences of valid snapshots are dened according to two call the three corresponding notions of state boundedness
types of transitions that makes the system evolve nondeter- global size-, local size-, and case-boundedness. Two research
ministically. In what follows we assume the current snapshot questions consequently arise: (1) Can state-boundedness help
to be s = IS , K. towards decidability of verication over RAW-SYSs? (2) If
Creation of a new case: A new case w, (M0 , I0 ) of process so, which of the information sources must necessarily be
w W is created, where M0 is the initial marking and bounded towards decidability?
local data store instance I0 is empty. The global data store is Undecidability. We attack the second research question,
untouched: the new snapshot is therefore s = IS , K  with showing that as soon as one the RAW-SYS information
K = K {w, (M0 , )}. sources is not size-bounded, reachability of a query con-
Case ring: one of the transitions t T of an active case stituted by an atomic proposition is undecidable even when
w, (M, I) K with w = D, P, T, F, F, A, G can be the modeling elements of the framework are severely re-
executed by picking a suitable parameters substitution d from stricted. More specically, we consider a class of RAW-SYS
the answers of query G(t) evaluated on IS I and for the called isolated: DG , W is isolated if every RAW-net w in
involved service calls F(t). The corresponding action A(t) W satises the following two conditions:
1. w has the following shape:
with actual parameters d is red and possibly updates the
init(x) i o nalize(
y) o
local data store instance from I to I  and the global data i
store from IS to IS . Also the net marking is updated to M  , inner(w)
Gi (x) Gf (
y)
and we distinguish two cases:
(i) if M  is a nal marking for w, then the process termi- where inner(w) is a workow net.
nated and it is removed from the set of active cases, 2. All guards and actions attached to inner(w) refer only to
resulting in a new snapshot s = IS , IA , K  with the local data store of the RAW-net (i.e., they do not read,
K = K \ {w, (M, I)}; nor write, DG ).
(ii) otherwise the new snapshot is s = IS , IA , K  with Intuitively, in an isolated system each case interacts with the
K = K \ {w, (M, I)} {w, (M  , I  )}. global data store only when it is created, for initializing the
Notice that in both cases, if I  or IS are not legal instances local data store, or when it completes its execution, for de-
for D or DG , i.e., they do not satisfy the integrity constraints, termining which local data have to be persistently stored.
then the transition cannot be red. All other guards and actions only operate over the local data
store, thus limiting the interactions with the other, simulta-
5 Verication of RAW-SYS models neously running, cases.
We now consider verication of RAW-SYS models, focusing We report in the following the results for each of the three
on fundamental dynamic properties such as reachability as dimensions of state-boundedness: isolated RAW-SYSs where
well as model checking against rst-order temporal logics. no bound is imposed on the local data stores, on the global
In our setting, reachability amounts to check whether there data stores and on cases. The proofs of the rst two theorems
exists a run of the RAW-SYS under study that starts from the are by reduction from the halting problem for deterministic
initial state and eventually achieve a state whose data satisfy two-counter machines (2CMs), well-known to be undecid-
a boolean query of interest. Since the execution semantics able, which can be simulated by isolated global size- (local-
of RAW-SYS gives raise to an innite-state transition sys- size) and case-bounded RAW-SYSs. The proof of the third
tem, verication is much more challenging than in the con- theorem requires instead a more convoluted approach.
ventional, nite-state setting (Calvanese, De Giacomo, and
Theorem 1 Checking reachability of an atomic proposition
Montali 2013). In particular, even the most basic forms of
is undecidable over isolated, global size-bounded and case-
reachability are highly undecidable in the general case. We
bounded RAW-SYSs, where: (i) the global data store contains
hence exploit reachability as a test for isolating classes of
only propositions; (ii) there is only one RAW-net equipped
RAW-SYS: whenever reachability turns out to be undecid-
with a local data store constituted by two unary relations;
able for a class, we consider it not amenable to verication.
(iii) there is at most one running case.
In this work, we rely on the notion of state-boundedness,
which has been extensively exploited for providing strong, Proof 1 (Proof Sketch) The proof is by reduction from the
robust decidability conditions in a plethora of data-aware pro- halting problem for deterministic, two-counter machines
cess frameworks (Belardinelli, Lomuscio, and Patrizi 2012; (2CMs), well-known to be undecidable (Minsky 1967).
Bagheri Hariri et al. 2013; De Giacomo, Lesperance, and Pa- We x a simple, isolated RAW-SYS S with the following
trizi 2012). Essentially, state-boundedness requires to limit, features: (i) the global data source contains only a single
a-priori, the number of objects that can co-exist in the proposition Hit used as a ag; (ii) the single RAW-net w
same state. A state-bounded system still accepts unbound- of S has a data store equipped with two unary relations C1
edly many dierent objects to appear within and across runs. and C2 . Net w has a init transition that lets only one single
In all such previous works, there is a single, global data token ow into the internal isolated workow net, and a
store, whose size is subject to the (state) bound. In our setting, nalize transition that raises the Hit ag in the global store.
there are three information sources to which such notion of The internal workow net inner(w), in turn, encodes a 2CM

1095
as follows. The instruction identiers of the 2CM become only unary relations, whose extension contains at most one
places of inner(w), where the rst instruction corresponds tuple (i.e., they act as registers); (ii) there is only one RAW-
to the input place of inner(w), and the halting state of the net equipped with an empty local data store; (iii) there is at
2CM corresponds to its output place. The values of the two most one running case.
counters correspond to the size of the extensions of the two
relations C1 and C2 , which are initially empty. In this light, Proof 3 (Proof Sketch) This case is the trickiest as we can-
increment and conditional decrement of the rst counter are not anymore exploit the same technique adopted in the pre-
simulated as follows: vious two cases, because it is not possible anymore to exploit
the local/global data store to remember the value of the two
An increment operation k1 : c1 ++; goto k2 is sim- counters. Furthermore, when a certain process case becomes
ulated in inner(w) by the net fragment below where running, its evolution cannot be aected by that of other
inck1 ,k2 () : {true  add{C1 (newval())}} injects running cases, since it only works on its own local data.
a new value in C1 through the fresh service call newval(). Let w be the single RAW-net of S. The WF-net of w is
k1 inck1 ,k2 () k2 again the trivial net containing a single, no-op transition.
However, it is now associated to two sophisticated updates to
be applied when an instance of w is created or terminates its
A conditional decrement operation k3 : if execution,
c1 ==0 then goto k4 ; else goto k5 is in-
The two counters are simulated using two chains of
stead simulated using the net fragment below.
k3
running cases, where the value of the counter is the length
k4 noopk3 ,k4 () deck3 ,k4 (x) k5
of the chain minus 1. The main diculty is how to rigidly
keep track of the ordering between cases in a chain, and
x.C1 (x) C1 (x)
how to properly manipulate the chain, given the fact that the
information about the chain itself cannot be stored anywhere,
The left branch captures the case where the counter is 0, being all the data stores bounded. We attack this problem as
which only requires to update the program counter, and in fact follows. First of all, each instance of w exposes itself via a
noopk3 ,k4 () : {}. The right branch captures the case where ticket, i.e., a unique identier that is made explicit in the
the counter is positive, and must consequently decrease by global data store (this is necessary, since the internal case
one unit. This is captured by guard C1 (x), which nondeter- identiers are not visible). The global data store remembers
ministically picks an element from C1 , and by the related the current, control-state of the 2CM by adopting the same
action deck3 ,k4 (x) : {true  del{C1 (x)}}, which removes technique as in the proof of Theorem 2. It also remembers
it. Similar structures are used to simulate increment and con- the extreme points of the two chains (i.e., the tickets of the
ditional decrement for counter 2. It is then easy to see that the instances at their top/bottom).
2CM halts if and only if S reaches a state where Hit holds. Since the 2CM is deterministic, we can assume that each
Theorem 2 Checking reachability of an atomic proposition control-state has a unique successor state obtained via a
is undecidable over isolated, local size-bounded and case counter-increment operation, or two successor states, one
bounded RAW-SYS, where: (i) the global data store con- achieved when a counter is positive and gets decremented,
tains only propositions and two unary relations; (ii) there is the other achieved when the same counter is zero. Increment
only one RAW-net equipped with an empty local data store; is then simulated by allowing for the creation of a new w-
(iii) there is at most one running case. instance only if the current control-state has an increment
transition. Upon creation, the instance consumes the infor-
Proof 2 (Proof Sketch) The proof is similar to that of The- mation about the instance that is currently at the top of the
orem 1, with the dierence that the unary relations used to corresponding chain, remembering the corresponding ticket
simulate the counters are now in the global data store. The in a local, previous ticket relation. At the same time, it gen-
global data store is also equipped with a nite set of atomic erates a fresh ticket identifying itself, and updates the global
propositions, one per state of the 2CM. In a given snapshot, data store by declaring that this ticket is now at the top of the
only one of such atomic propositions holds (modeling that chain. This immediately simulates increment, and therefore
the 2CM is in a certain current state). The single RAW-net has the very same update also updates the control-state.
a trivial structure with an input and output places connected
Decrement transitions for a counter are simulated by the
by a single, no-op transition. Upon creation of an instance of
termination of the running w-instance that is at the top of
such a RAW-net, it is checked which of such propositions cur-
the corresponding chain. However, termination is not explic-
rently holds, triggering a suitable, immediate update of the
itly controllable in the specication: all running w-instances
current state (i.e., substituting the current proposition with
evolve in parallel and in isolation to each other, and conse-
the next one, in accordance to the control-state update of the
quently there is no explicit way of selecting only the instance
2CM), and an update on the extension of the two counter
at the top of the chain and induce its termination. To enforce
relations, with the same strategy discussed in the proof of
this, we leverage the fact that a RAW-net case can properly
Theorem 1.
terminate only if the corresponding update satises the con-
Theorem 3 Checking reachability of an atomic proposition straints of the global data store. In particular, we make sure
is undecidable over isolated, local and global size-bounded that whenever a w-instance wants to terminate, a constraint
RAW-SYSs, where: (i) the global and local data stores contain is violated if the ticket of that instance does not correspond to

1096
that at the top of the chain. When the top-instance is picked, Corollary 1 Verication of Lp properties over size-
it successfully terminates by updating the control-state, and bounded RAW-SYSs is decidable
declaring that the top ticket is now the one store in its pre-
We close this section by briey discussing why size-
vious ticket relation.
bounded RAW-SYS are reasonable in practice. Bounding
Transitions triggered by a zero test are simulated in a
the number of simultaneously running instances can be seen
similar way, discriminating them from decrement transitions
as a sort of limited resources assumption: recalling our
by simply checking whether the selected, running w-instance
running example, the GoodsKit travel oce has only n peo-
is not only at the top of the chain, but also at the bottom (i.e.,
ple taking care of reimbursement procedures, and hence,
the chain contains only such an instance).
assuming each person to handle at most c cases in paral-
Decidability for Bounded RAW-SYSs. The undecidability lel, only m = n c RB cases can be running at the same
results in Theorem 13 show that as soon as one of the three time. Still, un unbounded number of dierent RB cases can
information sources has unbounded size, reachability turns be executed during GoodsKit life. Bounding the size of the
out to be undecidable. We thus analyze the situation where local and global data stores reects the fact that the progres-
all of them are bounded. We simply refer to such systems sion of cases depends only on a limited, i.e., not unbounded,
as size-bounded RAW-SYSs. We stress that such systems are amount of data. As for local data stores, size-boundedness
by no means nite-state: they allow for storing unboundedly immediately applies to all practical frameworks that rely on
many data within and across system runs, provided that such local variables as case data. Concerning the global data
data do not accumulate in the same snapshot. store, it can still be unbounded in practice: we only require
For bounded RAW-SYSs, we prove the following positive a bound on the part used to decide about the progression
result, thanks to a reduction to DCDSs. of currently running cases. Therefore we can easily imag-
ine an unbounded archival memory that is used only for
Theorem 4 Checking reachability over size-bounded RAW- other forms of analysis, such as reporting, auditing, and min-
SYSs is decidable in PSPACE in the size of the initial global ing. The unboundedness of the archival memory does not
data store. undermine our decidability results as, from the verication
Proof 4 (Proof Sketch) The proof is done in three steps: perspective, it can be seen as a write-only component.
(1) we provide a behavior-preserving encoding of RAW-
SYSs into DCDSs; (2) we argue that if the input RAW-SYS is 6 Implementing Verication using Planning
bounded, then the DCDS obtained via the encoding is state- To support automated verication of RAW-SYS models from
bounded in the sense of (Bagheri Hariri et al. 2013); (3) we a practical point of view we encode the model and verication
formulate reachability over the input RAW-SYS as a veri- problem using the C action language (Lifschitz 1999; Gelfond
cation problem over the corresponding DCDS - decidability and Lifschitz 1998). We selected this language because it
is then obtained by (Bagheri Hariri et al. 2013), in which enables a simple and clear encoding of both the dynamic and
verication over state-bounded DCDSs has been shown to static (i.e., data constraints) aspects of RAW-SYS; moreover
be decidable. The main guideline for the encoding is as fol- there are dierent state of the art systems implementing it
lows. The DCDS data component is obtained by combining (e.g. the DLVK planner (Eiter et al. 2003)). However, the
the global data store together with the local data stores. All same encoding can be easily adapted to dierent planners
such relations are augmented with an additional attribute that and similarly expressive planning representation languages,
explicitly accounts for the provenance of each tuple, that is, e.g., ADL (Calvanese et al. 2016). For the sake of space
the case id of the process that generated it (through actions). constraints we will sketch the main ideas of the encoding.
In addition, an accessory relation is added so as to track States are represented by means of so called uents and
the control-ow state of each such instance. The dynamic the dynamics of the planning domain is specied by means
component is obtained by introducing dedicated actions for of actions and rules. Actions may have preconditions, and
the creation/dismissal of cases. In addition, each RAW-net is rules, dening how uents change, may mention them. This
translated into a set of dedicated actions (one per transition in allow rules to be used as action postconditions.
the net), following the same strategy as in (Hariri et al. 2014). The data component is encoded using a uent for every re-
The fact that the obtained DCDS is state-bounded, which is lation name, and the initial data instance is the planner initial
essential for decidability, derives from the size-boundedness state. Integrity constraints of the data model are encoded by
of the input RAW-SYS, and from the fact that we focus on means of denial constraints; that is, rules with f alse in the
bounded Petri nets. As for the complexity, in general veri- head. To ensure separation between dierent processes, each
cation over state-bounded DCDSs requires to explore, in the relation includes an additional attribute holding the process
worst-case, a number of states that is exponential in the size id and queries and constraints are modied in such a way that
of the initial database. However, in the case of reachability, processes can only access local data, i.e., tuples marked with
this exploration can be done on-the-y, using space that is the corresponding process id. Processes control-ow, dened
polynomial in the size of the initial data store. by means of workow nets, is encoded using an additional
Thanks to the reduction to DCDS, we also get a stronger uent (mrk) representing the marking (similar to what done
result: RAW-SYSs can be model checked against sophisti- in (Di Francescomarino et al. 2015)).
cated temporal properties expressed in a rst-order variant A case transition associated to a guard/action pair are en-
of -calculus called Lp (Bagheri Hariri et al. 2013). coded with (i) a C action having the guard and the specic

1097
marking as precondition and (ii) rules asserting straight facts about 0.5s (482.2 ms) for Q1 with very small active domains,
if added by the eects and asserting negated facts if deleted to about a couple of minutes (116554.4 ms) for Q2 with a
by the eects (iii) rules taking care of updating the mrk uent. size of active domains (ADOM) and pending requests (PR)
Functional terms are not directly available in most of the which is realistic for a small enterprise. An execution time
implementations of C-based action languages, therefore we of two minutes looks reasonable for a verication system
simulate them by means of predicates and denial constraints dealing with complex data that has not been optimized, thus
enforcing functionality. Moreover, nondeterministic selec- giving the glimpse of the applicability of the approach to
tion of the value is forced by means of default negation realistic scenarios. Moreover the required time seems to be
through a commonly used ASP pattern of rules. Using the not particularly aected by the number of pending requests
results outlined in 5 we can restrict to a domain of con- (PR), while it seems to strongly depend on the size of ADOM.
stants for the planning problem including those appearing in
the initial instance, actions, and the (nite) set of constants 7 Related Work and Concluding Remarks
obtained by abstracting the innite elements from the orig- On the theoretical side, a number of works exist both in
inal domain. The niteness of the domain is guaranteed by the area of data-aware processes and of (variants of) PNs.
the fact that the model is state-bounded. Unfortunately, when combining processes and data, veri-
Soundness and completeness of such an encoding w.r.t. cation problems suddenly become undecidable (Calvanese,
the reachability problem are inspired by the results presented De Giacomo, and Montali 2013). We can divide this litera-
in (Calvanese et al. 2016). In particular, we establish a corre- ture in two streams. In the rst stream, variants of PNs are
spondence between state transitions in RAW-SYS and states enriched, by making tokens able to carry various forms of
of the action language specication. data, and by making transitions aware of such data, such in
The verication of reachability properties of the model CPNs (van der Aalst and Stahl 2011) or data variants such as
is encoded with a query describing the goal state for the (Structured) Data Nets (Badouel, Hlout, and Morvan 2015;
planner. These properties can be related to the state of the Lazi et al. 2007), -PNs (Rosa-Velardo and de Frutos-
(global) data store as well as more general conditions over Escrig 2011) and Conceptual WF-nets with data (Sidorova,
the dynamic of the system, e.g., the verication that there are Stahl, and Trcka 2011). For full CPNs, reachability is un-
no running processes can be performed by checking that no decidable and usually obtained by imposing niteness of
places are in the mrk relation (i.e. there are no tokens). color domains. Data variants instead weaken data-related as-
pects. Specically Data Nets and -PNs consider data as
unary relations, while semistructured data tokens are lim-
An experiment with a concrete planner. As a proof of ited to tree-shaped data structures. Also, for these mod-
the feasibility of the proposed approach we investigated the els coverability is decidable, but reachability is not. The
applicability of RAW-SYS with a concrete reasoning system work in (Sidorova, Stahl, and Trcka 2011) considers data
(DLVK ) in the realistic setting of the RB example. elements (e.g., Price) that can be used on transitions pre-
Specically, we tested Query ADOM PR Time(ms) conditions. However, reasoning does not consider data val-
the encoding of RB on dif- 2 482.2 ues (e.g., 50$) but only whether the value is dened
ferent scenarios in which 5
5 499.2 or undened. The second stream contains proposals that
we varied the following 2 3278.8 take a dierent approach: instead of making the control-
two dimensions: (i) the 10 5 3950.4 ow model increasingly data-aware, they consider stan-
Q1
size of the set of values of 10 3840.3 dard data models and make them increasingly dynamics-
the active domain of em- 2 41660.1 aware. Notable examples are relational transducers (Abite-
ployees and destinations 20 5 64515.9
boul et al. 2000), active XML (Abiteboul, Segoun, and
(ADOM); specically, 10 50019.9
Vianu 2009), the artifact-centric paradigm (Gerede, Bhat-
ADOM {5, 10, 20} 5
2 982.2
tacharya, and Su 2007; Damaggio, Deutsch, and Vianu 2011;
5 9805.8
(ii) the number of pending Bagheri Hariri et al. 2013), and DCDSs (Bagheri Hariri et
2 7143.2
requests (PR); speci- 10 5 7577.3 al. 2013). Such works dier on the limitations imposed to
cally PR {2, 5, 10}. Q2
10 62415.5 achieve decidability, but they all lack an intuitive control-ow
Considering the range of 2 90709.1 perspective. RAW-SYS instead directly combines a control-
these parameters, we ob- 20 5 98616.2 ow model based on PNs and standard data models ( la
tain a total of 8 scenar- 10 116554.4 DCDS) as rst class citizens.
ios per query. We focus on As future work, we plan to set up an extensive experimental
Table 3: Performance results
the following two existen- evaluation optimizing our DLVK encoding, as well as to tackle
tially quantied queries (i.e., goals in the planning encoding): verication properties beyond reachability, allowed by the
Q1. a pending request has been processed; result of DCDS state boundedness, through an encoding in
Q2. a pending request has been accepted; state-of-the-art model checkers.
The experimentation has been performed on a pc running
Windows 8 with 8GB RAM and a 2.4 GHZ Intel-core i7 Acknowledgement. This research has partially been carried
and the results averaged on 10 iterations. Table 3 shows out within the Euregio IPN12 KAOS, which is funded by
the performance of RAW-SYS on the RB example: the time the European Region Tyrol-South Tyrol-Trentino (EGTC)
required by RAW-SYS for achieving the goal ranges from under the rst call for basic research projects.

1098
References In Proc. of 14th Int. Conf. on the Principles of Knowledge
Abiteboul, S.; Vianu, V.; Fordham, B. S.; and Yesha, Y. 2000. Representation and Reasoning, KR 2014. AAAI Press.
Relational transducers for electronic commerce. Journal of Hull, R., and Motahari Nezhad, H. R. 2016. Rethinking
Computer and System Sciences 61(2):236269. BPM in a cognitive world: Transforming how we learn and
Abiteboul, S.; Segoun, L.; and Vianu, V. 2009. Model- perform business processes. In 14th Int. Conf. on Business
ing and verifying Active XML artifacts. Bull. of the IEEE Process Management, BPM 2016, volume 9850 of LNCS,
Computer Society Technical Committee on Data Engineering 319. Springer.
32(3):1015. Hull, R. 2008. Artifact-centric business process models:
Badouel, E.; Hlout, L.; and Morvan, C. 2015. Petri nets Brief survey of research results and challenges. In Proceed-
with semi-structured data. In Proc. of 36th International ings of the OTM 2008 Confederated International Confer-
Conference on Application and Theory of Petri Nets and ences, volume 5332 of LNCS, 11521163. Springer.
Concurrency. Kappel, G.; Kapsammer, E.; and Retschitzegger, W. 2004.
Bagheri Hariri, B.; Calvanese, D.; De Giacomo, G.; Deutsch, Integrating xml and relational database systems. World Wide
A.; and Montali, M. 2013. Verication of relational data- Web 7(4):343384.
centric dynamic systems with external services. In 32nd Kiepuszewski, B.; ter Hofstede, A. H. M.; and Bussler, C. J.
Symposium on Principles of Database Systems (PODS 13), 2013. On structured workow modelling. In 12th Int. Conf on
163174. ACM. Advanced Information Systems Engineering (CAiSE 2000),
Belardinelli, F.; Lomuscio, A.; and Patrizi, F. 2012. An volume 1789 of LNCS. Springer. 431445.
abstraction technique for the verication of artifact-centric Lazi, R.; Newcomb, T.; Ouaknine, J.; Roscoe, A. W.; and
systems. In Proc. of 13th Int. Conf. on Principles of Knowl- Worrell, J. 2007. Nets with Tokens Which Carry Data. In
edge Representation and Reasoning, KR 2012, 319328. 28th Int. Conf. on Applications and Theory of Petri Nets and
Calvanese, D.; Montali, M.; Patrizi, F.; and Stawowy, M. Other Models of Concurrency, (ICATPN 2007), volume 4546
2016. Plan synthesis for knowledge and action bases. In of LNCS, 301320. Springer.
Proc. of the 25th Int. Joint Conf. on Articial Intelligence Lifschitz, V. 1999. Action languages, answer sets and plan-
(IJCAI 2016). AAAI Press. ning. In The Logic Programming Paradigm: a 25-Year Per-
Calvanese, D.; De Giacomo, G.; and Montali, M. 2013. Foun- spective. Springer Verlag. 357373.
dations of data-aware process analysis: A database theory Meyer, A.; Smirnov, S.; and Weske, M. 2011. Data in
perspective. In 32nd Symposium on Principles of Database business processes. Technical Report 50, Hasso-Plattner-
Systems (PODS 13), 112. ACM. Institut for IT Systems Engineering, Universitt Potsdam.
Damaggio, E.; Deutsch, A.; and Vianu, V. 2011. Artifact Minsky, M. L. 1967. Computation: Finite and Innite Ma-
systems with data dependencies and arithmetic. In Proc. of chines. Prentice-Hall, Inc.
the 14th Int. Conf. on Database Theory (ICDT 2011), 6677.
Montali, M., and Calvanese, D. 2016. Soundness of data-
De Giacomo, G.; Lesperance, Y.; and Patrizi, F. 2012. aware, case-centric processes. International Journal on Soft-
Bounded situation calculus action theories and decidable ware Tools for Technology Transfer 18(5):535558.
verication. In 13th Int. Conf. on Principles of Knowledge
Reichert, M. 2012. Process and data: Two sides of the same
Representation and Reasoning, KR 2012, 467477.
coin? In Proceedings of the OTM 2012 Confederated Inter-
Di Francescomarino, C.; Ghidini, C.; Tessaris, S.; and San- national Conferences, volume 7565 of LNCS, 219. Springer.
doval, I. V. 2015. Completing workow traces using action
languages. In Proc. of 27th Int. Conf. on Advanced Infor- Rosa-Velardo, F., and de Frutos-Escrig, D. 2011. Decid-
mation Systems Engineering (CAiSE 2015), volume 9097 of ability and complexity of petri nets with unordered data.
LNAI, 314330. Springer. Theoretical Computer Science 412(34):4439 4451.
Dijkman, R. M.; Dumas, M.; and Ouyang, C. 2008. Se- Sidorova, N.; Stahl, C.; and Trcka, N. 2011. Soundness
mantics and analysis of business process models in bpmn. verication for conceptual workow nets with data: Early
Information and Software Technology 50(12):12811294. detection of errors with the most precision possible. Infor-
mation Systems 36(7):10261043.
Eiter, T.; Faber, W.; Leone, N.; Pfeifer, G.; and Polleres,
A. 2003. A logic programming approach to knowledge- ter Hofstede, A. H. M.; van der Aalst, W. M. P.; Adams,
state planning, II: The DLVK system. Articial Intelligence M.; and Russell, N., eds. 2010. Modern Business Process
144(1-2):157211. Automation - YAWL and its Support Environment. Springer.
Gelfond, M., and Lifschitz, V. 1998. Action Languages. van der Aalst, W., and Stahl, C. 2011. Modeling Business
Electronic Transactions on AI 2(3-4):193210. Processes: A Petri Net-Oriented Approach. MIT Press.
Gerede, C. E.; Bhattacharya, K.; and Su, J. 2007. Static anal- Van Der Aalst, W. M. P. 1998. The application of petri nets
ysis of business artifact-centric operational models. In IEEE to workow management. Journal of Circuits, Systems and
Int. Conf. on Service-Oriented Computing and Applications, Computers 08:2166.
SOCA 2007, 133140. IEEE Computer Society.
Hariri, B. B.; Calvanese, D.; Montali, M.; and Deutsch, A.
2014. State-boundedness in data-aware dynamic systems.

1099

Vous aimerez peut-être aussi