Vous êtes sur la page 1sur 10

2012 European Intelligence and Security Informatics Conference

Association and Centrality in Criminal Networks


Rasmus Rosenqvist Petersen
University of Southern Denmark The Maersk Mc-Kinney Moeller Institute Email: rrp@mmmi.sdu.dk

AbstractNetwork-based techniques are widely used in criminal investigations because patterns of association are actionable and understandable. Existing network models with nodes as rst class entities and their related measures (e.g., social networks and centrality measures) are unable to capture and analyze the structural richness required to model and investigate criminal network entities and their associations. We demonstrate a need to rethink entity associations with one specic case (inspired by The Wire, a tv series about organized crime in Baltimore, United States) and corroborated by similar evidence from other cases. Our goal is to develop centrality measures for fragmented and non-navigational states of criminal network investigations. A network model with three basic rst class entities is presented together with a topology of associations between network entities. We implement three of these associations and extend and test two centrality measures using CrimeFighter Investigator, a novel tool for criminal network investigation. Our ndings show that the extended centrality measures offer new insights into criminal networks.

I. I NTRODUCTION Network-based techniques are widely used in crime investigations because patterns of association are actionable and understandable. Target-centric investigation where a group of people shares and restructures information in a common information space in order to coordinate or reach consensus is a special type of investigation. Criminal network information structures are by nature emergent and evolving, and a targetcentric and iterative approach to tool support of this information domain is therefore suitable. Existing criminal network models with nodes as rst class objects and their related measures (e.g., social networks and centrality measures) are unable to capture the structural richness required to model and investigate criminal network entities and their associations. Our target-centric model for criminal network investigation is based on a model for intelligence analysis [1] and involves ve processes: acquisition, synthesis, sense-making, dissemination, and cooperation (see [2] for a detailed description of the model). All individuals in the target-centric model are stakeholders: from information collectors (e.g., undercover agents and automated web crawlers) over information analysts (investigators) to decision-makers (intelligence customer). We found that a target-centric approach is best for the investigations we have analyzed. The traditional alternative is a sequential approach where investigative processes guide the investigation. This sequential model is appealing to intelligence agencies and law enforcement since the exchange of information between individuals responsible for different processes can be controlled. However, such compartmentalization has been found
978-0-7695-4782-4/12 $26.00 2012 IEEE DOI 10.1109/EISIC.2012.63 232

to cause intelligence failures for a number of high-prole investigations. Examples include the interrogations of the Iraqi defector Curveball who sought asylum in Germany and the subsequent invasion of Iraq in 2003 [3], [4], the investigation of links between Operation Crevice and the July 7th 2005 bombings in United Kingdom [5], [6], and the investigation into the al-Qaeda organization prior to the September 11th 2001 attacks on United States [7], [8]. In this paper, we present a criminal network model with three rst class entities (node, link, and group) that supports emerging and evolving information structures. Based on a study of criminal network investigations we present a topology of entity associations that occur in these networks. We argue that relevant entity associations are not only direct (relationship) links, but could also be based on more semantic associations such as the spatial co-location of entities. Together, the network model and the topology of associations, has guided our development of support for dealing with the uncertainty present in fragmented and partial networks. In this paper we use that ability to dynamically extend two measures of entity centrality in a network, degree and betweenness, and our results show that our approach provide investigators with new insights into criminal networks. The CrimeFighter Investigator tool supports a target-centric and iterative, approach to criminal network investigation. CrimeFighter Investigator is part of the CrimeFighter Toolbox for counterterrorism [9]. Besides the Investigator tool, CrimeFighter consists of the Explorer tool targeted at open source collection and processing and the Assistant tool targeted at advanced structural analysis and visualization. The remainder of this paper is organized as follows: Section II discusses and denes the concepts on which our work is based. First a conceptual model dening three rst class entities is presented. Then, we review a criminal network investigation from The Wire, followed by a review of entity association and centrality. The section is concluded with a topology of criminal network entity associations. In Section III we describe how CrimeFighter Investigator supports dynamic extension of centrality algorithms with associations from our topology. Section IV tests and evaluates extensions of degree and betweenness centrality measures. Section V concludes the paper. II. E NTITY A SSOCIATION AND C ENTRALITY The building blocks of criminal networks are information entities. Our network model (Figure 1) denes three such entities, namely information elements (nodes), relations (links),

and composites (groups). Nodes hold information about realworld objects. Investigators basically think in terms of people, places, things, and their relationships. We use rectangles as visual abstractions here for simplicity, but any symbol (circles, triangles, etc.) could have been used to illustrate different types of real-world objects. Links of different types and weights can associate information entities directly. Links have two endpoints, they can be both directed and undirected, and they have different visual abstractions (see Figure 1, middle). Composites are used to associate entities in sub groups. We work with two types of composites [2]: Reference composites are used to group entities in the common information space. Inclusion composites can collapse and expand information to let investigators work with subspaces. The circles in Figure 1 indicate connection points for direct association of entities.

Fig. 1. Our network models three rst class entities: Information elements (left), relations (middle), and composites (right). Points of direct association are indicated using circles.

Information entities are normally synthesized in a classic nodes-and-links way before visualization. Typical network structures that form during include hierarchical structures, cellular structures comprised of cohesive subgroups (cliques) connected by bridges, and at (or uid) structures where individual entities are distributed in some (more or less) random manner, maybe based on factions or their relationship with nearby nodes, or simply because of a more desirable visual layout. But criminal network structures are emergent and evolving and the networks go through many iterations after a target is selected until the structure types mentioned above emerge. A large organization like al-Qaeda has evolved many entity structures. Sageman depicts al-Qaeda as four clusters with one leadership cluster, the Central Staff. After 1996, the Central Staff was no longer directly involved in terrorist operations, but the other three major clusters were connected to their Central Staff contacts by their lieutenants in the eld [10]. Two of the al-Qaeda clusters are comprised of several cohesive subgroups, while the southeast Asian cluster is more hierarchically structured, with a leader and a consultative council at the top. When the cluster was created it was divided into four geographical regions, and each region had several branches. All the network information was gathered from public domain sources: documents and transcripts of legal proceedings [. . . ], government documents, press and scholarly articles, and Internet articles [10]. The synthesis of the elaborate list of data set attributes alone must have been quite a tedious and time consuming task. After 10 years of investigative journalism the Pearl Project published a report on the kidnapping and murder of Daniel

Pearl depicting ve cells responsible for various tasks, with all cells connecting to the mastermind behind the kidnapping [11]. However, from the account of the ofcial investigation we know how fragmented and inconsistent information about the kidnappers initially was [12], and from another account we get a vivid description of how investigations faced the eternal problem of any investigation into Islamist groups or AlQaida in particular: the extreme difculty of identifying, just identifying, these masters of disguise, one of whose techniques is to multiply names, false identities, and faces [13]. Krebss almost iconic network of 9/11 hijackers has been referenced widely [14]. It was aggregated based on open sources, but we dont know the intermediate states of the network prior to the published version. And we dont know the exact evidence that formed the links between the hijackers. When investigations start, criminal network entities are often associated in other ways than through well established relationships to other entities. First, the entities are randomly positioned in the information space and maybe only a few are directly linked (e.g., the known accomplishes of the target). Later, more entities are linked, groups are created, and structures emerge. During the rst iterations, spatial associations like entity co-location play an important role. A spatial association with certain semantics could be entities placed in close proximity of each other to indicate a subgroup in the network or snippets of information about a certain individual. Or entities might be placed above and below each other to indicate hierarchical importance. And it may take many synthesis-sense-making iterations before it is clear what attributes (node meta data) are relevant as input for sensemaking algorithms. In other words, semantics happen [15]. The network visualizations we see in magazines, news papers and scientic journals and proceedings are often created specically for dissemination purposes. It tells very little about the investigative efforts required to synthesize and making sense of the respective networks. The networks therefore convey limited information to the reader about what processes, tasks and techniques that a tool for criminal network investigation should support. A. The Wire: investigating organized crime The Wire is a tv series, renowned for its authentic depiction of urban life on each side of the law1 . In the rst season it is drug dealers on one side and law enforcement ofcers on the other [17]. The Wire is interesting as a security informatics case study for a number of reasons. First of all, the targetcentric, board-based approach2 chosen by the investigative
1 The primary writers are David Simon and Ed Burns. Burns has worked as a Baltimore police detective for the homicide and narcotics divisions. Simon is an author and journalist who worked for the Baltimore Sun city desk for twelve years. He authored homicide: a year on the killing streets and co-authored the corner: a year in the life of an inner-city neighborhood with Burns [16]. We have previously focused on policing and investigative journalism as two investigation types that could benet from the concepts we develop and implement in CrimeFighter Investigator [2]. 2 We have previously described the advantages of a board-based approach for the planning domain, where information structures are also emergent and evolving (see [18]).

233

team maps well onto our criminal network investigation model [2]. Secondly, Analysts Notebook [19], a commercial software tool for visualization and analysis of criminal networks, is used to narrow down a list of suspects, based on a large number of intercepted phone calls. Finally, the shows ability to describe investigative context is exceptional. By context, we mean factors such as power, law enforcement culture, resources, and politics that ultimately can decide the success or failure of investigations [20]. The Barksdale organization is a hierarchical and somewhat at structure that maintains a top-down chain of command (see [16], [21]). The top consists of the leader Avon Barksdale, his second-in-command Stringer Bell who administrates and manages the organization, and, Avons sister Briana Barksdale, who is responsible for the nancial side together with Stringer. Maurice Levy is the organizations lawyer who offers legal advice and acts as defense lawyer for members of the organization. At the bottom of the organization are the drug selling crews: typically a crew is responsible for a high-rise, an area in the low-rises, or a street corner (so called openair drug markets [22]). Each crew has a chief, one or more high ranking lieutenants who control a number of dealers and runners, responsible for arranging a buy, getting the money, retrieving the drugs from a nearby location and handing it over to the buyer. For communicating strategies and commands to the crews, the leadership (primarily Stringer) has lieutenants to enforce his commands (in season one Anton Artis and Roland Brice work as the lieutenants), and they in turn have their enforcers who they forward tasks to. But Stringer Bell also shows up in person in the pit (nickname for the low rises) to ask the crew chief to solve a specic task or follow a new strategy. The rst season begins with narcotics lieutenant Cedric Daniels being ordered to organize a detail of narcotics and homicide cops to take down Avon Barksdales drug crew which runs the distribution of heroin in several of Baltimores projects. Realizing that low-level buy-and-busts are getting them nowhere3 , the detail of cops [. . . ] add visual and audio surveillance to their law enforcement tools [20]. The team is provided with ofce space in a the basement, from where they can work the case and monitor the many wires they set up in an attempt to map out the network of individuals in the Barksdale organization. A senior police ofcer, recognizing that all the pieces matter is put in charge of information collection and processing and he starts adding snippets of information on to the investigation board shown in Figure 2a functioning as the teams common information space. Figure 2b shows some of the information entities used on the investigation board.
3 After years of random buy-and-bust interventions, law-enforcement controls of serious crime networks have gradually come to follow the key player strategy [23]. Morselli follows up by stating that a more accurate appraisal of the social organization of drug-trafcking [. . . ] would follow a resourcesharing model in which collaboration among resourceful individuals would be at the base of coordination in such operations [23]. We nd that this is also the approach taken by the investigators in The Wire by targeting not only Avon Barksdale but a range of important individuals in and around the decision-making body of the organization.

There are polaroid close-ups of individuals, and two types of text cards: one with meta information about entities and one functioning as headers. In the middle there is a surveillance photo and at the bottom a newspaper clipping. We have dened the following four information entities used on the investigation board and use colored rectangles to represent them in Figure 2c: portrait pictures are blue, large surveillance photos are orange, text cards with meta data about individuals are green, and header text cards with red text are dark red. Based on this augmentation of the investigation board we observe a number of semantics. Most obviously all portrait polaroid pictures are placed below a meta data text card. Sometimes a surveillance photo is placed next to the portraits. Finally, the investigation board is divided horizontally into areas by the header text cards placed at the top. Based on The Wire and other reviewed cases4 , we dene three tool requirements describing investigative needs that we aim to support: 1) When node-link-node associations are not dominant, then semantic associations will reduce investigation uncertainty by computation of extended centrality measures. 2) Centrality measures for criminal network entities, must support empty endpoint associations for more accurate results. 3) A combination of several direct and semantic associations can be necessary to support when computing centrality measures for criminal network entities. B. Entity association During target-centric criminal network investigations, the investigative team adds information pieces as they are discovered and step-by-step information structures emerge as entities are associated. We have observed that initially the information entities are placed randomly in an information space. If a new entity is somehow associated with an entity already in the shared information space, then it is positioned next to that entity (co-located). Later, some co-located entities are directly associated using link entities, because the investigators have learned the nature of the relationship between the entities. Depending on the level of time criticality (e.g., high security risk), a decision has to be made at some point. When the network is fragmented and incomplete such decision-making can be a challenging task due to the uncertainty. Sense-making
4 Several criminal network investigations have inspired our work. The investigation of Daniel Pearls kidnapping and murder was target-centric and used large pieces of paper on a wall to synthesize information entities as they were discovered [11][13]. The investigation to locate and arrest the 9/11 mastermind Khalid Sheikh Mohammed (both before and after the attacks), was, by the Federal Bureau of Investigation, conducted in a target-centric manner and always with a focus on gathering evidence both for later potential trials but also to map and understand the network of individuals, events, and places that was emerging [7]. Researchers and writers Strick van Linschoten and Kuehn have been mapping a network of Afghan Talibans to investigate their associations with the Afghan Arabs from 1970 to 2010 [24]. They use Tinderbox for their mapping efforts [25]. Tinderbox is a software tool that takes a board-based approach to synthesis of networks and supports multiple structures [26].

234

(a) investigation board

(b) information entities

(c) augmented investigation board

Fig. 2. The Wire case - a shared information space, in this case a physical board (left), with different types of information entities (right). Close-up pictures are blue, surveillance photos are orange, text cards with meta information about individuals are green and text cards functioning as headers are dark red.

algorithms are often applied to assist investigators in making these decisions and we discuss measures of centrality for individual network entities below. Information entity associations form information structures and centralities are computed based on these associations. Subsequently, associations impact the measures of centrality we want to calculate. Criminal network investigation has to a large degree so far focused on the direct association of nodes. Links are seldom rst class objects in the terrorism domain models with the same properties as nodes. This is in contrast to the fact that the links between the nodes provide at least as much relevant information about the network as the nodes themselves [27]. The nodes and links of criminal networks are often laid out at the same level in the information space when the network is visualized. Composites (groups) are rst class entities that add depth to the information space. For investigative purposes navigable structures and entities (including composites) are useful for synthesis tasks such as manipulating, re-structuring, and grouping entities. Our understanding of information links (relations) and groups (composites) is based on hypertext research [2]. C. Entity centrality Measures of centrality have been developed for different types of networks. Most prominent are social network analysis techniques (see [23], [28], [29]) that can measure the centrality of entities in criminal networks based on their direct and indirect associations to other entities in the network. But although the premise that centrality is an indication of importance, inuence, or control in a network may appear valid, it is also contestable, particular in criminal contexts. [. . . ] What does it mean to be central in a criminal network? [23]. We argue that centrality is dependent on the specic criminal network being investigated. It depends on the associations between entities that investigators deem important, and it depends on the weights of those associations. Furthermore, the accuracy of centrality measures depends on the investigators ability to embed their tacit knowledge and novel associations into centrality algorithms. We review a selection of techniques below, which we nd to be relevant for criminal network

analysis on the above mentioned premises. An entity is central when it has many associations to other entities in the network. This kind of centrality is measured by the degree of the entity and is also known as local centrality since only entities at a distance of 1 or 2 links are included. The higher the degree, the more central the entity. For networks with directed links, both in-degree and out-degree centrality can be measured, meaning to the number of incoming and outgoing links an entity has. A network with high degrees of both is a highly cohesive network. Usually, not all entities are connected to each other in a network. Therefore, a path from one entity to another may go through one or more intermediate entities. Betweenness centrality is measured as the frequency of occurrence of an entity on the geodesic connecting other pairs of entities. A high frequency indicates a central entity. These entities bridge networks, clusters, and subgroups: betweenness centrality eshes out the intermediaries or the brokers within a network [23]. Closeness, also known as global centrality, indicates whether or not an entity has easy access to other entities in the network. Eigenvector centrality is like a recursive version of degree centrality where an entity is central to the extent that the entity is connected to other entities that are central. Specic techniques for terrorist network analysis often take the mentioned centrality measures as input to their computations. Examples include measures of link importance based on secrecy and efciency [9], the prediction of covert network structure, missing links, and missing key players [30], and custom-made techniques developed by investigators to target network-specic analysis tasks, such as the node removal technique described in [31]. D. Hypertext and semantic web technology Hypertext systems aim at augmenting human intellect, i.e., increasing the capability of man to approach a complex problem situation, to gain comprehension to suit particular needs, and to derive solutions to problems [32]. CrimeFighter Investigator supports a range of domain-independent hypertext structures that are used to support synthesis of information entities: navigational structures allow arbitrary pieces of in-

235

formation (entities) to be linked (associated, see discussion above); spatial structures were designed to deal with emergent and evolving structures of information which is a central task in information analysis; taxonomic structures can support various classication tasks. In the context of criminal network investigation, spatial structures are useful in various synthesis, sense-making, and dissemination tasks such as re-structuring, brainstorming, retracing the steps, creating alternative interpretations, and storytelling. Taxonomic structures are in essence hierarchical (tree) structures. Hierarchical structures are also known from other structuring domains (such as composites from the associative domain and collections from the spatial domain). In the context of investigation, taxonomic structures can provide a different visual (hierarchical) perspective of associative and spatial structures hence supporting the exploring perspectives task of sense-making. See [2] for further details on the application of hypertext structures to criminal network investigation. Semantic web concepts have many characteristics in common with our understanding of criminal network entities and their associations. Similar to centrality measures for criminal networks, semantic web concepts have been developed to measure the centrality of entities in online social networks. We are interested in analysis of complex systems in which nodes could be any object, relations (links) could be of any nature, and structures are generated by the users (investigators). Semantic web technology can explicitly model the interactions between individuals, places and things in complex systems of information entities, but classical social network analysis methods are typically applied to these semantic representations without fully exploiting their rich expressiveness [33]. A short summary of semantic web technology and a social network analysis example is given in [34]: Semantic web [technologies] provide a graph model, a query language and type and denition systems to represent and exchange knowledge online. These [technologies] provide a [. . . ] way of capturing social networks in much richer structures than raw graphs. Several ontologies can be used to represent social networks. The most popular is FOAF5 , used for describing people, their relationships and their activity. A large set of properties is dedicated to the denition of a user prole: family name, nick, interest, etc. The knows property is used to connect people and to build a social network. [. . . ] The properties in the RELATIONSHIP6 ontology specialize the knows property of FOAF to type relationships in a social network more precisely (familial, friendship, or professional relationships). For instance the relation livesWith specializes the relation knows. We believe that the outlined approach can be adopted and extended to support other association types such as the
5 http://www.foaf-project.org/ 6 http://vocab.org/relationship/

Fig. 3. Queries that extract the degree centrality of [individuals] linked by the property foaf:knows and its specialization relationship:worksWith [34].

semantic associations described below. E. Topology of associations Based on the concepts of centrality and association, we outline a topology of associations between criminal network entities which impact the centrality of individual entities with varying degree. Our topology is divided into direct and semantic associations (see Figure 4 and 5). Direct associations are expressed using link entities. The link may be weak by weight (low), by type (rumor, acquaintance, one-visit-to, etc.), or by evidence (uncorroborated, questionable news paper, etc.), but it is nonetheless interpreted as a direct association by sense-making algorithms and in visualizations.Semantic associations between criminal network entities are build incrementally based on the tacit knowledge of investigators and the investigation domain their target operates within. Initially, investigators express information via visual or textual means and later formalize that [information] in the form of attributes, values, types, and relations [15]. The visual symbol for direct associations is a thick solid line, and thin solid circles indicate entity connection points. The visual symbol for semantic associations is a dashed line and dashed circles indicate connection points. We realize that some of these associations are more relevant than others, and it is exactly this relevance of alternative associations that we are investigating in this paper. In Figure 4a to 4c, we show three classic associations: the node-link-node association is the most frequently used (4a), together with the less frequently used node-link-group (4b) and group-link-group (4c) associations. Figure 4d to 4g shows four examples of direct associations that occur in criminal network investigations, but are not included when entity centrality is computed. A link could be the target of an investigation, e.g., Daniel Pearl was investigating whether or not there was a link between Richard Reid (the shoe bomber) and the leader of a local radical Islamist group [12]. Other examples include knowledge about the money transfer between two individuals or that one individual had seen them talk at the same location on numerous occasions (Figure 4d). The empty endpoint is another example of a direct association that occurs in criminal network investigations, but is not (directly) addressed by traditional centrality algorithms. The need to include empty endpoints in centrality is straightforward: if investigators know that someone is distributing drugs to three individuals, e.g., based on wire taps, but they dont know who those individuals are, then an empty endpoint can be used until it is clear. This could be the case for both nodes and groups (see Figure 4e and 4g). Finally, direct associations between entities outside groups to entities inside groups are needed

236

(a) node-node Fig. 4.

(b) node-group

(c) group-group

(d) link-link

(e) empty endpoint I

(f) node-sub node

(g) empty endpoint II

Direct associations in our topology includes classic associations (a-c) and novel associations in terms of centrality measures (d-g).

(a) clique I

(b) clique II Fig. 5.

(c) meta data

(d) sequential

(e) group-subgroup

(f) node-subnode

(g) node below

Semantic associations in our topology include spatial associations (a-d) and hierarchical associations (e-g).

(both for reference and inclusion composites, see Figure 4f). When criminal network investigators start grouping entities, structures where entities outside the group are linked to entities inside the group might emerge. But the relation still has association to that entity in the subgroup. The semantic co-location association should be used carefully by investigators. If the investigators position entities near each other spatially because they are assumed to be related somehow, then it will make sense to use spatially based associations. But if not, then it will simply clutter the network with non-relevant relations. If entities are placed near each other or as overlapping entities it could mean that they are forming a sort of clique (Figure 5a and 5b). Also, as it is the case in the analyzed The Wire investigation board, position entities next to or around a (centered) entity could mean that the information entities are meta data about the centered entity (Figure 5c). Entities positioned next to each other horizontally or vertically, could mean that the entities represent a sequence (Figure 5d). Semantic hierarchical associations can occur either when composites are used or when information entities are positioned spatially in a manner that resembles that of a hierarchy. If a group contains single information entities and subgroups, the single entities must have some sort of relationship to the entities in the subgroups since their overall classication is the same (Figure 5e). Also it could be that a single entity is associated with a composite (group) and therefore might have some sort of relation with entities within that composite (Figure 5f). Finally, positioning entities in spatial hierarchies as shown in Figure 5g indicates entities below other entities represent sub entities. The topology of associations can be seen as a wish list of requirements for what an investigative tool should support in this regard. The topology is not exhaustive; we expect to uncover additional associations over time. Especially new semantic associations based on temporal distance (when individuals appear on an investigation time line together with other individuals and events etc.), distance between entities in the real world, distance in family ties, and so on.

III. C RIME F IGHTER I NVESTIGATOR CrimeFighter Investigator [2], [35] is based on a number of concepts (see Figure 6). At the center is a shared information space. Spatial hypertext research has inspired the features of the shared information space including the support of investigation history [2]. The view concept provides investigators with different perspectives on the information in the space and provides alternative interaction options with information (hierarchical view to the left (top); satellite view to the left (bottom); spatial view at the center; algorithm output view to the right). Finally, a structural parser assists the investigators by relating otherwise unrelated information in different ways, either based on the entities themselves or by applying algorithms to analyze them (see the algorithm output view to the right). In the following, central CrimeFighter Investigator features supporting measures of centrality are presented. A. Extending centrality algorithms with new associations The classic centrality algorithms have been extended by adding some analysis prior to the existing steps. Our implemented betweenness algorithm (described in [31]) with the extra step for the selected centrality extension(s) works as follows: 1) Pre-analysis; In this step the algorithm analyzes whether or not the included association types appear in the criminal network. If they do then changes are temporarily made to the network accordingly. 2) List all entity pairs; This step creates a list of all entity pairs that exists in the network, again based on the included associations. This means that if the direct node-group association is included, then all entities that are directly or indirectly (by association through intermediary entities) associated to the group with links are added to the list of entity pairs. 3) List all shortest path(s) for each entity pair; We calculate the shortest path(s) for all entity pairs without considering the cost-efciency of our algorithm: we take a breadth rst, brute-force approach [36], visiting all nodes at depth d before visiting nodes at depth d + 1, removing all loops and all paths to the destination node

237

Fig. 6.

CrimeFighter Investigator showing an altered version of the investigation board from The Wire.

longer than the shortest path(s) in the set, until only the shortest path(s) remain. 4) Node occurrence; We calculate the ratio by which each node in the network appear in the accumulated set of shortest path(s). 5) Bubble sort; The results are sorted according to the users choice, usually descending with the highest centrality rst. 6) Generate report; If the user requests it, a pdf report is generated for easy dissemination of the results of the centrality measure. The user can decide what report elements to include. Pre-analysis is the algorithm step of primary interest to the work presented here. For the direct empty endpoint association, pre-analysis involves adding temporary information elements as placeholders of empty endpoints. For the semantic colocation association, we create a temporary relation between two entities if they are not already related and they are within the user-dened boundaries of each other (see Figure 7). B. Customizing sense-making and sense-making algorithms CrimeFighter Investigator algorithms are managed using a structural parser, where investigators can select different algorithms to run and control the order in which they are executed, for example either simultaneously or sequentially. Figure 8 (left) shows how individual centrality algorithms can be customized by the user. The user must decide how

to run an algorithm (Figure 8a) and what entities to include for the respective centrality algorithm (Figure 8b). This is done using drag and drop between two dened areas as shown in Figure 8 (right, top frame). For included entities the user can set a weight (maybe a location counts less than a person for a measure of betweenness centrality) and for excluded entities the user how the algorithm should deal with it, e.g., when tracing a shortest path. Should it not include the shortest path or simply ignore this entity and continue along the path? Direct and semantic associations are included or excluded using the same drag and drop approach as for

(a) without

(b) with

(c) without

(d) with

Fig. 7. The two implemented algorithm extensions, the empty endpoint association and the co-location association are explained. Without the empty endpoint association, the link from the empty endpoint to the connected entity is not included in measures of betweenness centrality and degree centrality is not calculated for the empty endpoint (a) and with that association the link is included (b). Without the co-location association entities positioned near each other in the information space are not included in measures of centrality (c), but if entities fall within the boundaries dened by the investigators and the association is included, then those entities are included in measures of centrality (d).

238

entities (see Figure 8c and 8d). Again, weights can be setup for included associations and the algorithms action(s) for excluded associations. Finally, we imagine many settings for how to format and list results (Figure 8e). Typically, normalization is important for comparison of results. If an investigation has many of the included entities it can be useful only to display for example 10 results based on some parameter, e.g., highest centrality.

Fig. 8. Setting up centrality algorithms using structural parser windows: the centrality algorithm settings window is shown on the left, and the window for inclusion and exclusion of entities together with specic settings for each of those entities is shown on the right.

out all entities except the close-up photos (i.e., the blue rectangles) and created an investigation using CrimeFighter Investigator where individuals are positioned with the same relative distance. All individuals are given numbers or letters as name, except for the two lieutenants Anton Artis (A.A.) and Roland Brice (R.B.). The network with the semantic colocation association included is shown in Figure 9a and the calculated centralities are shown in Figure 9b. Prior to testing the empty endpoint association we found that empty endpoints rarely occurred in the investigation we analyzed. Links are used to connect two entities, and even if the contents of one entity is unknown it is still created as a placeholder. It is unclear whether this is simply because it does not make sense to work with empty endpoints or if it is because of a structural bias toward links as simple entity connectors. To test the inuence of the empty endpoint association we have used some of the links from the previous test to create a new test case (see Figure 6). We assume that a number of subgroups have been detected (the four colored composites) and that the investigators know there is some connection from the main network to each of these subgroups but it is unclear how and therefore an empty endpoint is positioned next to each subgroup. To test the requirement for centrality measures to consider multiple associations, we use the same network as for the empty endpoint requirement (see Figure 6). However, this time we test both the empty endpoint association and the co-location association together. The with condition therefore means that the algorithm replaces empty endpoints with actual nodes (placeholders) and creates links between co-located nodes that are not already directly associated. A. Discussion and summary of results Testing the requirement for semantic associations illustrated how centrality measures can be applied to spatial network structures using a co-location association. It is evident that when no relations exist in an investigation prior to analysis, there is a need to dene associations between entities in a different way if the investigators need to calculate node centrality to deal with the uncertainty of an ongoing investigation. We see that degree centrality indicates the individuals on the right hand side in Figure 9b as central to the network (e.g., 9, 6, 8, and 10), but they are of little importance. At the same time degree doesnt point to the two lieutenants A.A. or R.B. as key players like we expected. We therefore nd that one should be careful with considering spatial co-location as a measure for network degree centrality. Betweenness centrality clearly points to A.A. and R.B. as key players in the network together with individual 2. Given the results of our two other tests it is also interesting that individual 5 is placed in top four in terms of betweenness. When we tested the empty endpoints requirement we found that the measure of degree centrality provides investigators with no clear tendencies, although it more strongly indicates individual F, D, A.A., and 3 as central to the network. The betweenness results more distinctly point to A.A. and

It is currently possible to set the visual symbols for the information space and the algorithm view (see Figure 8f). For the information space the user can decide whether or not to overlay entities with a geometric shape (circle, square, or rectangle) containing the calculated centrality (instead of just showing the results in the algorithm view). The color, size and outline of the shape can be decided together with the font and font size of the printed centrality. For the algorithm view it can be decided how to display the results textually in a list. Maybe a certain attribute should be printed (e.g., person name or email date). And the font (type, size and color) can be set. IV. E VALUATION We have tested CrimeFighter Investigators support of three tool requirements on a ltered version of the investigation from The Wire and a semi-altered version of the same investigation. We calculate two centrality measures, degree and betweenness, for two conditions, with and without two designed and implemented associations. We test the co-location association on an investigation inspired by The Wire to evaluate the requirement for support of semantic associations. The investigation had no direct associations between entities prior to the test. We have ltered

239

(a) test scenario 1

(b) colocation results

(c) empty endpoint results

(d) two associations results

Fig. 9. The Wire investigation with links representing colocation associations (a). The degree and betweenness centralities for each of three tests: colocation association (b), empty endpoints association (c), and both colocation and empty endpoints associations (d).

R.B. when including the empty-endpoints association. We also observe that individual 2 is ranked as fourth instead of seventh which is a more realistic depiction of this individuals betweenness in the network. Individual 5 has the highest change in betweenness when including empty endpoints, making him an interesting subject for further investigation. As mentioned earlier, it would be possible to model empty endpoints using information element placeholders until the content of the empty endpoint is known. This also means that traditional social network analysis measures of centrality could be applied. We therefore recommend to test if empty endpoints have higher value for restructuring tasks during synthesis than for centrality algorithms. Our test of the requirement for support of multiple associations was successful in terms of extending two measures of centrality with more than one association from our topology. But for the test investigation the test results did not add much investigative value. The inclusion of both empty endpoint and co-location associations connects all entities in the criminal network through the empty endpoints (individual 5 is connected to individual 6 and 12, individual F

to individual H, and individual A.A. to individual M). This makes the degree and betweenness centrality of key nodes without the associations less distinctive. The numbers are attened because the information elements in the subgroups achieve higher measures of betweenness centrality with the associations included. The most interesting result for this nal test was that the degree and betweenness centrality of individual 5 is increased considerably when the associations are added. Together, our three requirement tests have shown that measures of centrality extended with novel types of associations provided new insights into two organized crime networks that traditional centrality measures could not provide. Most important result was that the centrality of individual 5 was increased in all three tests. Individual 5 was not known to be a central entity in the network before the tests. V. C ONCLUSION We have presented two novel sense-making algorithms based on new interpretations of information entity association and centrality. The algorithms are extensions of classic social network analysis algorithms where the user can include and

240

exclude specic entities and associations for analysis to match it with the structures they have build when investigating a criminal network. More specically , this paper has three main contributions: 1) A novel network model with nodes, links, and groups as rst class entities. 2) A topology of direct and semantic network entity associations based on an analysis of various criminal network investigations following a target-centric approach. 3) An implementation that supports three of these associations: the traditional node-link-node association and the novel empty endpoint association and the semantic colocation association. Both associations have been tested on a criminal network investigation from The Wire and an altered version of that same investigation. We can conclude that target-centric criminal network investigation creates structures not clear from the beginning of an investigation and in order to apply traditional centrality measures, associations other than node-link-node have to be supported. We plan to implement support of the other associations in our topology in the near future. We would like to test them on real-world investigations (either post-crime or ongoing) to learn if and how they could provide useful insights into the investigated criminal networks. As an alternative to manually applying a specic radius or geometric shape to decide co-location association, it could be interesting to apply a standard machine learning algorithm that suggests co-location, not in terms of position on the investigation board, but based on temporal distance, physical distance in the real world, or distance in family ties. R EFERENCES
[1] R. Clark, Intelligence analysis: a target-centric approach. CQ Press, 2007. [2] R. R. Petersen and U. K. Wiil, Hypertext structures for investigative teams, in proceedings of the 22nd ACM conference on hypertext. ACM Press, 2011, pp. 123132. [3] B. Drogin, Curveball. Ebury Press, 2008. [4] T. Weiner, Legacy of Ashes: The History of the CIA. Anchor Books, 2008. [5] Could 7/7 have been prevented? Review of the intelligence on the London terrorist attacks on 7 July 2005, Intelligence and Security Committee, United Kingdom, 2009. [6] R. R. Petersen, Presentation of crimeghter investigator. British Home Ofce, London, United Kingdom: Presented and demonstrated work on prediction of covert network structure and missing links to a group of British intelligence analysts, March 2011. [7] T. McDermott and J. Meyer, The Hunt for KSM - Inside the Pursuit and Takedown of the Real 9/11 Mastermind, Khalid Sheikh Mohammad. Little, Brown and Company, 2012. [8] The 9/11 Commission Report (Executive Summary), National commission on terrorist attacks upon the United States, United States, 2004. [Online]. Available: http://www.9-11commission.gov/ report/911Report Exec.pdf. [9] U. K. Wiil, N. Memon, and J. Gniadek, Crimeghter: A toolbox for counterterrorism, Lecture notes in communications in computer and information science (Knowledge discovery, knowledge engineering and knowledge management), vol. 128, pp. 337350, 2011. [10] M. Sageman, Understanding terrorist networks. Philadelphia, Pensylvania: University of Pennsylvania Press (PENN), 2004. [11] B. F. Todd and A. Nomani, The truth left behind: inside the kidnapping and murder of Daniel Pearl, 2011. [12] M. Pearl, A mighty heart. Virago Press, 2004.

[13] B. H. Levy, Who killed Daniel Pearl? Melville House Publishing, 2003. [14] V. Krebs, Mapping networks of terrorist cells, CONNECTIONS, vol. 24, no. 3, pp. 4352, 2002. [15] F. Shipman, J. M. Moore, P. Maloor, H. Hsieh, and R. Akkapeddi, Semantics happen: knowledge building in spatial hypertext, in Proceedings of the thirteenth ACM conference on Hypertext and hypermedia, ser. HYPERTEXT 02. ACM, 2002, pp. 2534. [16] R. Alvarez and D. Simon, The Wire: Truth Be Told. Pocket Books, 2004. [17] R. Penfold-Mounce, D. Beer, and R. Burrows, The wire as social science-ction? Sociology, vol. 45, no. 1, pp. 152167, Feb. 2011. [18] R. R. Petersen and U. K. Wiil, Analysis of emergent and evolving information: the agile planning case, in Software and data technologies, ser. Communications in computer and information science, J. Cordeiro, A. Ranchordas, and B. Shishkov, Eds. Springer Berlin Heidelberg, 2011, vol. 50, pp. 263276. [19] (2012) Ibm i2 analysts notebook. [Online]. Available: http://www.i2group.com/us/products/analysis-product-line/ ibm-i2-analysts-notebook [20] B. Capers, Crime, legimaticy, our criminal network, and the wire, Ohio state journal of criminal law, vol. 8, pp. 459471, 2011. [21] D. Simon and E. Burns, The wire (the complete rst season), 2002. [22] T. A. Taniguchi, J. H. Ratcliffe, and R. B. Taylor, Gang set space, drug markets, and crime around drug corners in camden, Journal of research in crime and delinquency, vol. 48, pp. 327363, 2011. [23] C. Morselli, The criminal network perspective, in Inside criminal networks, ser. Studies of organized crime. Springer New York, 2009, vol. 8, pp. 121. [24] A. S. Linschoten and F. Kuehn, An enemy we created: the myth of the Taliban/Al-Qaeda merger in Afghanistan, 1970-2010. Hurst, 2012. [25] R. R. Petersen, Interview with alex strick van linschoten. Trafalgar Square, London, United Kingdom: A discussion of CrimeFighter Investigator, Tinderbox, Gephi, Analysts Notebook in relation to Alexs work with mapping the temporal evolution of Afghan Taliban., March 2011. [26] M. Bernstein, The Tinderbox way. Eastgate Systems, 2006. [27] P. A. Gloor and Y. Zhao, Analyzing actors and their discussion topics by semantic social network analysis, in Proceedings of information visualization, 2006, pp. 130135. [28] J. Scott, Social network analysis, a handbook (second edition). Sage, 2000. [29] L. R. Irons, Recent patterns of terrorism prevention in the united kingdom, homeland security affairs, vol. 4, 2008. [30] C. J. Rhodes and P. Jones, Inferring missing links in partially observed social networks, Journal of the operational research society, vol. 60, no. 10, pp. 13731383, 2009. [31] R. R. Petersen, C. J. Rhodes, and U. K. Wiil, Node removal in criminal networks, in Proceedings of european intelligence and security informatics conference. IEEE, 2011, pp. 360365. [32] D. C. Engelbart, A conceptual framework for the augmentation of mans intellect, in Computer-supported cooperative work. Kaufmann, 1988, pp. 3565. et eo, F. Limpens, F. Gandon, L., O. Corby, M. Buffa, M. Leitzel[33] G. Er man, and P. Sander, Semantic social network analysis: a concrete case, in Handbook of Research on Methods and Techniques for Studying Virtual Communities: Paradigms and Phenomena. IGI Global, 2011, pp. 122156. [34] G. Er et eo, M. Buffa, F. Gandon, P. Grohan, M. Leitzelman, and P. Sander, A state of the art on social network analysis and its applications on a semantic web, 2008. [35] R. R. Petersen and U. K. Wiil, Crimeghter investigator: a novel tool for criminal network investigation, in Proceedings of european intelligence and security informatics conference. IEEE, 2011, pp. 360 365. [36] M. Sipser, Introduction to the theory of computation. PWS Publishing Company, 1997.

241

Vous aimerez peut-être aussi