Académique Documents
Professionnel Documents
Culture Documents
Kathleen M. Carley
Carnegie Mellon University
email: kathleen.carley@cmu.edu
tel: 1-412-268-6016
fax: 1-412-268-6938
Overview
Dynamic network analysis (DNA) is an emergent field centered on the collection, analysis,
understanding and prediction of dynamic relations (such as who talks to whom and who knows
what) and the impact of such dynamics on individual and group behavior. DNA facilitates
reasoning about real groups as complex dynamic systems that evolve over time. In this chapter,
the basic tenets of DNA are described and contrasted with those of Social Network Analysis and
Link Analysis. Some of the basic techniques are then illustrated through the analysis of data on
al Qaeda. Technology described enables the analyst to identify vulnerabilities in the terrorist
network and to assess how that network might change in response to strategic interventions.
1
The research reported herein was supported by the National Science Foundation NSF IRI9633 662, the Office of Naval
Research (ONR) Grant No. N00014-02-10973 on Dynamic Network Analysis, Grant No. N00014-97-1-0037 on Adaptive
Architecture, the DOD, and the NSF MKIDS program. Additional support was provided by the NSF IGERT 9972762 for
research and training in CASOS and by the center for Computational Analysis of Social and Organizational Systems at
Carnegie Mellon University (http://www.casos.cs.cmu.edu). The views and conclusions contained in this document are
those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the
Office of Naval Research, Department of Defense, the National Science Foundation or the U.S. government.
The author would like to thank Jana Diesner, Jeff Reminga, Max Tsvetovat and Dan Wood for providing supporting
material and comment on related work.
2
The term group is used in a very generic sense to refer to any organized collective. As such it might be an informal set of
actors such as a bridge club, a formal organization such as IBM, a collective entity such as al Qaeda, and so on. Often in
this chapter, the term group and organization are used interchangeably.
individuals positions within it influence behavior. There are numerous SNA computational
tools, ranging from network visualizers to packages for analyzing network data.
Feature
Multi-link
Multi-mode
Networks evolve
Locate network elite
Locate patterns across networks
Agents and groups evolve
Predict person behavior
Predict group behavior
Handles missing information
Sensitivity analysis
Optimized search
Analysis of current group
Analysis of strategic intervention
Requires massive human resources
Elite identification
Pattern identification
Analysis of change
X
X
X
X
X
X
X
X
Needs
work
X
X
X
X
X
MAS
X
X
X
X
X
X
Needs
work
X
X
X
Needs
work
X
X
Quantitative
X
X
X
Needs
work
Needs
work
Needs
work
X
X
X
Needs
work
X
X
X
X
Needs
work
X
X
Qualitative
Qualitative
X
Assumes Abstract
future =
past
X
X
Needs
work
X
X
X
X
X
X
X
X
X
X
X
X
Traditional SNA work is a strongly quantitative area dealing with small, complete networks.
Often the data was focused on a single type of relation such as friendship and a single type of
node e.g., people at a single point in time. In the social sciences, and information science in
particular, relational data is used in an information-centered way, to discover the structure of a
complex socio-technical system in terms of the set of inter-relationships and the impact of that
structure on behavior. This is true whether the focus is groups of friends (Krackhardt and
Kilduff, 1990), business elite and boards of directors (Mizruchi, 1996; 2000) or the internet (BarIlan, 2004; Chen, Newman, Newman, and Rada, 1998). Network data is processed to determine
the importance of individual actors (Borgatti, 2002;Van Aelst and Walgrave, 2002) and the roles
that they play. The interpretation of results is a central issue to these researchers.
To be sure, within SNA there was always some work on multi-link or multi-mode data;
however, it was the exception more than the norm. Traditional SNA led to a wealth of findings
on how to accurately extract and assess networks, how to identify elites or key actors, and how to
measure various network properties. This work is tightly tied to statistics and led to a new
branch of statistics for working with relational data as such data violates the independence
assumption of traditional statistics.
Recently, researchers in statistical physics have discovered network science. In this case,
powerful mathematical techniques are applied to understanding differences in stylized network
topologies and with applications to structure of the word wide web and its growth (Barabsi,
2002). The statistical physicists mathematically model links in a very abstract sense, divorced
from content and social context, and often on a very large scale. This work has led to the reidentification of some traditional measures and a host of new techniques for dealing with
massive networks. This work lies between the SNA work and that on modern link analysis.
A second major research area that uses relational data is forensic science. For example, in
criminal investigations, law enforcement agencies face the problem of identifying associations
between a group of entities such as individuals and organizations. To do this, they use a
technique referred to as link analysis. Traditional link analysis represents information in terms
of the links between locations, people, resources, and events. Early emphasis was on
visualization of the links and the use of human intuition to extract patterns. There are numerous
link analysis tools for criminal investigation, however, for the most part these simply aid in
visualization and are not using the computer for analysis (Sparrow, 1991). Social factors that
define the context leading to link formation and that enable interpretation are brought in by the
human analyst to aid in interpretation. The situation is just beginning to change.
Modern link analysis (MLA), largely deriving from work in computer science particularly
that in machine learning, provides tools for extraction of links from databases (Goldberg and
Senator, 1998) and texts (Lee 1998), and analysis of the extracted links (Chenk and Lynch, 1992;
Huack et al., 2002). Extraction of links often requires massive data pre-processing or
restructuring of databases (Goldberg and Wong, 1998). Modern tools and techniques in link
analysis derive from recent work in computer science. Advanced data-processing techniques are
combined with machine learning to enable rapid database transformation and pattern extraction.
The main goal here is identification and recognition of patterns.
Much of the work in MLA has been applied to web page analysis typically from an
information retrieval perspective. Links have been incorporated in to various algorithms (most
commonly into search engines) to retrieve authoritative information from various data sources
including the web (Arasu, Cho, Garcia-Molina, Paepcke, and Raghavan, 2001; Kleinberg, 1999).
In these applications, the general research perspective tends to ignore the social context as to
why the link was created and the interpretation as to what the links mean, although social factors
are used to motivate the ideas (e.g., why links might help information retrieval) and to evaluate
the outcomes (e.g., comparative evaluation of search engines). Rather, the focus of the research
is the efficacy, scalability and robustness of the algorithms.
Both traditional social network analysis and link analysis are effectively static analyses. In
both cases, little attention has been paid to how do the networks evolve and change over time,
how do networks grow, and how can they be destabilized. That is, there has been little work
linking networks to action. The scientific field that has focused on dynamics is computer
simulation, and in particular, multi-agent simulation (MAS). Research using multi-agent
technology has demonstrated the ability to grow societies (Epstein and Axtell, 1997) and evolve
networks over time (Carley, 1991; Carley, Lee and Krackhardt, 2001).
Multi-agent simulation techniques are used to model and reason about complex sociotechnical systems. In general, the non-linearities inherent in systems when coupled with the
large number of processes, agents and variables produce a system that is difficult for humans,
unassisted by computation, to effectively reason about the consequences of any one action or
change. Computational analysis, and in particular multi-agent simulation, is an important tool for
generating hypotheses about the behavior of these systems that can then be tested in the lab and
field (Carley, 1999). Complex systems typically have internal change, adaptation, or
evolutionary mechanisms that result in behavior that on the surface might appear random but
actually has an underlying order (Holland, 1995). In these systems, complex outcomes emerge
from simple processes; however, there are a plethora of possible outcomes depending on input
conditions and history (Kauffman, 1995), some of which may be catastrophic (McKelvey,
1999b). Some complex systems have the ability to self-organize (Bak, 1996) particularly when
the agents involved have the ability to engage in reflection as do humans. MAS techniques are
powerful for thinking through the complexities of these systems. However, the vast majority of
MAS systems have dealt with unrealisitic or toy problems, have moved agents about on grids,
and have ignored the constraints and enablers on human behavior afforded by being embedded in
social networks.
The past five years have seen the birth of a new field of science dynamic network analysis
(DNA).3 The science of DNA entails the theory and design of dynamic networks among diverse
entities and the study of all phenomena emerging from, enabled by, or constrained by such
networks. Entities include both intelligent agents such as humans or robots and artifacts such as
events or resources. DNA makes possible the simultaneous evaluation of multiple networks
linking diverse entities leading to an analysis of multi-color, multi-link, dynamic graphs. An
example is the simultaneous analysis of the social network and the knowledge network for
purposes of improved organizational learning (Carley and Hill, 2001).
Dynamic Network Analysis (DNA) extends the power of thinking about networks to the
realm of large scale, dynamic systems with multiple co-evolving networks under conditions of
information uncertainty with cognitively realistic agents (Carley, 2003). DNA sits at the crossroads of these other techniques and draws on ideas and methods from all of the afore mentioned
approaches resulting in a powerful approach to relational analysis (see Table 1). DNA has been
made possible due to three key advances: 1) conceptualizing networks as meta-networks (Carley,
2002a; Krackhardt and Carley, 1998) connecting various entities such as agents, knowledge and
events, 2) treating ties as variable and so having a weight and/or probability, and 3) combining
social networks with cognitive science and multi-agent systems to endow the agents with the
ability to adapt (Carley, 2002b). In a meta-network perspective a set of networks are defined
3
The term DNA first appeared in print in a paper published by the National Academy of Science (Carley, 2003).
using an organizational ontology that defines networks in terms of relations and a set of entity
classes; e.g., people, knowledge, resources, events, organizations and locations. These entity
classes delineate a set of networks, often referred to as the meta-matrix, in terms of the set of
possible relations within and between two entity classes. For example, between people there is a
social network that might be further divided in to friendship, mentoring and family relations and
between people and knowledge there is a knowledge network indicating who has expertise about
what and possibly at what level. Relationships are defined from a variable tie perspective. As
such, connections between entities are seen as ranging in their likelihood, strength, and direction
rather than as being simple binary connections indicating exclusively whether or not there is a
connection. Finally, the utilization of multi-agent network models enables the user to reason
about the dynamics of complex adaptive systems. In particular, these computational models
combine our understanding of human cognition, biology, knowledge management, artificial
intelligence, organization theory and geographical factors into a comprehensive system for
reasoning about the complexities of social behavior.
A key aspect of DNA is the dynamic approach to the co-evolution of agents, knowledge,
tasks, organizations and the set of inter-linked networks that connect these entities. Multi-agent
network modeling is used to capture the complexities by which who people know influences
what they know and so what they can do and what organizations they join. Changes at each unit
of analysis, person to group to organization to society impact changes at the next; however, the
rate of change decreases and the size of the impact increases as unit size increases. Another
feature is that each agent (and indeed each unit) has transactive knowledge knowledge of who
knows who, what, is doing what, and is a member of what. This knowledge is typically
incomplete, sparse, and potentially wrong. However, the actions of the agents are based on their
perception of the network not the actual network. Cognitive, social, task, and cultural constraints
limit what entities are present, what/who can be connected to what/who, when and how those
connections can change, when new entities (such as new agents) can be added or old ones
dropped, and so on.
3. DNA Tool Chain
The application of DNA techniques to a large complex system, such as al Qaeda, entails a
series of procedures. First, one needs to gather the relational data. One approach for doing this
is to extract relations from a corpus of texts such as open-source items like web pages, news
articles, journal papers, stock holder reports, community rosters, and various forms of humint
and sigint. Second, the extracted networks need to be analyzed. That is, given the relational data
can we identify key actors and sub-groups, points of vulnerability, and so on. Third, given a set
of vulnerabilities, we want to ask what would happen to the system were the vulnerabilities to be
exploited. How might the networks changes with and without strategic intervention. The CMU
CASOS group has developed an interoperable suite of tools that acts as a chain to extract
networks from texts, analyze these networks, and then engage in what-if reasoning. This tool
suite takes into account multi-mode, multi-link, and multi-time period data including attributes of
nodes and edges. This toolset contains the following tools: AutoMap for extracting networks
from texts, ORA for analyzing the extracted networks, and DyNet for what-if reasoning about
the networks (see figure 1). Each of these tools are described in turn.
Figure 1. DNA tool chain for reasoning about complex socio-technical systems.
AutoMap is a semi-automated Network Texts Analysis (NTA) tool for extracting network
data from texts (CMU: http://casos.cs.cmu.edu/projects/automap/, Diesner and Carley, 2004;
2005). NTA is a specific text analysis method that encodes the relations between words in texts
and constructs a network of the linked words (Popping, 2000). In AutoMap we technique is
based on a distance based approach also referred to as windowing (Danowski, 1993).
Windowing basically slides a fictitious window over the text and words within the size of that
window are linked together if they match the coding rules specified by the analysts (ref Carley).
It has been shown in previous research how map analysis (Carley, 1997; Carley and Palmquist,
1992) and its implementation in AutoMap (Diesner and Carley, 2004; 2005) can be applied to
systematically extract links between words in texts in order to model the authors mental map
as semantic networks. Since we implemented the meta-matrix model into AutoMap as a general
ontology for classifying concepts as entities of the meta-matrix, adding meta-matrix text analysis
as a further type of NTA to AutoMap, the software supports the extraction of the structure of
organizations such as covert networks from text collections social and organizational systems
(Diesner and Carley, 2004; 2005). The tool also facilitates the comparison of maps generated
with AutoMap and the fusion of the networks per texts into a network that represents the
structure of a system reflected in a corpus.
ORA is a statistical toolkit for analyzing dynamic network data composed of multiple
entities and relations (CMU: http://www.casos.cs.cmu.edu/projects/ora/, Carley and Reminga
2004; Kamneva and Carley, 2004). ORA facilitates analyzing the entire meta-network with a
series of measures that have been found to be highly valuable in both the command and control
and counter-terrorism contexts (Carley, 2004). The metrics in ORA were developed by drawing
on state of the art research in organization theory, social networks, communication theory,
operations research, economics, and computer science. ORA takes meta-matrix data and
generates a series of reports that can be used to identify key actors or organizations, evaluate
their sphere of influence and locate who influences them, and identify vulnerabilities in the
overall structure of the meta-network for the group. In addition, ORA enables the analyst to
compare and contrast two different networks and to estimate possible relations between actors
based on factors such as relative similarity and expertise. To aid the analyst, ORA generates
seven different reports:
Risk Report: evaluates the overall system using measures of risk or vulnerability in
seven different areas.
Intelligence Report: identifies key actors individuals and groups who by virtue of
their position in the network are critical to its operation.
Management Report: identifies over- and under-performing individuals and assesses
the state of the network as a functioning organization.
Context Report: compares measured values against various stylized forms of
networks in an effort to characterize the networks topology.
SubGroup Report: identifies the subgroups present in the network using various
grouping algorithms.
Sphere of Influence Report: for each individual, identifies the set of actors, groups,
knowledge, resources, etc. that influence and are influenced by that actor.
Optimization Report: enables the analyst to locate the optimal form of the target
organization and/or assess how far the current structure is from the optimum.
DyNet is a multi-agent network simulation package for assessing network change under
various conditions of information assurance (CMU: http://casos.cs.cmu.edu/projects/dynet/,
Carley, 2004). DyNet is built on-top of the Construct simulation engine (CMU:
http://casos.cs.cmu.edu/projects/construct/, Carley, 1990; 1991; Schreiber and Carley, 2004)
Using DyNet the analyst is able to assess how the networked organization is likely to evolve if
left alone, how its performance could be affected by various information warfare and isolation
strategies, and how robust these strategies are in the face of varying levels of information
assurance. The basic engine evolves the network in response to agent interaction and the
exchange of information. Two basic mechanisms underlie this diffusion process. The first
mechanism is relative similarity whereby individuals are more likely to exchange information if
they are comfortable interacting with each other and share culturally relevant factors in common.
The second mechanism is relative expertise whereby individuals are more likely to exchange
information if one actor seeks out the other in search of particular information.
DyNetML, an XML based interchange language for relational data (CMU:
http://www.casos.cs.cmu.edu/projects/dynetml/ , Tsvetovat, Reminga and Carley, 2004). By
using DyNetML as a unified interchange language other tools such as UCINET (Borgatti,
Everett and Freeman 2002) can be linked in and data can be easily exchanged. AutoMap exports
the coded text in DyNetML. ORA imports and exports meta-network data, and does so in a
variety of formats, including DyNetML. We note that for extremely large datasets, an XML
inter-change language is unwieldy. Hence, sparse matrix representation schemes, such as DL,
are also used. DyNet can also read and write network data in DyNetML.
Several principles guided the development of this suite of tools. First, the tools needed to be
interoperable so that all tools should be capable of using (reading/writing) the same set of data.
The goal is to move over time to interoperability in the form of analytical results from one tool
that can be used as input to other tools in terms of data and summary statistics. Second, it needed
to be possible to collect data in many ways but stored in a common format. This facilitates using
data extracted or collected by means other than AutoMap. Third, it is important to link to the
CMU tool suite to other tools with unique and valuable capabilities. This makes it possible to
extend the overall approach and to work in multiple venues. Fourth, the tool set needed to scale
to large data sets and be robust in the face of missing data. To date, we have processed thousands
of texts with AutoMap and most ORA measures run in less than an hour with 30,000 nodes. The
least scalable of the technologies is the simulation engine DyNet. Fifth, the approach needed to
be expandable as new entity types and relations become critical. We have made this possible by
enabling the meta-matrix ontology to be augmented by user defined entity classes. Finally,
attributes of nodes and relations need to also be captured and analyzed. This facilitates
interpretation and enables context and content information to be used to evaluate the results.
This suite of tools is now applied to data collected on al Qaeda. The purpose of this
application is to demonstrate the utility and breadth of these tools for addressing issues
surrounding covert networks. A secondary purpose is to provide some insight into the structure
of al Qaeda as available from open-source information.
4. Al Qaeda Extracting the Network
A set of 591 articles were gathered from the web and then processed with AutoMap. Of
these articles, 113 were published in 2002. It is the data extracted from just these 113 articles
that will be discussed in this paper. This is a sample of the available data and not a
comprehensive set of texts. The texts include news articles, web pages, and academic texts.
The first step in processing the texts was to convert them to a .txt format. Then a thesaurus
was constructed that enabled greater generalization of the concepts used by this community. The
generalization thesaurus was created by converting meaningful unigrams, e.g. Al-Mohsen, and
bigrams, e.g. Abd Al-Mohsen or Abu Hajjer, contained in the lists of Named Entities and
collocations into unique, single-worded core concepts, e.g. into Abd_Al-Mohsen. The third step
was the construction of a specific thesaurus for meta-matrix data. In this case we cross-classified
the concepts into the following entity classes: Actors, Knowledge, Resources, Tasks, Locations,
Roles and Organizations. In AutoMap, a semantic network is coded using the general thesaurus
and then cross-classified using the entity classes as an ontology.
It is important to note that the creation of a generalization and a meta-matrix thesaurus
requires significant domain knowledge. Subject matter experts can help in the creation of these
thesauri. Once developed, generalization thesauri enable aliases and various mis-spellings to be
converted into core concepts and synonyms to be cross classified with equating concepts. Once
developed, the meta-matrix thesauri enable the semantic network to be cross-classified resulting
in an extraction of various alternative networks such as the social network and the knowledge
network. The creation of the thesauri is the most labor intensive part of the coding process. It is
also possible to gain some economies of scale by using general context thesauri, such as general
location and terrorism thesauri, for multiple similar contexts such as coding data on Hamas and
al Qaeda. Once created, however, there is no limit on the number of texts that can be processed
using the same thesauri.
The fourth step is the extraction of the networks using AutoMap. In extracting networks a
window is slid over the processed text putting links between concepts within a certain distance of
each other. For this analysis, a window of size six is used. Note, this window roughly
corresponds to the average length of a sentence after minimal content bearing words, such as
articles, are deleted. Each text is processed and then the resulting networks combined into a
single database.4
To understand the operation of AutoMap consider the following example. The following is
an excerpt from various text files that has been annotated by underlying the basic concepts:
Hisham Al Hussein
the Philippine government booted the second secretary at Iraq's Manila
embassy, Hisham Al Hussein, on February 13, 2003, after discovering that the
same mobile phone that reached his number on October 3, 2002, six days later
rang another cell phone strapped to a bomb at the San Roque Elementary School
in Zamboanga.
Abu Madja and Hamsiraji Ali
That mobile phone also registered calls to Abu Madja and Hamsiraji Ali,
leaders of Abu Sayyaf, Al Qaeda's Philippine branch.
Abdurajak Janjalani
It was launched in the late 1980s by the late Abdurajak Janjalani, with the
help of Jamal Mohammad Khalifa, Osama bin Laden's brother-in-law.
.
AutoMap takes this text, and processes it with the thesaurus, and then returns a multi-mode,
multi-link network like that shown in Figure 2. It is important to note that there are limitations to
this extraction. In particular, the system does not make the inferences that a human might
between content at the beginning and end of a particular text.
After coding the 591 texts on al Qaeda, the resulting networks were quite detailed. The
resulting networks covered 10 years (1995 to 2004), 604 actors, 237 resources, 157 knowledge
areas, 215 tasks or events, 309 locations and 161 organizations. For the remainder of this paper
we will concentrate on only that data extracted for the year 2002.
5. DNA Analysis
The 2002 data on al Qaeda is analyzed using ORA. Four analyses are conducted. First the
nature of network is assessed. Second, the network elite are identified. Third, the sphere of
influence around one of the elite is examined. And finally, the likely impact of various courses of
action is discussed.
10
The database is called NetIntel and specifications can be found in Tsvetovat, Diesner and Carley, 2005.
11
these additional network dimensions, the majority of work on network topologies considers only
the actor-to-actor network.
Table 2: Description of Network Topologies in Terms of the Actor-to-Actor Network
Random
In this structure, the ties among actors are distributed randomly leading, on
average, to each actor having the same number of ties.
Hierarchy
In this structure, the ties are distributed into a simple tree structure.
Hierarchies are characterized by the absence of cycles.
Matrix
In this structure, the ties are distributed into a modified tree structure such
that at some level, each child has two or more parent nodes.
Cellular
In this structure, the actors are distributed into a large number of groups such
that all actors within a group are fully connected and there are minimal
connections among sub-groups.
Scale Free
In this structure, the ties among actors are distributed according to a powerlaw.
Core Periphery
In this structure the actors are distributed into two groups a core and a
periphery, such that the actors in the core are connected by a dense network
of ties and those in the periphery are only loosely connected to each other.
Matrix
Cellular
have more cliques, less spread in degree centrality, a higher spread in cognitive demand and task
exclusivity than would a matrix structured network. Due to topological differences, the isolation
of individuals high in degree centrality is likely to have more of an impact on a network with a
matrix topology than a network with a cellular topology; whereas, the isolation of individuals
high in cognitive demand is likely to have more of an impact on cellular than matrix
organizations.
ORA can be used to characterize the topology of a complex socio-technical system.
Running ORA on the 2002 al Qaeda network reveals the following structure. The actor-to-actor
network is displayed in Figure 4. In the 2002 data there are 1067 nodes distributed as follows:
201 Actors, 106 Knowledge, 157 Resources, 142 Tasks, 184 Locations, 193 Roles and 84
Organizations. The network is extremely sparse with an overall complexity of 0.0031, and the
social network itself has a density of only 0.0017. The comparison of al Qaeda with stylized
forms reveals that the observed network is decidedly non-random (see Table 3). Nor does it
match the profile of the other structures. Note, in creating the stylized structures the number of
nodes and the density were held constant and then ties were distributed according to the profile
of the stylized structure. The stylized structure is then compared with the real network. The
results indicate that al Qaeda does not match other simple structures. For example, the
individuals in al Qaeda exhibit much lower betweeness and much higher closeness than we see
in either a random or a core-periphery network (table 3). In other words, most members of al
Qaeda do not connect otherwise disconnected groups and most are connected to a small group of
others. In part, the differences may be because the top structure in al Qaeda is hierarchical and
the rest is cellular. However, the data does suggest that al Qaeda is simply not organized in
either a random or core-periphery structure.
In order to compute a core-periphery network the user must specify a value alpha. In this case we used the common setting
of alpha equal to 2.
13
14
diverse cognitive activities that they temporarily act as change agents directing others to do
things. Individuals who exhibit high task or knowledge exclusivity are in unique positions in the
system for which there is no backup or redundancy.
Using ORA the network elite can be identified. The main report for this is the intel report in
which the top five individuals who stand out on each of these critical dimensions are identified.
Note that the top five organizations on the comparable dimensions are also identified. For
individual actors, these elite are listed in Table 4.
Table 4: Network Elite in the Actor-to-Actor Network for al Qaeda 2002
Level Degree
Betweenness
Cognitive
Task
Knowledge
Demand
Exclusivity
Exclusivity
1
Bin Laden
Bin Laden
Bin Laden
Bin Laden
Bin Laden
2
Mokhtar
Mohammed
Haouari
Jose Padilla
Jose Padilla
Ariel Sharon
Fadlallah
3
Adel
Ibrahim
Boumezbeur
Cherie Stultz
Aziz Nassour
Hassouna
Jose Padilla
4
Khalid
Benjamin
Jose Padilla
Mohammed
Netenyahu
Samih Osailly
Bashar Assad
5
Mohamad
Mustapha Labsi Mullah Omar
Bilal Marwan
Bilal Marwan
Hammoud
Note, that although Bin Laden shows up as the top in all measures, other individuals rank
high in other measures, in particular, Padilla. Given the nature of the data, a sample of opensource information, one should not assume that these individuals necessarily hold the position
shown. Rather, the point here is to illustrate the kinds of findings possible, not to identify
specific individuals. As such, what is important to note is that different individuals stand out on
different dimensions and therefore, depending on the effect one wants to have, different
intervention strategies are called for. For example, if the goal is to disrupt operations those high
in cognitive demand might be isolated; whereas, if the goal is to discover more information, then
those high in degree centrality should be interviewed or traced. This data suggests that the
isolation of Bin Laden or Padilla would be disruptive and that both could provide important
intelligence. However, they may be hard to access. Stultz, K. Mohammed, and Omar are likely
to be connecting disconnected groups. Hassouna, Osailly and Marwan play specialized roles.
5.3. Sphere of Influence
Around each actor is a sphere of influence. This is the set of others (actors and
organizations), events (or tasks), and items (knowledge or resources) that the actor influences or
is influenced by. In a standard social network, that contains only actor-to-actor connections, the
sphere of influence is simply the actors ego-net. The ego-net is the set of others (alters) to
whom the actor is directly connected and the connections among those alters. When we move
beyond SNA to DNA, this idea is expanded to encompass all the entity classes. Thus the ego-net
in the meta-network, i.e., the sphere of influence, is the set of all other nodes (regardless of entity
class) that the focal node is directly connected to and the connections among those nodes.
To illustrate this idea, the sphere of influence for Bin Laden based on the al Qaeda 2002 data
is shown in Figure 5. For illustrative purposes the node for Bin Laden is enlarged. In this figure,
we see that Bin Laden is connected to 7 other actors, 7 resources, 18 knowledge areas, 11 tasks
15
or events, 8 locations, and 6 organizations. It is highly likely that Bin Laden is connected to
more individuals than those shown here. The point is not the specific content of this figure but
the fact that we can identify the sphere of influence and determine where an individual can be
influenced. For example, information that gets to those directly connected to Bin Laden, such as
Zawahiri or Mohammed Atef is likely to get to Bin Laden.
critical for different reasons. The final stage of analysis is to engage in a series of what-if
experiments to determine how the network is likely to change on its own or in response to
various strategic interventions. What-if analyses techniques can be used to provide insight in to
the possible effect of following various courses of action.
The analysis can address both what is likely to be the immediate short term effect on a
particular course of action, and what are the longer or moderate term effects. The immediate
effects can be seen by examining what linkages are broken, or what reduction in capacity there
is, when an actor is isolated. These can be evaluated directly with ORA by doing a static
comparison. Longer term effects can be seen by evolving the network over a few time periods to
see moderate term changes. These effects need to be generated using the simulation tool DyNet.
From an immediate impact perspective, assume that the following actors are isolated: Bin
Laden, Jose Padilla, Aziz Nassour, Benjamin Netenyahu, and Bilal Marwan. It is interesting to
isolate these individuals as they are all high in cognitive demand and so represent the current
emergent leaders of the group. Also, given their position, it should be difficult to recover from
the loss of these individuals given their extensive expertise and position in terms of complex
tasks. The isolation of these individuals reduces the overall density of the social network from
0.0017 to 0.0013, increases the average speed from 0.5455 to 0.7791, and increases accuracy
from 0.9223 to 0.9224. Note average speed is measured as the average geodesic value so the
higher the number the longer it takes information to diffuse. In the immediate term, the isolation
of these individuals has minimal impact of performance, but will reduce the rate at which
information spreads and makes the overall organization less cohesive.
Another effect is that a new network elite is likely to emerge. The predicted new elite is
shown in table 5. The analyst can compare the new (table 5) and old (table 4) to estimate
whether the change will be beneficial or harmful to US interests. For example, in table 5 the
individuals who are now listed as high in cognitive demand are likely to be the new emergent
leaders. If these individuals are harder to influence or less likely to support US interests then the
forgoing course of action should perhaps be avoided.
Table 5: Network Elite in the Actor-to-Actor Network for al Qaeda 2002 after Removal of
Top Cognitive Demand Actors
Level Degree
Betweenness
Cognitive
Task
Knowledge
Demand
Exclusivity
Exclusivity
1
Mokhtar
Mohammed
Mohammed
Haouari
Atta
Richard Reid
Ariel Sharon
Fadlallah
2
Adel
Ibrahim
Boumezbeur
Samih Osailly
Ariel Sharon
Hassouna
Bashar Assad
3
Mohamad
Mustapha Labsi Yasien Taher
Mourad Ikhlef
Samih Osailly
Hammoud
4
Abdel
Rohan
Dahoumane
Ziad Jarrah
Ibrahim Bah
Qaed al-Harethi Gunaratna
5
Khalid
Ahmed
Fateh Kamel
Sahim Alwan
Rabah Kadri
Mohammed
Ghailani
17
In the near term, changes in interaction are likely to occur and individuals who are not
currently interacting are likely to start. In Figure 6, those connections that are likely to form in
the near term are shown. These are the changes that are likely to occur if there is no
intervention. There are two ways to use this information. First, this analysis suggests where
changes are likely to occur. Second, to the extent that these changes are extremely likely, then it
may be that the connection already exists. As such, one might want to direct intelligence
gathering to confirming whether these ties actually exist. However, with the intervention
(isolation of the five actors), fewer new ties should form. Further, the individuals most likely to
form new ties after the intervention are Amar Makhlulif, Faysal Galab, and Mourad Ikhlef.
18
such as knowledge, resources, tasks, locations and organizations. By combining techniques and
ideas from statistics, computer science and organizational theory an integrated tool chain for the
extraction, analysis and prediction of relational data is possible. Herein, one instantiation of this
tool chain is presented and then used to examine data on al Qaeda 2002.
The results presented here should not be interpreted as findings about al Qaeda. Only a
sample of data for a single year is shown. As such all findings such as who are the elite, how it
is likely to change and the nature of its basic makeup are not likely to be correct. The import of
the work, however, is in the methodology and the type of activities that these techniques enable
the analyst to engage in.
The techniques presented here suffer from a few limitations. First, the extraction of
networks from texts is a rapid process once the thesaurus is created. The main limitation in
applying AutoMap to a new corpus is the time it takes to create new thesauri. Future work
should explore techniques to make thesauri construction more automatic. The second limitation
is that the tool does not infer linkages across texts. Future work should explore whether some
type of limited expert system or learning algorithm could be used to infer additional links.
Second the analysis of the network focuses on only relational data. Future work should explore
augmenting the analysis to consider non-relational information such as node attributes (e.g., are
the actors married or not or personality factors) to provide a more complete While these are
critical to groups, they are somewhat removed from more detailed performance indicators that
we might wish to influence such as ability to engage in recruiting, gathering finances, and
planning. Future work should explore how to link alternative performance metrics to general
network or relational data.
Nevertheless, this work demonstrates that it is possible to consider multiple types of
networks simultaneously. Moreover, we see that stronger metrics for assessing the shape of the
terrorist group and identifying its vulnerabilities are made possible by examining multiple
networks at the same time. By taking into account the entire meta-network actors who occupy
unique positions, not just in who they are connected to, but in what they know and are capable of
can be identified. Once identified, courses of action for intervening can be assessed. This
assessment can be in terms of both the immediate effects and near term effects (next few
months). Predicted changes should be viewed in two ways as pointing out what might happen
and as suggesting connections that might already exist. Hence these tools can be used both to
understand change and to afford guidance for information acquisition.
19
References
Arasu, A., Cho, J., Garcia-Molina, H., Paepcke, A., and Raghavan, S. (2001). Searching the
Web. ACM Transactions on Internet Technology, 1(1), 2-43.
Bak, P., 1996. How Nature Works: The Science of Self-organized Criticality, Copernicus.
Barabsi, A. L. (2002). Linked: The new science of networks. Cambridge, Massachusetts:
Perseus Publishing.
Bar-Ilan, J. (2004). A microscopic link analysis of academic institutions within a country - the
case of Israel. Scientometrics, 59(3), 391-403.
Borgatti, S.P., 2002, The Key Player Problem, Proceedings from National Academy of
Sciences Workshop on Terrorism, Washington DC.
Carley, K.M. 1997. Network Text Analysis: the network position of concepts. In Carl W.
Roberts (Ed.), Text analysis for the social sciences, (pp. 79-102). Mahwah, NJ: Lawrence
Erlbaum Associates.
Carley, K.M., and Palmquist, M. 1992. Extracting, Representing, and Analyzing Mental Models.
Social Forces, 70 (3), 601-636.
Carley, Kathleen M. and Jeff Reminga, 2004. ORA: Organization Risk Analyzer. Carnegie
Mellon University, School of Computer Science, Institute for Software Research
International, Technical Report CMU-ISRI-04-101.
Carley, Kathleen M. and Vanessa Hill, 2001, Structural Change and Learning Within
Organizations.
In Dynamics of
Organizations:
Computational Modeling and
Organizational Theories. Edited by Alessandro Lomi and Erik R. Larsen, MIT Press/AAAI
Press/Live Oak, Ch. 2. pp 63-92.
Carley, Kathleen M. 2002a, Smart Agents and Organizations of the Future The Handbook of
New Media. Edited by Leah Lievrouw and Sonia Livingstone, Ch. 12 pp. 206-220, Thousand
Oaks, CA, Sage.
Carley, Kathleen M. 2002b, Inhibiting Adaptation In Proceedings of the 2002 Command and
Control Research and Technology Symposium. Conference held in Naval Postgraduate
School, Monterey, CA. Evidence Based Research, Vienna, VA.
Carley, Kathleen M. 2003, Dynamic Network Analysis in Dynamic Social Network Modeling
and Analysis: Workshop Summary and Papers, Ronald Breiger, Kathleen Carley, and
Philippa Pattison, (Eds.) Committee on Human Factors, National Research Council, National
Research Council. Pp. 133-145.
Carley, Kathleen M. 2004, Estimating Vulnerabilities in Large Covert Networks In
Proceedings of the 9th International Command and Control Research and Technology
Symposium. Conference held at Loews Coronado Resort, CA. Evidence Based Research,
Vienna, VA.
Carley, Kathleen M. 2004. Estimating Vulnerabilities in Large Covert Networks Using MultiLevel Data. In Proceedings of the 2004 International Symposium on Command and Control
Research and Technology. Conference held in June, San Diego, CA., Evidence Based
Research, Presented during Track 1, Electronic Publication, Vienna, VA.
Carley, Kathleen M. Ju-Sung Lee and David Krackhardt, 2001, Destabilizing Networks,
Connections 24(3):31-34.
20
Carley, Kathleen M., 1999. On Generating Hypotheses Using Computer Simulations. Systems
Engineering, 2(2): 69-77.
Carley, Kathleen M.1991. A Theory of Group Stability. American Sociological Review, 56(3):
331-354.
Carley, Kathleen M., 1990, Group Stability: A Socio-Cognitive Approach. Pp. 1-44 in Lawler
E., Markovsky B., Ridgeway C. and Walker H. (Eds.) Advances in Group Processes:
Theory and Research . Vol. VII. Greenwhich, CN: JAI Press.
Chen H., and Lynch, K. J. (1992). Automatic construction of networks of concepts
characterizing document databases. IEEE Transactions on Systems, Man and Cybernetics,
22(5), 885-902.
Chen, C., Newman, J., Newman, R., and Rada, R. (1998). How did university departments
interweave the Web: A study of connectivity and underlying factors. Interating With
Computers, 10(4), 353-373.
Danowski, J. 1982. A network-based content analysis methodology for computer-mediated
communication: An illustration with a computer bulletin board. In R. Bostrom (Ed.),
Communication Yearbook, (pp. 904-925). New Brunswick, NJ: Ransaction Books.
Diesner, Jana and Kathleen M. Carley, 2005, Revealing Social Structure from Texts:MetaMatrix Text Analysis as a novel method for Network Text Analysis, In V.K. Narayanan and
D.J. Armstrong (Eds.) Causal Mapping for Information Systems and Technology Research:
Approaches, Advances, and Illustrations, Chapter 4, Harrisburg, PA: Idea Group Publishing.
Diesner, Jana and Kathleen, M. Carley, 2004. AutoMap1.2 - Extract, analyze, represent, and
compare mental models from texts. Carnegie Mellon University, School of Computer
Science, Institute for Software Research International, Technical Report CMU-ISRI-04-100.
Epstein, Joshua and Rob Axtell. 1997. Growing Artificial Societies. Boston, MA: MIT Press.
Goldberg, H. G. and Wong, R. W. H. (1998). Restructuring transactional data for link analysis in
the FinCEN AI system. In Proceedings of 1998 AAAI Fall Symposium on Artificial
Intelligence and Link Analysis (Menlo Park CA, 1998). AAAI Press.
Goldberg, H. G., and Senator, T. E. (1998). Restructuring databases for knowledge discovery by
consolidation and link formation. In Proceedings of 1998 AAAI Fall Symposium on
Artificial Intelligence and Link Analysis (Menlo Park CA, 1998). AAAI Press.
Hauck, R. V., Atabakhsh, H., Ongvasith, P., Gupta, H., and Chen, H. (2002). COPLINK concept
space: An application for criminal intelligence analysis. IEEE Computer Digital Government
Special Issue, 35(3), 30-37.
Heider, F., 1979, On Balance and Attribution, (eds.) Holland, P. and S. Leinhardt, Perspectives
on Social Networks, New York, Academic Press.
Henzinger, M. R. (2001). Hyperlink analysis for the Web. IEEE Internet Computing, 5(1), 45-50.
Holland, John, 1995. Hidden Order: How Adaptation Builds Complexity. Helix Books.
Kamneva, Natasha and Kathleen, M. Carley, 2004. A Network Optimization Approach for
Improving Organizational Design. Carnegie Mellon University, School of Computer Science,
Institute for Software Research International, Technical Report CMU-ISRI-04-102.
Kauffman, Stuart A. 1995. At home in the universe : the search for laws of self-organization and
complexity. New York, NY: Oxford University Press.
21
22
Author Information
Contact
Kathleen M. Carley
1323 Wean Hall
Institute for Software Research International, SCS
Carnegie Mellon University
Pittsburgh, PA 15213
Email: kathleen.carley@cmu.edu
Fax: 1-412-268-1744
Tel: 1-412-268-6016
Bio
Kathleen M. Carley's research combines cognitive science, social networks and computer
science. Her specific research areas are computational social and organization theory, group,
organizational and social adaptation and evolution, dynamic network analysis, computational
text analysis, and the impact of telecommunication technologies and policy on communication,
information diffusion, disease contagion and response within and among groups, including
command and control teams, particularly in disaster or crisis situations. Her models meld multiagent technology with social network dynamics and empirical data. Four of the large-scale
multi-agent network models she and the CASOS group have developed are: BioWar a city,
scale model of weaponized biological attacks; OrgAhead a model of strategic and natural
organizational adaptation; Construct a model of the co-evolution of social and knowledge
networks and personal/organizational identity and capability; and DyNet a system for evaluating
alternative destabilization strategies on covert networks.
Picture
23