Vous êtes sur la page 1sur 28

1) AI definitions.

Artificial intelligence (AI) is the intelligence exhibited by machines or software. It is


also the name of the academic field of study which studies how to create computers and
computer software that are capable of intelligent behavior. Major AI researchers and
textbooks define this field as "the study and design of intelligent agents", in which an
intelligent agent is a system that perceives its environment and takes actions that maximize
its chances of success. John McCarthy, who coined the term in 1955, defines it as "the
science and engineering of making intelligent machines".
AI research is highly technical and specialized, and is deeply divided into subfields that
often fail to communicate with each other. Some of the division is due to social and cultural
factors: subfields have grown up around particular institutions and the work of individual
researchers. AI research is also divided by several technical issues. Some subfields focus on
the solution of specific problems. Others focus on one of several possible approaches or on
the use of a particular tool or towards the accomplishment of particular applications.
The central problems (or goals) of AI research include reasoning, knowledge, planning,
learning, natural language processing (communication), perception and the ability to move
and manipulate objects. General intelligence is still among the field's long-term
goals.Currently popular approaches include statistical methods, computational intelligence
and traditional symbolic AI. There are a large number of tools used in AI, including versions
of search and mathematical optimization, logic, methods based on probability and
economics, and many others. The AI field is interdisciplinary, in which a number of sciences
and professions converge, including computer science, mathematics, psychology, linguistics,
philosophy and neuroscience, as well as other specialized fields such as artificial psychology.
2. The Turing test.
---------------------------------------------------------------------------------------------------------------The Turing test is a test, developed by Alan Turing in 1950, of a machine's ability to
exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human. Turing
proposed that a human evaluator would judge natural language conversations between a
human and a machine that is designed to generate human-like responses. The evaluator
would be aware that one of the two partners in conversation is a machine, and all
participants would be separated from one another. The conversation would be limited to a
text-only channel such as a computer keyboard and screen so that the result would not be
dependent on the machine's ability to render words as speech. If the evaluator cannot
reliably tell the machine from the human (Turing originally suggested that the machine
would convince a human 70% of the time after five minutes of conversation), the machine is
said to have passed the test. The test does not check the ability to give correct answers to
questions, only how closely answers resemble those a human would give.
The test was introduced by Turing in his paper, "Computing Machinery and

Intelligence", while working at the University of Manchester (Turing, 1950; p. 460).It opens
with the words: "I propose to consider the question, 'Can machines think?'" Because
"thinking" is difficult to define, Turing chooses to "replace the question by another, which is
closely related to it and is expressed in relatively unambiguous words."Turing's new
question is: "Are there imaginable digital computers which would do well in the imitation
game?" This question, Turing believed, is one that can actually be answered. In the
remainder of the paper, he argued against all the major objections to the proposition that
"machines can think".
3. What are the cognitive sciences and the relation between them and AI.
------------------------------------------------------------------------------------------------------------------------Cognitive science is the interdisciplinary scientific study of the mind and its processes.It
examines what cognition is, what it does and how it works. It includes research on
intelligence and behaviour, especially focusing on how information is represented,
processed, and transformed (in faculties such as perception, language, memory, attention,
reasoning, and emotion) within nervous systems (humans or other animals) and machines
(e.g. computers). Cognitive science consists of multiple research disciplines, including
psychology, artificial intelligence, philosophy, neuroscience, linguistics, and anthropology.It
spans many levels of analysis, from low-level learning and decision mechanisms to highlevel logic and planning; from neural circuitry to modular brain organization. The
fundamental concept of cognitive science is that "thinking can best be understood in terms
of representational structures in the mind and computational procedures that operate on
those structures.
4. What are the roots of AI.
Philosophy: logic, methods of reasoning, mind as physical system, foundations of learning,
language, rationality
Mathematics: formal representation and proof, algorithms, computation, (un)decidability,
(in)tractability, probability
Psychology: adaptation phenomena of perception and motor control experimental
techniques (psychophysics, etc.)
Economics: formal theory of rational decisions
Linguistics: knowledge representation, grammar
Neuroscience: plastic physical substrate for mental activity
Control theory: homeostatic systems, stability, simple optimal agent designs
5. Name at least 3 important facts from the history of AI.
1.Greek myths of Hephaestus and Pygmalion incorporated the idea of intelligent robots
(such as Talos) and artificial beings (such as Galatea and Pandora)

2.Ren Descartes proposed that bodies of animals are nothing more than complex machines
(but that mental phenomena are of a different "substance")
3.Samuel Butler suggested that Darwinian evolution also applies to machines, and
speculates that they will one day become conscious and eventually supplant humanity
6. The General Problem Solver.
General Problem Solver or G.P.S. was a computer program created in 1959 by Herbert
A. Simon, J.C. Shaw, and Allen Newell intended to work as a universal problem solver
machine. Any problem that can be expressed as a set of well-formed formulas (WFFs) or
Horn clauses, and that constitute a directed graph with one or more sources (viz., axioms)
and sinks (viz., desired conclusions), can be solved, in principle, by GPS. Proofs in the
predicate logic and Euclidean geometry problem spaces are prime examples of the domain
the applicability of GPS. of predicate logic theorems. It was based on Simon and Newell's
theoretical work on logic machines. GPS was the first computer program which separated its
knowledge of problems (rules represented as input data) from its strategy of how to solve
problems (a generic solver engine). GPS was implemented in the third-order programming
language, IPL.
While GPS solved simple problems such as the Towers of Hanoi that could be
sufficiently formalized, it could not solve any real-world problems because search was easily
lost in the combinatorial explosion. Put another way, the number of "walks" through the
inferential digraph became computationally untenable. (In practice, even a straightforward
state space search such as the Towers of Hanoi can become computationally infeasible,
albeit judicious prunings of the state space can be achieved by such elementary AI
techniques as alpha-beta pruning and min-max.)
7) The Mycin system
MYCIN was an early expert system that used artificial intelligence to identify bacteria
causing severe infections, such as bacteremia and meningitis, and to recommend
antibiotics, with the dosage adjusted for patient's body weight the name derived from
the antibiotics themselves, as many antibiotics have the suffix "-mycin". The Mycin system
was also used for the diagnosis of blood clotting diseases.
MYCIN was developed over five or six years in the early 1970s at Stanford University.
It was written in Lisp as the doctoral dissertation of Edward Shortliffe under the direction of
Bruce G. Buchanan, Stanley N. Cohen and others. It arose in the laboratory that had created
the earlier Dendral expert system.
MYCIN was never actually used in practice but research indicated that it proposed an

acceptable therapy in about 69% of cases, which was better than the performance of
infectious disease experts who were judged using the same criteria.
8) The Eliza system.
ELIZA is a computer program and an early example of primitive natural language
processing. ELIZA operated by processing users' responses to scripts, the most famous of
which was DOCTOR, a simulation of a Rogerian psychotherapist. Using almost no
information about human thought or emotion, DOCTOR sometimes provided a startlingly
human-like interaction. ELIZA was written at MIT by Joseph Weizenbaum between 1964 and
1966.
When the "patient" exceeded the very small knowledge base, DOCTOR might provide a
generic response, for example, responding to "My head hurts" with "Why do you say your
head hurts?" A possible response to "My mother hates me" would be "Who else in your
family hates you?" ELIZA was implemented using simple pattern matching techniques, but
was taken seriously by several of its users, even after Weizenbaum explained to them how it
worked. It was one of the first chatterbots.
9) Name at least 3 programming languages and/or frameworks specific for AI. (Note: general
purpose programming languages, such as Java are excluded).
Prolog is a declarative language where programs are expressed in terms of relations, and
execution occurs by running queries over these relations. Prolog is particularly useful for
symbolic reasoning, database and language parsing applications. Prolog is widely used in AI
today.Prolog is a declarative language where programs are expressed in terms of relations,
and execution occurs by running queries over these relations. Prolog is particularly useful
for symbolic reasoning, database and language parsing applications. Prolog is widely used in
AI today.
Python is very widely used for Artificial Intelligence. They have a lot of different AIs with
corresponding packages: General AI, Machine Learning, Natural Language Processing and
Neural Networks. Companies like Narrative Science use Python to create an artificial
intelligence for Narrative Language Processing
IPL was the first language developed for artificial intelligence. It includes features
intended to support programs that could perform general problem solving, including lists,
associations, schemas (frames), dynamic memory allocation, data types, recursion,
associative retrieval, functions as arguments, generators (streams), and cooperative
multitasking.
10) Name at least 5 subfields of AI.

Neural Networks e.g. brain modelling, time series prediction, classification


Evolutionary Computation e.g. genetic algorithms, genetic programming
Vision e.g. object recognition, image understanding
Robotics e.g. intelligent control, autonomous exploration
Expert Systems e.g. decision support systems, teaching systems
Speech Processing e.g. speech recognition and production
Natural Language Processing e.g. machine translation
Planning e.g. scheduling, game playing
Machine Learning e.g. decision tree learning, version space learning

11) Describe the Breadth-First search algorithm. Example.


Breadth-first search (BFS) for traversing or searching tree or graph data structures. It starts at the
t ee oot o so e a it a
ode of a g aph, so eti es effe ed as a sea h ke a d e plo es the
neighbor nodes first, before moving to the next level neighbours.
BFS was invented in the late 1950s by E.F Moore who used it to find the shortest path out of a maze
and discovered independently by C Y Lee as a wire routing algorithm. (ex: a tree)
12) Describe the Depth-First search algorithm. Example.
-----------------------------------------------------------------------------------------------------------------------------------Depth first search (DFS) is an algorithm for traversing or searching tree or graph data structures.
One starts at the root (selecting some arbitrary node as the root in case of a graph) and explores as
far as possible along each branch before backtracking.

13) Describe the A* search algorithm. Example.


- computer algorithm that is widely used in pathfinding and graph traversal, the process of plotting
an efficiently traversable path between multiple points called nodes. Noted for its performance and
accuracy, it enjoys widespread use. However, in practical travel-routing systems, it is generally
outperformed by algorithms which can pre-process the graph to attain better performance,
although other work has found A* to be superior to other approaches.

14) Describe the IDA* search algorithm. Example.


Iterative depending A* (IDA*) is a graph traversal and path search algorithm that can find the
shortest path between a designated start node and any member of a set of goal nodes in a
weighted graph. It is a variant of iterative depending depth-first search that borrows the idea to
use a heuristic function to evaluate the remaining cost to get to the goal from the A* search

algorithm. Since it is a depth-first search algorithm, its memory usage is lower than in A*, but
unlike ordinary iterative deepening search, it concentrates on exploring the most promising
nodes and thus doesn't go to the same depth everywhere in the search tree. Unlike A*, IDA*
doesn't utilize dynamic programming and therefore often ends up exploring the same nodes
many times.
While the standard iterative deepening depth-first search uses search depth as the cutoff for
each iteration, the IDA* uses the more informative
cost to travel from the root to node
cost to travel from

and

where

is the

is a problem-specific heuristic estimate of the

to the solution. As in A*, the heuristic has to have particular properties to

guarantee optimality (shortest paths);


Applications of IDA* are found in such problems as planning. The algorithm was first described
by Richard Korf in 1985.

15) Describe the SMA* search algorithm. Example.


SMA* or Simplified Memory Bounded A* is a shortest path algorithm based on the A*
algorithm. The main advantage of SMA* is that it uses a bounded memory, while the A*
algorithm might need exponential memory. All other characteristics of SMA* are inherited from
A*.
Like A*, it expands the most promising branches according to the heuristic. What sets SMA*
apart is that it prunes nodes whose expansion has revealed less promising than expected. The
approach allows the algorithm to explore branches and backtrack to explore other branches.
Expansion and pruning of nodes is driven by keeping two values of
stores a value

for every node. Node

which estimates the cost of reaching the goal by taking a path through that

node. The lower the value, the higher the priority. As in A* this value is initialized
to

, but will then be updated to reflect changes to this estimate when its children

are expanded. A fully expanded node will have an


successors. In addition, the node stores the

value at least as high as that of its

value of the best forgotten successor. This value

is restored if the forgotten successor is revealed to be the most promising successor.


Starting with the first node, it maintains OPEN, ordered lexicographically by

and depth. When

choosing a node to expand, it chooses the best according to that order. When selecting a node
to prune, it chooses the worst.

16) Describe the RBFS search algorithm. Example.

RBFS is a linear-space algorithm that expands nodes in best-first order even with a nonmonotonic cost function and generates fewer nodes than iterative deepening with a
monotonic cost function.

In order to be expanded, the upper bound on a node must be at least as large as its stored
value.

If a node has been previously expanded, its stored value will be greater than its static value.

If the stored value of a node is greater than its static value, its stored value is the minimum
of the last stored values of its children.

In general, a parents stored alue is passed do n to its hildren, hi h inherit the alue
onl if it e eeds oth the parents stati alue and the hilds stati alue.
17) Describe the Hill-climbing search algorithm. Example.
In computer science, hill climbing is a mathematical optimization technique which belongs to
the family of local search. It is an iterative algorithm that starts with an arbitrary solution to a
problem, then attempts to find a better solution by incrementally changing a single element of
the solution. If the change produces a better solution, an incremental change is made to the
new solution, repeating until no further improvements can be found.
For example, hill climbing can be applied to the travelling salesman problem. It is easy to find an
initial solution that visits all the cities but will be very poor compared to the optimal solution. The
algorithm starts with such a solution and makes small improvements to it, such as switching the
order in which two cities are visited. Eventually, a much shorter route is likely to be obtained.
Hill climbing is good for finding a local optium (a solution that cannot be improved by
considering a neighbouring configuration) but it is not necessarily guaranteed to find the best
possible solution (the global optium) out of all possible solutions (the search space). In convex
problems, hill-climbing is optimal. Examples of algorithms that solve convex problems by hillclimbing include the simplex algorithm for linear programming and binary search.
The relative simplicity of the algorithm makes it a popular first choice amongst optimizing
algorithms. It is used widely in artificial intelligence, for reaching a goal state from a starting
node. Choice of next node and starting node can be varied to give a list of related algorithms.
Although more advanced algorithms such as simulated annealing or tabu search may give
better results, in some situations hill climbing works just as well. Hill climbing can often produce
a better result than other algorithms when the amount of time available to perform a search is
limited, such as with real-time systems. It is an anytime algorithm: it can return a valid solution
even if it's interrupted at any time before it ends.

18) Describe the Genetic algorithm principle.


In the field of artificial intelligence, a genetic algorithm (GA) is a searchheuristic that mimics
the process of natural selection. This heuristic (also sometimes called a metaheuristic) is
routinely used to generate useful solutions to optimization and search problems. Genetic
algorithms belong to the larger class of evolutionary algorithms (EA), which generate solutions
to optimization problems using techniques inspired by natural evolution, such as
inheritance,mutation, selection, and crossover.

19) Describe the MIN-MAX algorithm. Example.


Minimax (sometimes MinMax or MM) is a decision rule used in decision theory,game
theory,statistics and philosophy for minimizing the possible loss for a worst case (maximum
loss) scenario. Originally formulated for two-player zero-sum game theory, covering both the
cases where players take alternate moves and those where they make simultaneous moves, it
has also been extended to more complex games and to general decision-making in the
presence of uncertainty.

20) Describe the Alpha-Beta pruning algorithm. Example.


Alphabeta pruning is a search algorithm that seeks to decrease the number of nodes that
are evaluated by theminimax algorithm in its search tree. It is an adversarial search algorithm
used commonly for machine playing of two-player games (Tic-tac-toe, chess .). It stops
completely evaluating a move when at least one possibility has been found that proves the
move to be worse than a previously examined move. Such moves need not be evaluated
further. When applied to a standard minimax tree, it returns the same move as minimax would,
but prunes away branches that cannot possibly influence the final decision.

21) Describe the Constraint Satisfying Problem (CSP).


Constraint satisfaction problems (CSPs) are mathematical problems defined as a set of
objects whose state must satisfy a number of constraints or limitations. CSPs represent the
entities in a problem as a homogeneous collection of finite constraints over variables, which is
solved by constraint satisfaction methods. CSPs are the subject of intense research in both
artificial intelligence and operation research, since the regularity in their formulation provides a
common basis to analyze and solve problems of many seemingly unrelated families. CSPs
often exhibit high complexity, requiring a combination of heuristics and combinatorial
search methods to be solved in a reasonable time. The Boolean satisfying problem (SAT),

the satisfiability modulo theory(SMT) and answer set programming (ASP) can be roughly
thought of as certain forms of the constraint satisfaction problem.
Examples of simple problems that can be modeled as a constraint satisfaction problem

Eight queen puzzle

Map coloring problem

Sudoku.

Examples demonstrating the above are often provided with tutorials of ASP, Boolean SAT and
SMT solvers. In the general case, constraint problems can be much harder, and may not be
expressible in some of these simpler systems.
"Real life" examples include automated planning and resource allocation.

22) Describe the Backtracking search algorithm. Example.


Backtracking is a general algorithm for finding all (or some) solutions to some computational
problems, notably constraint satisfaction problems, that incrementally builds candidates to the
solutions, and abandons each partial candidate c ("backtracks") as soon as it determines that c
cannot possibly be completed to a valid solution.
The classic textbook example of the use of backtracking is the eight queens puzzle, that
asks for all arrangements of eight chess queens on a standard chessboard so that no queen
attacks any other. In the common backtracking approach, the partial candidates are
arrangements of k queens in the first k rows of the board, all in different rows and columns. Any
partial solution that contains two mutually attacking queens can be abandoned.
Backtracking can be applied only for problems which admit the concept of a "partial
candidate solution" and a relatively quick test of whether it can possibly be completed to a valid
solution. It is useless, for example, for locating a given value in an unordered table. When it is
applicable, however, backtracking is often much faster than brute force enumeration of all
complete candidates, since it can eliminate a large number of candidates with a single test.
Backtracking is an important tool for solving constraint satisfaction problems, such as
crosswords, verbal arithmetic, Sudoku, and many other puzzles. It is often the most convenient
(if not the most efficient) technique for parsing, for the knapsack problem and other
combinatorial optimization problems. It is also the basis of the so-called logic programming
languages such as Icon, Planner and Prolog.
Backtracking depends on user-given "black box procedures" that define the problem to be
solved, the nature of the partial candidates, and how they are extended into complete
candidates. It is therefore a metaheuristic rather than a specific algorithm although, unlike

many other meta-heuristics, it is guaranteed to find all solutions to a finite problem in a bounded
amount of time.

23) Describe the Distributed Constraint Satisfying Problem (DCSP).


Distributed constraint optimization (DCOP or DisCOP) is the distributed analogue to constraint
optimization. A DCOP is a problem in which a group of agents must distributedly choose values for a
set of variables such that the cost of a set of constraints over the variables is either minimized or
maximized
Distributed Constraint Satisfaction is a framework for describing a problem in terms of
constraints that are known and enforced by distinct participants (agents). The constraints are
described on some variables with predefined domains, and have to be assigned to the same values
by the different agents.
Problems defined with this framework can be solved by any of the algorithms that are proposed
for it.
The framework was used under different names in the 1980s. The first known usage with the
current name is in 1990.
24) Describe the Asynchronous BackTracking (ABT) Family of algorithms.
Asynchronous Backtracking (ABT) was a pioneer algorithm to solve DisCSP, dating its first version
from 1992. ABT is an asynchronous algorithm executed autonomously by each agent in the distributed
constraint network. Each agent takes its own decisions and informs other agents of them, and no agent
has to wait for decisions of others. It computes a global consistent solution (or detects that no solution
exists) in finite time; its correctness and 3 completeness have been proven.
ABT requires constraints to be directed. A constraint causes a directed link between the two
constrained agents: the value-sending agent, from which the link departs, and the constraintevaluating
agent, to which the link arrives. When the value-sending agent makes an assignment, it informs the
constraint-evaluating agent, which tries to find a consistent value. If it cannot, it sends back a message
to the value-sending agent to cause backtracking. To make the network cycle-free there is a total order
among agents, which is followed by the directed links. The ABT algorithm is executed on each agent,
keeping its own agent view and nogood list. Considering a generic agent self, the agent view of self is the
set of values that it believes to be assigned to agents connected to self by incoming links. The nogood
list keeps the nogoods received by self as justifications of inconsistent values. Agents exchange
assignments and nogoods. ABT always accepts new assignments, updating the agent view accordingly.
When receiving a nogood, it is accepted if it is consistent with the agent view of self, otherwise it is
discarded as obsolete. An accepted nogood is used to update the nogood list. When an agent cannot
find any value consistent with its agent view, because the original constraints or because the received
nogoods, new nogoods are generated from its agent view and sent to the closest agent in the new
nogood, causing backtracking. If self receives a nogood including another agent not connected with it,
self requires to add a link from that agent to self. From this point on, a link from the other agent to self

will exist. The process terminates when achieving quiescence, meaning that a solution has been found,
or when the empty nogood is generated, meaning that the problem is unsolvable.

25) K owledge Base KB . Defi itio . Des i e the KB age t s a tio s.


Knowledge-based agents are best understood as agents that know about their world and reason about
their courses of action.
The knowledge-base (KB): a set of representations of facts about the world
The k owledge ep ese tatio la guage: a la guage whose se te es ep ese t fa ts a out the
world.
TELL and ASK interface: operations for adding new sentences to the KB and querying what is known. This
is similar to updating and querying in databases.
The i fe e e e ha is : a e ha is fo dete i i g what follows f o what has ee TELLed to
the knowledge base. The ASK operation utilizes this inference mechanism.

26) Expert system. Definition and working principles.


The most important applied area of AI is the field of expert systems. An expert system (ES) is a
knowledge-based system that employs knowledge about its application domain and uses an inferencing
(reason) procedure to solve problems that would otherwise require human competence or expertise.
The power of expert systems stems primarily from the specific knowledge about a narrow domain
stored in the expert system's knowledge base.
It is important to stress to students that expert systems are assistants to decision makers and not
substitutes for them. Expert systems do not have human capabilities. They use a knowledge base of a
particular domain and bring that knowledge to bear on the facts of the particular situation at hand. The
knowledge base of an ES also contains heuristic knowledge - rules of thumb used by human experts who
work in the domain.
The knowledge base of an ES contains both factual and heuristic knowledge. Knowledge
representation is the method used to organize the knowledge in the knowledge base. Knowledge bases
must represent notions as actions to be taken under circumstances, causality, time, dependencies,
goals, and other higher-level concepts.
Several methods of knowledge representation can be drawn upon. Two of these methods include:
1. Frame-based systems
- are employed for building very powerful ESs. A frame specifies the attributes of a complex object and
frames for various object types have specified relationships.
2. Production rules
- are the most common method of knowledge representation used in business. Rule-based expert
systems are expert systems in which the knowledge is represented by production rules.
A production rule, or simply a rule, consists of an IF part (a condition or premise) and a THEN part
(an action or conclusion). IF condition THEN action (conclusion).
The explanation facility explains how the system arrived at the recommendation. Depending on

the tool used to implement the expert system, the explanation may be either in a natural language or
simply a listing of rule numbers.

27) Knowledge representation using the first order predicate logic. Constants, predicates,
functions, variables, connectives, quantifiers.

First-order logic (FOL) models the world in terms of

Objects, which are things with individual identities

Properties of objects that distinguish them from others

Relations that hold among sets of objects

Functions, which are a subset of relations where there is only one value for any given
input

A sentence is

satisfiable if it is true under some interpretation

valid if it is true under all possible interpretations

inconsistent if there does not exist any interpretation under which the sentence is true

Basics:

- empty set = constant = { }

- unary predicate Set( ), true for sets

- binary predicates:

x s (true if x is a member of the set x)

s1 s2 (true if s1 is a subset of s2)


- binary functions:
intersection s1 s2, union s1 s2 , adjoining {x|s}
First-order logic:

Much more expressive than propositional logic

Allows objects and relations as semantic primitives

Universal and existential quantifiers

syntax: constants, functions, predicates, equality, quantifiers

28) The modus ponens inference rule. Example.


In propositional logic, modus ponendo ponens (Latin for "the way that affirms by affirming";
generally abbreviated to MP or modus ponens) or implication elimination is a valid, simple argument
form and rule of inference.It can be summarized as "P implies Q; P is asserted to be true, so therefore Q
must be true." The history of modus ponens goes back to antiquity.
While modus ponens is one of the most commonly used concepts in logic it must not be mistaken
for a logical law; rather, it is one of the accepted mechanisms for the construction of deductive proofs
that includes the "rule of definition" and the "rule of substitution".Modus ponens allows one to
eliminate a conditional statement from a logical proof or argument (the antecedents) and thereby not
carry these antecedents forward in an ever-lengthening string of symbols; for this reason modus ponens
is sometimes called the rule of detachment.Enderton, for example, observes that "modus ponens can
produce shorter formulas from longer ones", and Russell observes that "the process of the inference
cannot be reduced to symbols.
A justification for the "trust in inference is the belief that if the two former assertions [the
antecedents] are not in error, the final assertion [the consequent] is not in error". In other words: if one
statement or proposition implies a second one, and the first statement or proposition is true, then the
second one is also true. If P implies Q and P is true, then Q is true. An example is:
If it is raining, I will meet you at the theater.
It is raining.
Therefore, I will meet you at the theater.

29) The modus tolens inference rule. Example.


In propositional logic, modus tollens (or modus tollendo tollens and also denying the
consequent)(Latin for "the way that denies by denying") is a valid argument form and a rule of
inference. It is an application of the general truth that if a statement is true, then so is its contrapositive.
The first to explicitly describe the argument form modus tollens were the Stoics.
The inference rule modus tollens validates the inference from P implies Q and the contradictory of
Q to the contradictory of P.

pq

30) Forward chaining algorithm. General description.


Forward chaining is one of the two main methods of reasoning when using an inference engine and
can be described logically as repeated application of modus ponens. Forward chaining is a popular
implementation strategy for expert systems, business and production rule systems. The opposite of
forward chaining is backward chaining.
Forward chaining starts with the available data and uses inference rules to extract more data (from
an end user, for example) until a goal is reached. An inference engine using forward chaining searches
the inference rules until it finds one where the antecedent (If clause) is known to be true. When such a
rule is found, the engine can conclude, or infer, the consequent (Then clause), resulting in the addition
of new information to its data.
Inference engines will iterate through this process until a goal is reached.
For example, suppose that the goal is to conclude the color of a pet named Fritz, given that he
croaks and eats flies, and that the rule base contains the following four rules:
If X croaks and X eats flies - Then X is a frog
If X chirps and X sings - Then X is a canary
If X is a frog - Then X is green
If X is a canary - Then X is yellow
Let us illustrate forward chaining by following the pattern of a computer as it evaluates the rules.
Assume the following facts:
Fritz croaks
Fritz eats flies
With forward reasoning, the inference engine can derive that Fritz is green in a series of steps:
1. Since the base facts indicate that "Fritz croaks" and "Fritz eats flies", the antecedent of rule #1 is
satisfied by substituting Fritz for X, and the inference engine concludes:
Fritz is a frog
2. The antecedent of rule #3 is then satisfied by substituting Fritz for X, and the inference engine
concludes:
Fritz is green

31) Backward chaining algorithm. General description.


Backward chaining (or backward reasoning) is an inference method that can be described (in lay
terms) as working backward from the goal(s). It is used in automated theorem provers, inference
engines, proof assistants and other artificial intelligence applications.
Backward chaining is implemented in logic programming by SLD resolution. Both rules are based on

the modus ponens inference rule. It is one of the two most commonly used methods of reasoning with
inference rules and logical implications the other is forward chaining. Backward chaining systems
usually employ a depth-first search strategy, e.g. Prolog.

32) Knowledge representation using production rules. Example.

One of the most popular approaches to knowledge representation is to use production


rules, sometimes called IF-THEN rules. They can take various forms.e.g.
IF condition THEN action
IF premise
THEN conclusion
IF proposition p1 and proposition p2 are true
THEN proposition p3 is true

Some of the benefits of IF-THEN rules are that they are modular, each defining a
relatively small and, at least in principle, independent piece of knowledge. New rules
may be added and old ones deleted usually independently of other rules.
Mycin was designed to help the doctor to decide whether a patient has a bacterial
infection, which organism is responsible, which drug may be appropriate for this
infection, and which may be used on the specific patient.
The global knowledge base contains facts and rules relating for example symptoms to
infections, and the local database will contain particular observations about the patient
being examined. A typical rule in Mycin is as follows:
IF the identity of the germ is not known with certainty
AND the germ is gram-positive
AND the morphology of the organism is "rod"
AND the germ is aerobic
THEN there is a strong probability (0.8) that the germ is of type
enterobacteriacae

33) Knowledge representation using semantic networks. Example.


A semantic network, or frame network, is a network which represents semantic relations between
concepts. This is often used as a form of knowledge representation. It is a directed or undirected graph
consisting of vertices, which represent concepts, and edges.
An example of a semantic network is WordNet, a lexical database of English. It groups English words
into sets of synonyms called synsets, provides short, general definitions, and records the various
semantic relations between these synonym sets. Some of the most common semantic relations defined
are meronymy (A is part of B, i.e. B has A as a part of itself), holonymy (B is part of A, i.e. A has B as a
part of itself), hyponymy (or troponymy) (A is subordinate of B; A is kind of B), hypernymy (A is
superordinate of B), synonymy (A denotes the same as B) and antonymy (A denotes the opposite of B).

WordNet properties have been studied from a network theory perspective and compared to other
semantic networks created from Roget's Thesaurus and word association tasks. From this perspective
the three of them are a small world structure.

34) Knowledge representation using Frames. Example.


Frames were proposed by Marvin Minsky in his 1974 article "A Framework for Representing
Knowledge." A frame is an artificial intelligence data structure used to divide knowledge into
substructures by representing "stereotyped situations." Frames are the primary data structure used in
artificial intelligence Frame languages.
Frames are also an extensive part of knowledge representation and reasoning schemes. Frames
were originally derived from semantic networks and are therefore part of structure based knowledge
representations. According to Russell and Norvig's "Artificial Intelligence, A Modern Approach,"
structural representations assemble "...facts about particular object and even types and arrange the
types into a large taxonomic hierarchy analogous to a biological taxonomy."

35) Knowledge representation using decision trees. Example.


A decision tree is a decision support tool that uses a tree-like graph or model of decisions and their
possible consequences, including chance event outcomes, resource costs, and utility. It is one way to
display an algorithm.
Decision trees are commonly used in operations research, specifically in decision analysis, to help
identify a strategy most likely to reach a goal, but are also a popular tool in machine learning.

36) Knowledge representation using AND-OR trees. Example.


An andor tree is a graphical representation of the reduction of problems (or goals) to conjunctions
and disjunctions of subproblems (or subgoals).
P if Q and R
P if S
Q if T
Q if U

37) The OAV model for representing semantic nets. Example.


38) CLIPS/JESS. Structured facts (defined with deftemplate). Example

The deftemplate construct is used to create a template which can then be used by
nonordered facts to access fields of the fact by name. The deftemplate construct is

analogous to a record or structure definition in programming languages such as


Pascal and C.
The syntax of the deftemplate construct is:
Syntax
(deftemplate <deftemplate-name> [<comment>]
<slot-definition>*)
<slot-definition> ::= <single-slot-definition> |
<multislot-definition>
<single-slot-definition>
::= (slot <slot-name>
<template-attribute>*)
<multislot-definition>
::= (multislot <slot-name>
<template-attribute>*)
<template-attribute> ::= <default-attribute> |
<constraint-attribute>
<default-attribute>
::= (default ?DERIVE | ?NONE | <expression>*) |
(default-dynamic <expression>*)

Redefining a deftemplate will result in the previous definition being discarded. A


deftemplate can not be redefined while it is being used (for example, by a fact or
pattern in a rule). A deftemplate can have any number of single or multifeld slots.
CLIPS always enforces the single and multifield definitions of the deftemplate. For
example, it is an error to store (or match) multiple values in a single-field slot.
Example
(deftemplate object
(slot name)
(slot location)
(slot on-top-of)
(slot weight)
(multislot contents))

39) CLIPS/JESS. Unstructured facts. Example

It is sometimes convenient to have a series of ad-hoc facts in the database, as well as


the structured ones. These are simply lists of words between parenthesis, such as:
(alarm on)
(alarm off)

(temperature high)
(valve a3572 open)
40) CLIPS/JESS. Assert and retract. Examples.

Asserting facts
There are two ways of introducing facts into the CLIPS database. One way is to
include them in a set of initial facts
(deffacts initial-facts
(student (sno 123) (sname maigret) (major pre-med)
(advisor simenon)) ... )

deffacts are asserted after the CLIPS file containing them has been loaded into CLIPS
(see below) and then after the (reset) command.
Another way to assert a fact is to assert it "on the fly", generally as an action in a rule:
(assert (student (sno 123) (sname maigret)
(major pre-med) (advisor simenon)))
Notice that, just as in LISP, parenthesis are important. Although CLIPS was written
in C (the letters CLI in CLIPS stand for "C Language Implementation"), the syntax
for using CLIPS is very much LISP-like.
41) CLIPS/JESS. Constraints. Examples

One type of field constraint is called a connective constraint. There are three types of
connective constraints. The first is called a ~ constraint. Its symbol is the tilde "~".
The ~ constraint acts on the one value that immediately follows it and will not allow
that value.
As a simple example of the ~ constraint, suppose you wanted to write a rule that
would print out "Don't walk" if the light was not green. One approach would be to
write rules for every possible light condition, including all possible malfunctions:
yellow, red, blinking yellow, blinking red, blinking green, winking yellow, blinking

yellow and winking red, and so forth. However, a much easier approach is to use the ~
constraint as shown in the following rule:
(defrule walk
(light ~green)
=>
(printout t "Don't walk" crlf))

By using the ~ constraint, this one rule does the work of many other rules that
required specifying each light condition.
42) CLIPS/JESS. Relational expressions. Examples.

A condition or logical expression is an expression that can only take the


values true or false. A simple form of logical expression is the relational expression.
The following is an example of a relational expression:
x < y

which takes the value true if the value of the variable x is less than the value of the
variable y.
The general form of a relational expression is:
operand1 relational-operator operand2
The operands can be either variables, constants or expressions. If an operand is an
expression then the expression is evaluated and its value used as the operand.
The relational-operators allowable in C++ are:
less than
> greater than
<= less than or equal to
>= greater than or equal to
== equals
!= not equals
<

Note that equality is tested for using the operator == since = is already used for
assigning values to variables.
The condition is true if the values of the two operands satisfy the relational operator,
and false otherwise.

43) CLIPS/JESS. Reading data from the keyboard.

The following example shows how (read) is used to input data. Note that no extra
(crlf) is needed after the (read) to put the cursor on a new line. The (read)
automatically resets the cursor to a new line.
CLIPS> (clear)
CLIPS> (defrule read-input
(initial-fact)
=>
(printout t "Name a primary color" crlf)
(assert (color (read))))
CLIPS>
(defrule check-input
?color <- (color ?color-read&red|yellow|blue)
=>
(retract ?color)
(printout t "Correct" crlf))
CLIPS> (reset)
CLIPS> (run)
Name a primary color
red
Correct
CLIPS> (reset)
CLIPS> (run)
Name a primary color
green
CLIPS> ; No "correct"

The rule is designed to use keyboard input on the RHS, so it's convenient to trigger
the rule with (initial-fact). Otherwise, you'd have to make up some dummy fact to
trigger the rule.
The (read) function is not a general-purpose function that will read anything you type
on the keyboard. One limitation is that (read) will read only one field. So if you try to
read
primary color is red

only the first field, "primary", will be read. To (read) all the input, you must enclose
the input within double quotes. Of course, once the input is within double quotes, it is
a single literal field. You can then access the substrings "primary", "color", "is", and
"red" with the strexplode or sub-string functions.
The second limitation of (read) is that you can't input parentheses unless they are
within double quotes. Just as you can't assert a fact containing parentheses, you can't
(read) parentheses directly except as literals.
The readline function is used to read multiple values until terminated by a carriage
return. This function reads in data as a string. In order to assert the (readline) data, an
(assert- string) function is used to assert the nonstring fact, just as input by (readline).
A top-level example of (assert-string) follows.
CLIPS> (clear)
CLIPS> (assert-string "(primary color is red)")
<Fact-0>
CLIPS> (facts)
f-0 (primary color is red)
For a total of 1 fact.
CLIPS>

Notice that the argument of (assert-string) must be a string The following shows how
to assert a fact of multiple fields from (readline).
CLIPS> (clear)
CLIPS> (defrule test-readline
(initial-fact)

=>
(printout t "Enter input" crlf)
(bind ?string (readline))
(assert-string (str-cat "(" ?string ")")))
CLIPS> (reset)
CLIPS> (run)
Enter input
primary color is red
CLIPS> (facts)
f-0 (initial-fact)
f-1 (primary color is r ed)
For a total of 2 facts.
CLIPS>

Since (assert-string) requires parentheses around the string to be asserted, the (str- cat)
function is used to put them around ?string.
Both (read) and (readline) also can be used to read information from a file by
specifying the logical name of the file as the argument. For more information, see
the CLIPS Reference Manual.
44) CLIPS/Jess. Loop structures and/or techniques.
Loop
The loop loop is used when a single variable is being incremented with each successive loop.
Syntax: loop (variable, startVal, endVal) {script};
A simple increment loop structure. Initializes variable with the value of starVal. Executes script.
Increments variable and tests if it is greater than endVal. If it is not, executes script and continues to loop.
For example, the following script outputs numbers from 1 to 4:

loop (ii, 1, 4)

{type "$(ii)";};

Note: The loop command provides faster looping through a block of script than does the for command. The
enhanced speed is a result of not having to parse out a LabTalk expression for the condition required to stop the
loop.

Doc -e
The doc -e loop is used when a script is being executed to affect objects of a specific type, such as graph
windows. The doc -e loop tells Origin to execute the script for each instance of the specified object type.
Syntax: doc -e object {script};
The different object types are listed in the document command.
For example, the following script prints the windows title of all graph windows in the project:

doc -e P {%H=}

For
The for loop is used for all other situations.
Syntax: for (expression1; expression2; expression3) {script};
In the for statement, expression1 is evaluated. This specifies initialization for the loop. Second, expression2 is
evaluated and if true (non-zero), the script is executed. Third, expression3, often incrementing of a counter, is
executed. The process repeats at the second step. The loop terminates when expression2 is found to be false
(zero). Any expression can consist of multiple statements, each separated by a comma.
For example, the following script output numbers from 1 to 4:

for(ii=1; ii<=4; ii++)


{
type "$(ii)";
}

45) Planning. The language of planning problems (from section 11.1 of [1]).
PLANNING - In which we see how an agent can take advantage of the structure of a problem to
construct complex plans of action.
The language of planning problems The preceding discussion suggests that the representation of
planning problemsstates, actions, and goalsshould make it possible for planning algorithms to take
advantage of the logical structure of the problem. The key is to find a language that is expressive enough
to describe a wide variety of problems, but restrictive enough to allow efficient algorithms to operate
over it.

46) Planning. Forward state-space search


It is sometimes called progression planning, because it moves in the forward direction.
We sta t i the p o le s i itial state, o side i g se ue es of a tio s u til we find a sequence that
reaches a goal state. The formulation of planning problems as state-space search problems is as follows:
The i itial state of the sea h is the i itial state f o the pla i g p o le . I ge e al, ea h state will
be a set of positive ground literals; literals not appearing are false.
The a tio s that a e appli a le to a state a e all those whose p e o ditio s a e satisfied. The su esso
state resulting from an action is generated by adding the positive effect literals and deleting the
negative effect literals. (In the first-order case, we must apply the unifier from the preconditions to the
effect literals.) Note that a single successor function works for all planning problemsa consequence of
using an explicit action representation.
The goal test he ks whethe the state satisfies the goal of the pla i g p o le .
The step ost of ea h a tio is t pi all 1. Although it would e eas to allow diffe e t osts fo
different actions, this is seldom done by STRIPS planners.

47) Planning. Backward state-space search.


Backward search can be difficult to implement when the goal states are described by a set of constraints
rather than being listed explicitly. In particular, it is not always obvious how to generate a description of
the possible predecessors of the set of goal states. We will see that the STRIPS representation makes
this quite easy because sets of states can be described by the literals that must be true in those states.
RELEVANCE The main advantage of backward search is that it allows us to consider only relevant actions.
An action is relevant to a conjunctive goal if it achieves one of the conjuncts of the goal. For example,
the goal in our 10-airport air cargo problem is to have 20 pieces of cargo at airport B, or more precisely,
At(C1, B) At(C2, B) . . . At(C20, B) .

48) Bayes' Rule and Its Use (from section 13.6 of [1]).
In probability theory and applications, Bayes's rule relates the odds of event A_1 to the odds of
event A_2, before (prior to) and after (posterior to) conditioning on another event B. The odds on A_1 to
event A_2 is simply the ratio of the probabilities of the two events. The prior odds is the ratio of the
unconditional or prior probabilities, the posterior odds is the ratio of conditional or posterior
probabilities given the event B. The relationship is expressed in terms of the likelihood ratio or Bayes
factor, \Lambda. By definition, this is the ratio of the conditional probabilities of the event B given that
A_1 is the case or that A_2 is the case, respectively. The rule simply states: posterior odds equals prior
odds times Bayes factor (Gelman et al., 2005, Chapter 1).
Bayes' rule is an equivalent way to formulate Bayes' theorem. If we know the odds for and against
A we also know the probabilities of A. It may be preferred to Bayes' theorem in practice for a number of
reasons.
Bayes' rule is widely used in statistics, science and engineering, for instance in model selection,

probabilistic expert systems based on Bayes networks, statistical proof in legal proceedings, email spam
filters. As an elementary fact from the calculus of probability, Bayes' rule tells us how unconditional and
conditional probabilities are related whether we work with a frequentist interpretation of probability or
a Bayesian interpretation of probability. Under the Bayesian interpretation it is frequently applied in the
situation where A_1 and A_2 are competing hypotheses, and B is some observed evidence. The rule
shows how one's judgement on whether A_1 or A_2 is true should be updated on observing the
evidence B .

49) The Semantics of Bayesian Networks (from section 14.2 of [1]).


A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed
acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a
set of random variables and their conditional dependencies via a directed acyclic graph (DAG). For
example, a Bayesian network could represent the probabilistic relationships between diseases and
symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of
various diseases.A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic
directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that
represents a set of random variables and their conditional dependencies via a directed acyclic graph
(DAG). For example, a Bayesian network could represent the probabilistic relationships between
diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the
presence of various diseases.
Efficient algorithms exist that perform inference and learning in Bayesian networks. Bayesian
networks that model sequences of variables (e.g. speech signals or protein sequences) are called
dynamic Bayesian networks. Generalizations of Bayesian networks that can represent and solve decision
problems under uncertainty are called influence diagrams.

50) Inference in Temporal Models (from section 15.2 of [1]).


Inference may be defined as the non-logical, but rational means, through observation of patterns
of facts, to indirectly see new meanings and contexts for understanding. Of particular use to this
application of inference are anomalies and symbols. Inference, in this sense, does not draw conclusions
but opens new paths for inquiry. (See second set of Examples.) In this definition of inference, there are
two types of inference: inductive inference and deductive inference.
Human inference (i.e. how humans draw conclusions) is traditionally studied within the field of
cognitive psychology; artificial intelligence researchers develop automated inference systems to
emulate human inference.
Statistical inference uses mathematics to draw conclusions in the presence of uncertainty. This
generalizes deterministic reasoning, with the absence of uncertainty as a special case. Statistical
inference uses quantitative or qualitative (categorical) data which may be subject to random variation.

51) Hidden Markov Models (from section 15.3 of [1]).

A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is
assumed to be a Markov process with unobserved (hidden) states. A HMM can be presented as the
simplest dynamic Bayesian network.
In simpler Markov models (like a Markov chain), the state is directly visible to the observer, and
therefore the state transition probabilities are the only parameters. In a hidden Markov model, the state
is not directly visible, but the output, dependent on the state, is visible. Each state has a probability
distribution over the possible output tokens. Therefore the sequence of tokens generated by an HMM
gives some information about the sequence of states. The adjective 'hidden' refers to the state
sequence through which the model passes, not to the parameters of the model; the model is still
referred to as a 'hidden' Markov model even if these parameters are known exactly.
Hidden Markov models are especially known for their application in temporal pattern recognition
such as speech, handwriting, gesture recognition, part-of-speech tagging, musical score following,partial
discharges and bioinformatics.
A hidden Markov model can be considered a generalization of a mixture model where the hidden
variables (or latent variables), which control the mixture component to be selected for each
observation, are related through a Markov process rather than independent of each other. Recently,
hidden Markov models have been generalized to pairwise Markov models and triplet Markov models
which allow consideration of more complex data structures and the modelling of nonstationary data.

52) Machine learning. Forms of learning (from section 18.1 of [1]).


Machine learning is a subfield of computer science that evolved from the study of pattern
recognition and computational learning theory in artificial intelligence.In 1959, Arthur Samuel defined
machine learning as a "Field of study that gives computers the ability to learn without being explicitly
programmed". Machine learning explores the study and construction of algorithms that can learn from
and make predictions on data.Such algorithms operate by building a model from example inputs in
order to make data-driven predictions or decisions,rather than following strictly static program
instructions.
Machine learning is closely related to and often overlaps with computational statistics; a discipline
which also focuses in prediction-making through the use of computers. It has strong ties to
mathematical optimization, which delivers methods, theory and application domains to the field.
Machine learning is employed in a range of computing tasks where designing and programming explicit
algorithms is infeasible. Example applications include spam filtering, optical character recognition
(OCR),search engines and computer vision. Machine learning is sometimes conflated with data mining,
where the latter sub-field focuses more on exploratory data analysis and is known as unsupervised
learning

53) Machine learning. Inductive learning (from section 18.2 of [1])


Is the area with the larger number of methods
Goal: To discover general concepts from a limited set of examples (experience)
It is based on the search of similar characteristics among examples (common patterns)

All its methods are based on inductive reasoning


From a formal point of view the knowledge that we obtain is invalid
We assume that a limited number of examples represents the characteristics of the concept that we
want to learn
Just only one counterexample invalidates the result
But, most of the human learning is inductive!

54) Machine learning. Learning decision trees algorithm (from section 18.3 of [1])
Decision tree learning uses a decision tree as a predictive model which maps observations about an
item to conclusions about the item's target value. It is one of the predictive modelling approaches used
in statistics, data mining and machine learning. Tree models where the target variable can take a finite
set of values are called classification trees. In these tree structures, leaves represent class labels and
branches represent conjunctions of features that lead to those class labels. Decision trees where the
target variable can take continuous values (typically real numbers) are called regression trees.
In decision analysis, a decision tree can be used to visually and explicitly represent decisions and
decision making. In data mining, a decision tree describes data but not decisions; rather the resulting
classification tree can be an input for decision making. This page deals with decision trees in data
mining.

55) Statistical Learning Methods. Nave Bayes models (from section 20.2 of [1])
Naive Bayes models have been widely used for clustering and classification. However, they are
seldom used for general probabilistic learning and inference (i.e., for estimating and computing arbitrary
joint, conditional and marginal distributions). In this paper we show that, for a wide range of benchmark
datasets, naive Bayes models learned using EM have accuracy and learning time comparable to Bayesian
networks with context-specific independence. Most significantly, naive Bayes inference is orders of
magnitude faster than Bayesian network inference using Gibbs sampling and belief propagation. This
makes naive Bayes models a very attractive alternative to Bayesian networks for general probability
estimation, particularly in large or real-time domains.

56) Single layer feed-forward neural networks (perceptrons) (from section 20.5 of [1])
Any number of McCulloch-Pitts neurons can be connected together in any way we like. The
arrangement that has one layer of input neurons feeding forward to one output layer of McCulloch-Pitts
neurons, with full connectivity, is known as a Perceptron
This is a very simple network, but it is already a powerful computational device. Later we shall see
variations of it that make it even more powerful.

57) Multilayer feed-forward neural networks (from section 20.5 of [1]).

MLF neural networks, trained with a back-propagation learning algorithm, are the most popular
neural networks. They are applied to a wide variety of chemistry related problems . A MLF neural
network consists of neurons, that are ordered into layers . The first layer is called the input layer, the
last layer is called the out- mation .

Vous aimerez peut-être aussi