Empirical Methods

1 Empirical Research Physicists ask what kind of place this universe is and seek to characterize its behavior system- atically. Biologists ask what it means for a physical system to be living. We in Al wonder what kind of information-processing system can ask such questions. —Avron B. Barr and Edward A. Feigenbaum, The Handbook of Artificial Intelligence, Vol. 1, p. 11 Empirical methods enhance our observations and help us see more of the structure of the world. We are fundamentally empirical creatures, always asking, What is going on? Isitreal or merely apparent? What causes it? Is there a better explanation? Great developments were born of microscopes, telescopes, stethoscopes, and other observa- tional tools. No less important are “datascopes"—visualization methods—and techniques for drawing sound conclusions from data. Empirical methods cluster loosely into exploratory techniques for visualization, summarization, exploration, and mod- cling; and confirmatory procedures for testing hypotheses and predictions. In short, empirical = exploratory + experimental. Because empirical studies usually observe nondeterministic phenomena, which ex- hibit variability, both exploratory and experimental methods are based in statistics. Exploratory statistical methods are called, collectively, exploratory data analy whereas methods for confirmatory experiments go by the name statistical hypothesis testing. You can find any number of textbooks on the latter, very few devoted to the former, You might get the impression from these books and from scientific journals that all empirical work involves controlled experiments and hypothesis testing. But experiments don’t spring like Athena fully formed from the brow of Zeus: They are painstakingly constructed by mortals who usually get it wrong first time, who require lengthy periods of exploration before formulating a precise experimental question, Yet find themselves still amazed at Nature’s ability to confound.2) Empirical Research Even if experiments were easy, you should take @ mone encompassing view of empirical work for one reason if no other: experiments and hypothesis testing answer yes-or-no questions. Youcan reduce all your ‘research interests to a series of yes-OF-N0 {questions but you'll find it slow work, A more eficlent approach is to flip back and ant between exploratory and experimental work, engaging in the Jatter only when the former produces a question you really want to answer ‘Empirical methods and statistics are worth learning because handful of each goes a long way, Psychologists, for example, know perhaps fen may statistical methods ada similar number of techniques for visualizing and exploring dla; and they know perhaps two dozen important experiment designs.’ With these methods they address. Thousands of research questions. We are AT researchers, not psychologists, but we too can expect a few appropriate methods to go a long Way. “the American Heritage Dictionary® defines empirical and empiricism as follows: Empirical (1) Relying upon or derived from observation experiment: empirical methods, (2) Relying solely on practical experience and without regard for system or theory. Empiricism (1) The view that experience, esp. of the senses, i the only source of enowledge. (2) The employment of empirical methods, as 2 science. Inthis book, you should read empirical in its first sense ‘and empiricism in its second sense, The other interpretations are too exclusive (Cohen, 1991). Down with empiricism (in its first sense); down with the equally exclusive view that theory is the only source of knowledge. Up with empirical studies informed by theories and theories informed by observations. LA ALPrograms as Objects of Empirical Studies Our subject is empirical methods for studying AI prograntsy methods that involve runing programs and recording their behaviors, Unlike other scientists, who study ‘chemical reactions, processes in cells, bridges under stress, animals in mazes, and so on, we study computer programs that ‘perform tasks in environments. It shouldn't be 1.1 get these numbers from textbooks on statistics for psychologists ‘and from books on exper set enign such as Cook and Campbell 1979 and articles such as Bower ‘and Clapper 1990 and Newell 1973. 2. American Heritage Dictionary of the English Language, ‘American Heritage Publishing Co. Inc. and Houghton Mifflin Company, 1971AI Programs as Objects of Empirical Studies \on/ Figure 1.1 How the structure, task, and environment of an organism influence its behavior, difficult: Compared with biological systems, AI systems are simple; compared with ‘human cognition, their tasks are rudimentary; and compared with everyday physical environments, those in which our programs operate are extremely reduced. Yet programs are in many ways like chemical, biological, mechanical, and psychological processes. For starters, we don’t know how they work. We generally cannot say how Jong they will take to run, when they will fail, how much knowledge is required to attain a particular errorrate, how many nodes of a search tree must be examined, and so on, It is but a comforting fiction that building a program confers much understanding of its behavior. Predictive theories of program behavior are thin on the ground and no better than predictive theories of chemical, biological, mechanical, and psychological processes: at best incomplete and approximate, at worst vague or wrong. Studying AI systems is not very different from studying moderately intelligent animals such as rats. One obliges the agent (rat or program) to perform a task ording to an experimental protocol, observing and analyzing the macro- and micro- structure of its behavior. Afterward, if the subject is a rat, its head is opened up or chopped off; and if itis a program, its innards are fiddled with. Six components are common to these scenarios: agent, task, environment, protocol, data collection, and analysis. The first three are the domain of theories of behavior, the last three are in the realm of empirical methods. Behavior is what we observe and measure when an agent attempts a task in an environment. As figure 1.1 suggests, an agent’s behavior is not due entirely to its structure, nor its environment, nor its task, but rather to the interaction of these influences. Whether your subject is a rat or a computer program, the task of science is the same, to provide theories to answer three basic research questions:4 Empirical Research su How will a change in the agent's structure affect its behavior given & task and an environment? fs How willachange in an agent's task affectits behavior in aparticulr environment? tu How will a change in an agent’s environment affect its behavior on ® particular task? 1.2 Three Basic Research Questions ‘The three basic research questions all have the same form, so let's pick one to exam- ine: How will a change in the agent's structure affect its behavior given 2 task and srr envivonment? Bill Clancey and Greg Cooper asked such a question of the MYCIN system some years ago (Buchanan and Shortiffe, 1984p. 219). They asked, how vensitive is MYCIN to the accuracy of is certainty factors? What will happen ifeach certainty factor, very precisely represented on the scale — 1000... 1000, is replaced by the nearest of just seven values (—1000, 66, ~333, 0. 533, 666, 1000)? When Clancey and Cooper ran the modified and unmodified versions of MYCIN and compared the answers, they found essentially no decrement inthe adequacy of MYCIN’s recommendations (see chapter 6 for details). Clancey and Cooper’s question was exploratory; their answer Was descriptive. ‘They asked, “What will happen if ..?" they answered with a description of MYCIN’S performance. Question and answer belong in the lower left-hand comer of figure 1.2, ey dimensions of which—understanding and generality —define a space of versiont vote base research questions and answers. Early in research project we ask “What will happen if ...?" and answer, “Here’s what happens...” Later, we ask, “Does this verde eecarately predict what happens?” and “Does this model provide an accu7ai® causal explanation of what happens?” Early in a project we ask short questions that often have lengthy descriptions of behavior as answers; later, the question’ SPP rengthy because they refer to predictive and causal models, but the answer A short, sien just “yes" or‘: This shift in balance characterizes progress from exploratory studies to experimental studies. “The progression is called “understanding” in figure 1. because descrPpiO% —the row end of the dimension—requite no understanding (you can describe leaves Wining Color in autumn without understanding the process); whereas prediction reqii"e® at Teast an understanding of the conditions under which behavior occurs (leaves 10 jor in the autumn, or when the tre is diseased, and turn more reliably after @ period srintensely cold weather). In practice, the transition from description 0 prediction depends on identifying terms in the description that appeat to have predictive power.5 A Strategy for Answering Basic Research Questions general generalization specific to system description prediction causal explanation understanding Figure 12 Generalization and understanding define a space of basic research questions. This often involves a succession of descriptions: “leaves turn color in October,” then, identifying a predictive feature, “leaves turn color when the weather turns cold.” An importantrole of exploratory studies is to identify predictive features, What we aim for as scientists, however, is causal explanation: leaves turn color because chlorophyll, which masks other pigments, disappears. And not only causal explanations, but explanations—why the leaves of aspen, maple, sumac, and oak (but general cau not pine) change color in the autumn. 13 A Strategy for Answering Basic Research Questions Clancey and Cooper did, in fact, offer a post hoc, informal causal explanation of their results. MYCIN’s task was to prescribe therapy for all the organisms likely to have caused one or more infections. Inaccuracies in certainty factors affected MYCIN’s judgments about the likelihood of organisms, but because most antibiotics kill many Sorts of organisms, the chances are good that MYCIN will recommend an effective antibiotic even if it judges the likelihood of each organism incorrectly. Clancey and Cooper didn’t generalize this explanation beyond MYCIN, but we can imagine how they might have done it. Here is a generalization in terms of features of a program's structure, task, environment, and behavior: Suppose a program's task is 10 select actions to “cover” sets of situations. An instance of the task is a ser S of situations. A feature of the environment is that during problem-solving, $ doesn’t change, A feature of the task is that it provides too little information for the program6 Empirical Research to be certain whether a hypothesized situation nisin S. Another task feature is that is 1 arvely smal. A feare of the program is that Ici judge Pry € S), the ikelthood that by {sin S. Another program feature is that it cam “rece io cover only some h’s (eB likely ones) The program selects actions to cover these ‘The actions have the interesting property tat ‘each sake with several k's. The only behavior of infers! ig whether the program selects “00d” “retions, A generalization of Clancey and Cooper's result is this: The program's behavior is obust against inaccurate estimares of Pr(s © 8), ‘The point of this long-winded exercise isto illustrate a central conjecture: General theories in artificial intelligence arise from n featural charact ffeirenvironments, tasks, and behaviors ‘Progress toward general theories dependson finding these features. Empirical methods, in particular, help us find general featnre® by studying specific programs. The empirical generalization strategy, around which this book is organized, goes Tike this: 1. build @ program that exhibits a behavior of interest while performing particular ks in particular environments; 2, identify specific features ofthe program its tasks and environments that influence the target behavior; +3, develop and testa causal model of how these features influence the target behaviors “A, once the model makes accurate predictions, generalize the features so that other programs, tasks, and environments are encompassed by the causal model; 5, test whether the general model predicts accurately the behavior of this larger set of programs, tasks, and environments. Chapter 2 on exploratory data analysis addresses step 2. Chapters 3, 4, and 5, on experiment design and hypothesis testing address the testing part of steps 3 and 5, Chapters 6, 7, and 8, address the model formation parts of step 3. Chapter 9, on generalization, addresses step 4- 14 Kinds of Empirical Studies Many researchers in artificial intelligence and computer science speak casually of experiments, as if any activity that involves building and running a program is exper- mental, This is confusing, Tonce flew across the ‘country to attend a workshop on “experimental computer science.” only (0 discover this phrase is shorthand for nomex perimental, indeed, nonempirical vrrearen on operating systems and compilers. We en distinguish four chasses of empirical studies:Kinds of Empirical Studies Exploratory studies yield causal hypotheses that are tested in observation or manipulation experiments. To this end, exploratory studies usually collect lots of data, analyzing it in many ways to find regularities, Assessment studies establish baselines and ranges, and other assessments of the behaviors of a system or its environment. Manipulation experiments test hypotheses about causal influences of factors by manipulating them and noting effects, if any, on one or more measured variables. Observation experiments disclose effects of factors on measured variables by observing associations between levels of the factors and values of the variables. These are also called natural and quasi-experimental experiments, ‘Manipulation and observation experiments are what most people regard as proper experiments, while exploratory and assessment studies seem informal, aimless, heuris- tic, and hopeful, Testing hypotheses has the panache of “real science,” whereas exploration seems like fishing and assessment seems plain dull. In fact, these activities are complementary; one is not more scientific than another; a research project will involve them all. Indeed, these activities might just as well be considered phases of a research project as individual studies. The logic of assessment and exploratory studies is different from that of manip lation and observation experiments, The latter are confirmatory—you test explicit, precise hypotheses about the effects of factors. Exploratory and assessment studies suggest hypotheses and help to design experiments. These differences are reflected in methods and conventions for analyzing data. Manipulation experiments are analyzed with the tools of statistical hypothesis testing: f tests, analysis of variance, and so on, Results are outcomes that are very unlikely to have occurred by chance. Hypoth- esis testing is also applied to some observation experiments, but so are other forms of analysis, For example, it’s common to summarize the data from an observation experiment in a regression model of the form y= wid} + wry +... wee + CY where y is called the response variable and x; ....x, are predictor variables. ‘Typ- ically, we'll appraise how well or poorly the regression model predicts y, but this is less a conclusion, in the hypothesis-testing sense, than an assessment of whether x1... explain y. Not surprisingly, there are fewer conventions to tell you whether ‘a regression model is acceptable. Obviously, a model that accounts for much of the variance in y is preferred to one that doesn’t, all other things equal, but nobody will insist that predictive power must exceed some threshold, as we conventionally insistEmpirical Research 60 + Number of plans 40 20 5 10 45 20 ‘Time reaufred to complete a plan Figure 13. A frequency distribution. that a result in hypothesis testing must have no more than one-in-twenty chance of being wrong. Data analysis for exploratory and assessment studies is freer still of conventions and definitions of signfieance. The purpose of these studies is 10 find suggestive patterns in data. For example, what does the distribution in figure 1.3 suggest to you? Suppose the horizontal axis represents the time requited to complete a plan jr a simulated environment and the heights of the bars represent the number of plans that required a particular amount of time: roughly thirty plans required 7.5 vee units, more than fifty plans required 11 time units, and at Teast twenty plans required 18 time units to finish. These three peaks, oF ‘modes, suggest we are col- Jeeting data in three rather different conditions. Are we seeing the effects of three types of simulated weather? Three Kinds of plans? Three different operating sys- ains? Or perhaps there are just two conditions, not three, andthe first “dip” in the histogram represents a sampling error. Are the modes equally spaced, or increas- ingly separated, suggesting a nonlinear relationship between conditions and time requirements? Why are mmore plans found in the second ‘“jrump” than in the first? Te ita sampling bias? Do fewer plans fail inthis condition? These questions are representative of exploratory studies. They are answered by exploratory data analysis, which includes arranging and partitioning data, drawing PINTS, dropping data that seem to obscure patterns, doing everything possible to discover and amplify regularities. "The strict standards of statistical hypothesis testing: have no place here. Exploratory data analysis finds things in haystacks (data are rarely a8 clear as figure 1.3), whereas
Vous aimerez peut-être aussi
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
D'Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Évaluation : 4 sur 5 étoiles
4/5 (5794)
The Little Book of Hygge: Danish Secrets to Happy Living
D'Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Évaluation : 3.5 sur 5 étoiles
3.5/5 (399)
Provisional Appointment For Learner Licence Test: 25-Nov-2021 Date
Document1 page
Provisional Appointment For Learner Licence Test: 25-Nov-2021 Date
abhishek
Pas encore d'évaluation
Magazines
Podcasts
Partition
Hijab Order
Document129 pages
Hijab Order
Newsinc 24
100% (1)
Holistic Approach To Mental Health
Document38 pages
Holistic Approach To Mental Health
abhishek
Tallinn University Exam Format English Language
Document1 page
Tallinn University Exam Format English Language
abhishek
Blockchain-Based Smart Contracts - Applications and Challenges
Document27 pages
Blockchain-Based Smart Contracts - Applications and Challenges
abhishek
Rejeski 2019
Document13 pages
Rejeski 2019
abhishek
Frontiers of Computer Science: Instruction For Authors: Firstname LASTNAME ( )
Document7 pages
Frontiers of Computer Science: Instruction For Authors: Firstname LASTNAME ( )
abhishek
Visuospatial Reasoning: January 2005
Document33 pages
Visuospatial Reasoning: January 2005
abhishek
2020 Draft Budgetary Plan of Estonia: Tallinn, 15. October 2019
Document40 pages
2020 Draft Budgetary Plan of Estonia: Tallinn, 15. October 2019
abhishek
Smart Contract Designer: User Guide
Document10 pages
Smart Contract Designer: User Guide
abhishek
Europass - European Language Levels - Self Assessment Grid
Document1 page
Europass - European Language Levels - Self Assessment Grid
Mardel203
Website For Preparation
Document1 page
Website For Preparation
abhishek
Welspa I
Document5 pages
Welspa I
abhishek
Welspa II
Document5 pages
Welspa II
abhishek
Key Partners
Document3 pages
Key Partners
abhishek
Policy
Document3 pages
Policy
abhishek
DigitizedEvidenceTables EperisoneinLBP PDF
Document2 pages
DigitizedEvidenceTables EperisoneinLBP PDF
abhishek
Smart Contracts Report #2 - 0
Document28 pages
Smart Contracts Report #2 - 0
abhishek
2019 - Understanding Blockchain Technology For Future Supply Chains - A Systematic Literature Review and Research Agenda
Document44 pages
2019 - Understanding Blockchain Technology For Future Supply Chains - A Systematic Literature Review and Research Agenda
abhishek
List Magazines
Document1 page
List Magazines
abhishek
How To Write A Paper
Document8 pages
How To Write A Paper
abhishek
Usrguid 3
Document10 pages
Usrguid 3
dtcoilley
Good Writing Springer
Document13 pages
Good Writing Springer
abhishek
Hu
Document4 pages
Hu
abhishek
How To Peer Review - Springer
Document16 pages
abhishek
Dematologist Appointment Abhishekdixit 22oct5.20pmdigital Registrar
Document1 page
Dematologist Appointment Abhishekdixit 22oct5.20pmdigital Registrar
abhishek
Puks PDF
Document82 pages
Puks PDF
abhishek
Document16 pages
abhishek
Toidukoha Avaja Eng
Document2 pages
Toidukoha Avaja Eng
abhishek
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
D'Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (231)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
D'Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (894)
The Yellow House: A Memoir (2019 National Book Award Winner)
D'Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (98)
Shoe Dog: A Memoir by the Creator of Nike
D'Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (537)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
D'Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (474)
Never Split the Difference: Negotiating As If Your Life Depended On It
D'Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (838)
Grit: The Power of Passion and Perseverance
D'Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (587)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
D'Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (265)
Yes Please
D'Everand
Yes Please
Amy Poehler
4/5 (1891)
Angela's Ashes: A Memoir
D'Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (440)
The Emperor of All Maladies: A Biography of Cancer
D'Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (271)
On Fire: The (Burning) Case for a Green New Deal
D'Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (73)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
D'Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (344)
Team of Rivals: The Political Genius of Abraham Lincoln
D'Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (234)
Fear: Trump in the White House
D'Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (738)
The Glass Castle: A Memoir
D'Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1712)
Rise of ISIS: A Threat We Can't Ignore
D'Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (137)
Principles: Life and Work
D'Everand
Principles: Life and Work
Ray Dalio
4/5 (599)
The Unwinding: An Inner History of the New America
D'Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
D'Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2219)
Steve Jobs
D'Everand
Steve Jobs
Walter Isaacson
4.5/5 (806)
John Adams
D'Everand
John Adams
David McCullough
4.5/5 (2409)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
D'Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1090)
Bad Feminist: Essays
D'Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1015)
The Outsider: A Novel
D'Everand
The Outsider: A Novel
Stephen King
4/5 (1839)
Brooklyn: A Novel
D'Everand
Brooklyn: A Novel
Colm Toibin
3.5/5 (1937)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
D'Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (119)
A Man Called Ove: A Novel
D'Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (4609)
The Light Between Oceans: A Novel
D'Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (789)
The Woman in Cabin 10
D'Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2322)
Manhattan Beach: A Novel
D'Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (792)
The Perks of Being a Wallflower
D'Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (2099)
Wolf Hall: A Novel
D'Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (3811)
Little Women
D'Everand
Little Women
Louisa May Alcott
4/5 (104)
The Art of Racing in the Rain: A Novel
D'Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4200)
Sing, Unburied, Sing: A Novel
D'Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1103)
A Tree Grows in Brooklyn
D'Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (1929)
The Constant Gardener: A Novel
D'Everand
The Constant Gardener: A Novel
John le Carre
3.5/5 (104)
Her Body and Other Parties: Stories
D'Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (821)

Empirical Methods

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Empirical Methods

Transféré par

Droits d'auteur :

Formats disponibles

Vous aimerez peut-être aussi