365 Chapter 2 Readings

2.1.
3 Ecological Optics
Gibson opposes structuralism, influenced by Gestalt..HOLISM. but went further in rejecting structuralism, what a
little rebel, organismic structure NOT basis for perceptual theory; structure enviro=ecology. “what head’s inside of
” ecological optics: INFO basis of perception in enviro rather than MECHANISTIC basis in brain.
Analyzing Stimulus Structure. wow his theory had an outright goal: specify how world structures light in AOA so
ppl can perceive by sampling this info. how proximal characteristics give info about distal. texture gradient
(smaller, more dense, diff shape in proximal on eye= more distant distal …remember distal ACTUALLY uniform
texture!) we perceive distance when shown texture gradients
-also, ACTIVE EXPLORATION of envrio (when we MOVE, spatial pattern of stimulation on retina changes…
dynamic AOA). We evolved perceptual systems to get around for food, water, mates, shelter damnit! Don’t’ do
expts in restricted, articial labs! b/c of dynamic AOA, optical info tells us 1) layout of enviro, 2) our trajectory
through it. (moving through hallway). information pickup how brain allows actively exploring organism to
perceive enviro unambiguously! didn’t care too much about mechanisms (remember ecological optics about basis in
enviro rather brain) but suggested resonance metaphor where brain is like series of tuning forks; enviro info causes
appropriate neural structures to fire, just like mech vib of particular freq in air causes tuning fork of same
characteristic freq to fire. not developed.
Direct Perception. soo controversial- visual perception FULLY SPECIFIED by optical info in enviro available at
retina of MOVING, ACTIVELY EXPLORING org w/out mediating processes/internal reps. thought unconscious
inferences goin beyond info enviro not necessary b/c many soruces of optical info sufficient! (esp opticl flow in
yep, actively exploring org ;)). BUT still problem of underdetermination…adding temporal dimension of actively
exploring organism doesn’t help b/c THAT’S in enviro too (from 3D+temp to 2D+temp…still missing one spatial
dimension.) still can’t uniquely determine enviro purely from optical info.
2.1.4 Constructivism
Clearly one of the coolest thing about vision is logically, optical info insufficient for underdetermination (inverse
problem) solved uniquely, but we still do it regularly! how??? most perceptual theorists (yeah sorry Gibson) have
decided there’s some additional source of info BEYOND retinal images used in seeing (seriously, I’m SORRY
Gibson.) so our visual system is SOMEHOW contributing info (beyond retinal images, yes, even dynamic ones )
that lets us arrive at single most likely possibility from infinite: constructivism . combines best of 3 (struc, Ges,
eco). Obvs not strong stands in 4 issues; internal mostly but internal mechs for extracting enviro from patterns of
retinal stim (aw throw Gibby a bone.) so global percepts (internal) constructed from local info (external)...ooh
CONSTRUCTED. Gestalt importances of emergent properties like lines, edges, angle,s whole figures.
Nativism/empricism neutral. Methodological behaviorists; introspective analysis first importnat step but THEN
must draw inference sabout perceptual processes by studying QUANTITATIVE MEASURES of human/aniaml
behavior to TEST their ideas.
Unconscious Inference. acknowledge logical gap (unlike Gibby) b/w optical info available from retinal stim and
perceptual knowledge derived from it. need process of “inference” (hidden asumptions) to transform 2-D optical
INFO into 3-D perceputal INTERPRETATION of enviro. no awareness. make inferences based on likelihood
principle: visual system computes INTERPRETATION w/ highest probability given retinal stimulation. contrasted
with Pragnanz (Simplest) but simple often=likely
Heuristic Interpretation: when assumptions coupled with sensory data in incoming image, rsult in heuristic
interpetati nprocess where visula system makes inferences about most likely environmental condition that
produced iamge; heuristic because only right if underlying assumptions true..so more likely assumptions are true,
more likely perception correct! (evolutionary benefit of correct assumptions haha obvs!) literal/strict itnerpretation
of unconscious inference- perception accomplished by sequentially applying rule sof symbolic logic/sovling math
equations. Helholtz yay, Gibson NAY NAY NAY. other more plausible interpretations: connectionsit networks
reach perceptual conclusions partly a) sensory data, b) additional assumptions embodied in pattenr of connections
among neuronlike things, os make inferences on basis of heuristic assumptions (“soft constraitns”) w/out
sym.logic/math! (conceptual rather than ltieral interpretation.) okkk guys, behavior of such netowrks ALSO closely
related to Kohler’s physical Gestalts and Gibby’s mechanical resonance for info pick up 
2.2 A BRIEF HISTORY OF INFO PROCESSING
modern era starts in 1950’s-1960’s. why? 3 dev change way we understand vision: 1) computer simulations to
model cognitive process, 2) application of info pro to psych, 3) idea brain is bio processor of info (okk I thought
THAT was around earlier!)
2.2.1 Computer Vision

soo important when we realized digital computers can simulate complex perceptual processes; before we thought
perception/cognition unique to living thigns…uhhh than we have to construct theories and investigate them with
expts of living organisms (hard, expensive), now we can build synthetic systems whose system of operation known
explicitly in advanced (since designed by theorsist) and compare behavior w/ sighted organisms!
The Invention of Computers. modern computer from hypothetical Turing machines (for info pro.) hah afirst
behemoth digital computer ENIAC at UPenn. ALL eg. of Turing. from start Truing understood possibily simulate
intelligent thought: AI…led to computer vision: program computers to extract useful info about enviro from optical
images. wowww: 2 important developments changing theoretical branch of vision science dramatically…and
forever. (RIGHT ABOUT DRAMATICALLY HAHA)
1. Real images: when you test vision theories on compouters, apply (use) gray-scale recordings…classic theories
were for perfect, noiseless line drawings (ideal not real situations). now we’re testing real images, warts and all 
2. Explicit theories: vague, informal incomplete, conceptual…to explicit!
first insight from this craziness: vision is HARD MAN! hard to get computers to see!(edge detection, finding
regions, grouping) jeez you need HEROIC computational efforts.
Blocks World. microwrold where all to-be-perceived objects simple, uniformly colored, geometrical solids on flat
tabletop. success: Roberts could recofnize gray-scale images if nonoverlapping from limited set of known
prototypes; could make clean line drawing based on luminance edges: changes in amt of light flaling ontwo
adjacent regions, edges detected at just 4 orientations 45 deg angle, linked to form smooth lines/contours. then
construct geometrical description finding primitive volumes that would fit together to form it and then predict from
differ angle.
Computational Approaches to Ecological Optics. formally analyze info in optical images..yay Gibby! math
analysis of how enviro structure of various sorts reflected in image structure. lots of interest in how we recover
more complete info about visual scene directly from image.. particularly about DEPTH and SLANT in 3-D (yahh
‘cause those are super 3-D concepts!) Koenderink and Van Doorn applied differential geometry etc to shit like
MOTION PERCEPTION from optical flow, DEPTH PERCEPTION from stereoscopic info, 3-D ORIENTATION
from shading, etc. didn’t create computer (women) but inspired.
Marr and colleagues math analysies of LUMINANCE in 2-D images providing info about structure of surfaces,
objects in 3-D space. (Loepell obvs thinks Marr is luminant!) very ecological (focus on optical info in enviro)
Connectionism and Neural Networks. most recent dev in computer simulations is this explosion in interest!
assumption: human vision depends on massively parallel structure of brain neural circuits. So networks of
interconnected computing elements/unitslike simplified neuron! element’s current state= activation level (like
firing rate). Activ SPREAD through excit/inhib connections (synapses)…connectionist models an be specified
mathematically UGH but their behavior usually depends on nonlinear eqs aren’t solved analytically …so must
simulate on computers, not solve. historical precursors? perceptrons (hahaa sounds even MORE modern/high-
tech): particular class of neuronlike network models studied intensively by Frank Rosenblatt, interesting b/c can
learn how to idnntify egs of new categories by adjusting weights on their cnxns according to explicit urles. simple
learning rule lets it make any categorical discrimination!! Hebb did human/animal learning research w/in neural
network framework…associations formed w/in b/w “cell assemblies” firing at same time. but serious limitations on
perceptrons bringing neural network research to effective halt for over a decade . parallel distributed processing
(PDP) models recently bringing it back yahh.
2.2 Info Pro Psychology

dominated by behaviorism (In extreme form no perception in psych theories b/c it’s simply internal experience of
external world) no internal processes like “mentalism” “consciousness.” But then seeing brain as info pro machine
helped (new PRECISE language like computer program writing)…w/in decade (starting from 60’s) firmly rooted in
cog psych. Broadbent: flow chart where initial analysis of gross sens factors then att’n operates as switch among
several info pro channels…nice b/c although quite simple specifies TEMPORAL STRUCUTRE of info pro events.
Sperling discovers iconic memory: lasts for like ½ second…studies mechanisms into it! encourages exploring w/in
this framework of info pro! info pro yay
2.2.3 Biological Information Processing

Physiolgoicl technique for studying neural activity in visual system- 3rd important cpt of info pro paradigm! neuron
thought appropriate unit of analysis in visual system; so if we can study INDIV neurons we can, in theory, map out
functional wiring system of entire visual system neuron by neuron, operations of each neuron specified! and it’s
expected to be done by 2010! ;). study living brains- cool too!
Early Developments. ppl thought brain was just random bio organ, not info pro, but then ppl used really explicit
analogies…neural spikes are like binary code…in 50’s realize neurosn not connected but synapses. binary code
bullshit but we take info pro machine=brain seriously! first studies lesion expts fascinating localization, same with
electrical brain stimulation…neither adequate b/c don’t measure e’chemical behavior of INDIV neurons (we’re
not looking for localization, but PROCESS/operations!)
Single-Cell Recordings. study output of electrical potential changes to see what stimulus conditions elicited it (thus
its role.) with vision project patterns of light onto retina. hubel wiesel receptive field: region of retina that
influences firitng rate of target neuron (excit/inhibit) complex w/ orientation and position blab la neural mechs..so
portions of visual cortex now perhaps best understood of entire cerebral cortex! but limitation- confined to indiv
cells! what I thought that’s what we wanted??! oh now we want lots of indiv cells at same time…constrains on
mapping oversal structure (architecture) would be very laborious!
Autoradiography more efficient htan single-clel at overall architecture! inject radioactive chemical like 2-DG
taken up by active neurons and accumulate b/c not metabolized, show item of interest, then kil l animal and slice
Brain Imaging Techniques CT, PET, MRI (PET is not autoradiography remember!) b/c NON-INVASIVE can
examine bodily tissues including brain wihotut breaking skin! First was X-Ray CT. beam shot, intensity measured
on othe r side. strength of beam correlated with AVG density of all tissue it goes through..insufficient at tissue
density. but different nagles=complete map of tissue density…CT scan. MRI strong mag field polarizes molecules
in brain, raido waves pulse tissue emit detectable RADIO signlals, processed by algorithsm similar to CT
scans..more detailed and no x-rays! PppET indirect trhough blood flow. functional.but slow, poor spatial
resolution. fMRI oxygen no raradioactive.
2.3 Information Processing Theory

information processing paradigm: theorize about human mind as computational process. works for visual and
other shit too! scientific paradigm: set of working assumptions community of scientists hare in conducting
RESEARCH on given topic. like Newtonian (broad assumptions for lots of physic etc.) some like Gibson’s folks
don’t believe in info pro paradigm, but most vision scientists do
2.3.1 The Computer Metaphor. why computer so improtnat? 1) preferred tool for testing new visual processing
theories for working on real images! 2) theory for ANALOGY for mental processes (programs) relation to brain
(computer.) minds are “software” (programs) of biological computation, brains “hardware” (computer.) replaced
Gestalt and Gibson’s info pickup/mech resonance. very compatible with construcivism b/c making inferences is
what computers do when they execute programs! maybe brain actually IS biological computer. properly
programmed computer could do it too! view of relation b/w computer programs and mental events sometimes called
strong AI like machine can actually have conscious visual experiences and then weak AI like machine only
SIMULATING mental events.
2.3.2 3 Levels of Info Processing

metatheory: theory about theories oh Marr.so it was his theory about info processing paradigm. 3 conceputal
distinctiosn b/w 3 levels and all essential for understandin vision as info pro.
The Computational Level
info constraints available for mapping input to output. what computation needs to be performed, and what info it’s
based on…not hwo it’s accomplished. (WHAT are inputs (INFO) and how formally related ot outputs
(COMPUTAITON.) thermostat must map both temp and setting and compute into on/off signal depending on higher
lower than setpoint.
The Algorithmic Level

more specific specify HOW computation executed in terms of info pro operations. corresponds most closely to
program in computer sicence. So you need to form representations for input and output and decide on processes to
convert input rep into output rep in well-defined manner! So REP is way of encoding information about
input/output, and PROCESS is way of transforming one rep into the other! So for thermostat, one rep for temp, one
for setting, and then some kind of comparison process b/w too to determine whether temp higher or lower than
setpoint.
The Implementational Level.

How algorithm actually EMBODIED as physical process (same algorithm can be implemented using many
physically different devices) different types of thermostats; same algorithm.
2.4 Four Stages of Visual Perception

we’re looking at algorithmic level into 3 major stages beyond retinal image itself; each defined by different output
rep and processes required to compute it from input rep: image-based, surface-based, object-based, category-
based.
2.4.1 The Retinal Image. pair of 2-D images projected from enviro to observers’ eyes viewpoint. although the
optical image striking the eye completely CONTINOUS, we perceive DISCRETE parts with retinal receptors! (so
we can tell objects apart..not just blur!) complete set of FIRIN RATES of receptors both eeys= 1st rep of optical
info..complicated by distrib of receptors (fovea et.) usually retinas simplified as homogenous arrays of receptors
with x, y coordinates center is middle of fovea; pixels: square, primitive, indivisible, explicity represented visual
unit of info. I(x,y) for image intensity at given location. so coordinate sytem of retinal image explicitly tied to
intrinsic structure of retina. no edges just this I(x,y) crap..how hard to interpret! why? our system for info in intensity
IMAGES not NUMBERS. this is the crazy problem we face: perceiving objects in 3-D on basis of 2-D array of #’s
2.4.2. The Image Based Stage. additional reps based on 2-D retinal org not just intiital registration of images in two
eyes. local edges/lines, linking, matching up in left/right, detecting ends and blobs etc. we can tell apart
shadow/shading luminance edges from surface edges. whoa set of luminance edges detected in image NOT same as
clean line drawing!! primal sketches: 1) raw primal sketch: just elementary detection processes locating edges,
bars, blobs, line terminations, then full primal sketch: global grouping/organization. he syas image-based rep
defined by: 1) image-based primitives (not info about physical objects in external world like surface edges or
shadow edges…just shit determined by LIGHT INTENSITY), 2) two-dimensional geometry: geometry is 2-D, 3)
retinal reference frame: coordinate system w/in which 20D features located specified relative to retina (principle
axes align with EYE not other shit).
2.4.3 The Surface-Based Stage. recover instrinsic porperites of visible surfaces in EXTERNAL WORLD (main
diff) reps info about external world in terms fo spatial layout of visible surfaces in 3-D, image-based just image
features in 2-D pattern of light on retina. perceiving surface layout was actually first Gibson- he realized this was
easier than more popularly focused task of perceiving 3-D objects…discredited until Marr started supporting it!
partly b/c Gibson never suggested specific rep for this layout (b/c he wasn’t info processing theorist.) Surface-
based rep by Marr is 2.5D sketch…others called it intrinsic images: rep intrinsic properties of surfaces in external
world rather tna properties of input image. OOOH HEADING TO EXTRENAL! 1st step in receovering 3rd spatial
dimension..not about all surfaces but ones visible from that viewpoint. need assumptiosn to infer properties of
visible surfaces, but they’re relatively few and almost always true (esp compared to surfaces hidden from view)!!
yay this is good for heuristics! only visible so imagine as shrink-wrapped rubber sheet just to those surfaces
reflecting light inot viewer’s eyes. small, locally flat pieces? (even curved flat over small area.) so this rep just from
color, slant, distance from viewer of each localy flat patch of surface in direction radially outward from viewer’s
position. Crucial properties: 1) surface primitives: local patches of 2-D surface at some particular slant located at
some distance from viewer in 3-D space, color texture 2) 3-D geometry (surfaces ARE 2-D, but spatial distribution
is in 3-D space!) 3) viewer-centered reference frame: coordinate system w/in which 3-D layout of surface repped
is specified in terms of direction and distance from observer’s STATIONPOINT TO SURFACE; not retina.
rep of surfaces via stereopsis (binary depth cues), motion parallax, shading/shadows, etc.
2.4.4. The Object-Based Stage at this point don’t know that backs of cups exist since not visible surface! need
more, this stage is TRULY 3-D , further assumptions now including unseen shit..obvs object-ased b/c unseen
surfaces mean involve EXPLICIT REPS OF WHOLE OBJECTS IN ENVIRO. recover 3-D here! how? maybe just
extend surface-based to include unseen…boundary approach, OR conceive of them as intrinsically 3-D repped as
arrangements of some set of primitive 3-D…volumetric approach (since it reps objects explicitly as volume sof
particular shape in 3-D space.) volumetric wins kinda. hard to break down like others b/c even general unclear? is
it surfaces or volumes? current bes tgues: 1) volumetric primitives: primitive elmenets descriptions of truly 3-D
volumes, including info about unseen, 2) 3-D geometry, space w/in which volumetric primitives located fully 3-D,
3) object-based reference frames..coordiante system interm sof intrinsic structure of volumes themselves.
2.4.5 The Category-Based Stage we want to aid in survival/repro, final stage is FUNCTIONAL PROPERTIES:
category-based stage (what they afford organism given beliefs/desires/goals/motives). these properties obtained
through categorization. categorization=pattern recog. 2 operations: 1) visual system classifies object as member of
one of large number known categories according to visible properties (size, shape, loc, color) 2) identify allows
access to large body stored info about affordances of this object. Advantage to 2-step scheme? any functional
propertiy can be associated with any object b/c relation b/w form of object and info stored about function can be
purely ARBITRARY due to basis in process of categorization! Also can look at it in Gestalt way of regsitsring
functional properties from visible characteristics W/OUT categorizing, physiognomic characters: fruit sais “eat
me”, Gibson liked this..visible functions of object as its affordances for perceiver, don’t need to categorize first,
maybe both processes (indirect categorizing and direct affordances) some shit so intimately tied to visible structure
like chairs don’t need to categorize, others like telephones need to.
Later processes may feed-back to earlier one!

365 Chapter 2 Readings

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

365 Chapter 2 Readings

Transféré par

Droits d'auteur :

Formats disponibles

2.1.

2.2.1 Computer Vision

2.2 Info Pro Psychology

2.2.3 Biological Information Processing

2.3 Information Processing Theory

2.3.2 3 Levels of Info Processing

The Algorithmic Level

The Implementational Level.

2.4 Four Stages of Visual Perception

Vous aimerez peut-être aussi