Académique Documents
Professionnel Documents
Culture Documents
Data need to be imagined as data to exist and function as such, and the imagination of data entails
an interpretive base. (Gitelman & Jackson, 2013, p. 3)
A key claim of big data enthusiasts is that it exists prior to interpretation, and is
thus able to provide transparent patterns and connections that tell us about the
social. The great insights of Bowker and Star (1999) and Bowker (2005) in their
analyses of infrastructures-as-processes are how the conditions for the
possibility of information gradually become invisible and eventually ‘naturalized’
and ‘inevitable’ until they break down. Digital data is often seen as floating
externally, as disembodied or immaterial (see Hayles, 1999 on this problem),
rather than the outcome of complex valueladen ‘standards’, protocols and
technologies that make up the infrastructures of heterogeneous data sets. If we
reduce phenomena to data, they are divided and classified, often obscuring the
ambiguity, ambivalence, conflict and contradiction involved (Bowker & Star,
1999; Gitelman, 2013). The forgetting of this gives credence to the notion that
routinely and automatically produced digital data produces a ‘distanced
objectivity’ and thus a specific claim to truth. In recent accounts of the social
and cultural history of data, it is argued that, on the contrary, data of any kind is
alwaysalready an interpretation. For example, with regard to the terms ‘raw’ and
‘cooked’ applied to big data, Boellstorff (2013, p. 9) argues:
These categories are incredibly important with regard to big data. One reason is the implication
that the “bigness” of data means it must be collected prior to interpretation ! ‘raw’. This is revealed
by metaphors like data ‘scraping’ that suggest scraping flesh from bone, removing something taken
as a self-evidently surface phenomenon. Another implication is that in a brave new world of big
data, the interpretation of that data, its ‘cooking’, will increasingly be performed by computers
themselves.
The turn towards social media data in sociology, media and communication
studies is usefully complimented, then, by the ‘material turn’ related to
scholarship in science and technology studies (see Goffey, Pettinger & Speed,
2014; Hand, 2014; Lohmeier, 2014). Data does not exist outside of its material
substrate, and is shaped by ethico-political constraints and agendas, engrained
practices and technical knowledge, regulations and protocols, orientations
towards valued outcomes and so on. This includes the ways in which specific
disciplines imagine and construct data as part of ‘the operations of knowledge
production more broadly’ (Gitelman & Jackson, 2013, p. 3). In this sense, it has
been argued that data is ‘co-produced’ through application programming
interfaces (APIs) and researchers themselves, who make and select data, and
also by the tools used to delimit and make that data visible and amenable for
analysis (Vis, 2013, p. 2).
The notion that the social sciences and humanities should simply take ‘the
computational turn’ (Berry, 2012) is thus highly contested, raising complex
issues about what forms of ‘the social’ are being constructed and enacted
through designed computational processes and the disciplinary methods
employed to analyse and interpret them. Thinking carefully about the powerful
effects of data in shaping social life, while at the same time being able to
critically engage with its sociotechnical ambivalences and affordances, would
seem to require a range of approaches and modes of expertise. New media
scholars have drawn upon work in STS and histories of media to situate data in
relation to the material and semiotic conditions of its production as data, and
the processes through which it becomes black-boxed, stabilized and mobilized
in a variety of contexts. A second way of socializing digital data turns its
attention to the sociotechnical processes at work in structuring the flows of data
in the first instance, asking how algorithms and other devices become
stabilized, and most importantly, asking how does this form of data become and
remain a legitimate and persuasive form of knowledge? Bruns (2013, p. 4)
argues that:
There is a substantial danger that social media analytics services and tools are treated by
researchers as unproblematic black boxes which convert data into information at the click of a
button, and that subsequent scholarly interpretation and discussion build on the results of the black
box process without questioning its inner workings.
Drawing on insights from STS and software studies, the black boxing of
algorithms is taken up in detail by Gillespie (2013) who argues that, on the one
hand, researchers must strive to deconstruct the workings of algorithmic
processes, but on the other hand recognize the obdurate affordances of these
processes that are designed to remain invisible:
Computational research techniques are not barometers of the social. They produce hieroglyphs:
shaped by the tool by which they are carved, requiring of priestly interpretation, they tell powerful
but often mythological stories ! usually in the service of the gods (Gillespie, 2013, pp. 191, 193)
A sociological inquiry into algorithms should aspire to reveal the complex workings of this
knowledge machine, both the process by which it chooses information for users and the social
process by which it is made into a legitimate system. But there may be something, in the end,
impenetrable about algorithms. They are designed to work without human intervention, they are
deliberately obfuscated, and they work with information on a scale that is hard to comprehend (at
least without other algorithmic tools) … [S]o in many ways, algorithms remain outside our grasp,
and they are designed to be. (Gillespie, 2013, p.)
If data are so central to our lives and our planet, then we need to understand just what they are
and what they are doing. We are managing the planet and each other using data and just getting
more data on the problem is not necessarily going to help. What we need is a strongly humanistic
approach to analyzing the forms that data take; a hermeneutic approach which enables us to
envision new possible futures even as we risk being swamped in the data deluge. (Bowker, 2013,
p. 171)
The opportunities to use existing web tools to pull together and triangulate web
data of many kinds ! for example, Twitter feeds with geolocational and temporal
data ! might in many cases be more fruitful than ‘offline data’, if one is trying to
understand the mediation of social activities. This is especially significant for
digital social research that seeks to re-appropriate the forms of automated
expertise at play in constituting ‘publics’ (visualized, mapped, represented
through data) that are then subjects to be acted upon (e.g. by the state). In other
words, questions of data analytic expertise are being explored by researchers
trying to utilize them and also qualitatively by researchers asking critical
questions about the politics of this ‘redistribution of expertise’ (Bassett, 2014;
Kennedy & Moss, 2014). Big data is an intensification of the automation of
expertise (Bassett, 2014), where expertise is being redistributed between
humans and machines in ways that are not always progressive let alone
democratic. For example, how are analytics framing the ways in which ‘publics’
are constituted and understood, and to what extent do people outside of big
data companies have a say in what become powerful inscriptions and
representations? How might publics be enabled by analytics? Could analytics
be used to form more ‘knowing publics’? How might analytics be drawn upon to
form public opinion (as a process), rather than represent it (as captured)?
There are also limits to this approach if one is trying to understand the conditions
through which this data has been produced as data. Here, I would suggest, is
the continuing value of ethnographic approaches that situate digital
technologies within the fabric of people’s lives (i.e. boyd, 2014; Miller, 2011)
and try to understand the complex forms of negotiation that are taking place
that both constitute much of the data in the first place and are the contexts within
which people reflexively engage with that data. Any account of the recursive
processes of data circulation must surely benefit from detailed explorations of
this kind. In this regard, Crawford (2013) makes an explicit call for developing
robust combinations of big and small data studies, computational social science
with ‘traditional qualitative methods’. She argues that:
In this essay I have aimed to do several things. I have sought to provide a partial
but hopefully useful reading of how digital social research has shifted much of
its emphasis from studies of mediated spaces, to networks, to mediated life in
a dataverse. Bowker (2013) employs this term while acknowledging its
hyperbole to force us to think about how data is coming to define us and our
actions, as well as what we claim to know about the world and each other. This
is what many researchers in the social sciences and humanities are responding
to: the sense of a world being remade through data and the need to critically
engage with these processes and their implications, in terms of both the conduct
of social research and the lives of the researched. In briefly discussing three
key debates at the present time I have simply sought to identify what I think are
profitable trajectories. By resolutely returning to the ongoing problems of
contextualizing and localizing digital data, qualitative research can, I think, make
major contributions to our understanding of digital data-in-society.