Ethnography, it is argued, should include visual images alongside verbal evidence. Pink (2006, p. 1), for example, maintained that:
The visual is therefore inextricably interwoven with our personal identities, narratives, lifestyles, cultures and societies, as well as with definitions of history, time, space, place, reality and truth. Ethnographic research is likewise intertwined with visual technologies, images, metaphors and ways of seeing. When ethnographers produce photographs or video, these images, as well as the experience of producing and discussing them, become part of their ethnographic knowledge.
Marcus Banks (2007) argued that the ubiquity of visual representation warrants the incorporation of still and moving images into ethnographic analysis. Furthermore, incorporating images into ethnographic evidence may provide sociological insight not accessible by other means.
Dipesh Kharel (2015, p. 152, for example, explained how his research relied heavily on visual ethnography as words alone could not convey the life of the Nepalese slate workers he was studying.
I realized that my field notes about slate mining never provided me with the understanding of the mining process that the rough footage provided. Thami slate porters and miners find it difficult to explain their work. They have unique words and idioms to describe the process of slate mining and to define the slate structure, which are very difficult to explain to audiences without visual means. I therefore thought that my film (A life with Slate [sic], 2006) would provide a better anthropological understanding of knowledge in the context of slate mining and carrying. For about four months I was engaged with the daily life of slate miners and porters, but I still do not find it easy to explain to readers about their knowledge of slate mining in the way that I feel the film can, much more effectively.
Regula Burri (2012), not only commented on the ubiquity of images in the 21st Century but how, in areas such as medicine, images are vital tools in practice and diagnosis, providing a quicker means of communication than can be achieved by use of written text or conversational explanations. (See CASE STUDY Visual logic.)
Anthropologists have used photography in ethnographic field studies since the start of the 20th Century. Charles Hattersley's (1908) study of the Baganda in Uganda has a hundred photographs, albeit that their principal purpose seems to be to illustrate the civilising impact of white European influence in the country along with the civilising impact of Christian missionaries. Early anthropological studies assumed that photographs revealed the 'truth', without much consideration of the point of view of the photographer or researcher and the subsequent book editor: reflexivity did not loom large.
Later anthropologists including Margaret Mead, used photographs as research tools rather than simple illustrations or confirmations of preconceptions. In Balinese Character: A photographic analysis Bateson and Mead (1942, p. xii) were concerned about the inadequacy of English (or other 'non-native' language) in capturing the essence of the culture of the peoples under study. They concluded that augmenting verbal studies with photographic evidence would enable a better understanding. They used a technique of placing photographs side by side for comparison purposes while retaining the holistic image. They wrote:
In this monograph we are attempting a new method of stating the intangible relationship among different types of culturally standardized behavior by placing side by side mutually relevant photographs. Pieces of behaviour, spatially and contextually separated—a trance dancer carried in procession, a man looking up at an aeroplane, a servant greeting his master in a play, the painting of a dream—may all be relevant to a single discussion: the same emotional thread may run through them. To present them together in words, it is either necessary to resort to devices which are inevitably literary, or to dissect the living scenes so that only desiccated items remain....By the use of photographs, the wholeness of each piece of behavior can be preserved, while the special cross referencing desired can be obtained by placing the series of photographs on the same page.
The acclaim afforded Bateson and Mead's (1942) book would, one might expect, lead to more such studies. However, as Harper (2000, p. 3) showed, it was 30 years before the tradition established by Bateson and Mead was 'revitalized in studies on a smaller scale, including Danforth and Tsiaras' (1982) study of death rituals of rural Greece, Cancian's (1974) visual ethnography of Mexican peasants, and Gardner and Heider's (1968) visual ethnography of ritualistic war among the Dani of New Guinea'. It was not until the 1970s in the US, sparked by the Vietnam War, that sociologists started using images, reviving an 80 year old approach. For example, Jacob Riis's (1890) study of the squalor of industrializing cities 'could easily have found a way into Marx's Capital or Engels' study of the condition of the working class in England during the 1840s' and
'Lewis Hine's photographic study of child labor in the early 20th century documented the extraction of surplus value from a working class of children' (Harper, 2000, p. 4).
126.96.36.199 Context and interpretation
Howard Becker (1995, p. 10), in exploring the difference between visual sociology, documentary photography and photojournalism, emphasised the importance of context.
As opposed to much contemporary photography made in the name of art, the three photographic genres discussed here insist on giving a great deal of explicit social context for the photographs they present…. Contemporary art photographs… often show us something that might well have been the subject of a documentary photograph….But they seldom provide any more than the date and place of the photograph, withholding the minimal social data we ordinarily use to orient ourselves to others, leaving viewers to interpret the images as best they can from the clues of clothing, stance, demeanor and household furnishings they contain. What might seem to be artistic mystery is only ignorance created by the photographer's refusal to give us basic information.
Conversely documentary, photojournalism and visual sociology 'routinely provide at least a minimally sufficient background to make the images intelligible'. Becker regards Bateson and Mead (1942) as a classic example:
Each photograph is part of a two page layout, one page devoted to photographs, the other to two kinds of text: a one or two paragraph interpretive essay, describing a topic like "The Dragon and the Fear of Space" or "Boys' Tantrums" or "The Surface of the Body," these essays having a further context in a long introductory theoretical essay on culture and personality, and a full paragraph of annotation for each photograph, telling when it was made, who is in it, and what they are doing. (Becker, 1995, p. 12)
So much context might be regarded as over interpreting the text, using the photographic evidence as illustration of a thesis but Bateson and Mead argued (see below 188.8.131.52), to the contrary, that the photographs were, essentially, empirical data, in the same way as researcher observations, or notes of conversations. The photograph and the related text provide a fuller picture; what Clifford Geertz (1973) called a 'thick description'.
Thick description, Geertz's approach to participant observation, is a theme taken up by other visual anthropologists and sociologists. The approach has been characterised by Norman Denzin (1989, p. 33) as: providing a context for an observable act; specifying intentions and meanings of the act; tracing the evolution and development of the act; and presenting the action as an interpretable text.
Dipesh Kharel (2015, pp. 155–6) characterised Geertz's thick description as a process that:
specifies many details, social structures, social actions and meanings, and which is contrast to "thin description" which is a factual and superficial account without any interpretation. According to Geertz (1973) "thin description" is not only an insufficient account of an aspect of a culture; it is also a misleading one. Therefore, Geertz (1973) suggests that an ethnographer must present a "thick description" which is composed not only of facts but also of commentary, interpretation and interpretations of those comments and interpretations. He points out:
The claim to attention of an ethnographic account does not rest on its author's ability to capture primitive facts in faraway places and carry them home like a mask or carving, but on the degree to which he is able to clarify what goes on in such places, to reduce the puzzlement—what manner of men are these? (1973, p. 16)
...Geertz's idea of "thick description" can be achieved by images, gestures, or sequences that convey meaning. Thickness is created by the ability of the visual description to transmit what is really being 'said.' In ethnographic filmmaking, "thick descriptions" result from what has been recorded and edited. Mead remarks that a camera can be used to record thick descriptions of informants and their socio-cultural context through their own voices and activities, based on their understandings of their world, which may not possible with verbal descriptions.
An example of the use of photographic images in an historical sociological study is that of George Dowdall and Janet Golden (1989), which also refers to thick description. It is a case study in the use of photographs as data about institutional life, featuring Buffalo State Asylum for the Insane. Photographs came from a wide range of sources both internal and external to the institution. A total of 800 images were collected, of which 350 were used for the analysis. The images were categorised into 13 groups. Content analysis was rejected because it ignores or reduces 'contextual meanings in the interests of standardization and comparability (Emerson, 1983, p. 25)'. They adopted a layered analysis probing each image in depth. Images first viewed in historical context, which the authors referred to as appraisal, and that involved comparing written and visual data for congruence or lack of correspondence. Appraisal showed, for example, that the published photos on Annual Reports, gave a misleading impression of the hospital setting compared to photographs of activities that revealed the nature of the urban environment.
The second level, inquiry, considered the images as a whole: questions are stimulated by the prevalence of certain images and patterns, for example the difference between worker culture and patient culture. 'The most arresting contrast is between the well-furnished, well-lit and well-staffed hallways (image #5) of the asylum's first years presented in several official photographs from the early Annual Reports…and the shabbily dressed patients jammed against these same walls only a few decades later.'
The third layer, interpretation, focuses on individual images and uses 'thick description' (Geertz, 1973).
...only through our close reading of a select group of images could we uncover the two phenomena discussed below: "compelled activity" and "enforced idleness". For example, glassed in porches overlooking the grounds were added to the original design and photographs show rows of men sitting on benches all facing outwards but with seemingly no access to the outside, despite earlier promises of 'pleasure grounds.
The authors concluded that photographs need to be historically contextualised but they also provide added meaning. The images, for example, enhanced the researchers' understanding of the overcrowding statistics.
The researchers were aware of the issues raised by photographs, such as invasion of privacy and confidentiality of subjects and so decided not to publish photographs of people who may still be alive.
However, all research has ethical issues to address, not just work using photographs (See Section 10 for a broader discussion of research and publication ethics). The researchers also raised the issue of costs of photography and printing of journals. In an era of digital photography and the Internet these cost issues are no longer relevant; more costly is the time to analyse what could be a vast array of images.
Contextualisation appears to be a key issue in ethnographic analyses of visual images. Becker (1995, p. 5) in distinguishing photographs as research data from photographs as art asserted that 'Just as paintings get their meaning in a world of painters, collectors, critics, and curators, so photographs get their meaning from the way the people involved with them understand them, use them, and thereby attribute meaning to them.' For Becker:
Photographs get meaning, like all cultural objects, from their context. Even paintings or sculptures, which seem to exist in isolation, hanging on the wall of a museum, get their meaning from a context made up of what has been written about them, either in the label hanging beside them or elsewhere, other visual objects, physically present or just present in viewers' awareness, and from discussions going on around them and around the subject the works are about. (Becker, 1995, p. 10)
However, he admits the distinction between art and research data is not simple 'leaving the context implicit does not make a photograph art'. In some cases, documentary work may be devoid of textual context. He cites Robert Frank's (1959)The Americans, which gives no more textual support to the images than most art photographs. However, it is documentary because 'the images themselves, sequenced, repetitive, variations on a set of themes, provide their own context, teach viewers what they need to know in order to arrive, by their own reasoning, at some conclusions about what they are looking at'.
Barbara Niskac (2011, p. 129) also addresses the issue of categorisation of photography and the interpretation of images:
I would argue that, depending on the context, photography can be seen as either artistic, touristic, documentary, or ethnographic. Unwritten rules regarding what is and what is not ethnographic are generally formed in our minds, rather than by methodologies themselves…. And for that matter, "an anthropological photograph is any photograph from which an anthropologist could gain useful, meaningful visual information" (Edwards, 1996). In various situations, images are invested with new (and perhaps conflicting) meanings by different audiences and at different stages of ethnographic research and representation. It is only in relation to the discourses that people use to define them and through representation that they gain a certain meaning.
Kharel (2015, p. 154), similarly, raised the issue of multiple interpretations in relation to his own research output:
...ethnographic films are open to audiences constructing their own meanings based on their own observation of the film. The meaning can be produced on any level: emotional, ideological or practical level. I have experienced this during my film A Life with Slate screening at international film festivals. I found several different interpretations of the film based on the audience's socio-cultural background, professional area, and level of film literacy.