Document analysis

core definition

Document analysis in social research is the process of analysing any cultural product to provide data for or insights into a research issue.

explanatory context

There is no single 'document analysis' method, various techniques (from content analysis to semiology) are used depending on the type of document and the intention of the research.


A document is any cultural product including hand-written documents such as letters, printed documents, paintings, photographs, charts, maps, films, videos televeision programmes, newspapers. The latter items are part of the mass media and the whole field of media analysis is applied to them. Conversely, letters to family and friends, along with diaries, notes, drafts and files would be regarded as personal documents and would not necessarily be analysed in the same way. Media analysis, for example, often seeks out the underlying ideology, while personal document research is more likley to be seeking to identify motivations.


Document as an all-encompassing term is not dissimilar to the use in social research of the term 'text'


Much of the advice about analysing documents refers to historical documents but applies equally to 'contemporary' docuements, after all, once written all documents are in effect historical.

analytical review

Heffernan (undated):

What is document analysis?

Document analysis is a social research method and is an important research tool in its own right and is an invaluable part of most schemes of triangulation. Documentary work involves reading lots of written material (it helps to scan the documents onto a computer and use a qualitative analysis package). A document is something that we can read and which relates to some aspect of the social world. Official documents are intended to be read as objective statements of fact but they are themselves socially produced.

Sources of Documents: 1. Public records; 2. The media; 3. Private papers; 4. Biography; 5. Visual documents
The term 'biography' has two meanings in social research. Firstly, it is a particular style of interviewing, where the informant is encouraged to describe how his or her life (or some aspect of it) has changed and developed over time. In doing so, they reflect his/her own conception of self, identity and personal history. Secondly, 'biography' refers to a work that draws on whatever materials are available to an author to represent an account of a person's life and achievements. Narrative analysis is used to elicit results.

Heffernan suggest the following forms of analysis: content analysis, semiotics, discourse analysis, conversation analysis, grounded theory and 'interpretative analysis', which she says 'aims to capture hidden meaning and ambiguity. It looks how messages are encoded, latent or hidden. You are also acutely aware of who the audience is'.

Powell (2013) referring to genealogical research provides sound advice applicable to any document analysis:

It can be easy when examining a historical document that relates to an ancestor to look for the one "right answer" to our question - to rush to judgement based on the facts presented in the document or text, or the conclusions we make from it. It is easy to look at the document through eyes clouded by personal bias and perceptions engendered by the time, place and circumstances in which we live. What we need to consider, however, is the bias present in the document itself. The reasons for which the record was created. The perceptions of the document's creator. When weighing the information contained in an individual document we must consider the extent to which the information reflects reality. Part of this analysis is weighing and correlating evidence obtained from multiple sources. Another important part is evaluating the provenance, purpose, motivation and constraints of the documents which contain that information within a particular historical context.
Questions to consider for every record we touch:

1. What type of document is it?
Is it a census record, will, land deed, memoir, personal letter, etc.? How might the record type affect the content and believability of the document?

2. What are the physical characteristics of the document?
Is it handwritten? Typed? A pre-printed form? Is it an original document or a court-recorded copy? Is there an official seal? Handwritten notations? Is the document in the original language in which it was produced? Is there anything unique about the document that stands out? Are the characteristics of the document consistent with its time and place?

3. Who was the author or creator of the document?
Consider the author, creator and/or informant of the document and its contents. Was the document created first-hand by the author? If the document's creator was a court clerk, parish priest, family doctor, newspaper columnist, or other third party, who was the informant?

What was the author's motive or purpose for creating the document? What was the author or informant's knowledge of and proximity to the event(s) being recorded? Was he educated? Was the record created or signed under oath or attested to in court? Did the author/informant have reasons to be truthful or untruthful? Was the recorder a neutral party, or did the author have opinions or interests that might have influenced what was recorded? What perception might this author have brought to the document and description of events? No source is entirely immune to the influence of its creator's predilections, and knowledge of the author/creator helps in determining the document's reliability.

4. For what purpose was the record created?
Many sources were created to serve a purpose or for a particular audience. If a governmental record, what law or laws required the document's creation? If a more personal document such as a letter, memoir, will, or family history, for what audience was it written and why? Was the document meant to be public or private? Was the document open to public challenge? Documents created for legal or business reasons, particularly those open to public scrutiny such as those presented in court, are more likely to be accurate.

5. When was the record created?
When was this document produced? Is it contemporary to the events it describes? If it is a letter is it dated? If a bible page, do the events predate the bible's publication? If a photograph, does the name, date or other information written on the back appear contemporaneous to the photo? If undated, clues such as phrasing, form of address, and handwriting can help to identify the general era. First-hand accounts created at the time of the event are generally more reliable than those created months or years after the event occurred.

6. How has the document or record series been maintained?
Where did you obtain/view the record? Has the document been carefully maintained and preserved by a government agency or archival repository? If a family item, how has it been passed down to the present day? If a manuscript collection or other item residing in a library or historical society, who was the donor? Is it an original or derivative copy? Could the document have been tampered with?

7. Were there other individuals involved?
If the document is a recorded copy, was the recorder an impartial party? An elected official? A salaried court clerk? A parish priest? What qualified the individuals who witnessed the document? Who posted the bond for a marriage? Who served as godparents for a baptism? Our understanding of the parties involved in an event, and the laws and customs which may have governed their participation, aids in our interpretation of the evidence contained within a document.

In-depth analysis and interpretation of a historical document is an important step in the genealogical research process, allowing us to distinguish between fact, opinion, and assumption, and explore reliability and potential bias when weighing the evidence it contains. Knowledge of the historical context, customs and laws influencing the document can even add to the evidence we glean. The next time you hold a genealogical record, ask yourself if you have really explored everything the document has to say.

Bélanger (2006) also provides advice on dealing with historical documents:

A document may be of various types: a written document, a painting, a monument, a map, a photograph, a statistical table, a film or video, etc. Anything from the past that helps us learn what happened, and why, is a document. The technique of document analysis outlined below is generally applicable to all types of documents. However, it is especially appropriate for the written documents.

Analyzing a document (external analysis)
The introduction of the document: You do not have to follow exactly the sequence of issues given below. The first purpose of this section is to introduce your document and its subject (briefly) as well as to clarify the following:

a. The author: Who is the author? What do we know about the author? What motive (purpose) might the author have had in writing this document? What biases or assumptions might colour the views of the author? What is the degree of familiarity of the author with the subject discussed in the document? Was the author a direct observer of the event/issue [if this is pertinent] or was the information obtained second-hand? Had the author any personal involvement in the events/issues described [if pertinent]? Do we have any reason to think that the author does not describe what he/she believes to be true?

b. The time frame: When was this document produced? Is it contemporary to the events/issues it describes? In what context was it produced? How has it come down to us? Could it have been tampered with?

c. Place: Where was this document produced? Does the geographical location influence the content? Was this document meant to be public or private?

d. Category of document: What is the category in which this document falls (memoirs, poem, novel, speech, law, study, sermon, Church document, song, letter, etc.)? How would the type of writing affect the content and believability of the document? Is the document in the original language in which it was produced? Is the translation authoritative?

e. Audience: What is the intended audience of this document? Was the author representing a specific group? Or addressing the document to a specific group (or speaking to a specific group)?
Analyzing the document (internal analysis)
Main body of the document:

a. Content of the document: What does the author argue (main theme; secondary themes: summarize them briefly but thoroughly. You might need to regroup ideas under some themes)? What specific information of importance is provided? What light does is shed on the society/events/issues described? Do not only summarize but analyze the document as well: What does the author really mean? Does the source tell a consistent story? Are there contradictions? Evident errors [why would this be]? Does the source provide us unwittingly with information (what can be read between the lines)? Are there allusions made by the author that need to be explained?

b. Believability of the document: Given the external analysis and the content of the document, how credible is the information? Is it corroborated by other sources? Are important facts ignored? Why would such facts be omitted? Using other credible evidence, can you confirm or contradict the thesis of the document? Is the testimony sincere, exact? What makes you think so? Are there assertions made that are incorrect?

Campus Labs (2011) states:

Document analysis is a form of qualitative research in which documents are interpreted by the researcher to give voice and meaning around an assessment topic. Analyzing documents incorporates coding content into themes similar to how focus group or interview transcripts are analyzed. A rubric can also be used to grade or score a document. There are three primary types of documents:
• Public Records: The official, ongoing records of an organization’s activities. Examples include student transcripts, mission statements, annual reports, policy manuals, student handbooks, strategic plans, and syllabi.
• Personal Documents: First-person accounts of an individual’s actions, experiences, and beliefs. Examples include calendars, e-mails, scrapbooks, blogs, Facebook posts, duty logs, incident reports, reflections/journals, and newspapers.
• Physical Evidence: Physical objects found within the study setting (often called artifacts). Examples include flyers, posters, agendas, handbooks, and training materials.

Marshall (1998) discussed personal document analysis:

personal documents: These are documents, used in social science, which record part of a person's life—most frequently in their own words. The most obvious examples are letters, diaries, biographies and life-histories, but the term can be stretched to include many other items from photographs to inscriptions on tombstones. (The surprisingly wide sources of data are fully described in K. Plummer's Documents of Life, 1983.) Personal documents aim to capture the subjective side of a person's life and are valuable as part of an ideographic research strategy. They are often used in the early and exploratory stages of research but can also be used as case-studies for theory generation and falsification. Personal documents were particularly popular in the work of some of the early Chicago sociologists: for example, Clifford Shaw gathered many life-histories of delinquents, and the classic study by William Isaac Thomas and and Florian Znaniecki The Polish Peasant in Europe and America (1918) analysed a series of letters, as well as presenting a major life-history.

Corti (1993) discussing the use of diaries in social research states:

Biographers, historians and literary scholars have long considered diary documents to be of major importance for telling history. More recently, sociologists have taken seriously the idea of using personal documents to construct pictures of social reality from the actors' perspective (see Plummer's 1983 book Documents of Life). In contrast to these 'journal' type of accounts, diaries are used as research instruments to collect detailed information about behaviour, events and other aspects of individuals' daily lives.

Self-completion diaries have a number of advantages over other data collections methods. First, diaries can provide a reliable alternative to the traditional interview method for events that are difficult to recall accurately or that are easily forgotten. Second, like other self-completion methods, diaries can help to overcome the problems associated with collecting sensitive information by personal interview. Finally, they can be used to supplement interview data to provide a rich source of information on respondents' behaviour and experiences on a daily basis. The 'diary interview method' where the diary keeping period is followed by an interview asking detailed questions about the diary entries is considered to be one of the most reliable methods of obtaining information.

associated issues

Paul Lazarsfeld, a member of the Columbia School and best known for his development of multivariate analysis techniques, the notion of the interchangeability of indicators and criticised by C Wright Mills as an abstracted empiricist, also undertook an influential account of the effects of unemployment using personal document analysis (Jahoda et al., 1933).

related areas

See also

content analysis



