Content analysis
Summary
This is a method of summarising a large body of fairly short statements
into a small statistical table in a report. The method described
here presupposes a spreadsheet; you can find specialised computer
programs for doing the same thing.
Benefits
This method imposes a discipline on the process of summarising
user comments that will produce a more objective result than simply
picking out 'representative statements.'
Method
The method of content analysis (CA) described here is applicable
to analysing a body or corpus of many short statements or records.
Each statement should be no more than approx. 20 words long and
there should be at least 12 such statements in the corpus. The statements
may come from different persons, from one person, or be notes in
the behaviour of one or more persons. The starting point in any
case is a series of short records which can be represented in written
form.
First ensure that each record represents one and only one theme.
Look carefully at statements with connectives such as 'and' and
'but' in them. Such statements may need to be broken down into simpler
units. Enter the statements into the second column of a spreadsheet,
one statement per cell. You may if you wish add any identifying
information about each statement in the cell to its right.
On each cell to the left of the statement, insert a numeric code
corresponding to the theme of the statement. For instance any mention
of the legibility of the screen fonts may be coded as a 20. It is
best to code these in tens (10, 20, 30 etc.) at first.
Once you have coded most of the items, sort the columns with data
on the leftmost column. Cells which are as yet uncategorised will
appear together; you will be able to update your categorisation
scheme, and create new levels of 'delicacy' to your analysis by
using units, and if necessary decimal points. Thus legibility and
colour may be coded as 22, legibility and size as 24, and so on.
References to legibility pure and simple will still however remain
as 20s.
Iteratively refine and sort until you have categorised the items
to your satisfaction. It is common to have a number of 'miscellaneous'
items. At this stage you may add up the number of instances of each
category, and rank order the record types by frequency of occurrence
in the corpus. This enables you to make statements such as "the
most frequent cause of complaint about this software is the legibility
of the menu wordings." In academic research, CA is usually also
subjected to a process of verification. In such a process a second
rater applies the categories the first rater generated to each of
the statements 'blind', ie without knowing how the statements were
categorised by rater one. A common criterion is to use an 80% agreement
as a criterion that the categories are generally replicable. More
stringent methods and criteria can be employed, but they are not
usually applicable to usability testing.
More Information
Two classic texts on CA are:
Holsti, O.R. 1969. Content Analysis for the Social Sciences and
the Humanities. Reading, MA: Addison-Wesley.
Krippendorf, K. 1980. Content Analysis: An Introduction to its
Methodology. Beverly Hills, CA: Sage Publications.
See also the discussion in:
Kirakowski J and M Corbett (1990) Effective Methodology for the
Study of Human-Computer Interaction North Holland/Elsevier.
Computer programs to help with content analysis can be found at:
NUDIST
(the industry leader) http://www.qualisresearch.com/
http://www.simstat.com/wordstat.htm
Alternative Methods
Card sorting is a way of carrying out CA which however emphasises
the subjective nature of the sorting activity and keeps the evaluator
'at a distance' from making decisions about the data. Concept Walls
and Affinity Diagrams are ways of eliciting the latent structure
in the records being sorted, usually expressed as hierarchies.
Next Steps
CA is usually carried out as part of an analysis of a large corpus
of data, for instance, audience reactions or user suggestions. It
is not uncommon for 12 person-hours work to be summarised in one
small table in a report. The table is usually 2 columns by n rows,
where in each row the first column is a brief summary of the category,
and the second column is the percentage frequency of occurrence
of that category in the corpus.
Case studies
http://irm.cit.nih.gov/itmra/weptest/app_a6.htm
http://www.mindspring.com/~etamplin/research/5305.htm
West, Mark D., ed. Theory, Method, and Practice in Computer Content
Analysis. Westport, CT: Ablex, 2001.
Background Reading
You can find a page of resources on content analysis at: http://www.gsu.edu/~wwwcom/
|