Lies and statistics

Alison Petch,
Researcher 'The Other Within' project

"There are three kinds of lies: lies, damned lies and statistics." - Autobiography of Mark Twain (apparently)

In 2006 I published a paper about the use of statistics in the Relational Museum project. Certain aspects of this paper are as relevant to the Other Within project as they were to its predecessor so I include a (mostly silently) edited version of the paper here. The full version can be found in the Journal of Museum Ethnography, no. 18 (May 2006), pp. 149–156.

Overview of the Statistical Research

The statistical analyses carried out as part of the project were made possible by the fact that the primary records for all the museum’s object collections had been retrospectively computerized before the project began.[1] Indeed, the fact that all the records had been computerized and thus available for statistical analysis was one of the inspirations for the development of the ‘Relational Museum’ project in the first place. The statistics were prepared using a version of the museum’s collections-management FilemakerPro database for all the objects accessioned up to and including 1945. Most of the analysis was of a very basic kind allowing graphs, charts, and tables to be produced using Microsoft Excel.[2]
We tried to guard against potential biases in the statistics by asking standard questions for each geographical area and for each named collector. For the geographical statistics, we calculated the total number of objects from each continent, each region, and each country and broke these figures down into archaeological, ethnographic, and undetermined objects. We then broke these figures down decade-by-decade and by type of object.[3] We looked into how the figures correlated with the colonial status of the regions/countries vis-à-vis the United Kingdom and how they compared with each other. We also tried not to ignore the particular, paying attention where we could to statistics that seemed relevant to the geographical area in question even when they seemed to be specific to that area alone. We also looked at how all these figures might correlate with different types of collectors, such as colonial officers, missionaries, anthropologists, archaeologists, etc. Similarly, for the statistics relating to individual collectors, we investigated the total number of objects donated and/or bequeathed by the collector and then broke these down by whether they were archaeological, ethnographic, or undetermined; also by continent, region, and country. We then broke these figures down decade-by-decade and by type of object.[4]
These were the sorts of questions we regarded as being the most likely to provide statistics that would be useful in furthering the project’s key areas of investigation. In each instance, the questions were answered in full so that there is consistency between each set of statistics.
One of the first set of questions we asked related to the broad geographical provenance by continent of the collections. ...
Further statistical analysis was carried out into the collections from each continent, to illustrate which I have chosen the figures for the Australian collections. Of the collections from Australia up to 1945, 14% had been recorded on the computer as ‘definitely’ archaeological and 13% as ‘definitely’ ethnographic, while 73% were recorded as either archaeological or ethnographic. Moreover, 86% of all the objects from Australia were recorded as stone tools and 82% recorded as being from Tasmania, the explanation for this being the acquisition in 1940 of Ernest Westlake’s collection of more than 13,000 Tasmanian stone tools, this collection representing 71% of the Australian collections accessioned by the museum up to 1945.
A further set of questions were asked about the collections donated and bequeathed to the museum by the six selected collectors, the statistics produced being used to compare and contrast their collections. In addition to the broad questions outlined above, we investigated whether there were other objects in the collection from members of the same family as the selected collector. ...
The examples I have discussed are, of course, only a very few of the many statistics generated by the project. In and of themselves, they tell us little. Rather, they are best regarded as no more than suggestive jumping-off or starting-points for further research, both internally within the museum and externally as they are made available to the research community through the project website and related publications. Even as starting-points, however, they need to be reliable. So how reliable are they?

Critique

In beginning to think about the reliability of the statistical information generated by the project, I could not escape the nagging doubts expressed in a number of well-known sayings. If one looks up ‘statistics’ in a dictionary of quotations one’s doubts are immediately confirmed. Most famously, Benjami Disraeli remarked that ‘there are three kinds of lies: lies, damned lies, and statistics’, while others have made similar points, perhaps more subtly: Mark Twain remarked that ‘facts are stubborn things, but statistics are more pliable’, while the American humourist Evan Esar described statistics as ‘the only science that enables different experts to use the same figures to draw different conclusions’, and Jean Baudrillard concluded that ‘like dreams, statistics are a form of wish fulfillment’. More systematically, the compilers of a BBC website set up to help people understand statistics (BBC 2003), suggest that the following points be borne in mind when reviewing any statistics: ‘Where did the data come from? Who ran the survey? Do they have an ulterior motive for having the result go one way?’ and ‘How was the data collected? What questions were asked? How did they ask them? Who was asked?’ They also warn the prospective user to, ‘Be wary of comparisons. Two things happening at the same time are not necessarily related, though statistics can be used to show that they are.’ Finally, they advise the user to ‘Be aware of numbers taken out of context. This is called ‘cherry-picking’, an instance in which the analysis only concentrates on such data that supports a foregone conclusion and ignores everything else.’
The accuracy of all statistics is, of course, linked inextricably to the precision of the data upon which they are prepared. This is the first, and largest, problem with all statistics, the present ones included. As I have pointed out previously (Petch, 2002, 2003, 2004), the primary documentation of the collections at the Pitt Rivers Museum is generally agreed to be of a very high standard, with most objects being associated with a good deal of information. However, there are still many objects for which the museum holds little or (almost) no information. This is not unusual for a museum of the age and size of the Pitt Rivers, but it is a fact that needs to be taken into account when assessing the value of statistical analyses.
For example, there are still approximately 100 ‘bulk’ entries in the database for the object collections. An example chosen at random—the first to be found on the database—is provided by the computerized record for PRM 1954.6.36, which contains no more information than was provided in the original accession book entry: ‘Graham Hutton…. Objects obtained by him in 1939 in Mexico.—Valley of Mexico, Teotihuacan culture. Box containing flake-knives of obsidian (or portions of them) & other pieces of worked obsidian.’ While ongoing work by collections staff continues to reduce the number of these unsatisfactory entries, which give little clue to the actual number of objects covered, it is difficult to envisage the number of them being reduced to zero in the near future—at least not without special funding.
There are also a large number of objects the data for which is not sufficiently clear to be regarded as truly accurate. Many objects in the founding collection, for example, are insufficiently provenanced. An example, again chosen at random, is provided by the record for PRM 1884.7.28, again based on the accession book entry: ‘Primitive Food etc. Vessels (Substitutes for Pottery) Double gourd (figure of 8-shaped) ?Africa’. Some others are confusingly provenanced, such as PRM 1884.19.217: ‘Weapons Spears Darts Spear, all of forged iron ?India or East Africa Sale no 346’.
Moreover, the circumstances in which objects were originally found is too often not described, making it particularly difficult to assess whether some items should be classified on the database as archaeological or ethnographic. This is particularly true for the stone tools brought from Australia in the late nineteenth century. At this time, some Aboriginal people continued to use stone but most had begun to use metal or glass instead. Also, at the time many collectors carried out amateur excavations as well as retrieving surface finds and obtaining objects directly from Aboriginal people. Very few of the Australian stone tools in the museum’s collections came with detailed information about how they were collected. The absence of such information can mean that the very nature of the object may have to be regarded as unclear, for the time being at least.

Conclusions

So, on reflection, what do I think of the project’s statistics? On balance, I have concluded that the statistics are of great value, especially if in using them one bears in mind the sort of caveats outlined above.
With regard to the first point raised on the BBC website, I hope that I have made clear above the methodology adopted in the project, while with regard to the second I trust that my explanations in previous contributions to this journal as to how the databases used to compile the statistics were prepared (Petch 1999, 2000, 2002) should set at rest any undue concerns. The third point does not seem to me to apply in the present case, while the ‘cherry-picking’ is of course just what we have had to guard against in preparing the project monograph, while all users of the statistics, whether members of the musuem’s staff or of the wider research community will need to do so as they make use of the resource that the project has created.
There are, of course, many uses to which the statistics could be put (in addition to those to which the ‘Relational Museum’ project itself has already put them). First, they provide overviews of the collections that are just not obtainable by any other means, thus allowing members of the general public, visitors, and researchers to understand the museum’s collections a little better. Secondly, if other museums produced similar statistics the possibility would be created for some interesting comparative work. For example, a detailed statistical comparison between the collections of the Pitt Rivers Museum and those of the Cambridge University Museum of Anthropology and Archaeology—a similar institution with a similar history—might yield a great deal of information about the development of museum ethnography (especially within university museums), patterns of acquisition, and so on. Or, at least, give rise to even more suggestive jumping-off points for further research.
The BBC site concludes, ‘Statistics do have a sort of magical appeal. They appear to the untrained eye to be based on complex maths that is difficult to understand. This is rubbish: statistics are easy to create. Accurate statistics are much more difficult to calculate.’ In producing statistics about the collections of the Pitt Rivers Museum between 1884 and 1945, my colleagues on the ‘Relational Museum’ project and I aimed to be accurate and clear. The users of the products of the project, including this article and the project website, may judge the extent of our success.

Acknowledgements

Five researchers have been employed during the three-and-a-half years of the project, which was funded by a major grant from the Economic and Social Research Council, whose generous support is again acknowledged here. The author and Frances Larson were employed throughout, while Sandra Dudley, Megan Price, and Chris Wingfield were also employed in the project’s earlier stages. Although this article is based primarily on my own research I gratefully acknowledge the work of my colleagues and the contributions of the project’s joint directors, Christopher Gosden and Michael O’Hanlon. In 2007 a monograph based on the project’s findings, by Chris Gosden, Francis Larson, and myself, will be published by Oxford University Press. In the meantime, the raw data generated by the project has been made available on the project website at < http://history.prm.ox.ac.uk/ >.

Notes

1. Computerization of object records began at the museum in 1985, retrospective computerization of the earlier records being carried out on a piecemeal, project-by-project basis until 1999–2002 when a grant from the UK government’s Designation Challenge Fund was used to complete the task (for further information, see, for example, Coote et al. 2000; Petch 1999, 2002).
2. For examples of the charts, graphs, and tables, see the project website. A moot point that I do not have the space to consider here is the extent to which the presentation of statistical information in graphs, charts, and tables, rather than in prose as adopted here, gives additional—possibly spurious—authority to the figures.
3. Lists of the object names, classes, processes, materials, and other terms currently used in the museum’s cataloguing processes may be found in the relevant section of the museum’s website at < http://www.prm.ox.ac.uk/databases/ >. These draw, of course, on those developed earlier in the history of the museum by Beatrice Blackwood and others (Blackwood 1970).
4. It should be noted that the total number of objects in the founding collection is yet to be finally established. The collection was not systematically recorded until the 1920s (Blackwood 1970: 14–15; Coote et al. 1999: 65), and even then it seems that a number of objects were ‘missed’. As a result, ‘previously unentered’ objects continue to be identified during ongoing inventorying work.

References

BBC 2003. How to Understand Statistics, online at < http://www.bbc.co.uk/dna/h2g2/A1091350 >; created 28 July 2003, last accessed 21 March 2006.
Blackwood, Beatrice 1970. The Classification of Artefacts in the Pitt Rivers Museum, Oxford (Occasional Papers on Technology, 11), Oxford: Pitt Rivers Museum, University of Oxford.
Coote, Jeremy, Chantal Knowles, Nicolette Meister, and Alison Petch 2000. ‘Computerizing the Forster (“Cook”), Arawe, and Founding Collections at the Pitt Rivers Museum’, Pacific Arts, nos 19/20 (July), pp. 48–80.
Petch, Alison 1999. ‘Cataloguing the Pitt Rivers Museum Founding Collection’, Journal of Museum Ethnography, no. 11 (May), pp. 95–104.
—— 2002. ‘Today a Computerized Catalogue: Tomorrow the World’ Journal of Museum Ethnography, no. 14 (March), pp. 94–9.
—— 2003. ‘Documentation in the Pitt Rivers Museum: The Contribution of Sir Francis Knowles (1886–1953)’, Journal of Museum Ethnography, no. 15 (March), pp. 109–14.
—— 2004. ‘Collecting Immortality: The Field Collectors who Contributed to the Pitt Rivers Museum, Oxford’, Journal of Museum Ethnography, no. 16 (March), pp. 127–39.

ENGLAND: THE OTHER WITHIN

Analysing the English Collections at the Pitt Rivers Museum