This is probably you (this article being written for researchers), but what do you want (obviously not an easy life or a guarantee of riches and fame)? You want to know things about the world, and be sure that you know things right. You want other people to know things, and you want to be able to rely on their knowledge, so that you can develop further knowledge together. And you want those other people to know that you know things, to share in your knowledge, trusting that you are a reliable source of knowledge – not just other researchers, but also sometimes members of the public, the media, politicians, students etc.
To achieve these goals, researchers are doing an increasingly wide range of activities. We can categorise their activities into a sequence of stages (although it isn’t always sequential). Note that I use the term “created” rather than “collated” – digital representations are representations, not the thing itself. Making digital representations is a creative act, a sampling, an interpretation, even when done methodically. The stages are:
- Describing aspects of the world – creating representations of things and events, often having to judge which features to include and what to leave out. When we digitise an artefact, for example, we produce a representation of it at a point in time, and exclude some details (a conventional photograph doesn’t caputure the internals of an artefact). Describing produces data (note their is a debate about whether “data” is the plural of “datum”, but I prefer the terms “data” and “data points” as that fits better with my programmer perspective). Stage 7 concerns the critical evaluation of the creation of data, but in reality this has to happen throughout to ensure research quality. We also need to be aware of ethical and legal issues, not just in representing living people, but also in representing cultures and contested facts (think of all those museums now struggling with the dilemma of holding artefacts taken from other cultures without permission).
- Organising our data into data structures – lists and relationships, where the relationships carry significant meaning (e.g. data about a set of artefacts relating to data about manufacturers). This can include more story-like or theatrical representations, and now also immersive virtual reality. It’s quite usual for these data structures to be created before the data is created, or for the structures to emerge with the data creation. Again we need to be aware of ethical and legal implications, as well as be concerned for the quality of what we are creating.
- Including “metadata” about data and data structures, to record the details of how the data and relationshiops were created and changed – including who by, when, and why.
- Making data, relationships, and Information about a resource, often describing its provenance, format, ownership, location, history of modification, keywords, short description. The Dublin Core specification is most commonly used in Digital Humanities to ensure that resources are described using a standardised common taxonomy. Wit... More available for use by ourselves and other researchers. We need to consider the lifecycle of the data – how will it be sustained over time so that research based on it may be used and evaluated in the future?
- Analysing data and relationships to generate and test hypothesis about the world (as it is now, as it was in the past, as it might be in the future), so as to extend our understanding. This involves speculation, uncertainty, often playful experimentation, and messiness. There’s also no reason why this has to happen once all the data is created, again in reality this is an emergent and non-linear process.
- Reporting on what we have found in creating and analysing data and relationships. Although this seems like a final stage, actually researchers are communicating in many ways and with a wide variety of peoplr all through the process.
- Critically evaluating data, relationships, analysis, conclusions, and the processes followed. Again this doesn’t just happen where the official process says it should happen, in peer review, but rather as a continuous process. For example, when creating data, we need to reflect critically on the choices we make when we decide which features are salient and what to leave out. Data structures constrain our thinking in specific ways, so we need to think carefelly about how we choose them. For example, in a database of Irish history I once built, we began collecting dates from archives with the assumption that all dates are certain. But it turned out that there is a range of uncertainty in the setting of dates. So we added an additional “cf.” field to identify uncertain dates. We could have gone further, and added fields to indicate ranges of possible dates, or the confidence with which a date is ascribed. These are decisions that must be made early. Critical evaluation of decisions allow us to get things right without unanticipated consequences later on.
In practice each of these steps may be done by the same people, or specialists may be employed. There are people who are skilled at digitising physical artefacts (with more specialisms within that field), there are people who work soley with digital sources and archives, while others specialise in storytelling (e.g. film makers), and others who analyse data. Some of these people would identify as researchers, while others are more general-purpose in their specific professions. But when doing this work for research purposes, all need to adopt the concerns, values, and perspective of the researcher.
Clarity of information and meaning, transparency of origin and process, reliablity, integrity – these things matter. But also efficiency and simplicity – your time is precious, you want systems that allow you to get to the result you seek quickly and without unnecessary effort. At the same time, you want connectivity with other researchers, and between data from different sources – this adds complexity, and poses a threat to your other values (simplicity, reliability, integrity etc.). And to make matters even more tricky, you want to be creative, making the conditions for new perspectives to emerge – this adds a degree of uncertainty and messiness that can sometimes be hard to manage.
You are probably somewhere on a continuum between these poles – perfect integrity and creative messiness. It’s worth reflecting on that now, as it impacts on how you relate to the other roles, especially the programmer.