Participation and contribution quality

Pitch In

In their study on technology-mediated social participation, Preece and Shneiderman (2009, p. 18) define a contribution as “an individual act that adds to a larger communal effort”. For crowdsourcing cultural heritage (CCH), as for all forms of crowdsourcing, both the quantity of contributors and quality of contributions impact on the success of a project. Considering these challenges from the perspective of academic and cultural heritage institutions helps us to better understand the goals of websites for CCH.

Crowd participation

One of the main challenges for CCH projects is “gaining critical mass of engaged and trusted participants” (Crowd Consortium, 2015, p. 40). Crowdsourcers reporting on new initiatives frequently share impressive statistics related to the number of visitors to their website; while this provides an indication of the level of interest in the project and the impact of project promotion, meaningful engagement is only achieved when site visitors are converted to active contributors (Crowd Consortium, 2015, p. 96; McKinley, 2012; Ridge, 2014; Wald et al., 2015).

The untapped potential of website visitors can be significant, as evidenced by a study on the National Library of Finland project Digitalkoot. While the project was considered relatively successful, only 15% of visitors to the site participated in the first 51 days of the project (Chrons & Sundell, 2011). Similarly, of the 1,207 online visitors who registered to participate in the first six months of the Transcribe Bentham project, only 21% did any transcription (Causer, Wallace, & Tonra, 2012).

Generally funded for a limited time, CCH projects must benefit from as many volunteer contributions as possible before the funding of staff who support volunteers, moderate contribution quality, and maintain crowdsourcing systems comes to an end (Ridge, 2014). While numerous case studies report that a core group of dedicated “super contributors” are responsible for completing the majority of the work (Dunn & Hedges, 2012; Noordegraaf et al., 2014; Ridge, 2014), these members of the crowd must still be recruited and retained.

The participation of short-term contributors, who are sometimes referred to as “dabblers” (Eveleigh, Jennett, Blandford, Brohan, & Cox, 2014), “drive-bys” (M Ridge & Vershbow, 2014), and “the long-tail” (Michelucci, 2013, p. 688; M Ridge & Vershbow, 2014; Wald et al., 2015), is important for several reasons. Research on online communities has found that people are more likely to participate on a website when there is evidence of other users (Kraut & Resnick, 2012; Preece & Shneiderman, 2009). Being part of a community is also a common motivation for volunteering (Ridge, 2014). Furthermore, the purpose of CCH is not solely task completion, but also outreach and social engagement; CCH project teams working within publicly funded institutions need to encourage and demonstrate widespread engagement between the institution and the community, and the engagement of the community with cultural heritage collections (Dunn & Hedges, 2013, p. 149; Ridge, 2014, p. 217).

Case studies on CCH have acknowledged the role of marketing and media coverage for recruiting volunteers (McKinley, 2011; Mia Ridge, 2014, p. 180; Taranto, 2011), but the primary goal of project promotion is to direct traffic to the project site. As the following section will discuss in more detail, it is the project website that plays the largest role in recruiting and retaining volunteers.

Contribution quality

As crowdsourcing becomes more widely adopted by academic and cultural heritage institutions, there is more evidence to support the value and accuracy of crowd contributions (Manzo, Kaufman, Punjasthitkul, & Flanagan, 2015). Nevertheless, data quality continues to be a recurring theme in research on CCH. This is understandable, given the reputations of research and collecting institutions as authoritative sources of accurate information (Earle, 2014; M. Ellis, 2011; Mia Ridge, 2014). In Crowdsourcing our Cultural Heritage (Ridge, 2014, p. 215), Eveleigh paints a picture of issues surrounding quality from the perspective of archives:

Crowdsourcing initiatives in archives, as in related professional fields, are also haunted and constrained by the fear that a contributor might be wrong, or that descriptive data might be pulled out of archival context, and that researchers using collaboratively authored resources might somehow swallow all of this without question or substantiation.

In response to these concerns, some CCH websites require the same micro-task to be completed by several volunteers, and rely on computer algorithms to determine the most common response, and by extension, data accuracy (Mia Ridge, 2014; M Ridge & Vershbow, 2014). While this approach is common for projects that involve relatively straightforward tasks such as tagging, small-scale transcription, and data correction, not all CCH tasks are not suited to this form of quality assurance. Some projects require members of the project team to manually moderate contributions, and others invite online volunteers to perform this task.

The response of CCH researchers to the issue of data quality is to question how crowd contributions should be integrated with online collections (Burford, 2012; Earle, 2014; Mia Ridge, 2014), and explore how the design of websites can better support quality contributions (Crowd Consortium, 2015; Hansen, Schone, Corey, Reid, & Gehring, 2013; Jennett & Cox, 2014; McKinley, 2013).