Crowdsourcing CCCT Seminar @UvA

Friday 22 Nov 2013, 16.00-17.00, Science Park 904 (room B0.201)

Julia Noordegraaf & Angela Bartholomew (Faculty of Humanities, UvA)
Modeling Crowdsourcing for Cultural Heritage
The Modeling Crowdsourcing for Cultural Heritage (MOCCA) project aims to help steer more effective crowdsourcing projects for galleries, libraries, archives, and museums. The outcome is a tool that helps cultural heritage professionals design effective projects. A first evaluation of existing models and projects shows that the specific conditions of individual crowdsourcing projects, such as the modalities of the institutions and collections, the level of openness, rewards and other forms of crowd management, greatly contribute to a project’s success or failure. Our challenge has become to model these conditions in a structure that allows heritage professionals to determine the design criteria relevant for their specific purposes. My presentation will focus on this modeling problem as input for a brainstorm and discussion.

Lora Aroyo (Computer Science Department, VU University Amsterdam)
Crowd Truth: Disagreement in Crowdsourcing is not Noise but Signal
One of the critical steps in analytics for big data is creating a human annotated ground truth. Crowdsourcing has proven to be a scalable and cost-effective approach to gathering ground truth data, but most annotation tasks are based on the assumption that for each annotated instance there is a single right answer. From this assumption it has always followed that ground truth quality can be measured in inter-annotator agreement, and unfortunately crowdsourcing typically results in high disagreement. We have been working on a different assumption, that disagreement is not noise but signal, and that in fact crowdsourcing can not only be cheaper and scalable, it can be higher quality. In this paper we present a framework for continuously gathering, analyzing and understanding large amounts of gold standard annotation disagreement data. We discuss the experimental results demonstrating that there is useful information in human disagreement on annotation tasks. Our results show .98 accuracy in detecting low quality crowdsource workers, and .87 F-measure at recognizing useful sentences for training relation extraction systems.

Moderator: Maarten de Rijke (Informatics Institute, UvA)
Date and Time: Friday 22 November 2013, 16.00-17.00 (followed by drinks)
Location: Science Park, room: B0.201, Science Park 904, 1098 XH Amsterdam

Free entrance

About Lora Aroyo

I am Research Scientist at Google working in the area of Responsible AI specifically focussing on responsible data for AI. Previously, I was a full professor in Computer Science, head of the Web and Media Group, Department of Computer Science, VU University of Amsterdam, The Netherlands, where I was scientific coordinator of the EU Integrated Project: NoTube: Integration of Web and TV Data with the Help of Semantics, http://notube.tv Go to my web page for more details: http://lora-aroyo.org
This entry was posted in keynote, presentation and tagged , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s