Finding high quality content in social media, with an application to community-based question answering
Source:
Web Search and Data Mining (WSDM), ACM Press, Stanford, USA, p.183-194 (2008)
ISBN:
978-1-59593-927-9
URL:
http://www.chato.cl/papers/acdgg_2007_high_quality_content_social_media.pdf
Abstract:
The quality of user-generated content varies drastically from
excellent to abuse and spam. As the availability of such content
increases, the task of identifying high-quality content in sites based
on user contributions---social media sites---becomes increasingly
important. Social media in general exhibit a rich variety of
information sources: in addition to the content itself, there is a
wide array of non-content information available, such as links between
items and explicit quality ratings from members of the community. In
this paper we investigate methods for exploiting such community
feedback to automatically identify high quality content. As a test
case, we focus on Yahoo!~Answers, a large community question answering
portal that is particularly rich in the amount and types of content
and social interactions available in it. We introduce a general
classification framework for combining the evidence from different
sources of information that can be tuned automatically for a given
social media type and quality definition. In particular, for the
community question answering domain, we show that our system is able
to separate high-quality items from the rest with an accuracy close to that of humans.
Notes:
Was Yahoo! Technical Report YR-2007-05
Download:
ACM COPYRIGHT NOTICE. Copyright © 2008 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept., ACM, Inc., fax +1 (212) 869-0481, or
permissions@acm.org.