Publication

Issues with Privacy Preservation in Query Log Mining

Source:

Privacy-Aware Knowledge Discovery: Novel Applications and New Techniques, Chapman and Hall/CRC Press (2009)

Abstract:

In this chapter we present and analyze the current state of the art in query log privacy preservation. We focus on two complementary issues: the privacy of users that submit queries, and the privacy of websites featured in search results. We study vulnerabilities that arise in query log publishing, specifically in Web search engine logs, and discuss the effects that these have on the parties involved. Our analysis gives an overview of anonymization techniques that have been attempted and their weaknesses at preventing attacks on query log data. Furthermore, our research studies the implications for public data produced by query log data mining applications, and how it poses a risk of involuntary private data disclosure.