Data Mining - Mehmed Kantardzic [216]
2. do not know to what use the data will be made, and/or
3. have not consented to such collection of data or data use.
In order to alleviate concerns about data privacy, a number of techniques have recently been proposed in order to perform the data-mining tasks in a privacy-preserving way. These techniques for performing privacy-preserving data mining are drawn from a wide array of related topics such as cryptography and information hiding. Most privacy-preserving data-mining methods apply a transformation that reduces the effectiveness of the underlying data when they are applied to data-mining methods or algorithms. In fact, there is a natural trade-off between privacy and accuracy although this trade-off is affected by the particular algorithm that is used for privacy preservation. The key directions in the field of privacy-preserving data mining include:
Privacy-Preserving Data Publishing: These techniques tend to study different transformation methods associated with privacy. They concentrate on how the perturbed data can be used in conjunction with classical data-mining methods.
Changing the Results of Data-Mining Applications to Preserve Privacy: These techniques are concentrated on the privacy of data-mining results where some results are modified in order to preserve the privacy. A classic example of such techniques are association-rule hiding methods, in which some of the association rules are suppressed in order to preserve privacy.
Cryptographic Methods for Distributed Privacy: If the data are distributed across multiple sites, a variety of cryptographic protocols may be used in order to communicate among the different sites so that secure function computation is possible without revealing sensitive information.
Recent research trends propose that issues of privacy protection, currently viewed in terms of data access, be reconceptualized in terms of data use. From a technology perspective, this requires supplementing legal and technical mechanisms for access control with new mechanisms for transparency and accountability of data used in a data-mining process. Current technical solutions of the impact of data mining on privacy have generally focused on limiting access to data at the point of collection or storage. Most effort has been put into the application of cryptographic and statistical techniques to construct finely tuned access-limiting mechanisms. Even if privacy-preserving data-mining techniques prove to be practical, they are unlikely to provide sufficient public assurance that data-mining inferences conform to legal restrictions. While privacy-preserving data-mining techniques are certainly necessary in some contexts, they are not a sufficient privacy protection without the transparency and accountability.
In the long run, access restriction alone is not enough to protect privacy or to ensure reliable conclusions, and the best example of these challenges is Web and Web-mining technology. As we leave the well-bounded world of enterprise databases and enter the open, unbounded world of the Web, data users need a new class of tools to verify that the results they see are based on data that are from trustworthy sources and are used according to agreed-upon institutional and legal requirements. The implications of data mining on digital social networks such as Facebook, Myspace, or Twitter may be enormous. Unless it is part of a public record designed for consumption by everyone or describes an activity observed by strangers, the stored information is rarely known outside our families, much less outside our social networks. An expectation that such information and potential derivatives will remain “private” on the Internet is not anymore a reasonable assumption from the social network perspective. One of the major contributors to these controversies is the absence of clear legal standards. Thirty years ago the lack of relevant law was understandable: The technologies were new; their capacity was largely unknown; and the types of legal issues