|
META TOPICPARENT | name="SecondEssay" |
| |
< < | The chaffing society | > > | Chaffing in practice | | | |
< < | -- By SamuelRoth - 04 Dec 2014 | > > | -- By SamuelRoth - 15 Jan 2015 | | Rivest's other algorithm | |
< < | In his 1998 paper Chaffing and Winnowing: Confidentiality without Encryption, Ronald Rivest proposes a new technical method that, in theory, achieves strong information confidentiality without traditional encryption. His proposal, which he calls "Chaffing and Winnowing," involves splitting up the plaintext into many small packets—Rivest suggests as small as a single bit—and computing a "message authentication code" (MAC) for each one by using a cryptographic function to combine the data in the packet with a secret key. The sender then adulterates the plaintext packets with a similar number of false packets, each with its own fabricated MAC, so that an eavesdropper cannot tell the "wheat" from the "chaff." Only the recipient, possessed of the secret key, can determine which MACs are valid, and thereby distinguish the true plaintext from false. | > > | In his 1998 paper Chaffing and Winnowing, Ronald Rivest proposes a new method that, in theory, achieves strong information confidentiality without traditional encryption. His proposal involves splitting up the plaintext into packets and computing a "message authentication code" (MAC) for each one by combining the data in the packet with a secret key. The sender then adulterates the plaintext packets with a similar number of false packets, each with its own fabricated MAC, so that an eavesdropper cannot tell the "wheat" from the "chaff." Only the recipient, possessed of the secret key, can determine which MACs are valid, and thereby distinguish the true plaintext from false. | | | |
< < | But Rivest's endorsement and operationalization of the concept have failed to promote its use in contemporary cryptography: A Google search for "Rivest chaffing" yields only various exegeses of the basic idea, and a 2006 technical analysis conducted by a student at the University of Bath still struggles with the question of whether it could be "a real alternative to using traditional encryption techniques." | > > | At the time of Rivest's writing, encryption techniques were at the center of a fierce public debate. Federal law restricted the exportation of encryption software, which meant that it could not be made publicly available on the internet without fear of criminal prosecution. Rivest proposed chaffing and winnowing, in part, to show that sending a plaintext message and subjecting that bitstream to encryption were two distinct operations that could be carried out by two different actors who had in no way coordinated their efforts. Rivest concluded that the theoretical possibility of such a scheme demonstrated "the difficulty (or impossibility) of drafting any kind of reasonable law restricting encryption or confidentiality technology." | | | |
< < | When you're a famous cryptographer like Rivest, every idea must look like an algorithm. But I propose that chaffing's primary potential is not as a cryptographic technique per se, but as a strategy of resistance to corporate surveillance. | > > | Chaffing and winnowing has not been directly put into use, as it would be cumbersome as a cryptographic method, and the anti-surveillance movement has since won the debate over restriction of encryption techniques. Nevertheless, I propose that chaffing and winnowing does have potential useful applications—first, as a technical strategy of resistance to corporate surveillance, and second, as a means of preserving anonymity in big data. | | | |
< < | Way too big data | > > | Supplementing encryption | | | |
< < | As we have discussed in class, databases threaten privacy most effectively when used to make correlations and connections between previously discrete pieces of information. Prof. Moglen recounted the anecdote of the last missing social security number from Baltimore; for another example, consider the security researcher who combined voting records and supposedly "anonymized" medical data to uncover the health records of the governor of Massachusetts. | > > | As Prof. Moglen pointed out in his comments, some software already puts chaffing-like strategies into use. TrackMeNot, a browser extension for Firefox and Chrome, conceals genuine user requests to search engines in a cloud of automated ones; over time, it learns the user's search habits and designs fake queries that should be hard to distinguish from bona fide searches. And adding bad plaintext to good helps to prevent against birthday attacks, a class of cryptographic exploits that rely on the attacker having found two blocks of plaintext that produce the same cryptographic hash. | | | |
< < | But what if all those tables were intentionally bogged down with bad rows? Government electronic surveillance is concerned with finding any basis, significant or insubstantial, on which to justify real-world action, but business intelligence depends on the watcher's ability to effectively narrow the field of data down to the specific pieces of information she needs: The Massachusetts researcher, for instance, succeeded in part because only six people in the city of Cambridge shared the governor's birthdate. What if "big data" became "way too big data," most of it false? | > > | But the idea's technical potential has not been exhausted. As Prof. Moglen pointed out in class, even encrypted email is not entirely free from surveillance when conducted on an unfree platform such as Gmail, because corporate surveillance can still account for the frequency of communications between users. Moreover, business intelligence can mine valuable data merely by determining who corresponds with whom. Encryption conceals the content of communications, but not the fact of their occurrence. | | | |
< < | Wheat into chaff | > > | Chaffing, in that context, offers one solution: Users could chaff the database in a reversible way by use of a program that would interface between the user and the database, along the lines of the Lucent Personalized Web Assistant of the late 1990s. Whenever two users of the chaffing program first communicate with each-other, their instances of the program would agree on a private key by means of a secure handshake; this key would then be used to generate valid MACs for genuine emails between those two individuals. Meanwhile, those users' chaffing programs would start up a non-stop exchange of fake communications, so that future valid communications would be indistinguishable from so much noise. | | | |
< < | Some applications are obvious. If Facebook cares which recipes its users are saving on Epicurious—cares so much, in fact, that it will surveil its users even after they log out—then write a program that saves recipes on Epicurious in the background twenty-four hours a day. If one's phone provider tracks who is calling whom, write an app that, in the background, ceaselessly places calls to other phones with the same app, and does so in a way that hard for a computer to distinguish meaningful phone conversation. | > > | That strategy, used in conjunction with encryption, would complicate a corporate data miner's ability to draw meaningful conclusions from the frequency of communications. As for the second deficiency of plain encryption identified above—the ability of business intelligence to build a network identifying who communicates with whom—the chaffing program also offers a potential solution: the users of the email chaffing program could consent to have fake emails sent to their address from all the users of the chaffing program with whom they have not yet executed a secure handshake—i.e., from strangers. Thus, from the perspective of the email provider, the network of people in communication with one-another and the network of people using the chaffing program are identical. | | | |
< < | A similar program for one's web browser could defend unencrypted web browsing against invasion of privacy by hiding bona fide user requests in a field of chaff. At some point, even Google or Apple has got to run out of hard drive. Or, in the case of more opportunistic corporate surveillance schemes such as Phorm, perhaps it would be more effective to amass each user's cookies on a central server and then install the whole archive in every user's cache. | > > | Supplementing anonymization | | | |
< < | Wheat into chaff into wheat | > > | Big data marks a second area in which chaffing could prove useful as a technical method. Free distribution of large quantities of data promises to radically change the way we understand society and the individual. Even putatively anonymized data, however, can often be linked to personal identities through the use of clever analytical methods. Consider the security researcher who combined voting records and supposedly "anonymized" medical data to uncover the health records of the governor of Massachusetts. Similarly, Prof. Moglen discussed in class the release of New York City transit records, which revealed, as he said, every adulterer in City Hall. | | | |
< < | But all of the above applications assume that the person who creates the chaff in the database does not need to be able to separate out the wheat again. What if that's an important part of the system, because the database in question is of emails or tweets or Facebook friends? (Supposing, arguendo, that tweets or Facebook friends are things which one should accumulate in a database at all.) | > > | Often, it seems, such unmasking depends on a careful process of elimination: The Massachusetts researcher, for instance, succeeded in part because only six people in the city of Cambridge shared the governor's birthdate. Presumably, the City Hall adultery was revealed through similar means: knowing that only one employee left a particular building at a given time, for instance. | | | |
< < | The solution here is not much more complex: Users could chaff the database in a reversible way by use of a program that would interface between the user and the database, along the lines of the Lucent Personalized Web Assistant of the late 1990s. Whenever two users of the chaffing program first communicate with each-other, their instances of the program would agree on a private key by means of a secure handshake; this key would then be used to generate valid MACs for genuine emails between those two individuals. Meanwhile, those users' chaffing programs would start up a non-stop exchange of fake communications, so that future valid communications would be indistinguishable from so much noise.
These chaffed communications could, of course, also be encrypted. But when an email exchange is merely encrypted, the operators of the sender's and recipient's respective email services can still track when and how often the individuals communicate. Adding in chaffing prevents even that minimal kind of surveillance.
Furthermore, the users of the email chaffing program could consent to have fake emails sent to their address from all the users of the chaffing program with whom they have not yet executed a secure handshake—i.e., from strangers. Thus, from the perspective of the email provider, the network of people in communication with one-another and the network of people using the chaffing program are identical. If the value of the vast troves of data currently being assembled by major online service providers comes from making correlations that simulate real-world activities and social relationships, chaffing will complicate and limit their ability to do so.
Technology, politics, law, and culture
In short, the chaffing society seems technologically achievable. The chaffing approach will also certainly require political components, so that the flood of bad data is understood as something other than hacking or an attempt at distributed denial of service. There may even be legal implications, although it is difficult to anticipate what legal challenges will be raised against a strategy that involves, in principle, nothing more than an exponential numerical increase in otherwise legal activities. But the primary challenge will likely be cultural: convincing the movements against surveillance that privacy can be built upon a new vision of radical oversharing.
The reasons Ron's chaffing paper didn't result in practical
technological application are apparent on the surface of the text.
Your approach, adding confetti to overwork and confuse data-miners,
is less effective than just encrypting the emails that are your
example.
There are places where confetti of various kinds is useful in the
ways you suggest. You could look at the "Track Me Not" plugin for
the Firefox browser, which sends a constant stream of plausible
search requests to search engines, fuzzing around the user's own.
The use of entropy inside VPNs to prevent various forms of birthday
attack on the VPN data (including the isolation of encrypted
telephone calls) has also been written about.
But what in the end is the point of the essay? If it is
engineering, precisely what is the problem being solved? If it is
about law or politics, how is it? Rivest's original paper is an
elegant one, about why controlling encryption with legislation
doesn't work. It's not about engineering actual solutions to real
world problems. If your essay is in the same spirit, it would
benefit from an effort to achieve his level of clarity and economy.
| > > | But if every good row in such a database were supplemented by a bad row—or five—the process of elimination would much harder. The chaff could be crafted to statistically mirror the wheat, so that large-scale conclusions and trends drawn from the data would remain essentially valid, but the individual adulterers would be harder to pick out. (This application of chaffing bears some similarities to the idea of differential privacy, a technique that focuses on adding noise to the output from a dataset, rather than adding the noise to the dataset directly.) By adding a MAC to each row, the individual or organization responsible for the data would retain the ability to verify or disclaim individual pieces of data that proved of interest to trustworthy researchers. | | | |
> > | Chaffing and winnowing will never enter wide cryptographic use. Nevertheless, in certain limited contexts, it offers a valuable supplement to encryption and to anonymization. | |
You are entitled to restrict access to your paper if you want to. But we all derive immense benefit from reading one another's work, and I hope you won't feel the need unless the subject matter is personal and its disclosure would be harmful or undesirable. |
|