Papers tagged ‘Methodology’

Inferring Mechanics of Web Censorship Around the World

I’ve talked a bunch about papers that investigate what is being censored online in various countries, but you might also want to know how it’s done. There are only a few ways it could be done, and this paper does a good job of laying them out:

  • By DNS name: intercept DNS queries either at the router or the local DNS relay, return either no such host or a server that will hand out errors for everything.

  • By IP address: in an intermediate router, discard packets intended for particular servers, and/or respond with TCP RST packets (which make the client disconnect) or forged responses. (In principle, an intermediate router could pretend to be the remote host for an entire TCP session, but it doesn’t seem that anyone does.)

  • By data transferred in cleartext: again in an intermediate router, allow the initial connection to go through, but if blacklisted keywords are detected then forge a TCP RST.

There are a few elaborations and variations, but those are the basic options if you are implementing censorship in the backbone of the network. The paper demonstrates that all are used. It could also, of course, be done at either endpoint, but that is much less common (though not unheard of) and the authors of this paper ruled it out of scope. It’s important to understand that the usual modes of encryption used on the ’net today (e.g. HTTPS) do not conceal either the DNS name or the IP address of the remote host, but do conceal the remainder of an HTTP request. Pages of an HTTPS-only website cannot be censored individually, but the entire site can be censored by its DNS name or server IP address. This is why Github was being DDoSed a few months ago to try to get them to delete repositories being used to host circumvention tools [1]: Chinese censors cannot afford to block the entire site, as it is too valuable to their software industry, but they have no way to block access to the specific repositories they don’t like.

Now, if you want to find out which of these scenarios is being carried out by any given censorious country, you need to do detailed network traffic logging, because at the application level, several of them are indistinguishable from the site being down or the network unreliable. This also means that the censor could choose to be stealthy: if Internet users in a particular country expect to see an explicit message when they try to load a blocked page, they might assume that a page that always times out is just broken. [2] The research contribution of this paper is in demonstrating how you do that, through a combination of packet logging and carefully tailored probes from hosts in-country. They could have explained themselves a bit better: I’m not sure they bothered to try to discriminate packets are being dropped at the border router from packets are being dropped by a misconfigured firewall on the site itself, for instance. Also, I’m not sure whether it’s worth going to the trouble of packet logging, frankly. You should be able to get the same high-level information by comparing the results you get from country A with those you get from country B.

Another common headache in this context is knowing whether the results you got from your measurement host truly reflect what a normal Internet user in the country would see. After all, you are probably using a commercial data center or academic network that may be under fewer restrictions. This problem is one of the major rationales for Encore, which I discussed a couple weeks ago [3]. This paper nods at that problem but doesn’t really dig into it. To be fair, they did use personal contacts to make some of their measurements, so those may have involved residential ISPs, but they are (understandably) vague about the details.

Encore: Lightweight Measurement of Web Censorship with Cross-Origin Requests

As I’ve mentioned a few times here before, one of the biggest problems in measurement studies of Web censorship is taking the measurement from the right place. The easiest thing (and this may still be difficult) is to get access to a commercial VPN exit or university server inside each country of interest. But commercial data centers and universities have ISPs that are often somewhat less aggressive about censorship than residential and mobile ISPs in the same country—we think. [1] And, if the country is big enough, it probably has more than one residential ISP, and there’s no reason to think they behave exactly the same. [2] [3] What we’d really like is to enlist spare CPU cycles on a horde of residential computers across all of the countries we’re interested in.

This paper proposes a way to do just that. The authors propose to add a script to globally popular websites which, when the browser is idle, runs tests of censorship. Thus, anyone who visits the website will be enlisted. The first half of the paper is a technical demonstration that this is possible, and that you get enough information out of it to be useful. Browsers put a bunch of restrictions on what network requests a script can make—you can load an arbitrary webpage in an invisible <iframe>, but you don’t get notified of errors and the script can’t see the content of the page; conversely, <img> can only load images, but a script can ask to be notified of errors. Everything else is somewhere in between. Nonetheless, the authors make a compelling case for being able to detect censorship of entire websites with high accuracy and minimal overhead, and a somewhat less convincing case for being able to detect censorship of individual pages (with lower accuracy and higher overhead). You only get a yes-or-no answer for each thing probed, but that is enough for many research questions that we can’t answer right now. Deployment is made very easy, a simple matter of adding an additional third-party script to websites that want to participate.

The second half of the paper is devoted to ethical and practical considerations. Doing this at all is controversial—in a box on the first page, above the title of the paper, there’s a statement from the SIGCOMM 2015 program committee, saying the paper almost got rejected because some reviewers felt it was unethical to do anything of the kind without informed consent by the people whose computers are enlisted to make measurements. SIGCOMM also published a page-length review by John Byers, saying much the same thing. Against this, the authors argue that informed consent in this case is of dubious benefit, since it does not reduce the risk to the enlistees, and may actually be harmful by removing any traces of plausible deniability. They also point out that many people would need a preliminary course in how Internet censorship works and how Encore measures it before they could make an informed choice about whether to participate in this research. Limiting the pool of enlistees to those who already have the necessary technical background would dramatically reduce the scale and scope of measurements. Finally they observe that the benefits of collecting this data are clear, whereas the risks are nebulous. In a similar vein, George Danezis wrote a rebuttal of the public review, arguing that the reviewers’ concerns are based on a superficial understanding of what ethical research in this area looks like.

Let’s be concrete about the risks involved. Encore modifies a webpage such that web browsers accessing it will, automatically and invisibly to the user, also access a number of unrelated webpages (or resources). By design, those unrelated webpages contain material which is considered unacceptable, perhaps to the point of illegality, in at least some countries. Moreover, it is known that these countries mount active MITM attacks on much of the network traffic exiting the country, precisely to detect and block access to unacceptable material. Indeed, the whole point of the exercise is to provoke an observable response from the MITM, in order to discover what it will and won’t respond to.

The MITM has the power to do more than just block access. It almost certainly records the client IP address of each browser that accesses undesirable material, and since it’s operated by a state, those logs could be used to arrest and indict people for accessing illegal material. Or perhaps the state would just cut off their Internet access, which would be a lesser harm but still a punishment. It could also send back malware instead of the expected content (we don’t know if that has ever happened in real life, but very similar things have [4]), or turn around and mount an attack on the site hosting the material (this definitely has happened [5]). It could also figure out that certain accesses to undesirable material are caused by Encore and ignore them, causing the data collected to be junk, or it could use Encore itself as an attack vector (i.e. replacing the Encore program with malware).

In addition to the state MITM, we might also want to worry about other adversaries in a position to monitor user behavior online, such as employers, compromised coffee shop WiFi routers, and user-tracking software. Employers may have their own list of material that people aren’t supposed to access using corporate resources. Coffee shop WiFi is probably interested in finding a way to turn your laptop into a botnet zombie; any unencrypted network access is a chance to inject some malware. User-tracking software might become very confused about what someone’s demographic is, and start hitting them with ads that relate to whatever controversial topic Encore is looking for censorship of. (This last might actually be a Good Thing, considering the enormous harms behavioral targeting can do. [6])

All of these are harm to someone. It’s important to keep in mind that except for poisoning the data collected by Encore (harm to the research itself) all of them can happen in the absence of Encore. Malware, ad networks, embedded videos, embedded like buttons, third-party resources of any kind: all of these can and do cause a client computer to access material without its human operator’s knowledge or consent, including accesses to material that some countries consider undesirable. Many of them also offer an active MITM the opportunity to inject malware.

The ethical debate over this paper has largely focused on increased risk of legal, or quasilegal, sanctions taken against people whose browsers were enlisted to run Encore tests. I endorse the authors’ observation that informed consent would actually make that risk worse. Because there are so many reasons a computer might contact a network server without its owner’s knowledge, people already have plausible deniability regarding accesses to controversial material (i.e. I never did that, it must have been a virus or something). If Encore told its enlistees what it was doing and gave them a chance to opt out, it would take that away.

Nobody involved in the debate knows how serious this risk really is. We do know that many countries are not nearly as aggressive about filtering the Internet as they could be, [7] so it’s reasonable to think they can’t be bothered to prosecute people just for an occasional attempt to access stuff that is blocked. It could still be that they do prosecute people for bulk attempts to access stuff that is blocked, but Encore’s approach—many people doing a few tests—would tend to avoid that. But there’s enough uncertainty that I think the authors should be talking to people in a position to know for certain: lawyers and activists from the actual countries of interest. There is not one word either in the papers or the reviews to suggest that anyone has done this. The organizations that the authors are talking to (Citizen Lab, Oxford Internet Institute, the Berkman Center) should have appropriate contacts already or be able to find them reasonably quickly.

Meanwhile, all the worry over legal risks has distracted from worrying about the non-legal risks. The Encore authors are fairly dismissive of the possibility that the MITM might subvert Encore’s own code or poison the results; I think that’s a mistake. They consider the extra bandwidth costs Encore incurs, but they don’t consider the possibility of exposing the enlistee to malware on a page (when they load an entire page). More thorough monitoring and reportage on Internet censorship might cause the censor to change its behavior, and not necessarily for the better—for instance, if it’s known that some ISPs are less careful about their filtering, that might trigger sanctions against them. These are just the things I can think of off the top of my head.

In closing, I think the controversy over this paper is more about the community not having come to an agreement about its own research ethics than it is about the paper itself. If you read the paper carefully, the IRB at each author’s institution did not review this research. They declined to engage with it. This was probably a correct decision from the board’s point of view, because an IRB’s core competency is medical and psychological research. (They’ve come in for criticism in the past for reviewing sociological studies as if they were clinical trials.) They do not, in general, have the background or expertise to review this kind of research. There are efforts underway to change that: for instance, there was a Workshop on Ethics in Networked Systems Research at the very same conference that presented this paper. (I wish I could have attended.) Development of a community consensus here will, hopefully, lead to better handling of future, similar papers.

Taster’s Choice: A Comparative Analysis of Spam Feeds

Here’s another paper about spam; this time it’s email spam, and they are interested not so much in the spam itself, but in the differences between collections of spam (feeds) as used in research. They have ten different feeds, and they compare them to each other looking only at the domain names that appear in each. The goal is to figure out whether or not each feed is an unbiased sample of all the spam being sent at any given time, and whether some types of feed are better at detecting particular sorts of spam. (Given this goal, looking only at the domain names is probably the most serious limitation of the paper, despite being brushed off with a footnote. It means they can’t say anything about spam that doesn’t contain any domain names, which may be rare, but is interesting because it’s rare and different from all the rest. They should have at least analyzed the proportion of it that appeared in each feed.)

The spam feeds differ primarily in how they collect their samples. There’s one source consisting exclusively of manually labeled spam (from a major email provider); two DNS blacklists (these provide only domain names, and are somehow derived from other types of feed); three MX honeypots (registered domains that accept email to any address, but are never used for legitimate mail); two seeded honey accounts (like honeypots, but a few addresses are made visible to attract more spam); one botnet-monitoring system; and one hybrid. They don’t have full details on exactly how they all work, which is probably the second most serious limitation.

The actual results of the analysis are basically what you would expect: manually labeled spam is lower-volume but has more unique examples in it, botnet spam is very high volume but has lots of duplication, everything else is somewhere in between. They made an attempt to associate spam domains with affiliate networks (the business of spamming nowadays is structured as a multi-level marketing scheme) but they didn’t otherwise try to categorize the spam itself. I can think of plenty of additional things to do with the data set—which is the point: it says right in the abstract most studies [of email spam] use a single spam feed and there has been little examination of how such feeds may differ in content. They’re not trying so much to produce a comprehensive analysis themselves as to alert people working in this subfield that they might be missing stuff by looking at only one data source.

Tensions and frictions in researching activists’ digital security and privacy practices

Presentations at HotPETS are not as technical as those at PETS proper, and are chosen for relevance, novelty, and potential to generate productive discussion. The associated papers are very short, and both are meant only as a starting point for a conversation. Several of this year’s presentations were from people and organizations that are concerned with making effective use of technology we already have, rather than building new stuff, and with highlighting the places where actual problems are not being solved.

The Tactical Technology Collective (tactical tech.org, exposing the invisible.org; if you don’t have time to watch a bunch of videos, I strongly recommend at least reading the transcribed interview with Jesus Robles Maloof) is a practitioner NGO working on digital security and privacy, with deep connections to activist communities around the world. What this means is that they spend a lot of time going places and talking to people—often people at significant risk due to their political activities—about their own methods, the risks that they perceive, and how existing technology does or doesn’t help.

Their presentation was primarily about the tension between research and support goals from the perspective of an organization that wants to do both. Concretely: if you are helping activists do their thing by training them in more effective use of technology, you’re changing their behavior, which is what you want to study. (It was not mentioned, and they might not know it by this name, but I suspect they would recognize the Hawthorne Effect in which being the subject of research also changes people’s behavior.) They also discussed potential harm to activists due to any sort of intervention from the outside, whether that is research, training, or just contact, and the need for cultural sensitivity in working out the best way to go about things. (I can’t find the talk now—perhaps it was at the rump session—but at last year’s PETS someone mentioned a case where democracy activists in a Southeast Asian country specifically did not want to be anonymous, because it would be harder for the government to make them disappear if they were known by name to the Western media.)

There were several other presentations on related topics, and technology developers and researchers aren’t paying enough attention to what activists actually need has been a refrain, both at PETS and elsewhere, for several years now. I’m not sure further reiteration of that point helps all that much. What would help? I’m not sure about that either. It’s not even that people aren’t building the right tools, so much, as that the network effects of the wrong tools are so dominant that the activists have to use them even though they’re dangerous.