Circumvention - Monitoring Internet censorship: the case of UBICA

A logical consequence from the awareness of censorship and progress on understanding its working details is the proposal of methods to dodge it, collectively named censorship circumvention (just “circumvention” in the following). Besides papers focused on circum-vention itself, often papers discussing censorship and censorship detection add also the related analysis of possible circumvention methods: references to both cases are provided hereafter in the form of a brief survey of literature on circumvention.

Methods, tools and platforms have been specifically designed to counter censorship: in [Elahi and Goldberg, 2012] a taxonomy is presented that characterizes thirty among cir-cumvention tools, platforms and techniques according to a number of properties, namely:

the ability to make censorship damages outweigh its benefits, avoiding censor control over entities or traffic, avoiding censors surveillance, and other means along these main ones.

Another valuable source for scientific literature on censorship and circumvention is the webpage “Selected Papers in Censorship”¹⁵.

The early analysis of network-based censorship techniques [Dornseif, 2003] cites a num-ber of possible circumvention techniques: for each a brief analysis is presented, considering which censorship technique is circumvented and what is needed by the user to apply the technique, concluding that albeit several workarounds are possible they can be techni-cally complex, burdensome or needing collaboration by a third party, thus resulting not easy to be applied for a common user.

A few techniques for circumvention of application-level keyword-based censorship are suggested in Crandall et al. [2007]. These techniques are based on the knowledge of the blacklist of keywords that trigger censorship, and are described as asymmetric techniques in the sense that are server-based and do not require changes in the client. We note that while some of them - namely IP packet fragmentation, insertion of HTML comments and varying characters encoding- exploit standard functionalities of the stack on the client that could virtually be unaffected and provide results equivalent to the non-mangled com-munication, other (captcha, spam-like rewording) likely will disrupt automatic functions such as indexing and text search and are meant solely for human interpretation.

In [Clayton et al., 2006] a technique is presented to circumvent a specific censorship (TCP-RST communication disruption) by ignoring the forged RST packets, also

consider-15www.cs.kau.se/philwint/censorbib/

Circumvention 28

ing TTL-based validation to identify forged RSTs from legitimate ones. The implications are discussed (adopting a non-RFC-compliant stack on both web server and client) and compared with the use of encryption. New censorship techniques and variants namely Revised Sequence Number Prediction Attack, Forged SYN-ACK Response (detected and described in [Polverini and Pottenger, 2011]) suggest that the proposed circumvention technique is no more effective.

A recent field survey on circumvention techniques in China has been published as tech-nical report in [Robinson et al., 2013]: listed circumvention tools are Freegate¹⁶, Ultra-Surf¹⁷, web proxies, SSH, while only 16% used personal VPNs; according to the authors, a 15% that didn’t tell the name of the tool possibly use GAppProxy¹⁸, predecessor of GoAgent¹⁹ tool.

Even if not specifically designed for censorship circumvention, anonymity technologies can and have been used to circumvent censorship: a recent survey on usage and geograph-ical distribution of several technologies including proxy servers, remailers, JAP, I2P, and Tor is provided in [Li et al., 2013].

The availability of Internet Censorship detection and monitoring tools is a strict re-quirement for the development and evolution of circumvention technologies: this adds another motivation to the present Thesis work.

16http://www.dit-inc.us/freegate

17https://ultrasurf.us/

18http://gappproxy.sourceforge.net/

19https://code.google.com/p/goagent/ (documentation in Chinese)

Circumvention29

DNS+HTTP

STAGES more

two single

STATEFULNESS stateless

stateful

CENSORING LOCATION oﬀ-path

in-line

SURVEILLANCE LOCATION oﬀ-path

in-line

SYMPTOM

poor QoS jitter

delay throughput packet loss

HTTP

body HTML

redirect http-meta

javascript iframe

unrelated content block page

response code 30X redirect

403 forbidden 404 not found 200 OK

TLS invalid certiﬁcate connection reset TCP

timeout IP host unreachable network unreachable NXDOMAIN DNS

SECOND-STAGE TRIGGER

HTTP

body keyword header

hostname resource path

DNS type A

hostname TCP

payload keyword header port

source destination UDP

payload keyword header port

source destination IP source

destination ACTION

HTTP tampering

transparent Proxy HTTP injection DNS tampering

DNS-Sec error DNS injection DNS hijacking IP/TCP/UDP

IP/TCP/UDP throttling IP/TCP/UDP ﬁltering TCP connection disruption BGP hijacking

BGP black-hole FIRST-STAGE

TRIGGER

HTTP

body keyword direction

response request header

hostname resource path

DNS type A

hostname TCP

payload keyword header port

source destination UDP

payload keyword header port

source destination IP

source destination

Figure 2.4: Characterization of Network-based Censorship Techniques - the two-stage technique is described in terms of the different axes:

matched properties are highlighted.

Circumvention 30

Great Firewall of China

STAGES more

two single

STATEFULNESS stateless

stateful

CENSORING LOCATION oﬀ-path

in-line

SURVEILLANCE LOCATION oﬀ-path

in-line

SYMPTOM

poor QoS HTTP TLS invalid certiﬁcate connection reset TCP

timeout IP DNS

SECOND-STAGE TRIGGER

IP source destination ACTION

HTTP tampering DNS tampering IP/TCP/UDP

IP/TCP/UDP throttling IP/TCP/UDP ﬁltering TCP connection disruption BGP hijacking

BGP black-hole FIRST-STAGE

TRIGGER

HTTP body direction

response request header

hostname resource path DNS

TCP UDP IP

Figure 2.5: Characterization of Network-based Censorship Techniques - the two-stage technique (as detected in the analysis of the Great Firewall of China [Xu et al., 2011]) is described in terms of the different axes: unmatched section are collapsed, matched properties are highlighted.

Chapter 3 Detection of Internet Censorship

To be able to carry on an informed discussion and analysis of censorship, is of paramount importance the ability to assess and understand its actual usage. In fact significant aspects of censorship, such as its enforceability, its transparency and the accountability of the censors to the affected population, strongly depend on the technical details of the adopted censorship technique and thus evolve with the technology and real usage of it.

The evidence of censorship is fundamental to raise the awareness in international scenarios of the social cost of censorship methods. Finally, side effects of the application of censorship (dubbed “overblocking”), that play a major role in the ethical, political and economical feasibility of its enforcement, are also bound to the technical details of the adopted method.

In coherence with the definition of Internet Censorship we have adopted in Section 2.2, we consider the Internet Censorship Detection¹ as “the process that, analyzing network data, proves the existence of impairments in the access to content and services caused by a third party (neither the client system nor the server hosting the resource or service) and not justifiable as an outage”. We implicitly include the collection of the suitable network traffic data in the Internet Censorship Detection as it is a fundamental phase of the process. We also note that, as stated in Chapter 2, in the present work the focus is on Network-based Censorship, and thus Detection refers, unless differently stated, to the related techniques (Section 2.1.3).

Detection is essentially based on the ability to tell the effect of the censorship from the “normal” uncensored result and from involuntary outages; for a class of detection methods (active detection), it requires also the possibility of intentionally triggering the

1hereafter also “detection”, when no ambiguity would derive

supposed censorship system. The inference of the adopted censorship technique is inherent to identifying the type of third party causing the impairment and its differentiation from an outage.

With reference and in addition to definitions stated in Section 2.2, censorship detection techniques can be characterized considering two main aspects:

viewpoint : the role of the probe host in the client-server communication model:

client-based : collected network traffic is initiated by the same IP address of the probe host;

gateway-based : collected traffic has neither source nor destination IP addresses belonging to the same network of the probe; in the scenario of interest the probe host is a gateway towards Internet: all traffic between the served network and the Internet passes through it;

server-based : collected network traffic is initiated by other addresses towards the same IP address of the probe host;

collection : the collection of network traffic data can be performed with active or passive techniques

active collection : techniques that use client systems (“probes”) to generate net-work traffic purposely crafted for possibly eliciting a censorship response (to be recorded and analyzed).

passive detection : techniques that collect traffic data from network application logs or traffic traces captured on the device (“probe”) in order to look for evidence of censorship events;

A special case crossing these definitions is the usage of active methods towards a controlled destination: as both sides of the communication are controlled (both qualify as

“probe”) then traffic can be collected also at the receiver side (passive collection method).

As for the viewpoint, if both edges are controlled then once the server receives client traffic can also respond with a purposely crafted reply (e.g. containing a second-stage trigger for a stateful censorship system); finally, probes can switch role, allowing directed testing of the network paths in-between. This setup is akin to the one used in the methods of network tomography [Castro et al., 2004], thus in analogy we name it “censorship tomography”, as a special case of active methods. In fact even though censorship tomography implies

Active detection methods 33

logging of received traffic, the possibility to generate traffic purposely forged to trigger some specific mechanisms is the essential property that characterizes active methods.

Nel documento Monitoring Internet censorship: the case of UBICA (pagine 35-41)