Policy by the Numbers
Data for policymaking from Google and friends.
The challenges of censorship detection
Monday, April 30, 2012
When I lived in Washington, DC, I was lucky enough to be on the same power grid as the local power company’s headquarters. During an outage, the utility’s site would show me
displaying how many customers were without power, where they were located and most importantly, where repair crews were headed next. Similar dashboards for centralized systems are all around us. When my car tires lose air pressure, my car tells me. When I order a package, FedEx tells me where it is. So where is our Internet dashboard? If I can find out that a package is being held in Buffalo, shouldn’t I know why my Internet packets didn’t make it to Beijing?
It turns out that identifying Internet censorship, filtering or other web blockages is much more challenging. As Jonathan Zittrain described in his
2009 TED Talk
, the Internet is more akin to a mosh pit than FedEx. There’s no “absolutely positively” on the Internet; instead, there’s a series of interconnected, loosely affiliated servers that voluntarily pass your data in the general direction of its destination.
This decentralized structure is what makes the Internet so robust, but it also makes filtering very hard to pinpoint. For example, when Chinese citizens suddenly inundated
President Obama’s Google+ page
with comments about Internet freedom and other topics, we could see that Google+ was accessible in China. Although the result was clear, the cause was not.
Often people believe that the “Great Firewall” is the extent of China’s censorship, but as
, that much-scrutinized feature is only the outermost layer of the censorship apparatus. The self-censorship that ISPs and content providers impose upon themselves is actually far more effective. Researchers at Carnegie Mellon University have
how this multi-tiered approach can have a disparate impact on Chinese netizens. On Sina Weibo (China’s Twitter equivalent) 53% of messages originating from the politically contentious region of Tibet are deleted, but only 12% and 11.4% of messages from Beijing and Shanghai, respectively, are removed. They also showed that deletion is inconsistent; 17.4% of posts are deleted, while other posts with the same political terms remain.
So when we see something like Google+ comments from China or the
of previously blocked search terms
, the distributed nature of both the network and the filtering makes it hard to know exactly what is occurring.
Thus, what we need is a dashboard for Internet health. The
at Harvard is one piece of that. By aggregating over 200,000 user-submitted reports about accessible and inaccessible websites, we can map a slice of the end-user experience. Constructing a true dashboard, however, will require data as distributed as the network itself. What each piece of the network knows about itself and its neighbors may be inconsequential on its own, but can powerful when aggregated. Creating a useful measurement tool requires that leaders from browser manufacturers, ISPs, registrars and backbone providers recognize the crucial role they can play in helping identify filtering, censorship and other web blockages.
, Project Director of Herdict.org and Berkman Center fellow
No comments :
Post a Comment
Future of Music
Hangouts on Air
Internet of Things
Oxford Internet Institute
The authors of these posts include Googlers and guest bloggers. Opinions expressed here do not necessarily represent Google’s views. We hope the numbers presented will inspire meaningful conversations and inform policy debates.
Public Policy Blog
Official Android Blog
Lat Long Blog
Ads Developer Blog
Android Developers Blog