SOPA and PROTECT IP chill free speech

There’s a lot of outcry over how pending copyright legislation (SOPA (PDF), formerly known as E-PARASITE, in the House, and PROTECT IP (PDF) in the Senate) would “break the Internet”. Hyperbole aside, the bills would enable the Attorney General and rights holders to go after payment processors, domain name registrars, and the like to disable access to “foreign” websites that infringe U.S. intellectual property rights.

My concern is that the bills are overbroad. They take down too much non-infringing speech in order to get at the stuff that does infringe upon copyright. I’m not sure whether the Supreme Court would hold that the bills abridge free speech rights under the First Amendment, but they would have a serious chilling effect upon free speech.

For example, suppose that the Russian equivalent of Google’s Blogger service hosts infringing content — say, at blogger.ru/piratedmovies. Suppose also that this is the only piece of infringing content and that the vast majority of content on blogger.ru is stuff like critiques of Dostoyevsky and recipes for borscht. Under Sec. 102 of SOPA, the Attorney General can obtain a court order to block off all U.S. access to blogger.ru. While the Russian operators of blogger.ru could, in theory, appear in a U.S. court to dispute the Attorney General’s actions, it’s unlikely that the operators of a Russian language website are going to go to that effort for the handful of American users interested in its Borscht recipes. Collectively though, this would block off Americans from a lot of “foreign” Internet account. It would, in effect, create a “Great Firewall of America”.

Continue reading “SOPA and PROTECT IP chill free speech”

Stop Using Tiananmen Square as a Censorship Test

I commented on Robert Scoble’s blog in response to Serkan Toto’s use of search results for “Tiananmen Square” on Google.com vs. “天安门广场” on Google.cn to illustrate that some filtering was still up. He’s right, filtering is still up as of now, but that’s a bad search query to illustrate your point. I complained about this earlier with Nicholas Kristof too, and I think this sort of thing illustrates how our preconceived notions about the People’s Republic of China color our view of events there.

I’ve reposted the relevant bits of my comment on Scoble’s blog below:

[U]sing Tiananmen Square as a test query is misleading. Of course “天安门广场” is going to return images of, you know, the actual square! Here are the search results for “天安门广场” in Google.com, which is US-based and uncensored:

http://bit.ly/7C8EsD

Huh, not much there — but this time you can’t blame censorship for it.

Why? Well, English speakers are very likely to associate Tiananmen with the 1989 crackdown, so Google’s search algorithm associates the term “Tiananmen” with images of the tank guy.

On the other hand, for mainland Chinese, “天安门广场” has a meaning outside of the 1989 crackdown. It’s a place, and one that’s smack dab in the middle of Beijing. When someone in China mentions “天安门广场”, they’re probably using it in the context of “there’s a street vendor near the northwest corner of Tiananmen Square selling kites,” not “never forget the people killed here 21 years ago.” Most people on the Internet use it for boring everyday stuff, not to foment dissent over an event a lot of “netizens” are too young to remember. Google’s algorithm picks up on this kind of thing and organically ranks things related directly to the location itself over things related to the one incident that English speakers associate Tiananmen with.

“天安门广场 1989” and “Tiananmen 1989” are probably much better terms for proving your point.

That said, you’re right that Google.cn hasn’t implemented all or some of the de-censoring yet. You can tell, because on the bottom of the search results on Google.cn, you see “据当地法律法规和政策,部分搜索结果未予显示。”

That is, “According to local laws, regulations and policies, some search results are not shown.”

Bing Censoring in China?

Nicholas Kristof recently put up an article about Bing censoring simplified (mainland) Chinese searches. All of the major search players do this of course, but what’s new is that the censoring happens when if you’re searching from a U.S. IP address (as opposed to within China itself).

Kristof uses Tiananmen (天安门) as his search term, but I think that’s a little ambiguous. Tiananmen Square has a history that stretches well before 1989 (trivia of the day: the 1989 incident was not the first Tiananmen Square incident) and as a popular tourist location, it’s plausible that Bing’s algorithm would turn up lots of friendly-Tianamen-is-a-nice-place-to-visit results.

So let’s try the name of a certain evil cult outlawed in China.

For comparison, here’re the Google results:

Google has 7,490,000 results and Bing has 0? Now that’s implausible.

Interesting notes:

  • Today’s Bing background is of Potola Palace in Tibet, the former home of the Dalai Lama.
  • Google includes traditional Chinese character results in search results using simplified Chinese characters (see the last item in the screenshot above).