RSS news directory. Notizie in rilievo.
HOME INVIO DI ARTICOLI RICERCA
» Home » Informatica » SEO, SEM, Marketing »
:: Waxy.org: Andy Baio lives here [en]

Waxy.org

Andy Baio lives here

 << Feed precedente / Feed successivo >>
The Final ROFLCon and Mobile's Impact on Internet Culture

A little late on this, but wow, ROFLCon III was amazing. I was there to moderate a morning keynote panel on the supercut meme with Rich Juzwiak, Duncan Robson and Aaron Valdez, three of my favorite supercut creators. It was a privilege to share the stage with these guys, who are all amazing at what they do. It ended with a debut of Duncan's Three Point Landing, which the audience adored. Here's the whole thing.

Every talk I saw was amazing. All the sessions are making their way onto YouTube, and are all worth checking out. I posted some of my personal highlights on Twitter, but if you missed them, here are my favorites:

Jonathan Zittrain's introductory keynote was thoughtful and inspiring. Jason Scott's solo talk on the Mysterious Mr. Hokum is a crazy story of a pre-Internet scammer. Flourish Klink's panel on fangirl culture was eye-opening, a glimpse into a massive subculture of the web I know far too little about.

The most entertaining, hands down, was Craig Allen's behind-the-scenes story of the Old Spice campaign, with a surprise Skype cameo by Isiaiah Mustafa.

The most underseen and misunderstood session was Wonder-Tonic's pitch for Localoffrly.biz, a douchebag startup turned into comedy performance art. (Bonus points for actually launching a site.) Hard to believe, but some people in the audience weren't sure whether it was a joke, and started to get frustrated when they stopped the gamified talk between each "level." Brave.

And, of course, Chris Poole's solo talk, which ended up inspiring my Wired column that was published last Wednesday. I reprinted it below, hope you enjoy it.


Early this month, the Internet invaded the MIT campus for ROFLCon III, the biennial two-day conference that brings together the subjects of net memes with those who study and adore them.

Among the meme celebrities -- Tron Guy, Paul "Double Rainbow" Vasquez, Antoine Dodson, Scumbag Steve and Chuck Testa all attended -- were those who are deeply invested in the future of Internet culture, both emotionally and financially. Founders of community sites like Reddit and 4chan, academics studying memes, and the cottage industry that's capitalized on them, most notably the Cheezburger Network's Ben Huh. And, of course, the whole audience participated in their propogation.

From the moment I boarded the plane to Boston there was an undercurrent of change running through the conference. I sat next to Whitney Phillips, a University of Oregon doctoral student speaking on a panel about her research on troll culture. She'd attended every ROFLCon since 2008, and realized that she'd have to revise her thesis in the next month -- the meme landscape is in a transitional period, but it's not clear what it's transitioning into. She echoed something I heard repeatedly over the weekend: "It just feels different."

It felt apropos that this was the last ROFLCon, with the organizers "putting this trilogy to bed and riding out into the sunset." Or, at least, until "we can figure out how to continue doing it great justice."

The Internet is still spawning memes at an accelerated rate -- and they'll never go away. But there are some major shifts under way that may fundamentally change the way they're created.

Every meme, like folklore, shares two common characteristics: It must show reproduction (the ability to be copied) and variation (the ability to mutate).

These days, memes spread faster and wider than ever, with social networks acting as the fuel for mass distribution. But it's possible we may see less mutation and remixing in the near future. As Internet usage shifts from desktops and laptops to mobile devices and tablets, the ability to mutate memes in a meaningful way becomes harder.


From the Interest Web to the Social Web

Over the last few years, we've seen a fundamental shift away from discussion forums and other niche communities to social networks and aggregators. In a 20-minute talk at ROFLCon, 4chan and Canvas founder Chris Poole characterized this as a shift from the interest-based web to the friend-based web.

Poole is concerned that the web is losing its emotional depth, a richness that comes from lurking, failing and learning before finding your place in a community. The difficulty gave it more meaning, and the resulting communities added far more value to the web than they extracted.

Now, aggregators like 9GAG and Cheezburger are ridiculously popular, but memes rarely originate there. Unsourced images are posted and watermarked by their new hosts, muddling their origins and diluting the context of the original image. As Poole said, "It's hard to feel emotionally invested in 9GAG."

To me, this is part of the natural expansion of online community. Reddit users hate 9GAG for stealing their memes, but 9GAG is popular because it's easier to use, making it more inclusive to Facebook users than Reddit's sprawling subgenres and somewhat esoteric community norms. It's the same reason that, for years, 4chan users hated Reddit for stealing their memes and bringing them to a community that was much easier to understand.

Unlike social networks, each successive community doesn't seem to cannibalize its predecessor, but instead simply finds a larger, newer audience. The original community stays largely the same, which feels like stagnation relative to the "next big thing." With each new site, the mainstream base and shared knowledge we call "Internet culture" converges into a mixed cultural heritage.

But there's one potential risk that affects the cultural production of memes.


Meme Mutation

Ever tried using 4chan on a iPhone? It's completely impossible to upload images from an iPhone or iPad, immediately limiting your contribution to the community to commenting alone. Sites like Reddit let you post a URL, but modifying and uploading images to a public URL from a mobile device is, for the moment, not easy.

Also for the moment, it's extremely rare for mobile apps to allow community remix and sharing. In fact, I could only find two iOS apps that supported posting your own remixes to a public community space: Mixel and Make Pixel Art. (If you know more, leave them in the comments.) All others only support sharing to your contacts or your own social network, but not the public, unmediated space that memes thrive in.

It's not surprising, then, that the only memes that seem to originate on smartphones are text-based -- autocorrect fail, iPhone whale, and texts from last night.

It feels like we're on the verge of a breakthrough to unleash the creative potential of these devices, but mobile developers are limiting our options to mild tweaking, at best. Instagram's filters made the simplest cosmetic changes, and you weren't able to modify anybody else's work. Draw Something let you draw, but only with a single person and no shared history. Where's the Canvas, Polyvore, deviantArt, and YTMND of the app world?

In the absence of good remix apps, image macro generators like Meme Generator and Quick Meme have filled the gap, making it possible to instantly generate a new meme from a mobile browser in seconds. No tools, or time investment, required.

This is incredibly empowering, but also limiting. Your imagination, and the scope of the meme's breadth, is limited to the capabilities of the meme generator.

It's reasonable to think the shift from desktops and laptops to mobile and tablets will continue, especially for the new generations of young Internet users that typically generate memes. If the app ecosystem doesn't grow to accommodate it, we may see remix participation drop, largely substituted by the lightweight interaction of likes, favs and comments and lightweight prebuilt memes from generators.

In his talk on Saturday, Poole said, "Memes are the instruments with which we play music. The way things are going, we're going to lose our song."

Memes may not go away, but I'm worried we may lose the concert venues where the music is performed -- the quirky, difficult communities that foster creative expression and make it meaningful.

 
Criminal Creativity: Untangling Cover Song Licensing on YouTube

We all break laws. Every day, millions of people jaywalk, download music, and drive above the speed limit. Some laws are obscure, others are inconvenient, and others are just fun to break.

There are millions of cover songs on YouTube, with around 12,000 new covers uploaded in the last 24 hours. Nearly 40,000 people covered "Rolling in the Deep," 11,000 took on "Pumped Up Kicks," 6,000 were inspired by "Somebody That I Used to Know."

Until recently, all but a sliver were illegal, considered infringement under current copyright law. Nearly all were non-commercial, created out of love by fans of the source material, with no negative impact on the market value of the original.

This is creativity criminalized, quite possibly the most popular creative act that's against the law.

I don't think it's an act of civil disobedience; nobody's making a statement. Most people don't know that cover songs need a synchronization license, and even if they did, trying to get one is a confusing and expensive proposition. Unlike the mechanical licenses used to release a cover song on an album, video sync licenses don't have an affordable flat rate and require the publisher's explicit permission.

Even as YouTube forges agreements with publishers to handle the synchronization rights for cover songs, it's nearly impossible for musicians to tell whether their songs are covered or not.

This week, I set out to answer a seemingly simple question: when are YouTube cover songs legal, and how can we do this better?


Conflicting Information

Even trying to determine if a cover song is legal can be confusing for most musicians. There's no shortage of answers online, but most of them are conflicting. Publishers, musicians, and lawyers all give different answers, none of which are totally accurate. Even YouTube's own FAQs are incomplete, made inaccurate by recent settlement agreements.

Like any area of copyright law, there's no shortage of armchair lawyering on blogs and discussion forums about cover songs. A common belief is that cover songs fall under the "fair use" provisions of the Copyright Act, but the question of whether a non-parody cover song could fall under fair use is untested in the courts. Despite this, over 60,000 cover songs on YouTube cite "fair use" in their title or description. (Whether uploaders actually believe that or are preemptively using it as a defense is anyone's guess.)


Content ID detects one of Adrian Holovaty's cover song

While they happily encourage fans to upload covers, YouTube makes it clear that users must have the rights to all content they upload. "We tell users they must own the copyright or have the necessary rights for any content they upload," said a YouTube representative. "It's ultimately their responsibility to know whether they possess the rights for a particular piece of content."

Their only specific guidance for cover songs is in their Copyright FAQ, which says, "Recording a cover version of your favorite song does not necessarily give you the right to upload that recording without permission from the owner of the underlying music."

But this answer isn't fully accurate. YouTube's negotiated blanket synchronization licenses for its users from thousands of publishers, most notably the settlement with the National Music Publishers Association last August. This agreement allowed publishers to opt-in to a program that let them take a cut from a $4 million advance pool and up to 50 percent of the advertising revenue from any cover song they own the rights to.

Frustratingly, we have no idea which publishers have signed on. The NMPA doesn't publish the list, making it impossible to figure out whether your song is covered by the agreement or not. (I contacted the NMPA, but a spokesperson confirmed that information appeared to be unavailable, but was looking into it.)


Begging for Forgiveness

In reality, the only way to tell whether a song is legal is to risk breaking the law and losing your YouTube account -- by uploading the video and waiting for copyright notices.

In the last few months, YouTube has quietly expanded Content ID beyond original recordings to detect cover versions and live performances using the underlying melodies. A YouTube representative confirmed with me, "Content ID's technology allows us to identify works in an original sound recording, or in a cover version (by identifying the underlying melody of a song), using information provided to us by the publishers."

YouTube hasn't talked much about its melody matching technology, but it was in the news recently after a drunk Edmonton man belted "Bohemian Rhapsody" in the back of a police car. After the Content ID identified the song, EMI initially decided to take the video down, but soon changed its mind and authorized it with advertising.


Adrian shared a screenshot of his copyright disputes page.

Everyblock founder Adrian Holovaty is well known on YouTube for his acoustic guitar covers, which have amassed millions of views. I asked him if Content ID identified the melodies in any of his videos. So far, seven of his videos were identified, with all but one rights holder choosing to leave the video online and collect the revenue. Only one video his cover of the Village People's "YMCA," was taken down by the songwriter, leaving Adrian with a "copyright strike" on his account. YouTube's policy allows three strikes before the account is terminated and all videos removed.


The Flaws in the System

The system's not perfect, though. Unscrupulous individuals are routinely using Content ID to claim content they don't own to harvest ad dollars from unsuspecting users. For example, two of Adrian Holovaty's disputed tracks are Django Reinhardt songs from the 1930s, claimed by an obscure company named "Social Media Holdings."

Other copyright claims may be accidental, as material they don't actually own finds its way into the Content ID database, like this poor guy who's received eight consecutive claims from companies claiming to own George Romero's public domain Night of the Living Dead.

And Content ID isn't immune to false positives, like the bird calls misidentified as music. Worse, for all these case, disputed Content ID claims bypass the DMCA process for counter-claims entirely, as I wrote about in February.

How can a musician decide what's legitimate or worth fighting?

Still, YouTube's Content ID is pushing publishers and rights holders into the modern age. It's an ingenious approach for an otherwise dysfunctional copyright system that's too hard for amateurs to navigate, making money for everyone involved while still allowing free creative expression.


The Need for Change

But there's something strange about this begging-for-forgiveness approach to copyright. It's like driving without traffic signs, only finding out you broke the law when you're pulled over.

The real question: Why is it illegal in the first place?

Cover songs on YouTube are, almost universally, non-commercial in nature. They're created by fans, mostly amateur musicians, with no negative impact on the market value of the original work. (If anything, it increases demand by acting as a free promotional vehicle for the track.)

The best solution is the hardest one: To reform copyright law to legalize the distribution of free, non-commercial cover songs.

Copyright law was intended to foster creativity by making it safe for creators to exclusively capitalize on their work for a limited period of time. Cover songs on YouTube don't threaten that ability, and may actually prevent new works by chilling talent that could go on to do great things.

As we've seen with countless breakout artists from YouTube, budding musicians have built their careers from cover songs that evolved into original material. Karmin, Pomplamoose, Julia Nunes, Greyson Chance.... Even Justin Bieber started with covers of Chris Brown and Nee-Yo before getting discovered.

Now, the next generation of budding pop stars are covering Justin Bieber, with about 216,000 of them so far. It's all part of the virtuous cycle of culture: We take from it, build on it, and then give back in return. The law should help that along, not hinder it.


Update: I originally published this column over at Wired on May 2. The woman I spoke to at the NMPA confirmed the list of publishers appeared to be unavailable, but promised to look into it. I haven't heard back, so I followed up again. I'll update here if I hear anything.)

 
History of Yahoo CEOs: Tenure vs. Stock Price

Just for the hell of it, I charted the tenure for every one of Yahoo's CEOs against the starting and ending stock price. Man, what a mess.


Just so nobody else ever has to do this, here's the data, culled from Google News reports and YHOO stock quotes. Download as a CSV.

 
Super Mario Summary Shame

Is there a word for being totally proud of something and completely ashamed at the same time?


Context: Super Mario Summary reworks every level of Super Mario Bros. on a single screen, turning a side-scrolling platformer into a devious, addictive puzzle. Amazingly, it was built in only 48 hours by Swedish developer Johan Peitz as part of the Ludum Dare game competition. Read his postmortem of making the game.

 
In a Rigged Game, Twitter's IPA Lets Developers Rewrite the Rules

Last month, in response to Yahoo's wrongheaded patent infringement lawsuit against Facebook, I wrote about my experience filing patents at Yahoo. Patents I helped to file, ostensibly only for defensive purposes, were turned into blunt weapons to thwart innovation and extort money.

As I said, "I thought I was giving them a shield, but turns out I gave them a missile with my name permanently engraved on it."

This week, Twitter announced their Innovator's Patent Agreement, an open source contract intended to guarantee patents will only be used defensively, even when sold. The IPA seems to directly address the issues raised in my article.

Adam Messinger, Twitter VP of Engineering, wrote that, "With the IPA, employees can be assured that their patents will be used only as a shield rather than as a weapon."

Every one of Twitter's existing patent filings, including Loren Brichter's famous pull-to-refresh patent, will fall under this agreement later this year.

Still, the IPA isn't perfect, and it needs work to protect the intentions of designers and engineers. Instapaper founder Marco Arment pointed out that the contract's definition of "defensive" is overly broad, allowing an unethical company to initiate a lawsuit for a range of reasons without requiring the inventor's permission.

Hypothetically, if Yahoo had adopted the IPA, would it have prevented them from later suing Facebook for patent infringement? Maybe not. Facebook's threatened several startups over trademark name issues in the past, including Lamebook, Placebook, and Teachbook. If any of them were also users, customers or affiliates of Yahoo, then Yahoo could bypass the Patent Agreement and file a patent lawsuit. (Though, if they did, the inventors could choose to sublicense their patents directly to Facebook.)

These problems are correctable though, and Twitter should be commended for taking this important first step. In a deeply broken patent system, it's heartening to see an established company proactively try to work around its flaws. I hope agreements like these find wide industry adoption.

But this isn't a real fix. Union Square's Fred Wilson dubbed it Twitter's "Patent Hack," and that's exactly what it is -- it's duct tape to patch a broken system, but it doesn't solve any of the underlying problems.

The ideal would be patent reform, or if the system's beyond reform, the abolition of business method patents entirely.

Marco Arment wrote, "A truly innovative stance would be for a large technology company to avoid filing patents, and to lobby aggressively for progressive patent reform to make that a practical choice for every technology company."

Like I did last month, Marco vowed not to file any patents. "I fundamentally disagree that software patents (and many other types of patents) are a net gain for society, and I can't participate in that system in good conscience."

After all, if you only use them defensively, why do you need patents at all? Publish your work and establish prior art.

Sadly, prior art only works in an ideal world. As we've seen, the U.S. patent office routinely grants patents even when prior art exists. The recently passed reforms to the patent system, switching from a first-to-invent to a first-to-file system, make this more likely than ever.

For the moment, avoiding patents entirely isn't a realistic legal strategy for large companies. Maintaining a patent arsenal won't ward off shell company-style patent trolls, but it can protect you from competitors by allowing cross-licensing settlements. But all of that feeds into the "cold war" mentality of stockpiling patents you never hope to use.

Until we have real reform or abolition, ethical tech companies are forced to play the patent game, but at least engineers and designers now have a way to rewrite the rules in their favor.

 
Memeorandum Colors 2012: Visualizing Bias on Political Blogs

I don't watch sports, but every four years, I lose myself in the horse race of the U.S. presidential elections. That competition kicked off in earnest Monday, as Gallup started its daily tracking polls for the general election between Barack Obama and Mitt Romney.

In 2008, I was hooked on one drug for my daily fix: Memeorandum, a completely automated aggregator that surfaces popular stories from political news sites, often within minutes.

As you'd expect, the universe of political blogs is largely split in two, with conservative and liberal blogs rarely covering the same stories or linking to the same sites. But it can be very challenging to tell their political leanings at a glance, especially with names like "Balloon Juice," "Weasel Zippers," or "The Volokh Conspiracy."

So, four years ago, I launched a project with Delicious/Tasty Labs founder Joshua Schachter to visualize the linking biases of various political blogs on Memeorandum by looking at their past behavior.

Using singular value decomposition, the linear algebra at the heart of your Netflix recommendations, we reduced the entire matrix of blogger-to-article relationships to a single dimension. Imagine a single line grouping like-minded blogs together based on the diversity of the stories they cover, with hardcore left- and right-leaning blogs on opposite sides of the spectrum.

Using those precalculated values, we load the data from Google Spreadsheets and color the links on Memeorandum, based on where they fall on the spectrum. The brighter the color, the more frequently they only cover stories by their counterparts.

This simple visualization leads to some interesting insights. Compare these two articles, which were trending on Memeorandum at this writing:


Seeing each site's potential bias provides the context for understanding how news is spread. Right-leaning blogs are eager to point out new evidence that George Zimmerman was hurt the day he shot Trayvon Martin, but left-leaning blogs aren't covering that story. Likewise, only left-leaning news sites appear to be covering the news of Ted Nugent's threatening remarks to the president, but conservative blogs aren't. This visualization also makes it easy to spot outliers, the sources that are breaking away from their past behavior to link to something beyond their usual circle.

This browser add-on is free and open source on Github. We've updated the data sources for the first time since 2008, and Memeorandum Colors now works natively in Chrome, in addition to Firefox.

You can try the browser add-on by following these simple directions.

Google Chrome

  1. Click the memeorandum_colors.user.js link.

  2. In the warning dialog at the bottom of Chrome window, select "Continue."

  3. Visit Memeorandum and wait a moment for the links to color.

Firefox

  1. Install Greasemonkey.

  2. Restart Firefox.

  3. Click the memeorandum_colors.user.js link, wait three seconds, and Install.

  4. Visit Memeorandum and wait a moment for the links to color.


Four Years of Data

Along with this release, we now have four years of historical activity to work with. The collected scores are on Google Fusion Tables, and I've included a dump of the activity in Github.

Looking at historical activity can reveal some interesting trends, especially in how attitudes have shifted since the last election.

For example, Little Green Footballs is a long-running political weblog started by Charles Johnson, a web developer who aligned himself with the conservative right wing after the World Trade Center attacks. In late 2009, he publicly parted ways with the right.

That shift away from conservatism was reflected in his linking behavior at least a year before his public statement. If you look at the timeline below, you can see that Johnson started linking to a wider variety of stories outside the conservative conversation, until his activity was mostly neutral in early 2010. Now, his activity tends neutral but slightly favors articles popular in the liberal blogosphere.


Bias In Linking, Not Beliefs

Memeorandum was created by San Francisco developer Gabe Rivera, who followed its introduction with aggregators for media, celebrity gossip, and baseball news. The most popular of these is Techmeme, a daily destination for tech industry watchers.

A month after Obama's election, Rivera announced he'd hired a human editor for Techmeme to help prevent inaccurate results from the algorithm. This editorial oversight would affect any link-based analysis on Techmeme, but he confirmed that Memeorandum is still completely machine-driven.

This automated analysis is not a commentary on the personal opinions and beliefs of any blogger -- no amount of linear algebra can prove that. What this shows is the biases in their linking behavior: the stories that each site chooses to cover, or not cover, and their similarity to others like them.

If you'd like to learn more about the math behind how this works, there's more detail and links to tutorials on my original blog entry.

Let me know if you have any questions and I'll try to answer them in the comments.

 
Waxy.org Turns 10

Ten years ago, I started this site with three simple rules: no journaling, no tired memes, and be original. 18 months later, I added a little linkblog.

In those ten years, I've posted 415 entries, including this one, and over 13,000 links.

The decision to start writing here regularly changed my entire life. It's given me exposure, a place to share my projects and crazy experimentation with technology. It's created new opportunities for me, directly or indirectly responsible for every major project I've gotten involved in. It's a place to play and experiment with ideas, some of which led to big breakthroughs and passions. And it connected me to people who cared about the things I did, many of whom became lifelong friends.

Personal homepages and weblogs have long since faded from the popular trends. They're no longer hip and nobody's launching the hot new startup to reinvent them or make them better.

Most of the interest in writing online's shifted to microblogging, but not everything belongs in 140 characters and it's all so impermanent. Twitter's great, but it's not a replacement for a permanent home that belongs to you.

And since there are fewer and fewer individuals doing long-form writing these days, relative to the growing potential audience, it's getting easier to get attention than ever if you actually have something original to say.

Carving out a space for yourself online, somewhere where you can express yourself and share your work, is still one of the best possible investments you can make with your time. It's why, after ten years, my first response to anyone just getting started online is to start, and maintain, a blog.

And now, just for the hell of it, some of my favorite posts from the last ten years. :)


2002

Tracking the All Your Base Meme with Usenet. The first chart appears only two weeks in, setting a precedent for the next ten years.

Dar Kabatoff's In Town. My first deep-dive into Internet kookiness, an amazing example of Usenet lunacy that eventually led to my first stalker. To this day, people still link to this on various forums that Kabatoff appears in.

Spamming Weblog Comments. Where I casually predicted the rise of blog spam and Bayesian filters designed to stop it.

Steve Martin Fans. Another exploration into a sad, weird corner of the Internet, a prolific stalker turned suicidal in a Steve Martin fan forum.

October 2002 Dictionary Domains. I used to periodically run a script, check for the available of dictionary word .com, .org and .net domains, and post the results. Note the last one in the list, which I later snatched up for myself.


2003

Eldred, Shared Culture Loses. My first mashup landed me in the New York Times and Boston Globe, my first real press coverage ever. Soon after, a Disney exec bought a print of the comic from me, with the sale facilitated by Larry Lessig himself!

NYT and Lost Friends. Two weeks later, I was in the NYT again for my Lost Friends page. This was very new to me.

Google Buys Blogger. I was sitting front and center at the Blogosphere panel in Los Angeles when Ev announced Google bought Blogger, and was one of the first to report the news.

Bias Affects Story Updates on Political Weblogs. My first controversial tech exposé, manually analyzing sites to understand linking behavior. Most of these sites found my article from their referers, leading to some very upset bloggers. People don't like to be accused of bias.

Typo Popularity Tracking with Google. I feel like I started to hit a stride with posts like these, doing some simple analysis to find entertaining results.

Star Wars Kid. The post that launched a meme, melting my server and the servers of most of my friends. I later tracked him down, interviewing him with Jish's help and doing a fundraiser to buy him a newly-introduced iPod. Later, I reported on the lawsuits. Years later, I wrote a final summary of the whole thing, along with the logs for that period.

Santa Monica Farmer's Market Tragedy. My personal reporting from a freak car accident that killed nine people outside my office led to coverage in the BBC. Horrifying.

Upcoming.org Launch! The side project that changed my life.


2004

Researching the 2004 Oscar Screeners. Inspired by a delusional film industry, I sat down and tried to figure out exactly how often Oscar screeners leak online. Eight years later, I'm still doing it every year.

Waxy v2.0. Announcing our pregnancy and, a few months later, the birth of our son.

Danger Mouse's The Grey Album. I was the first person to put the Grey Album on the web, leading to the first takedown request from EMI, which spawned the Grey Tuesday protests.

InfocomBot for AOL Instant Messenger. One of my favorite hacks ever, it let you play classic and modern text adventures over AIM.

Nanniebots: Hoax, Fraud, or Delusion? I helped Ben Goldacre and Cameron Marlow debunk a ridiculous hoax, someone who claimed he developed chatbots to lure pedophiles in chatrooms.

Waxy's Bandwidth Blowout #1: Heat Vision and Jack. In the years before YouTube, serving video was a massive pain in the ass. If you were lucky enough to have a dedicated server, excess bandwidth was a handy commodity. I always loved hosting commercially-unavailable materials.

Amazon Knee-Jerk Contrarian Game. This post, tracking horrible Amazon reviews of critically-loved media, still makes me laugh.

Kleptones, "Night at the Hip-Hopera". Still my favorite mashup album ever, I originally hosted a copy and crowdsourced the sample list for the Kleptones. It netted me my second cease-and-desist, this time from Disney/Hollywood Records.

Afro-Ninja Found! I managed to track down the identity of a stuntman having a very bad day.

Amateur Tsunami Video Footage. Another pre-YouTube phenomenon, the demand for this tragic disaster footage was so high, it melted my server and even took down Archive.org for a time. The videos I uploaded to Archive.org dominated their most downloaded lists for years.


2005

Boing Boing Statistics. I built a simple visualization tool for Boing Boing's five-year archive, following my own Waxy.org Stats and Metafilter growth charts.

Wordpress Website's Search Engine Spam. The biggest story I'd ever broken, at that point, covering search engine spam hidden on Wordpress.com. For me, this was a switch from casual blogging to serious journalism, including quotes from Matt Mullenweg before publishing. More in the followup.

Automating Wikipedia History. I started a contest to make a Greasemonkey script to visually browse Wikipedia history, and got some amazing entries, including one by future-jQuery creator John Resig.

Yahoo and Upcoming, Sitting In A Tree. One of the craziest things that ever happened to me, the optimism in this post is almost blinding.

House of Cosbys, Mirrored. After the brilliant Cosby-inspired animated series was shut down, I mirrored all of the videos and got a takedown order from Bill Cosby's lawyer. I publicly defied it, compiled a list of Cosby parodies in the media, and did an interview about it with the New York Times. I never heard from team Cosby again.


2006

Metafilter Sources 2006. Tracking how the top 50 link sources on Metafilter changed between 2004 and 2006.

Sex Baiting Prank on Craigslist Affects Hundreds. I broke the story of Jason Fortuny's "Craigslist Experiment" after seeing a link to it in a private discussion forum. This ended up being a huge story, involving Craigslist, lawsuits, and ruined lives.


2007

Outgoing. Waxy.org went into cryogenic sleep while I was working at Yahoo and raising my baby boy, so I decided to take some time off to write again and explore new ideas.


2008

Colin's Bear Animation. Four years later, this video still makes me laugh. I tracked down Colin and interviewed him about it.

Personal Ads of the Digerati. I dug up vintage personal ads from Dave Winer and Richard Stallman, and I interviewed RMS about his unusual methods of accessing the web.

The Times (UK) Spamming Social Media Sites. I exposed some nefarious SEO practices from a mainstream newspaper, and interviewed founders of online communities to see what they thought.

Highlights from the British MovieTone Darkweb. Some wonderful vintage videos from a service that doesn't want you to find them. I'm amazed these videos still work.

ForumWarz Postmortem: Interviewing the Game's Creators. This innovative game never got popular, but I was very proud of this interview.

WIRED and The WELL. I have a complete archive of The WELL, and occasionally dig into it for research. For anyone who cares about Wired history, it's a treasure trove.

Internet Power, Volume 1: Flashback to the VHS-Era Web. I set up a VCR and started ripping vintage VHS tapes about the Internet. This was the first of a series of VHS rips, including Internet Power Vol. 2, Olympia School District, and Computability.

Fanboy Supercuts, Obsessive Video Montages. The blog post that named the "supercut" genre, I continued adding to it for years before starting Supercut.org.

Milliways: Infocom's Unreleased Sequel to Hitchhiker's Guide to the Galaxy. This post caused me more pain and heartache than anything I've ever written. On its release, I was extremely proud of it, reconstructing the never-before-told history of an unreleased Infocom game using digital archives. But I didn't ask permission before quoting private emails, causing major fallout on the source that provided me with the archives, ending our friendship forever. You have no idea how often I wish I could unpublish this post.

The Whitburn Project: 120 Years of Music Chart History. I've always loved this story about a group of record collectors on Usenet, illegally swapping Billboard chart spreadsheets. In my followup post, I used the data to analyze music history.

The Machine That Changed the World: Great Brains. An awesome, out-of-print documentary series on computer history that I ripped from VHS, and created annotated show notes for each of the five episodes.

Girl Turk: Mechanical Turk Meets Girl Talk's "Feed the Animals". The first of my Mechanical Turk experiments, crowdsourcing metadata about the album to make neat charts.

Cheap, Easy Audio Transcription with Mechanical Turk. People still cite this post regularly as the guide for DIY crowdsourced transcription.

Kickstarter. The first of many posts about Kickstarter, when I first met the team and joined the board. "Ultimately, everybody should be able to support themselves doing what they love using the web."

Memeorandum Colors: Visualizing Political Bias with Greasemonkey. I worked with Joshua Schachter on this Greasemonkey script analyzing linking behavior on Memeorandum. I still use this every day.

The Faces of Mechanical Turk. I wanted to know what they looked like, and was willing to pay them to find out. This image seems to show up in every conference presentation about Mechanical Turk.


2009

Robin Hood's "Oo De Lally," Translated Into 16 Languages. This makes me happy.

Translating "The Economist" Behind China's Great Firewall. One of the strangest online communities I've ever discovered, a group of Chinese fans of The Economist translating the entire thing cover-to-cover as a learning tool. I ended up writing a shorter version of this piece for the New York Times.

Attribution and Affiliation on All Things Digital. This investigation into AllThingsD's linking practices led to concrete change. They never use long quotes anymore, clearly attribute, and drive traffic to the blogs they link to. Everyone wins.

Category Inflation at the Webbys. In the three years since, the number of categories continues to explode. Planning on writing a followup soon.

Kind of Bloop: An 8-Bit Tribute to Miles Davis. My first Kickstarter project was a big success, hitting its goal in four hours, and went on sale later that year.

Meme Scenery. One of my all-time favorite posts, I removed the subjects of famous memes from their backgrounds. There's something weirdly serene about these background locations without context.

Code Rush in the Creative Commons. In 2008, I'd posted an annotated copy of the classic Mozilla documentary and interviewed the director after he requested I take it offline. A year later, he decided to release it under a Creative Commons license, allowing me to put my annotated version back online.


2010

Interviewing Ted Rall on Comics Journalism in Afghanistan. I interviewed several project creators for the Kickstarter podcast, including this one with author and cartoonist Ted Rall, Pixeljam and James Kochalka, and R.U. Sirius.

Wikileaks Cablegate Reactions Roundup. Sometimes, there's value in just curating the best set of links around a topic. Every time I've ever done this, people seem to like it. I need to remember that more often.

Joining Expert Labs At the end of 2010, I took a leap and joined Expert Labs to work on tools to help government agencies better listen to citizens using social media.


2011

Metagames: Games About Games. Quite possibly the most entertaining research I've ever done. It took me forever, largely because I ended up playing so many clever games.

The Daily: Indexed. I got a lot of press for creating a public index of The Daily's iPad app, against their will. After my trial was up, I wrote about how I did it.

Making Supercut.org. The product of one very, very long night, I worked with artist Michael Bell-Smith to make a script that generated randomized video clips composed entirely of spliced-together supercuts.

Playable Archaeology: An Interview with Telehack's Anonymous Creator. I was so floored by this tour de force of computing history, I interviewed the brilliant, but anonymous, genius behind it.

Kind of Screwed. The long, frustrating tale of the contested Kind of Bloop artwork, which cost me a large out-of-court settlement and a bunch of legal bills. Makes a good story, though!

Apple's 1987 Knowledge Navigator, Only One Month Late. As I was watching the Knowledge Navigator video, I started piecing together dates to figure out when it was supposed to take place. I was blown away by the coincidence.

Google Kills Its Other Plus, and How to Bring It Back. My first column for Wired ended up being a big one. Lots of other power users were justifiably upset, and it directly led to the "Verbatim" feature being added to Google Search.

Supercut: Anatomy of a Meme. I dug into the supercut meme using Mechanical Turk and my database of clips. This doubled as the launch announcement for Supercut.org, a community-contributed index of videos.

Google Analytics A Threat to Potential Bloggers. Exposing one of my techniques for researching anonymous sites, I was surprised how many people didn't know about this.

Viewing the UC Davis Pepper Spraying from Multiple Angles. Sometimes, the simplest ideas are the most powerful. The video's been viewed on YouTube over 150k times.

No Copyright Intended. Remix culture is the new Prohibition.


I'll wrap it up there. With luck, I'll see you in ten more years. Thanks for reading.

 
Instagram's Buyout: How Does It Measure Up?

Instagram's billion-dollar sale to Facebook raised eyebrows yesterday, renewing cries of a new bubble. But relative to other major acquisitions of the past, how does it measure up?

I crunched the numbers, pulling together data from a selection of 30 notable internet acquisitions over the last ten years, from Broadcast.com to OMGPop, to see if the Facebook/Instagram acquisition was as crazy as everyone thinks. (I left out companies without public purchase prices or user stats.)

The spreadsheet below captures the acquisition date, dollar amounts, and ballpark counts of the users and employees at the time of acquisition. Be warned: any of these numbers are very rough, cobbled together from Internet Archive searches, old news articles, Quora answers, and tech blogs. If you have more accurate information, please leave a comment and I'll fix it.

Download the spreadsheet or view it on Google Docs.


Cost Per User

When a startup's acquired, they're purchased for any combination of the technology, talent, or the user base.

If we look strictly at the acquisition cost per user, Facebook got a relative deal with the Instagram purchase, paying roughly $37 for each of Instagram's 27 million users. (The median cost across all the acquisitions is about $92 per user.)

Compare that to acquisitions like Aardvark ($555/user) or Jaiku ($240/user), and you can systematically see which were likely technology or talent hires. The glaring exception is Yahoo's famous purchase of Mark Cuban's Broadcast.com in 1999, which paid nearly $10,000 for each of their 520,000 monthly active users, ten times any other startup. (Broadcast.com skewed the chart so much, I had to leave it off.)


Cost Per Employee

But if you look at the payout per employee, Instagram is completely off the charts. If split equally, each of Instagram's 13 employees would make nearly $77 million. The nearest runner-up is YouTube, with a paltry $24M for its 2006-era staff of 67. Skype, Broadcast.com, and Myspace all top the charts. The median? About $3 million.

Some would point to this as a sign of a bubble, but I think it's more likely it just reflects the incredible scalability of modern app architectures. Using cloud services, failover, and solid monitoring, Instagram can quickly scale up to support a million new users overnight with very little additional engineering effort.


The User-to-Employee Ratio

Instagram's numbers are exactly what you'd want to see in a social network -- high user counts with the lowest number of employees. This ratio is a measure of your efficiency, and it's no surprise that Instagram comes out on top here, with a ratio of one employee for every 2.07 million users.

The second highest user-to-employee ratio is OMGPOP, famous for developing Draw Something, the fastest-growing mobile app in history. With only one employee for every 875,000 users, they were able to scale to 50 million users within 50 days.

On the other end of the scale are the short-lived Q&A service Aardvark, with one employee for every 1,800 users, and customer-service giant Zappos with one employee for every 3,400 users.

More than anything, the app ecosystem rewards efficiency; your ability to massively scale with very little engineering effort. I'm guessing these ridiculously lean startups with huge exits aren't a freak occurrence. We'll see more of them as the rest of the world catches up, and learns how to do more with less.


Methodology

All figures are at the time of acquisition, and I favored active user counts over total registered users for calculating acquisition cost per year.

Thanks to Tristan Louis for providing some of the rumored numbers.


Update

I originally published this yesterday on Wired, under a different headline and revised lede from my editor. To be clear, I don't know if we're in a bubble or not. My only point is that, relative to other acquisitions, the per-user cost for Instagram isn't insane. Union Square Ventures' Albert Wenger added some additional thoughts, noting that the per-user costs should be discounted as the userbase grows.

Many Wired commenters complained I was wrong because Instagram has no revenue. In 2006, YouTube had 34M users, zero revenue, and were bleeding $1M/month for bandwidth alone. Was Google crazy to buy them, too?

Anyway, it was a good excuse to collect all of this data in a spreadsheet for the first time. I went looking, and couldn't find the numbers available in one place anywhere. Hope you liked it.

 
The Fun Pass Is An Awesome Deal

At $2, the Fun Pass is an awesome deal. At $1.50? Unbeatable.

I guess Caine should've worked out a harder-to-crack checksum.

 
The End of Expert Labs, The Start of Something New

Gina and Anil both announced this already, but I was so busy wrapping up loose ends, I didn't get around to my announcement.

Short version: Expert Labs — the non-profit I've worked on for the last 18 months — is over. Gina and Anil are rebooting ThinkUp into a commercial entity, but I've decided to move on. I'll continue to act as a ThinkUp advisor, and have already started work on two brand new, soon-to-be-announced projects.


A Quick Review

I worked on a whole bunch of stuff while at Expert Labs, but it took on two themes: bringing ThinkUp to a new audience, and analysis of the data we collected. Since most of this work wasn't high-visibility outside of the existing ThinkUp community, here's a quick roundup.

Outreach. It's the first time in my career I've ever worked with self-hosted software, and I spent quite a bit of energy trying to help people understand why they'd want to use ThinkUp and make it as easy as possible to get it installed. It's hard enough to get people to sign up with a new web service, but one that requires you to install it on your own web server? Damn hard.

Part of this was marketing: I produced two promo videos, showing off the capabilities of the app at different stages. The first video was overly long, too detailed, and a bit cheezy. With the second, I cut out all the crap and asked Clay to narrate a tight, 74-second elevator pitch for why ThinkUp is an essential utility. If you've never seen it, take a minute to watch.

Unfortunately, offering a hosted version ourselves was never an option. As a nonprofit, it would have been irresponsible for us to archive people's social media activity and then disappear when funding dried up. Instead, we tried to make installation as simple as possible.

My first attempt was just getting it up and running on EC2, and making that process as easy as possible with a step-by-step tutorial. Later, I replaced that with the ThinkUp Launcher, a one-click installer that booted a custom EC2 instance with ThinkUp preinstalled. I released the code on Github, so any open-source project could easily make their own launcher.

Finally, in December, a commercial service appeared that offered drop-dead simple ThinkUp hosting. We worked with PHP Fog, a Portland-based cloud hosting company, to support a one-click ThinkUp jumpstart. Here's the screencast I made, showing off how to get up-and-running in seconds.

To help expand the reach of the app, I worked with Mule Design to figure out what ThinkUp does well, what it could do better, and incorporate those learnings to redesign the next version of ThinkUp. Elements of the redesign have already made their way into ThinkUp 1.0, and will guide later versions of the app.


Analysis. Whether it was making charts, building mashups, or crunching data, I spent quite a bit of effort trying to make sense out of the incredible amount of data being collected by ThinkUp.

I showed off the ThinkUp API with ThinkBack, an open-source mashup that extracted entities from your historical Twitter history to make a time machine of the people, places, and things in your past.

I analyzed Twitter reactions to 2011 and 2012 State of the Union speeches, as well as the White House's Twitter Town Hall, releasing datasets for each. I even made my first, and only, linkbait infographic summing up the White House's Year in Review on Twitter.

One of the biggest projects I created was the Federal Social Media Index, which used ThinkUp to gather activity from 125 federal agencies on Twitter, and try to measure their engagement for the questions they ask using some simple metrics. The response was great, showing how much interest there is for additional tools in that world.

Over the last few weeks, I've adapted it to use the ThinkUp API and will be open-sourcing the results soon to use on your own projects.

Overall, working with Expert Labs was fascinating for me. I'd never worked with government before, and was able to work with motivated and passionate teams from the White House down to local city government. It was an eye-opening experience, and I learned a ton about cultivating an open-source community, the challenges facing state and federal government agencies, and distributing hosted software. Best of all, I was able to do it all while working with three friends I deeply respect: Gina Trapani, Anil Dash, and Clay Johnson.


The Future

Expert Labs may be ending, but ThinkUp is just getting started. It'll continue to be free and open-source, and Gina and Anil are spinning ThinkUp off into a commercial entity, using the open-source base to create a new media property. You can read more about their plans on their Knight News Challenge application on Tumblr, which you should totally like and reblog. (The number of votes factors into the Knight Challenge judging!)

And me? I'll be doing new stuff, like always. I'm still writing my weekly Wired column, working on Playfic, and thinking about big future projects.

I've started working on two unannounced projects simultaneously that I'm crazy excited about. Both have to do with this problem: how do you use technology to connect people together in new ways, and help people make a living doing what they love? It's a running theme through everything I've ever worked on, and I'll be writing much more about them soon.

For the first time in a very long time, I'm also open to hearing about new opportunities. If you're working on anything along these lines and want help, get in touch!

 
Flashback Trojan Creators Scared of Xcode, But Not Norton Antivirus

On Wednesday, a Russian antivirus firm announced that over 600,000 Macs were infected with the Flashback trojan, exploiting a Java vulnerability to create the first significant malware infection in OS X history.

If you're running a botnet, the goal is to avoid detection for as long as possible. Flashback took an interesting approach to hiding itself — if one of several popular antivirus or monitoring tools is detected, it immediately deletes itself. Merely installing a utility like Avast, Clam Antivirus, Little Snitch or HTTP Scoop was enough to protect you, even if you didn't keep them running.

Funny enough, major commercial antivirus utilities like Norton Antivirus, McAfee VirusScan, and F-Secure weren't included in the blacklist. It seems the Flashback authors aren't afraid of the effectiveness of those utilities or, maybe, the technical expertise of their customers.

From the threat description:

On execution, the malware checks if the following path exists in the system:
/Library/Little Snitch
/Developer/Applications/Xcode.app/Contents/MacOS/Xcode
/Applications/VirusBarrier X6.app
/Applications/iAntiVirus/iAntiVirus.app
/Applications/avast!.app
/Applications/ClamXav.app
/Applications/HTTPScoop.app
/Applications/Packet Peeper.app

If any of these are found, the malware will skip the rest of its routine and proceed to delete itself.

Note the presence of Xcode, Apple's IDE for Mac and iOS development. To a virus author, the presence of development tools like Xcode is a perfect indicator of a tech-savvy user... the kind of person most likely to detect your work.

If you want to stay safe, or see if you were infected, Macworld has the best roundup.

 
Crate-Digging Through YouTube

I love when I'm crate-digging through the weird part of YouTube and stumble on something truly amazing, seen only by a handful of other people. Just now, I was looking for the redneck bar scene from 48 Hrs. and found this:

It's the opening titles for 48 Hours of Hallucinatory Sex (originally "48 Horas de Sexo Alucinante"), a 1987 trash/sexploitation film from Brazil. (Don't worry, the clip's safe for work.)

Everything about this video is amazing, from the face-melting porno synth to the Amstrad-like scrolling fonts. (You can see the blinking cursor!) With the VHS warble, it sounds like an unreleased track straight off of DJ Shadow's Endtroducing... I couldn't find any information about the soundtrack online, but would love to hear more.

The sequel to a 1985 movie called 24 Hours of Explicit Sex, the plot of 48 Hours is totally meta: a sex psychologist sees the original film and hires the original cast and crew to make her own. It's like the '80s porno version of The Human Centipede 2: Full Sequence, where a psychopath is inspired to recreate the events of The Human Centipede using the real-life actors from the film.


The last time I stumbled on anything this funky, it was this scene from low-budget indie comedy Apple Pie from 1976, that ends with this insane 15-minute-long choreographed dance sequence set on the streets of 1970s NYC. And the music? An improvised funk jam by Hall & Oates.

This happens to me every time I go to NYC.

 
Sightseeing 8-Bit Maps with 1-Bit Camera

Spent the morning sightseeing in Google Maps 8-Bit, taking snapshots with my handy 1-Bit Camera.

(Click for larger size.)

 
A Patent Lie: How Yahoo Weaponized My Work

I originally wrote this column over at Wired back on March 13 about my experience with patents at Yahoo, but forgot to republish it here on Waxy.org in my permanent archive.

This article received a bigger response, hands-down, than anything I've written for Wired so far, resting at the top of Techmeme for a full day, with widespread coverage from The Telegraph, The Verge, Fox News, and Business Insider. (That's a good signal you've written something notable: when competing tech magazines start linking to your work.)

Almost two weeks later, I'm still angry but happy that the column ignited such a powerful discussion about the patent issue. I'm especially pleased that "weaponizing patents" is entering the lexicon; articles like these use the phrase without mentioning me at all. Awesome.

For two other perspectives on this issue, I enjoyed Mark Cuban's linkbait take and Fred Wilson's short, furious rant.

Anyway, if you hadn't seen it, I hope you enjoy it.

?

While most of the tech world was partying at South by Southwest in Austin yesterday, Yahoo announced it was filing a lawsuit against Facebook for allegedly infringing on 10 patents from their 1,000+ patent warehouse.

I'm no fan of Facebook, but this is a deplorable move. It's nothing less than extortion, expertly timed during the SEC-mandated quiet period before Facebook's IPO. It's an attack on invention and the hacker ethic.

In the interest of full disclosure, I have a small supporting role in this story. None of the patents I co-invented are cited in the Yahoo complaint, but a handful of applications I worked on with Yahoo were granted patents, weaponized now to use against people like me.

Here's how the process worked, in my case:

In 2005, Yahoo acquired Upcoming.org, the collaborative events calendar I'd launched two years before.
Back then, the Web 1.0 behemoth seemed on the verge of turning things around. A series of smart moves — high-profile hires, the Oddpost and Flickr acquisitions, the launch of the Yahoo! Developer Network, and their Research Lab — was breathing new life into things. Two months after we were acquired, Del.icio.us and Webjay joined us in the Yahoo fold.

After we moved in, we were asked to file patents for anything and everything we'd invented while working on Upcoming.org. Every Yahoo employee was encouraged to participate in their "Patent Incentive Program," with sizable bonuses issued to everyone who took the time to apply.

Now, I've always hated the idea of software patents. But Yahoo assured us that their patent portfolio was a precautionary measure, to defend against patent trolls and others who might try to attack Yahoo with their own holdings. It was a cold war, stockpiling patents instead of nuclear arms, and every company in the valley had a bunker full of them.

Against my better judgement, I sat in a conference room with my co-founders and a couple of patent attorneys and told them what we'd created. They took notes and created nonsensical documents that I still can't make sense of. In all, I helped Yahoo file eight patent applications.

Years after I left I discovered to my dismay that four of them were granted by the U.S. Patent and Trade Office.

I thought I was giving them a shield, but turns out I gave them a missile with my name permanently engraved on it.

I was naive. Even if the original intention was truly defensive, a patent portfolio can easily change hands, and a company can even more easily change its mind. Since I left in 2007, Yahoo has had three CEOs and a board overhaul.

The scary part is that even the most innocuous patent can be used to crush another's creativity. One of the patents I co-invented is so abstract, it could not only cover Facebook's News Feed, but virtually any activity feed. It puts into very sharp focus the trouble with software patents: Purposefully vague wording invites broad interpretation.

In their complaint, Yahoo alleges that Facebook's News Feed violates "Dynamic page generator," a patent filed in 1997 by their former CTO related to the launch of My Yahoo, one of the first personalized websites. Every web application, from Twitter to Pinterest, could be said to violate this patent. This is chaos.

Software patents should be abolished, plain and simple. Software is already covered by copyright, making patent protection unnecessary.

Ask any programmer — developing software is as creative and unique as writing poetry.

Yahoo's lawsuit against Facebook is an insult to the talented engineers who filed patents with the understanding they wouldn't be used for evil. Betraying that trust won't be forgotten, but I doubt it matters anymore. Nobody I know wants to work for a company like that.

I'm embarrassed by the patents I filed, but I've learned from my mistake. I'll never file a software patent again, and I urge you to do the same.

For years, Yahoo was mostly harmless. Management foibles and executive shuffles only hurt shareholders and employee morale. But in the last few years, the company's incompetence has begun to hurt the rest of us. First, with the wholesale destruction of internet history, and now by attacking younger, smarter companies.

Yahoo tried and failed, over and over again, to build a social network that people would love and use. Unable to innovate, Yahoo is falling back to the last resort of a desperate, dying company: litigation as a business model.

That it's Yahoo makes it even sadder. The complaint isn't really wrong when it asserts that: "For much of the technology upon which Facebook is based, Yahoo! got there first."

But being first with something generic that would have been invented by someone (like the wheel) — as opposed to something few could have imagined (like the Segway) — is a big difference.

Ask any start-up CEO — execution is everything.

As the fictionalized Mark Zuckerberg says in The Social Network, "If you guys were the inventors of Facebook, you'd have invented Facebook."

 
YouTube's Content ID Disputes Are Judged by the Accuser

Last Friday, a YouTube user named eeplox posted a question to the support forums, regarding a copyright complaint on one of his videos. YouTube's automated Content ID system flagged a video of him foraging a salad in a field, claiming the background music matched a composition licensed by Rumblefish, a music licensing firm in Portland, Oregon.

The only problem? There is no music in the video; only bird calls and other sounds of nature.

Naturally, he filed a dispute, explaining that the audio couldn't possibly be copyrighted.

The next day, amazingly, his claim was rejected. Not by YouTube itself — it's unlikely that a Google employee ever saw the claim — but from a representative at Rumblefish, who reviewed the dispute and reported back to YouTube that their impossible copyright for nonexistent music was indeed violated.

Back at YouTube, eeplox found himself at a dead end. YouTube now stated, "All content owners have reviewed your video and confirmed their claims to some or all of its content." No further disputes were possible, the case was closed.

Whether caused by a mistake or malice, Rumblefish was granted full control over eeplox's video. They could choose to run ads on the video, mute the audio, or remove it entirely from the web.


A History of Screw-Ups

On Sunday night, Reddit took notice. Within hours, the thread was on the homepage, commenters were freaking out and, to his credit, Rumblefish CEO Paul Anthony was fielding questions in an IAmA interview until 2:30am.

His argument: One of Rumblefish's Content ID reps made a mistake by denying the dispute, and they released the claim on Sunday night. "We review a substantial amount of claims every day and the number is increasing significantly," said Anthony. "We have millions of videos now using our songs as soundtracks and keeping up is getting harder and harder."

This is the latest in a long series of foibles or outright abuses of YouTube's Content ID system. Content ID was intended to help copyright holders manage the chaos of YouTube. They'd provide copies of their audio and video for analysis, which would then algorithmically match newly-uploaded videos. If a match was found, rightsholders could automatically block the video or, increasingly, claim money from video advertising.

Content ID's monetization was a huge boon for copyright holders. Uploaders could keep their videos online, while copyright holders profited from the creative reuse of their work.

But the last couple years have seen a dramatic rise in Content ID abuse, using it for purposes that it was never intended. Scammers are using Content ID to steal ad revenue from YouTube video creators en masse, with some companies claiming content they don't own, deliberately or not. The inability to understand context and parody regularly leads to "fair use" videos getting blocked, muted or monetized.


Bypassing the DMCA

The problem is that media companies and scammers are using Content ID as an end run around the DMCA.

With the DMCA, the process works like this. A rightsholder could file a claim against a video with YouTube, and YouTube would immediately take the video offline. If there was a mistake, the uploader could file a counter-notice. The video would then be restored by YouTube within 10-14 business days of the counter-notice, unless it went to court.

It wasn't perfect, by any means, but it was fair. Disputes could always be appealed, and both parties were given equal power. And if a claimant lied about owning the copyright to the material in question, they could face perjury charges.

The current system, led by Content ID, tips the balance far in favor of the claimant.

Rumblefish never needed to prove they were the copyright holder, but were still given ultimate control over the video's fate. Uploaders can dispute claims, but the only people reviewing claims are the Content ID partners that filed the claim in the first place, who are free to deny them wholesale.


A Simple Fix

The solution is simple: if a copyright holder wants to pursue a disputed Content ID match, they should file a DMCA claim. That's the only way to guarantee their rights, and make the copyright holder legally responsible for telling the truth.

In fact, this is exactly how YouTube says that Content ID "fair use" claims should work. In practice, this doesn't appear to be true any longer. Content ID partners, of course, can file a DMCA notice at any time, but why bother if they can reject the counter-claims themselves?

(Preferred partners like Universal Music Group can go a step further and block videos directly without filing a claim.)

This problem has been on YouTube's radar for at least two years, but it's only getting worse as unsavory companies discover this nascent business model. Claim copyright on media you may or may not own, and let Content ID do the rest.

By letting Content ID partners have the final word, and not trusting their own users, YouTube is violating its trust with its community and damaging fair use in the process.


Update

I originally published this article over at Wired, where a commenter pointed out that this process may actually violate YouTube's "safe harbor" granted through the DMCA. If they choose to ignore disputes, they're effectively giving content providers an end run around fair use and the DMCA.

Selfish Crab wrote:

It seems like by providing the Content ID system, Youtube was trying to pre-emptively identify copyrighted material, like a first-pass dispute system. Their lawyers probably concluded that so long as the content ID system falls back onto DMCA takedown procedure, they are still in compliance with the DMCA sufficiently to retain their safe harbor.

So if Content ID claim disputes do not fall back onto DMCA takedown, as Andy's article suggests, there's a case to be made that YouTube no longer has liability protection from users. It is a whole another can of worms to analyze what a legal claim against youtube would look like. You'd have to look at the YouTube Terms of Service (i.e., the contract) to see if maybe they contracted around this problem already, you'd have to figure out damages, etc etc. Or I guess you can just raise a shitstorm and that's enough of a moral victory.

In a Google+ comment last December, senior copyright counsel for Google and former EFF staff attorney Fred von Lohmann acknowledged the problem.

Yes, we're aware of that problem in the Content ID dispute process and are looking at what we can do to fix it. It's the result of a complicated collision of how to handle geographically limited Content ID claims, disputes, and global DMCA removals. Turns out to be a hard problem to figure out. But we're thinking on it.

Virginia law student Patrick McKay got in touch with Annie Baxter, a public relations manager at YouTube, about this issue.

This is one of those corner-case outcomes that emerges from several different rules, none of which was intended to yield the result you've encountered (i.e., DMCA takedowns are global, but Content ID ownership claims are territorial). Unfortunately, addressing it YouTube-wide is going to take some time, both for pondering and implementing.

So while we can promise you that we're thinking about this, we can't promise you a fix or time-table. And feel free to tell the OVC we're looking at it and trying to come up with something.

In the meantime, anyone in the Content ID program is offered free reign to claim copyright on your videos and profit directly from them. I'm hoping this gets cleared up soon.

 
Introducing Playfic

So, I made a weird new thing with my 15-year-old nephew, Cooper McHatton. It's experimental and has lots of rough edges, but quite frankly, I'm tired of working on it, so here you go.

Playfic is a community for writing, sharing, and playing interactive fiction games (aka "text adventures") entirely from your browser, using a "natural language"-inspired language called Inform 7.


Inform 7 is incredibly awesome and weird. For example, this is a fully functional game:

East of the Garden is the Gazebo. Above is the Treehouse. A billiards table is in the Gazebo. On it is a trophy cup. A starting pistol is in the cup. In the Treehouse is a container called a cardboard box.

Type that into Playfic, and you end up with this simple game, ready to send to the world.


The official documentation is extensive, with a great manual and recipe book. I've collected a list of resources to help you get started.

For now, there's very little documentation on Playfic itself, but you can click the "View game source" link on every game to see how it was made, and Cooper's adding sample games from the official Recipe Book.


My hope is that Playfic opens up the world of interactive fiction to a much wider audience — young writers, fanfic authors, and culture remixers of all ages.

While the language can be tricky, building simple games is surprisingly easy. Cooper had never coded anything or made a game before trying Playfic, and within 30 minutes of futzing around, he'd made his first game.

Some stuff is broken and missing, but I'd love to hear what you make of it. Open to any and all feedback. Go make some games!

 
The Perpetual, Invisible Window Into Your Gmail Inbox

The other day, I tried out Unroll.me, a clever new service that reads your inbox to let you unsubscribe from mailing lists and other unwanted e-mail flotsam with a single click.

As I was about to connect my Gmail account, my finger hovered over the "Grant access" button.

Wait a second. Who am I giving access to my Gmail account, anyway? There was no identifying information on their site — no company address, no team page listing the names of its team members, and broken links to their privacy policy or terms of service.

For all I knew, it could be run by unscrupulous spammers or an Anonymous troll looking for lulz. And I was about to give them unfettered access to eight years of my e-mail history and, with password resets, the ability to access any of my online accounts?

I had to dig around online to find out who's behind it, and fortunately, Unroll.me is a totally legit NYC-based startup providing a useful service. I spoke to Perri Blake Gorman, Unroll.me's cofounder and CMO, who assured me they'll add all the company information as they roll out their public beta.

But since Gmail added OAuth support in March 2010, an increasing number of startups are asking for a perpetual, silent window into your inbox.

I'm concerned OAuth, while hugely convenient for both developers and users, may be paving the way for an inevitable privacy meltdown.


The Road to OAuth

For most of the last decade, alpha geeks railed against "the password anti-pattern," the common practice for web apps to prompt for your password to a third-party, usually to scrape your e-mail address book to find friends on a social network. It was insecure and dangerous, effectively training users how to be phished.

The solution was OAuth, an open standard that lets you grant permission for one service to connect to another without ever exposing your username or password. Instead of passwords getting passed around, services are issued a token they can use to connect on your behalf.

If you've ever granted permission for a service to use your Twitter, Facebook, or Google account, you've used OAuth.

This was a radical improvement. It's easier for users, taking a couple of clicks to authorize accounts, and passwords are never sent insecurely or stored by services who shouldn't have them. And developers never have to worry about storing or transmitting private passwords.

But this convenience creates a new risk. It's training people not to care.

It's so simple and pervasive that even savvy users have no issue letting dozens of new services access their various accounts.

I'm as guilty as anyone, with 49 apps connected to my Google account, 80 to Twitter, and over 120 connected to Facebook. Others are more extreme. My friend Sam is a developer at Kickstarter, and he authorized 148 apps to use his Twitter account. Anil counted 88 apps using his Google account, with nine granted access to Gmail.

For Twitter, the consequences are unlikely to be serious since almost all activity is public. For Facebook, a mass leak of private Facebook photos could certainly be embarrassing.

But for Gmail, I'm very concerned that it opens a major security flaw that's begging to be exploited.


The Privacy Danger

A long list of services, large and small, request indefinite access to your Gmail account.

I asked on Twitter and Google+ for people to check their Google app permissions to see who they've granted Gmail access to. The list includes a range of inbox organizers, backup services, email utilities, and productivity apps: TripIt, Greplin, Rapportive, Xobni, Gist, OtherInbox, Unsubscribe, Backupify, Blippy, Threadsy, Nuevasync, How's My Email, ToutApp, ifttt, Email Game, Boomerang, Kwaga, Mozilla F1, 0boxer, Taskforce, and Cloudmagic.

Once granted, all of these services are issued a token that gives unlimited access to your complete Gmail history. And that's where the danger lies.

You may trust Google to keep your email safe, but do you trust a three-month-old Y Combinator-funded startup created by three college kids? Or a side project from an engineer working in his 20 percent time? How about a disgruntled or curious employee of one of these third-party services?

Any of these services becomes the weakest link to access the e-mail for thousands of users. If one's hacked or the list of tokens leaked, everyone who ever used that service risks exposing his complete Gmail archive.

The scariest thing? If the third-party service doesn't discover the hack or chooses not to invalidate its tokens, you may never know you're exposed.

In the past, Gmail's issued security warnings to accounts being accessed from multiple IP addresses. I spoke to OtherInbox founder Joshua Baer, and he said that Google's eased up on the warnings because of the prevalence of third-party services.

It's entirely possible for someone with a stolen token to read, search, and download all your mail to their server for months, and you'd never find out unless they exposed themselves, or you were diligently auditing your "Last account activity" history.


Stay Safe

Clearly, we're not going to stop using awesome new utilities just because there's a privacy risk. But there are best practices you can follow to stay safe.


  • Clean up your app permissions. The best thing you could do, right now, is to log into each service you care about and revoke access to the apps you no longer use or care about, especially those that have access to Gmail. Finding the permissions pages can be tricky, but the nice folks at MyPermissions.org made a handy dashboard linking to every one.

  • Think before you authorize. Before authorizing an account, find out who you're granting access to. Look for a staff page, contact address, and take a look at the privacy policy to make sure they're not sharing or selling your info with third parties. Bonus points if they outline their security policies and offer a way to disconnect service from within the app. If anything seems off, don't do it.

  • When in doubt, change your password. Have a feeling that someone might be reading your mail, but not sure which app is to blame? Changing your password instantly invalidates all your Google and Facebook OAuth tokens, though Twitter tokens persist after password changes.

Google could improve, as well. Their permissions page is too hard to find, even for experienced users, and it's impossible to see which apps have accessed your account recently.

Facebook does an excellent job with this, but Google only shows you the IP address and the protocol it used to connect. Surfacing this information, as a periodic e-mail or on-site notification, would go a long way to averting a potential disaster.


The Greatest Troll of All

So, I originally published everything above over on my Wired column yesterday, but I left off something else I've been thinking about.

While I think a compromised database is the most likely scenario, there's another possibility that disturbs me more.

Imagine that a brand new service pops up, offering a simple, fun service that uses your Gmail account. Maybe a neat visualization like Tout's Year in Review, or maybe something more practical like sending all your attachments to Dropbox.

But it's all just a giant troll, where the app's creators are silently running targeted searches, downloading your mail, and looking for compromising photos and sensitive documents behind-the-scenes. They could collect the documents for months or years, and then release it all online in an anonymous blast. Lulz!

You'd likely never find out where the data came from, and the perpetrators would never be caught. Hell, if you've Gmail-authed a questionable app, this could be happening to you right now and you'd never know. Whee!

 
Pirating the Oscars 2012: Ten Years of Data

Every year, the MPAA tries desperately to stop Oscar screeners -- the review copies sent to Academy voters -- from leaking online. And every year, teenage boys battling for street cred always seem to defeat whatever obstacles Hollywood throws at them.

For the last 10 years, I've tracked the online distribution of Oscar-nominated films, going back to 2003. Using a number of sources (see below for methodology), I've compiled a massive spreadsheet, now updated to include 310 films.

This year, for the first time, I'm calling it: after three years of declines, the MPAA seems to be winning the battle to stop screener leaks. But why?

A record 37 films were nominated this year, and the studios sent out screeners for all but four of them. But, so far, only eight of those 33 screeners have leaked online, a record low that continues the downward trend from last year.

(Disclaimer: Any of this could change before the Oscar ceremony, and I'll keep the data updated until then.)

They may be winning the battle, but they've lost the war.

While screeners declined in popularity, 34 of the nominated films (92 percent) were leaked online by nomination day, with 25 of them available as high-quality DVD or Blu-ray rips. Only three films -- Extremely Loud & Incredibly Close, My Week with Marilyn and W.E. -- haven't leaked online in any form (yet!).

If the goal of blocking leaks is to keep the films off the internet, then the MPAA still has a long way to go.

There are a number of theories about what's causing the decline.

It could be attributed to tighter controls -- personalized watermarks, the aggressive prosecution of leakers, and greater awareness of the risks for Academy voters.

But the MPAA may have little to do with the decline. Oscar-nominated films could be coming out earlier in the year, making screeners less important.

Or maybe the interests between the mainstream downloader and industry favorites is diverging? If the Oscars are mostly arthouse fare and critical darlings, but with low gross receipts, they'll be less desirable to leak online. It would be very interesting to track the historical box office performance of nominees to see how it affects downloading. (Maybe next year!)

The continuously shrinking window between theatrical and retail releases may be to blame. After all, once the retail Blu-ray or DVD is released, there's no reason for pirate groups to release a lower-quality watermarked screener.

The chart below tracks the window between U.S. release and its first DVD/Blu-Ray leak online, which shows how the window between theatrical and retail release dates is slowly closing since 2003.

Whatever the reason, online movie releasing groups are taking longer to pirate movies than ever. When I first started tracking releases in the early- to mid-2000s, the median time between theatrical release to its first leak online was 1 to 2 days. Now, that number's crept up to over three weeks.

The rise in leak time correlates with a dip in popularity for lower-quality sources, like camcorder-sourced footage. This year, only eight of the 37 nominees (21 percent) were sourced from camcorder footage. (This is likely because there are fewer blockbuster nominees than in the mid-2000s.)

As the industry slowly transitions from physical media to streaming video, it'll be interesting to see if the downward trend continues, or if the ease of capturing streaming video spawns a new renaissance for screeners. Last year, Fox Searchlight distributed screeners with iTunes, and all were quickly and easily pirated.


The Data Dump

Skeptical of my results? Want to dig into it yourself? Good! Here's the complete dataset, available on Google Spreadsheets or downloadable as an Excel spreadsheet or comma-separated text file.


Methodology

I include the full-length feature films in every category except documentary and foreign films (even music, makeup, and costume design).

I use Yahoo! Movies for the release dates, always using the first available U.S. date, even if it was a limited release, falling back to the first available U.S. date in IMDB.

All the cam, telesync, and screener leak dates are taken from VCD Quality, supplemented by dates in ORLYDB. I always use the first leak date, excluding unviewable or incomplete nuked releases.

The official screener release dates are from Academy member Ken Rudolph, who kindly lists the dates he receives each screener on his personal homepage. Thanks again, Ken!

For previous years, see 2004, 2005, 2007, 2008 (part 1 and part 2), 2009, 2010, and 2011.

 
Why SOPA and PIPA Must Die

Today, you're going to hear a million solid reasons why SOPA and PIPA -- the two proposed bills sponsored by the entertainment industry to censor the web -- have to die. Wikipedia, Google, Reddit, craigslist, Metafilter, and many, many more have made their cases. Here's mine.

Virtually every project I've ever worked on is threatened by this legislation:

Upcoming.org faced copyright complaints for event posters and listings that users added to the site.

Kickstarter gets DMCA takedowns from artists who find their work used in pitch videos, and from project founders quarreling with each other.

Supercut.org indexes hundreds of video remixes that reuse copyrighted content.

Kind of Bloop faced a lawsuit over the cover art.

And here on Waxy.org, I've had a number of battles over copyright. Among them, I received a cease-and-desist from EMI for being the first person to host DJ Danger Mouse's Grey Album on the web, from Disney for hosting the Kleptones' Night at the Hip-Hopera, and from Bill Cosby for hosting House of Cosbys, which was clearly fair use as a parody.


Every cease-and-desist and DMCA request I've received wasn't fun to get in my inbox, but it allowed me to deal with the issues directly with the copyright holder or using the due process of the court system.

Imagine, instead, a world where a bill like SOPA or PIPA passes. A copyright holder could bypass due process entirely, demanding that search engines stop linking to my sites, ad providers drop me, and force DNS providers not to resolve my domain name. All in the name of stopping piracy.

The chilling effect would be huge.

Every online community that allows for community-contributed content -- discussion forums, imageboards, Usenet newsgroups, photo sharing communities, video sites, and many more -- would be forced to pre-emptively self-censor, shut down, or risk getting blown off the net entirely.

That fucking sucks.


Everything I love about the web requires the unfettered freedom to build new ways to let people express themselves, and with that, comes the risk of copyright infringement.

Breaking the web isn't a solution.

Please take 10 minutes today to call your representatives -- or show up in person! --and let them know you won't stand for this. SOPA and PIPA must die.

 
No Copyright Intended

On October 26, a YouTube user named crimewriter95 posted a full-length version of Pulp Fiction, rearranged in chronological order.


A couple things struck me about this video.

First, I'm surprised that a full-length, 2.5-hour very slight remix of a popular film can survive on YouTube for over six weeks without getting removed. Now that it's on Kottke and Buzzfeed, I'm guessing it won't be around for much longer.

But I was just as amused by the video description:

"The legendary movie itself placed into chronological order. If you'd like me to put the full movie itself up, let me know and I'll be glad to oblige. Please no copyright infringement. I only put this up as a project."

These "no copyright infringement intended" messages are everywhere on YouTube, and about as effective as a drug dealer asking if you're a cop. It's like a little voodoo charm that people post on their videos to ward off evil spirits.

How pervasive is it? There are about 489,000 YouTube videos that say "no copyright intended" or some variation, and about 664,000 videos have a "copyright disclaimer" citing the fair use provision in Section 107 of the Copyright Act.

Judging by his username, I'm guessing crimewriter95 is 16 years old. I wouldn't be surprised if most of those million videos were uploaded by people under 21.

He's hardly alone. On YouTube's support forums, there's rampant confusion over what copyright is. People genuinely confused that their videos were blocked even with a disclosure, confused that audio was removed even though there was no "intentional copyright infringement." Some ask for the best wording of a disclaimer, not knowing that virtually all video is blocked without human intervention using ContentID.

YouTube's tried to combat these misconceptions with its Copyright School, but it seems futile. For most people, sharing and remixing with attribution and no commercial intent is instinctually a-okay.

Under current copyright law, nearly every cover song on YouTube is technically illegal. Every fan-made music video, every mashup album, every supercut, every fanfic story? Quite probably illegal, though largely untested in court.

No amount of lawsuits or legal threats will change the fact that this behavior is considered normal — I'd wager the vast majority of people under 25 see nothing wrong with non-commercial sharing and remixing, or think it's legal already.


Here's a thought experiment: Everyone over age 12 when YouTube launched in 2005 is now able to vote.

What happens when — and this is inevitable — a generation completely comfortable with remix culture becomes a majority of the electorate, instead of the fringe youth? What happens when they start getting elected to office? (Maybe "I downloaded but didn't share" will be the new "I smoked, but didn't inhale.")

Remix culture is the new Prohibition, with massive media companies as the lone voices calling for temperance. You can criminalize commonplace activities from law-abiding people, but eventually, something has to give.

Update, February 11: Everybody's singing the YouTube Disclaimer Blues.