As sci-fi author William Gibson once said, “The future is already here—it’s just not very evenly distributed.” It shows up in some places before others. And one of the places this particular aspect of the future has shown up first, oddly enough, is the Coca-Cola Village Amusement Park, a holiday village, theme park, and marketing event that opens seasonally in Israel. Sponsored by Facebook and Coke, the teenagers attending the park in the summer of 2010 were given bracelets containing a tiny piece of circuitry that allowed them to Like real-world objects. Wave the bracelet at the entrance to a ride, for example, and a status update posted to your account testifies that you’re about to embark. Take a picture of your friends with a special camera and wave the bracelet at it, and the photo’s posted with your identity already tagged.
Embedded in each bracelet is a radio-frequency identification (RFID) chip. RFID chips don’t need batteries, and there’s only one way to use them: call-and-response. Provide a little wireless electromagnetic power, and the chip chirps out a unique identifying code. Correlate the code with, say, a Facebook account, and you’re in business. A single chip can cost as little as $.07, and they’ll cost far less in the years to come.
Suddenly it’s possible for businesses to track each individual object they make across the globe. Affix a chip to an individual car part, and you can watch as the part travels to the car factory, gets assembled into a car, and makes its way to the show floor and then someone’s garage. No more inventory shrinkage, no more having to recall whole models of products because of the errors of one factory.
Conversely, RFID provides a framework by which a home could automatically inventory every object inside it—and track which objects are in which rooms. With a powerful enough signal, RFID could be a permanent solution to the lost-keys problem—and bring us face-to-face with what Forbes writer Reihan Salam calls “the powerful promise of a real world that can be indexed and organized as cleanly and coherently as Google has indexed and organized the Web.”
This phenomenon is called ambient intelligence. It’s based on a simple observation: The items you own, where you put them, and what you do with them is, after all, a great signal about what kind of person you are and what kind of preferences you have. “In the near future,” writes a team of ambient intelligence experts led by David Wright, “every manufactured product—our clothes, money, appliances, the paint on our walls, the carpets on our floors, our cars, everything—will be embedded with intelligence, networks of tiny sensors and actuators, which some have termed ‘smart dust.’”
And there’s a third set of powerful signals that is getting cheaper and cheaper. In 1990, it cost about $10 to sequence a single base pair—one “letter”—of DNA. By 1999, that number had dropped to $.90. In 2004, it crossed the $.01 threshold, and now, as I write in 2010, it costs one ten-thousandth of $.01. By the time this book comes out, it’ll undoubtedly cost exponentially less. By some point mid-decade, we ought to be able to sequence any random whole human genome for less than the cost of a sandwich.
It seems like something out of Gattaca, but the allure of adding this data to our profiles will be strong. While it’s increasingly clear that our DNA doesn’t determine everything about us—other cellular information sets, hormones, and our environment play a large role—there are undoubtedly numerous correlations between genetic material and behavior to be made. It’s not just that we’ll be able to predict and avert upcoming health issues with far greater accuracy—though that alone will be enough to get many of us in the door. By adding together DNA and behavioral data—like the location information from iPhones or the text of Facebook status updates—an enterprising scientist could run statistical regression analysis on an entire society.
In all this data lie patterns yet undreamed of. Properly harnessed, it will fuel a level of filtering acuity that’s hard to imagine—a world in which nearly all of our objective experience is quantified, captured, and used to inform our environments. The biggest challenge, in fact, may be thinking of the right questions to ask of these enormous flows of binary digits. And increasingly, code will learn to ask these questions itself.
In December 2010, researchers at Harvard, Google, Encyclopædia Britannica, and the American Heritage Dictionary announced the results of a four-year joint effort. The team had built a database spanning the entire contents of over five hundred years’ worth of books—5.2 million books in total, in English, French, Chinese, German, and other languages. Now any visitor to Google’s “N-Gram viewer” page can query it and watch how phrases rise and fall in popularity over time, from neologism to the long fade into obscurity. For the researchers, the tool suggested even grander possibilities—a “quantitative approach to the humanities,” in which cultural changes can be scientifically mapped and measured.
The initial findings suggest how powerful the tool can be. By looking at the references to previous dates, the team found that “humanity is forgetting its past faster with each passing year.” And, they argued, the tool could provide “a powerful tool for automatically identifying censorship and propaganda” by identifying countries and languages in which there was a statistically abnormal absence of certain ideas or phrases. Leon Trotsky, for example, shows up far less in midcentury Russian books than in English or French books from the same time.
The project is undoubtedly a great service to researchers and the casually curious public. But serving academia probably wasn’t Google’s only motive. Remember Larry Page’s declaration that he wanted to create a machine “that can understand anything,” which some people might call artificial intelligence? In Google’s approach to creating intelligence, the key is data, and the 5 million digitized books contain an awful lot of it. To grow your artificial intelligence, you need to keep it well fed.
To get a sense of how this works, consider Google Translate, which can now do a passable job translating automatically among nearly sixty languages. You might imagine that Translate was built with a really big, really sophisticated set of translating dictionaries, but you’d be wrong. Instead, Google’s engineers took a probabilistic approach: They built software that could identify which words tended to appear in connection with which, and then sought out large chunks of data that were available in multiple languages to train the software on. One of the largest chunks was patent and trademark filings, which are useful because they all say the same thing, they’re in the public domain, and they have to be filed globally in scores of different languages. Set loose on a hundred thousand patent applications in English and French, Translate could determine that when word showed up in the English document, mot was likely to show up in the corresponding French paper. And as users correct Translate’s work over time, it gets better and better.
What Translate is doing with foreign languages Google aims to do with just about everything. Cofounder Sergey Brin has expressed his interest in plumbing genetic data. Google Voice captures millions of minutes of human speech, which engineers are hoping they can use to build the next generation of speech recognition software. Google Research has captured most of the scholarly articles in the world. And of course, Google’s search users pour billions of queries into the machine every day, which provide another rich vein of cultural information. If you had a secret plan to vacuum up an entire civilization’s data and use it to build artificial intelligence, you couldn’t do a whole lot better.
Читать дальше