This week I reread the futuristic Berners-Lee article on the Semantic Web (pdf). It describes smart desktop software agents that can arrange appointments, delegate tasks to trusted family members, and even negotiate prices on behalf of their users. These agents depend on semantically tagged Web information -- open data that's been described in a standard way so machines can use it.Nine years later, and depending on your perspective, this compelling vision still seems like mission impossible. Some major websites, such as ibm.com, are now built with validated, standards-driven code and tagged with semantic metadata for machines to repurpose at will. But, in a familiar chicken-and-egg adoption cycle, most organizations with useful online data are still waiting for applications to come along and justify the effort of recoding for the semantic web.
Information architects get the value of semantic data applications, but end users like you and me don't seem to miss it much. I believe we put up with it because we already have an secret agent of our own: Agent Google. No one's better at pulling the meaning out of disparate web documents and making information findable. Google doesn't need the semantic web -- they've already semanticized it for their own search results and ad serving.
Of course, there's a big difference between desktop software drawing on open data formats, and Agent Google's secret data factories. Perhaps the most significant change in the Web since Berners-Lee wrote his article is the mass migration of personal data into the cloud. Many people now routinely write all their documents, share their schedules, post their photos and video, save all their email, and map their social lives online. Meanwhile, Facebook Connect and OpenID are bridging your data islands under identifiable accounts.
Agent Google's near-omniscience is no news flash to many of us, but it's worth staying alert for alternatives. The White House's recent adoption of the semantic RDF and OWL standards may lead other organizations to publish their data in agent-friendly formats. Someday soon, you may finally have your own trustworthy semantic agent, with a license to gather data that's for your eyes only.