The Sociology of Web 3.0
This year the World Wide Web is 25 years old. Web 1.0 was made possible by Sir Tim Berners-Lee’s creation of the http protocol which enabled us to retrieve of a copy of a document by accessing its address on a network. Web 2.0 was made by us; the content providers.
To realise the semantic web or Web 3.0 technologists envision everything (documents, data, inanimate objects, even us) having an address on a network and artificial intelligence having the ability to understand and ‘learn’ relationships between these addressed entities. So Web 3.0 will be created by machines as we become data sources on the network.
Understandably, owing to the scale of its ambition, the semantic web is yet to be realised. Many of the technologies that make the semantic web at least technically feasible have, however, powerful implications for the way we and the computers we program, collect, store, use, and share data.
These include linked data and ontologies. Those of you familiar with relational databases will know about pieces of data, well….having relations. For example if you and your partner are registered on the vehicle licensing agency’s database, even if you live at different addresses, it will be easy for the database to establish you have a relationship with your partner because your records share your car’s registration number.
The aim of linked data, via data structuring models such as RDF, is to connect and consolidate records that exist in different databases; even data in different formats (if linked data was fully realised the Web would become one huge distributed database). The query language SPQARL enables us to query across databases by mining linked data. For example, if the vehicle licensing agency and your university stored its data as linked data it will at least be technically possible, via related records like your car’s registration, to find out which department you work in. You may park your car at the university which stores your car registration along with your employment records. Theoretically a search engine could pull all the details about your car and employment in one data grab. Now imagine all the possible connections that could be made using all the digital data stored about you on all the databases out there.
Ontologies introduce another layer of functionality to this scenario. They are not ontologies as we understand them in sociology; in computer science it’s a framework for organising data in a way that gives it meaning. Ontologies allow computers to ‘learn’ the semantics of relationships between data. For the more technically minded here’s how the BBC are using an ontology to characterise news stories.
All that can be known in the previous example is that you and your partner share a registration number; computers can’t know or think that means you both drive that car. An ontology, however, introduces other variables to the data relations. It could define you as a driver, and your partner as a driver, and then state you both own only one car. The computer would then ‘know’ you both drive the same car. As more variables are added the computer builds a richer picture until it’s able to imply relationships between data. Given, you and your partner are living at separate addresses and the car is sports car, add to that your age and a recent trip to Ibiza the computer could conclude you are having a mid-life crisis! The use of this technology has, therefore, important implications for the way data can be exploited.
Take for example store club card data collected by DunnHumby. If a customer buys a certain magazine, razors, a ready meal for one, a fitness video, some baby clothes, cigarettes, and multi-pack lager when certain football games are on, the reasoning software can, as it gains knowledge, produce a sophisticated profile of this customer; his social class, relationship status, lifestyle etc. The more semantic capabilities we add the more we are able, via machine processing, to reverse engineer aspects of identity. There’s no doubt that we should be worried about the personal data we give away to Facebook et al. for free but this technology could effectively mean the end of anonymous data. The better semantic A.I. technology becomes, the better it becomes at de-anonomysing data.
This more ‘intelligent’ data collection and processing offers a new, fertile territory for sociologists to analyse and critique.
P.S. This is only a fairly frivolous primer, for a more technical academic guide to these technologies please see;
Digital Futures? Sociological Challenges and Opportunities in the Emergent Challenges and Semantic Web Sociology 2013 47: 173 Susan Halford, Catherine Pope and Mark Weal Semantic Web DOI: 10.1177/0038038512453798