It wasn’t so long ago we were talking about data being the new oil. But today, just like oil, its value is diminishing rapidly. Like oil, over-supply will be the cause of big data’s demise, and as more and more people stop feeding and using those vast data stores, their value will plummet as well.
Well, that’s my theory anyway, because until relatively recently, the public had no idea that its data was being collected in a wholesale fashion and being used by all and sundry for purposes ranging from marketing and national security to epidemic tracking and social engineering.
Remember the headlines not too long ago when the seemingly innocent Google StreetView vehicles were caught collecting home Wi-Fi information, and WikiLeaks exposed government efforts to collect data on ‘people of interest.’ Personal privacy was all but gone when Edward Snowden divulged the incredible efforts national security organizations went through to collect and decipher data on anyone under the guise of protecting citizens from terrorists.
However altruistic the motives and reasoning for these data collection activities, questions still emerged as to who else might be doing the same and just how safe the data was. The adage “if one person can hack a system anyone can” certainly holds true. If governments can collect whatever they like from the internet or phone conversations, so can the crooks.
How surprising then to find that almost everybody and their dog is collecting information on us. The practice is so widespread and so entrenched that even those collecting the data for whatever purpose are blissfully unaware that someone else is pinching theirs under their very noses. It is quite common that one internet transaction is being tracked by any number of other systems and the data dispersed across the internet universe before you finish typing the next character.
The screenshots below (courtesy of RedMorph) are taken from a product called Spyderweb, and illustrate that after consciously visiting just four websites (cnn.com, ft.com, huffingtonpost.com and tmz.com), actually 402 sites are shown as being visited. “Visited” in this case means that your web browser shared information about you and they were allowed to place a cookie, bug or other tracker on your web browser.
The second screenshot is with a privacy tool (RedMorph) enabled. The same four sites are visited but a total of 24 sites were needed to experience them (the additional 20 sites are CDNs, Font Libraries and other needed partners).
How many times this data is copied and stored somewhere else is anyone’s guess, but the sheer volume of replication and duplication must be staggering. We hear about Google, Facebook and Amazon storing data for profiling and marketing reasons, and telcos doing the same to provide better customer experience, but what is it being used for by all those other collectors? Once upon a time, users were the customers of the internet – today they are the product!