Why Social Media is Vulnerable to Data Mining and a Tool to Manipulate Minds

March 24, 2018

75

March 24, 2018 – Twenty years ago there was no such thing as online social media. I remember the beginnings with MySpace in 2003, followed by hundreds more in ensuing years. At one point I was a member of more than 50 of these new online communities, from Classmates to Orkut, Google+, Goodreads, Instagram, LinkedIn, Twitter, Reddit, Tumblr, and of course, Facebook.

Creating profiles on these sites, of course, took considerable effort and time. I was willing to share pretty basic data about myself, what I did, my likes, where I liked to travel, and eventually my first attempts at blogging. Back then I was curious to learn and write about online social interaction that was not a dating site. I remember thinking that tools like these would eliminate barriers, political and social boundaries, and bring people closer together. Little did I think about a future in which social networking would actually prove a weapon for division and the spread of hate.

Once the novelty wore off and I had written some blogs about my social networking experiences I started minimizing my time on sites. Today I remain active on only a few.

From early onset, it was obvious that Facebook with its clean interface, and easy navigability, was going to become a dominant player in the social media realm. And today more than 2 billion people count themselves as active Facebook users.

Here are a few demographic facts about the site:

it’s available in 101 languages with 300,000 users helping to do translation
it has 1.66 billion monthly active users and 1.57 billion active daily users who access the site through their mobile phones
it has 1.368 billion daily desktop users
53% of its users are female
the average number of “friends,” is 155
the average user is connected to 80 other pages, groups, and events
the age demographic for 87% is between 18 and 29
74% of users are college graduates
Over 40 million small businesses use Facebook to promote their products and services
the average time spent on Facebook per visit is 20 minutes
48% of users, in the age demographic between 18 and 34, wake up to check in to their Facebook page before getting on with their day
users upload 350 million photos to the site daily
users share 1 million links every 20 minutes
every 20 minutes there are 20 million “friend” requests
every 20 minutes 3 million messages are sent
status updates amount to 55 million daily

Other than the users of Google, there is no other site with as much traffic as Facebook. Its user population is larger than any country on the planet. It can be deemed a remarkable success story in terms of its global presence.

In its early days, one of the great challenges was to monetize the site. Since access was free, to deal with operating costs, there needed to be a revenue stream. Facebook’s product was people and their information. To make money it could allow companies and users to buy advertising, a passive means of attempting to influence the Facebook audience. It also could also sell the data voluntarily contributed by its users. For the latter, it needed to be transparent and let the community know what was fair game. So like every other software and application company, Facebook provided user acceptance agreements in which buried within were paragraphs denoting the social network’s right to sell user data.

Up until 2014, this latter stream of income was a Wild West show with few constraints on companies and organizations prepared to harvest information from Facebook users. Facebook started to note some pretty significant buys and decided to tamp down the volume by limiting the size of data units that could be purchased.

In the past week, however, the public has learned that data prior and including 2014 was mined to the tune of 50 million users by one company, Cambridge Analytica, focused on data-driven campaigning on behalf of political and commercial clients. On the Cambridge site, it describes how the company’s political services and products have supported “more than 100 campaigns across five continents.” It goes on to state, “within the United States alone, we have played a pivotal role in winning presidential races as well as congressional and state elections.”

Cambridge Analytica states that its business is predictive analytics aimed at developing individual voter profiles. And where did it get much of its information for the American 2016 election? Some of it came from records including voter demographics available in the public domain. But as we have learned the company also got 50 million user profiles from Facebook.

Cambridge Analytica boasts that its tools give it the ability to forecast voter behaviour. Thus it can use persuasion techniques including tailored language, visual advertising, and targeted messaging to individual voters. The mining of Facebook data caught Americans unaware that their Facebook information that they shared with friends and family could be exploited for analytical purposes.

Was the data purchased by Cambridge Analytica? No. Screen scraping isn’t used if your intentions are to purchase data. It is used to grab information. And Cambridge harvested Facebook profiles using screen scrapers, data copying programs that grab information directly from web pages.

I’m familiar with screen scrapers having used them in the past when converting legacy mainframe applications to client-server PC-based network systems on behalf of clients. Screen scraping saved hundreds of hours of coding giving my colleagues and me a shortcut to recreate complex user interfaces.

But in the case of Facebook, Cambridge’s screen scraping wasn’t about developing anything, but rather harvesting information. Some may even call it pirating tens of millions of profiles to create audiences for targeted messaging. I don’t think any Facebook users including me signed up for this type of wholesale sharing of their personal information.

Today screen scraping is controversial. It is seen as an invasive instrument associated with pirating sensitive online content. A number of banks have called for its banning because the threat the technology represents to personal financial information. Of course, the fintech companies which are building online mobile banking applications to compete with banks want to keep screen scrapers alive for obvious reasons.

But who would have thought that a useful development tool could be used to grab personal information of online users and then develop targeted information to influence the outcome of elections? I doubt the developers of screen scrapers anticipated such unintended consequences. And what elections were influenced by the use of this private data? Donald Trump’s ascent to the presidency of the United States is one. Other include Ted Cruz, Ben Carson, and John Bolton (the new U.S. National Security Advisor) whose political action committee (PAC) is a client.

Facebook’s CEO, Mark Zuckerberg, after a number of days of silence, finally admitted that his company “made mistakes” regarding past sharing of data with third-party apps, a practice that was allowed until 2014. Facebook began recognizing just how vulnerable personal data was to third-party apps at that time and began to make changes to limit the wholesale grabbing of millions of profiles. It still was okay to access Facebook accounts if third-party app developers paid for the privilege. Now going forward, the social network that dominates all others on the planet, intends to restrict the amount of data third parties can access. It also has announced changes to data sharing tools and feeds to be built into the user interface, settings, and Facebook newsfeeds.

But the Cambridge Analytica data theft and its use to subvert democracy is not the end of the problems that social networks face unless access by third-parties is restricted to only anonymized data. And then there is the added threat of using more sophisticated artificial intelligence (AI) tools than pattern recognition applications and what these represent to data sharing principles upon which social networking is built. The final question is: will AI tools, exploited by bad actors like Cambridge Analytica, ultimately kill Facebook? Look at what far less sophisticated algorithms have already done to the reputation of the company and we may be witnessing a tipping point for Mark Zuckerberg’s creation.