Law in the Internet Society

Know Your Enemy - A Data Scientist's View

-- By GraceHopper - 04 Oct 2023

A Revolutionary Act

In 2021 I viewed my last social media post. It was a picture of a high school friend’s baby. For a brief second, I felt a prickle of oxytocin. I shared a moment of closeness with this stranger I had not spoken to in over a decade. Then the wash of anxiety came. The baby wore matching clothes in a Pinterest worthy background, starkly contrasted to the collection of empty Monster cans on my desk. Should I post? With a picture of a can of Monster and code running in the background, I could cosplay as a hacker in a spy thriller, with the caption “living the dream”? Would people respond? And then I decided to stop playing the game. I logged off social media that night for the second to last time. In late 2022 I requested that the companies delete my data and for the first time since childhood I broke free from the Silicon Trap that held my mind captive. From firewalls to VPNs to segmenting my online identity on various VMs and phones, I now protect my mind from the parasite.

In my freedom, I'd like to consider myself as much of a philosopher as a computer nerd, but I’m not exceptional at either. I leave the philosophers to wax poetic about the meaning of freedom and the computer nerds to code. Instead, I will explain an algorithm of our adversary, the threat it poses, and a call to action.

The Recommendation Engine

The recommendation engine is the industry standard to power what you see online. From social media newsfeeds to actual newsfeeds and embedded advertisements and content, they are ubiquitous. Built for addiction, the engines hook users' attention and keep it. They are also open source. Companies do not try to hide it. They post papers on their research sites boasting that this is what we do, this is how we get you.

Recommendation engines have three main components: the content, your data, and the model.

The Content.

The first is a matrix of content on the social media platform. This content is a mix of advertisements and posts by humans/bots. Classification models categorize the content, such as tone (you can quantify love), content (political post versus puppies), and more.

Your Data.

The second is a matrix of user profiles. Data scientists are hungry for any information, however minute, on you. From your browsing history, the locations you’ve visited, search histories in other applications, even when you take your daily bowel movement, we want it all. And generally, we have it all. Surveillance capitalism took the wheel and runs rampant. That site you scrolled past the terms and conditions to read about Hailey Bieber’s new nail color took your data and packaged it into 24-hour, 48-hour, 1-week, and three-week intervals for the convenience of data teams worldwide.

The Model.

The third component is the model. The first two parts, the content and your data, go into the model to train it and to output recommendations. A blackbox of statistics and optimization algorithms, the model is optimized on clicks (a measure of immediate engagement) and long-term engagement (“stickiness” is the industry term, an euphemism for addictiveness). The more engagement an algorithm generates, the more money it makes, and more successful it’s considered.

Topic clustering” reinforces the model. Topic clustering creates groups of users that behave a certain way (“cluster”), discovers other users that look like the “cluster,” and surfaces similar recommendations to these users as well. Clusters generally revolve around classifications of the content, so multiple clusters can encompass a single user, creating a unique identity and derivative clusters. Topic clustering industrializes the recommendation engine and crystalizes the silicon cage, mass producing recommendations to keep people clicking within their clusters.

The Threat

People say they have nothing to hide, that the parasite can have their data. People don’t understand that every piece of personal data fed to this engine chips away at their freedom of choice. The more information the model has, the better recommendations it produces and the narrower the results it gives. The recommendation engine transforms our online choices from a cornucopia of ideas to slightly different flavors of the same thing. It whittles the mind, reducing choice to a narrow band of prepackaged recommendations that the engine is statically sure we will click on and get addicted to. Advertisements sell products, these profits get pumped into data science teams, who to optimize recommendation engines, which recommends additional advertisements with products to sell. In this world, what does that mean for freedom of thought?

Even more nefarious is topic clustering. The more information the model has, more accurate the cluster it places users into. Attention stealing and addiction happen en masse as advertising campaigns target entire clusters. In practice, this bends minds to succumb to the the tyranny of the masses. It silences voices challenging the majority by placing them in clusters of similar voices. Topic clustering imprisons dissenting cries, locking them in echo chambers. Sentenced to choirs of similar voices, dissents become majorities and society becomes stupid. I do not have to ask what happens when bad actors utilize this power. Cambridge Analytica and the 2016 election already have. In this world, what does that mean for democracy?

Call to Action

For every byte of data given to the parasite, a crumb of freedom succumbs to its voracious appetite. We must cut ties with the parasite and reclaim our minds. Once we take that critical, revolutionary step, then we strategize. Can tort law hold companies responsible for the harms that they cause? Or can contract law protect as consideration the unbridled use of data? What role does antitrust law have? Or do we place a cratering charge in the heart of the parasite and build a new? I do not have answers, only an appetite for knowledge and a stomach to say “fuck you” to power.


You are entitled to restrict access to your paper if you want to. But we all derive immense benefit from reading one another's work, and I hope you won't feel the need unless the subject matter is personal and its disclosure would be harmful or undesirable. To restrict access to your paper simply delete the "#" character on the next two lines:

Note: TWiki has strict formatting rules for preference declarations. Make sure you preserve the three spaces, asterisk, and extra space at the beginning of these lines. If you wish to give access to any other users simply add them to the comma separated ALLOWTOPICVIEW list.

Navigation

Webs Webs

r5 - 08 Oct 2023 - 10:24:52 - GraceHopper
This site is powered by the TWiki collaboration platform.
All material on this collaboration platform is the property of the contributing authors.
All material marked as authored by Eben Moglen is available under the license terms CC-BY-SA version 4.
Syndicate this site RSSATOM