Law in the Internet Society

Learning Management Systems Applications; Privacy Risks and Suggested Principles.

-- By SapirAzur - 08 Dec 2021

In the past few years, we have witnessed an exponential growth in distant-learning applications, including learning management and analytics systems; With the social distancing restrictions introduced by the covid-19 pandemic, professionals worldwide advocated even more passionately for a hybrid, analytics-oriented approach for academic learning. This paper will present data collection practices of "Canvas" learning management analytics system, discuss privacy and security concerns and suggest principles for minimizing student rights risks.

Numerous studies have indicated that learning management and analytics applications can support higher education and improve students learning by providing data about learning activities and engagement. Although there is no large-scale empirical proof supporting those findings, in the past years, various educational institutions have partnered with learning-analytics companies to collect and utilize data to assess student's behavior and formulate a predictive analysis of performance, which will enable faculty to personalize learning.

Whether those systems are effective or not, the privacy risks they pose are substantial and should be thoroughly considered, especially given the nature of the relationship between universities and students, as illustrated hereafter through the example of Canvas.

Canvas

Canvas, a learning management system used in Columbia University, possesses an extensive student information database. This database includes personally identifiable information, such as name, gender, profile information, pictures, linked social media accounts, association with interest groups, student memberships, user engagements, browser, IP address, cookies.

According to Canvas's privacy policy, this data is collected, analyzed, and made (partially) accessible to the university, Canvas, their affiliates, and service providers, including companies whose sole purpose is to increase profitability, such as Google Analytics; Furthermore, Canvas may use the data for its purposes, including improving Canvas's platform.

Canvas claims it does not tie the information gathered using third-party analytics to identifiable information. However, there are still some apparent privacy risks regarding the data aggregation mechanism, the aggregator's identity, its security measurements, and the implications of security attacks from an AI perspective.

Aggregated Data

Aggregated data combines two or more fields or attributes into a single field in a database. Though no meaningful information is usually revealed when looking at a single data point in the database, aggregating multiple points may lead to non-trivial insights. Meaning, one can connect specific data points by adding context or linking a particular dataset into other datasets, and the smaller and diverse the database, the greater the risk to the data subject's privacy. For example, the smaller a classroom is, the higher the risk of identifying an individual within this group, using two or more attributes (e.g., location, age). Thus, even unidentifiable attributes may be used to narrow down an individual in a dataset in an easy three-step query: women> age25-30> Israel. The field of statistical learning, and specifically data mining, systematically leverages computational methods to infer corollaries from aggregated datasets.

Security challenges from an AI perspective

AI introduces some potentially new security risks. One of those risks is a "model inversion attack," wherein an attacker has access to partial identifying data belonging to individuals already included in the data of a particular model. Since the attacker holds both the initial data and the model, they can infer further information about the individual by reconstructing the model's inputs from its outputs.

Another risk is a "membership inference attack," which allows a malicious actor to deduce if a given individual is present in a training data on an AI model; they have the target model, and they use it in conjunction with the information they already have to find out if the individual is in the database. If the database includes a specific group of data subjects, this information alone constitutes a privacy concern (e.g., individuals with disabilities).

The Point

Whether the data is kept adequately by the alleged data owners, or whether it is compromised, whether it is identifiable or aggregated, there is no way of knowing how and to what extent it will be used in the future by those who gain access to it. In light of that, I would like to propose several principles for maintaining privacy when using learning applications, mainly Canvas:

Voluntary participation: Universities should provide adequate alternatives and allow students to be excluded from the system. Another option is to allow students to choose features they would like to enable and the data they agree to be collected.

Transparency: Students should understand what data is collected and how it will be used. Those terms should be properly agreed upon, not only through a one-sided privacy policy, and reflect the university's responsibility toward its students.

Data minimization: the data collected should be anonymized and limited to the minimum amount and time; more educational data does not always make better educational data. It would not eliminate risks but would minimize them.

Access: Data access should be strictly limited and continuously monitored. All authorized personnel should have security authorization.

Supervision: An designated committee of students and university professionals could examine, advise, and monitor the learning system. The committee should include privacy law attorneys.

Conclusion

Weighing the advantages of acquiring analytical insights to enhance learning against the privacy risks and ethical concerns, it is becoming increasingly clear that the risks are substantial. It does not mean that the innovative or engaging approach should be forsaken, nor does it mean that the benefits of enhanced learning programs are small, but rather that an interactive system in which students can provide feedback must be sufficiently protective of their data and privacy. As more companies are monetizing off of our data, we need to develop protective technologies that would be dedicated to privacy and security. In particular, given the special relationship between universities and their students, where students may be under a misconception that universities, places of higher education, know best and would not jeopardize their privacy. From my point of view, the most desirable solution would be developing an independent management system. That way, the university alone decides what data it would collect, how it would be stored, and who would access it.

This is a fine start. Substantively, there are some issues to address:

  1. Using curricular delivery software as a surveillance tool is an extraordinary step that demands extraordinary justification. Instead, as you say, there is nothing in the professional literature that justifies this intrusion of surveillance capitalism into learning. So you need to show your claim is correct. Don't stop research with the tertiary journalistic sources. Get the underlying papers. If you don't use research software of your own (that should respect your privacy completely), learn to use Zotero now. You could attach the results of your literature search as a Zotero database or as Bibtex to this topic. Then anyone who wants to follow your research can do so.
  2. You don't need to recapitulate the possible harms to privacy of individuals resulting from de-anonymization or other misuses of aggregate data. A couple of links to wellp-chosen secondary sources will get the reader started on learning more if she wants to. You can use that space instead to set up the underlying question: If there is no proven benefit to spying on learning, while there are reasons both practical and normative to put all information about students' learning under students' control, why are we doing what we are doing instead of whatever would be better?
  3. Merely proposing rules that will not be followed when there is so much money to be made is perhaps not the most efficient way to use our time on the mudball we are making out of the miracle that is Earth. The root of the "learning management software" problem is that it does the wrong thing. I hate content management systems because all they do is manage content. I hate learning management systems even more because all they do is manage learning, and learning is of all human processes that one that doesn't need management. Human curiosity, the desire to learn, which peaks in childhood, needs nurturing in the context of human relationship, of dialogue, of mutual empowerment. That processes of teaching and learning in relationship has made the human race what it is, and can still save it. Management is the antithesis of assisting learning.

    So we want instead of this learning management stuff to use technology that enacts our educational philosophy. The wiki is a fundamental tool for social constructionist education: we make the course by writing it. Students control whether they use it minimally or maximally to support their learning, individually or collectively, as they choose. People can comment or edit as they please, and can both learn and teach as they do. Everyone has equal access to all the data, and equally complete control over access to their work. The public has access to that which is published, but no third party has preferential access to anything, and all non-public data is under the instructor's, and only the instructor's, control.

    Making such educational enablement software is easy: you are, after all, living with me in it right now. It is made of free software parts anyone can put together freely and share as widely as they like. It's all operated and maintained by one person: me. It runs in virtual machines sitting on servers in my apartment and my office that I assembled from loose parts with my own hands. The education was in this literal, technical sense constructed out of the learning that goes into it, including mine.

So in substance, we are choosing between software that does the wrong thing, with no demonstrable benefit, at immense cost to a fundamental value, freedom of thought, once learning becomes a comprehensively surveilled activity; and software that embodies our educational philosophy and preserves our values, for whose educational benefits we have at minimum our own experience in doing what we say we want society to be able to do. Proof of concept plus running code equals revolution.

These may not be your conclusions, of course, though they are mine. But your own l,earning should be in dialogue with them.


You are entitled to restrict access to your paper if you want to. But we all derive immense benefit from reading one another's work, and I hope you won't feel the need unless the subject matter is personal and its disclosure would be harmful or undesirable. To restrict access to your paper simply delete the "#" character on the next two lines:

Note: TWiki has strict formatting rules for preference declarations. Make sure you preserve the three spaces, asterisk, and extra space at the beginning of these lines. If you wish to give access to any other users simply add them to the comma separated ALLOWTOPICVIEW list.

Navigation

Webs Webs

r3 - 02 Jan 2022 - 13:48:14 - EbenMoglen
This site is powered by the TWiki collaboration platform.
All material on this collaboration platform is the property of the contributing authors.
All material marked as authored by Eben Moglen is available under the license terms CC-BY-SA version 4.
Syndicate this site RSSATOM