Law in the Internet Society

Learning Management Systems Applications; Privacy Risks and Suggested Principles.

-- By SapirAzur - 08 Dec 2021

In the past few years, we have witnessed an exponential growth in distant-learning applications, including learning management and analytics systems; With the social distancing restrictions introduced by the covid-19 pandemic, professionals worldwide advocated even more passionately for a hybrid, analytics-oriented approach for academic learning. This paper will present data collection practices of "Canvas" learning management analytics system, discuss privacy and security concerns and suggest principles for minimizing student rights risks.

Numerous studies have indicated that learning management and analytics applications can support higher education and improve students learning by providing data about learning activities and engagement. Although there is no large-scale empirical proof supporting those findings, in the past years, various educational institutions have partnered with learning-analytics companies to collect and utilize data to assess student's behavior and formulate a predictive analysis of performance, which will enable faculty to personalize learning.

Whether those systems are effective or not, the privacy risks they pose are substantial and should be thoroughly considered, especially given the nature of the relationship between universities and students, as illustrated hereafter through the example of Canvas.

Canvas

Canvas, a learning management system used in Columbia University, possesses an extensive student information database. This database includes personally identifiable information, such as name, gender, profile information, pictures, linked social media accounts, association with interest groups, student memberships, user engagements, browser, IP address, cookies.

According to Canvas's privacy policy, this data is collected, analyzed, and made (partially) accessible to the university, Canvas, their affiliates, and service providers, including companies whose sole purpose is to increase profitability, such as Google Analytics; Furthermore, Canvas may use the data for its purposes, including improving Canvas's platform.

Canvas claims it does not tie the information gathered using third-party analytics to identifiable information. However, there are still some apparent privacy risks regarding the data aggregation mechanism, the aggregator's identity, its security measurements, and the implications of security attacks from an AI perspective.

Aggregated Data

Aggregated data combines two or more fields or attributes into a single field in a database. Though no meaningful information is usually revealed when looking at a single data point in the database, aggregating multiple points may lead to non-trivial insights. Meaning, one can connect specific data points by adding context or linking a particular dataset into other datasets, and the smaller and diverse the database, the greater the risk to the data subject's privacy. For example, the smaller a classroom is, the higher the risk of identifying an individual within this group, using two or more attributes (e.g., location, age). Thus, even unidentifiable attributes may be used to narrow down an individual in a dataset in an easy three-step query: women> age25-30> Israel. The field of statistical learning, and specifically data mining, systematically leverages computational methods to infer corollaries from aggregated datasets.

Security challenges from an AI perspective

AI introduces some potentially new security risks. One of those risks is a "model inversion attack," wherein an attacker has access to partial identifying data belonging to individuals already included in the data of a particular model. Since the attacker holds both the initial data and the model, they can infer further information about the individual by reconstructing the model's inputs from its outputs.

Another risk is a "membership inference attack," which allows a malicious actor to deduce if a given individual is present in a training data on an AI model; they have the target model, and they use it in conjunction with the information they already have to find out if the individual is in the database. If the database includes a specific group of data subjects, this information alone constitutes a privacy concern (e.g., individuals with disabilities).

The Point

Whether the data is kept adequately by the alleged data owners, or whether it is compromised, whether it is identifiable or aggregated, there is no way of knowing how and to what extent it will be used in the future by those who gain access to it. In light of that, I would like to propose several principles for maintaining privacy when using learning applications, mainly Canvas:

Voluntary participation: Universities should provide adequate alternatives and allow students to be excluded from the system. Another option is to allow students to choose features they would like to enable and the data they agree to be collected.

Transparency: Students should understand what data is collected and how it will be used. Those terms should be properly agreed upon, not only through a one-sided privacy policy, and reflect the university's responsibility toward its students.

Data minimization: the data collected should be anonymized and limited to the minimum amount and time; more educational data does not always make better educational data. It would not eliminate risks but would minimize them.

Access: Data access should be strictly limited and continuously monitored. All authorized personnel should have security authorization.

Supervision: An designated committee of students and university professionals could examine, advise, and monitor the learning system. The committee should include privacy law attorneys.

Conclusion

Weighing the advantages of acquiring analytical insights to enhance learning against the privacy risks and ethical concerns, it is becoming increasingly clear that the risks are substantial. It does not mean that the innovative or engaging approach should be forsaken, nor does it mean that the benefits of enhanced learning programs are small, but rather that an interactive system in which students can provide feedback must be sufficiently protective of their data and privacy. As more companies are monetizing off of our data, we need to develop protective technologies that would be dedicated to privacy and security. In particular, given the special relationship between universities and their students, where students may be under a misconception that universities, places of higher education, know best and would not jeopardize their privacy. From my point of view, the most desirable solution would be developing an independent management system. That way, the university alone decides what data it would collect, how it would be stored, and who would access it.


You are entitled to restrict access to your paper if you want to. But we all derive immense benefit from reading one another's work, and I hope you won't feel the need unless the subject matter is personal and its disclosure would be harmful or undesirable. To restrict access to your paper simply delete the "#" character on the next two lines:

Note: TWiki has strict formatting rules for preference declarations. Make sure you preserve the three spaces, asterisk, and extra space at the beginning of these lines. If you wish to give access to any other users simply add them to the comma separated ALLOWTOPICVIEW list.

Navigation

Webs Webs

r2 - 08 Dec 2021 - 21:54:55 - SapirAzur
This site is powered by the TWiki collaboration platform.
All material on this collaboration platform is the property of the contributing authors.
All material marked as authored by Eben Moglen is available under the license terms CC-BY-SA version 4.
Syndicate this site RSSATOM