Privacy and Discrimination: An Intertwined Violation

-- JosephHan - 01 Mar 2024

The mass collection and surveillance of people by tech companies collecting their data is not a new phenomenon. However, as technology continues to improve and grow, algorithms analyzing human data have become more common. This saves tons of time and money for companies, while still achieving desirable results. Examples of this include predictive policing, job application screening, and loan/financing approvals. However, these “objective” human models may be more dangerous than human bias as it is more likely to go undetected and there is less active correction for bias. Human relationships are complex and data can reveal far more than it first appears to. The use of personal data when analyzing human subjects algorithmically results in unlawful discrimination, regardless of any efforts made to counteract those effects.

Data is All Connected

Human data is interconnected and always reveals more than on the surface. Overt discrimination is unlikely when human data is collected; a company is unwilling to make a component race. However, many “impersonal” data points can make race a large component regardless of the intent of the person designing a model. Take zip codes for an example; the use of zip codes in assessing human subjects may on its face not seem like an issue. But when one realizes that zip codes are highly reflective of race and income, it is clear that the use of zip codes will clearly lead to harmful effects on marginalized communities. If a high school student is denied a loan due to his zip code being “high risk”, that could deny education and diminishes socioeconomic mobility. Names are another example. It is impossible to submit a job or housing application without a name. Yet if the model assessing the quality of the candidates use their names, it provides many insights into gender and race. Even data that appears facially neutral will have ties to other human data that results in discrimination.

The Feedback Loop Problem

Feedback loops in models will reinforce existing societal biases. The poster child for this phenomenon is predictive policing. The most common method of predictive policing is place-based; it uses pre-existing crime data to find areas and times where crimes are more likely to happen.

Like most models, input data was needed to get the algorithm off the ground, especially in AI models. However, this initial data is already biased by the policing tendencies of human police officers beforehand. It is not a new idea that police have used race when deciding what areas to patrol. That isn’t where the bias stops. Areas that are found to have high incidences of crime will be policed further, and increased policing will inevitably find more instances of crime. This is the feedback loop that algorithms can create. However, police departments will use the “objectivity” of the algorithm to deny any discriminatory effects. In fact, they can maintain a facade of neutrality using their computer outputs as justification for their actions; discriminatory actions that the police would have had a more difficult time justifying without this model that supports their biases.

Similar effects exist in other human selection methods. Programs that screen candidates for job opportunities will use previously successful candidates as input for their models. That training data will be composed of candidates selected by people, and the long history of discrimination in employment is apparent. While it may seem that an objective algorithm would get rid of previous human bias, in reality the algorithm only perpetuates it. There is a reason that proactive diversity efforts were made in order to combat intrinsic biases. Yet the notion of having a machine make those same racial assessments scares the public, even though that may be the necessary solution. Feedback loops are an inherent trait of human selection models.

There is No Transparency

Black Box

The increasing use of AI results in "black-box" algorithms that cannot be fully understood, even to the engineers creating the algorithms. Machine learning abstracts personal information and creates a model that is used to create an output; this output is informed by a neural network which essentially breaks the input down into components, performs various calculations and weighs those components, then creates an output based off of what the model was designed to accomplish. In this system, the components in the intermediate step are usually unknown; the users only care about the inputs and outputs. This is a boon for both the parties using these programs and the ones selling them; since the intermediates are unknown they can shield themselves from liability as they did not know that their algorithm had discriminatory effects.

What Happened to Me?

Another compounding issue that arises from “black-box” algorithms is the lack of guidance. Those that are being evaluated cannot be sure of the reason that they were selected or rejected by a certain algorithm. In fact, the companies that want the evaluation cannot be sure either, but it decreases their work tremendously while still providing more than enough candidates for whatever purpose they need their algorithm to achieve. Revealing the algorithm may work against the interests of the person deploying it; people will find a way to “game the system” for their own optimal results.

Conclusion

Human data reveals far more than on the surface. Data points are interconnected and it is impossible to isolate “neutral” data points from influencing decisions that should not take into account protected classes. This problem is only exacerbated by the immense amount of data that companies are collecting on people, regardless of whether they are customers, users, and even non-users. Even though one may think that a computer is less biased than a human making assessments on other humans, one must realize that those computer algorithms have been built by people with biases, many of which are subconscious. Legislating greater protections for people and their data will have many beneficial effects beyond simply greater privacy for the people and ensuing 4th Amendment concerns.

Navigation

CompPrivConst CompPrivConst

Webs Webs

r1 - 01 Mar 2024 - 05:25:51 - JosephHan

This site is powered by the TWiki collaboration platform.
All material on this collaboration platform is the property of the contributing authors.
All material marked as authored by Eben Moglen is available under the license terms CC-BY-SA version 4.

Syndicate this site RSS ATOM