Recognising the Bias of AI Language Models and Improving Their Correctness by RCI

New research conducted by the University of Washington, Carnegie Mellon University, and Xi’an Jiaotong University reveals that AI language models possess varying political biases. The study examined 14 large language models, including OpenAI’s GPT-2, GPT-3 Ada, and GPT-3 Da Vinci, as well as Meta’s LLaMA. The results showed that OpenAI’s ChatGPT and GPT-4 leaned towards a left-wing libertarian perspective, while Meta’s LLaMA exhibited a right-wing authoritarian inclination.

The Study of Intrinsic Bias of LLMs

The researchers employed a political compass to plot the models’ positions on topics like feminism and democracy. They also investigated whether retraining the models on more politically biased data affected their behavior and ability to identify hate speech and misinformation, finding that it did indeed have an impact.

As AI language models are widely deployed, comprehending their inherent political assumptions and biases becomes crucial. These biases have the potential to cause harm, such as a healthcare chatbot refusing to provide information on abortion or contraception, or a customer service bot delivering offensive content. OpenAI has faced criticism for perceived liberal biases in ChatGPT, but the company emphasizes its efforts to address concerns and avoid favoring any political group. However, some researchers, like Chan Park from Carnegie Mellon University, believe that complete freedom from political biases is unachievable for language models.

“We believe no language model can be entirely free from political biases,” Chan Park, a PhD researcher at Carnegie Mellon University says.

AI language models have distinctly different political tendencies. Chart by Shangbin Feng, Chan Young Park, Yuhan Liu and Yulia Tsvetkov.

Researchers conducted a study to understand the political biases of AI language models. They analyzed three stages of model development, starting with assessing 14 models’ positions on politically sensitive statements and plotting them on a political compass. Surprisingly, they discovered distinct political tendencies among the models. Google’s BERT models were found to be more socially conservative compared to OpenAI’s GPT models, potentially due to training on conservative books versus liberal internet texts.

The study also revealed that AI models’ biases can be reinforced through training data. Furthermore, the researchers observed that the models’ political leanings influenced their classification of hate speech and misinformation. Left-leaning models were more sensitive to hate speech targeting minorities, while right-leaning models were more sensitive to hate speech against white Christian men. Additionally, left-leaning models were better at detecting misinformation from right-leaning sources, and vice versa for right-leaning models.

The Limitations of the Study

The lack of transparency regarding the data and methods used to train AI models makes it difficult for outside observers to understand why different models exhibit varying political biases, according to Park, one of the researchers involved in the study. While researchers have attempted to mitigate biases by removing biased content from datasets, the study highlights that cleaning data is insufficient. Biases can persist, even at lower levels, and contemporary AI models remain inaccessible to academic researchers, hindering comprehensive analysis. The limitations of the study include the use of older models and the challenge of assessing the true internal state of AI models. The researchers acknowledge that the political compass test is not a perfect measure of political nuances. To ensure fairness, companies need to be aware of how biases influence their AI models’ behavior.

The Self-critiquing ability of LLMs

The researchers present a method called Recursive Criticism and Improvement (RCI) that enhances the performance of a pre-trained large language model (LLM) in executing reasoning tasks through natural language guidance. RCI, basically, involves a prompting scheme where the LLM generates an initial output, identifies problems with it, and then generates an updated output based on the identified issues.

Illustrative examples of explicit RCI prompting and baseline prompting approaches on the GSM8K dataset. RCI prompting effectively addresses logical errors that arise in the baseline prompting approaches. Prompts text is displayed in violet color.

RCI prompts LLMs to identify problems in their output and improve it based on the identified issues, allowing for iterative improvement. They compare RCI with baseline prompting methods using a cleaned dataset e.g. GSM8K dataset (grade school math problems) and demonstrate that RCI involves a two-step process: critiquing the previous answer and generating an improved answer based on the critique. The RCI process can continue until specific conditions are met.

The RCI prompting scheme improves reasoning abilities for LLMs more broadly, making it a significant contribution in the development of intelligent application.

This article is drafted with the assistance of A.I. and referencing from the sources below:

https://www.technologyreview.com/2023/08/07/1077324/ai-language-models-are-rife-with-political-biases/

https://arxiv.org/abs/2303.17491

The work described in this article was supported by InnoHK initiative, The Government of the HKSAR, and Laboratory for AI-Powered Financial Technologies.
(AIFT strives but cannot guarantee the accuracy and reliability of the content, and will not be responsible for any loss or damage caused by any inaccuracy or omission.)

Share this content

Read More

知識圖譜與大規模語言模型 (Chinese Version Only)

Address

Units 1101-1102 & 1121-1123,
Building 19W Science Park West Avenue,
Hong Kong Science Park,
Shatin, Hong Kong

Products & Solutions

People

About Us

Address

Copyright © 2023 Laboratory for AI-Powered Financial Technologies Ltd. All Rights Reserved.