Kean University Researchers Reveal How Tone Can Influence AI Accuracy
Assistant Professor Boyang Li, Ph.D., led research on how AI systems respond to linguistic tone
New research from Kean University suggests that something as simple as linguistic tone, whether polite or forceful, can influence how accurately artificial intelligence (AI) systems respond.
Led by Assistant Professor Boyang Li, Ph.D., of the Department of Computer Science and Technology, the study examined how the phrasing of prompts affects the performance of vision-language models (VLMs), particularly their tendency to generate incorrect responses, commonly known as “hallucinations.”
Working with Assistant Professor Meng Xu, Ph.D., and four student researchers from Wenzhou-Kean University who are currently studying at Kean’s Union campus, Li found that linguistic tone has a measurable impact on the reliability of artificial intelligence systems. Their findings, detailed in the paper “Tone Matters: The Impact of Linguistic Tone on Hallucination in VLMs,” have been accepted for presentation at the 2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). The research provides new insights and practical guidance for evaluating AI systems.
“As AI becomes more integrated into everyday life, ensuring its reliability is essential,” said Patricia Morreale, Ph.D., chair of the Kean Department of Computer Science and Technology. “This research reflects the responsibility of universities like Kean to advance AI thoughtfully and accurately.”
Li’s long-term research focuses on AI safety, examining both technical design and human interaction with AI systems. This project specifically examined how tone and phrasing interact with models that interpret both images and text.
“The research has centered on AI safety from both the human and technology sides,” Li said. “In this case, we are asking if a model will respect the truth even when a user’s tone tries to turn it in another direction.”
To test that question, the research team conducted a five-level experiment using multiple AI models. In one sequence, the team presented models with images of clocks, watches and name tags and asked them to identify the time or a name, even when the text in the image was actually a random assortment of numbers and letters.
Level One prompts were straightforward. By Level Five, the prompts became increasingly elaborate and demanding.
As the language grew more insistent, researchers observed that some models began guessing rather than acknowledging uncertainty.
“When we increase the pressure in the prompt, the model may start guessing even when there is no real answer in the image,” Li said. “The tone of the user should not change what is and isn’t true.”
The team also discovered that when prompts became overly forceful, some models activated internal safeguards and refused to respond.
Li hopes the research framework will contribute to stronger standards for evaluating AI platforms before they are released publicly.
“Before a model is released, it should pass a benchmark that evaluates whether it consistently respects the truth,” Li said. “If a user strongly pushes for the wrong answer, the model still should not provide it, especially today when misinformation spreads so easily.”
Li also credits Kean’s broader investment in artificial intelligence. The University introduced the state’s first bachelor’s degree in artificial intelligence in the Fall semester, part of a growing AI ecosystem that includes an AI Center of Excellence.
“Kean’s support for AI research has been tremendous,” Li said. “The University’s leadership recognizes that AI is starting to shape society. That support allows us to invest our time into research that can benefit the public.”