Laypeople often find advice from ChatGPT more reliable than from a real lawyer

“When laypeople did not know whether the advice came from a lawyer or a language model, they more often trusted the advice generated by the language model.”

Can you, as a legal professional or lawyer, still count on the trust of your clients? New research from the University of Antwerp and the University of Southampton shows that laypeople sometimes find legal advice from a language model, such as ChatGPT, more reliable than advice from real lawyers. Especially when they do not know who provided the advice.

What explains this preference? How well can laypeople distinguish advice from a language model from real legal advice? And what are the risks if the public blindly trusts advice from a language model?

Study Design

The researchers used three experiments to investigate how laypeople interact with legal advice from a lawyer versus that from a language model.

Two main questions were central to this:

  1. Are laypeople more willing to act on the basis of legal advice from a language model or from a lawyer?
  2. Can laypeople, if they do not know the source, recognize whether legal advice comes from a language model or from a human?

The advice concerned three areas of law: traffic law, tenancy law, and environmental law. Both the language model (ChatGPT-4o) and British lawyers provided answers to exactly the same legal questions, based on English law.

Insight into Experiment 1

In the first experiment, it appeared that laypeople rated legal advice from a language model higher than from a lawyer when they did not know who the advice came from. The source of the advice was therefore unknown in those cases.

Methodology

  • 100 participants were randomly divided into two groups:
    • Group A was shown who had written the advice (lawyer or language model).
    • Group B received the same advice, but without mention of who had written it.
  • Each participant read a total of 18 legal advice segments divided across traffic law, tenancy law, and environmental law.
  • After reading each piece of advice, participants indicated to what extent they were willing to follow the advice, on a scale of 1 (not at all) to 9 (completely).
  • The advice from the language model and the advice from the lawyer were equally distributed and presented randomly.

Summarized examples of questions asked

  • Traffic law: my father received a fine after his car got stuck on a speed bump. How can he defend himself legally?
  • Tenancy law: my landlord wants to evict me because I have a dog, even though he gave verbal permission at the start. What are my rights?
  • Environmental law: the municipality will not allow me to install a colored metal fence. Is that permitted according to the law?

Both the language model and the lawyers answered the same questions without knowing each other’s answers.

Results

  • Advice from the language model: average score 7.23.
  • Advice from the lawyer: average score 6.74.

As soon as the source was known, the differences largely disappeared.

Insight into Experiment 2

The second experiment repeated the test setup where participants were not shown whether the legal advice came from the language model or from the lawyer.

Methodology

  • 78 new participants who had not participated in Experiment 1.
  • The same 18 legal cases were presented.
  • Participants again did not know whether the advice came from a language model or a lawyer.
  • For each case, they indicated their willingness to follow the advice on a scale of 1 to 9.

Results

  • Advice from the language model: average score 7.08.
  • Advice from the lawyer: average score 6.82.

In this repetition as well, participants trusted the advice from the language model more often than advice from lawyers. The differences were statistically significant, confirming the results of Experiment 1.

Insight into Experiment 3

The third experiment investigated whether laypeople could recognize the difference between the advice from the language model and the advice from the lawyer.

Methodology

  • 110 participants evaluated 18 pieces of advice and had to estimate who had written them.
  • Use of a scale from 1 (certainly language model) to 6 (certainly lawyer).

Results

The average discrimination score was 0.59. This means that participants might as well have guessed. In other words: the distinction is not visible. The score of 0.59 shows that laypeople could estimate whether advice was written by a language model or a lawyer slightly better than chance, but their ability remained limited.

Key Findings

  • When the source is unknown, laypeople are more likely to trust legal advice generated by a language model than advice from a lawyer. This stems largely from the fact that language models often use confident and complex language.
  • Complex language used by language models can be mistakenly seen as a quality indicator.
  • The preference for the language model’s advice only applied when laypeople did not know who had written the advice. In the group where the source was known, participants made no distinction between the advice from the language model and that from the lawyer.
  • Participants could only distinguish the difference between the advice from the language model and the lawyer to a limited extent.

Conclusion

The research shows that laypeople trust legal advice from a language model more quickly than that from a lawyer, as long as it is not clear who wrote the advice. This emphasizes the importance of being transparent about the origin of legal advice and better informing users about the operation and limitations of language models. Therefore, more investment is needed in AI literacy and in the ability to recognize language model texts, as this is the only way to limit risks such as over-reliance on automatically generated advice.

LegalMike in Action

Every two weeks on Friday afternoons, we organize a digital knowledge session. During these sessions, we demonstrate how to optimally utilize LegalMike in your legal practice, from real-world examples to practical tips.

The next knowledge session will take place on April 10.

Or join directly via Google Meet.