Dartmouth researchers have built an artificial intelligence model for detecting mental disorders using conversations on Reddit, part of an emerging wave of screening tools that use computers to analyze social media posts and gain an insight into people’s mental states.
What sets the new model apart is a focus on the emotions rather than the specific content of the social media texts being analyzed. In a paper presented at the 20th International Conference on Web Intelligence and Intelligent Agent Technology, the researchers show that this approach performs better over time, irrespective of the topics discussed in the posts.
There are many reasons why people don’t seek help for mental health disorders—stigma, high costs, and lack of access to services are some common barriers. There is also a tendency to minimize signs of mental disorders or conflate them with stress, says Xiaobo Guo, Guarini ’24, a co-author of the paper. It’s possible that they will seek help with some prompting, he says, and that’s where digital screening tools can make a difference.
“Social media offers an easy way to tap into people’s behaviors,” says Guo. The data is voluntary and public, published for others to read, he says.
Reddit, which offers a massive network of user forums, was their platform of choice because it has nearly half a billion active users who discuss a wide range of topics. The posts and comments are publicly available, and the researchers could collect data dating back to 2011.
In their study, the researchers focused on what they call emotional disorders—major depressive, anxiety, and bipolar disorders—which are characterized by distinct emotional patterns. They looked at data from users who had self-reported as having one of these disorders and from users without any known mental disorders.
They trained their model to label the emotions expressed in users’ posts and map the emotional transitions between different posts, so a post could be labeled “joy,” “anger,” “sadness,” “fear,” “no emotion,” or a combination of these. The map is a matrix that would show how likely it was that a user went from any one state to another, such as from anger to a neutral state of no emotion.
Different emotional disorders have their own signature patterns of emotional transitions. By creating an emotional “fingerprint” for a user and comparing it to established signatures of emotional disorders, the model can detect them. To validate their results, they tested it on posts that were not used during training and show that the model accurately predicts which users may or may not have one of these disorders.
This approach sidesteps an important problem called “information leakage” that typical screening tools run into, says Soroush Vosoughi, assistant professor of computer science and another co-author. Other models are built around scrutinizing and relying on the content of the text, he says, and while the models show high performance, they can also be misleading.
For instance, if a model learns to correlate “COVID” with “sadness” or “anxiety,” Vosoughi explains, it will naturally assume that a scientist studying and posting (quite dispassionately) about COVID-19 is suffering from depression or anxiety. On the other hand, the new model only zeroes in on the emotion and learns nothing about the particular topic or event described in the posts.
While the researchers don’t look at intervention strategies, they hope this work can point the way to prevention. In their paper, they make a strong case for more thoughtful scrutiny of models based on social media data. “It’s very important to have models that perform well,” says Vosoughi, “but also really understand their working, biases, and limitations.”