Show More


Computer Science Department

Health Miner: Using Sentiment Analysis and Machine Learning to Enrich the Diabetes Patient Centric Journey

McKenzie Allaben

Social media (SM) content has become an increasingly valuable source in healthcare and pharmaceuticals that provides insights about patients’ emotional perspectives towards disease management that would otherwise be unidentified in pharmaceutical claims data. Using off-the-shelf SM listening and analytics tools provide quick solutions to extract this knowledge from SM outlets. However, the lack of specificity and customization of the one-solution-fits-all nature of these tools prompts the need to develop an alternative, customizable SM listening and analytics tool.

Health Miner is a KNIME-based platform, text-mining program that analyzes and identifies the sentiment and main topic of online health forum post content. Using Health Miner, researchers are able to extract, identify, and analyze health forum posts specifically related to type II diabetes and or insulin within the KNIME workflow. The program performs text preprocessing and lexical analyses to hypothesize the sentiment expressed in extracted content (i.e. whether content is positive, slightly positive, neutral, slightly negative, negative). In addition to the sentiment analysis portion of the KNIME workflow, the program will analyze the physical and psychological relevancy of a forum post and consider negated context when performing sentiment analysis. A combination of unsupervised and supervised machine learning techniques will be used to perform cluster analysis, grouping content based on the “main topic” identified within each post.