top of page

Facebook Comment Volume Prediction

Yixi Chen (yca3160)

ecchen96153@gmail.com

​

William Ehrich (wde220)

wdehrich@gmail.com

​

Chungyuen Li (clt2503)

lizoyu1@gmail.com

​

EECS 349 - Machine Learning

Northwestern University, Spring 2017

​

Approach

Experimentation was performed using three regression models: (1) regression tree, (2) neural network, and (3) Naïve Bayes. In addition, single feature baseline accuracies were calculated and data transformations were performed in attempts to increase accuracy.

 

To measure accuracy, we used three metrics: (1) Hits@10, (2) proportion of predicted values within a factor of 2 of the actual results, and (3) the mean squared error. The Hits@10 score is determined by providing the model with 100 input feature vectors, generating predicted values and counting how many of the feature vectors in the predicted top 10 are in the actual top 10 with respect to comment volume.

Task

practice_areas
our_vision

In our study, we seek to create a model to predict the number of comments a Facebook post will receive based on features of that post and features of the Facebook page on which the post was made. This endeavor is of great importance from a scientific perspective because of the great potential to understand the thoughts and feelings of people based on their behavior on social media. From a commercial perspective, companies and other institutions use social media for marketing purposes. Therefore, greater insight into social media activity can lead to more effective marketing strategies.

Results

Our findings revealed that regression trees and neural networks were able to yield impressive Hits@10 scores of around 7. By comparison Naïve Bayes analysis yielded lower accuracies. Additionally, it was observed that the number post comments was best predicted by the number of comments on the corresponding page in the preceding 24 hours. Ultimately, neither user controllable features, such as post character length and day of posting, or data transformations, such as logarithms and square roots, had a significant impact on prediction accuracies.

Average Hits@10 scores for different combinations of regression models and feature selections

ATTORNEYS
contact
bottom of page