Optimize Content Moderation for a US-based Leading Media Company

Business Problem: With increased usage of social media, moderating conversations and removing abusive input is becoming more demanding. A predictive model for automating the process of removing and blocking insulting content was needed

Solution: A predictive model was built to identify insulting / abusive content in a conversation or twitter input, for automatic removal to optimize moderation effort.

The solution used “Text  Mining”. The text data was cleansed  by removing common English words, punctuation, digits etc. and converted it into a matrix using a frequency term approach and Term Frequency – Inverse Document Frequency (TF-IDF).

Technology: R2.15.1, Textir and Liblinear packages.

About the Author: Site Admin