Using Machine Learning Techniques to Limit Social Media Misinformation

March 10, 2021
a hand holds a phone with a screen that says false information symbolizing using machine learning to limit social media misinformation

Social media can be a great way to stay in touch with family, friends, and co-workers, especially during a pandemic when face-to-face gatherings are extremely limited. Social media may also be used as a way to stay informed of the latest news – nearly every local and national news source uses social media as a method of communication. But how do you separate the real news from the fake news?

Researchers at Rice University sought to answer that question and have developed a new method, using machine learning (ML) and artificial intelligence (AI), to flag fake news. The method is more efficient than existing methods and produces fewer false positives, reports the university in an article for TechXplore.

Computer scientist Anshumali Shrivastava and statistics graduate student Zhenwei Dai shared their findings at the 2020 Conference on Neural Processing Systems, where they presented the findings from their paper, Adaptive Learned Bloom Filter (Ada-BF): Efficient Utilization of the Classifier with Application to Real-Time Information Filtering on the Web.

Shrivastava and Dai’s methodology is based on the Bloom filter, a probabilistic data structure developed in the early 70s that tests whether an element is part of a set. The Bloom filter may return false-positive responses but cannot return false negatives. The method developed by Rice uses new algorithms, resulting in a combined lower false-positive rate and memory usage compared to existing methods.

“Using test databases of fake news stories and computer viruses, Shrivastava and Dai showed their Adaptive Learned Bloom Filter (Ada-BF) required 50% less memory to achieve the same level of performance as learned Bloom filters,” shares the university.

The article explains that while not confirmed, it is presumed that Twitter currently uses a Bloom filter to check to see if a specific data element, as part of a tweet, is included in an already identified set of elements, like known computer viruses. The filter will find all instances of that data element but also records false positives.

With an estimated 500 million tweets a day, even a small percentage of false positives flagged for manual review is cumbersome. So Shrivastava and Dai turned to AI and ML to lower the rate of false positives.

“Language recognition software can be trained to recognize and approve most tweets, reducing the volume that needs to be processed with the Bloom filter,” reports the university. “Use of machine learning classifiers can lower how much computational overhead is needed to filter data, allowing companies to process more information in less time with the same resources.”

The method used by Rice was tested with malicious URLs, malware, and fake news, and resulted in a reduction of 80% in the false positive rate.

“Any improvement to Bloom filter false positive rates impacts the efficiency and performance of the entire internet,” states Shrivastava and Dai in the paper. “This affects productivity and user experience of all the end-users of the internet.”

Removing misinformation from social media is also a matter of national security. Counties across the globe consider the spread of disinformation on social platforms to be a threat to national security and to democratic systems of government. This issue has been seen on both domestic and international levels quite recently. For this reason, the viral spread of false information via social media platforms is an issue that professionals in cybersecurity must be aware of.

Capitol Tech offers bachelor’s, master’s and doctorate degrees in cyber and information security and analytics and data science. Many courses are available both on campus and online. To learn more about Capitol Tech’s degree programs, contact admissions@captechu.edu.

Categories: Machine Learning