NewsWhip’s Head of Machine Learning outlines how machine learning is used in NewsWhip platforms.
Here, our Head of Machine Learning, Bojan Furlan, outlines how we’ve been using machine learning to improve NewsWhip technology, and what’s planned next.
NewsWhip improves the workflows of journalists, marketers and content creators by surfacing relevant stories and information fast, helping inform editorial decisions with audience data.
We aim to do this by applying machine learning and artificial intelligence techniques to our vast database of content and social engagement data. NewsWhip captures social data for millions of digital objects every day, from articles on major news sites and blogs to videos and other posts on social media platforms around the world.
NewsWhip’s machine learning team applies techniques such as Natural Language Processing, Information Extraction, Text Classification and Time Series Analysis to this content to provide deeper insights around the stories that readers are responding to on social media.
Here are three areas that we’ve been focussing on developing in Spike, which provides newsrooms with streams of stories picking up traction on social media on their beat in real time.
Trending Entities: Highlighting the people, places and events that are trending across the web
Of all the millions of pieces of content – videos, social posts and articles – published online each day, there are a much smaller number of common subjects that appear in each.
It would be impossible for any human to read enough content to find out these common themes, and that wouldn’t even solve the question of how readers are responding to different stories on social media. Identifying the core subject matter – people, events, places – that are resonating is a key question for any newsroom looking to stay on top of the day’s coverage.
At NewsWhip, we use information extraction techniques to identify people, organisations, locations and more in the full text of articles. The system then highlights these trending entities in Spike, showing users what’s gaining traction in different areas, beyond the obvious names.
Because the model looks at entities from web articles rather than just one social network, it provides a much clearer picture of what’s trending across the web as a whole. This gives our users a detailed view of the people, places and things generating engagement on social media over the last few hours.
Prediction: Highlighting the stories that will be big, while they’re still small
NewsWhip Spike can predict how different stories will perform on social media, based on early signals from the reception of the story in the first hours after its publication.
The model looks at the early volume of social engagement to find the ‘social velocity’ of different stories – or how fast they are spreading on social platforms. Based on that information, the system can then predict how much engagement the story will achieve over the course of the day. The more the system knows about the story, the further it predicts. If a story is one hour old, Spike can predict what its social engagement will look like in six hours time.
This is useful for editors looking to see how their own stories will perform on social media to inform placement and promotion, but it also allows journalists to see early in the morning which stories are likely to be trending and discussed by lunchtime. This predictive element allows newsrooms to much better plan their day-to-day coverage with actionable signals from their audience.
Similarly, it allows social media managers at brands and agencies to get on top of the stories that will be discussed throughout the day.
Alerts: Providing trending content alerts for quick reaction
For most editors, reacting to a story that has already been covered heavily on social media and elsewhere isn’t enough. They need their newsroom to be aware as early as possible, so they can get the story to their own readers. NewsWhip’s trending content alerts help that process by quickly alerting users to stories that are gaining engagement at a rate that’s higher than usual.
NewsWhip’s technology uses statistical distributions to identify these outliers. The model compares each story’s performance to the typical engagement rate for each dashboard. If a story is performing beyond expectations, the system alerts journalists and editors via email or Slack.
Content categorisation: Automatically categorising stories for better niche reporting
NewsWhip’s machine learning model can look at the content of each article it processes, and automatically categorise it with labels such as ‘Cricket’, ‘UK Politics’, or ‘cybersecurity’, based on what it’s about.
These categories are searchable within Spike, allowing journalists to get very accurate results about the stories that are trending and making an impact on social media around any niche topic. NewsWhip’s machine learning team is currently training the model with a view to increasing the coverage and precision of this data.
What’s coming next: Spotting the ‘white space’ in reporting
In future, NewsWhip will be using machine learning to personalise our technology to help journalists and editors to find the most relevant content possible.
One area that will be extremely useful for editors is using machine learning to identify the ‘white space’ in coverage of major news events – the stories that are not being covered by a set of publishers, but are gaining engagement from social media users.
Users would be able to compare the output of their site and their competitors, and then receive a feed of popular stories that haven’t been covered by either. This allows editors to think creatively about their coverage, and cover stories that their audience haven’t already read about.