University of Michigan Q&A: Platform Health Metrics and the Iffy Quotient


By   |   February 5th, 2019   |   Reading time: 8 minutes Digital Journalism, Interviews

Social Media Analytics, Facebook Analytics

The University of Michigan’s Center for Social Media Responsibility has been in the news recently for its Iffy Quotient. We talked to them about it. 

We sat down with Professor Paul Resnick, the Associate Dean for Research at the Center, to discuss his team’s work on their ‘Iffy Quotient’, how they developed it, and how NewsWhip data is helping them to achieve their goals.

Professor Resnick, thanks for joining us, would you mind beginning by telling our audience a little bit about yourself and your work at the University of Michigan?

 

Paul Resnick: I’m a professor at the University of Michigan School of Information and I’m the Associate Dean for Research and director of the Center for Social Media Responsibility. One of the things that center does is to create platform health metrics, a way of providing accountability for the media platforms around issues where they have some public responsibility.

My background is in computer science, but I’ve been floating around social psychology, political science and economics for the last twenty something years. I guess I would call myself a computational social scientist. In our School of Information we’re an interdisciplinary school made up of computer scientists, economists, etc. Probably half of our students go into user experience, another quarter are data scientists of some form. That’s kind of who we are as an institution and who I am.

This particular project is focused on the ‘Iffy Quotient’, can you tell me a little bit about how that came about, and a little about what that means?

 

The Iffy Quotient is the fraction of popular URLs that come from ‘iffy’ sites, which is our whimsical way of referring to sites that often carry misinformation. We were looking for health metrics that we could measure from the outside, without needing to have data provided by the platforms.

This idea came to my attention when I first met Aviv Ovadya maybe 18 months or so ago. I had seen a graph that Craig Silverman had published, showing the most popular twenty stories that were from fake sites vs those that were not fake news sites right after the election, and how the popularity of fake news sites had gone up. That image was in my mind and Aviv had prototyped something using NewsWhip data to calculate this on an ongoing basis so I brought him in to do that as a project here at our Center, to turn that into something that we could really update continually and not be a one-off thing.

There’s a public-facing website, and once a day it gets updated, on a three day delay because it takes a while for the scores to settle down. We did some historical analysis too. We went back to the beginning of 2016 and that’s part of what’s interesting, to compare over time, when does it go up, when does it go back down, as well as to compare between the two platforms Twitter and Facebook.

Why did you think it was important to focus specifically on social media for this?

 

Well there’s more and more people getting their political news from social media sites, and so there’s been a lot of commentary about how it’s important that within that news stream that people are receiving accurate information. There are arguments that people should not be in filter bubbles, and they should have some exposure to things they don’t always agree with, so we’re focusing on this because it’s a place where a lot of people are getting their information, basically.

In your historical analysis do you see it as something that’s on the rise or falling since 2016?

 

Well, it rose and then it fell! In the second half of 2016 into the first quarter of 2017 there was a big rise on Twitter and on Facebook in the fraction of content that was coming from these iffy sites, but since then, it’s gone down, with a steeper decline on Facebook than on Twitter. So Facebook is now down to below the early 2016 levels, and Twitter is down from the peak but still quite a bit above the early 2016 levels.

We don’t know exactly why. It could be supply differences, it could be demand differences, and over time it could be differences in countermeasures that the platforms are taking. By supply differences I mean, if you’re politically motivated, or even commercially motivated, you might create more misinformation when there’s an election on. People might also be more interested in consuming misinformation during an election cycle.

So that gives you the supply and demand part, and maybe people are less interested in all kinds of political stuff at the end of 2017 than they were at the end of 2016, and so there’s less of an environment for people to be able to put out surprising, enticing, untrue information.

But then there’s also countermeasures that the platforms are taking. And both Facebook and Twitter are trying to crack down on fake accounts. There may be other countermeasures the platforms are taking that we just don’t know about. Because we don’t know what exactly the different platforms have done we’re not able to say exactly what was more effective and what was less effective, we only see the cumulative effect.

Can you tell me about the methodology for the study?

 

The first step is we’re mashing up two information sources. One is NewsWhip, which is providing us with information about the most engaged with urls, and the other is Media Bias Fact Check, which is providing us with judgments about which sites are iffy and which ones are ok. They obviously don’t use the same term as us, iffy is our term. So we get engagement data from NewsWhip, and we look at the overlap between the iffy sites and the engagement data. So we start from the list of urls and cross-check them with the sites that are defined as ‘iffy’. That way at a single point in time we can compare the iffy quotient on Twitter and Facebook, and that’s a meaningful comparison to make.

We used the NewsWhip API, and we have a set of scripts that run daily. We were also able to pull historical data, because obviously we haven’t been running this daily since 2016, so fortunately this allowed us to query going back. Now on an ongoing basis we just query for a new day, so we have scripts that do that, and then we store everything, maintain the provenance, what date we ran the query on in case we have to recheck our work. But the basic things are the script that hits the API, and then the historical data checked against the Media Bias Fact Check list.

We do some things with redirects too to see if there’s name changes for sites that are no longer publishing under the same url, so we do a little bit of that aliasing too.

What led you to choose NewsWhip?

 

Aviv, the guy who brought this project to the Center had already identified NewsWhip as a place that had this data, tracking a large number of sites and urls, and that’s the thing that would have been difficult for us to do, had we tried to do it without your data. We wouldn’t have been able to figure out which are the most popular urls on a particular day. So that helped us figure out what the most popular ones were, which was very useful.

What are some of the roadblocks or challenges you’ve run into on this study?

 

There are a lot of things around deciding what to do around redirects, and lists not matching up perfectly vs. 2016, so there were questions about that that we had to make decisions about. We had to try to understand what we might be missing too, there could be urls that weren’t being tracked for example, that’s a challenge we never really had a solution for. There are decisions about whether we should use the engagement counts and how meaningful that was.

I think the last one is challenges around data, because we want to restrict our universe to things that are only public affairs content. We’re computing a numerator and a denominator. Do we want to try to exclude a url from both if the url is a game site or something like that. We didn’t end up having a classifier that was reliable enough so we just decided to use any url that was being tracked, that became popular on Facebook or Twitter.

So the website is going to be up indefinitely, eventually the chart will get smooshed because the x-axis will have too many days on it, so we’ll have to change that!

And what’s next?

 

This is the first of what we hope will be many platform health metrics. We’re not announcing the next project yet, but stay tuned!

Thanks for taking the time Professor Resnick! If you’d like your own view of what’s going viral on the web and social right now, you can check out NewsWhip Spike.