Sorry, we could not find the combination you entered »
Please enter your email and we will send you an email where you can pick a new password.
Reset password:


By Thomas Baekdal - June 2016

Accurate Analytics is Painful

Just a short note about something you should consider. As all of you know, I am obsessed with accurate analytics. I don't care about views or clicks or all that other nonsense. I want to know exactly how much influence I have, and I'm constantly trying to get as close to the most realistic number as possible.

However, doing this is exceptionally painful and sort of depressing, because it also means that my numbers will be much lower than anyone else's.

Let me give you an example.

On this site I'm measuring this for each article:

And when we look at this for an article like 'The Increasing Problem With the Misinformed', which was fairly popular, we end up with this data:

That's quite a difference, isn't it? But this isn't even the worst example. The worst example is this article:

This is just insane!

And, in the early days of this site, the way I did article analytics was simply to count the number of times an article was requested by the server, which is how many CMS systems work (including those at big newspapers).

This gave me a rather impressive number that felt good and was fun to look at. I mean, who wouldn't want to see articles with a ton of views all the time? But, as you can see, this number is completely distorting my reality in the most misleading way possible.

The RAW view numbers recorded by my CMS are total crap.

So, we have to ignore the server-level data, and only start to look at the next level of analytics, which is the pageview numbers recorded on the browser-level(like with Google/Adobe/Comscore Analytics). This number is far more accurate since it doesn't include all the thousands of views that aren't from actual people (bots and other automated junk traffic).

The result could look like this, which is the data for my recent Plus article 'Publishers and The Snacking Economy'.

Keep in mind, Baekdal Plus is a subscription site and my audience is mostly media executives. This severely limits my traffic potential, because these articles never get any of the random traffic you would get with a free site. This is an insanely focused audience.

But look at this graph and ask yourself, which one represents the most accurate number for indicating real traffic?

Pageviews? Nah... that may be how many times the article was actually loaded into the browser, but it's not very accurate since it doesn't indicate number of people.

Unique pageviews? Well, kind of... but look at the difference between unique pageviews and the number of people who started scrolling down the page. That's a big difference.

Think about this as if you were running a physical retail shop. If you see a person walking into your shop who then immediately turns around before he or she even has a chance to look at anything, would you consider that person to be a customer?

No, of course not. What actually happened was that this person walked into your store, realized it wasn't where she wanted to go, and left. That might technically be counted as a person, but it's not actually a visitor. It's a mistaken view.

So, neither pageviews nor unique pageviews really tell you what you need to know. They are inflated numbers that may or may not indicate a real audience.

More to the point, there is a huge difference between each article. Here, again, are the graphs from the two articles from before.

As you can see, the difference between the number of people who started scrolling and the unique pageviews is almost the same in the first article, but it's massively different in the second article.

This is something that you really need to know. If you just look at unique pageviews, you really have no idea what's going on.

Thus, the way I do analytics today is that I look at analytics from the bottom up. My primary metric is the number of people who read each article. That's my number one metric. My second most important metric is the 'started scrolling' metric, because that is a better indication of my real traffic.

And when I do read-rates, I don't compare that with pageviews(because that would be misleading considering that a big part of it might be accidental traffic), I compare it to 'started scrolling'.

For instance, the first article I mentioned has an effective read-rate of 24%.

So why is this a painful way to do analytics? Well, because all the numbers I see are now much lower than before. In the past I would look at my analytics and it would tell me that an article had 53,000 views and I would get really excited about it. Today, however, my analytics system tells me that the same article has 4,852 readers.

It's a far more accurate number, and it really helps me understand my content, but it's not as fun. In fact, it can be quite depressing when an article doesn't work.

The reason I tell you this story is because you need to do the same thing. Too often I see publishers use view metrics that have no relation to reality. I see brands whose conversion calculation is based on an audience that isn't actually an audience, completely missing what actually works.

The worst example these days is Facebook. Every single week I hear publishers talk about their amazing video views, even when we know that they don't mean anything.

The same principle applies. Look at the analytics from the other end.

Let me give you one example:

Here is a very short 18-second video that Facebook is using to explain their video analytics.

First of all, 18 seconds for a video that is autoplaying and that people never really had to think about? That's not really a view. If you are a publisher and you are posting content like this, I wouldn't consider any of this to be a view, because it's so short that it's hardly content in the first place. And the people who do watch it will have an almost zero percent retention rate.

But let's leave that to one side for a moment and imagine that this is actually real content, and then look at the numbers.

Look at the metrics above. Which number represents the most accurate figure for real people viewing this video? The answer is the one at the bottom. It's the 7,726 unique views (6,517 being organic and 1,209 being paid).

All the other numbers are fake. They are completely pointless numbers that give you no real indication of how many people you actually reached. Think, again, about the example of the person walking into a retail store. A person that walks into your store and immediately leaves is not actually a visitor. You never reached that person. And it's the same on Facebook.

We see this even more clearly when we see the metrics for a slightly longer video, like this one:

This video is 1 minute and 21 seconds long, but as you can see, it lost 66% of its views in the first five seconds alone. Those are views that you never actually reached. None of those people really looked at the video. They never reached the point of making a decision about it. It was just yet another form of noise.

So, think about what I said about the difference between pageviews, started scrolling, and reads, and convert that thinking into Facebook views. What we get is this:

If we then write that out like I did to begin with, we get a graph like this (Facebook counts views from 3 seconds):

So, I ask you again: Which number represents the most accurate metric? The answer, of course, is the one at the end.

And this is even more important when you start to compare how each video performed. Take a look at the graphs for the two different Facebook videos I showed you above. They have entirely different patterns, which also means that the difference between Facebook views, started views, and real views will follow entirely different patterns.

The only real way to compare them is to look at the bottom of your analytics, rather than at the top. You start with real views, and compare that to started viewing, and you ignore whatever fake number of views there is further up.

It's the same for brands. If you are promoting a video on Facebook, the only number that means anything is the real views, because that indicates the audience that you actually reached. That's the only group that has the potential to convert into sales.

However, it's slightly different for publishers selling views (like with native advertising), because publishers cannot promise brands that their message will work. It's up to the brand to make sure their product and ad are worth looking at. Publishers can only sell brands the 'potential' for a real view.

In other words, publishers can only offer brands this:

The problem is that this isn't what Facebook is doing, and it isn't what many publishers are doing. And it's one of the many reasons that there is this huge conflict between agencies, brands, publishers and the platforms.

For instance, GroupM, the largest media buying planner in the world, recently decided that they would only accept the metric for 'started viewing' for every deal going forward. As they say:

For video ads, at least half of the video must be played, and the viewer actually has to press play - auto-play videos that start up when a page loads or when a user scrolls past them don't count - and the sound must be on.

And while some publishers are annoyed about this, as a media analyst, I absolutely agree with them. Keep in mind that advertising video is usually fairly short, so half the video might only be 15 seconds.

This is how the world should work. A view should only count from the point where people actually decide whether they want to see it or not. And it's the role of publishers and platforms to build user behavior to facilitate that, rather than optimize for fake views where the numbers don't mean anything.

This doesn't mean people will actually end up viewing the whole thing, nor that brands get any ROI out of it. But the purpose of advertising is to generate a lead. Much of what we see today doesn't do that at all.

For publishers, consider this a competitive advantage. Take the Financial Times, who was one of the first to sell advertising time rather than views. That gave them a considerable competitive advantage and positioned them as a serious place to advertise.

What the Financial Times told its ad partners was that they wouldn't just give them random views, they would give them real leads.

That is a powerful message.

So, turn your analytics upside down. Start with the real numbers at the bottom and ignore the fake numbers at the top. It isn't as fun to look at and it's much harder, but it gives you a much better picture.


The Baekdal/Basic Newsletter is the best way to be notified about the latest media reports, but it also comes with extra insights.

Get the newsletter

Thomas Baekdal

Founder, media analyst, author, and publisher. Follow on Twitter

"Thomas Baekdal is one of Scandinavia's most sought-after experts in the digitization of media companies. He has made ​​himself known for his analysis of how digitization has changed the way we consume media."
Swedish business magazine, Resumé


—   analytics   —


Creating a propensity model for publishers


How my focus on analytics has changed as an independent publisher


How can publishers measure trust and other editorial metrics?


A guide to analytics for independent journalists


Why producing less news leads to a boost in subscriptions


GDPR: How publishers can track things without tracking people