🖥 Metrics Monday Web Analytics Series
My initial vision for Metrics Monday didn’t include multi-week series, but as I considered it further— especially in the case of web analytics — the concepts and terminology related to many metrics build on more fundamental concepts, so there’s an implicit order to best presenting an introduction to them. To that end, I’ll be writing about web analytics for the next 12 weeks or so.
🚦Web Traffic: Pageviews, Users, and Sessions
The first few metrics we’ll tackle are measures of web traffic — pageviews, users, and sessions.
Google Analytics dominates the website performance data collection space, unsurprisingly, with their free service, but there are a lot of other platforms doing the same task from powerhouse Adobe Analytics to small-but-mighty Y Combinator-backed Mixpanel.
I’ll use terminology consistent with Google Analytics definitions unless otherwise stated (because it’s ubiquitous, not necessarily because it’s “better” than other platforms).
📜 Pageviews: Counting is hard.
In my career as a data practitioner, across companies and industries, I’ve consistently found that the metrics that seem the simplest — specifically the ones involving counting something — can be the hardest to pin down. Why? There are several factors, but I’m convinced that a big one is that we—as business units and individual contributors—all think we know the definition of some very foundational concept, like what a “customer” is, and go about analyzing data accordingly. At some point, we realize the business intelligence team’s numbers don’t match the customer success team’s, whose don’t match the finance team’s or the marketing team’s.
As you probably intuited, pageviews (or page views, if you prefer) is a measure of the number of views of a webpage. It sounds straightforward, but there are some nuances, and you must proceed with a definition. In writing. That everyone in your organization knows about and can access. Trust me regarding the need for agreed-upon definitions of common metrics if you trust me about nothing else again, ever. I promise that defining something is always — and I very, very rarely say “always” — better than not defining it. Having a consistent, agreed-upon, written-down definition will save you time, effort, and unnecessary meetings, and — most importantly — will enable a trusting relationship between you, the data practitioner, and everyone else at your organization.
Should we count several views of the same page by the same user in quick succession as separate pageviews? How do we define “views”? What interval of time do we consider “quick succession”?
Questions like these, questions asked by data teams everywhere, are the reason you must have a well-defined, well-documented metric before you implement dbt or a data mesh or a lakehouse or whatever shiny, new component of the Modern Data Stack gets prioritized above all else at your company.
Let’s take a look at the Google Analytics definition of pageviews found in the Google Analytics Help Center Glossary:
A pageview is an instance of a page being loaded (or reloaded) in a browser. Pageviews is a metric defined as the total number of pages viewed.
By Google’s definition, the pageviews metric doesn’t involve who the viewer is or when they do the viewing, so to answer one of the questions posed previously, we should count our multiple views by the same viewer in a short timeframe as distinct pageviews.
🛑 But wait. Google Analytics provides a more detailed definition* in an article titled “The difference between Google Ads Clicks, and Sessions, Users, Entrances, Pageviews, and Unique Pageviews in Analytics” (🤯what a mouthful!), also in the Help Center:
A pageview is defined as a view of a page on your site that is being tracked by the Analytics tracking code. If a user clicks reload after reaching the page, this is counted as an additional pageview. If a user navigates to a different page and then returns to the original page, a second pageview is recorded as well.
So this involves tracking code. How do we know if the site our viewer visited is being tracked?
The Google Analytics developer’s guide definition is technical, obviously, but succinct. It tells us exactly which sites will be tracked and counted in Google Analytics pageviews.
There are two ways to send a pageview to Google Analytics:
1. Use the default behavior of the
gtag.js
snippet2. Send manual
page_view
events
I’ll translate the two most technical aspects of the definition:
A
gtag.js
snippet is a short bit of code written in JavaScript, the language of the web.A
page_view
event is a replacement for the aforementioned snippet of JavaScript. It tells Google Analytics NOT to count pages the way it does by default, but to count the pages we want to count, the way we want to count them. (Disclaimer: This definition is not technically 100% accurate, but for the sake of non-developers’ understanding it does accurately reflect the purpose of thepage_view
event.)
Adobe Analytics documentation defines pageviews similarly, and so do other platforms I’m familiar with.
We went down the long and winding road of Google Analytics’ definition of pageviews in order to understand that the numbers you send to stakeholders in reports and display in data visualizations could be very wrong without the knowledge that the software development team possesses the most accurate definition of this metric. As a data analyst having just made this discovery, my next move would be to contact a member of the dev team to ask where to find the documentation related to this metric. Most companies have probably chosen to stick with Google’s defaults, but some definitely have not. The accuracy of your web traffic metrics depends upon knowing for certain how your company defines pageviews.
Finally, after a long journey into semantics, here’s the calculation:
SUM(Pageviews)
The mathematics is easy for this one, but getting it right is hard.