[EDIT: This post has been edited to share new findings. All edits are clearly noted].
Yesterday I was asked an interesting question by one of my clients. The organization, which provides services in the United States, wanted to know if a greater percentage of people were booking appointments in a state different from where they live now compared to in the past. The business analyst who was investigating this question noticed a significant uptick in “out of state” bookings and, appropriately confused by the data, also asked me “did Google Analytics change the way they handle location or something?”
This additional inquiry was spot on. One of my analytics mantras is, “if that data looks wrong, it probably is wrong”. My team can affirm that I consistently add the “nose emoji” to team chats imploring everyone to make sure to “smell the data” before we deliver something to a client.
So I started digging, and this is what I initially found:
While the main point of this blog post is to discuss the outcomes of what I found (image above and what it means), I’m going to proceed a bit methodically through my steps for the benefit of readers who are newer to the analytics industry and might not (yet) know how to produce the output that I did.
In order to understand if something happened to “the way Google Analytics handles location” I wanted to see if there was a change in the % Total of Sessions for any given location. In order to do so, I used the following query (easily pulled in a custom report or via API as well).
Although initially the only data I needed was date, region, and sessions, I added Metro as a more granular location dimension for the US and a bunch of device information (browser, browser version, operating system, and operating system version) because if “strange” behavior in analytics can be isolated to a particular browser or OS, then it is likely to be a cross browser issue and not a true behavioral issue. Though not visible in the query, I also did a UNION of multiple data sources so that I could determine if the issue was limited to one website or present across a number of different websites.
In order to visualize the data, I used Tableau (which is especially nice when dealing with a large number of rows, as this query returned about 7M rows).
The first step is to put Date in the x-axis (columns in Tableau) and the sum of sessions as the y-axis (rows). I chose an area graph, because I knew I was going to do a breakdown of this data soon.
The next step is to add a breakdown under ‘marks’. I chose ‘color’ to distinguish between the different states.
I then sorted by the region by Field ascending (of sessions). This arranges the data in a more meaningful order.
Then I added a Table Calculation to the measure (sessions) to get the Percent of Total. The calculation is done Table (down) because I’m looking for the percent total for each column (day).
An even clearer description of what was happening to the traffic patterns can be shown when using a discreet line graph with Metro as a breakdown. Here, too, the metric is percent of total sessions over time.
New York, Los Angeles, and Chicago are **VERY** popular this time of year….
Non-spoiler alert, when I removed iOS and MacOS from the data, the trends looked relatively normal.
Initially, I came to the conclusion that this change in geolocation accuracy was due to Apple’s iCloud Private Relay. This is because I was not thorough enough in my analysis and did not do a further breakdown by Browser and Browser version. I was really excited to publish my first blog post in almost a year, so I cut corners.
Luckily, the internet can be a very helpful place and a famous industry personality took the time and care to point out that what I was noticing was browser based.
Safari hides the user’s IP address in requests sent to domains in DDG’s Tracker Radar. This is in addition to Private Relay which is a subscription product and affects all HTTP traffic.
— Simo Ahava (@SimoAhava) June 29, 2022
Some additional segmentation confirms that the seismic shifts in geolocation accuracy are due to Safari sending bogus signals to (in this case) Google Analytics.
One thing is clear, the accuracy of your digital analytics IP address location based data just got a lot less accurate. If you’re a large retailer (one of the data sources above), your geo-targeting of ads just got thrown into the blender (especially if your audience is iPhone heavy). Any dynamic language in ad copy (or website copy) that was ‘personalized’ towards users from certain locations will be potentially less relevant (or totally irrelevant) to them. Want to show your customers a convenient buy online pick up in store option that is in their locale, say ‘bye bye’ to using IP addresses for that.
Generally, this leads me to take an angry stance towards Apple. I do not believe that their brute force privacy policies are anything other than Apple shoving their thumbs into the eyes of Google and Facebook in the name of consumer protection so that they can stick their greedy hands into these competitors’ pockets to try and make off with some cash. ATT and ITP helps Apple’s bottom line and hurts advertisers. Hurting advertisers probably ends up being a bad thing for consumers as well. I don’t believe that most people want a less relevant experience online. Yes, the ad tech industry went way off the deep end with their data hoarding and exploitative data practices. A correction is indeed warranted. Though I can’t help but think decisions like what was done with
Private Relay Safari 15.5 are overkill. Private Relay Safari 15.5 could still forward the IP address and obfuscate the data (similar to GA IP Anonymization).
[Edit: Noting that iCloud Private Relay is a subscription product, I have much less of a problem with it. There are plenty of VPNs on the market, if Apple wants to get into that market, more power too them. My gripe is with introducing changes to WebKit etc that are in the name of privacy but truly are meant to attack Google and Facebook).
On the other hand, the very reason why Google Analytics has been declared illegal to use  IN THE EU [/edit] is because US governmental and intelligence agencies are able to force big tech companies to disclose information. In light of Roe vs. Wade recently being overturned, if I lived in the United States I actually might be thinking twice about whether or not I want the government to potentially access data such as the IP address I used to conduct internet searches or visit websites.
What do you think? Is Apple’s iCloud Private Relay going to hurt businesses, advertisers, and/or consumers? Is it just a greed driven attempt by Apple to hurt their competitors, or is Apple being truly altruistic and providing this VPN service for the benefit of society? Somewhere in between? Or you don’t know because you didn’t even read this far in the article?
Please feel free to comment below or to [at] me on Twitter: @analyticsninja