At Analytics Ninja, we find ourselves ever more and more excited about using Google Tag Manager Server Side containers (SSGTM). While Google is certainly not the first vendor to introduce Server Side tag management technology, SSGTM it is quickly gaining wide adoption because, well, you know…. Google.
I digress… It is important to note that SSGTM manages data on the server. You cannot use it to track things like clicks that happen in the browser. You will still need some sort of client-side measurement to capture these interactions. To understand on of the main functions of what SSGTM actually does, here is a flow chart from Jim Gordon.
For our customers on the Shopify platform, using SSGTM became a MUST because Shopify introduced a feature, not a bug, definitely, for sure, 100%, not a bug, but rather an important FEATURE which broke the data layer on transactions quite a bit. They claimed that the feature made the order process execute faster (without supporting evidence).
We saw the same thing. We set up tracking to quantify missing line items and order IDs on a bunch of stores. Some are 40+%.
The js “fix” is using checkout object instead of order. Lots of downsides obviously, but can prevent null.
Native ga<>shopify now has mixed trans IDs
— Elevar (@getelevar) January 21, 2021
The problem in our use case was two-fold. First, the data layer would not load a products array, so within Google Analytics we had no visibility into WHICH products were sold, on some days for more than 10% of sales. Secondly, and more importantly, in the data warehouse that we built for our customer that powered all of our reporting and analysis, the Transaction ID (order_id) is our primary key. Without it, we could not join data from GA with our Shopify database table. We view data loss above 5% as unacceptable, and this was happening.
Here is a view of Total Transactions over time vs Transactions without data.
We can zoom in on that as a percent of total =>
Sadly, dealing with the Shopify Plus Support team about this issue was not a pleasant experience. They usually provide solid support and I find the team to be knowledgeable and responsive. In this case we spent a over a month in a painful back and forth which ended with them saying it was something that were not going to fix it, because they really didn’t think it was broken. They repeated claimed that “enterprise clients” use Webhooks to track conversions, a claim that I vigorously disagree with. Having a functional, reliable data layer is a hill that I will die on when it comes to describing enterprise business requirements.
But there was nothing that we could do. We lost the battle of having a data layer that worked within acceptable bounds. So it forced our hands into using SSGTM and Webhooks, something which was good for the Analytics Ninja team because it required us to jump into the SSGTM waters. Further procrastination was not an option.
The logic to get everything to work was relatively straightforward. Shopify was chocking on providing data during the order creation process. But BEFORE anyone is able to place an order on Shopify, they get both a cart_token and a checkout_token. The checkout token is easy to access, it’s a simple window scoped Shopify.Checkout.token variable. When a user is in the checkout, we shoot this data over to our SSGTM container via a Google Analytics Event tag. Notice that in the advanced configuration section of the tag you can configure the transport URL.
The SSGTM endpoint stores data that includes the checkout token on the &el parameter (i.e. the Event Label) by pushing it into Firebase. A Google Analytics hit payload includes all of the browser information that we need to stitch the server-side transaction hit back into the session, the most important piece being the clientId.
The SSGTM tag which stores the checkout_token
So once we have the clientId and checkout_token stored on the server, we wait. And wait. Will the person purchase? Will their credit card validate. Eeeeks. The tension.
Finally… BOOM!!! After 90 seconds, someone makes a purchase. Since we set up a Webhook in Shopify (like the BEST OF THE BEST OF BREED of all enterprise clients) by going to Settings > Notifications > Webhooks in the Shopify admin, we now have all of the purchase details delivered in real time to a Cloud Function.
When I asked Timur (a wonderful friend who in his spare time is Analytics Ninja’s Data Engineering Lead) what exactly happens next, he explained that the cloud function sends data to pubsub, which then messages the SSGTM container and then WHAM!! we send a GA hit with full Enhanced Ecommerce data to Google Analytics. This is not only way to do this; we could have sent the webhook directly to SSGTM, but the pubsub queue stores the data as a failsafe so that if the SSGTM container were to fail, we still have all of the transaction data available to send later. There is quite a bit of code (Python) involved in this solution. Fixing what Shopify broke was NOT a trivial matter.
BUSINESS VALUE ==> the Enterprise Data Warehouse is no longer missing 10-15% of transactional data, and our multi-touch, cross device customer journey data model can once again feel comfortable showing its face in public because the stain on the clean white shirt of data quality has been washed away.
With regard to SSGTM, I think that it is a fantastic that the platform can solve an important business use case such as this one. SSGTM is not the only solution to this use case. Anecdotally, David Vallejo (who is a part of the top 0.1% of implementation specialists) built a server side solution in PHP in Heroku for one of our clients in 2017 when he used to contract for Analytics Ninja. But SSGTM provides a lot of transparency to the data flows. I’m really happy that we were able to work around broken Shopify data layers in this way. And major props to Timur for crafting the fix. I am super impressed.
That’s it for now. Please share questions and comments to this post below.