Apply now for Hotbrain 2023!

How we hacked Google Analytics into an in-house CTR tracking system (video).

Share Post:

Video transcript:

Hey hey, okay I want to walk through our internal click-through-rate tracking system that we created.

This was a while back for a client and they had a rather large online listings site. They had lots of content—I think they had over fifty thousand individual items and then different ways of viewing them through search results pages and listing pages and things like that. So you can think of something like an AutoTrader for a good example—it wasn’t AutoTrader but you can think of this. So you can see that these search results pages with lots of listings on them and then you can drill down to an individual listing page or vehicle listing detail page as well. And then there might be other pages as supposed to just the regular search results. There might be some, you know top-level pages that maybe show a smattering of listings. Doesn’t look like this does, but you could imagine where, you know, maybe on, let’s see, sedan, maybe you’ll see some listings on here, and they might be of a different nature,

So, why do we want to track click-through-rate and really quickly, what is click-through-rate? It’s always good to define our metrics. Now, most watchers probably know what that means already, but it’s always good define our metrics quickly. So click-through rate just means the number of clicks that something, a piece of content receives, divided by the number of Impressions. Okay, so for example, 500 / 50,000 equals 0.01, or 1% click-through rate. Okay, so basically what it means is an impression is how many times something is seen; a click is how many times something is obviously, clicked on. So you think from anything from a Facebook post to a Google Google search result, to an ad you know a Google ad or a Facebook ad, to a listing.

So just by virtue of loading this page and scrolling through the page, all of these individual listings are getting an impression aren’t they? They’re getting seen by me. But most websites don’t track that and our client;s website at the time, they weren’t tracking that. We didn’t know the impressions that the listings were getting; we only knew the clicks when somebody clicked, we’d have regular clickstream analytics data, which is useful, but how much more useful when we can also pair with Impressions data? Now we’re getting intelligent about things like, do listings at the top, get most of the Clicks? What about in the middle? What about if it has a featured area at the top where you people pay extra right, are those worth it? Is it worth the cost? Do they get more clicks? Do they get ignored? What if they have certain banners on them like newly-listed or reduced-price or, you know, great price (for some reason, they all seem to have a great price, haha which is kind of funny). So you could even end up later doing some image, maybe bulk image analysis would like machine learning if you were, we’re getting fancy and you can say, you know, are there certain image characteristics that we could draw out statistically that might have some benefit. For example, the appearance of text on an image could be certainly identified or the angle of the car, perhaps, or how close the car up is, maybe in the car colors, right?

So lots of interesting information that can be gleaned using impressions in ultimately click-through-rate data.

So, you can imagine this would be very useful for a listing site, but you may think well I don’t have a listing site. It’s also very useful. Obviously, for e-commerce on. Now, this is an extreme example obviously, Amazon. But you can imagine we have a bunch of products and same thing. You might have different products and different positions, different types of products, different ways of positioning them. Another example would be, this is a university site; this was a client of ours, and they have a lot of content. They have all different programs. They have a blog, probably multiple blogs, but you know, lots of content.

So what I like to say is that it’s very useful whenever you have lots dynamic content.

So that term dynamic content, basically just means that you have a, could be a blog, a magazine and ecommerce store, or classified or listing site. Basically, you have a set design and then you have a database that is populating that with dynamic content that may change. You know we have new articles, you have new listings. The design doesn’t change, just the content changes. So when there’s lots of dynamic content and you’re trying to optimize for… anything, this becomes very useful.

So I want to tell the story of how we did it. And then, now that GA4 is out, like so many other things, it doesn’t work anymore that way. And so, I want to just give a preview of how we might do it today.

So, there are some off-the-shelf CTR try, off-the-shelf tracking software that you can buy, but it’s very expensive. And I was also very concerned at the time for the client for implementation and spending a lot on it because every site is unique and different, it’s not like there’s one type of website, or one type of code that used or anything like that. So we decided let’s just see what we can do with free off-the-shelf things. So you know, we were already using Google analytics, we were pretty happy with the data, we are getting out of it and its performance. And so we thought maybe we can leverage Google Analytics says it’s free and pretty powerful. So I did some research and when what found out is that, we found out a very interesting little piece of information.

You can—events have been a thing for Google and loads for a very long time and they still are in G4 though they’re different and I’ll explain that in a minute—but historically are traditionally Universal analytics events have four, I think, just for now, I don’t know, maybe at least four basically I guess what we could call parameters to them, you can set a category, an action, a label, and a value. So, whenever you send an event to Google analytics, you can put in whatever you want for these. Values for category could be like shoes you know, or like blog, Action, click. Label, I don’t know what you might want to say, send you, I want to send some more information about what was clicked, right? So, maybe like, Banner or CTA something like that or Main CTA or something like that. And value is optional, it’s all optional, but this is a numeric value, like one.

So am I research? What we found out was that the label parameter could hold a maximum whopping, two thousand characters of data—two KB. So all the sudden we realize, wow, we could send all kinds of information up to this limit into our Google analytics, just by firing an event. Super cool. Already built into the system. We didn’t have to do anything to customize GA. All we have to do is send the data.
So what did we do? Well, we see on this page, we see a number of listings, right? And each listing has certain information about it that we might be interested in. Of course, the number one thing we want to know is what is the listing ID. Okay, so you can see here up on AutoTrader, and this was true of our client as well. Listing has an ID we can get that whether or not it’s in the URL, we can get that right and put it into our event. Once we do that, that’s our key right? We can do anything. So now we can compare. Anytime this ID was shown on the screen as an impression, it was clicked. In other words, when the page loaded, we can tie that together and get our click-through-rate. But there’s lots of other information about this listing that we might want to know. For example, it’s in the first position, there’s no price given, it has maybe, you know, it doesn’t have delivery available for it. You know, like I said it maybe has this “newly listed,” right? What was the other one? That was another designation and some of these are all new list. There’s another designation I saw as well. What position is it, right? Again. I said that one’s first. Maybe this one’s 24th, right? We want to know that. There might be other information like maybe they have featured listings up here that customers are paying extra for. Okay. So is that featured listing. Maybe want to know something like this was on the mobile site instead of the desktop site. Or there could be other interesting information as well that we might want to include.

So I have an example of how this looks. You can see here at there’s a label. This this is being sent as a data layer to Google tag manager, and see it’s chock-full of this just nasty looking information. What this is? These are encoded strings. So basically what we did was, we have a position for each one of these letters and we set the ones that are valid or useful. They might have information, like what platform type—again desktop or mobile? What position they were. Whether or not they were featured position, we might do with one of these numbers, I don’t remember if that’s all the ID or what, what position it’s in. And if there’s anything else interesting, like, for example, price or no price, newly listed et cetera, one of those statuses. So if in the second for example, in the second position here, this is a W might be one thing if it’s a, Y it might mean something else. Obviously, we need to get that encoded data out and of course we can see at the end here, we have the ad ID, for the listing ID, then the X is just being used as a delimiter. So basically this is like a comma separated little file that we’re sending in—to explode this back into a table and then we map each one of these characters to a real field, or a real concept. And now we have a properly exploded table of information about every impression we get on every page. So in this case, there would be 25—there are no special listings are featured listings on here. There be 25 of these sets of information. And we put that through a data flow where we explode that information out and put it into a useful table that we can query. And of course we tie it back together with the with the click data later and aggregate so so that’s kind of how it was done and it worked really really well which was great.

Here’s a local example. I’m in Kelowna BC and this is called Castanet, castanet.net. It’s an old kind of not the most pretty website but gets tons of traffic, has tons of content, very popular and there’s probably a lot of communities like this that have a local hub or local website, news website. Just another example of dynamic content.

Okay cool. So it was super successful and worked out really well and it was—except for some development time—it was free. And it wasn’t too hard either. So lastly, we have what about GA for? Well, we know that I already mentioned that this method unfortunately, like so many things would not—will not work now, in GA4. And we haven’t had a chance to do something like this in GA for yet, but if, and when we do, I think I know what we’re going to do. So while we don’t have that same character limit, you can see here, there’s just general event parameters now that we can create there’s no set category action, label, value—things of that nature per se. We just create our own parameters. But each parameter has a limit of just a measly 100 characters, which isn’t so hot. However, instead of just that one text-based label, that we can really fill up, we have a maximum of 25 event parameters, so we can include lots of data, we just have to break it up with a maximum 100 character limit. So, it does impose a little bit of an extra step, but I don’t think it’s going to be too bad. I think what we’ll probably do in this case is we’ll just map one parameter, to one listing, and then we’ll just send along as many as events as we need to. It’s going to result in more events being sent to GA, but we were sending this to its own property anyways. So even though the website had fairly high traffic, in the millions of visits, we didn’t run out of our quota in our free Google analytics account. We just had to do a Google Analytics property devoted just to catching those impressions. So actually it kind of works out kind of nicely on this site because you can see the default is twenty five results per page, so in this case that would fill up one event nicely, and we would have 100 characters to encode information about each listing.

Oh look I just noticed a little interesting factor right, these first two are indeed featured; they’re sponsored. You see that right there, I didn’t notice that. So that would be some interesting metadata about those listings. So that’s how I think we would do it in GA4. I think it would work out nicely. Of course you know we’d have to stitch those events together so we might have a parameter that gives us a code that has something to do with that this is associated with this batch of events or something like that, and then tie them together that way to know if there if, i.e., there were more than twenty five listings on a page. If the user for example, you know, was looking at fifty at a time, we would batch two together and we tie some ID code together to say that they’re the same part of the same impression set.

So that that’s it! Super interesting, and just something neat that we did that I wanted to share. Maybe get you thinking, especially if you have you know are part of managing a brand that has a website with a lot of dynamic content, this is something that can be done and maybe something you haven’t thought about doing before, but really affords you, some rich, new data that you can start to get some new life out of your website analytics and data. Ask some new questions, develop some new hypotheses, and I hope create a better product and create better results as well.

More Updates