Wondering what you should do to make your digital tracking “GDPR compliant”? Well, I was too. In the last year, I’ve been reading many interpretations of the GDPR legislation and how it will impact digital marketing. Many of this was just utterly bullshit (“we can’t do anything anymore!”) and often it only created more confusion.
To spare you the journey (and time investment) I went through, I decided to share my approach to the GDPR legislation in this blogpost. This approach enables the marketing teams of our clients to keep on delivering good (and targeted) user experiences, while maintaining a bond of trust with their customers. Note that a lot of this is based on my personal interpretations and the outcome of discussions with other marketeers and lawyers. If you have another view on certain recommendations, don’t hesitate to reach out!
One of the arguments that came up is that, under the GDPR, the ClientID used by Google Analytics to recognize a returning visitor, should also be considered as personal data. Therefore, I added an additional tip to this post to change the _ga cookie (which stores the ClientID) to a non-persistent cookie (= session cookie).
Before I get to the recommendations, I like to point out a common misconception. The General Data Protection Regulation (GDPR) only involves personal data. It does not talk about cookies in any way. The GDPR does not replace the so called “cookie law” (in official terms: ePrivacy Directive).
The ePrivacy directive focussed on the different types of cookies: session or persistent, first-party or third-party, etc. But not on the information contained in the cookie itself. The GDPR focusses on the information itself. Only when a cookie contains personal data, it becomes subject to the GDPR.
Note that the European Comission is working on a new ePrivacy Legislation (expected end of 2018) which will built upon the ePrivacy directive and the GDPR. This legislation is expected to bring more clarity on how companies should handle tracking and advertising online.
The first step towards a GDPR compliant web analytics configuration starts with making an inventory of what data you are capturing. Does any of this data points fall under “personal data”?
The definition of personal data used within the GDPR is broad and encompasses both identified and identifiable persons.
‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly.Article 4, GDPR
For an identified person, the personal data is quite obvious: a name, email address, home address, social security number, etc. I assume that you are not collecting any of these in GA, as this would be a breach of the terms of service (it’s probited to collect Personal Identifiable Information – PII). However, the term identifiable person is more complex. This can be any datapoint that enables you to possibly link that datapoint back to a specific person in the future. Which encompasses randomized identifiers, IP addresses, etc.
So, with this in mind; what questions should you ask yourself while reviewing your Google Analytics configuration?
If the answer on all of the above questions is No, your Google Analytics configuration is not in scope of the GDPR legislation. You are only capturing anonymous data. If you had to answer Yes on one or mutliple questions, action is required to make sure you are GDPR compliant.
There are many use cases that require a user identifier to be stored in Google Analytics. For instance: your final conversion happens offline and you want to be able to attribute that offline conversion to an online channel. You want to have a more accurate count of the number of unique users that you are reaching (cross browsers and devices). Or you want to inform your customer service agents what web pages where viewed by a customer before they contacted the call center.
User identifiers can be stored in Google Analytics by using the built-in User Id feature or you could use a user scoped custom dimension. The identifier itself can be any ID (customer id, prospect id, hashed email or home address, etc.) as long as the ID itself is anonymous (= it only functions as a key and does not contain any personal information by itself).
Note that it is against Google’s guidelines to store:
To be GDPR compliant, a user identifier should only be stored after you have received the explicit consent of your user. Identifying someone within your tracking does not fall under “legitimate interest” (sorry to crush your hopes). And the GDPR clearly states that data subjects have the right to opt-out of (and should be informed beforehand of) personalization and targeted advertising.
To be able to dynamicly set a User ID, it is recommended to manage your tracking within a tag manager. This enables you to only set the User ID when the visitor has given his consent for identification (and thus enabling you to tie their browsing behaviour to their customer profile).
Below the steps to take on how to set this up within Google Tag Manager.
Make sure that you capture the User ID in a variable. There are typically two ways to capture the User ID value in a Tag Manager variable: from a first party cookie or from the data layer (more on this on Simo’s blog).
Also make sure to capture the consent of the visitor in a Tag Manager variable. You would store a visitor’s consent typically in a first party cookie. In the example below, we use 3 different consent levels on the website. This level value is set in the cookieconsent cookie.
Next, bring both variables together with the use of a Lookup Table variable called Allow UserId. This variable will return the User ID only if the visitor has consent level 2 or 3. By not specifying a value for level 1 (or default value for when the cookie consent level is unknown), the Lookup Table will return the value undefined.
Lastly, use this Allow UserID variable in your Google Analytics Settings variable to populate the userId field. In the case of consent level 1, GTM will detect the undefined value and will omit the userId field from the tag, i.e. as if you had not specified a “userId” field (thanks to Yu Hui for this tip).
Note: the GDPR also foresees in the right to be forgotten. Therefore, a data controller (you) should be able to delete all data points that can be linked to a specific data subject if this person asks to be forgotten. At the moment, it is not possible to delete a specific User ID or Custom Dimension value from historic Google Analytics data. However, it’s up to Google (who is the data processer in this case) to enable such functionality. I expect that they will launch such a feature somewhere in the beginning of May 2018. *Update: Google has announced that they will make an User Deletion API available.*
Google Analytics uses multiple cookies to make their tracking work. Of these cookies, the _ga cookie contains a (randomized) identifier used by Google Analytics: the ClientId. This ID is used to recognize a browser, enabling Google Analytics to tell you if this browser has been on your site before or not (the returning visitor metric).
If the ClientId should be considered as personal data or not under GDPR can be discutable. But if you prefer to be safe over sorry, you might want to prevent Google Analytics to identify the visitor’s browser when he/she prefers to stay anonymous. You can do this by changing the _ga cookie to a session cookie. This means that the cookie will be deleted once the user closes his browser.
By default, the _ga cookie (and thus the ClientId) is stored for 2 years in the browser of the visitor. By changing the expiration time of the cookie, you can limit the time that GA will “remember” a specific browser.
As already described above for the User ID, you will need to create a dynamic variable that takes the visitor’s consent into account when setting the expiration time of the _ga cookie. This can be done with a Lookup Table variable that returns the amount of seconds that the cookie should be remembered.
Use the Cookie Expiration variable in your Google Analytics Settings variable to populate the anonymizeIp field. When a user has consent level 2 or 3, the expiration time is set to 63.072.000 seconds, this equals two years. When a user has consent level 1, the expiration time is set to 0 seconds, which means that the cookie will be deleted once the browser is closed.
Note: Google Analytics does not need cookies to function. You can also decide to completely disable cookies from being set. Therefore, you would use the storage field and set its value to none.
Just as with identifiers, the GDPR considers an IP address to be data that can be used to identify a person. Allthough this point of view is discutable, it’s best to anomyze the IP address for visitors that wish to stay anonymous.
Google already made it very easy to anonymize IP adressess a few years ago (May 2010 to be exact). By setting the anonymizeIp feature to true in the tracking script, the last octet of the IP address is changed to zero (188.8.131.529 becomes 184.108.40.206). This feature is called IP Masking in Google terminology.
The impact of IP Masking on your data is rather neglectable. Google Analytics uses the IP address to determine the location of where your visitors are visiting your website. Stripping the last octet of the IP address will result in a less accurate location. But you propably won’t notice it in your reports. As GA already limits the level of detail shown in the location report to City.
As already described above for the User ID, you will need to create a dynamic variable that takes the visitor’s consent into account when deciding if the IP address should be masked or not. This can be done with a Lookup Table variable that returns a true or false value.
Use this Anonymize IP variable in your Google Analytics Settings variable to populate the anonymizeIp field. Only when a user has consent level 2 or 3, the entire IP address will be send to GA.
Note: the mask IP feature only influences the default IP address field processed by GA. If you also capture the IP address in a custom dimension, you will need to create your own logic to mask the IP address before it is stored in the custom dimension.
By enabling advertising features in Google Analytics, you get additional data (demographic and interest reports) and you can take advantage of integrations with your AdWords and DoubleClick accounts. Very interesting for anyone that wants to fine tune advertising targetting.
However, it’s important to understand that by enabling the advertising features, the GA script will also place DoubleClick cookies in addition to the default Analytics cookies. DoubleClick is Google’s advertising network. By placing the DoubleClick cookie, you share browsing behaviour data from your website visitors with Google.
In terms of GDPR it’s important to ask your visitors for their explicit consent, to share their data with other parties, before it happens. This implies that we need a way to turn the use of advertising features dynamically on or off. Again, GTM is the perfect tool for the job.
Create a variable that checks the cookie consent and returns a true or null value to determine if we can use the advertising features.
Apply this Allow Advertising features variable in your Google Analytics Settings variable to populate the displayFeaturesTask field. Only when a user has consent level 3, the DoubleClick cookie will be set.
Also for other advertising pixels (Facebook, Awin, etc.) the same principle as with DoubleClick applies: by loading their pixels/cookies, you are sharing browsing behaviour data with a third party. Thus, you need your visitors’ consent.
To make sure that you only load advertising pixels when your visitor has agreed to data sharing with third parties, create specific triggers that take the consent level into account.
To summarize, a quick overview of the steps that you should take to make your Google Analytics implementation GDPR compliant.
Jente De Ridder | 22 March 2018