Wondering what you should do to make your digital tracking “GDPR compliant”? Well, I was too. In the last year, I’ve been reading many interpretations of the GDPR legislation and how it will impact digital marketing. Many of this was just utterly bullshit (“we can’t do anything anymore!”) and often it only created more confusion.
To spare you the journey (and time investment) I went through, I decided to share my approach to the GDPR legislation in this blogpost. This approach enables the marketing teams of our clients to keep on delivering good (and targeted) user experiences, while maintaining a bond of trust with their customers. Note that a lot of this is based on my personal interpretations and the outcome of discussions with other marketeers and lawyers. If you have another view on certain recommendations, don’t hesitate to reach out!
Update March 30th, 2018
This blogpost fueled discussion on LinkedIn after publication. See my original status update on LinkedIn and check the comments for interesting follow-up reads from privacy professionals.
One of the arguments that came up is that, under the GDPR, the ClientID used by Google Analytics to recognize a returning visitor, should also be considered as personal data. Therefore, I added an additional tip to this post to change the _ga cookie (which stores the ClientID) to a non-persistent cookie (= session cookie).
GDPR is NO cookie legislation
Before I get to the recommendations, I like to point out a common misconception. The General Data Protection Regulation (GDPR) only involves personal data. It does not talk about cookies in any way. The GDPR does not replace the so called “cookie law” (in official terms: ePrivacy Directive).
The ePrivacy directive focussed on the different types of cookies: session or persistent, first-party or third-party, etc. But not on the information contained in the cookie itself. The GDPR focusses on the information itself. Only when a cookie contains personal data, it becomes subject to the GDPR.
Note that the European Comission is working on a new ePrivacy Legislation (expected end of 2018) which will built upon the ePrivacy directive and the GDPR. This legislation is expected to bring more clarity on how companies should handle tracking and advertising online.
Do you track personal data in Google Analytics?
The first step towards a GDPR compliant web analytics configuration starts with making an inventory of what data you are capturing. Does any of this data points fall under “personal data”?
The definition of personal data used within the GDPR is broad and encompasses both identified and identifiable persons.
For an identified person, the personal data is quite obvious: a name, email address, home address, social security number, etc. I assume that you are not collecting any of these in GA, as this would be a breach of the terms of service (it’s probited to collect Personal Identifiable Information – PII). However, the term identifiable person is more complex. This can be any datapoint that enables you to possibly link that datapoint back to a specific person in the future. Which encompasses randomized identifiers, IP addresses, etc.
So, with this in mind; what questions should you ask yourself while reviewing your Google Analytics configuration?
- Do we use the User ID feature? Or do we capture other identifiers in custom dimensions?
- Do we capture the IP address in GA?
- Are we sharing our Google Analytics data with other parties/tools? For example: Double Click, AdWords, etc.
If the answer on all of the above questions is No, your Google Analytics configuration is not in scope of the GDPR legislation. You are only capturing anonymous data. If you had to answer Yes on one or mutliple questions, action is required to make sure you are GDPR compliant.
When can I use user identifiers in Google Analytics?
There are many use cases that require a user identifier to be stored in Google Analytics. For instance: your final conversion happens offline and you want to be able to attribute that offline conversion to an online channel. You want to have a more accurate count of the number of unique users that you are reaching (cross browsers and devices). Or you want to inform your customer service agents what web pages where viewed by a customer before they contacted the call center.
User identifiers can be stored in Google Analytics by using the built-in User Id feature or you could use a user scoped custom dimension. The identifier itself can be any ID (customer id, prospect id, hashed email or home address, etc.) as long as the ID itself is anonymous (= it only functions as a key and does not contain any personal information by itself).
Note that it is against Google’s guidelines to store:
- Data that permanently identifies a particular device (such as a unique device identifier if such an identifier cannot be reset). (source)
- A User ID after that the user has logged out. (source)
User ID and GDPR
To be GDPR compliant, a user identifier should only be stored after you have received the explicit consent of your user. Identifying someone within your tracking does not fall under “legitimate interest” (sorry to crush your hopes). And the GDPR clearly states that data subjects have the right to opt-out of (and should be informed beforehand of) personalization and targeted advertising.
To be able to dynamicly set a User ID, it is recommended to manage your tracking within a tag manager. This enables you to only set the User ID when the visitor has given his consent for identification (and thus enabling you to tie their browsing behaviour to their customer profile).
Below the steps to take on how to set this up within Google Tag Manager.
GDPR proof User ID with GTM
Make sure that you capture the User ID in a variable. There are typically two ways to capture the User ID value in a Tag Manager variable: from a first party cookie or from the data layer (more on this on Simo’s blog).
Also make sure to capture the consent of the visitor in a Tag Manager variable. You would store a visitor’s consent typically in a first party cookie. In the example below, we use 3 different consent levels on the website. This level value is set in the cookieconsent cookie.
- level 1 means that the visitor does not want to be identified.
- level 2 means that the visitor agrees to be identified and that we can personalize his experience.
- level 3 means that the visitor agrees to be identified and that we can share his data with 3rd parties.
Next, bring both variables together with the use of a Lookup Table variable called Allow UserId. This variable will return the User ID only if the visitor has consent level 2 or 3. By not specifying a value for level 1 (or default value for when the cookie consent level is unknown), the Lookup Table will return the value undefined.
Lastly, use this Allow UserID variable in your Google Analytics Settings variable to populate the userId field. In the case of consent level 1, GTM will detect the undefined value and will omit the userId field from the tag, i.e. as if you had not specified a “userId” field (thanks to Yu Hui for this tip).
Note: the GDPR also foresees in the right to be forgotten. Therefore, a data controller (you) should be able to delete all data points that can be linked to a specific data subject if this person asks to be forgotten. At the moment, it is not possible to delete a specific User ID or Custom Dimension value from historic Google Analytics data. However, it’s up to Google (who is the data processer in this case) to enable such functionality. I expect that they will launch such a feature somewhere in the beginning of May 2018. *Update: Google has announced that they will make an User Deletion API available.*
What about the cookies used by Google Analytics?
Google Analytics uses multiple cookies to make their tracking work. Of these cookies, the _ga cookie contains a (randomized) identifier used by Google Analytics: the ClientId. This ID is used to recognize a browser, enabling Google Analytics to tell you if this browser has been on your site before or not (the returning visitor metric).
If the ClientId should be considered as personal data or not under GDPR can be discutable. But if you prefer to be safe over sorry, you might want to prevent Google Analytics to identify the visitor’s browser when he/she prefers to stay anonymous. You can do this by changing the _ga cookie to a session cookie. This means that the cookie will be deleted once the user closes his browser.
How to change the _ga cookie to a “session cookie”?
By default, the _ga cookie (and thus the ClientId) is stored for 2 years in the browser of the visitor. By changing the expiration time of the cookie, you can limit the time that GA will “remember” a specific browser.
As already described above for the User ID, you will need to create a dynamic variable that takes the visitor’s consent into account when setting the expiration time of the _ga cookie. This can be done with a Lookup Table variable that returns the amount of seconds that the cookie should be remembered.
Use the Cookie Expiration variable in your Google Analytics Settings variable to populate the anonymizeIp field. When a user has consent level 2 or 3, the expiration time is set to 63.072.000 seconds, this equals two years. When a user has consent level 1, the expiration time is set to 0 seconds, which means that the cookie will be deleted once the browser is closed.
Note: Google Analytics does not need cookies to function. You can also decide to completely disable cookies from being set. Therefore, you would use the storage field and set its value to none.
When should I anonymize the IP address in Google Analytics?
Just as with identifiers, the GDPR considers an IP address to be data that can be used to identify a person. Allthough this point of view is discutable, it’s best to anomyze the IP address for visitors that wish to stay anonymous.
Google already made it very easy to anonymize IP adressess a few years ago (May 2010 to be exact). By setting the anonymizeIp feature to true in the tracking script, the last octet of the IP address is changed to zero (184.108.40.2069 becomes 220.127.116.11). This feature is called IP Masking in Google terminology.
The impact of IP Masking on your data is rather neglectable. Google Analytics uses the IP address to determine the location of where your visitors are visiting your website. Stripping the last octet of the IP address will result in a less accurate location. But you propably won’t notice it in your reports. As GA already limits the level of detail shown in the location report to City.
How to mask IP addresses in GTM?
As already described above for the User ID, you will need to create a dynamic variable that takes the visitor’s consent into account when deciding if the IP address should be masked or not. This can be done with a Lookup Table variable that returns a true or false value.
Use this Anonymize IP variable in your Google Analytics Settings variable to populate the anonymizeIp field. Only when a user has consent level 2 or 3, the entire IP address will be send to GA.
Note: the mask IP feature only influences the default IP address field processed by GA. If you also capture the IP address in a custom dimension, you will need to create your own logic to mask the IP address before it is stored in the custom dimension.
When can I use advertising features in Google Analytics?
By enabling advertising features in Google Analytics, you get additional data (demographic and interest reports) and you can take advantage of integrations with your AdWords and DoubleClick accounts. Very interesting for anyone that wants to fine tune advertising targetting.
However, it’s important to understand that by enabling the advertising features, the GA script will also place DoubleClick cookies in addition to the default Analytics cookies. DoubleClick is Google’s advertising network. By placing the DoubleClick cookie, you share browsing behaviour data from your website visitors with Google.
Dynamically turn advertising features on or off in GTM
In terms of GDPR it’s important to ask your visitors for their explicit consent, to share their data with other parties, before it happens. This implies that we need a way to turn the use of advertising features dynamically on or off. Again, GTM is the perfect tool for the job.
Create a variable that checks the cookie consent and returns a true or null value to determine if we can use the advertising features.
Apply this Allow Advertising features variable in your Google Analytics Settings variable to populate the displayFeaturesTask field. Only when a user has consent level 3, the DoubleClick cookie will be set.
Other advertising pixels
Also for other advertising pixels (Facebook, Awin, etc.) the same principle as with DoubleClick applies: by loading their pixels/cookies, you are sharing browsing behaviour data with a third party. Thus, you need your visitors’ consent.
To make sure that you only load advertising pixels when your visitor has agreed to data sharing with third parties, create specific triggers that take the consent level into account.
Steps towards GDPR compliant Analytics
To summarize, a quick overview of the steps that you should take to make your Google Analytics implementation GDPR compliant.
- Start with reviewing your existing implementation and assess if you have any personal data stored in GA at the moment. If you prefer to keep the implementation simple, it’s best to not store any personal data. However, this will limit you in the amount of data that’s available and can also make the data less actionable (depending on your use cases). Note that personal data can show up in many forms within Analytics. Review the guidelines from Google on preventing the storage of personal information in Analytics.
- If you wish to capture personal information in GA, be very thoughtfull on what consent is required. Provide a solution to gather the necessary consent and understand that the GDPR does not allow “one overall consent“. The user has the right to be selective in what can happen with his data and what not.
- Next, configure your Google Tag Manager to take into account the consent level of a specific visitor before any tags are fired. GTM becomes your command center in terms of GDPR compliancy for marketing tools.
- Review the users that have access to your Google Analytics data. As GA contains personal data, you must keep track of who is allowed to access that data and what they are doing with the data. This means that you should no longer allow access through generic email addresses (ex: firstname.lastname@example.org). Only allow access on personal addresses, so that you can hold people responsible if needed.
- Keep following Google’s updates on what features they are launching to help you to be GDPR compliant and understand your personal obligations as data controller. Some usefull links here below.
- Infographic on GDPR from the EU
- GDPR legislation text
- Open GDPR – Open source code to help you manage consent levels
- Google on data protection & compliance
- Google will foresee a User Deletetion API
- Tips from Google to prevent PII in Analytics
- Google Analytics Privacy Settings
- Google’s guidelines on user consent
- LinkedIn Post from Sergio Maldonado going into detail on the discussion if the concept of cookies falls under GDPR or not.