Disclaimer: this blog post has been written by digital analysts, not lawyers. The purpose of this article is to explain how to not process any personal data with Matomo in order to avoid going through the GDPR compliance process with Matomo analytics. This work comes from our interpretation of different sources: the official GDPR text and the UK privacy commission: ICO resources. It cannot be considered as professional legal advice. As with GDPR, this information is subject to change. GDPR may also be known as RGPD in French, Spanish, Portuguese, Datenschutz-Grundverordnung, DS-GVO in German, Algemene verordening gegevensbescherming in Dutch, Regolamento generale sulla protezione dei dati in Italian.
Are you looking for a way to not process any personal data with Matomo? If the answer is yes, you’re at the right place. From our understanding, if you are not processing personal data, then you shouldn’t be concerned about GDPR. Our inspiration came from this official reference:
“The principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable. This Regulation does not therefore concern the processing of such anonymous information, including for statistical or research purposes.“
In this blog post we are going to see how you can configure Matomo in order to not process any personal data and what the consequences are.
Not a Matomo user yet? Check out our live demo or start you free 21-day trial now.
Firstly, what is personal data according to the GDPR?
The term « personal data » is predominantly used in Europe.
From: eur-lex.europa.eu
(1) “‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;”
(30) “Natural persons may be associated with online identifiers provided by their devices, applications, tools and protocols, such as internet protocol addresses, cookie identifiers or other identifiers such as radio frequency identification tags. This may leave traces which, in particular when combined with unique identifiers and other information received by the servers, may be used to create profiles of the natural persons and identify them.”
Our interpretation is this means any information that can be linked to and identify a person. « Personal data » covers a wider range of information than Personally Identifiable Information (PII).
What is Personally Identifiable Information (PII)?
Personally Identifiable Information (PII) is a term used predominantly in the United States.
The US Office of Privacy and Open Government‘s definition below:
« The term personally identifiable information refers to information which can be used to distinguish or trace an individual’s identity, such as their name, social security number, biometric records, etc. alone, or when combined with other personal or identifying information which is linked or linkable to a specific individual, such as date and place of birth, mother’s maiden name, etc. »
Curious about PII really means? This PII introduction will walk you through what PII is.
What can be considered Personally Identifiable Information (PII)?
- Full name/usernames
- Home address/mailing address
- Email address
- Credit card numbers
- Date of birth
- Phone numbers
- Login details
- Precise locations
- Account numbers
- Security codes (including biometric records)
What other data is considered « personal data » according to GDPR?
From the above examples we see that all personal data can be PII, but not all PII can be defined as personal data. But to be on the safe side, let’s see what can be considered « personal data » according to GDPR.
- IP addresses
- Cookies identifiers
- Page URL or page titles
- User ID and Custom “personal” data
- Ecommerce order IDs
- Location
- Heatmaps & Session Recordings
Let’s see each of them in more detail.
1. IP addresses
IP addresses can indirectly identify an individual. It can also give a good approximation of an individual’s location.
IP addresses are therefore considered personal data, which means you need to anonymise them. To do so, a feature is available within Matomo, where you can anonymise the IP. We recommend that you anonymise at least the last two bytes:
See our configuration guide for more information
What are the consequences of using this feature?
When applying IP anonymisation on two bytes, you will no longer be able to see the full IP in the UI.
Moreover, there is a small chance that 2 different visitors with the same device and software configuration will be identified as the same visitor if the anonymised IP address is the same for both.
2. Cookies
It is not clear for us yet if all cookies are considered equal under GDPR. At this stage it is too early to make a definite decision.
Did you know? Matomo lets you optionally disable the creation of cookies by adding an extra line of code to your tracking code see below.
See our configuration guide for more information
What are the consequences of using this feature?
Matomo is using a few first party cookies, and the following cookies may hold personal data:
- _pk_id : contains a visitor id used to identify unique visitors
- _pk_ref : to identify from where they came from
3. Page URLs and page titles
URLs are not mentioned within the official GDPR text. However, we know that according to the different CMS you use, some of them may have URLs which include personal identifiers.
For example:
As a result, you need to find a way to anonymise this data.
There are several ways you can perform this action according to your website. If your website is adding the personal data through query parameters, you can define a rule to exclude them from Matomo.
If the personal data are not included within query parameters, you can use the “setCustomURL” feature and write your code as follow:
See our developer documentation for more information
If you are also processing personal data within the title tag, you can use the following function: “setDocumentTitle”.
What are the consequences of using this feature?
By anonymising the URLs containing personal data, some of your URLs will be grouped together.
4. User ID and custom personal data
User ID is a feature (a tracking code needs to be added) which allows you to identify the same user across different devices.
A User ID needs a corresponding database in order to link a user across different devices, it can be an email, a username, a name, a random number… All that data is either direct or non direct online identifiers and are therefore under the scope of GDPR.
It will be the same situation if you are using custom variables and/or custom dimensions in order to push personal data to the system.
To continue using the User ID feature but not recording personal data, you can consider using a hash function which will anonymise/convert your actual User ID into something like “3jrj3j34434834urj33j3”.
Alternatively, you can enable the feature « Anonymise User IDs ». This feature will be available starting in Matomo 3.5.0:
What are the consequences of using this feature?
Under GDPR, User ID is personal data. Anonymising the User ID using a hash function or our built-in functionality make the User Id pseudo-anonymous, which means it can’t be easily identified to a specific user. As a result, you will still get accurate visits and unique visitors metrics, and the Visitor Profile, but without tracking the original User ID which is personal data.
5. Ecommerce order IDs
Order IDs are the reference number assigned to the products/services bought by your customers. As this information can be crossed with your internal database, it is considered as an online identifier and is therefore under the scope of GDPR. As for User ID, you can anonymise order IDs using our built-in functionality to Anonymise Order IDs (see section 4. about User Id).
What are the consequences of anonymising order ID?
It really depends on your former use of order IDs. If you were not using them in the past then you should not see any difference.
6. Location
Based on the IP address of a visitor, Matomo can detect the visitors location. Location data is problematic for privacy as this technology has become quite accurate and can detect not only the city a visitor is from, but sometimes an even more precise position of a visitor.
In order to not leave any accurate traces, we strongly recommend you to enable the IP anonymisation feature. Next, you need to enable the setting “Also use the anonymised IP address when enriching visits”. You find this setting directly below the IP anonymisation. This is important as otherwise the full IP address will be used to geolocate a visitor.
What are the consequences of anonymising location data?
The more bytes you anonymise from the IP, the more anonymised your location will be. When you remove two bytes as suggested, the city and region location reports will not be as accurate. In some cases even the country may not be detected correctly anymore.
7. Heatmaps & Session Recordings
Heatmaps & Session Recording is a premium feature in Matomo allowing you to see where users click, hover, type and scroll. With session recordings you can then replay their actions in a video.
Heatmaps & Session Recordings are under the scope of GDPR as they can disclose in some specific cases (for example: filling a contact form) personal data:
To avoid this, Matomo will anonymise all keystrokes which a user enters into a form field unless you specifically whitelist a field. Many fields that could contain personal data, such as a credit card, phone number, email address, password, social security number, and more are always anonymised and not recorded.
See our configuration guide for more information
Note that a page may still show personal information within the page as part of regular content (not a form element). For example an address, or the profile page of a forum user. We have added a feature which allows you to set an HTML attribute « data-matomo-mask » to anonymise any personal content shown in the UI.
What are the consequences of using this feature?
Mainly, you will not be able to see in plain text what people are entering into your forms.
What should you do with past data?
Once more, we have to say that we are not lawyers. So do not take our answers as legal advice. From: ec.europa.eu/newsroom/article29/document.cfm?doc_id=50053
“For example, as the GDPR requires that a controller must be able to demonstrate that valid consent was obtained, all presumed consents of which no references are kept will automatically be below the consent standard of the GDPR and will need to be renewed.”
Our interpretation is that, if you were previously relying on consent, unless you can demonstrate that valid consent was obtained, you need to get the consent back (which is almost impossible) or you need to anonymise or remove that data.
To anonymise previously tracked data, we are actively working on a feature to do just that directly within Matomo. Alternatively, you may also set up the deletion of logs after a certain amount of time.
We really hope you enjoyed reading this article. GDPR is still on the go and we are pretty sure you have a lot of questions about it. You probably would like to share our vision about it. So do not hesitate to ask us through our contact form to see how we are interpreting GDPR at Matomo and InnoCraft.
Matomo Analytics and GDPR
If you’re still wondering whether Matomo is the best option for you, check out our live demo or start you free 21-day trial now – no credit card required.
Matomo Analytics provides safeguards to ensure you stay on top of privacy trends; are GDPR compliant; and can protect your users’ privacy.