Matomo is a powerful analytics suite, however, if you have large amounts of data and want to combine it with data from other platforms then you may benefit from exporting the data to a data warehouse platform such as BigQuery. As Matomo On-Premise users have direct access to the database, they can use this method, however, for cloud customers we have an add-on that we can enable for your account that enables fully hosted and automated exports of your data for this situation. This guide covers not only how you can enable and configure it on your account, but also how to import it into BigQuery.

Enable BigQuery & Data Warehouse Exports

This feature is currently in beta and has been enabled for a small subset of our users. If this is you, you can jump to the next section of this guide. If you don’t have access yet and would like to enable it then please contact support. Once out of beta, this feature will require a paid subscription that you can add to any Matomo Cloud plan.

Configure Matomo for Data Warehouse Exports

There are several stages to enabling the export of your Matomo Analytics data to a data warehouse platform and these are all described below.

Select which sites to enable

The first step is enabling the specific sites in Matomo that you would like to sync. As soon as you have enabled a site for export it will begin processing data into export format on the backend. There are hourly data limits in place for this processing so please only enable sites that you plan on exporting to a data warehouse platform to ensure the fastest processing times.

  1. Click on the cog icon Settings Cog Icon to access your Settings area.

  2. Click on BigQuery & Data Warehouse under the Export menu item.

  3. Select the Site Id you’d like to export within the Sync Configuration section by clicking the checkbox.
    Data Warehouse Export Configuration

  4. Click the Enable Export button. This will start regular processing of your data into the hosted links shared further down the page in the Export Table Paths section.

How to backfill data for export from Matomo

When you first enable the BigQuery and Data Warehouse exports, it will begin regular processing of your analytics data from the date you enabled the feature. If you would like access to your historic data, you will need to configure the date ranges you’d like to see.

Note: This system uses Amazon S3 on the backend which has strict data limits. Therefore, it will only process a certain amount of data per hour and the number of days this covers will vary depending on the amount of data your Matomo instance contains.

To ensure that you get access to your most important data as soon as possible, we recommend only enabling smaller date ranges that you plan to actively access in your data warehouse platform to avoid filling your hourly quotas with data that may not be necessary yet.

  1. Visit the BigQuery & Data Warehouse settings page under the Export menu within the Matomo Admin settings.
  2. Scroll down to the Export Jobs section
  3. Click on the Add Backfills Export Job button
  4. Select a start and end date that you want to access in your data warehouse and click on Submit to queue your backfill request.
  5. Once you have submitted your Backfill request, it will enter Queued status. At this stage you will need to wait for the data to be processed and uploaded to the Export Table Paths shown further down the page.

You will notice that each date is added as a unique job and the processing can take several hours, days or longer depending on the amount of data you have tracked. There are several potential statuses shown for backfill jobs in the Export Jobs section.

  1. Queued – This means a backfill has been requested for the date and you will need to wait.
  2. Processing – This means the backfill request is currently processing and should soon be available for importing to your data warehouse platform.
  3. Finished – This means that data for the specific date has finished processing and therefore the data should be available from the links in the Export Table Paths for use within your data warehouse platform.

How to import to your data warehouse

There are many data warehouse platforms available and we can’t provide specific guides for each. However, BigQuery is one of the most popular and they usually offer free credits for getting started so we have included a full guide for this process below. It is likely that other data warehouse tools will follow a similar methodology so if you use another platform, you may be able to use this guide as a reference too.

Select which data to import to your data warehouse

Before you begin importing your Matomo analytics, it is important to make sure you are only importing data that you will access within your data warehouse. Not every site makes use of every Matomo feature and there is no point wasting time setting up imports of empty tables if you don’t use a feature. For this reason, we have organised the tables into two lists below; recommended and optional. Everyone is likely to benefit from the recommended tables whereas the optional tables are likely to be most relevant to people using specific features as described in their listing.

Recommended tables for all users

  1. log_action – This contains data about the actions available on your site and is recommended for most users.
  2. log_link_visit_action – This links data from the log_visit and log_action tables and is recommended for most users.
  3. log_visit – This table contains all visit data for your site and is recommended for most users.
  4. site – This table contains metadata about your site and is recommended for most users.
  5. goal – This contains data relating to the Goal Tracking feature. If you don’t use this feature you can safely ignore this table.
  6. log_conversion – This contains data on conversions and is recommended for most users.

Optional tables

  1. funnel – This contains data relating to the Funnels feature. If you don’t use this feature you can safely ignore this table.
  2. log_abtesting – This contains data relating to the AB Testing feature. If you don’t use this feature you can safely ignore this table.
  3. log_clickid – This contains data relating to the Advertising Conversion Export feature. If you don’t use this feature you can safely ignore this table.
  4. log_conversion_item – This contains data on ecommerce items as part of Ecommerce tracking in Matomo. If you don’t use this, you can safely ignore this table.
  5. log_form – This contains data relating to the Form Analytics feature. If you don’t use this, you can safely ignore this table.
  6. log_form_field – This contains data relating to the Form Analytics feature. If you don’t use this, you can safely ignore this table.
  7. log_form_page – This contains data relating to the Form Analytics feature. If you don’t use this, you can safely ignore this table.
  8. log_hsr – This contains data relating to the Heatmaps and Session Recording feature. If you don’t use this, you can safely ignore this table.
  9. log_hsr_blob – This contains data relating to the Heatmaps and Session Recording feature. If you don’t use this, you can safely ignore this table.
  10. log_hsr_event – This contains data relating to the Heatmaps and Session Recording feature. If you don’t use this, you can safely ignore this table.
  11. log_hsr_site – This contains data relating to the Heatmaps and Session Recording feature. If you don’t use this, you can safely ignore this table.
  12. log_media – This contains data relating to the Media Analytics feature. If you don’t use this, you can safely ignore this table.
  13. log_media_plays – This contains data relating to the Media Analytics feature. If you don’t use this, you can safely ignore this table.
  14. site_form – This contains data relating to the Form Analytics feature. If you don’t use this, you can safely ignore this table.

If you aren’t sure exactly which tables you need to include, you can set up imports for all 20 of them. Any tables that are not used will simply appear empty within your data warehouse as they would in the Matomo database itself. You can access deeper documentation on the database schema within our developer documentation.

How to configure your Matomo datasets and tables in Big Query

  1. Visit Google Cloud Console and Enable the BigQuery API and the BigQuery Data Transfer API.

  2. Visit BigQuery in the Google Cloud Console.

  3. Open or create a Project linked to a billing account from the selector at the top left of the screen.
    BigQuery Add Button

  4. Click on the three dots within the explorer to reveal a sub-menu and click Create dataset
    Create Data Set

  5. Define the settings for your data set by selecting a descriptive Dataset ID, and Region. We recommend selecting europe-west3 for the most efficient transfers.
    Create BigQuery Data Set

  6. Click on the three dots next to the dataset and then click Create table.
    Create Table

  7. Complete the options as shown in the screenshot below, making sure to select your project and dataset. We recommend using the same table name as in the source data to avoid confusion, for example log_visit.
    Create BigQuery Table
    You will notice in the screenshot that we have toggled the Schema to Edit as text. There is very specific text that you should include for each table and these are available on GitHub so you can copy and paste them into this section. Only once you have pasted the schema in, click Create Table button to finalise.

  8. You now need to repeat step 6 and 7 for each of the tables that you plan on importing from Matomo.

How to configure data transfers from Matomo to Big Query

  1. To get started, visit the BigQuery and Data Warehouse page under the Export menu within Matomo.

  2. Within the Credentials section, click on the Generate Credentials button and securely save these details for later.

  3. Visit BigQuery in Google Cloud Console.

  4. Click on the Data transfers tab and click to Create a transfer.

  5. Select Amazon S3 as the Source.
    Create BigQuery Transfer

  6. Complete the Transfer configuration fields:

    1. Display name: Something descriptive, for example the dataset and table name.
    2. Repeat Frequency: You can leave this at the default of repeating every 24 hours.
    3. Start date and run time: Set this to midnight to ensure full data for the prior day.
    4. Dataset: This should be the dataset you have created within BigQuery.
    5. Destination table: This should be a table that you’ve created within BigQuery and if you followed the guide above will share the same name as the table in Matomo.
    6. Amazon S3 URI: You will need to source this from the Export Table Paths section on the BigQuery and Data Warehouse page in Matomo for the relevant table. The URI looks a little something like the following:
      s3://prod-processed-customer-exports/demo.matomo.cloud-id/site/*
    7. Access key ID: This is the access key you were provided when clicking Generate Credentials on the BigQuery and Data Warehouse page in Matomo.
    8. Secret access key: This is the access key you were provided when clicking Generate Credentials on the BigQuery and Data Warehouse page in Matomo.
    9. File format: Select JSON
    10. Write Disposition: For the majority of tables, you should select WRITE_APPEND to ensure that new data is added to the database instead of overwriting prior transfers. However, if you are configuring the site, site_form, funnel or goal tables these can safely be configured to use WRITE_TRUNCATE.
  7. Scroll down to the bottom and click Save to enable the data transfer.

  8. To create another transfer, you need to click the + CREATE TRANSFER link at the top of the page and then repeat Steps 9 and 10 for each table that you want to import into BigQuery.
    Create Transfer

Once you have set up data transfers for all the tables, they should begin regular automated transfers according to your configured schedule from the date you set them up. One thing to keep in mind is that not every table will always contain new data every single day, so it can be normal for a transfer to run without importing new data if nothing has changed.

Note: If you would like to backfill data from prior dates, you will need to ensure that you have configured and completed the backfill process in Matomo. Then within BigQuery, click on the relevant data transfer and there is a link to SCHEDULE BACKFILL at the top right of the page. Clicking this will load a configuration pop-up that lets you select the date range that you would like to backfill.

Schedule BigQuery Backfill

How to run your first query in BigQuery

Once your first data transfer has completed, you can verify that your tables contain data by running a query on each of the tables. Full support for BigQuery is beyond the scope of this guide. The official BigQuery documentation can be found here.

One example query you could run to check that records have been imported is:

SELECT * FROM `matomo-project.matomo_export.log_visit` LIMIT 1000

Make sure that you replace matomo_project with your actual project name, matomo_export with your dataset name and log_visit with your table name.

Matomo MySQL Query

As long as this query returns results, you can be sure that your data has begun transferring. For more help navigating the exported tables, we have provided more examples in our FAQ on how to write SQL queries for the Matomo database.

Matomo Database Schema Text for Big Query

When creating tables for the Matomo data in BigQuery, you will need to define a schema for each of them to allow for successful syncing. Currently, the most efficient way of doing this is to copy and paste the database schemas from this GitHub repository as text into the BigQuery editor.