Have you ever been using Google Search Console and found it a pain when you have to perform multiple filters on URLs or queries to analyse performance the way you want? This is where Google Search Console’s new regular expression feature comes into play. Added in April 2021, Search Console’s regex filtering enables you to search for multiple elements in a query or page to return data that is of interest to you. This allows you to perform analysis as broad or as granular as you require.
However, this new feature does have some limitations. Google Search console is using regex2, a version of regular expression that you may have come across before if you’ve used regex filters in Google Analytics. This version is simpler to use but comes at the cost of some useful functionality. In this article, we will discuss how webmasters and marketers can use regex filters to gain insight into their organic search performance to feed into future strategies.
What is RegEx?
Regular Expression, or regex, is a series of symbols and characters that are applied as conditions in a search or filter to help match a particular pattern of text. As a simple example, the pipe character “|” acts as an OR function in regular expressions. So, “red|blue” in a regex2 filter would return any string containing the word “red” or “blue”.
Regular expressions are often most useful as a means to group results to provide more of an insight into a particular area of interest. It’s commonly used in JavaScript and tracking to capture multiple instances to react to a user action or trigger tracking. You can also use it in Google Analytics to analyse data to understand the performance of themes or content on the site.
For example, on an ecommerce site, you could compare the performance of different page types to understand the performance between category or product list pages (PLPs) versus product detail pages (PDPs). Applying regular expression can enable you to analyse this performance in Google Search Console to understand the organic visibility of different groups of pages or keywords. This information can fuel your SEO strategy.
For the purpose of this piece, we will be focusing on regular expression in the form of the Custom (regex) filter in Google Search Console. This functionality has recently been added to Google Search Console and can be used to make efficiencies when analysing organic data in the performance reports.
How to apply a regular expression filter in Google Search Console?
Google Search Console has introduced the use of regular expression to filter results in the performance report.
Using the custom filter option at the top of the performance report, you can apply a filter with regex using the Custom (regex) dropdown option.
Why is it useful?
When analysing data in Search Console, time efficiencies can be made by using regex to streamline your data manipulation to extract insights. For example, if you want to isolate a batch of search queries containing multiple different words, you can combine these into a single filter instead of having to do multiple filters and then having to combine this data once you’ve exported it. This allows you to analyse the data as broad or granular as you require.
The most helpful RegEx filters in Google Search Console
Once you start using regular expressions – if you don’t already – you’ll likely find yourself using some much more than others. Here are a few of the ones that we find ourselves tapping in the most.
Regex cheat sheet
Regular Expression Character | What it does |
| | Or. Place between two strings to match either one. |
. | Any single character. |
(.*) | This “greedy” wildcard will match any string of characters where placed. |
^ | Begins with. Place at the front of a string so it won’t match any characters before it. |
$ | Ends with. Place at the end of a string so it won’t match any characters that follow. |
Escape (backslash not forward slash). This tells the regex engine to take any character literally that it would otherwise identify as a regex function. So, “www.” Would be understood by a regex as a string three Ws and any one other character unless you type “www.” | |
[0-9] | Any single digit between 0 and 9. To match a specific number of characters you can specify this with the number in brackets (squiggly brackets to me and you) afterwards – e.g. [0-9]{4} (matches any 4-character string) |
[a-z] | The character must be a letter a to z. Again, use brackets as above to specify how many characters you want to match. |
The full list of regular expressions supported by Google can be found here.
To debug your regex, we recommend you use regular expressions tools such as regex101 to make sure only the URLs or queries you want data for are included.
How to use Custom (regex) filter in Search Console (use cases)
- Create a list of queries or URLs you want to match
Want to filter your search queries for any that contain trainers OR sneakers OR training shoes? Use the pipe character “|” as an or function in your regex filter:
Example: trainers|sneakers|training shoes
If you want to do the same thing but match only those search terms exactly, add “^” and “$” to indicate the beginning and end of each string respectively.
Example: ^trainers$|^sneakers$|^training shoes$
Example: When searching for all variations of a term applicable for your website use pipe (|) to list applicable variables.
1.1 Brand search
The approach described above can be exceedingly helpful when you want to include or exclude brand terms in your search query report, but your brand is searched with multiple variations or with common misspellings.
Example: mcdonalds|macdonalds|maccies
1.2 Questions
If planning new informational pages, searching for questions that have triggered your search results in Google can be a good starting point to optimise. In the string below we’ve used “^” so that it will only match search queries that begin with “are”, like “are regular expressions worth learning”, instead of a non-question query like “regular expressions are the best”.
Example: ^what|^when|^why|^who|^how|^where|^do|^are
2. Match unusual URL patterns
It’s easy to isolate your blogs in Google Search Console if your URL structure follows a nice simple /blog/ format. But what if you’re using WordPress and the only thing that identifies blog URLs is the date at the beginning, e.g. “/2021/07/16/blog-name”? Here you can begin to use the regex functions we’ve described to match those tricky dates.
Example: /[0-9]{4}/[0-9]{2}/[0-9]{2}/(.*)
This might look complicated but if you break it down it simply reads; any four digits – forwards slash – any two digits – forward slash – any two digits – forward slash – any number of any other characters.
3. Match product ID or SKU
Say, for example, you’ve identified that your customers find your store by typing in your product IDs or SKUs and want to see the extent of this in Google Search Console. No normal filter will do the trick. You’re not going to want to type in every single product ID separately and record the output!
Instead, you can use digits [0-9] and letter identifiers [a-z] to stipulate the pattern to search for. For instance, if your product IDs followed the pattern “p1234567” you could use the regex below to look for one letter followed by seven digits.
Example: [a-z]{1}[0-9]{7}
If the letter is always “p” you can adjust the regex to only include “p” this would look like “p[0-9]{7}”
What’s not supported by Google Search Console’s RegEx filter
If you are already familiar with regular expression, then there are certain things you’ll expect it to do. However, as Google Search console uses regex2, not all regex expressions are supported. This will limit how you can use regex to filter your data. As a result, webmasters are still limited by what they can achieve using regular expression in Google Search Console.
Of particular note, the regex characters to state the search does not include “?!” is not supported by regex2, which means we are unable to compare brand vs. generic performance.
It’s also worth mentioning that the custom (regex) filter is only available for page and query, not for device or location values. Google Console users can only select one value under these options at a time.
To conclude, the functionality of regex in Google Search Console is an improvement compared to what was previously there. However, the limitation of what syntax is supported in Google Search Console means there is still some improvements required to assist strategic insight, which means webmasters will need to export their data outside of Google Search Console interface for data manipulation.
If you’re looking for expert insights and analytics services or want to unlock the full potential of your Google Search Console, get in touch today.