Google search queries in the times of COVID-19
Mass-extracting Google search trends
Google Trends is a great tool to explore the online collective mindset of a nation using the aggregate internet queries as an indicator of what people think is important right now.
If you are curious, you can visit the daily serch trends page of their website to get an executive summary of what is going on in the country. Also, you can explore topics and try your own keywords in this great free tool made available by Google. Here is a screenshot of the keyword “beer delivery” for the UK for what has transcurred of 2020 (nation-wide lockdown started in March 23rd, coinciding with the steep rise in search interest):
Search volume is normalized to a range of 0–100 for the keywords selected, where 100 indicates the date where the highest search volume ocurred.
As I mentioned before, this tool is great. However, it has some limitations when you want to compare several categories across multiple countries, speaking different languages, at the same time. In this blog I will describe the rough steps you can take to mass-extract data from Google trends and create your own ad-hoc analysis with Python.
pytrends is the “unoficial” open-source API for extracting data from Google trends. Once you have installed it, you will find that it is easy to use. I will be showing the example for the “interest over time”, but you will find that pytrends is capable of pulling also “interest by region”, “related topics”, “related queries” and more data points. The basic gist of pytrends lies in the initial parameters you use to initiate the requests: the keyword list to pull the geographie(s) that you want to analyse and the timeframe.
After extracting information I noticed that the reults could be improved if the list of keywords was specific to the official language spoken in each county (otherwise, the results will contain only searches made in English). So I created a dedicated dataframe with the equivalences in local languages of the 5 keywords of interest and used this structure to replace the zip_list shown in line 13. The results were much more representative, as you will see below.
Some Analysis
After saving my data to an Excel file, I as able to create an analysis comparing the variation per week of year in order to understand which categories are demanding more search volumes vs. a representative baseline (same period, year before). To add context, I added a line that denotes the start of the lockdown per country. Here are some of the results:
Conclusion
pytrends is a great way to facilitate the massive extraction of data from Google Trends. It enables fast multi-category, multi-country and multi-language analysis, with just a couple lines of code. Google Trends is very good as it currenlty is; but this tool will enable you to go a much faster and a little bit further for more complex analysis.
If you want to connect and have a conversation about this or any other topic, please feel free to connect with me on LinkedIn: