A small slice of big data could go a long way for social sciences: migration studies.

Big data is a big deal! It will be more so in the years to come as we will come to appreciate long data and wide data (longitudinal big data and representativity issues).

As more people  and activities go online big data will be more and more representative. There are of course still going to be differences between say Google search data and Bing search data in the same way we know that the demographic profiles of Mercedes drivers and Audi drivers may differ in certain ways which change with time. So for scientific purposes or even for commercial purposes it will be critical to know the demographics of the online population per data trove. This is where currently the fiercest battle is: as it is very hard to get demographics from search data the big players are focusing on social media where people volunteer their data. This is why for example Google+ is a big bet for Google or why Google will offer you everything you need in terms of disk space and services so that you keep your online activities on servers they control and index.

Regarding long data there will be a lot of research necessary to make sure the data remains meaningful in depth of time. As more and more companies collect and form big data chunks on their customers their customers will eventually become the product in the same way that Facebook’s users are not its customers but its product…

Search for "turkiye" by Fed. State in Germany

When it comes to big data nothing is bigger than Google search data. I want to quickly highlight an example which shows you how we could use Google Trends to do migration studies for example. In the graph above you see a simple heat map which shows you the relative average intensity of searches for the word “Turkiye” by Federal State in Germany. The premise is that since only Turks would search for that spelling the search intensity will correlate with the distribution of Turkish speaking people across Germany (of course also with other time dependent variables). It is easy to see that Nordrhein-Westfalen is the place with the most searches which is consistent with the fact that most of the Turks in Germany (32%) live there and number two is Baden-Württemberg where the second largest Turkish population share is (17%). Also we see very nicely that except in Berlin there are practically no Turks in East Germany.

This entry was posted in data, research and tagged , . Bookmark the permalink.