Details on Data - 1: Spatial Details




The key data set for this exercise will be the tweet feeds for the region of europe in February 2012. To be more exact - here the included countries in our study:

  • Germany
  • The Netherlands
  • Italy
  • France
  • Spain
  • Great Britain & Ireland
  • Greece

All Islands which belong to these countries are of course also included when looking for the money-related tweets.

The dataset used was originally extracted from an actually worldwide dataset. It will be hosted locally in Postgres-databases, for more Info please check the "Details on Software"


In order to get a better overview on the intensity of tweets relative to the total population of the different countries, the table containing the tweets was queried by using a bounding box (geographical extents via X- & Y- coordinates) of each individual country. This method is a good way to start, although I have to mention that the bounding box is strictly rectangular (the individual countryborders aren't), which means some inaccurracy. If further detail is wanted it would be nesseccary to work with the exact borderlines & extends of the country.

The following queries were used for the individual countries:


--France--

select  txt  from  geo_tweets_europe_feb2012  where  st_x(coords)>-4.913  and  st_x(coords)<8.234  and st_y(coords)>42.254 and st_y(coords)<50.879;

--UK--


select txt from geo_tweets_europe_feb2012 where st_x(coords)>-10.4747 and st_x(coords)<1.7494 and
st_y(coords)>49.9116 and st_y(coords)<60.8444;


--Germany--


select  txt  from  geo_tweets_europe_feb2012  where  st_x(coords)>7.222  and  st_x(coords)<15.134  and st_y(coords)>47.608 and st_y(coords)<54.925;

--Italy--


select txt from geo_tweets_europe_feb2012 where st_x(coords)>6.6197 and st_x(coords)<18.51499 and st_y(coords)>36.6491 and st_y(coords)<47.0947;
 

--Spain--


select  txt  from  geo_tweets_europe_feb2012  where  st_x(coords)>-7.292  and  st_x(coords)<4.367  and st_y(coords)>36.127 and st_y(coords)<44.039;
 

--Netherlands--


select  txt  from  geo_tweets_europe_feb2012  where  st_x(coords)>3.3708  and  st_x(coords)<7.2216  and st_y(coords)>50.7538 and st_y(coords)<51.5114;


--Greece--


select  txt  from  geo_tweets_europe_feb2012  where  st_x(coords)>19.376  and  st_x(coords)<28.2380  and st_y(coords)>34.8089 and st_y(coords)<41.74;



The population numbers were obtained from www.geohive.com. The intensity of tweets per country is then calculated by dividing the number of tweets by the population of the respective country of interest.

The results are visualised in the table below:



 

No comments:

Post a Comment