class: center, middle, inverse, title-slide # Laboratorio Bio-demografico ## Lezione 7 ### Nicola Barban
Alma Mater Studiorum Università di Bologna
Dipartimento di Scienze Statistiche ### 2 Marzo 2021
--- # Outline: * Mappe usando il paccjetto tidyverse ### Pacchetti R usati ```r #install.packages("eurostat") library(eurostat) library(sf) library(tmap) library(tidyverse) library(viridis) library(zoo) ``` --- # Dati Eurostat ### Ottengo indice datasets Eurostat ```r toc <- get_eurostat_toc() # Check the first items library(knitr) kable(head(toc)) ``` |title |code |type |last update of data |last table structure change |data start |data end |values | |:--------------------------------------------------------|:---------|:-------|:-------------------|:---------------------------|:----------|:--------|:------| |Database by themes |data |folder |NA |NA |NA |NA |NA | |General and regional statistics |general |folder |NA |NA |NA |NA |NA | |European and national indicators for short-term analysis |euroind |folder |NA |NA |NA |NA |NA | |Business and consumer surveys (source: DG ECFIN) |ei_bcs |folder |NA |NA |NA |NA |NA | |Consumer surveys (source: DG ECFIN) |ei_bcs_cs |folder |NA |NA |NA |NA |NA | |Consumers - monthly data |ei_bsco_m |dataset |25.02.2021 |25.02.2021 |1980M01 |2021M02 |NA | --- # Cerco dati su "fertility rate" ```r kable(search_eurostat(pattern="fertility rate", type="table", fixed=FALSE)) ``` |title |code |type |last update of data |last table structure change |data start |data end |values | |:-------------------------------------|:--------|:-----|:-------------------|:---------------------------|:----------|:--------|:------| |Total fertility rate by NUTS 2 region |tgs00100 |table |24.02.2021 |24.02.2021 |2008 |2019 |NA | |Total fertility rate |tps00199 |table |24.02.2021 |24.02.2021 |2008 |2019 |NA | |Total fertility rate by NUTS 2 region |tgs00100 |table |24.02.2021 |24.02.2021 |2008 |2019 |NA | --- # Download data dan EUROSTAT ```r # Download attribute data from Eurostat fert_data <- get_eurostat(id="tps00199", time_format = "num") glimpse(fert_data) ``` ``` ## Rows: 563 ## Columns: 4 ## $ indic_de [3m[38;5;246m<chr>[39m[23m "TOTFERRT", "TOTFERRT", "TOTFERRT"… ## $ geo [3m[38;5;246m<chr>[39m[23m "AD", "AL", "AM", "AT", "AZ", "BE"… ## $ time [3m[38;5;246m<dbl>[39m[23m 2008, 2008, 2008, 2008, 2008, 2008… ## $ values [3m[38;5;246m<dbl>[39m[23m 1.25, 1.58, 1.43, 1.42, 1.90, 1.85… ``` --- # Label_eurostat() ### Get definitions for Eurostat codes from Eurostat dictionaries. ```r fert_df<- label_eurostat(fert_data) glimpse(fert_df) ``` ``` ## Rows: 563 ## Columns: 4 ## $ indic_de [3m[38;5;246m<chr>[39m[23m "Total fertility rate", "Total fer… ## $ geo [3m[38;5;246m<chr>[39m[23m "Andorra", "Albania", "Armenia", "… ## $ time [3m[38;5;246m<dbl>[39m[23m 2008, 2008, 2008, 2008, 2008, 2008… ## $ values [3m[38;5;246m<dbl>[39m[23m 1.25, 1.58, 1.43, 1.42, 1.90, 1.85… ``` --- # Seleziono dati del 2018 ```r fert_2018<-fert_data %>% filter(time==2018) glimpse(fert_2018) ``` ``` ## Rows: 48 ## Columns: 4 ## $ indic_de [3m[38;5;246m<chr>[39m[23m "TOTFERRT", "TOTFERRT", "TOTFERRT"… ## $ geo [3m[38;5;246m<chr>[39m[23m "AL", "AM", "AT", "AZ", "BE", "BG"… ## $ time [3m[38;5;246m<dbl>[39m[23m 2018, 2018, 2018, 2018, 2018, 2018… ## $ values [3m[38;5;246m<dbl>[39m[23m 1.37, 1.57, 1.47, 1.73, 1.62, 1.56… ``` --- # ottengo dati Spaziali ```r geodata <-get_eurostat_geospatial(nuts_level=0) head(geodata) ``` ``` ## Simple feature collection with 6 features and 7 fields ## geometry type: MULTIPOLYGON ## dimension: XY ## bbox: xmin: 2.54601 ymin: 34.56908 xmax: 34.56859 ymax: 51.50246 ## CRS: EPSG:4326 ## id CNTR_CODE NUTS_NAME LEVL_CODE FID ## 1 BG BG БЪЛГАРИЯ 0 BG ## 2 CH CH SCHWEIZ/SUISSE/SVIZZERA 0 CH ## 3 CY CY ΚΥΠΡΟΣ 0 CY ## 4 AL AL SHQIPËRIA 0 AL ## 5 CZ CZ ČESKÁ REPUBLIKA 0 CZ ## 6 BE BE BELGIQUE-BELGIË 0 BE ## NUTS_ID geometry geo ## 1 BG MULTIPOLYGON (((22.99717 43... BG ## 2 CH MULTIPOLYGON (((8.61383 47.... CH ## 3 CY MULTIPOLYGON (((33.75237 34... CY ## 4 AL MULTIPOLYGON (((19.831 42.4... AL ## 5 CZ MULTIPOLYGON (((14.49122 51... CZ ## 6 BE MULTIPOLYGON (((5.10218 51.... BE ``` --- # Geographical data 1. **Shapefiles** (vector formats). Forniscono coordinate precise per specifici punti, che possono rappresentare elementi geografici come città o possono essre uniti in linee o poligoni per rappresentare confini, strade fiumi ecc. (https://www.eea.europa.eu/data-and-maps/data/eea-reference-grids-2/gis-files/italy-shapefile) 2. **Rasters.** File di tipo immagine georeferenziati. (https://www.neonscience.org/resources/learning-hub/tutorials/raster-data-r) --- # Mappa Europa ```r map0 <- tm_shape(geodata) + tm_fill("lightgrey") + tm_grid() map0 ``` <img src="lezione7_files/figure-html/unnamed-chunk-8-1.png" width="60%" /> --- ```r geodata %>% tm_shape() + tm_fill("lightgrey") + tm_grid() ``` <img src="lezione7_files/figure-html/unnamed-chunk-9-1.png" width="60%" /> --- # Restringo mappa all'Europa continentale ```r library(tmaptools) bb(geodata) ``` ``` ## xmin ymin xmax ymax ## -63.15176 -21.38696 55.83509 71.17592 ``` ```r *bb_cont<-bb(geodata,xlim=c(-12,44), ylim=c(35,70) ) map1 <- tm_shape(geodata, bbox=bb_cont) + tm_polygons() + tm_fill("lightgrey") ``` --- ```r map1 ``` <img src="lezione7_files/figure-html/unnamed-chunk-11-1.png" width="60%" /> --- # Unisco file geografico e dati ```r mapdata <- geodata %>% left_join(fert_2018, by="geo") %>% mutate(cat=cut_to_classes(values, n=4, decimals=1)) # cut_to_classes è una fuinzione del pacchetto eurostat head(select(mapdata, geo, values, cat),3) ``` ``` ## Simple feature collection with 3 features and 3 fields ## geometry type: MULTIPOLYGON ## dimension: XY ## bbox: xmin: 5.95607 ymin: 34.56908 xmax: 34.56859 ymax: 47.80401 ## CRS: EPSG:4326 ## geo values cat ## 1 BG 1.56 1.4 ~< 1.6 ## 2 CH 1.52 1.4 ~< 1.6 ## 3 CY 1.32 1.2 ~< 1.4 ## geometry ## 1 MULTIPOLYGON (((22.99717 43... ## 2 MULTIPOLYGON (((8.61383 47.... ## 3 MULTIPOLYGON (((33.75237 34... ``` --- # Mappa tasso di fecondità totale ```r map_fert <- tm_shape(mapdata, bbox=bb_cont) + tm_polygons("cat", title = "Tasso Fecondità Totale", palette = "Oranges", border.col = "black") map_fert ``` <img src="lezione7_files/figure-html/unnamed-chunk-13-1.png" width="60%" /> --- # Fertilità 2015 ```r fert_reg_2015 <- get_eurostat(id="tgs00100", time_format = "num") %>% filter(time==2015) head(fert_reg_2015) ``` ``` ## # A tibble: 6 x 5 ## unit age geo time values ## <chr> <chr> <chr> <dbl> <dbl> ## 1 NR TOTAL AT11 2015 1.37 ## 2 NR TOTAL AT12 2015 1.52 ## 3 NR TOTAL AT13 2015 1.42 ## 4 NR TOTAL AT21 2015 1.43 ## 5 NR TOTAL AT22 2015 1.44 ## 6 NR TOTAL AT31 2015 1.61 ``` ```r geodata2 <-get_eurostat_geospatial(nuts_level=2) ``` --- # Cambio cambiare la legenda ### Decido i valori per la mappa ```r mapdata_reg <- geodata2 %>% left_join(fert_reg_2015) %>% mutate(cat=cut(values, breaks=c(0,1.1,1.3, 1.5, 1.7, 1.9, 2.1, 2.3, 2.5, 7))) ``` ### Aggiungo alcuni dettagli alla mappa ```r map_fert_reg <- tm_shape(mapdata_reg, bbox=bb_cont) + tm_polygons("cat", title = "TFT", palette = "Greens", border.col = "black", colorNA="grey") + tm_layout("Tasso Fecondità Totale Regionale", legend.title.size = 1, legend.text.size = 0.6, legend.position = c("right","top"), legend.bg.color = "white", legend.bg.alpha = 1) ``` --- # tasso fecondità regionale Europa ```r map_fert_reg ``` <img src="lezione7_files/figure-html/unnamed-chunk-17-1.png" width="60%" /> --- # Esercizio Creare una mappa regionale mostrando le regioni in cui il tasso di fecondità totale è aumentato, restato uguale o diminuito --- # Scale Colori divergenti ```r library(RColorBrewer) display.brewer.all(type="div") ``` <img src="lezione7_files/figure-html/unnamed-chunk-18-1.png" width="60%" /> --- # Scale qualitative ```r display.brewer.all(type="qual") ``` <img src="lezione7_files/figure-html/unnamed-chunk-19-1.png" width="60%" /> --- # Visualizzo scala colori ```r display.brewer.pal(n = 8, name = 'Dark2') ``` <img src="lezione7_files/figure-html/unnamed-chunk-20-1.png" width="60%" /> --- # Visualizzo codici colori ```r brewer.pal(n = 8, name = "Dark2") ``` ``` ## [1] "#1B9E77" "#D95F02" "#7570B3" "#E7298A" "#66A61E" ## [6] "#E6AB02" "#A6761D" "#666666" ``` ```r # install.pacakges(c("shiny","shinys")) # palette_explorer() ``` --- # Alcuni esempi di color palette https://htmlcolorcodes.com https://coolors.co http://colormind.io --- # Carico shapefile ```r library(sf) geo_prov <- st_read("ProvCM01012018_g_WGS84.shp") ``` ``` ## Reading layer `ProvCM01012018_g_WGS84' from data source `/Users/nicola/Dropbox/Teaching/2021/labSOCIODEMO/webpage/LaboratorioBioDemografico2021/lezione7/ProvCM01012018_g_WGS84.shp' using driver `ESRI Shapefile' ## Simple feature collection with 107 features and 12 fields ## geometry type: MULTIPOLYGON ## dimension: XY ## bbox: xmin: 313279.3 ymin: 3933846 xmax: 1312016 ymax: 5220292 ## CRS: 32632 ``` ```r head(geo_prov) ``` ``` ## Simple feature collection with 6 features and 12 fields ## geometry type: MULTIPOLYGON ## dimension: XY ## bbox: xmin: 313279.3 ymin: 4879338 xmax: 516960.2 ymax: 5088855 ## CRS: 32632 ## COD_RIP COD_REG COD_PROV COD_CM COD_PCM DEN_PROV ## 1 1 1 1 201 201 - ## 2 1 1 2 0 2 Vercelli ## 3 1 1 3 0 3 Novara ## 4 1 1 4 0 4 Cuneo ## 5 1 1 5 0 5 Asti ## 6 1 1 6 0 6 Alessandria ## DEN_CM DEN_PCM SIGLA SHAPE_LENG SHAPE_AREA ## 1 Torino Torino TO 539249.8 6828137329 ## 2 - Vercelli VC 418249.3 2082027827 ## 3 - Novara NO 250242.4 1341319564 ## 4 - Cuneo CN 490111.9 6898443198 ## 5 - Asti AT 315561.4 1508948105 ## 6 - Alessandria AL 474880.9 3560304556 ## SHAPE_LEN geometry ## 1 539249.8 MULTIPOLYGON (((411015 5049... ## 2 418249.3 MULTIPOLYGON (((438328.6 50... ## 3 250242.4 MULTIPOLYGON (((460929.5 50... ## 4 490111.9 MULTIPOLYGON (((420223.9 49... ## 5 315561.4 MULTIPOLYGON (((425041.3 49... ## 6 474880.9 MULTIPOLYGON (((453307.9 50... ``` --- # Dati provinciali ```r tm_shape(geo_prov) + tm_borders() ``` <img src="lezione7_files/figure-html/unnamed-chunk-24-1.png" width="60%" /> --- # cambio stile ```r classic_ita <- tm_shape(geo_prov) + tmap_style("classic")+ tm_polygons() classic_ita ``` <img src="lezione7_files/figure-html/unnamed-chunk-25-1.png" width="60%" /> ```r tm_style("white") ``` ``` ## $tm_layout ## $tm_layout$style ## [1] "white" ## ## ## attr(,"class") ## [1] "tm" ``` --- # elenco stili di tmap ```r #tmap_style_catalog(path = "./tmap_style_previews", styles = NA) ``` --- # Una provincia a caso in Italia :-) ```r classic_ita+ tm_style("white")+ tm_shape(geo_prov %>% filter(DEN_PROV=="Padova")) + tm_fill(col="red") ``` <img src="lezione7_files/figure-html/unnamed-chunk-27-1.png" width="60%" /> --- # Una regione in Italia ```r tm_shape(geo_prov) + tm_fill() + tm_shape( geo_prov %>% filter(COD_REG==8)) + tm_polygons(col="red")+ tm_style("white") ``` <img src="lezione7_files/figure-html/unnamed-chunk-28-1.png" width="60%" /> --- # Scarica dati provinciali COVID ```r storicoProvince <- read.csv(file = "https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-province/dpc-covid19-ita-province.csv", stringsAsFactors = FALSE) storicoProvince <- subset(storicoProvince, codice_provincia<=111) storicoProvince$denominazione_regione[storicoProvince$denominazione_regione %in% list("P.A. Trento", "P.A. Bolzano")] <- "Trentino-Alto Adige" ``` --- # Creo dataset provinciale ```r library(zoo) covid_prov <- storicoProvince %>% group_by(codice_provincia) %>% mutate(diff_totale_casi = diff(c(0, totale_casi)) ) ``` ```r ## Per usare dati media mobile 7 gg #covid_prov <- covid_prov %>% # group_by(codice_provincia) %>% # mutate(diff_totale_casi = rollmean(diff_totale_casi,7, align="right", na.pad =T) ) glimpse(covid_prov) ``` ``` ## Rows: 39,911 ## Columns: 15 ## Groups: codice_provincia [107] ## $ data <chr> "2020-02-24T18:00:0… ## $ stato <chr> "ITA", "ITA", "ITA"… ## $ codice_regione <int> 13, 13, 13, 13, 17,… ## $ denominazione_regione <chr> "Abruzzo", "Abruzzo… ## $ codice_provincia <int> 66, 67, 68, 69, 76,… ## $ denominazione_provincia <chr> "L'Aquila", "Teramo… ## $ sigla_provincia <chr> "AQ", "TE", "PE", "… ## $ lat <dbl> 42.35122, 42.65892,… ## $ long <dbl> 13.398438, 13.70440… ## $ totale_casi <int> 0, 0, 0, 0, 0, 0, 0… ## $ note <chr> "", "", "", "", "",… ## $ codice_nuts_1 <chr> "", "", "", "", "",… ## $ codice_nuts_2 <chr> "", "", "", "", "",… ## $ codice_nuts_3 <chr> "", "", "", "", "",… ## $ diff_totale_casi <dbl> 0, 0, 0, 0, 0, 0, 0… ``` --- # Definisco date ```r ieri<-Sys.Date()-1 sett.scorsa<-ieri-7 library(lubridate) ``` ### seleziono giorno 4 settimane fa (stesso giorno della settimana) ```r mese.scorso<-as.Date(ieri) %m-% weeks(4) ``` ### visualizzo nel formato che preferisco ```r format(ieri, "%a %b %d %Y") ``` ``` ## [1] "Tue Mar 02 2021" ``` ```r format(sett.scorsa, "%a %b %d %Y") ``` ``` ## [1] "Tue Feb 23 2021" ``` ```r format(mese.scorso, "%a %b %d %Y") ``` ``` ## [1] "Tue Feb 02 2021" ``` --- # credo dataset per mappa casi provinciali ```r casi_prov<- covid_prov %>% filter(data==ieri | data==sett.scorsa | data==mese.scorso) %>% select(COD_PROV=codice_provincia, data,diff_totale_casi) %>% arrange(COD_PROV, desc(data)) casi_prov$diff_totale_casi[casi_prov$diff_totale_casi<0]<-NA glimpse(casi_prov) ``` ``` ## Rows: 321 ## Columns: 3 ## Groups: COD_PROV [107] ## $ COD_PROV <int> 1, 1, 1, 2, 2, 2, 3, 3, 3,… ## $ data <chr> "2021-03-02T17:00:00", "20… ## $ diff_totale_casi <dbl> 873, 530, 381, 68, 40, 24,… ``` ```r mappa_prov <- geo_prov %>% left_join(subset(casi_prov, data==ieri)) ``` --- # mappa nuovi casi COVID ```r map1 <- tm_shape(mappa_prov) + tm_polygons("diff_totale_casi")+ tm_style("white") map1 ``` <img src="lezione7_files/figure-html/unnamed-chunk-36-1.png" width="60%" /> --- # Cambio palette ```r map1 <- tm_shape(mappa_prov) + tm_borders() + tm_fill("diff_totale_casi", palette ="Blues")+ tm_style("white") map1 ``` <img src="lezione7_files/figure-html/unnamed-chunk-37-1.png" width="60%" /> --- # mappa a bolle ```r map2 <- tm_shape(mappa_prov) + tm_borders() + tm_bubbles("diff_totale_casi", col ="blue")+ tm_style("white") map2 ``` <img src="lezione7_files/figure-html/unnamed-chunk-38-1.png" width="60%" /> --- # Reshape wide ```r table(casi_prov$data) ``` ``` ## ## 2021-02-02T17:00:00 2021-02-23T17:00:00 ## 107 107 ## 2021-03-02T17:00:00 ## 107 ``` ```r casi_prov_wide<- casi_prov %>% pivot_wider(id= COD_PROV, names_from=data, values_from=diff_totale_casi) %>% ##< setNames( c("COD_PROV","ieri","sett.scorsa", "mese.scorso" ) ) ``` ```r casi_prov_wide<- casi_prov_wide %>% mutate(var_casi_sett=((ieri-sett.scorsa)/sett.scorsa)*100, var_casi_mese=((ieri-mese.scorso)/mese.scorso)*100) %>% mutate(var_casi_sett=na_if(var_casi_sett,"Inf"), var_casi_mese=na_if(var_casi_mese,"Inf")) head(casi_prov_wide) ``` ``` ## # A tibble: 6 x 6 ## # Groups: COD_PROV [6] ## COD_PROV ieri sett.scorsa mese.scorso var_casi_sett ## <int> <dbl> <dbl> <dbl> <dbl> ## 1 1 873 530 381 64.7 ## 2 2 68 40 24 70 ## 3 3 115 93 78 23.7 ## 4 4 313 185 119 69.2 ## 5 5 41 71 54 -42.3 ## 6 6 130 33 44 294. ## # … with 1 more variable: var_casi_mese <dbl> ``` ```r # palette = c("coral2", "aquamarine3", "gray") ``` --- # Summary ```r casi_prov_wide %>% group_by() %>% summarise(variazione_sett_media= mean(var_casi_sett, na.rm=T), variazione_mens_media= mean(var_casi_mese, na.rm=T), N_aumento=sum(var_casi_sett>=0, na.rm=T), N_dimin=sum(var_casi_sett<0, na.rm=T )) %>% select(variazione_sett_media, variazione_mens_media, N_aumento, N_dimin) ``` ``` ## # A tibble: 1 x 4 ## variazione_sett_… variazione_mens… N_aumento N_dimin ## <dbl> <dbl> <int> <int> ## 1 36.8 119. 69 37 ``` --- # Merge dataset con dati geografici ```r mappa_prov2 <- geo_prov %>% inner_join(casi_prov_wide) ``` --- # mappa differenza casi in un mese ```r library(viridis) map3 <- tm_shape(mappa_prov2) + tm_polygons(title="Variazione % casi sett", col="var_casi_sett", midpoint=0, breaks = seq(from=-100, to=100, 25), palette="-Spectral")+ tm_style("white") map3 ``` <img src="lezione7_files/figure-html/unnamed-chunk-43-1.png" width="60%" /> --- # Importo dati ISTAT ```r library(readr) ISTAT.prov <- read_csv("province.csv", col_types = cols(`Codice provincia` = col_integer()), skip = 1) Pop.prov <-ISTAT.prov %>% filter(Età=="Totale") %>% mutate(Tot.Pop=`Totale Maschi`+`Totale Femmine`) %>% select(COD_PROV="Codice provincia" , "Tot.Pop" ) ``` --- # Merge dati e calcolo incidenza ```r mappa_prov3<- left_join(mappa_prov2,Pop.prov ) mappa_prov3<-mappa_prov3 %>% mutate (inc.ieri=10^5*ieri/Tot.Pop, inc.sett.scorsa=10^5*sett.scorsa/Tot.Pop, inc.mese.scorso=10^5*mese.scorso/Tot.Pop) ``` --- # mappa incidenza ```r map4 <- tm_shape(mappa_prov3) + tm_polygons(col="inc.ieri", n=10)+ tm_style("white") map4 ``` <img src="lezione7_files/figure-html/unnamed-chunk-46-1.png" width="60%" /> --- ```r map4.1 <- tm_shape(mappa_prov3) + tm_polygons("inc.ieri", style="cont")+ ##< tm_style("white") map4.1 ``` <img src="lezione7_files/figure-html/unnamed-chunk-47-1.png" width="60%" /> --- # cambio legenda ```r map4 <- tm_shape(mappa_prov3) + tm_polygons(col = "inc.ieri", breaks = c( 0, 25, 50, 75, 100, 150, 200)) + tm_layout(legend.outside = TRUE) map4+tm_style("white") ``` <img src="lezione7_files/figure-html/unnamed-chunk-48-1.png" width="60%" /> --- # Cambio numero di intervalli ```r map5<- tm_shape(mappa_prov3) + tm_polygons(col="inc.ieri",n=10)+ tm_style("white") map5 ``` <img src="lezione7_files/figure-html/unnamed-chunk-49-1.png" width="60%" /> --- # combino diverse mappe ```r library(viridis) map6<-tm_shape(mappa_prov3) + tm_polygons(title="Incidenza gironaliera COVID", col=c("inc.ieri", "inc.sett.scorsa","inc.mese.scorso"), palette = "-inferno", n=8)+ tm_style("white")+ tm_facets(free.scales=FALSE, ncol = 3)+ tm_layout(panel.labels = c(format(ieri, "%b %d %Y"), format(sett.scorsa, "%b %d %Y"), format(mese.scorso, "%b %d %Y")), panel.label.size = 1.5, panel.label.color = 'white', panel.label.bg.color = 'grey', panel.label.height = 1.1) ``` --- ```r map6 ``` <img src="lezione7_files/figure-html/unnamed-chunk-51-1.png" width="80%" /> ---