GNRS R package

Brian Maitner

2021-03-28

Geographic Name Resolution Service

The package GNRS is designed to interact with the Geographic Name Resolution Service of the Botanical Information and Ecology Network.

#Installing GNRS

library(devtools)
install_github("EnquistLab/RGNRS")

#The easiest case: one political division

library(GNRS)
GNRS_super_simple(country =  "United States", 
                 state_province = "Arizona",
                 county_parish = "Pima County")
##                         poldiv_full country_verbatim state_province_verbatim
## 1 United States@Arizona@Pima County    United States                 Arizona
##   state_province_verbatim_alt county_parish_verbatim county_parish_verbatim_alt
## 1                                        Pima County                       Pima
##         country state_province county_parish country_id state_province_id
## 1 United States        Arizona          Pima    6252001           5551752
##   county_parish_id country_iso state_province_iso county_parish_iso geonameid
## 1          5308878          US                 AZ               019   5308878
##   gid_0   gid_1      gid_2 match_method_country match_method_state_province
## 1   USA USA.3_1 USA.3.11_1  exact standard name                  exact name
##   match_method_county_parish match_score_country match_score_state_province
## 1                 exact name                                               
##   match_score_county_parish poldiv_submitted poldiv_matched match_status
## 1                              county_parish  county_parish   full match
##   user_id
## 1       1

#Multiple political divisions

#First, we'll load the test data that are included with this package, gnrs_testfile

gnrs_testfile <- gnrs_testfile

head(gnrs_testfile, n = 10)
##    user_id   country          state_province
## 1        1    Russia                 Lipetsk
## 2        2    Mexico       Sonora, Estado de
## 3        3 Guatemala                  Izabal
## 4        4       USA                 Arizona
## 5        5     U.S.A                 Arizona
## 6        6       USA                 Ilinois
## 7        7    Mexico            Quintana Roo
## 8        8    Mexico            Quintana Roo
## 9        9   Ukraine                 Kharkiv
## 10      10    Canada Province of Nova Scotia
##                           county_parish
## 1                      Dobrovskiy rayon
## 2                          Hua^sA(C)pac
## 3                                      
## 4                           Pima County
## 5                                  Pima
## 6                                      
## 7               La^sA°zaro Ca^sA°rdenas
## 8  Municipio de La^sA°zaro Ca^sA°rdenas
## 9                       Novovodolaz'kyi
## 10

As you can see, the sample data include spelling variants (USA vs U.S.A.) and non-standard characters that may cause problems. The GNRS will standardize these spelling variants and non-standard characters.

gnrs_results <- GNRS(gnrs_testfile)

#The standardized names are found in these columns:
head(gnrs_results[c("country","state_province","county_parish")], n = 10)
##          country      state_province    county_parish
## 1         Russia  Lipetskaya Oblast' Dobrovskiy Rayon
## 2         Mexico              Sonora                 
## 3      Guatemala              Izabal                 
## 4  United States             Arizona             Pima
## 5  United States             Arizona             Pima
## 6  United States            Illinois                 
## 7         Mexico        Quintana Roo                 
## 8         Mexico        Quintana Roo                 
## 9        Ukraine Kharkivs'ka Oblast'  Novovodolaz'kyi
## 10        Canada         Nova Scotia

The GNRS function expects 4 columns as input, but all are optional. If you ever forget, you can use the function GNRS_template as a quick look-up, or as a template to populate

head(GNRS_template())
##   user_id country state_province county_parish
## 1      NA      NA             NA            NA