Introduction to aopdata

2021-04-28

This vignette introduces the aopdata package, lists the cities for which data are available for download, and presents the data dictionary.

aopdata is an R package that makes it easy to download data from the Access to Opportunities Project (AOP). AOP is a research initiative led by the Institute for Applied Economic Research (Ipea) that aims to study transport accessibility and inequalities in access to opportunities in Brazilian cities. You can access more information about the project on the AOP website.

Installation

You can install aopdata from CRAN, or the development version from GitHub.

# CRAN
install.packages('aopdata')

# github
devtools::install_github("ipeaGIT/aopdata", subdir = "r-package")

Usage

After installation, you can easily download accessibility estimates, as well as population and land use data, from 20 major Brazilian cities. Those cities are:

Abbreviation City Name
bel Belem
bho Belo Horizonte
bsb Brasilia
cam Campinas
cgr Campo Grande
cur Curitiba
duq Duque de Caxias
for Fortaleza
goi Goiania
gua Guarulhos
mac Maceio
man Manaus
nat Natal
poa Porto Alegre
rec Recife
rio Rio de Janeiro
sal Salvador
sgo Sao Gonçalo
slz Sao Luis
spo Sao Paulo

Read accessibility estimates

To download accessibility estimates, you can use the read_access() function. The example below downloads data on accessibility by walking for the city of Curitiba, in Southern Brazil. The function will download data for the year 2019, which is the one currently available.

read_access() can also download the spatial geometry of each city by setting geometry = TRUE, which can be used to map accessibility levels across the city. Refer to the mapping population and land use and mapping urban accessibility vignettes for examples of how to use the spatial geometry information. For now, we’ll set geometry = FALSE to download a dataframe with no spatial information.

Let’s check the results:

dplyr::glimpse(cur)
#> Rows: 4,683
#> Columns: 96
#> $ abbrev_muni <chr> "cur", "cur", "cur", "cur", "cur", "cur", "cur", "cur", "c~
#> $ name_muni   <chr> "Curitiba", "Curitiba", "Curitiba", "Curitiba", "Curitiba"~
#> $ code_muni   <int> 4106902, 4106902, 4106902, 4106902, 4106902, 4106902, 4106~
#> $ id_hex      <chr> "89a804c9593ffff", "89a804ca61bffff", "89a804ca643ffff", "~
#> $ P001        <int> NA, NA, 25, NA, NA, NA, 28, NA, 37, 50, NA, 48, NA, NA, NA~
#> $ P002        <int> NA, NA, 21, NA, NA, NA, 24, NA, 33, 43, NA, 40, NA, NA, NA~
#> $ P003        <int> NA, NA, 3, NA, NA, NA, 4, NA, 5, 8, NA, 7, NA, NA, NA, 0, ~
#> $ P004        <int> NA, NA, 0, NA, NA, NA, 0, NA, 0, 0, NA, 0, NA, NA, NA, 0, ~
#> $ P005        <int> NA, NA, 0, NA, NA, NA, 0, NA, 0, 0, NA, 0, NA, NA, NA, 0, ~
#> $ R001        <dbl> NA, NA, 826.9, NA, NA, NA, 882.4, NA, 770.5, 889.4, NA, 88~
#> $ R002        <int> NA, NA, 3, NA, NA, NA, 3, NA, 2, 3, NA, 3, NA, NA, NA, 3, ~
#> $ R003        <int> NA, NA, 5, NA, NA, NA, 5, NA, 4, 5, NA, 5, NA, NA, NA, 5, ~
#> $ E001        <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
#> $ E002        <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
#> $ E003        <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
#> $ E004        <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
#> $ S001        <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
#> $ S002        <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
#> $ S003        <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
#> $ S004        <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
#> $ mode        <chr> NA, NA, "public_transport", NA, NA, NA, "public_transport"~
#> $ peak        <int> NA, NA, 1, NA, NA, NA, 1, NA, 1, 1, NA, 1, 1, NA, NA, 1, 1~
#> $ CMATT15     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMATQ15     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMATD15     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMAST15     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMASB15     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMASM15     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMASA15     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMAET15     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMAEI15     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMAEF15     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMAEM15     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMATT30     <dbl> NA, NA, 0.0000, NA, NA, NA, 0.0001, NA, 0.0001, 0.0001, NA~
#> $ CMATQ30     <dbl> NA, NA, 0.0000, NA, NA, NA, 0.0001, NA, 0.0001, 0.0001, NA~
#> $ CMATD30     <dbl> NA, NA, 0.0000, NA, NA, NA, 0.0001, NA, 0.0001, 0.0001, NA~
#> $ CMAST30     <dbl> NA, NA, 0.0000, NA, NA, NA, 0.0000, NA, 0.0000, 0.0000, NA~
#> $ CMASB30     <dbl> NA, NA, 0.0000, NA, NA, NA, 0.0000, NA, 0.0000, 0.0000, NA~
#> $ CMASM30     <dbl> NA, NA, 0.0000, NA, NA, NA, 0.0000, NA, 0.0000, 0.0000, NA~
#> $ CMASA30     <dbl> NA, NA, 0, NA, NA, NA, 0, NA, 0, 0, NA, 0, 0, NA, NA, 0, 0~
#> $ CMAET30     <dbl> NA, NA, 0.0000, NA, NA, NA, 0.0000, NA, 0.0000, 0.0000, NA~
#> $ CMAEI30     <dbl> NA, NA, 0.0000, NA, NA, NA, 0.0000, NA, 0.0000, 0.0000, NA~
#> $ CMAEF30     <dbl> NA, NA, 0.0000, NA, NA, NA, 0.0000, NA, 0.0000, 0.0000, NA~
#> $ CMAEM30     <dbl> NA, NA, 0.0000, NA, NA, NA, 0.0000, NA, 0.0000, 0.0000, NA~
#> $ CMATT45     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMATQ45     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMATD45     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMAST45     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMASB45     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMASM45     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMASA45     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMAET45     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMAEI45     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMAEF45     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMAEM45     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
#> $ CMATT60     <dbl> NA, NA, 0.0035, NA, NA, NA, 0.0065, NA, 0.0124, 0.0089, NA~
#> $ CMATQ60     <dbl> NA, NA, 0.0032, NA, NA, NA, 0.0060, NA, 0.0135, 0.0086, NA~
#> $ CMATD60     <dbl> NA, NA, 0.0039, NA, NA, NA, 0.0069, NA, 0.0135, 0.0096, NA~
#> $ CMAST60     <dbl> NA, NA, 0.0044, NA, NA, NA, 0.0133, NA, 0.0221, 0.0133, NA~
#> $ CMASB60     <dbl> NA, NA, 0.0000, NA, NA, NA, 0.0136, NA, 0.0272, 0.0136, NA~
#> $ CMASM60     <dbl> NA, NA, 0.0046, NA, NA, NA, 0.0138, NA, 0.0229, 0.0138, NA~
#> $ CMASA60     <dbl> NA, NA, 0.0000, NA, NA, NA, 0.0000, NA, 0.0000, 0.0000, NA~
#> $ CMAET60     <dbl> NA, NA, 0.0017, NA, NA, NA, 0.0155, NA, 0.0276, 0.0190, NA~
#> $ CMAEI60     <dbl> NA, NA, 0.0000, NA, NA, NA, 0.0142, NA, 0.0228, 0.0171, NA~
#> $ CMAEF60     <dbl> NA, NA, 0.0029, NA, NA, NA, 0.0175, NA, 0.0322, 0.0205, NA~
#> $ CMAEM60     <dbl> NA, NA, 0.0000, NA, NA, NA, 0.0204, NA, 0.0340, 0.0272, NA~
#> $ CMATT90     <dbl> NA, NA, 0.0640, NA, NA, NA, 0.1059, NA, 0.2062, 0.1846, NA~
#> $ CMATQ90     <dbl> NA, NA, 0.0644, NA, NA, NA, 0.1069, NA, 0.1934, 0.1884, NA~
#> $ CMATD90     <dbl> NA, NA, 0.0631, NA, NA, NA, 0.1018, NA, 0.1934, 0.1744, NA~
#> $ CMAST90     <dbl> NA, NA, 0.0708, NA, NA, NA, 0.1106, NA, 0.1858, 0.1814, NA~
#> $ CMASB90     <dbl> NA, NA, 0.0612, NA, NA, NA, 0.0952, NA, 0.1293, 0.1497, NA~
#> $ CMASM90     <dbl> NA, NA, 0.0642, NA, NA, NA, 0.1009, NA, 0.1789, 0.1743, NA~
#> $ CMASA90     <dbl> NA, NA, 0.0833, NA, NA, NA, 0.1944, NA, 0.2500, 0.3333, NA~
#> $ CMAET90     <dbl> NA, NA, 0.0777, NA, NA, NA, 0.0898, NA, 0.1036, 0.1123, NA~
#> $ CMAEI90     <dbl> NA, NA, 0.0741, NA, NA, NA, 0.0798, NA, 0.0798, 0.0997, NA~
#> $ CMAEF90     <dbl> NA, NA, 0.0848, NA, NA, NA, 0.0994, NA, 0.1170, 0.1170, NA~
#> $ CMAEM90     <dbl> NA, NA, 0.0680, NA, NA, NA, 0.0884, NA, 0.1156, 0.1020, NA~
#> $ CMATT120    <dbl> NA, NA, 0.7867, NA, NA, NA, 0.8521, NA, 0.8961, 0.8833, NA~
#> $ CMATQ120    <dbl> NA, NA, 0.7917, NA, NA, NA, 0.8587, NA, 0.8847, 0.8877, NA~
#> $ CMATD120    <dbl> NA, NA, 0.7699, NA, NA, NA, 0.8348, NA, 0.8847, 0.8706, NA~
#> $ CMAST120    <dbl> NA, NA, 0.7168, NA, NA, NA, 0.7566, NA, 0.8053, 0.7876, NA~
#> $ CMASB120    <dbl> NA, NA, 0.6054, NA, NA, NA, 0.6531, NA, 0.7075, 0.6803, NA~
#> $ CMASM120    <dbl> NA, NA, 0.7064, NA, NA, NA, 0.7477, NA, 0.7982, 0.7798, NA~
#> $ CMASA120    <dbl> NA, NA, 1.0000, NA, NA, NA, 1.0000, NA, 1.0000, 1.0000, NA~
#> $ CMAET120    <dbl> NA, NA, 0.5371, NA, NA, NA, 0.6114, NA, 0.6701, 0.6408, NA~
#> $ CMAEI120    <dbl> NA, NA, 0.4587, NA, NA, NA, 0.5242, NA, 0.5869, 0.5527, NA~
#> $ CMAEF120    <dbl> NA, NA, 0.5731, NA, NA, NA, 0.6550, NA, 0.7135, 0.6842, NA~
#> $ CMAEM120    <dbl> NA, NA, 0.6259, NA, NA, NA, 0.7143, NA, 0.7755, 0.7483, NA~
#> $ TMIST       <dbl> NA, NA, 58.5583, NA, NA, NA, 48.1750, NA, 47.8833, 41.5667~
#> $ TMISB       <dbl> NA, NA, 64.2750, NA, NA, NA, 55.0417, NA, 51.7833, 50.6917~
#> $ TMISM       <dbl> NA, NA, 58.5583, NA, NA, NA, 48.1750, NA, 47.8833, 41.5667~
#> $ TMISA       <dbl> NA, NA, 87.6500, NA, NA, NA, 82.3500, NA, 76.2500, 76.7167~
#> $ TMIET       <dbl> NA, NA, 48.6167, NA, NA, NA, 40.8000, NA, 41.1667, 38.1333~
#> $ TMIEI       <dbl> NA, NA, 60.5750, NA, NA, NA, 53.2000, NA, 49.9000, 44.6750~
#> $ TMIEF       <dbl> NA, NA, 48.6167, NA, NA, NA, 40.8000, NA, 41.1667, 38.1333~
#> $ TMIEM       <dbl> NA, NA, 61.1667, NA, NA, NA, 52.7417, NA, 51.0333, 46.9000~

As you can see, a lot of data has been returned from read_access(). The dataframe’s columns can be classified into 4 groups, according to the data they contain: geographic, sociodemographic, land use, and accessibility. The following section explains the contents of each column.

Data dictionary

Geographic variables

column Description
abbrev_muni Abbreviation of city name (3 letters)
name_muni City name
code_muni 7-digit code of each city
id_hex Unique id of hexagonal cell

Sociodemographic variables

column Description
P001 Total number of residents
P002 Number of white residents
P003 Number of black residents
P004 Number of indiginous residents
P005 Number of asian-descendents residents
R001 Average household income per capita
R002 Income quintile group
R003 Income decile group

Land use variables

column Description
T001 Total number of formal jobs
T002 Total number of formal jobs with primary education
T003 Number of formal jobs with secundary education
T004 Number of formal jobs with tertiary education
E001 Total number of public schools
E002 Number of public schools - early childhood
E003 Number of public schools - elementary schools
E004 Number of public schools - high schools
S001 Total number of healthcare facilities
S002 Number of healthcare facilities - low complexity
S003 Number of healthcare facilities - medium complexity
S004 Number of healthcare facilities - high complexity

Accessibility variables

The name of the columns with accessibility estimates are the junction of three components:

  1. Indicator

  2. Type of opportunity

  3. Time thresold (if applicable)

1) Indicator

Indicator Description Note
CMA Cumulative opportunity measure (active)
TMI Travel time to closest opportunity Value = Inf when travel time is longer
than 2h (public transport) or 1,5h (walking or bicycle)

2) Type of opportunity

Indicator Description
TT All jobs
TQ Total jobs with partial match between job education and income quintile
TD Total jobs with partial match between job education and income decile
ST All healthcare facilities
SB Healthcare facilities - Low complexity
SM Healthcare facilities - Medium complexity
SA Healthcare facilities - High complexity
ET All public schools
EI Public schools - early childhood
EF Public schools - elementary schools
EM Public schools - high schools

3) Time thresold (only applicable to CMA estimates)

Time thresold Description Note - Only applicable to:
15 Opportunities accessible within 15 min. Active transport modes
30 Opportunities accessible within 30 min. All transport modes
45 Opportunities accessible within 45 min. Active transport modes
60 Opportunities accessible within 60 min. All transport modes
90 Opportunities accessible within 90 min. Public transport
120 Opportunities accessible within 120 min. Public transport

Next Steps

Now, check the next vignettes for demonstrations on how to use aopdata to produce land use and accessibility maps, as well as to analyse accessibility inequalities.