Extract tables from an HTML page — html_getTables • MazamaCoreUtils

Parse an HTML page and return all <table> elements as a list of data frames.

html_getTables(url = NULL, header = NA)

html_getTable(url = NULL, header = NA, index = 1)

Arguments

url: URL or local file path of an HTML page.
header: Logical specifying whether the first row should be used as column names. If NA, the first row is used only when it contains <th> elements.
index: Index identifying which table to return.

Value

List of data frames, one for each HTML table.

A single data frame containing the requested HTML table.

Details

The url argument may be either a remote URL or a local file path. Tables are parsed with rvest::html_table(). To extract a single table, use html_getTable().

Examples

if (FALSE) { # \dontrun{
url <- "https://en.wikipedia.org/wiki/List_of_tz_database_time_zones"

tables <- html_getTables(url)
firstTable <- tables[[1]]

head(firstTable)
nrow(firstTable)
} # }