Title: | Taxonomic Information from 'Wikipedia' |
Description: | 'Taxonomic' information from 'Wikipedia', 'Wikicommons', 'Wikispecies', and 'Wikidata'. Functions included for getting taxonomic information from each of the sources just listed, as well performing taxonomic search. |
Version: | 0.4.0 |
License: | MIT + file LICENSE |
URL: | https://docs.ropensci.org/wikitaxa, https://github.com/ropensci/wikitaxa |
BugReports: | https://github.com/ropensci/wikitaxa/issues |
LazyLoad: | yes |
LazyData: | yes |
Encoding: | UTF-8 |
Language: | en-US |
VignetteBuilder: | knitr |
Depends: | R(≥ 3.2.1) |
Imports: | WikidataR, data.table, curl, crul (≥ 0.3.4), tibble, jsonlite, xml2 |
Suggests: | testthat, knitr, rmarkdown, vcr |
RoxygenNote: | 7.1.0 |
X-schema.org-applicationCategory: | Taxonomy |
X-schema.org-keywords: | taxonomy, species, API, web-services, Wikipedia, vernacular, Wikispecies, Wikicommons |
X-schema.org-isPartOf: | https://ropensci.org |
NeedsCompilation: | no |
Packaged: | 2020-06-29 14:49:03 UTC; sckott |
Author: | Scott Chamberlain [aut, cre], Ethan Welty [aut] |
Maintainer: | Scott Chamberlain <myrmecocystus+r@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2020-06-29 15:30:03 UTC |
wikitaxa
Description
Taxonomic Information from Wikipedia
Author(s)
Scott Chamberlain myrmecocystus@gmail.com
Ethan Welty
List of Wikipedias
Description
data.frame of 295 rows, with 3 columns:
language - language
language_local - language in local name
wiki - langugae code for the wiki
Details
From https://meta.wikimedia.org/wiki/List_of_Wikipedias
Wikidata taxonomy data
Description
Wikidata taxonomy data
Usage
wt_data(x, property = NULL, ...)
wt_data_id(x, language = "en", limit = 10, ...)
Arguments
x |
(character) a taxonomic name |
property |
(character) a property id, e.g., P486 |
... |
curl options passed on to |
language |
(character) two letter language code |
limit |
(integer) records to return. Default: 10 |
Details
Note that wt_data
can take a while to run since when fetching
claims it has to do so one at a time for each claim
You can search things other than taxonomic names with wt_data
if you
like
Value
wt_data
searches Wikidata, and returns a list with elements:
labels - data.frame with columns: language, value
descriptions - data.frame with columns: language, value
aliases - data.frame with columns: language, value
sitelinks - data.frame with columns: site, title
claims - data.frame with columns: claims, property_value, property_description, value (comma separted values in string)
wt_data_id
gets the Wikidata ID for the searched term, and
returns the ID as character
Examples
## Not run:
# search by taxon name
# wt_data("Mimulus alsinoides")
# choose which properties to return
wt_data(x="Mimulus foliatus", property = c("P846", "P815"))
# get a taxonomic identifier
wt_data_id("Mimulus foliatus")
# the id can be passed directly to wt_data()
# wt_data(wt_data_id("Mimulus foliatus"))
## End(Not run)
Get MediaWiki Page from API
Description
Supports both static page urls and their equivalent API calls.
Usage
wt_wiki_page(url, ...)
Arguments
url |
(character) MediaWiki page url. |
... |
Arguments passed to |
Details
If the URL given is for a human readable html page, we convert it to equivalent API call - if URL is already an API call, we just use that.
Value
an HttpResponse
response object from crul
See Also
Other MediaWiki functions:
wt_wiki_page_parse()
,
wt_wiki_url_build()
,
wt_wiki_url_parse()
Examples
## Not run:
wt_wiki_page("https://en.wikipedia.org/wiki/Malus_domestica")
## End(Not run)
Parse MediaWiki Page
Description
Parses common properties from the result of a MediaWiki API page call.
Usage
wt_wiki_page_parse(
page,
types = c("langlinks", "iwlinks", "externallinks"),
tidy = FALSE
)
Arguments
page |
(crul::HttpResponse) Result of |
types |
(character) List of properties to parse. |
tidy |
(logical). tidy output to data.frames when possible.
Default: |
Details
Available properties currently not parsed: title, displaytitle, pageid, revid, redirects, text, categories, links, templates, images, sections, properties, ...
Value
a list
See Also
Other MediaWiki functions:
wt_wiki_page()
,
wt_wiki_url_build()
,
wt_wiki_url_parse()
Examples
## Not run:
pg <- wt_wiki_page("https://en.wikipedia.org/wiki/Malus_domestica")
wt_wiki_page_parse(pg)
## End(Not run)
Build MediaWiki Page URL
Description
Builds a MediaWiki page url from its component parts (wiki name, wiki type, and page title). Supports both static page urls and their equivalent API calls.
Usage
wt_wiki_url_build(
wiki,
type = NULL,
page = NULL,
api = FALSE,
action = "parse",
redirects = TRUE,
format = "json",
utf8 = TRUE,
prop = c("text", "langlinks", "categories", "links", "templates", "images",
"externallinks", "sections", "revid", "displaytitle", "iwlinks", "properties")
)
Arguments
wiki |
(character | list) Either the wiki name or a list with
|
type |
(character) Wiki type. |
page |
(character) Wiki page title. |
api |
(boolean) Whether to return an API call or a static page url
(default). If |
action |
(character) See https://en.wikipedia.org/w/api.php for supported actions. This function currently only supports "parse". |
redirects |
(boolean) If the requested page is set to a redirect, resolve it. |
format |
(character) See https://en.wikipedia.org/w/api.php for supported output formats. |
utf8 |
(boolean) If |
prop |
(character) Properties to retrieve, either as a character vector or pipe-delimited string. See https://en.wikipedia.org/w/api.php?action=help&modules=parse for supported properties. |
Value
a URL (character)
See Also
Other MediaWiki functions:
wt_wiki_page_parse()
,
wt_wiki_page()
,
wt_wiki_url_parse()
Examples
wt_wiki_url_build(wiki = "en", type = "wikipedia", page = "Malus domestica")
wt_wiki_url_build(
wt_wiki_url_parse("https://en.wikipedia.org/wiki/Malus_domestica"))
wt_wiki_url_build("en", "wikipedia", "Malus domestica", api = TRUE)
Parse MediaWiki Page URL
Description
Parse a MediaWiki page url into its component parts (wiki name, wiki type, and page title). Supports both static page urls and their equivalent API calls.
Usage
wt_wiki_url_parse(url)
Arguments
url |
(character) MediaWiki page url. |
Value
a list with elements:
wiki - wiki language
type - wikipedia type
page - page name
See Also
Other MediaWiki functions:
wt_wiki_page_parse()
,
wt_wiki_page()
,
wt_wiki_url_build()
Examples
wt_wiki_url_parse(url="https://en.wikipedia.org/wiki/Malus_domestica")
wt_wiki_url_parse("https://en.wikipedia.org/w/api.php?page=Malus_domestica")
WikiCommons
Description
WikiCommons
Usage
wt_wikicommons(name, utf8 = TRUE, ...)
wt_wikicommons_parse(
page,
types = c("langlinks", "iwlinks", "externallinks", "common_names", "classification"),
tidy = FALSE
)
wt_wikicommons_search(query, limit = 10, offset = 0, utf8 = TRUE, ...)
Arguments
name |
(character) Wiki name - as a page title, must be length 1 |
utf8 |
(logical) If |
... |
curl options, passed on to |
page |
( |
types |
(character) List of properties to parse |
tidy |
(logical). tidy output to data.frame's if possible.
Default: |
query |
(character) query terms |
limit |
(integer) number of results to return. Default: 10 |
offset |
(integer) record to start at. Default: 0 |
Value
wt_wikicommons
returns a list, with slots:
langlinks - language page links
externallinks - external links
common_names - a data.frame with
name
andlanguage
columnsclassification - a data.frame with
rank
andname
columns
wt_wikicommons_parse
returns a list
wt_wikicommons_search
returns a list with slots for continue
and
query
, where query
holds the results, with query$search
slot with
the search results
References
https://www.mediawiki.org/wiki/API:Search for help on search
Examples
## Not run:
# high level
wt_wikicommons(name = "Malus domestica")
wt_wikicommons(name = "Pinus contorta")
wt_wikicommons(name = "Ursus americanus")
wt_wikicommons(name = "Balaenoptera musculus")
wt_wikicommons(name = "Category:Poeae")
wt_wikicommons(name = "Category:Pinaceae")
# low level
pg <- wt_wiki_page("https://commons.wikimedia.org/wiki/Malus_domestica")
wt_wikicommons_parse(pg)
# search wikicommons
# FIXME: utf=FALSE for now until curl::curl_escape fix
# https://github.com/jeroen/curl/issues/228
wt_wikicommons_search(query = "Pinus", utf8 = FALSE)
## use search results to dig into pages
res <- wt_wikicommons_search(query = "Pinus", utf8 = FALSE)
lapply(res$query$search$title[1:3], wt_wikicommons)
## End(Not run)
Wikipedia
Description
Wikipedia
Usage
wt_wikipedia(name, wiki = "en", utf8 = TRUE, ...)
wt_wikipedia_parse(
page,
types = c("langlinks", "iwlinks", "externallinks", "common_names", "classification"),
tidy = FALSE
)
wt_wikipedia_search(
query,
wiki = "en",
limit = 10,
offset = 0,
utf8 = TRUE,
...
)
Arguments
name |
(character) Wiki name - as a page title, must be length 1 |
wiki |
(character) wiki language. default: en. See wikipedias for language codes. |
utf8 |
(logical) If |
... |
curl options, passed on to |
page |
( |
types |
(character) List of properties to parse |
tidy |
(logical). tidy output to data.frame's if possible.
Default: |
query |
(character) query terms |
limit |
(integer) number of results to return. Default: 10 |
offset |
(integer) record to start at. Default: 0 |
Value
wt_wikipedia
returns a list, with slots:
langlinks - language page links
externallinks - external links
common_names - a data.frame with
name
andlanguage
columnsclassification - a data.frame with
rank
andname
columnssynonyms - a character vector with taxonomic names
wt_wikipedia_parse
returns a list with same slots determined by
the types
parmeter
wt_wikipedia_search
returns a list with slots for continue
and
query
, where query
holds the results, with query$search
slot with
the search results
References
https://www.mediawiki.org/wiki/API:Search for help on search
Examples
## Not run:
# high level
wt_wikipedia(name = "Malus domestica")
wt_wikipedia(name = "Malus domestica", wiki = "fr")
wt_wikipedia(name = "Malus domestica", wiki = "da")
# low level
pg <- wt_wiki_page("https://en.wikipedia.org/wiki/Malus_domestica")
wt_wikipedia_parse(pg)
wt_wikipedia_parse(pg, tidy = TRUE)
# search wikipedia
# FIXME: utf=FALSE for now until curl::curl_escape fix
# https://github.com/jeroen/curl/issues/228
wt_wikipedia_search(query = "Pinus", utf8=FALSE)
wt_wikipedia_search(query = "Pinus", wiki = "fr", utf8=FALSE)
wt_wikipedia_search(query = "Pinus", wiki = "br", utf8=FALSE)
## curl options
# wt_wikipedia_search(query = "Pinus", verbose = TRUE, utf8=FALSE)
## use search results to dig into pages
res <- wt_wikipedia_search(query = "Pinus", utf8=FALSE)
lapply(res$query$search$title[1:3], wt_wikipedia)
## End(Not run)
WikiSpecies
Description
WikiSpecies
Usage
wt_wikispecies(name, utf8 = TRUE, ...)
wt_wikispecies_parse(
page,
types = c("langlinks", "iwlinks", "externallinks", "common_names", "classification"),
tidy = FALSE
)
wt_wikispecies_search(query, limit = 10, offset = 0, utf8 = TRUE, ...)
Arguments
name |
(character) Wiki name - as a page title, must be length 1 |
utf8 |
(logical) If |
... |
curl options, passed on to |
page |
( |
types |
(character) List of properties to parse |
tidy |
(logical). tidy output to data.frame's if possible.
Default: |
query |
(character) query terms |
limit |
(integer) number of results to return. Default: 10 |
offset |
(integer) record to start at. Default: 0 |
Value
wt_wikispecies
returns a list, with slots:
langlinks - language page links
externallinks - external links
common_names - a data.frame with
name
andlanguage
columnsclassification - a data.frame with
rank
andname
columns
wt_wikispecies_parse
returns a list
wt_wikispecies_search
returns a list with slots for continue
and
query
, where query
holds the results, with query$search
slot with
the search results
References
https://www.mediawiki.org/wiki/API:Search for help on search
Examples
## Not run:
# high level
wt_wikispecies(name = "Malus domestica")
wt_wikispecies(name = "Pinus contorta")
wt_wikispecies(name = "Ursus americanus")
wt_wikispecies(name = "Balaenoptera musculus")
# low level
pg <- wt_wiki_page("https://species.wikimedia.org/wiki/Abelmoschus")
wt_wikispecies_parse(pg)
# search wikispecies
# FIXME: utf=FALSE for now until curl::curl_escape fix
# https://github.com/jeroen/curl/issues/228
wt_wikispecies_search(query = "pine tree", utf8=FALSE)
## use search results to dig into pages
res <- wt_wikispecies_search(query = "pine tree", utf8=FALSE)
lapply(res$query$search$title[1:3], wt_wikispecies)
## End(Not run)