---
title: "Introduction to atpolR"
output: 
  rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Introduction to atpolR}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
bibliography: ../inst/REFERENCES.bib
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

oldpar <- par(no.readonly = TRUE)
```

The ATPOL grid was created in the late 1960s at the Institute of Botany at Jagiellonian University in Kraków. [@zajacZalozeniaMetodyczneAtlasu1978; @zajacAtlasDistributionVascular1978] describes the background and methodology. Łukasz Komsta and Marek Verey carried out the extensive mathematical research and GIS implementation [@komstaRewizjaMatematycznaSiatki2016; @vereyTeoretycznaAnalizaPraktyczne2017]. Algorithms provided by Komsta on [OpenATPOL](https://atpol.sourceforge.io/) are basis for implementation in **atpolR** package.

# Basic usage

## Prepare sample data based on published ATPOL data

In our example we'll use a distribution map of _Erigeron acris_ L. subsp. _acris_ from   [@zajacAtlasRozmieszczeniaRoslin2019]. The image was scanned at 150 dpi and georeferenced with  QGIS.

```{r eriacr, fig.width = 7, fig.align='center', fig.cap = "_Erigeron acris_ L. subsp. _acris_ distribution taken from [@zajacAtlasRozmieszczeniaRoslin2019]"}
par(mar = c(0, 0, 0, 0))
tif <- system.file("extdata/eriacr.tif", package = "atpolR")
r <- terra::rast(tif)
terra::plotRGB(r)
```

There is hundreds of records. To get them all, we will use `check_atpol_square()` function, which takes the POINT coordinates and a raster as arguments and checks if the value of raster cell corresponding to some arbitrary buffer around the POINT equals zero. As the points are drawn in centers of ATPOL 10 km grid, the values of atpol10k() centroids with an buffer with radius of 1200 m will be checked. It may be necessary to adjust a buffer slightly depending of the quality of scan and precision of georeferencing.

The raster usually consist of 3 layers, one for each R, G, B component. The difference between them are visible on Fig. \@ref(fig:rgbraster).

```{r rgbraster, echo = FALSE, message=FALSE, warning=FALSE, results='hide', fig.width = 7, fig.align='center', fig.cap="R, G and B layers of a raster"}
par(mar = c(0, 0, 0, 0))
par(mfrow = c(1, 3))
reds <- colorspace::sequential_hcl(255, "Reds", rev = TRUE)
grns <- colorspace::sequential_hcl(255, "Greens", rev = TRUE)
blus <- colorspace::sequential_hcl(255, "Blues", rev = TRUE)

terra::plot(r[[1]], col = reds,  legend = FALSE, axes = FALSE)
terra::plot(r[[2]], col = grns,  legend = FALSE, axes = FALSE)
terra::plot(r[[3]], col = blus,  legend = FALSE, axes = FALSE)

par(mfrow = c(1, 1))
```

We can see a lot of green and blue components in the scan shown in Fig. \@ref(fig:eriacr). We will only look at the first layer for now. Please keep in mind that other layers may be more useful depending on the scan quality and subsequent image processing. 

```{r rraster, echo = FALSE, message=FALSE, warning=FALSE, results='hide', fig.width = 7, fig.asp = 0.7, fig.align='center'}
terra::plot(r[[1]], col = colorspace::sequential_hcl(255, "Reds", rev = TRUE), axes = FALSE)
```

The layer's values are continuous, ranging from 0 to 255. To simplify and facilitate analysis, we will classify the layer by assigning only two values: 0 and 1. We'll make a reclassification matrix called `rclmat` and use the `classify()` function form the **terra** package.

```{r echo = TRUE, message=FALSE, warning=FALSE, results='hide'}
m <- c(0,120, 0,
       120,255, 1)
rclmat <- matrix(m, ncol=3, byrow=TRUE)
rc <- terra::classify(r[[1]], rclmat, include.lowest = TRUE)
```
```{r reclasified, echo = FALSE, message=FALSE, warning=FALSE, fig.width=7, fig.asp=0.7, fig.align='center', fig.cap="Reclasiffied raster with 0 --- as black and 1 --- as white"}
terra::plot(rc, axes = FALSE, col = c("black", "white"), legend = FALSE)
```

After preparing the raster, we can load the package and run the `check_atpol_square()` function for all 10k grids:

```{r eriacr-reverse, echo=TRUE, message=FALSE, warning=FALSE}
library(atpolR)

eriacr <- atpol10k()|>
  dplyr::mutate(a = mapply(function(x) check_atpol_square(x, rc), centroid)) |>
  dplyr::filter(a == "YES")
```

Let's display the results:
```{r echo=TRUE, message=FALSE, warning=FALSE}
eriacr
```
There is 709 observations (grids) in data set.

Let's have a closer look on BE square:

```{r BE, echo=TRUE, fig.width=7, fig.asp=0.7, fig.align='center', message=FALSE, warning=FALSE}
BE <- atpol100k() |>
  subset(Name == "BE") |>
  sf::st_bbox()
par(pty = "s")
plot(NA, type = "n", xlim = c(BE[1], BE[3]), ylim = c(BE[2], BE[4]), axes = FALSE, xlab = "", ylab = "")
terra::plot(rc, legend = FALSE, add = TRUE)

atpol100k() |>
  subset(Name == "BE") |>
  sf::st_cast("LINESTRING") |>
  terra::plot(add = TRUE, col = "blue", lwd = 1.2)

eriacr |>
  subset(substr(Name, 1, 2) == "BE") |>
  sf::st_set_geometry("centroid") |>
  terra::plot(pch = 16, cex = 1.2, col = "blue", add = TRUE)
```

Thus far, so god. Our data set corresponds to published data. In the next step we will add our own observations.

Please keep in mind that `check_atpol_square()` function can produce false positive results in the case of noisy scans. it also does not recognize the various shapes used in original publications for various sites. 

## Extend the data

Assume we want to add our own observations the data set. If we already know the grid name, we can simply use the Name to filter out the grid created by the `atpol10k()` function. We can use `latlon_to_grid()` function if we don't have the grid but have coordinates. In the following example, we are using both methods:

```{r myData, echo = TRUE, warning=FALSE}
myData <- atpol10k() |>
  dplyr::filter(Name %in% c("BE68", 
                            latlon_to_grid(51.13619, 16.95069, 4))) |>
  dplyr::mutate(a = "myData")
```

And let's add them to above BE square plot:

```{r echo = TRUE, eval=FALSE}
myData |>
  sf::st_set_geometry("centroid") |>
  terra::plot(pch = 16, cex = 1.8, col = "red", add = TRUE)
```  


```{r echo=FALSE, fig.align='center', fig.asp=0.7, fig.cap="Data set extended with our observations in grids BE48 and BE68", message=FALSE, warning=FALSE, myDataplot, fig.width=7}
BE <- atpol100k() |>
  subset(Name == "BE") |>
  sf::st_bbox()
par(pty = "s")
plot(NA, type = "n", xlim = c(BE[1], BE[3]), ylim = c(BE[2], BE[4]), axes = FALSE, xlab = "", ylab = "")
terra::plot(rc, legend = FALSE, add = TRUE)

atpol100k() |>
  subset(Name == "BE") |>
  sf::st_cast("LINESTRING") |>
  terra::plot(add = TRUE, col = "blue", lwd = 1.2)

eriacr |>
  subset(substr(Name, 1, 2) == "BE") |>
  sf::st_set_geometry("centroid") |>
  terra::plot(pch = 16, cex = 1.2, col = "blue", add = TRUE)

myData |>
  sf::st_set_geometry("centroid") |>
  terra::plot(pch = 16, cex = 1.8, col = "red", add = TRUE)
```  

## Plot it

**atpolR** package provides a function `plot_points_on_atpol()` which can be used to visualize the data set. Let's merge our two data sets (`eriacr` and `myData`) and plot them together. For removing any duplicates we can use `unique.data.frame()` function from base R, or `distinct(Name)` from **dplyr** package.

```{r}
eriacr <- eriacr |>
  rbind(myData) |>
  unique.data.frame()
```

And final plot.

```{r echo = TRUE, eval = FALSE}
plotPoitsOnAtpol(eriacr$centroid, main = "Erigeron acris subsp. acris", cex = 0.6)
```

```{r eriacratpol, echo = FALSE, fig.width=7, fig.asp=1, fig.align='center', fig.cap="Combined dataset drawn on ATPOL grid"}
par(mar = c(0, 0, 0, 0))
plotPoitsOnAtpol(eriacr$centroid, main = "Erigeron acris subsp. acris", cex = 0.6)
```


# Functions description

Two basic functions `latlon_to_grid()` and `grid_to_latlon()` allows to quickly convert geographical coordinates (given in WGS 84 latitude and longitude degrees) to ATPOL grid and from grid to coordinates respectively.

```{r}
latlon_to_grid(51.01234, 17.23456, 4)
latlon_to_grid(51.01234, 17.23456, 6)
```
The firs two arguments `latlon_to_grid()` function are latitude and longitude respectively. The third argument is the length of returned grid; it might be even number between 2 and 12.

```{r}
grid_to_latlon("CE50")
```
By default `grid_to_latlon()` returns the center of grid square. If you wish to get it's corners, you can pass another 2 arguments to the function, which are X and Y offsets, like:
```{r}
grid_to_latlon("CE50", xoffset = 1, yoffset = 1)
```
for bottom right corner.

ATPOL 10km x 10km and 100km x 100k grids are generated by `atpol10k()` and `atpol100k()` functions respectively. It returns set of simple features geometries with grids as polygons:

```{r}
atpol100k()
```

For 10k grid it returns a centroids as well:
```{r}
atpol10k()
```

Please note, that ATPOL grids are projected in EPSG:2180 coordinate reference system, commonly used in Poland. 

Function `atpol_div(grid, divider)` divides any given `grid` to smaller grids by 2, 4 or 5 as proposed in [@vereyStandaryzacjaZapisuPodzialow2018].

```{r atpoldiv, echo = FALSE, message=FALSE, warning=FALSE, results='hide', fig.width = 7, fig.align='center', fig.cap="Division by 2, 4 and 5 with adopted naming convection d, c, p"}
par(mar = c(0, 0, 0, 0))
par(mfrow = c(1, 3))
for (d in c(2, 4, 5)) {
  a <- atpol_div("BE23", d)
  a$centroid <- sf::st_centroid(a$geometry)

  plot(a$geometry)
  a |>
    sf::st_set_geometry("centroid") |>
    subset(select = c("Name", "centroid")) |>
    terra::vect() |>
    terra::text(labels = substr(a$Name,5,7))
  
}
par(mfrow = c(1, 1))
```


```{r boundaryPL, fig.width=6, fig.align='center', fig.cap="Boundary of Poland on ATPOL grid."}
par(mar = c(0, 0, 0, 0))
b <- boundaryPL()
plot(atpol100k()$geometry)
plot(b, col = "red", add = TRUE)
```

```{r, include = FALSE}
par(oldpar)
```


# Bibliography