All the material presented here, to the extent it is original, is available under CC-BY-SA.
The presentation file is on github at: (https://rsbivand.github.io/geostat19_talk)
I am running R 3.6.1, with recent update.packages()
.
needed <- c("sf", "mapview", "sp", "rgdal", "wordcloud", "RColorBrewer", "rgeos", "magrittr", "igraph", "miniCRAN", "ctv", "RSQLite", "lwgeom")
Script at https://github.com/rsbivand/geostat19_talk/raw/master/geostat_bivand_19.zip. Download to suitable location and use as basis.
Understanding the original uses of S and R (in the 1980s and 1990s), and seeing how these uses affected the development of R lets us appreciate the robustness of R’s ecosystem.
This talk uses readings of the R sources and other information to explore R’s history. The topics to be touched on include the “colour” books (brown, blue, white, green), interlinkages to SICP (Scheme) and LispStat
This should show where and when impulses flowed between R and the development of capabilities in R for spatial data analysis
The focus will be on the period 1997-2004, which is when a lot of the foundations for R spatial came into being
Rasmus Bååth has a useful blog piece on R’s antecedents in the S language
Something similar is present in the second chapter of Chambers (2016), from the viewpoint of one of those responsible for the development of the S language
In addition to S, we need to take SICP and Scheme into account (Abelson and Sussman 1996), as described by Ihaka and Gentleman (1996) and Wickham (2014)
Finally, LispStat and its creators have played and continue to play a major role in developing R (Tierney 1990, 1996, 2005)
Becker and Chambers (1984): S: An Interactive Environment for Data Analysis and Graphics, A.K.A. the Brown Book
Becker and Chambers (1985): Extending the S System
Becker, Chambers and Wilks (1988): The New S Language: A Programming Environment for Data Analysis and Graphics, A.K.A. the Blue Book.
Chambers and Hastie (1992): Statistical Models in S, A.K.A. the White Book.
Chambers (1998): Programming with Data: A Guide to the S Language, A.K.A. the Green Book.
Venables and Ripley (2000): S Programming
The S2 system was described in the Brown Book, S3 in the Blue Book and completed in the White Book, finally S4 in the Green Book
The big advance from S2 to S3 was that users could write functions; that data.frame objects were defined; that formula objects were defined; and that S3 classes and method dispatch appeared
S4 brought connections and formal S4 classes, the latter seen in R in the methods package (still controversial)
S-PLUS was/is the commercial implementation of S and its releases drove S3 and S4 changes
S was a Bell Labs innovation, like Unix, C, C++, and many interpreted languages (like AWK); many of these share key understandings
Now owned by Nokia, previously Alcatel-Lucent, Lucent, and AT&T
Why would a telecoms major (AT&T) pay for fundamental research in computer science and data analysis (not to sell or market other products better)?
Some Green Book examples are for quality control of telecoms components
S-PLUS was quickly adopted for teaching and research, and with S3, provided extensibility in the form of libraries
Most links have died by now, but see this FAQ} for a flavour - there was a lively community of applied statisticians during the 1990s
S built on a long tradition of documentation through examples, with use cases and data sets taken from the applied statistical literature; this let users compare output with methods descriptions
… so we get to R
Luke Tierney was in R core in 1997, and has continued to exert clear influence over development
Because R uses a Scheme engine, similar to Lisp, under the hood, his insight into issues like the garbage collector, namespaces, byte-compilation, serialization, parallelization, and now ALTREP has been crucial (see also the proposal by Luke Tierney, Gabe Becker and Tomas Kalibera})
Many of these issues involve the defensive copy on possible change policy involved in lazy evaluation, which may lead to multiple redundant copies of data being present in memory
Luke Tierney and Brian Ripley have fought hard to let R load fast, something that is crucial to ease the use of R on multicore systems or inside databases
The AI-GEOSTATS listserve/mailing list was started by Gregoire Dubois in 1995; AI was from Arc/Info
GRASS accepted code contributions, then kept in src.contrib
and src.garden
and GRASS mailing lists were active (the archives seem to have been lost in moving from minordomo to mailman in 2001)
Bao et al. (2000) describe S-PLUS links to GRASS and ArcView; the module for spatial statistics was also from the mid-1990’s
MASS
included S and compiled code for kriging, trend surfaces and point pattern analysis, splancs
provided point pattern analysis
Much of the porting of S code to R for spatial statistics was begun by Albrecht Gebhardt
The CRAN archives show tripack
and akima
- both with ACM licenses - from August 1998; ash
and sgeostat
followed in April 1999
The date for the MASS spatial
package is unknown because it was bundled in the VR
metapackage subsequently split into spatial
, MASS
, nnet
and class
In the earliest period, CRAN admins helped practically with porting
Rowlingson and Diggle (1993) describes the S-PLUS version of splancs
for point pattern analysis
I’d contacted Barry in 1997 but only moved forward with porting splancs
as R’s ability to load shared objects advanced
In September 1998, I wrote to him:
splancs
in the CRAN archive is from November 2000Albrecht and I were teaching spatial data analysis using R and code that would go into CRAN packages (I still have the package I wrote, and the exams)
In 1998, the ERSA (European Regional Science Association) meeting was in Vienna, and we gave a talk subsequently published in Bivand and Gebhardt (2000)
We covered R, point pattern analysis, geostatistics, ESDA and spatial econometrics
The first GRASS
package in the CRAN archive is from November 2000
Before that, we’d presented at GeoComputation 2000 in Chatham Bivand and Neteler (2000)
Nicholas Lewin-Koh posted on the R list (subsequently split between R-help and R-devel) in September 2000, asking about coding the reading of ESRI shapefiles
He made a number of key contributions which ended up in spweights
- currently part of spdep
- and what became maptools
He wrote the first vector representation based on the shapefile import format, and I should be blamed for subsequent fumbling implementations in maptools
From interaction with Kurt Hornik, Ross Ihaka and Brian Ripley on the R list during late 2000, contacts widened, including Tim Keitt (rgdal
)
I ended up presenting at the DSC meeting in Vienna in March 2001, getting to meet many R core members, developers and CRAN admins
Further contacts came from a CSISS workshop organised by Luc Anselin in May 2002 in Santa Barbara, including Konstantin Krivoruchko (ESRI) and Gilberto Câmara (INPE)
My talk eventually appeared as Bivand (2006), but the contacts were more useful
Visiting the INPE earth observation lab and presenting at a conference in December 2002 provided further ideas and links to TerraLib
Luc himself contributed code to spdep
and continues be very supportive
I did a further piece for the ERSA 2002 conference in Dortmund (Bivand 2002)
The DSC meeting in March 2003 was also being planned for Vienna, and I’d talked about it with Kurt Hornik when he gave a keynote at the GRASS meeting in Trento in September 2002
I met Virgilio Gómez-Rubio at the GRASS meeting among many others and talked about DCluster
and RArcInfo
In mid November I started emailing contacts to invite sumbissions for a spatial statistics session at DSC, and a workshop at which we could discuss working together
Of the replies, far more would have liked to attend than were able to (Pebesma 2004; Gómez-Rubio, Ferrándiz-Ferragud, and López-Quílez 2005; Gómez-Rubio and López-Quílez 2005)
Edzer was already well known because of his open-source gstat
program (Pebesma and Wesseling 1998)
From Edzer’s reply 13 November 2002
Correspondence developed quickly (21 November)
Contribution even though he couldn’t get to Vienna:
fields
and geoR
in mid-2001, spatstat
in early 2002
gstat
in February 2003, maptools
in August 2003, rgdal
in November 2003
sp
in April 2005 (after a workshop in Lancaster November 2004 and before one in Valencia May 2005)
classInt
in March 2006, spgrass6
in August 2006
A chapter written in 2011 and published three years later has an overview (Bivand 2014)
Before discussing software component stacks for GC, we should acknowledge the importance of open standards for geospatial data interchange
Kralidis (2008) points out the importance of concepts such as that of spatial data infrastructure, whether established within national jurisdictions, within supranational jurisdictions, or by international standards organisations
A fuller coverage of the relationships between spatial data infrastructures and free and open source geospatial software is given by Steiniger and Hunter (2012)
Kralidis also helpfully distinguishes between formal, de facto, and ad hoc standards, which provide the flexibility needed to move ahead somewhat faster than standards committees are usually able to do
Software components appear to have been defined first by McIlroy (1969), as interchangeable subassemblies by analogy with mass production manufacturing
The software component stack has been a core concept of programming at least since the publication of Kernighan and Plauger(1976), systematising the experience of Bell Labs computer scientists
Some of the lessons are made clear in programming itself (Kernighan and Pike 1999), while others affect how one may “glue” small utility functions together in an interactive and/or scripting language (Kernighan and Pike 1984)
Consequently, a software component stack can be taken as sequence of component programs that are used together to achieve a common goal
Using stacks of components becomes attractive when task objectives can more easily be met by using components developed by others than by developing them independently
When the costs of keeping a stack working exceed those of rewriting, the stack may fail, but this is seldom the case
Open source software developers often advertise application programming interfaces (API), with an implicit promise that other downstream developers using the API will be less subject to incompatible changes
It is then vital that changes in these underlying components do not change the way that dependent components function, unless their earlier behaviour had been in error
As already noted, developers wishing to integrate software components in stacks must pay careful attention to the versioning of the components, and to the impacts of upstream changes on downstream components
The terms upstream and downstream refer to the ordering of the components, with data flowing from upstream to downstream components
If the specification of an upstream component changes, those following it will need to be modified
If the changes are forced by real bugs being fixed, or security holes being blocked, downstream components must react in appropriate ways
However, some changes occur for other reasons, such as code cleaning, reimplementation, or the resolution of licence issues in otherwise functioning code
In most cases, upstream developers then attempt to reduce changes in their interfaces with downstream components to an unavoidable minimum
Open source projects are typically most constrained with respect to developer time for maintenance, including the revision of functioning code to accommodate upstream changes that may not improve downstream performance
A particularly troublesome issue for dynamically linked software components in relatively long-running applications is that of thread safety
The Open Source Geospatial Foundation was brought into being in 2006 as a successor to the Mapserver Foundation, itself created the year before
In addition to providing a shared infrastructure and procedural framework for web mapping, desktop application and geospatial library projects, OSGeo aims to promote open source geospatial software use and development, including use integrated with proprietary software
McIhagga (2008) discusses some of the ways in which communities of practice have developed, with particular reference to web mapping, and in his description, the open source web mapping “ecology”
The Geospatial Data Abstraction Library is a crucial part of the upstream geospatial library infrastructure
One of the most important components required by geospatial applications is the provision of robust and clear representations of coordinate reference systems
Because GEOS uses OGC SFS specifications for geometries, it does not “build” topologies in the classical GIS arc-node understanding
GRASS (Geographic Resources Analysis Support System) was already twenty years old when the GRASS developers collaborated in founding OSGeo, and they have been playing an important role in the broader OSGeo movement (Neteler et al. 2008); many of the more recent developments in GRASS are covered by Neteler et al. (n.d.)
Because so much open source (and other) software uses the PROJ library and framework, many are affected when PROJ upgrades. Until very recently, PROJ has been seen as very reliable, and the changes taking place now are intended to confirm and reinforce this reliability. Before PROJ 5 (PROJ 6 is out now, PROJ 7 is coming early in 2020), the +datum=
tag was used, perhaps with +towgs84=
with three or seven coefficients, and possibly +nadgrids=
where datum transformation grids were available. However, transformations from one projection to another first inversed to longitude-latitude in WGS84, then projected on to the target projection.
‘Fast-forward 35 years and PROJ.4 is everywhere: It provides coordinate handling for almost every geospatial program, open or closed source. Today, we see a drastical increase in the need for high accuracy GNSS coordinate handling, especially in the agricultural and construction engineering sectors. This need for geodetic-accuracy transformations is not satisfied by “classic PROJ.4”. But with the ubiquity of PROJ.4, we can provide these transformations “everywhere”, just by implementing them as part of PROJ.4’ (Evers and Knudsen 2017).
Following the introduction of geodetic modules and pipelines in PROJ 5 (Knudsen and Evers 2017; Evers and Knudsen 2017), PROJ 6 moves further. Changes in the legacy PROJ representation and WGS84 transformation hub have been coordinated through the GDAL barn raising initiative. Crucially WGS84 often ceases to be the pivot for moving between datums. A new OGC WKT is coming, and an SQLite EPSG file database has replaced CSV files. SRS will begin to support 3D by default, adding time too as SRS change. See also PROJ migration notes.
There are very useful postings on the PROJ mailing list from Martin Desruisseaux, first proposing clarifications and a follow-up including a summary:
- “Early binding” ≈ hub transformation technique.
- “Late binding” ≈ hub transformation technique NOT used, replaced by a more complex technique consisting in searching parameters in the EPSG database after the transformation context (source, target, epoch, area of interest) is known.
- The problem of hub transformation technique is independent of WGS84. It is caused by the fact that transformations to/from the hub are approximate. Any other hub we could invent in replacement of WGS84 will have the same problem, unless we can invent a hub for which transformations are exact (I think that if such hub existed, we would have already heard about it).
The solution proposed by ISO 19111 (in my understanding) is:
- Forget about hub (WGS84 or other), unless the simplicity of early-binding is considered more important than accuracy.
- Associating a CRS to a coordinate set (geometry or raster) is no longer sufficient. A {CRS, epoch} tuple must be associated. ISO 19111 calls this tuple “Coordinate metadata”. From a programmatic API point of view, this means that getCoordinateReferenceSystem() method in Geometry objects (for instance) needs to be replaced by a getCoordinateMetadata() method.
Maybe watch Even Rouault’s recent FOSS4G talk: https://media.ccc.de/v/bucharest-198-revamp-of-coordinate-reference-system-management-in-the-osgeo-c-c-stack-with-proj-and-gdal
In QGIS built on current PROJ 6 with the proj.h
API (and GDAL built on current PROJ 6 with the proj.h
API), we see the following sequence of GUI windows when trying to open the olinda.gpkg file.
Instead of using the declared coordinate reference system of the added layer to provide a transformation/conversion relationship to possible WGS84 geographical coordinate or web mapping backgrounds, the user of the most recent QGIS version with PROJ 6 faces a choice of three alternatives with varying availabilities and precisions:
The third alternative has better precision, but depends on finding and installing an NTv2 grid file in the PROJ shared/proj
metadata folder:
If we install the file, the choices change to promote the more precise NTv2-based path to the first position:
library(sf)
## Linking to GEOS 3.7.2, GDAL 3.0.1, PROJ 6.2.0
packageVersion("sf")
## [1] '0.8.0'
The final element reported by sf::sf_extSoftVersion()
shows whether sf was built with the proj.h
interface to PROJ, or the legacy proj.api.h
interface. However, GDAL also has to be built with the proj.h
interface for everything to line up:
sf_extSoftVersion()
## GEOS GDAL proj.4 GDAL_with_GEOS USE_PROJ_H
## "3.7.2" "3.0.1" "6.2.0" "true" "true"
st_crs(22525)
## Coordinate Reference System:
## EPSG: 22525
## proj4string: "+proj=utm +zone=25 +south +ellps=intl +towgs84=-205.57,168.77,-4.12,0,0,0,0 +units=m +no_defs"
The OGC WTK2 definition now contains a usage/scope term showing where the definition may be used; there may also be a temporal frame for a definition.
cat(system("projinfo EPSG:22525", intern=TRUE), sep="\n")
## PROJ.4 string:
## +proj=utm +zone=25 +south +ellps=intl +towgs84=-205.57,168.77,-4.12,0,0,0,0 +units=m +no_defs +type=crs
##
## WKT2_2018 string:
## PROJCRS["Corrego Alegre 1970-72 / UTM zone 25S",
## BASEGEOGCRS["Corrego Alegre 1970-72",
## DATUM["Corrego Alegre 1970-72",
## ELLIPSOID["International 1924",6378388,297,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4225]],
## CONVERSION["UTM zone 25S",
## METHOD["Transverse Mercator",
## ID["EPSG",9807]],
## PARAMETER["Latitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",-33,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",0.9996,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",500000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",10000000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["(E)",east,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["(N)",north,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## USAGE[
## SCOPE["unknown"],
## AREA["Brazil - east of 36°W onshore"],
## BBOX[-10.1,-36,-4.99,-34.74]],
## ID["EPSG",22525]]
If we ask about possible transformations/conversions, we see choices we saw among those represented in QGIS (I work on two apparently identical systems, which may give different choice counts)
cat(system("projinfo -s EPSG:22525 -t EPSG:31985", intern=TRUE), sep="\n")
## Candidate operations found: 2
## -------------------------------------
## Operation n°1:
##
## unknown id, Inverse of UTM zone 25S + Corrego Alegre 1970-72 to SIRGAS 2000 (2) + UTM zone 25S, 5 m, Brazil - Corrego Alegre 1970-1972
##
## PROJ string:
## +proj=pipeline +step +inv +proj=utm +zone=25 +south +ellps=intl +step +proj=push +v_3 +step +proj=cart +ellps=intl +step +proj=helmert +x=-206.05 +y=168.28 +z=-3.82 +step +inv +proj=cart +ellps=GRS80 +step +proj=pop +v_3 +step +proj=utm +zone=25 +south +ellps=GRS80
##
## WKT2_2018 string:
## CONCATENATEDOPERATION["Inverse of UTM zone 25S + Corrego Alegre 1970-72 to SIRGAS 2000 (2) + UTM zone 25S",
## SOURCECRS[
## PROJCRS["Corrego Alegre 1970-72 / UTM zone 25S",
## BASEGEOGCRS["Corrego Alegre 1970-72",
## DATUM["Corrego Alegre 1970-72",
## ELLIPSOID["International 1924",6378388,297,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4225]],
## CONVERSION["UTM zone 25S",
## METHOD["Transverse Mercator",
## ID["EPSG",9807]],
## PARAMETER["Latitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",-33,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",0.9996,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",500000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",10000000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["(E)",east,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["(N)",north,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## ID["EPSG",22525]]],
## TARGETCRS[
## PROJCRS["SIRGAS 2000 / UTM zone 25S",
## BASEGEOGCRS["SIRGAS 2000",
## DATUM["Sistema de Referencia Geocentrico para las AmericaS 2000",
## ELLIPSOID["GRS 1980",6378137,298.257222101,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4674]],
## CONVERSION["UTM zone 25S",
## METHOD["Transverse Mercator",
## ID["EPSG",9807]],
## PARAMETER["Latitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",-33,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",0.9996,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",500000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",10000000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["(E)",east,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["(N)",north,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## ID["EPSG",31985]]],
## STEP[
## CONVERSION["Inverse of UTM zone 25S",
## METHOD["Inverse of Transverse Mercator",
## ID["INVERSE(EPSG)",9807]],
## PARAMETER["Latitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",-33,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",0.9996,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",500000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",10000000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]],
## ID["INVERSE(EPSG)",16125]]],
## STEP[
## COORDINATEOPERATION["Corrego Alegre 1970-72 to SIRGAS 2000 (2)",
## VERSION["PBS-Bra 2005"],
## SOURCECRS[
## GEOGCRS["Corrego Alegre 1970-72",
## DATUM["Corrego Alegre 1970-72",
## ELLIPSOID["International 1924",6378388,297,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## CS[ellipsoidal,2],
## AXIS["geodetic latitude (Lat)",north,
## ORDER[1],
## ANGLEUNIT["degree",0.0174532925199433]],
## AXIS["geodetic longitude (Lon)",east,
## ORDER[2],
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4225]]],
## TARGETCRS[
## GEOGCRS["SIRGAS 2000",
## DATUM["Sistema de Referencia Geocentrico para las AmericaS 2000",
## ELLIPSOID["GRS 1980",6378137,298.257222101,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## CS[ellipsoidal,2],
## AXIS["geodetic latitude (Lat)",north,
## ORDER[1],
## ANGLEUNIT["degree",0.0174532925199433]],
## AXIS["geodetic longitude (Lon)",east,
## ORDER[2],
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4674]]],
## METHOD["Geocentric translations (geog2D domain)",
## ID["EPSG",9603]],
## PARAMETER["X-axis translation",-206.05,
## LENGTHUNIT["metre",1],
## ID["EPSG",8605]],
## PARAMETER["Y-axis translation",168.28,
## LENGTHUNIT["metre",1],
## ID["EPSG",8606]],
## PARAMETER["Z-axis translation",-3.82,
## LENGTHUNIT["metre",1],
## ID["EPSG",8607]],
## OPERATIONACCURACY[5.0],
## ID["EPSG",6193],
## REMARK["Formed by concatenation of tfms codes 6191 and 15485. May be used as transformation between Corrego Alegre 1970-72 and WGS 84 - see tfm code 6194. Used by Petrobras and ANP from February 2005."]]],
## STEP[
## CONVERSION["UTM zone 25S",
## METHOD["Transverse Mercator",
## ID["EPSG",9807]],
## PARAMETER["Latitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",-33,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",0.9996,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",500000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",10000000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]],
## ID["EPSG",16125]]],
## USAGE[
## SCOPE["unknown"],
## AREA["Brazil - Corrego Alegre 1970-1972"],
## BBOX[-33.78,-58.16,-2.68,-34.74]]]
##
## -------------------------------------
## Operation n°2:
##
## unknown id, Inverse of UTM zone 25S + Corrego Alegre 1970-72 to SIRGAS 2000 (1) + UTM zone 25S, 2 m, Brazil - Corrego Alegre 1970-1972, at least one grid missing
##
## PROJ string:
## +proj=pipeline +step +inv +proj=utm +zone=25 +south +ellps=intl +step +proj=hgridshift +grids=CA7072_003.gsb +step +proj=utm +zone=25 +south +ellps=GRS80
##
## WKT2_2018 string:
## CONCATENATEDOPERATION["Inverse of UTM zone 25S + Corrego Alegre 1970-72 to SIRGAS 2000 (1) + UTM zone 25S",
## SOURCECRS[
## PROJCRS["Corrego Alegre 1970-72 / UTM zone 25S",
## BASEGEOGCRS["Corrego Alegre 1970-72",
## DATUM["Corrego Alegre 1970-72",
## ELLIPSOID["International 1924",6378388,297,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4225]],
## CONVERSION["UTM zone 25S",
## METHOD["Transverse Mercator",
## ID["EPSG",9807]],
## PARAMETER["Latitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",-33,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",0.9996,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",500000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",10000000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["(E)",east,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["(N)",north,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## ID["EPSG",22525]]],
## TARGETCRS[
## PROJCRS["SIRGAS 2000 / UTM zone 25S",
## BASEGEOGCRS["SIRGAS 2000",
## DATUM["Sistema de Referencia Geocentrico para las AmericaS 2000",
## ELLIPSOID["GRS 1980",6378137,298.257222101,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4674]],
## CONVERSION["UTM zone 25S",
## METHOD["Transverse Mercator",
## ID["EPSG",9807]],
## PARAMETER["Latitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",-33,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",0.9996,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",500000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",10000000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["(E)",east,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["(N)",north,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## ID["EPSG",31985]]],
## STEP[
## CONVERSION["Inverse of UTM zone 25S",
## METHOD["Inverse of Transverse Mercator",
## ID["INVERSE(EPSG)",9807]],
## PARAMETER["Latitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",-33,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",0.9996,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",500000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",10000000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]],
## ID["INVERSE(EPSG)",16125]]],
## STEP[
## COORDINATEOPERATION["Corrego Alegre 1970-72 to SIRGAS 2000 (1)",
## VERSION["IBGE-Bra"],
## SOURCECRS[
## GEOGCRS["Corrego Alegre 1970-72",
## DATUM["Corrego Alegre 1970-72",
## ELLIPSOID["International 1924",6378388,297,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## CS[ellipsoidal,2],
## AXIS["geodetic latitude (Lat)",north,
## ORDER[1],
## ANGLEUNIT["degree",0.0174532925199433]],
## AXIS["geodetic longitude (Lon)",east,
## ORDER[2],
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4225]]],
## TARGETCRS[
## GEOGCRS["SIRGAS 2000",
## DATUM["Sistema de Referencia Geocentrico para las AmericaS 2000",
## ELLIPSOID["GRS 1980",6378137,298.257222101,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## CS[ellipsoidal,2],
## AXIS["geodetic latitude (Lat)",north,
## ORDER[1],
## ANGLEUNIT["degree",0.0174532925199433]],
## AXIS["geodetic longitude (Lon)",east,
## ORDER[2],
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4674]]],
## METHOD["NTv2",
## ID["EPSG",9615]],
## PARAMETERFILE["Latitude and longitude difference file","CA7072_003.gsb"],
## OPERATIONACCURACY[2.0],
## ID["EPSG",5526],
## REMARK["May be used as transformation between Corrego Alegre 1970-72 and WGS 84 - see tfm code 5541."]]],
## STEP[
## CONVERSION["UTM zone 25S",
## METHOD["Transverse Mercator",
## ID["EPSG",9807]],
## PARAMETER["Latitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",-33,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",0.9996,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",500000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",10000000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]],
## ID["EPSG",16125]]],
## USAGE[
## SCOPE["unknown"],
## AREA["Brazil - Corrego Alegre 1970-1972"],
## BBOX[-33.78,-58.16,-2.68,-34.74]]]
##
## Grid CA7072_003.gsb needed but not found on the system.
The input data use the Corrego Alegre 1970-1972 setting, and still provide a +towgs84=
key representation for pivoting through WGS84:
olinda <- st_read("data/olinda.gpkg", quiet=TRUE)
st_crs(olinda)
## Coordinate Reference System:
## No EPSG code
## proj4string: "+proj=utm +zone=25 +south +ellps=intl +towgs84=-205.57,168.77,-4.12,0,0,0,0 +units=m +no_defs"
We’ll just use one point to check things out:
xy_c <- st_centroid(st_geometry(olinda[ 1,]))
st_coordinates(xy_c)
## X Y
## 1 295460.1 9120358
If we manually pivot through WGS84 on the way back to SIRGAS2000 UTM, we get:
st_coordinates(st_transform(st_transform(xy_c, 4326), 31985))
## X Y
## 1 295489.3 9120352
Without the NTv2 grid file CA7072_003.gsb
we seem to get the same:
# without CA7072_003.gsb
st_coordinates(st_transform(xy_c, 31985))
## X Y
## 1 295489.3 9120352
but we also get the same with the grid file if we leave the +towgs84=
key in the PROJ string:
# with CA7072_003.gsb
st_coordinates(st_transform(xy_c, 31985))
# X Y
# 1 295489.3 9120352
If however we manipulate the PROJ string to specify the grid file instead of the +towgs84=
key, we can get the improved precision:
# with CA7072_003.gsb
xy_c1 <- xy_c
st_crs(xy_c1) <- "+proj=utm +zone=25 +south +ellps=intl +units=m +nadgrids=CA7072_003.gsb"
print(st_coordinates(st_transform(xy_c1, 31985)), digits=9)
# X Y
# 1 295486.396 9120350.62
Let’s try to use the PROJ utility program cs2cs
in its PROJ 6 version. The cs2cs
version when the grid file is present matches sf::st_transform()
when the input CRS is modified to point to the grid file:
# with CA7072_003.gsb
cat(system(paste0("echo ", paste(xy, collapse=" "), " | cs2cs EPSG:22525 EPSG:31985"), intern=TRUE))
# 295486.40 9120350.62 0.00
cs2cs
without the grid file gives:
xy <- st_coordinates(xy_c)
# without CA7072_003.gsb
cat(system(paste0("echo ", paste(xy, collapse=" "), " | cs2cs EPSG:22525 EPSG:31985"), intern=TRUE))
## 295488.64 9120352.16 0.00
This matches the second set of +towgs84=
coefficients:
# without CA7072_003.gsb
xy_c2 <- xy_c
st_crs(xy_c2) <- "+proj=utm +zone=25 +south +ellps=intl +units=m +towgs84=-206.05,168.28,-3.82,0,0,0,0"
st_coordinates(st_transform(xy_c2, 31985))
## X Y
## 1 295488.6 9120352
Using the lwgeom::st_transform_proj()
for now uses the proh_api.h
interface:
# without CA7072_003.gsb
# -DACCEPT_USE_OF_DEPRECATED_PROJ_API_H
st_coordinates(lwgeom::st_transform_proj(xy_c, 31985))
## X Y
## 1 295489.3 9120352
This is the EPSG description of the grid file: https://epsg.io/5541
It was retrieved from: https://www.eye4software.com/files/ntv2/ca70.zip
This page gives a picture of why the changes in PROJ matter - the arrows are in cm per year displacement.
Some grid files are available from https://proj.org/download.html, but because many others are not as freely available (yet), they may need to be dwnloaded from national mapping agencies. Most are relatively large, and also need to be versioned. Do read the README files in the zip archives!
A recent upgrade of GEOS from 3.7.1 to 3.7.2 on a CRAN test server led to failures in three packages using rgeos for topological operations. rgeos 0.4-3 set the checkValidity=
argument to for example gIntersection()
to FALSE (TRUE threw an error if either geometry was invalid). An issue was opened on the sf github repository (rgeos is developed on R-Forge). The test objects (from an example from inlmisc) will be used here:
rgeos::version_GEOS0()
## [1] "3.7.2"
## attr(,"rev")
## [1] "b55d2125"
For rgeos <= 0.4-3, the default was not to check input geometries for validity before trying topological operations, for >= 0.5-1, the default changes when GEOS > 3.7.1 to check for validity. The mode of the argument also changes to integer from logical:
cV_old_default <- ifelse(rgeos::version_GEOS0() >= "3.7.2", 0L, FALSE)
yy <- rgeos::readWKT(readLines("data/invalid.wkt"))
rgeos::gIsValid(yy, byid=TRUE, reason=TRUE)
## 1 2
## "Ring Self-intersection[2 3]" "Valid Geometry"
## 3 4
## "Valid Geometry" "Valid Geometry"
sf::sf_extSoftVersion()
## GEOS GDAL proj.4 GDAL_with_GEOS USE_PROJ_H
## "3.7.2" "3.0.1" "6.2.0" "true" "true"
The same underlyng GEOS code is used in sf:
sf::st_is_valid(sf::st_as_sf(yy), reason=TRUE)
## [1] "Ring Self-intersection[2 3]" "Valid Geometry"
## [3] "Valid Geometry" "Valid Geometry"
The geometries were also invalid in GEOS 3.7.1, but the operations succeeded:
ply <- rgeos::readWKT(readLines("data/ply.wkt"))
oo <- try(rgeos::gIntersection(yy, ply, byid=TRUE, checkValidity=cV_old_default), silent=TRUE)
print(attr(oo, "condition")$message)
## [1] "TopologyException: Input geom 0 is invalid: Ring Self-intersection at or near point 2 3 at 2 3"
ooo <- try(sf::st_intersection(sf::st_as_sf(yy), sf::st_as_sf(ply)), silent=TRUE)
print(attr(oo, "condition")$message)
## [1] "TopologyException: Input geom 0 is invalid: Ring Self-intersection at or near point 2 3 at 2 3"
In rgeos 0.5-1 and GEOS 3.7.2, new warnings are provided, and advice to check validity.
cV_new_default <- ifelse(rgeos::version_GEOS0() >= "3.7.2", 1L, TRUE)
try(rgeos::gIntersection(yy, ply, byid=TRUE, checkValidity=cV_new_default), silent=TRUE)
## Warning in RGEOSUnaryPredFunc(spgeom, byid, "rgeos_isvalid"): Ring Self-
## intersection at or near point 2 3
## yy is invalid
## Warning in rgeos::gIntersection(yy, ply, byid = TRUE, checkValidity
## = cV_new_default): Invalid objects found; consider using
## set_RGEOS_CheckValidity(2L)
New options are provided, get_RGEOS_CheckValidity()
and set_RGEOS_CheckValidity()
, because in some packages the use of topological operations may happen through other packages, such as raster::crop()
calling rgeos::gIntersection()
without access to the arguments of the latter function.
If we follow the advice, zero-width buffering is used to try to rectify the invalidity:
oo <- rgeos::gIntersection(yy, ply, byid=TRUE, checkValidity=2L)
## Warning in RGEOSUnaryPredFunc(spgeom, byid, "rgeos_isvalid"): Ring Self-
## intersection at or near point 2 3
## yy is invalid
## Attempting to make yy valid by zero-width buffering
rgeos::gIsValid(oo)
## [1] TRUE
equivalently:
oo <- rgeos::gIntersection(rgeos::gBuffer(yy, byid=TRUE, width=0), ply, byid=TRUE, checkValidity=1L)
rgeos::gIsValid(oo)
## [1] TRUE
and by extension to sf until GEOS 3.7.2 is accommodated:
ooo <- sf::st_intersection(sf::st_buffer(sf::st_as_sf(yy), dist=0), sf::st_as_sf(ply))
all(sf::st_is_valid(ooo))
## [1] TRUE
The actual cause was the use of an ESRI/shapefile style/understanding of the self-touching exterior ring. In OGC style, an interior ring is required, but not in shapefile style. Martin Davis responded in the issue:
The problem turned out to be a noding robustness issue, which caused the valid input linework to have a self-touch after noding. This caused the output to be invalid. The fix was to tighten up the internal overlay noding validation check to catch this situation. This has the side-effect of detecting (and failing) all self-touches in input geometry. Previously, vertex-vertex self-touches were not detected, and in many cases they would simply propagate through the overlay algorithm. (This made the output invalid as well, but since the inputs were already invalid this behaviour was considered acceptable).
The change in GEOS behaviour was not planned as such, but has consequences, fortunately detected because CRAN checks by default much more than say Travis by default. Zero-width buffering will not repair all cases of invalidity, but does work here.
In the Soho cholera data set; I also converted the shapefiles from https://asdar-book.org/bundles2ed/die_bundle.zip to GPKG to be more modern (using ogr2ogr
in GDAL 3 built against PROJ 6). sf is installed using the proj.h
interface in PROJ 6:
buildings <- sf::st_read("data/buildings.gpkg", quiet=TRUE)
st_crs(buildings)
## Coordinate Reference System:
## No EPSG code
## proj4string: "+proj=tmerc +lat_0=49 +lon_0=-2 +k=0.999601 +x_0=400000 +y_0=-100000 +ellps=airy +units=m +no_defs"
To make an interactive display in mapview()
, conversion/transformation to “Web Mercator” is needed - this uses a WGS84 datum. But PROJ 6 has dropped the +datum=
tag, so the display is not correctly registered.
library(mapview)
mapview(buildings)
The CRS/SRS values in the GPKG file (it is a multi-table SQLite database) include the datum definition:
library(RSQLite)
db = dbConnect(SQLite(), dbname="data/buildings.gpkg")
dbReadTable(db, "gpkg_spatial_ref_sys")$definition[4]
## [1] "PROJCS[\"Transverse_Mercator\",GEOGCS[\"GCS_OSGB 1936\",DATUM[\"OSGB_1936\",SPHEROID[\"Airy 1830\",6377563.396,299.3249646,AUTHORITY[\"EPSG\",\"7001\"]],AUTHORITY[\"EPSG\",\"6277\"]],PRIMEM[\"Greenwich\",0],UNIT[\"Degree\",0.0174532925199433],AUTHORITY[\"EPSG\",\"4277\"]],PROJECTION[\"Transverse_Mercator\"],PARAMETER[\"latitude_of_origin\",49],PARAMETER[\"central_meridian\",-2],PARAMETER[\"scale_factor\",0.999601],PARAMETER[\"false_easting\",400000],PARAMETER[\"false_northing\",-100000],UNIT[\"metre\",1,AUTHORITY[\"EPSG\",\"9001\"]],AXIS[\"Easting\",EAST],AXIS[\"Northing\",NORTH]]"
dbDisconnect(db)
Maybe using rgdal which is built using PROJ 6 but the legacy proj_api.h
interface, and the shapefile as shipped with ASDAR reproduction materials will help?
buildings1 <- rgdal::readOGR("data/buildings.shp", verbose=FALSE)
sp::proj4string(buildings1)
## [1] "+proj=tmerc +lat_0=49 +lon_0=-2 +k=0.999601 +x_0=400000 +y_0=-100000 +ellps=airy +units=m +no_defs"
No, same problem:
mapview(buildings1)
But the shapefile has the datum definition:
readLines("data/buildings.prj")
## [1] "PROJCS[\"Transverse_Mercator\",GEOGCS[\"GCS_OSGB 1936\",DATUM[\"D_OSGB_1936\",SPHEROID[\"Airy_1830\",6377563.396,299.3249646]],PRIMEM[\"Greenwich\",0],UNIT[\"Degree\",0.017453292519943295]],PROJECTION[\"Transverse_Mercator\"],PARAMETER[\"latitude_of_origin\",49.00000000000002],PARAMETER[\"central_meridian\",-2],PARAMETER[\"scale_factor\",0.999601],PARAMETER[\"false_easting\",400000],PARAMETER[\"false_northing\",-100000],UNIT[\"Meter\",1]]"
So in both cases with PROJ 6, we need to manipulate the CRS read in with the file to insert our choice of how to make the transformation, because the definition as read no longer contains it:
fixed <- "+proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +ellps=airy +nadgrids=OSTN15_NTv2_OSGBtoETRS.gsb +units=m +no_defs"
st_crs(buildings) <- fixed
sp::proj4string(buildings1) <- sp::CRS(fixed)
mapview(buildings)
mapview(buildings1)
Once S3 permitted extension by writing functions, and packaging functions in libraries, S and R ceased to be monolithic
In R, a library is where packages are kept, distinguishing between base and recommended packages distributed with R, and contributed packages
Contributed packages can be installed from CRAN (infrastructure built on CPAN and CTAN for Perl and Tex), Bioconductor, other package repositories, and other sources such as github
With over 14500 contributed packages, CRAN is central to the R community, but is stressed by dependency issues (CRAN is not run by R core)
Andrie de Vries Finding clusters of CRAN packages using igraph looked at CRAN package clusters from a page rank graph
Here we will be choosing the packages listed in the Spatial and SpatioTemporal task views as contained in the ctv package, and their pagerank relationships
cran <- available.packages()
library(ctv)
obj1 <- read.ctv(system.file("ctv/Spatial.ctv", package="ctv"))
obj2 <- read.ctv(system.file("ctv/SpatioTemporal.ctv", package="ctv"))
sp_ctv_pkgs <- sort(unique(c(obj1$packagelist$name, obj2$packagelist$name)))
pdb <- cran[cran[, "Package"] %in% sp_ctv_pkgs, ]
suppressPackageStartupMessages(library(miniCRAN))
suppressPackageStartupMessages(library(igraph))
suppressPackageStartupMessages(library(magrittr))
pg <- makeDepGraph(pdb[, "Package"], availPkgs = cran, suggests=TRUE, enhances=TRUE, includeBasePkgs = FALSE)
pr <- pg %>%
page.rank(directed = FALSE) %>%
use_series("vector") %>%
sort(decreasing = TRUE) %>%
as.matrix %>%
set_colnames("page.rank")
print(pr[1:30,], digits=4)
## Rcpp sp MASS knitr maptools
## 0.025042 0.018962 0.011704 0.009286 0.008289
## lattice raster rgdal testthat spdep
## 0.008224 0.007997 0.007912 0.007706 0.007587
## sf Matrix spatstat surveillance rgeos
## 0.007581 0.007314 0.007234 0.006617 0.006316
## ggplot2 plm lme4 plotKML RColorBrewer
## 0.006186 0.005782 0.005354 0.005183 0.005108
## rmarkdown magrittr rgl stplanr RandomFields
## 0.005094 0.004956 0.004764 0.004668 0.004605
## gstat jsonlite pkgdown spaMM spacetime
## 0.004551 0.004512 0.004421 0.004267 0.004224
topClusters <- table(cl$membership) %>%
sort(decreasing = TRUE) %>%
head(25)
cluster <- function(i, clusters, pagerank, n=10){
group <- clusters$names[clusters$membership == i]
pagerank[group, ] %>% sort(decreasing = TRUE) %>% head(n)
}
z <- lapply(names(topClusters)[1:15], cluster, clusters=cl, pagerank=pr, n=20)
z[[1]][1:20]
## sp knitr maptools lattice raster
## 0.018961701 0.009286065 0.008289421 0.008223655 0.007996581
## rgdal testthat sf rgeos ggplot2
## 0.007911978 0.007706242 0.007581470 0.006315732 0.006185982
## plotKML RColorBrewer rmarkdown stplanr gstat
## 0.005182947 0.005107896 0.005094181 0.004667842 0.004550959
## spacetime stringr Hmisc mapview igraph
## 0.004224012 0.003855975 0.003654207 0.003606242 0.003573066
z[[2]][1:20]
## MASS spdep Matrix spatstat surveillance
## 0.011703869 0.007586704 0.007313960 0.007234205 0.006617480
## RandomFields nlme foreach sperrorest CARBayes
## 0.004604721 0.003584424 0.003286419 0.003216616 0.002860922
## geostatsp FLightR car biomod2 CARBayesST
## 0.002850512 0.002847370 0.002710488 0.002653262 0.002645306
## mgcv georob boot lgcp geoR
## 0.002580343 0.002522789 0.002427513 0.002255335 0.002160760
z[[3]][1:20]
## magrittr jsonlite pkgdown R6 httr digest
## 0.004955838 0.004512034 0.004421034 0.003471628 0.003464246 0.003267953
## usethis geojsonio roxygen2 covr curl lintr
## 0.002763474 0.002379277 0.002355452 0.002225590 0.001975295 0.001901108
## geojson xml2 RNeXML osmdata crul rstudioapi
## 0.001873022 0.001813405 0.001775832 0.001736993 0.001583010 0.001508990
## rmapshaper rsconnect
## 0.001381217 0.001356919
z[[4]][1:20]
## Rcpp rgl ctmm RcppArmadillo BH
## 0.0250415771 0.0047642536 0.0036110230 0.0021795803 0.0020143644
## RcppEigen dbmss dodgr smam smerc
## 0.0017826835 0.0017406331 0.0017066628 0.0016736023 0.0016242956
## pbapply RSpectra SpatialEpi smacpod Gmedian
## 0.0014511915 0.0012763535 0.0011686080 0.0011488122 0.0011345323
## svglite spsann minqa ngspatial RcppParallel
## 0.0010568585 0.0010399338 0.0010203580 0.0010022323 0.0009753075
library(RColorBrewer)
library(wordcloud)
opar <- par(mar=c(0,0,0,0)+0.1, mfrow=c(1,2))
for (i in 1:2) wordcloud(names(z[[i]]), freq=unname(z[[i]]), scale=rev(8*range(unname(z[[i]]))/max(unname(z[[4]]))))
par(opar)
opar <- par(mar=c(0,0,0,0)+0.1, mfrow=c(1,2))
for (i in 3:4) wordcloud(names(z[[i]]), freq=unname(z[[i]]), scale=rev(8*range(unname(z[[i]]))/max(unname(z[[4]]))))
par(opar)
R itself is very varied in the way that approaches have established niches through time
sp
classes date back to the same time as Bioconductor, which has a fairly definite S4 preference
Communities develop and change over time, and continue to do so, with simultaneous differences in preferences
so what comes after tidyverse and “data science”? Should we be preparing already?
Abelson, Harold, and Gerald Jay Sussman. 1996. Structure and Interpretation of Computer Programs. Boston, MA: MIT Press.
Bao, Shuming, Luc Anselin, Doug Martin, and Diana Stralberg. 2000. “Seamless Integration of Spatial Statistics and GIS: The S-PLUS for ArcView and the S+Grassland Links.” Journal of Geographical Systems 2 (2): 287–306.
Becker, R.A., and J.M. Chambers. 1984. S: An Interactive Environment for Data Analysis and Graphics. Pacific Grove, CA, USA: Wadsworth & Brooks/Cole.
———. 1985. Extending the S System. Pacific Grove, CA, USA: Wadsworth & Brooks/Cole.
Becker, Richard A., John M. Chambers, and Allan R. Wilks. 1988. The New S Language. London: Chapman & Hall.
Bivand, Roger. 2000. “Using the R Statistical Data Analysis Language on GRASS 5.0 GIS Database Files.” Computers & Geosciences 26 (9): 1043–52. https://doi.org/https://doi.org/10.1016/S0098-3004(00)00057-1.
———. 2002. “Spatial Econometrics Functions in R: Classes and Methods.” Journal of Geographical Systems 4 (4): 405–21. https://doi.org/10.1007/s101090300096.
———. 2006. “Implementing Spatial Data Analysis Software Tools in R.” Geographical Analysis 38 (1): 23–40. https://doi.org/10.1111/j.0016-7363.2005.00672.x.
———. 2014. “Geocomputation and Open Source Software:Components and Software Stacks.” In Geocomputation, edited by Robert J. Abrahart and Linda M. See, 329–55. Boca Raton: CRC Press. http://hdl.handle.net/11250/163358.
Bivand, Roger, and Albrecht Gebhardt. 2000. “Implementing Functions for Spatial Statistical Analysis Using the R Language.” Journal of Geographical Systems 2 (3): 307–17. https://doi.org/10.1007/PL00011460.
Bivand, Roger, and Markus Neteler. 2000. Open Source Geocomputation: Using the R Data Analysis Language Integrated with GRASS GIS and PostgreSQL Data Base Systems. http://www.geocomputation.org/2000/GC009/Gc009.htm.
Chambers, John M. 1998. Programming with Data. New York: Springer.
———. 2016. Extending R. Boca Raton: Chapman & Hall.
Chambers, John M., and Trevor J. Hastie. 1992. Statistical Models in S. London: Chapman & Hall.
Evers, Kristian, and Thomas Knudsen. 2017. Transformation Pipelines for Proj.4. https://www.fig.net/resources/proceedings/fig_proceedings/fig2017/papers/iss6b/ISS6B_evers_knudsen_9156.pdf.
Gómez-Rubio, V., J. Ferrándiz-Ferragud, and A. López-Quílez. 2005. “Detecting Clusters of Disease with R.” Journal of Geographical Systems 7: 189–206.
Gómez-Rubio, V., and A. López-Quílez. 2005. “RArcInfo: Using GIS Data with R.” Computers and Geosciences 31: 1000–1006.
Ihaka, Ross, and Robert Gentleman. 1996. “R: A Language for Data Analysis and Graphics.” Journal of Computational and Graphical Statistics 5 (3): 299–314. https://doi.org/10.1080/10618600.1996.10474713.
Kernighan, Brian W., and Rob Pike. 1984. The UNIX Programming Environment. Englewood Cliffs, N. J.: Prentice-Hall.
———. 1999. The Practice of Programming. Reading, Mass.: Addison-Wesley.
Kernighan, Brian W., and P. J. Plauger. 1976. Software Tools. Reading, Mass.: Addison-Wesley.
Knudsen, Thomas, and Kristian Evers. 2017. Transformation Pipelines for Proj.4. https://meetingorganizer.copernicus.org/EGU2017/EGU2017-8050.pdf.
Kralidis, Athanasios Tom. 2008. “Geospatial Open Source and Open Standards Convergences.” In Open Source Approaches in Spatial Data Handling, edited by G. B. Hall and M. Leahy, 1–20. Berlin: Springer-Verlag.
McIhagga, David. 2008. “Communities of Practice and the Business of Open Sourceweb Mapping.” In Open Source Approaches in Spatial Data Handling, edited by G. B. Hall and M. Leahy, 49–64. Berlin: Springer-Verlag.
McIlroy, M. D. 1969. “’Mass produced’ software components.” In Software Engineering, edited by Peter Naur and Brian Randell, 79–87. Brussels: Conference on Software Engineering, NATO Science Committee; NATO Scientific Affairs Division.
Neteler, Markus, M. Hamish Bowman, Martin Landa, and Markus Metz. n.d. “GRASS GIS: A multi-purpose open source GIS.” Environmental Modelling & Software 31 (1): 124–30.
Neteler, M., D.E. Beaudette, P. Cavallini, L. Lami, and J. Cepicky. 2008. “GRASS GIS.” In Open Source Approaches in Spatial Data Handling, edited by G. B. Hall and M. Leahy, 171–99. Berlin: Springer-Verlag.
Pebesma, Edzer J. 2004. “Multivariable Geostatistics in S: The Gstat Package.” Computers and Geosciences 30: 683–91.
Pebesma, E. J., and R. S. Bivand. 2005. “Classes and Methods for Spatial Data in R.” R News 5 (2): 9–13.
Pebesma, E. J., and C. G. Wesseling. 1998. “Gstat, a Program for Geostatistical Modelling, Prediction and Simulation.” Computers and Geosciences 24: 17–31.
Rowlingson, B., and P. J. Diggle. 1993. “Splancs: Spatial Point Pattern Analysis Code in S-Plus.” Computers and Geosciences 19: 627–55.
Steiniger, Stefan, and Andrew J. S. Hunter. 2012. “Free and Open Source GIS Software for Building a Spatial Data Infrastructure.” In Geospatial Free and Open Source Software in the 21st Century, edited by Erwan Bocher and Markus Neteler, 247–61. Lecture Notes in Geoinformation and Cartography. Berlin Heidelberg: Springer.
Tierney, Luke. 1990. LISP-STAT: An Object-Oriented Environment for Statistical Computing and Dynamic Graphics. Wiley, New York, NY.
———. 1996. “Recent Developments and Future Directions in Lisp-Stat.” Journal of Computational and Graphical Statistics 5 (3): 250–62.
———. 2005. “Some Notes on the Past and Future of Lisp-Stat.” Journal of Statistical Software 13 (9): 1–15. https://doi.org/10.18637/jss.v013.i09.
Venables, William N., and Brian D. Ripley. 2000. S Programming. New York: Springer. http://www.stats.ox.ac.uk/pub/MASS3/Sprog/.
Wickham, Hadley. 2014. Advanced R. Boca Raton, FL: Chapman & Hall. http://adv-r.had.co.nz/.