sf is the R package for spatial analysis that will gradually replace the well known sp package. The project has gained a lot of traction lately with 3 vignettes documenting the package available here, here and here while the source code is available on CRAN and github.

My impressions so far:

Importing data with sf is significantly faster. For example, using rgdal to import a national scale road network of 4,479,738 line features for Great Britain takes 86 x more time and creates a 2.5 x larger dataset compared to sf.

I like the simplicity of the object representation. With sf, vector objects are stored in a data frame and are represented by a single class ‘sf’. Compare that to sp where the data are stored in different S4 Spatial* objects and are represented by a different class depending on the geometry type.

I like the simplicity of the geometry representation. Using sf the geometry attributes are stored in a column of class ‘sfc’, each individual feature (e.g. point, line) is of class ‘sfg’ and the name of the column is always geometry. Whereas with sp the class of the geometry object as well as the name of the slot containing the geometry differ depending on the geometry type.

It is easier to access the underlying points of the geometries within sf objects. Even though you can get a human readable representation of the geometry in sf with the function st_as_text, you can rbind and cbind your way through the geometry column in order to extract the coordinates as a matrix.

Given that the spatial object is a data frame it opens up quite a few possibilities. First of all the data geometry is represented as a field in the data frame, nothing is stopping you from adding more geometry fields. In addition, paired with the data.table package you could manipulate the attribute and geometry fields by reference, thus avoiding making copies of the data frame. The problem is that by coercing the data frame to data table the ‘sf’ class is removed, nevertheless, the class of the geometry column is still ‘sfc’ so you could still use it as input to those functions that accept ‘sfc’ class objects.

All in all sf looks quite promising, I’d like to see better support for data.table (a spatial extension of data.table with support for spatial indexes maybe?) and perhaps the naming convention for functions returning to camelCase as in package sp.