# Clusters

The pros & cons of different analysis techniques are discussed in detail in the review of home range analyses and for a more comprehensive review, see "A Manual for Wildlife Radio Tagging" (Kenward 2001) and Kenward et al. 2001.

## Contents

## Neighbour-linkage methods

Peripheral convex polygons were the earliest approach to range analysis. However, their size is strongly influenced by outlying locations and can include large areas not visited by animals. One way of addressing this was to use outlines based on density of locations, first as ellipses and then as contours. However, these too are outlier-sensitive and their parametric smoothing is prone to expand into unvisited areas. Another approach was to attribute a grid cell, with dimensions based on tracking accuracy, to locations at each point. However, this approach underestimated the areas visited unless either there were very large numbers of locations available, or "joining rules" were used to link grid cells. Joining rules that link neighbouring locations can estimate polygons that define visited areas with great accuracy, estimating range cores as polygons to which isolated grid cells at outlying locations contribute little area.

An early neighbour-linkage method involved restricting polygon edges plotted between locations to a fraction (initially half) of the span of maximum distance between any locations, which gave "concave" polygons. However, the span remains strongly influenced by outliers and the fraction was an aribitrary decision. A subsequent method defined clusters of locations, by minimising sums of nearest neighbour distances as locations with longer distances are added. The analysis starts the first cluster by identifying the two locations that are closest together and have the nearest 3rd location (i.e. the minimal sum of linkage distances). It then finds the location nearest to one in this initial cluster. If this is less than the distance to the 3rd location in any other potential cluster, the 4th location joins the original cluster. If not, a new cluster forms. If two clusters have nearest neighbours at equal distances, the location that joins is the one that minimises the distance to all locations in the cluster (i.e. a centroid rule resolves ties). If the nearest neighbour is already assigned to another cluster, the two clusters join. When the required percentage of locations has been assigned, a polygon (which was initially convex but can also be concave) is drawn around each cluster and their areas summed (Kenward 1987).

Four additions to the original Cluster Polygon implementation are available in Ranges: Objective cores, concave polygons, alternative joining rules and the ability to construct a single inclusive polygon.

On the results screens, statistics include the number of range nuclei for each % polygon. Outliers tend to have a stronger effect on areas than in other analyses, which makes utilisation plots very suitable for identifying range cores by inspection. Inspection and objective coring typically indicate that up to 15% of locations are used for excursive activity, so that 85% polygons often provide convenient cluster core boundaries. Core clusters often remain separate at this point, whereas separation distances that are less than outlier distances cause fusion when all the locations are included.

These are *.csv* files with column headers that can be double-clicked to open in Microsoft Excel or imported to an alternative spreadsheet.

## Selected cores

This option allows you to examine range structure and to define core areas. By excluding outlying locations the edges enclose areas most used by the animal. See the introduction to Location Analyses for more details.

You can choose one or more values for the percentage of locations or of location density to be included. Type them in ascending order, separated by either spaces or commas.

In the **Run Specifications** column you can specify a range areas and statistics output file. The estimates are in column format, suitable for spreadsheets. Each row has the 7 range variables, followed by X,Y coordinates for the range centre, followed by 5 range statistics followed by as many areas as there were core percentages. Structure statistics include, after the area estimate, the number of nuclei in a core, its partial area (the sum of areas of separate polygons / the area of a single polygon round all the clusters). Simpson’s Index for diversity of number of locations across clusters and Simpson’s Index for diversity of area across clusters. This is a .csv file with column headers that can be double-clicked to open in Microsoft Excel or imported to an alternative spreadsheet.

## Cores at 5% intervals

This option provides plots which help to decide which locations are part of a core, and which are outliers. You can choose to save both edge (polygon) and utilisation files. The cores are saved at 5% intervals, from 20-100, a total of 17 sets.

Utilisation files can be plotted in Input & Graphics.

## Objective cores

Rather than choosing a particular core size, it is more scientifically rigorous to have an objective core calculated from the distribution of the locations. The distribution of nearest-neighbour distances can be used to detect and exclusion of outlying locations (Kenward et al. 2001) resulting in an objective core. The ways in which outliers are excluded are discussed below. Objective coring sometimes estimates core areas larger than those from an equivalent number of locations in the standard analysis. This is because the standard approach estimates polygons as soon as a required percentage of polygons are included, whereas objective coring continues to merge clusters that are separated by less than the exclusion distance. In these cases, inclusive polygons give the same result for both methods.

## Incremental area analysis

Incremental area analysis is used to answer the question "how many locations do I need to estimate a home range?" Starting with the first three locations (the minimum needed to estimate a polygon area without a boundary strip), the new area is estimated as each location is added. This permits the consecutive areas, which tend to increase initially as the animal is observed using different parts of its range, to be plotted against number of locations until there is evidence of stability, which indicates that adding further locations will not improve the home range estimate. The default is to plot the edge round all the locations that have been added, but it is also possible to choose a single, smaller core. The consecutive area estimates have to be saved to an output file, so that the result can be examined using Input & Graphics.

## Convex or concave cluster polygons

Concave polygons are now offered as well as convex polygons. The edge restriction of concave polygons can be based on a fraction of the span of each cluster or have the edge-restriction fraction adapted to each individual cluster, ranging from 0.2 (i.e. strongly restricted) for a cluster of all the locations to 1 (i.e. convex) for small clusters. The concave polygon option can be set when animals use strips of non-linear habitat, and prevents the (very rare) overlap of a small cluster within the limits of a larger, curved cluster.

## Outlier exclusion

If objective cores are selected, exclusion can be of locations in the largest 5% of the nearest-neighbour distance distribution (by analogy with plotting contours or ellipses to 95% of the density distribution), which is the Ranges default. Alternatively, an iterative process excludes the location with the most extreme linkage distance if it is beyond 1%, 0.5% or 0.1% of the distribution estimated by the remainder, and repeats this process until all distances are within the chosen alpha-level on a normal distribution. The 0.1% alpha level excludes only the most extreme outliers. The display shows the Outlier Exclusion Distance (OED) beyond which locations are excluded. In cluster analyses, polygons then plot round clusters with no nearest-neighbour locations beyond this distance. For objective-restricted edges, the exclusion distance has a strip added equivalent to the resolution distances between locations.

## Objective-restricted-edge polygons

Although incremental cluster analysis conveniently defines groups of core locations separated from outliers, plotting outlines round clusters can be problematic. Convex polygons around separate clusters occasionally overlap (e.g. if a small cluster occurs within the horns of a crescent-shaped cluster) and simple concave solutions to that overlap problem use arbitrary or subjective edge distances. A better solution uses outlier exclusion distances to define Objective-Restricted-Edge Polygons.

OREPs are equivalent to polygons in cluster analysis when a core of locations is defined by an exclusion distance (OED) when outliers are excluded. However, instead of first identifying clusters of locations with the minimal sum of nearest-neighbour distances (which is slow to compute for many locations), OREPs are plotted immediately as concave polygons with an edge restriction based on the OED. OREPs have three advantages over clustering, namely (i) simplicity (hence speed), (ii) polygons cannot overlap and (iii) habitat at all locations is included (because a single grid cell is attributed to outliers beyond the OED).

Polygon edge distances can be based either on the distribution of Nearest-Neighbour Exclusion Distances (NNED), as used for objective coring in cluster analysis, or on the distribution of mean distances from each location to all others as estimated in kernel analyses. The Kernel Exclusion Distance (KED) is our default, because it (a) gives more normal distributions than nearest neighbour distances, (b) gives smoother outlines than NNED especially for small samples and (c) is analogous to exclusion of outlier locations to prevent their excessive influence on contours (reference).

Range cores defined by OREPs are equivalent both to cluster analysis with objective coring (Kenward et al. 2001), but without risk of polygon overlap, and also to the concave hulls derived from neighbour distances (Getz & Wilmers 2004), but with an objective choice of the edge-restriction distance. With kernel-based outlier exclusion distances, OREPs unify range analyses based on grid-cell, polygon and location density techniques.

## Joining priority

The third addition offers the centroid rule, of joining locations to clusters when all linkage distances are minimal, as a priority over the nearest-neighbour rule, which is then used only as a tie-breaker. Centroid priority suppresses chaining along linear habitats and is thus less appropriate than the nearest-neighbour priority for species (probably most species) that minimise their travel distances.

## Separate cluster polygons or Single inclusive polygon

The fourth addition is the option of plotting a single polygon round all the clusters. This excludes locations that are outliers to the main core but includes those between the clusters and which probably represent times when animals were detected on transition between clusters rather than making true excursions. This single polygon, called a "usual area" by Johnstone (1994), may provide a better estimate of a core territory.