Macroecology

Topics

  1. Introduction to Macroecology
  2. Data Format
  3. Defaults
  4. Options
  5. Output
  6. Caveats
  7. Macroecology Tutorial
  8. Literature Cited


1. Introduction to Macroecology

Macroecology is the study of the partitioning of physical space and ecological resources by species (Brown and Maurer 1989, Brown 1995). Macroecological studies often consist of the analysis of species-level traits, such as body size, area of geographic range, and average abundance, measured at large spatial scales. Typically, investigators plot these data in bivariate scatter plots and then begin a search for patterns and underlying biological explanations.

For example, Figure 1 illustrates the relationship between geographic range area and body size of 46 species of mid-western fishes studied in the Cimarron River, Oklahoma. Each point in this graph represents a single species. This data set is available to you in the Tutorial Data Sets folder (Midwest fishes.txt) and will be used to illustrate the features of EcoSim's macroecology module.
Body size vs. geographic range area
Figure 1. Relationship between body size (= standard length) and area of geographic range for 46 species of midwestern fishes. Data from Gotelli and Taylor (1999a). The dashed line indicates a potential boundary, suggesting that species with large geographic ranges and large body sizes are perhaps uncommon.

There appear to be relatively few points in the upper right hand corner of the plot. In otherwords, species with large geographic ranges tend not to be large in body size. Or do they? Suppose there were no ecological or evolutionary constraints on range size and body size. What would the data in Figure 1 look like? EcoSim provides you with a number of simulation tools for answering this question. Although macroecology emphasizes the holistic nature of such data sets, we think it is essential that patterns such as Figure 1 be tested against an explicit null hypothesis (Blackburn et al. 1990, Enquist et al. 1995).

As in all EcoSim analyses, we can simulate patterns such as Figure 1 by randomizing an original data set. In the context of macroecology, these randomizations assume that, in a null community, the traits of species are independent of one another (Blackburn and Gaston 1998). But in reality, closely related species have similar traits by common ancestry, and they may not represent independent data points for the purposes of statistical analysis. The comparative method (Harvey and Pagel 1991) seeks to address these problems by mapping species traits onto phylogenetic trees and then using methods such as phylogenetic regression (Garland et al. 1993) to adjust for non-independence.

However, there are some limitations to the phylogenetic approach. The most serious is that good phylogenies are still not available for many taxa, although this is changing rapidly with the widespread availability of good molecular sequence data. Phylogenetic analysis rests on assumptions about the mode of character evolution, which may be difficult to justify for many ecological characters. Finally, phylogenetic "corrections" may not uniquely remove the historical factors that can lead to ecological correlations, and could even remove some of the pattern we're looking for. On a more practical level, the results of many phylogenetic analyses may be similar to those ignore the non-independence of species (Ricklefs and Starck 1996). All of the tests provided in this module assume the statistical independence of species.

Although the tests in this module are illustrated with macroecology data, the analytical problems are much broader than this. Other authors have studied the problem of detecting patterns in two-dimensional graphs when there is an "upper bound" or "factor ceiling" in operation. Conventional tests such as linear regression may not always reveal these boundaries or ceilings. Sophisticated alternatives include polynomial regression (Blackburn et al. 1990), quantile regression (Blackburn et al. 1992; Scharf et al. 1998; Cade et al. 1999), path analysis (Thomson et al. 1996), and other techniques (Garvey et al. 1998). The tests we present are applicable to the general problem of detecting non-random patterns in bivariate scatterplots of data (Figure 1).

2. Data Format

The input for macroecology analyses is a matrix of species-level macroecology traits. Each row is a different species and each column is a different variable, such as body size, geographic range area, or population abundance. Each entry in the matrix represents the measurement of a macroecological variable for a particular species.

To keep things simple in two-dimensional space, we have restricted this module to the analysis of non-negative real numbers. This means that all of the data points when plotted will fall in the upper-right hand quadrant of cartesian coordinate space. EcoSim does not carry out transformations, so if some of your data are negative (say, from a logarithmic transformation), be sure to add a constant so that all the values are positive. Although EcoSim will accept a large input matrix, only two columns of data are analyzed at a time. EcoSim lets you specify which column represents the x-variable and which column represents the y-variable.

As in all EcoSim modules, the first column is reserved for species names, and the first row is reserved for site names. See Importing Data for restrictions on species and site names.

3. Defaults

EcoSim generates 1000 random matrices as the default. The default pattern for the shape test is a left triangle, and the default boundary for the boundary test is upper right. For the shape test, the default data distribution is symmetric, which means that the shapes are symmetric within the observed data limits. For all the tests, the default constraint is data defined, which means that the randomization is carried out by reshuffling the observed x and y values. EcoSim uses the first two columns of your data set as the default x and y variables.

4. Options

X variable, Y variable

You must specify which column represents the x variable and which column represents the y variable. In case you try to be cute, EcoSim will not allow you to use the same variable for x and y. The default x variable is column 2, the first column of data in the matrix (remember that the first column in your data set always contains the species labels), and the default y variable is column 3.

Shapes

EcoSim offers you four choices for shapes to analyze: left triangle, right triangle, pyramid, and inverted pyramid. Icons next to each choice illustrate the shapes. For each shape chosen, EcoSim analyzes the distribution of points that fall within the shape, following the method of Enquist et al. (1995). EcoSim evaluates the number of points that fall within the shape boundary, and the sums of squares of those points.

Data Distribution

SymmetricAsymmetric
Symmetric left triangle Asymmetric left triangle
Symmetric right triangle Asymmetric right triangle
Symmetric pyramid Asymmetric pyramid
Symmetric inverted pyramid Asymmetric inverted pyramid
Figure 2. Symmetric and asymmetric shapes generated by EcoSim. Asymmetric shapes are generated by using the median x and median y as inflection points. The panels illustrate the left triangle, right triangle, pyramid, and inverted pyramid shapes.
This option determines how the four shapes and boundaries are constructed for both the observed data and the simulated data sets. In all cases, EcoSim uses the limits of the data (min x, min y, max x, max y) to create the space. For the symmetric choice, the shapes appear just as in the icon: the triangle shapes are created by connecting the points at the two corners of the observed data set. In other words, the left triangle is created by connecting the data points (min x, min y), (min x, max y), and (max x, min y). The right triangle is created by connecting the data points (min x, min y), (max x, min y), and (max x, max y). The 3 sides of the pyramid are created by connecting the three points (min x, min y), ((max x + min x)/2, max y), and (max x, min y). And the 3 sides of the inverted pyramid are created by connecting the three points (min x, max y), ((max x + min x)/2, min y), and (max x, max y). In all cases, these shapes occupy exactly 50% of the coordinate space defined by the range of x and y values.

For the symmetric option, the four boundaries (upper right, lower right, upper left, and lower left) are found by connecting the four midpoints of the x and y variables [(min x, (max y + min y)/2), ((max x + min x)/2, max y), (max x, (max y + min y)/2), and ((max x + min x)/2, min y)], forming a symmetrical "diamond" in the data space.

The asymmetric option gives a slightly more complex pattern. This option takes into account the fact that the distribution of x and y values may not be symmetric, so that the simple triangles and pyramids may give a distorted impression of where the data points lie. The asymmetric option uses the median x and median y values to define the shapes. For the pyramid shapes, the point of the pyramid is set at the median value of x, whereas in the symmetric option the pyramid points occur at (max + + min x)/2. For the triangle shapes, EcoSim now creates a 4-sided polygon with an "inflection point" at (median x, median y).

For the asymmetric option, the four boundaries (upper right, lower right, upper left, and lower left) are found by connecting the four median points of the x and y variables [(min x, med y), (med x, max y), (max x, med y), and (med x, min y)], forming an asymmetrical "kite" in the data space.

All of this is a lot easier to understand by looking at some pictures!

Figure 2 uses the same data from Figure 1 to illustrate the four data shapes with the asymmetric and symmetric data options. Figure 3 illustrates the 4 boundaries that are created in each corner of the data space for the symmetric and asymmetric data options.

SymmetricAsymmetric
Symmetric boundaries Asymmetric boundaries
Figure 3. Symmetric and asymmetric boundaries generated by EcoSim. Symmetric boundaries are generated by using (max x + min x)/2 and (max y + miny)/2 as cut points. Asymmetric boundaries are generated by using the median x and median y as cut points. Each panel illustrates the 4 boundaries (upper right, upper left, upper left, lower right, and lower left) that can be selected with the symmetric and asymmetric data options.
One final point is that if the x and y variables in your data have a symmetric distribution, such as a uniform or a normal distribution, then the asymmetric and symmetric data distribution options will generate the same patterns because (max x + min x)/2 = median x and (max y + min y)/2 = median y.

Constraints

The constraints determine the null model algorithm that is used to create the random data sets. In all of the constraint options, EcoSim generates exactly the same number of data points that were found in your original data set.

EcoSim offers you three options for constructing the null distribution of x and y values.

1) Data-defined This is the simplest (and best) option, and the one that EcoSim uses for a default. To create the null data sets, EcoSim simply reshuffles the ordering of the y values, randomly pairing them with the x values. This option retains the variances and distributions of the original x and y variables, but eliminates any pattern in the covariance of x and y together.

2) User-defined (Uniform) When this option is chosen, a small edit box appears, and the user enters a minimum and a maximum for both the x and y variables. For both the x and y variables, EcoSim creates random uniform values that are greater than the specified minimum and less than the specified maximum. The specified maximum must be greater than the specified minimum, and both the maximum and minimum values must be greater than zero. The "defaults" that appear in the edit window are the observed data limits themselves.

3) User-defined (Normal) For this option, a dialog box opens and the user is asked to supply the mean and the standard deviation of a normal distribution for the x and y variable. Non-negative real numbers are required for all 4 of these values. EcoSim then draws random values from these distributions to supply the x and y observations. Using the normal distribution, EcoSim may sometimes generate negative values, especially if the requested mean is small and/or the standard deviation is large. If this happens, EcoSim will discard the negative value and draw another observation until it gets a non-negative number. Therefore, the actual mean and variance of the simulated distribution may be different from the values specified in the dialog box. The "defaults" that appear in the edit window are the observed means and standard deviations of the x and y variables.

Boundary Test

The boundary test allows you to test whether data are significantly concentrated or sparse in each of the four "corners" of the bivariate space (Figure 3). These four corners are defined by intuitive, but objective "boundaries" that can be compared for real and simulated data.

Select one of the 4 corners for the boundary test of your data. The darkened corner of the icon indicates the corner of the space that is being tested with the boundary test. EcoSim will provide you with two tests of the pattern associated with that boundary: the number of points that fall beyond the boundary, and the sum of squares of those points. If some corners of the space are unusually empty, the observed number of points and/or the sum of squares in the real data set will be significantly less than in the simulated data sets.

Our test is similar to a "range restriction" test developed by P. Wilson (unpublished manuscript), which is described by Thomson et al. (1996).

5. Output

Because of the large number of tests and graphical displays, there are no less than 11 output tabs for a single run of Macroecology! These should appear as two rows of tabs on your screen, and you should work through them in the following order.

Input Column Tab

The Input tab shows you the original columns of data selected for the x and y variables in your analysis. You cannot edit the data in this window, but you can refer back to the original data set as you study the simulation results.

Simulation Tab

The simulation tab shows one round of simulated data for the x and y variable. If you used the default option of data defined, you will see that the values for the y variable have been reshuffled. If you used user-defined, the simulated values for both the x and y variable will be different than observed, because the data will be drawn from a uniform or normal distribution, depending on which of these options you specified.

Dispersion Tab

This histogram tab shows the calculation of the dispersion index for the original data and the simulated data sets. The dispersion index is calculated by dividing the bivariate space into 4 quadrants, based on the location of the point (median x, median y; Figure 4).
Median quadrants
Figure 4. Quadrants for Midwestern fishes data. The four quadrants are defined by the position of the point (median x, median y). EcoSim calculates the variance in the number of points that fall in each quadrant.

EcoSim next counts the number of points that occur in each quadrat (ties are scored as half or quarter points). The variance of these 4 numbers is calculated as a simple index of dispersion. If the original data are randomly distributed with covariance that is close to zero, then the observed variance will be similar to the variance that is calculated for the simulated data sets. On the other hand, if points are unusually concentrated in some corners of the space, the null hypothesis will be rejected, and the observed variance will be significantly larger than expected. Finally, if the points are distributed very evenly among the four quadrats, the observed variance will be significantly smaller than expected. This test is similar in spirit to the two-dimensional K-S test described by Garvey et al. (1998).

Regression Slope Tab

This tab illustrates the least squares regression slope, calculated for both the observed and the simulated data. Although the slope is calculated exactly the same way as in a parametric regression test, the probability value is determined directly from comparisons with the simulated data, and does not depend on assumptions of data normality. Remember that the regression slope is the best fit line that passes through the center of the cloud of points at (mean x, mean y). It won't necessarily give you the same results as some of the boundary or shape tests, which are measuring patterns at the edges of the data distribution.

Regression Charts Tab

Regression charts tab
Figure 5. Regression charts tab for Midwestern fishes data. The upper panel shows the regression slope (red line) for the observed data set. The lower panel shows one of the simulated data sets, and the average regression line for all of the simulations.
This tab shows you a scatter plot of the observed data and a scatter plot of one of the simulated data sets (Figure 5). On the observed data plot, the measured regression slope is shown as a dashed red line. On the simulated data plot, the average position of the simulated regression slopes is shown, which is very close to a slope of zero.

Shape # of Points Tab

This tab illustrates the test for the number of points within the shape that you have chosen. EcoSim calculates the number of points that fell within the selected shape (left triangle, right triangle, pyramid, or inverted pyramid), then compares it with the histogram of point counts for all of the simulated data sets. If the points are unusually clustered within the shape, then the observed number of points will be significantly larger than the number found in most of the simulated data sets. Note that the shape test depends both on the shape you have selected, and on whether the shape is calculated with symmetric or asymmetric edges.

Shape Sum of Squares Tab

This tab gives the observed sum of squares for the points within the shape, following Enquist et al. (1995). For the shape that you specified, EcoSim calculates the vertical distance from each point within the shape to the shape boundary, squares it, and sums it. Only points that fall within the shape boundary are used to calculate this sum. This single number gives the average deviation of the points from the boundary. If the sum of squares is unusually large, the points are clustered away from the boundary, whereas if the sums of squares is unusually small, the points are clustered near the boundary. Note that the sums of squares depends both on the number of points inside the shape and the distance of those points from the boundary. This is why it is useful to examine both the number of points and the sums of squares when evaluating the shape test. The shape test is also very sensitive to whether the shape is calculated using the symmetric or asymmetric formula (Figure 3).

Boundary # of Points Tab

This tab illustrates the test for the number of points that fall outside the shape that you have chosen. EcoSim counts the number of points that fell outside the selected boundary (upper left, upper right, lower left, lower right), then compares it with the histogram of point counts for all of the simulated data sets. If the points are unusually sparse beyond the boundary, then the observed number of points outside the boundary will be significantly smaller than the number found for most of the simulated data sets.

Boundary Sum of Squares Tab

This tab gives the observed sum of squares for all of the points that fall beyond the boundary. For the boundary that you specified, EcoSim calculates the vertical distance from each point outside the boundary to the boundary edge, squares it, and sums it. Only points that fall beyond the boundary are used to calculate this sum. This single number gives the average deviation of the points from the boundary. If the sum of squares is unusually small, the points are clustered away from the boundary, whereas if the sums of squares is unusually small, the points are clustered near the boundary. Note that the sums of squares depends both on the number of points inside the shape and the distance of those points from the boundary. This is why it is useful to examine both the number of points and the sums of squares when evaluating the shape test.

Boundary Charts Tab

Boundary charts tab
Figure 6. Boundary charts tab for Midwestern fishes data. The upper panel shows the observed data set, and the lower panel shows one of the simulated data sets. In both panels, the red line is the upper right boundary.
This tab shows you a scatter plot of the observed data and a scatter plot of one of the simulated data sets (Figure 6). The selected boundary is shown as a red line.

Summary Tab

The summary tab gives the simulation conditions, including the name of the input file, the x and y variables chosen, shape test, boundary test, data distribution, and constraints chosen.

Next, it presents the information that was contained in each of the histogram tabs for this module: dispersion, regression slope, shape # of points, shape sum of squares, boundary # of points, and boundary sum of squares. For each simulation, the summary window shows the observed and expected metric, the probability value, and the histogram bins for the simulated data sets. It also gives the standardized effect size, calculated as: observed index - mean(simmulated indices)/standard deviation(simulated indices)

This metric is analagous to the standardized effect size that is used in meta-analyses (Gurevitch et al. 1992). It scales the results in units of standard deviations, which allows for meaningful comparisons among different tests. Roughly speaking a standardized effect size that is greater than 2 or less than -2 is statistically significant with a tail probability of less than 0.05. However, this is only an approximation, and it assumes that the data are normally distributed, which is often not the case for null model tests. For any individual study, you should always report the actual tail probability, which is calculated directly from the simulation, and does not require any assumptions about normality of the data.

Finally, the summary tab shows the original data matrix, with labels.

All of these data can be edited, deleted, or annotated. The output can then be saved (Save to File) or discarded (Close). There is also a small time clock in the lower right-hand corner so you can tell how long your simulation took.

6. Caveats

Of all the modules in EcoSim, macroecology represents one of the least tested areas for null model analysis, and many of the tests present here should be considered preliminary until their properties are studied in more detail.

But even at this point, it is possible to say a few things about how the test should be used. One of the most important issues is the randomization algorithm to be used. Unless you have a good reason to do otherwise, we strongly recommend that you retain the default option of the data-defined constraint. This creates a null data set by simply reshuffling the observed values of x and y. The strength of this approach is that it retains the variances of x and y, so that any significant results are due to patterns in the covariance of the two variables. Outliers and asymmetric data distributions are fully retained with this option.

We have included the user-defined normal and uniform options in case you have other null expectations for the x and y variable. However, we caution that these distributions can easily generate patterns that are very different from those in the original data set, and will often lead to the rejection of the null hypothesis. It is interesting that the uniform distribution is probably the implicit null hypothesis that people use when evaluating macroecology scatterplots, because it implies that the variable space should be randomly and uniformly "filled". However, some parts of the macroecological space may be rare simply because the density of both x and y variables is low in that region, not because the joint distribution of x and y is unfavored.

Of the tests that are presented, the dispersion test and regression slope are the most general tests for non-randomness in the covariance of x and y. They are a good starting place for evaluating the distribution of the your data, and many times they may be highly non-random even though the other boundary and shape tests are not.

The results of the boundary and shape sum of squares tests need to be carefully interpreted because they will reflect not only the placement of the points relative to the boundary, but also the number of points within the shape or beyond the boundary. Finally, we note that the boundary tests may often give uninformative results if they are applied indiscriminantly to all of the corners of the distribution, particularly when there appears to be a three-sided "triangle" shape. We recommend that you examine the major shapes in your data first, or, even better, establish a-priori hypotheses about shapes and boundaries from the theoretical literature (Brown 1995, Maurer 1999).

7. Macroecology Tutorial

Midwestern Fishes

Launch EcoSim and you will see the familiar opening 5 x 5 matrix of species and sites. Use the file menu to open the file called "Midwestern fishes.txt" contained in the "Tutorial Data Sets" folder. This data file gives a set of macroecological and metapopulation variables for 46 species of fishes from a long-term study in the Cimarron River. These data were graciously provided by the late Jimmy Pigg. Details of the data set are described in Pigg (1988) and Gotelli and Taylor (1999a, 1999b).

Each row of the data set gives a different species of fish. The macroecological variables (= columns) in this data set are:

FRACT 10 sites on the Cimarron River were censused between 1976 and 1988, and this variable is the average fraction of sites occupied each year.

EXT The average annual probability of extinction for an occupied site.

COL The average annual probability of colonization for an unoccupied site.

DIST The distance in km from the center to the edge of the species geographic range.

AREA The area of the geographic range in km2.

SIZE The standard length in mm, a convenient measure of body size for fishes.

EDGE An index of the position of the sites on the Cimarron River relative to the edge of the geographic range. The larger the index, the closer the sites are to the edge of the geographic range. See Gotelli and Taylor (1999a) for details.

ABUN The average abundance of each species in occupied sites.

For this tutorial, let's examine the relationship between body size and geographic range area, illustrated in Figures 1-3. Select AREA as the x variable and SIZE as the y variable. Go to the "general" tab and set the random number seed to 10 so that your results will exactly match those in this tutorial.

We are initially interested in whether or not there is a left "triangle" pattern as shown in Figure 1, so we will keep the defaults, which specify a symmetric left triangle, with a boundary test for the upper right-hand corner.

Understanding macroecology output

Let's work through the tabs in order, beginning with the input matrix and simulation tabs. The input matrix tab shows you the data columns that you selected, and the simulation tab shows you one of the simulated data sets. Notice in the simulated data set that the observed values of the y-variable have been randomly reshuffled and reassigned to the x values.

The dispersion tab counts the number of data points in each of the 4 quadrants of the sample space (Figure 3) and calculates the variance of those data points. The observed variance was 27.0, whereas the average of variance of the 1000 simulated data sets was only 3.85. The tail probability (shown in the lower panel) for observed variance is 0.021. These results suggest that the points are not randomly distributed in the two-dimensional space: some quadrats in the space have too many points and others have too few compared to the randomized data sets.

The regression slope tab shows gives a standard regression slope of 0.00004, which is not significantly greater than the simulated slope of 0.00 (p = 0.127). The regression charts tab confirms visually that the slope of the observed data is positive, but not an extreme value. The shape # of points tab indicates that 41 of the 46 data points fell within the (symmetric) left triangle shape. This does not differ significantly from the average of the simulated values (41.20; p = 0.751). If the observed data points were unusually concentrated in the triangle, then the simulated data sets would usually have contained substantially fewer than 41 points in the triangle.

The shape sum of squares test, suggests that the observed sum of squares is larger than simulated, but not significantly so (p = 0.063).

Finally, the boundary # of points and the boundary sum of squares tests confirm that the upper right-hand corner of the space is not unusually "empty" even though there is only a single observation in that region of the space (# of points p = 0.964; sum of squares p = 0.790). Because there are few species with large geographic ranges and few species with large body sizes, we don't expect many species to be occuring in this corner of the space and should not be puzzled by their absence. In fact, if there is any pattern in the shapes of these data, it is in the right triangle. Run the simulation again with for the "right triangle shape". Thirty six points fell within the symmetric right triangle, compared to an average of only 32.57 points for the simulated data sets (p = 0.045).

Non-randomness is also indicated in some of the boundary tests. If you test each of the 4 corners (upper right, lower right, upper left, and lower left), you will discover that there is an odd distribution of points in the lower regions of the graph. Although observed number of points (25) in the lower left-hand corner of the graph is not unusual, the observed sum of squares (7.52 x 106) is much greater than expected (p = 0.006): the observed points are a bit "too close" to the origin. Conversely, there was only 1 data point in the lower right-hand corner of the graph, and this was significantly fewer than expected (expected = 3.02, p = 0.044). These patterns are probably responsible for the significance of the dispersion test.

Understanding data symmetry

Now try re-analyzing your data with the asymmetric option checked. This option creates geometric shapes and boundaries using the median, rather than the midpoint of the data distributions. This option does not affect the regression and dispersion tests, but only applies to the shape and boundary tests. In general, you will see that the overall results are similar: most tests are non-significant, although there is a significantly large sum of squares in the lower left-hand corner of the data. In contrast to the analysis of the symmetric boundaries, the number of points in the lower right-hand corner of the graph is no longer significantly small. Instead, there is now a significant clustering of points within the pyramid shape (p = 0.048).

Understanding constraints

Next, try analyzing this data set with the user-defined uniform option checked. When you first select this option, EcoSim presents you with an edit window that allows you to select the minimum and maximum points for both the x and the y variables. Go ahead and retain the defaults, which are simply the observed minimum and maximum values in the original data. Thus, in the original data set, geographic areas range from 6.25 x 104 km2 to 1.02 x 10 7 km2. Body sizes range from 50 to 2000 mm. For the null data sets, EcoSim will select an x value and a y value from uniform distributions that are defined by these endpoints.

When you carry out these analyses, nearly all of the statistical tests give significant results. If you examine the chart tabs, you will see that the observed data look quite different from the simulated data sets, for which the sample space is fairly evenly filled with data points.

Another variation is to use the user-defined normal option, which uses a normal distribution to draw the x and y values. As before, EcoSim pops up an edit window let you specify this distribution. EcoSim conveniently calculates and provides you with the observed means and variances as defaults.

Compared to the data-defined, the normal distribution also leads to a frequent rejection of the null hypothesis, but not as often as with the uniform distribution. The reason is that the normal distribution leaves the four corners of the space relatively sparse, because points are generated less frequently in the tails of the x and y distributions. Again, examine the chart tabs to see how these simulated data sets look compared to the actual data set.

Understanding all of the combinations

Although we do not recommend that you "dredge" your data for results, it is instructive to systematically explore all of EcoSim's options for this data set. Table 1 summarizes the statistical output from all of the shape and boundary tests, using symmetric and asymmetric data distributions, and testing against null data sets created with the data-defined, uniform, and normal distributions (whew!).

SymmetricAsymmetric
IndexData-DefinedUniformNormal Data-DefinedUniformNormal
Dispersion++++++
Regressionnsnsnsnsnsns
Left Triangle (#)ns++++nsnsns
Left Triangle (ss)ns++++++ns+++
Right Triangle (#)++++++nsnsns
Right Triangle (ss)nsnsnsns------
Pyramid (#)ns++++++++++
Pyramid (ss)ns++++ns++++++
Inverted Pyramid (#)ns-----nsnsns
Inverted Pyramid (ss)ns---nsnsnsns
Upper Right (#)ns-nsns-ns
Upper Right (ss)ns-nsnsnsns
Upper Left (#)nsnsnsnsnsns
Upper Left (ss)nsnsnsnsns+
Lower Right (#)--nsnsnsns
Lower Right (ss)nsnsnsns---ns
Lower Left (#)ns++++++ns+++
Lower Left (ss)++++++++++nsns
Table 1. Summary of null model macroecology tests for the relationship between body size and geographic range area (Figure 1). # = number of points; ss = sum of squares; ns = non-significant (p > 0.05). Plus symbols (+, ++, +++) indicate the observed index was significantly greater than expected; Minus symbols (-, --, ---) indicate the observed index was significantly less than expected. One symbol = p < 0.05; two symbols = p < 0.01; three symbols = p < 0.001.

Some general results are apparent from these comparisons. The first is that the overall distribution of points is non-random, although the conventional regression slope is not signficantly different from zero. Compared to a uniform distribution, the observed data set seems to fit the triangle or pyramid distributions. Using the more realistic data-defined distribution, the null hypothesis is rarely rejected for the shape and boundary tests, although there does appear to be a weak clustering of points within the pyramid, or the right triangle, depending on the symmetry option chosen.

Thus, there does not seem to be a simple "evolutionary boundary" (Figure 1) in which combinations of large geographic range and large body size are uncommon. If anything, there is a clustering of large body sizes at intermediate (pyramid) or large (right triangle) geographic ranges. Most of the tests give a significant sum of squares for points in the lower left-hand corner. In other words, there is a slight excess of species that have especially small geographic ranges and small body sizes. These patterns are subtly different from one in which there are too few species with large range and large body sizes.

8. Literature Cited

Blackburn, T.M., P.H. Harvey, and M.D. Pagel. 1990. Species number, population density and body size relationships in natural communities. Journal of Animal Ecology 59: 335-345.

Blackburn, T.M. and K.J. Gaston. 1998. Some methodological issues in macroecology. American Naturalist 151: 68-83.

Blackburn, T.M., J.H. Lawton, and J.N. Perry. 1992. A method of estimating the slope of upper bounds of plots of body size and abundance in natural animal assemblages. Oikos 65: 107-112.

Brown, J.H. and B.A. Maurer. 1989. Macroecology: the division of food and space among species on continents. Science 243: 1145-1150.

Brown, J. H. 1995. Macroecology. University of Chicago Press, Chicago.

Cade, B.S., J.W. Terrell, and R.L. Schroeder. 1999. Estimating effects of limiting factors with regression quantiles. Ecology 80: 311-323.

Enquist, B.J., M.A. Jordan, and J.H. Brown. 1995. Connections between ecology, biogeography, and paleobiology: Relationship between local abundance and geographic distribution in fossil and recent molluscs. Evolutionary Ecology 9: 586-604.

Garland, T., Jr., A.W. Dickerman, C.M. Janis, and J.A. Jones. 1993. Phylogenetic analysis of covariance by computer simulation. Systematic Biology 42: 265-292.

Garvey, J.E., E.A. Marschall, and R.A. Wright. 1998. From star charts to stoneflies: detecting relationships in continuous bivariate data. Ecology 79: 442-447.

Harvey, P.H., and M.D. Pagel. 1991. The Comparative Method In Evolutionary Biology. Oxford University Press, Oxford.

Gotelli, N.J. and C.M. Taylor. 1999a. Testing macroecology models with stream-fish assemblages. Evolutionary Ecology Research 1: 847-858.

Gotelli, N.J. and C.M. Taylor. 1999b. Testing metapopulation models with stream-fish assemblages. Evolutionary Ecology Research 1: 835-845.

Gurevitch, J., L.L. Morrow, A. Wallace, and J.S. Walsh. 1992. A meta-analysis of field experiments on competition. The American Naturalist 140: 539-572.

Maurer, B.A. 1999. Untangling Ecological Complexity: The Macroscopic Perspective. University of Chicago Press, Chicago.

Pigg, J. 1988. Aquatic habitats and fish distribution in a large Oklahoma river, the Cimarron, from 1976-1988. Proceedings of the Oklahoma Academy of Sciences 68: 9-31.

Ricklefs, R.E. and J.M. Starck. 1996. Applications of phylogenetically independent contrasts: A mixed progress report. Oikos 77: 167-172.

Scharf, F.S., F. Juanes, and M. Sutherland. 1998. Inferring ecological relationships from the edges of scatter diagrams: comparison of regression techniques. Ecology 79: 448-460.

Thomson, J.D., G. Weiblen, B.A. Thomson, S. Alfaro, and P. Legendre. 1996. Untangling multiple factors in spatial distributions: lilies, gophers, and rocks. Ecology 77: 1698-1715.


All Pages Copyright © 2003
by Kesey-Bear and Acquired Intelligence, Inc.
All rights reserved.