CURSAT ver. 2.1: A Simple, Resampling-Based, Program to Generate Pseudoreplicates of Data and Calculate Rarefaction Curves

Gabriele Gentile

doi:10.5334/jors.260

CURSAT ver. 2.1: A Simple, Resampling-Based, Program to Generate Pseudoreplicates of Data and Calculate Rarefaction Curves

Journal of Open Research Software

Volume 8 (2020): Issue 1

By: Gabriele Gentile

Open Access

|Aug 2020

Figures & Tables

Table 1

Incidence matrix.

	1	2	3	4	5	6	7	8	9	10
a	1	0	1	1	0	0	0	1	1	0
b	1	1	0	0	0	0	1	1	0	0
c	0	0	0	1	0	1	1	0	1	1
d	0	0	0	0	1	0	1	0	1	0
e	0	0	1	1	0	0	0	1	0	0
f	1	1	0	0	1	0	0	1	0	1
g	0	0	1	1	1	0	0	0	1	0
h	1	1	0	1	0	0	1	1	0	1
i	1	0	0	0	1	1	0	1	0	1
j	0	0	1	1	0	0	1	0	1	0
k	0	0	1	1	0	1	0	1	0	1
l	1	1	1	0	0	0	1	1	0	0

File *pseud_incidence_boot.txt* opened in Microsoft Excel. It consists of the output file with 100 bootstrapped pseudoreplicates of the original dataset in the file *incidence.txt*. The first column is a label that marks the resample replicate during which the pseudoreplicate was generated. Only the first two replicates are shown.

Accumulation data file *accum_incidence_boot.txt* opened in Microsoft Excel. It consists of the output file with 100 accumulation replicates based on the bootstrap of the original dataset in the file *incidence.txt*. The first column marks the resample replicate during which the accumulation was generated. In the second and third columns, sampling events and the associated cumulative number of objects are respectively reported. Only the first two replicates are shown.

Accumulation curves (as indicated by black dots) obtained from 100 pseudoreplicates, with seed number = 12348695. The input file was *abundance.txt*. Bootstrap **(A)** and resampling without replacement **(B)** graphs. Dots indicate mean values. Whiskers indicate 2 × standard deviation. Only for reference purposes, logarithmic regression lines are drawn (red) and regression equations are reported. As expected, when using the same seed number, the files *abundance.txt* and *incidence.txt* produced identical results (not shown, but see also output files).

An example of a log file generated by running Edition #2, illustrating in detail the algorithm used by CURSAT ver. 2.1. The input data matrix D (top-left) in the file *data.txt* was used in this example. Only outcomes of replicate n.1 from a shuffling procedure are shown because the procedure is very similar for bootstrap. For each resampling replicate, a pseudoreplicate data matrix (ND) is generated by extracting the elements from a column of the input matrix D, according to the shuffling order stored in vector F. In this example, elements in column #1 of ND are extracted from column #4 of the input matrix D (green); subsequently, elements in column #2 of ND are extracted from column #2 of the input matrix D (orange), and so on. A new matrix (W) is then constructed by cumulatively summing elements of the pseudoreplicate data ND matrix by row (see asterisks). Finally, the matrix B (same dimensions as W) is constructed from matrix W, by replacing elements >0 with 1. This is instrumental to correctly perform accumulation irrespectively whether the input file contains abundance (as in this case) or incidence data. The accumulation replicates are constructed by cumulatively summing elements of matrix B by column (see files *data_shuf_log.txt* and *accum_data_shuf.txt*). The accumulation replicates are not stored in memory, but they are printed in the output file meanwhile they are created.

Accumulation curves (black dots) obtained from 100 pseudoreplicates. The database as in the file *seedbank.txt* was used. CURSAT ver. 2.1 **(A)** and **(B)** and EstimateS 9.1.0 **(C)** and **(D)** were tested for bootstrap and shuffling. Bootstrap and shuffling graphs are on the left and right sides, respectively. Dots indicate mean values. Standard deviation (2×) is indicated by continuous black lines. Logarithmic regression lines are drawn (red) and regression equations reported. CURSAT ver. 2.1 and EstimateS 9.1.0 produced similar results. For EstimateS 9.1.0 the cumulative n. of species (objects) has been calculated using S Mean ± (2 × bootstrap/shuffling SD (runs)) from the EstimatesS 9.1.0 output.

Accumulation curves (black dots) obtained from 100 pseudoreplicate data. The input data was the abundance 1000 × 1000 matrix as in the file *1000.txt*. CURSAT ver. 2.1 **(A)** and **(B)** and EstimateS 9.1.0 **(C)** and **(D)** were tested for bootstrap and shuffling. Bootstrap and shuffling graphs are on the left and right sides, respectively. Dots indicate mean values. Standard deviation (2×) is indicated by whiskers. Logarithmic regression lines are drawn (red) and regression equations are shown. To improve clarity, only cumulative data the first 20 sampling events are here reported. As expected in the case of a random matrix, both bootstrap and shuffling produced almost the same results. For EstimateS 9.1.0 the cumulative n. of objects has been calculated using S Mean ± (2 × bootstrap/shuffling SD (runs)) from the EstimatesS 9.1.0 output.

References

Authors

Metrics

Articles in this issue

DOI: https://doi.org/10.5334/jors.260 | Journal eISSN: 2049-9647

Journal RSS Feed

Language: English

Submitted on: Feb 22, 2019

Accepted on: Jun 25, 2020

Published on: Aug 21, 2020

Published by: Ubiquity Press

In partnership with: Paradigm Publishing Services

Publication frequency: 1 issue per year

Keywords:

ecology,

zoology,

botany,

biodiversity,

diversity indices

© 2020 Gabriele Gentile, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.

Volume 8 (2020): Issue 1

CURSAT ver. 2.1: A Simple, Resampling-Based, Program to Generate Pseudoreplicates of Data and Calculate Rarefaction Curves

Figures & Tables

Table 1

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Paradigm

My account

	1	2	3	4	5	6	7	8	9	10
a	1	0	1	1	0	0	0	1	1	0
b	1	1	0	0	0	0	1	1	0	0
c	0	0	0	1	0	1	1	0	1	1
d	0	0	0	0	1	0	1	0	1	0
e	0	0	1	1	0	0	0	1	0	0
f	1	1	0	0	1	0	0	1	0	1
g	0	0	1	1	1	0	0	0	1	0
h	1	1	0	1	0	0	1	1	0	1
i	1	0	0	0	1	1	0	1	0	1
j	0	0	1	1	0	0	1	0	1	0
k	0	0	1	1	0	1	0	1	0	1
l	1	1	1	0	0	0	1	1	0	0

	1	2	3	4	5	6	7	8	9	10
a	1	0	1	1	0	0	0	1	1	0
b	1	1	0	0	0	0	1	1	0	0
c	0	0	0	1	0	1	1	0	1	1
d	0	0	0	0	1	0	1	0	1	0
e	0	0	1	1	0	0	0	1	0	0
f	1	1	0	0	1	0	0	1	0	1
g	0	0	1	1	1	0	0	0	1	0
h	1	1	0	1	0	0	1	1	0	1
i	1	0	0	0	1	1	0	1	0	1
j	0	0	1	1	0	0	1	0	1	0
k	0	0	1	1	0	1	0	1	0	1
l	1	1	1	0	0	0	1	1	0	0

	1	2	3	4	5	6	7	8	9	10
a	1	0	1	1	0	0	0	1	1	0
b	1	1	0	0	0	0	1	1	0	0
c	0	0	0	1	0	1	1	0	1	1
d	0	0	0	0	1	0	1	0	1	0
e	0	0	1	1	0	0	0	1	0	0
f	1	1	0	0	1	0	0	1	0	1
g	0	0	1	1	1	0	0	0	1	0
h	1	1	0	1	0	0	1	1	0	1
i	1	0	0	0	1	1	0	1	0	1
j	0	0	1	1	0	0	1	0	1	0
k	0	0	1	1	0	1	0	1	0	1
l	1	1	1	0	0	0	1	1	0	0