occumbData()
creates a data list compatible with the model fitting
function occumb()
.
Arguments
- y
A 3-D array of sequence read counts (
integer
values) that may have adimnames
attribute. The dimensions are ordered by species, site, and replicate. The data for missing replicates are represented by zero vectors.NA
s are not allowed.- spec_cov
A named list of species covariates. Each covariate can be a vector of continuous (
numeric
orinteger
) or discrete (logical
,factor
, orcharacter
) variables whose length isdim(y)[1]
(i.e., the number of species).NA
s are not allowed.- site_cov
A named list of site covariates. Each covariate can be a vector of continuous (
numeric
orinteger
) or discrete (logical
,factor
, orcharacter
) variables whose length isdim(y)[1]
(i.e., the number of sites).NA
s are not allowed.- repl_cov
A named list of replicate covariates. Each covariate can be a matrix of continuous (
numeric
orinteger
) or discrete (logical
orcharacter
) variables with dimensions equal todim(y)[2:3]
(i.e., number of sites \(\times\) number of replicates).NA
s are not allowed.
Details
The element (i.e., covariate) names for spec_cov
, site_cov
, and
repl_cov
must all be unique.
If y
has a dimnames
attribute, it is retained in the resulting
occumbData
object, and can be referenced in subsequent analyses.
Examples
# Generate the smallest random dataset (2 species * 2 sites * 2 reps)
I <- 2 # Number of species
J <- 2 # Number of sites
K <- 2 # Number of replicates
data <- occumbData(
y = array(sample.int(I * J * K), dim = c(I, J, K)),
spec_cov = list(cov1 = rnorm(I)),
site_cov = list(cov2 = rnorm(J), cov3 = factor(1:J)),
repl_cov = list(cov4 = matrix(rnorm(J * K), J, K))
)
# A case for named y (with species and site names)
y_named <- array(sample.int(I * J * K), dim = c(I, J, K))
dimnames(y_named) <- list(c("common species", "uncommon species"),
c("good site", "bad site"), NULL)
data_named <- occumbData(
y = y_named,
spec_cov = list(cov1 = rnorm(I)),
site_cov = list(cov2 = rnorm(J), cov3 = factor(1:J)),
repl_cov = list(cov4 = matrix(rnorm(J * K), J, K))
)
# A real data example
data(fish_raw)
fish <- occumbData(
y = fish_raw$y,
spec_cov = list(mismatch = fish_raw$mismatch),
site_cov = list(riverbank = fish_raw$riverbank)
)
# Get an overview of the datasets
summary(data)
#> Sequence read counts:
#> Number of species, I = 2
#> Number of sites, J = 2
#> Maximum number of replicates per site, K = 2
#> Number of missing observations = 0
#> Number of replicates per site: 2 (average), 0 (sd)
#> Sequencing depth: 9 (average), 4.2 (sd)
#>
#> Species covariates:
#> cov1 (continuous)
#> Site covariates:
#> cov2 (continuous), cov3 (categorical)
#> Replicate covariates:
#> cov4 (continuous)
#>
#> Labels for species:
#> (None)
#> Labels for sites:
#> (None)
#> Labels for replicates:
#> (None)
summary(data_named)
#> Sequence read counts:
#> Number of species, I = 2
#> Number of sites, J = 2
#> Maximum number of replicates per site, K = 2
#> Number of missing observations = 0
#> Number of replicates per site: 2 (average), 0 (sd)
#> Sequencing depth: 9 (average), 3.5 (sd)
#>
#> Species covariates:
#> cov1 (continuous)
#> Site covariates:
#> cov2 (continuous), cov3 (categorical)
#> Replicate covariates:
#> cov4 (continuous)
#>
#> Labels for species:
#> common species, uncommon species
#> Labels for sites:
#> good site, bad site
#> Labels for replicates:
#> (None)
summary(fish)
#> Sequence read counts:
#> Number of species, I = 50
#> Number of sites, J = 50
#> Maximum number of replicates per site, K = 3
#> Number of missing observations = 6
#> Number of replicates per site: 2.88 (average), 0.33 (sd)
#> Sequencing depth: 77910 (average), 98034.7 (sd)
#>
#> Species covariates:
#> mismatch (continuous)
#> Site covariates:
#> riverbank (categorical)
#> Replicate covariates:
#> (None)
#>
#> Labels for species:
#> Abbottina rivularis, Acanthogobius lactipes, Acheilognathus macropterus, Acheilognathus rhombeus, Anguilla japonica, Biwia zezera, Carassius cuvieri, Carassius spp., Channa argus, Ctenopharyngodon idella, Cyprinus carpio, Gambusia affinis, Gnathopogon spp., Gymnogobius castaneus, Gymnogobius petschiliensis, Gymnogobius urotaenia, Hemibarbus spp., Hypomesus nipponensis, Hypophthalmichthys spp., Hyporhamphus intermedius, Ictalurus punctatus, Ischikauia steenackeri, Lepomis macrochirus macrochirus, Leucopsarion petersii, Megalobrama amblycephala, Micropterus dolomieu dolomieu, Micropterus salmoides, Misgurnus spp., Monopterus albus, Mugil cephalus cephalus, Mylopharyngodon piceus, Nipponocypris sieboldii, Nipponocypris temminckii, Opsariichthys platypus, Opsariichthys uncirostris uncirostris, Oryzias latipes, Plecoglossus altivelis altivelis, Pseudogobio spp., Pseudorasbora parva, Rhinogobius spp., Rhodeus ocellatus ocellatus, Salangichthys microdon, Sarcocheilichthys variegatus microoculus, Silurus asotus, Squalidus chankaensis biwae, Tachysurus tokiensis, Tanakia lanceolata, Tribolodon brandtii maruta, Tribolodon hakonensis, Tridentiger spp.
#> Labels for sites:
#> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50
#> Labels for replicates:
#> L, C, R