would be great to have phylogeny but we’re not there yet: there’s hope for using reference collections and grafting COI trees onto better supported backbone trees
reference collection forthcoming
motivation
Dimensions project sampling across the Hawaiian chronosequence and stratified across the forest profile (from floor to canopy) yielded $> 10^6$ specimens.
How to deal with this massive amount of material? Too expensive for keyout each specimen. This leads to metabarcoding
metabarcoding
issue: body size and DNA content: DNA content is super linear with body length
solution:
for Hawaii arthropods have a limited size range, so we can sort into a few size categories to mostly control for DNA content
size classes are 0–2, 2–4, 4–7, >7 mm in length
issue: exponential PCR amplification based on primer mismatch
solutions:
some patterns are interesting within close relatives where this bias does not matter, e.g. within lineage response to climate/elevation
taking average of multiple markers averages out the bias of each
statistical approaches for correcting for bias
we have a high throughput pipeline fro taking sorted specimens to sequencing to analysis
issue: sorting is slow
for Hawaii beating samples
barcode by size by site by plant separately
cluster OTUs
make an OTU table by rarifying each pooled sequencing run to standard number of sequences
also make a haplotype table by clustering “OTUs” with 0 divergence
similar data also exist for Hawaii environmental gradients
Collembola invasion
44% of abundance of arthropods on Big Island
could be predator release: based on gut content no spiders eat them, but we haven’t looked at carabids yet
abundance of collembola relates with age: most abundance of collembola are on young and old substrates
Something to look at: does copy number vary with genetic diversity? We might expect that more copy number allows for more diversity
Petr’s data assembly
folder on Google drive called data hosting all the data
so far we have 4 relevant data sets (all data are abundance only unless otherwise noted)
Paulo’s Azores arthropods
Dan Gruner’s Hawaiian arthropds
Jon’s Hawaii trees
Brent’s (via Isaac) Reunion spiders; includes sequence data
potential data
Jairo has data but is not ready for sharing yet, though narrow personal collaborations are possible
Joaquin has mainland European carrabid data; includes phylogeny and traits
Petr compiled global mammal database: no abundance; phylo, geo distribution, traits
Christine has snail data in the process of being assembled including phylo, trait, spatial occurrences as abundance proxies
data documentation is key, especially permisions
there’s a README template to be completed by each data providers; it includes an important section on permissions
maybe we should make consistent deffinitions of terms of use
but we don’t want to make the form more complex because it’s already cumberson for data providers to fill it out
we’ll get more into this
How should we proceed?
Jon proposes a project first approach where project leaders solicit specific data
wouldn’t it be nice if project driven data could be sharred more broadly
the proposal for now is that each project get’s its own folder it’s responsible for putting data in and maintaining it