| Title: | Phylogenetic Niche Conservatism Analysis for Ecological Communities |
|---|---|
| Description: | Provides functions for testing phylogenetic niche conservatism, a key prerequisite in community assembly studies. The package integrates global functional trait data across major taxonomic groups and implements methods such as Pagel's Lambda and Blomberg's K to quantify phylogenetic signals in ecological communities. Methods are described in Münkemüller et al. (2012) <doi:10.1111/j.2041-210X.2012.00196.x>. |
| Authors: | Yan He [aut, cre], Yu Xia [aut], Rui Yang [aut], Lingfeng Mao [aut] |
| Maintainer: | Yan He <[email protected]> |
| License: | GPL-3 |
| Version: | 0.1.0 |
| Built: | 2026-06-06 08:19:39 UTC |
| Source: | https://github.com/cran/PNC |
A comprehensive global database of ecological traits for amphibian species, compiled to provide insights into the life history and ecological characteristics of amphibians worldwide.
AmphiBIOAmphiBIO
A data frame with multiple variables:
Scientific name of the amphibian species
Taxonomic genus of the species
Taxonomic family of the species
Maximum adult body mass.
Minimum age at maturation or sexual maturity.
Maximum age at maturation or sexual maturity.
Maximum adult body size. In Anura, body size is reported as snout to vent length. In Gymnophiona and Caudata, body size is reported as total length.
Minimum size at maturation or sexual maturity.
Maximum size at maturation or sexual maturity.
Maximum life span.
Minimum no. of offspring or eggs per clutch.
Maximum no. of offspring or eggs per clutch.
Maximum no. reproduction events per year.
Minimum offspring or egg size.
Maximum offspring or egg size.
Oliveira, B. F., São-Pedro, V. A., Santos-Barrera, G., Penone, C., & Costa, G. C. (2017). AmphiBIO, a global database for amphibian ecological traits. Scientific data, 4(1), 1-7. doi:10.1038/sdata.2017.123
# Load the dataset data(AmphiBIO) head(AmphiBIO)# Load the dataset data(AmphiBIO) head(AmphiBIO)
Comprehensive morphological dataset for bird species, including taxonomic information from BirdLife International and detailed morphological measurements.
AVONETAVONET
A data frame with 11,009 rows and 14 columns, where each row represents a bird species:
Species scientific name
Genus name
Family name, according to BirdLife International taxonomy
Length from beak tip to skull base, in millimeters
Length from nostril anterior edge to beak tip, in millimeters
Beak width at the anterior edge of nostrils, in millimeters
Beak depth at the anterior edge of nostrils, in millimeters
Tarsus length from posterior notch between tibia and tarsus to the last scale end, in millimeters
Length from carpal joint to longest primary feather tip, in millimeters
Length from first secondary feather tip to longest primary feather tip, in millimeters
Length from carpal joint to first secondary feather tip, in millimeters
100*DK/Lw, where DK is Kipp's distance and Lw is wing length
Distance from longest rectrix tip to point where central rectrices protrude from skin, in millimeters
Species average body mass, including both male and female, in grams
This dataset provides comprehensive morphological measurements of birds, including beak, wing, tarsus, and body weight indicators. Data originates from a comprehensive study of bird morphological, ecological, and geographical characteristics.
- Taxonomic information based on BirdLife International - Measurements represent species averages - Hand-Wing Index reflects flight capability and ecological adaptation
Tobias, J. A., Sheard, C., Pigot, A. L., Devenish, A. J. M., Yang, J., Sayol, F., Neate-Clegg, M. H. C., Alioravainen, N., Weeks, T. L., Barber, R. A., Walkden, P. A., MacGregor, H. E. A., Jones, S. E. I., Vincent, C., Phillips, A. G., Marples, N. M., Montaño-Centellas, F. A., Leandro-Silva, V., Claramunt, S., Darski, B., et al. (2022). AVONET: morphological, ecological and geographical data for all birds. Ecology Letters, 25(3), 581-597. doi:10.1111/ele.13898
data(AVONET) head(AVONET)data(AVONET) head(AVONET)
The Barro Colorado Island (BCI) dataset contains comprehensive ecological data from the 50-hectare forest dynamics plot on Barro Colorado Island, Panama. This dataset includes phylogenetic information and community composition data for tropical forest species.
BCIBCI
A list containing four main components:
A data frame with species information including species names, genus, and family classifications.
A phylogenetic tree representing species-level evolutionary relationships, rooted and including branch lengths.
A phylogenetic tree with 183 tips and 174 internal nodes, rooted and including branch lengths.
A community matrix showing species abundance across different sampling plots, with species counts for each location.
Barro Colorado Island (BCI)
Condit, R., Pérez, R., Aguilar, S., Lao, S., Foster, R., & Hubbell, S. P. (2019). Complete data from the Barro Colorado 50-ha plot: 423617 trees, 35 years, 2019 version. Dryad Digital Repository. doi:10.15146/5xcp-0d46
# Load the dataset data(BCI) head(BCI)# Load the dataset data(BCI) head(BCI)
A comprehensive dataset of mammalian traits compiled from multiple sources, providing detailed ecological and biological information for various mammal species.
COMBINECOMBINE
A data frame with the following columns:
Species name
Genus name
Taxonomic family
Body mass of an adult individual in grams
Weight of the brain of an adult individual in grams
Total length from tip of the nose to anus or base of the tail of an adult individual in millimeters
Total length from elbow to wrist of an adult individual in millimeters, specific to order Chiroptera
Maximum reported age at death for the species in days
The amount of time needed to reach sexual maturity in days
The amount of time needed for a female to reach sexual maturity in days
Age at which females give birth to their first litter or their young attach to teats in days
Age at first reproduction in days
Length of time of fetal growth in days
Total number of teats present in an individual of the species
Number of offspring born per litter per female
Number of litters per female per year
Time between reproduction events in days
Weight of an individual at birth in grams
Age at which primary nutritional dependency on the mother ends and independent foraging begins in days
Weight at weaning in grams
Average age of parents of the current cohort in days
The distance an animal travels between its place of birth to the place where it reproduces in kilometers
Number of individuals of the species per squared kilometer
Size of the area within which everyday activities of individuals or groups of individuals are typically restricted in km2
Number of individuals in a group that spends most of their daily time together
Percentage of the diet composed of invertebrates
Percentage of the diet composed of vertebrates
Percentage of the diet composed of plants and/or fungi
Percentage of the diet composed of invertebrates
Percentage of the diet composed of mammals, birds
Percentage of the diet composed of reptiles, snakes, amphibians, salamanders
Percentage of the diet composed of fish
Percentage of the diet composed of vertebrates – general or unknown
Percentage of the diet composed of scavenge, garbage, offal, carcasses, trawlers, carrion
Percentage of the diet composed of fruit, drupes
Percentage of the diet composed of nectar, pollen, plant exudates, gums
Percentage of the diet composed of seed, maize, nuts, spores, wheat, grains
Percentage of the diet composed of other plant elements
Number of prevalent EltonTraits dietary categories consumed at 20 percent or more
Upper elevation limit at which the species can be found in meters
Lower elevation limit at which the species can be found in meters
Difference between the upper and lower elevation limits of a species in meters
Number of distinct suitable level 1 IUCN habitats
Soria, C. D., M. Pacifici, M. Di Marco, S. M. Stephen, and C. Rondinini. (2021). COMBINE: a coalesced mammal database of intrinsic and extrinsic traits. Ecology, 102(6):e03344. doi:10.1002/ecy.3344
data(COMBINE) head(COMBINE)data(COMBINE) head(COMBINE)
This function conducts comprehensive phylogenetic niche conservatism analysis across multiple communities simultaneously. It evaluates phylogenetic signal for trait data across different community assemblages using various statistical methods, enabling comparative assessment of niche conservatism patterns among communities. The function processes community composition matrices, species trait information, and phylogenetic trees to determine whether closely related species consistently occupy similar ecological niches across different habitats or sampling locations.
compnc( com, trait_data, phylo_tree, methods = c("lambda", "K"), pca_axes = c("PC1", "PC2"), sig_levels = c(0.001, 0.01, 0.05), min_abundance = 0, nsim = 1000, verbose = TRUE )compnc( com, trait_data, phylo_tree, methods = c("lambda", "K"), pca_axes = c("PC1", "PC2"), sig_levels = c(0.001, 0.01, 0.05), min_abundance = 0, nsim = 1000, verbose = TRUE )
com |
A community matrix with sites as rows and species as columns |
trait_data |
A data frame or matrix containing trait data with species as rows |
phylo_tree |
A phylogenetic tree object of class "phylo" |
methods |
Character vector specifying methods to use. Options: "lambda", "K" |
pca_axes |
Character vector specifying which PCA axes to include (e.g., c("PC1", "PC2")) |
sig_levels |
Numeric vector of significance levels for marking results |
min_abundance |
Minimum abundance threshold for including species |
nsim |
Number of permutations for significance testing |
verbose |
Logical indicating whether to show progress and warnings |
A data frame containing phylogenetic signal results for all communities
#' # Load example data data(BCI) data(TRY) # Extract trait data sp <- colnames(BCI$com) subtraits <- extract_traits(sp, TRY, rank = "species", traits = c("LA", "LMA", "LeafN", "PlantHeight", "SeedMass", "SSD")) compnc(com = BCI$com, subtraits, BCI$phy_species, methods = "lambda", pca_axes = NULL)#' # Load example data data(BCI) data(TRY) # Extract trait data sp <- colnames(BCI$com) subtraits <- extract_traits(sp, TRY, rank = "species", traits = c("LA", "LMA", "LeafN", "PlantHeight", "SeedMass", "SSD")) compnc(com = BCI$com, subtraits, BCI$phy_species, methods = "lambda", pca_axes = NULL)
This function evaluates the robustness of phylogenetic signal estimates across multiple communities by simulating trait data with the same phylogenetic signal strength as observed, applying the original missing data pattern, and testing how consistently the statistical significance is recovered across multiple simulations for each community.
compnc_robustness( com, trait_data, phylo_tree, methods = "lambda", pca_axes = c("PC1", "PC2"), sig_levels = c(0.001, 0.01, 0.05), min_abundance = 0, n_simulations = 100, alpha_level = 0.05, tolerance = 0.05, verbose = TRUE )compnc_robustness( com, trait_data, phylo_tree, methods = "lambda", pca_axes = c("PC1", "PC2"), sig_levels = c(0.001, 0.01, 0.05), min_abundance = 0, n_simulations = 100, alpha_level = 0.05, tolerance = 0.05, verbose = TRUE )
com |
A community matrix with sites as rows and species as columns |
trait_data |
A data frame or matrix containing trait data with species as rows |
phylo_tree |
A phylogenetic tree object of class "phylo" |
methods |
Character string specifying method to use. Options: "lambda" or "K". Default is "lambda" |
pca_axes |
Character vector specifying which PCA axes to include (e.g., c("PC1", "PC2")). Default is c("PC1", "PC2") |
sig_levels |
Numeric vector of significance levels for marking results |
min_abundance |
Minimum abundance threshold for including species |
n_simulations |
Integer. Number of simulations to run for robustness testing. Default is 100 |
alpha_level |
Numeric. Significance level for statistical testing. Default is 0.05 |
tolerance |
Numeric. Acceptable difference between target and estimated signal values during trait simulation. Default is 0.05 |
verbose |
Logical indicating whether to show progress and warnings |
A data frame containing the original phylogenetic signal results with additional columns:
robustness: Percentage of simulations that maintain the same statistical significance conclusion as the original analysis
signal_sd: Standard deviation of phylogenetic signal values across successful simulations
# Load example data data("HimalayanBirds") str(HimalayanBirds) data("AVONET") head(AVONET) # species level sp <- colnames(HimalayanBirds$com) sp subtraits <- extract_traits(sp, AVONET, rank = "species") head(subtraits) coverage(subtraits) pnc(subtraits, HimalayanBirds$phy_species, methods = "lambda", pca_axes = c("PC1", "PC2")) compnc(com = HimalayanBirds$com, subtraits, HimalayanBirds$phy_species, methods = "lambda", pca_axes = NULL) # Test robustness of phylogenetic signal analysis # This function's runtime is long compnc_robustness(HimalayanBirds$com, subtraits, HimalayanBirds$phy_species, methods = "lambda", pca_axes = NULL, n_simulations = 5)# Load example data data("HimalayanBirds") str(HimalayanBirds) data("AVONET") head(AVONET) # species level sp <- colnames(HimalayanBirds$com) sp subtraits <- extract_traits(sp, AVONET, rank = "species") head(subtraits) coverage(subtraits) pnc(subtraits, HimalayanBirds$phy_species, methods = "lambda", pca_axes = c("PC1", "PC2")) compnc(com = HimalayanBirds$com, subtraits, HimalayanBirds$phy_species, methods = "lambda", pca_axes = NULL) # Test robustness of phylogenetic signal analysis # This function's runtime is long compnc_robustness(HimalayanBirds$com, subtraits, HimalayanBirds$phy_species, methods = "lambda", pca_axes = NULL, n_simulations = 5)
This function calculates comprehensive coverage statistics for trait data, including individual trait coverage rates, complete case coverage, and overall data coverage. It provides both summary statistics and detailed breakdowns of missing and available data.
coverage(data)coverage(data)
data |
A data frame containing trait data. Each column represents a trait and each row represents an observation (e.g., species, samples). |
The function performs the following calculations:
Individual trait coverage: For each trait, calculates the number and percentage of available (non-NA) values
Complete case coverage: Counts rows with no missing values across all traits and calculates the percentage
Overall coverage: Calculates the percentage of all cells in the dataset that contain non-missing values
The function also prints the overall trait coverage rate to the console before returning the detailed summary table.
A data frame with the following columns:
Character. Names of traits plus an "All" row for complete cases
Integer. Number of non-missing values for each trait
Integer. Number of missing (NA) values for each trait
Character. Percentage of available data for each trait
The "All" row shows statistics for complete cases (rows with no missing values).
# Create sample trait data trait_data <- data.frame( PlantHeight = c(1.2, 1.5, NA, 2.1, 1.8), LDMC = c(0.5, NA, 0.8, 1.2, 0.9), LA = c(15.2, 18.5, 12.3, NA, 16.7) ) # Calculate coverage statistics coverage(trait_data)# Create sample trait data trait_data <- data.frame( PlantHeight = c(1.2, 1.5, NA, 2.1, 1.8), LDMC = c(0.5, NA, 0.8, 1.2, 0.9), LA = c(15.2, 18.5, 12.3, NA, 16.7) ) # Calculate coverage statistics coverage(trait_data)
This function extracts plant trait data from the TRY database or similar datasets for a specified list of taxa at different taxonomic ranks (species, genus, or family). For numeric traits at genus and family levels, it calculates mean values across all available records.
extract_traits(sp.list, dataset, rank = "species", traits = NULL)extract_traits(sp.list, dataset, rank = "species", traits = NULL)
sp.list |
A character vector containing the names of taxa to extract traits for. The names should match the taxonomic rank specified in the 'rank' parameter. |
dataset |
A data frame containing trait data. Default is TRY database. Must contain columns named "species", "genus", and "family" for taxonomic information. |
rank |
A character string specifying the taxonomic rank to match against. Must be one of "species", "genus", or "family". Default is "species". |
traits |
A character vector specifying which traits to extract. If NULL (default), all available traits in the dataset will be extracted. Available traits are all columns except "species", "genus", and "family". |
The function performs the following operations:
Validates input parameters
Identifies available traits in the dataset
Matches input taxa with dataset entries
Reports missing taxa
Extracts trait data based on the specified taxonomic rank
For numeric traits at genus/family level, calculates mean values
For non-numeric traits, uses the first available value
Handles NaN values by converting them to NA
A data frame with taxa names as row names and trait names as column names. For species-level extraction, returns the first occurrence of each species. For genus/family-level extraction, returns mean values for numeric traits and the first occurrence for non-numeric traits. Missing values are represented as NA.
# Load the dataset data(TRY) # Extract all traits for species species_list <- c("Acaena novae-zelandiae", "Adiantum capillus-veneris", "Zuelania guidonia") extract_traits(species_list, TRY, rank = "species") # Extract specific traits for species extract_traits(species_list, TRY, rank = "species", traits = c("LA", "LMA", "LeafN", "PlantHeight", "SeedMass", "SSD")) # Extract specific traits at genus level genus_list <- c("Acaena", "Adiantum") extract_traits(genus_list, TRY, rank = "genus", traits = c("LDMC", "PlantHeight", "SeedMass"))# Load the dataset data(TRY) # Extract all traits for species species_list <- c("Acaena novae-zelandiae", "Adiantum capillus-veneris", "Zuelania guidonia") extract_traits(species_list, TRY, rank = "species") # Extract specific traits for species extract_traits(species_list, TRY, rank = "species", traits = c("LA", "LMA", "LeafN", "PlantHeight", "SeedMass", "SSD")) # Extract specific traits at genus level genus_list <- c("Acaena", "Adiantum") extract_traits(genus_list, TRY, rank = "genus", traits = c("LDMC", "PlantHeight", "SeedMass"))
A comprehensive dataset of fish life history traits across multiple species, compiled by Thorson et al. (2023). The dataset provides various morphological, ecological, and biological characteristics of fish species.
FishlifeFishlife
A data frame with multiple variables:
Scientific species name
Genus of the fish species
Family classification
Maximum age, years
Trophic level, where 1 is primary producers, etc., dimensionless
Caudal fin height and length divided by area, dimensionless
Annual eggs produced, number/year
von Bertalannffy growth coefficient, year-1
Average temperature from portion of population sampled, celcius
maximum length, cm
von Bertalanffy asymptotic maximum length, cm
Length at 50% maturity, cm
Age at 50% sexual maturity, years
Natural mortality rate M, year-1
Asymptotic maximum weight, g
Maximum body depth, cm
Maximum body width, cm
Length of lower jaw, cm
Depth of caudal pedoncule, connecting caudal fin to body
Size of offspring, kg
Thorson, J. T., Maureaud, A. A., Frelat, R., Mérigot, B., Bigman, J. S., Friedman, S. T., Palomares, M. L. D., Pinsky, M. L., Price, S. A., & Wainwright, P. (2023). Identifying direct and indirect associations among traits by merging phylogenetic comparative methods and structural equation models. Methods in Ecology and Evolution, 14(5), 1243-1255. doi:10.1111/2041-210X.14076
data(Fishlife) head(Fishlife)data(Fishlife) head(Fishlife)
The 'HimalayanBirds' dataset provides information on bird species in the Himalayas, including their species names, genera, families, phylogenetic relationships, and community composition across elevation bands. This dataset is used to explore elevational patterns of bird functional and phylogenetic diversity and the ecological processes that structure bird communities.
HimalayanBirdsHimalayanBirds
A list with three components:
A data frame with 151 rows and 3 variables:
Scientific name of the bird species.
Genus of the bird species.
Family of the bird species.
A phylogenetic tree (object of class "phylo") representing the evolutionary relationships among the bird species. It contains edge, edge.length, Nnode, tip.label, and node.label.
A community matrix representing the presence (1) or absence (0) of each bird species across 12 elevation bands (ele1 to ele12). The rows represent the elevation bands, and the columns represent the bird species.
Ding, Z., Hu, H., Cadotte, M.W., Liang, J., Hu, Y., & Si, X. (2021). Elevational patterns of bird functional and phylogenetic structure in the central Himalaya. Ecography, 44(9), 1403-1417. doi:10.1111/ecog.05660
# Load the dataset data(HimalayanBirds) head(HimalayanBirds)# Load the dataset data(HimalayanBirds) head(HimalayanBirds)
This function merges two data frames based on the 'species' column, handling missing values and column differences intelligently. It provides flexible options for resolving conflicts when the same species appears in both datasets.
merge_dataset(main_data, additional_data, priority = "main")merge_dataset(main_data, additional_data, priority = "main")
main_data |
A data frame containing the primary dataset. Must include a 'species' column. |
additional_data |
A data frame containing the secondary dataset. Must include a 'species' column. |
priority |
A character string specifying how to handle conflicts when both datasets contain non-missing values for the same species and column. Options are:
|
The function performs the following operations:
Combines all unique species from both datasets
Includes all columns from both datasets
Handles missing values by using available non-missing values
Resolves conflicts based on the specified priority
For duplicate species within a dataset, only the first occurrence is used
A data frame containing all unique species from both input datasets, with all columns from both datasets. The 'species' column is placed first, followed by all other columns in alphabetical order.
Both input datasets must contain a 'species' column
If a species appears multiple times in a dataset, only the first occurrence is used
When priority is "mean", non-numeric values default to main_data values
The function preserves the original data types of columns
# Create sample datasets main_data <- data.frame( species = c("Abies alba", "Coussapoa trinervia", "Crataegus monogyna"), genus = c("Abies", "Coussapoa", "Crataegus"), family = c("Pinaceae", "Urticaceae", "Rosaceae"), LA = c(NA, 2050.24, 449.15), LeafN = c(13.10, 14.52, 17.46), Seedmass = c(53.64, NA, 95.92), stringsAsFactors = FALSE ) additional_data <- data.frame( species = c("Abies alba", "Corydalis solida"), genus = c("Abies", "Corydalis"), family = c("Pinaceae", "Papaveraceae"), LA = c(25.58, NA), LMA = c(0.19, 0.2), PlantHeight = c(53.66, 0.14), stringsAsFactors = FALSE ) # Merge with main data priority (default) merge_dataset(main_data, additional_data)# Create sample datasets main_data <- data.frame( species = c("Abies alba", "Coussapoa trinervia", "Crataegus monogyna"), genus = c("Abies", "Coussapoa", "Crataegus"), family = c("Pinaceae", "Urticaceae", "Rosaceae"), LA = c(NA, 2050.24, 449.15), LeafN = c(13.10, 14.52, 17.46), Seedmass = c(53.64, NA, 95.92), stringsAsFactors = FALSE ) additional_data <- data.frame( species = c("Abies alba", "Corydalis solida"), genus = c("Abies", "Corydalis"), family = c("Pinaceae", "Papaveraceae"), LA = c(25.58, NA), LMA = c(0.19, 0.2), PlantHeight = c(53.66, 0.14), stringsAsFactors = FALSE ) # Merge with main data priority (default) merge_dataset(main_data, additional_data)
This function performs in-depth phylogenetic niche conservatism analysis for communities by quantifying phylogenetic signal in trait data using multiple statistical methods. The function integrates trait data preprocessing, phylogenetic tree manipulation, optional principal component analysis, and robust statistical testing to provide detailed insights into evolutionary constraints on trait evolution.
pnc( trait_data, phylo_tree, methods = "lambda", pca_axes = c("PC1", "PC2"), sig_levels = c(0.001, 0.01, 0.05), nsim = 1000, verbose = TRUE )pnc( trait_data, phylo_tree, methods = "lambda", pca_axes = c("PC1", "PC2"), sig_levels = c(0.001, 0.01, 0.05), nsim = 1000, verbose = TRUE )
trait_data |
A data frame or matrix containing trait data with species as rows |
phylo_tree |
A phylogenetic tree object of class "phylo" |
methods |
Character vector specifying methods to use. Options: "lambda", "K" |
pca_axes |
Character vector specifying which PCA axes to include (e.g., c("PC1", "PC2")) |
sig_levels |
Numeric vector of significance levels for marking results |
nsim |
Number of permutations for significance testing |
verbose |
Logical indicating whether to show progress and warnings |
A data frame containing phylogenetic signal results
Münkemüller, T., Lavergne, S., Bzeznik, B., Dray, S., Jombart, T., Schiffers, K. and Thuiller, W. (2012). How to measure and test phylogenetic signal. Methods in Ecology and Evolution, 3(4), 743-756. doi:10.1111/j.2041-210X.2012.00196.x
#' # Load example data data(BCI) data(TRY) # Extract trait data sp <- colnames(BCI$com) subtraits <- extract_traits(sp, TRY, rank = "species", traits = c("LA", "LMA", "LeafN", "PlantHeight", "SeedMass", "SSD")) # Calculate phylogenetic signal using Lambda method pnc(subtraits, BCI$phy_species, methods = "lambda") # Calculate without PCA analysis pnc(subtraits, BCI$phy_species, methods = "lambda", pca_axes = NULL)#' # Load example data data(BCI) data(TRY) # Extract trait data sp <- colnames(BCI$com) subtraits <- extract_traits(sp, TRY, rank = "species", traits = c("LA", "LMA", "LeafN", "PlantHeight", "SeedMass", "SSD")) # Calculate phylogenetic signal using Lambda method pnc(subtraits, BCI$phy_species, methods = "lambda") # Calculate without PCA analysis pnc(subtraits, BCI$phy_species, methods = "lambda", pca_axes = NULL)
This function evaluates the robustness of phylogenetic signal estimates by simulating trait data with the same phylogenetic signal strength as observed, applying the original missing data pattern, and testing how consistently the statistical significance is recovered across multiple simulations.
pnc_robustness( trait_data, phylo_tree, methods = "lambda", pca_axes = c("PC1", "PC2"), n_simulations = 100, alpha_level = 0.05, tolerance = 0.05 )pnc_robustness( trait_data, phylo_tree, methods = "lambda", pca_axes = c("PC1", "PC2"), n_simulations = 100, alpha_level = 0.05, tolerance = 0.05 )
trait_data |
A data frame or matrix containing trait data with species as rows |
phylo_tree |
A phylogenetic tree object of class "phylo" |
methods |
Character string specifying method to use. Options: "lambda" or "K". Default is "lambda" |
pca_axes |
Character vector specifying which PCA axes to include (e.g., c("PC1", "PC2")). Default is c("PC1", "PC2") |
n_simulations |
Integer. Number of simulations to run for robustness testing. Default is 100 |
alpha_level |
Numeric. Significance level for statistical testing. Default is 0.05 |
tolerance |
Numeric. Acceptable difference between target and estimated signal values during trait simulation. Default is 0.05 |
The robustness testing procedure involves:
1. Performing baseline phylogenetic signal analysis using pnc()
2. For each trait, simulating new trait data with the same phylogenetic signal strength as observed in the original data
3. Applying the exact missing data pattern from the original dataset to the simulated data
4. Re-testing phylogenetic signal on the simulated data and recording p-values
5. Calculating the percentage of simulations that maintain the same statistical significance conclusion (significant vs. non-significant)
The function uses simulate_lambda_trait() or simulate_K_trait() internally to generate trait data with target phylogenetic signal values.
For PCA axes, the missing data pattern corresponds to complete cases from the original trait matrix. For individual traits, the original missing pattern is preserved exactly.
A data frame containing the original phylogenetic signal results with additional columns:
robustness: Percentage of simulations that maintain the same statistical significance conclusion as the original analysis
signal_sd: Standard deviation of phylogenetic signal values across successful simulations
Returns the enhanced results from the baseline pnc() analysis
# Load example data data(BCI) data(TRY) # Extract trait data sp <- colnames(BCI$com) subtraits <- extract_traits(sp, TRY, rank = "species", traits = c("LA", "LMA", "LeafN", "PlantHeight")) # Test robustness of phylogenetic signal analysis # This function's runtime is long pnc_robustness(subtraits, BCI$phy_species, methods = "lambda", n_simulations = 5)# Load example data data(BCI) data(TRY) # Extract trait data sp <- colnames(BCI$com) subtraits <- extract_traits(sp, TRY, rank = "species", traits = c("LA", "LMA", "LeafN", "PlantHeight")) # Test robustness of phylogenetic signal analysis # This function's runtime is long pnc_robustness(subtraits, BCI$phy_species, methods = "lambda", n_simulations = 5)
A comprehensive dataset containing ecological and morphological characteristics of reptiles. The dataset provides detailed information about reptile species, including elevation, seasonal precipitation, body mass, and reproductive features.
ReptTraitsReptTraits
A data frame with the following columns:
Scientific species name
Genus name
Family name
Minimum elevation where the species was observed (meters above sea level)
Maximum elevation where the species was observed (meters above sea level)
Mean annual temperature,°C
Temperature seasonality, standard deviation × 100
Seasonal precipitation information
Longevity data are the maximum age reported for each species from the literature, years
Maximum body mass of the species (grams)
Maximum length ("SVL", mm)/straight carapace length for turtles ("SCL", mm)
Mean number of offspring or eggs per clutch
Minimum clutch/litter size
Maximum clutch/litter size
The mean reported mean body temperatures of animal, °C
Oskyrko, O., Mi, C., Meiri, S., & Du, W. (2024). ReptTraits: a comprehensive dataset of ecological traits in reptiles. Scientific Data, 11(1), 243. doi:10.1038/s41597-024-03079-5
data(ReptTraits) head(ReptTraits)data(ReptTraits) head(ReptTraits)
This function generates trait data that matches a specified phylogenetic signal strength (Blomberg's K) through iterative simulation and testing.
simulate_K_trait(target_K, tree, max_attempts = 1e+05, tolerance = 0.02)simulate_K_trait(target_K, tree, max_attempts = 1e+05, tolerance = 0.02)
target_K |
Numeric. The desired phylogenetic signal strength (K value). - K = 0: No phylogenetic signal (star phylogeny) - K = 1: Expected signal under Brownian motion evolution - K > 1: Stronger phylogenetic signal than expected under Brownian motion - 0 < K < 1: Weaker phylogenetic signal than expected under Brownian motion |
tree |
An object of class "phylo". The phylogenetic tree for trait simulation. |
max_attempts |
Integer. Maximum number of simulation attempts before giving up. Default is 100000. |
tolerance |
Numeric. Acceptable difference between target and estimated K. Default is 0.02. |
The function works by:
1. Transforming the phylogenetic tree according to the target K value
2. Simulating trait data using phytools::fastBM() on the transformed tree
3. Estimating the phylogenetic signal using phytools::phylosig()
4. Repeating until the estimated K is within tolerance of the target
Tree transformation strategies: - When target_K = 0: Creates a star phylogeny using ape::stree() - When target_K = 1: Uses the original tree without transformation - When target_K > 1: Scales all branch lengths by the target K value - When 0 < target_K < 1: Interpolates between original tree and uniform branch lengths
A data.frame with one column named 'trait' containing the simulated trait values. Row names correspond to tip labels from the phylogenetic tree. Returns NULL if the target K cannot be achieved within the specified tolerance and attempts.
Blomberg's K measures the strength of phylogenetic signal relative to what would be expected under a Brownian motion model of evolution. Unlike Pagel's lambda, K can exceed 1, indicating stronger phylogenetic clustering than expected.
The function may take considerable time to converge for certain K values. Consider adjusting the tolerance parameter if convergence is slow.
# Generate a random tree tree <- ape::rtree(50) # Simulate trait with expected Brownian motion signal trait_data <- simulate_K_trait(0.9, tree) # Verify the phylogenetic signal trait_vector <- setNames(trait_data$trait, rownames(trait_data)) phytools::phylosig(tree, trait_vector, method = "K", test = TRUE)# Generate a random tree tree <- ape::rtree(50) # Simulate trait with expected Brownian motion signal trait_data <- simulate_K_trait(0.9, tree) # Verify the phylogenetic signal trait_vector <- setNames(trait_data$trait, rownames(trait_data)) phytools::phylosig(tree, trait_vector, method = "K", test = TRUE)
This function generates trait data that matches a specified phylogenetic signal strength (Pagel's lambda) through iterative simulation and testing.
simulate_lambda_trait( target_lambda, tree, max_attempts = 1e+05, tolerance = 0.02 )simulate_lambda_trait( target_lambda, tree, max_attempts = 1e+05, tolerance = 0.02 )
target_lambda |
Numeric. The desired phylogenetic signal strength (lambda value). Should be between 0 and 1. - 0: No phylogenetic signal (star phylogeny) - 1: Full phylogenetic signal (Brownian motion) |
tree |
An object of class "phylo". The phylogenetic tree for trait simulation. |
max_attempts |
Integer. Maximum number of simulation attempts before giving up. Default is 100000. |
tolerance |
Numeric. Acceptable difference between target and estimated lambda. Default is 0.02. |
The function works by:
1. Transforming the phylogenetic tree according to the target lambda value using rescale()
2. Simulating trait data using fastBM() on the transformed tree
3. Estimating the phylogenetic signal using phylosig()
4. Repeating until the estimated lambda is within tolerance of the target
Special cases: - When target_lambda = 0: Sets internal branch lengths to 0, keeping only terminal branches - When target_lambda = 1: Uses the original tree without transformation
A data.frame with one column named 'trait' containing the simulated trait values. Row names correspond to tip labels from the phylogenetic tree. Returns NULL if the target lambda cannot be achieved within the specified tolerance and attempts.
The function may take considerable time to converge for certain lambda values, especially those close to intermediate values.
Consider adjusting the tolerance parameter if convergence is slow.
If 'target_lambda' is greater than 1, it will be automatically capped at 1, as lambda values typically range from 0 to 1.
# Generate a random tree tree <- ape::rtree(50) # Simulate trait with strong phylogenetic signal trait_data <- simulate_lambda_trait(0.8, tree) # Verify the phylogenetic signal trait_vector <- setNames(trait_data$trait, rownames(trait_data)) phytools::phylosig(tree, trait_vector, method = "lambda", test = TRUE)# Generate a random tree tree <- ape::rtree(50) # Simulate trait with strong phylogenetic signal trait_data <- simulate_lambda_trait(0.8, tree) # Verify the phylogenetic signal trait_vector <- setNames(trait_data$trait, rownames(trait_data)) phytools::phylosig(tree, trait_vector, method = "lambda", test = TRUE)
A comprehensive global database of plant functional traits from the TRY initiative. This dataset contains standardized measurements of key plant functional traits across multiple species, genera, and families.
TRYTRY
A data frame with 58,964 rows and 23 variables:
Character. Species name
Character. Genus name
Character. Family name
Numeric. Dispersal unit length, mm. (TraitID: 237)
Numeric. Leaf area (in case of compound leaves: leaflet, undefined if petiole is in- or excluded), mm2. (TraitID: 3113)
Numeric. Leaf dry mass per leaf fresh mass (leaf dry matter content, LDMC), g/g. (TraitID: 47)
Numeric. Leaf carbon (C) content per leaf dry mass, mg/g. (TraitID: 13)
Numeric. Leaf nitrogen (N) content per leaf dry mass, mg/g. (TraitID: 14)
Numeric. Leaf nitrogen/phosphorus (N/P) ratio, g/g. (TraitID: 56)
Numeric. Leaf nitrogen (N) content per leaf area, g m-2. (TraitID: 50)
Numeric. Leaf phosphorus (P) content per leaf dry mass, mg/g. (TraitID: 15)
Numeric. Leaf nitrogen (N) isotope signature (delta 15N), per mill. (TraitID: 78)
Numeric. Leaf fresh mass, g. (TraitID: 163)
Numeric. Leaf mass per area. (1/SLA)
Numeric. Plant height vegetative, m. (TraitID: 3106)
Numeric. Root rooting depth, m. (TraitID: 6)
Numeric. Seed length, mm. (TraitID: 27)
Numeric. Seed dry mass, mg. (TraitID: 26)
Numeric. Seed number per reproduction unit, number. (TraitID: 138)
Numeric. Leaf area per leaf dry mass (specific leaf area, SLA or 1/LMA): petiole excluded, mm2 mg-1. (TraitID: 3115)
Numeric. Stem specific density (SSD, stem dry mass per stem fresh volume) or wood density, g/cm3. (TraitID: 4)
Numeric. Stem conduit density (vessels and tracheids), mm-2. (TraitID: 169)
Numeric. Wood vessel element length; stem conduit (vessel and tracheids) element length, micro m. (TraitID: 282)
The TRY database represents a global effort to compile plant functional trait data from multiple sources and research groups. Plant functional traits are morphological, physiological, and phenological characteristics that influence fitness and ecosystem functioning. This dataset includes key traits related to:
Leaf economics (SLA, LDMC, leaf nutrients)
Plant architecture (height, rooting depth)
Reproductive strategy (seed mass, seed number)
Wood anatomy (vessel length, conduit density)
Chemical composition (C, N, P content)
Missing values (NA) are common in trait databases due to the difficulty of measuring all traits for all species.
TRY Plant Trait Database (https://www.try-db.org/)
Kattge, J., Bönisch, G., Díaz, S., et al. (2020). TRY plant trait database – enhanced coverage and open access. Global Change Biology, 26(1), 119-188. doi:10.1111/gcb.14904
# Load the dataset data(TRY)# Load the dataset data(TRY)