Q1: How should I interpret PCA, PCoA, and NMDS plots?
A: In these plots, each dot represents a sample, and dots of different colors belong to different sample groups. The distance between the dots reflects the degree of variation between samples: the closer two dots are, the higher the similarity (and lower the difference) between the corresponding samples.
Q2: What is the difference between PCA and PCoA?
A: Principal Component Analysis (PCA) is a technique used to analyze and simplify complex datasets by decomposing variance and visualizing the differences across multiple groups on a 2D coordinate plot. Principal Coordinates Analysis (PCoA) is a similar dimensionality reduction and ordination method.
The key difference lies in how they handle data: PCA is based on the original species composition matrix and typically uses Euclidean distance, which primarily compares differences in species abundance. In contrast, PCoA first calculates the distance between samples using various distance algorithms. It then processes this distance matrix so that the distances between points in the plot accurately reflect the original dissimilarity data, effectively achieving a quantitative transformation of qualitative data.
Q3: How should I proceed if there is high variation among biological replicates within a group?
A: You should analyze the sample preparation process. Besides the intended experimental conditions, samples within a group might be influenced by various other factors, leading to discrepancies in the results.
Sampling: For environmental samples collected from large areas (e.g., soil, water, sludge), it is recommended to use a composite sampling method—mixing samples from multiple points to form a single biological replicate—to minimize individual variations.
Individual Variation: If a specific sample appears as a significant outlier, it is likely due to the sample itself. It is recommended to remove this outlier and re-run the analysis. If removing the outlier leaves you with insufficient biological replicates, consider sending additional samples to meet the requirement for replicates.