| Title: | Tidying, Analysis, and Fast Visualization of Animal and Plant Pedigrees |
|---|---|
| Description: | Provides tools for the analysis and visualization of animal and plant pedigrees. Analytical methods include equivalent complete generations, generation intervals, effective population size (via inbreeding, coancestry, and demographic approaches), founder and ancestor contributions, partial inbreeding, genetic diversity indices, and additive (A), dominance (D), and epistatic (AA) relationship matrices. Core algorithms — ancestry tracing, topological sorting, inbreeding coefficients, and matrix construction — are implemented in C++ ('Rcpp', 'RcppArmadillo') and 'data.table', scaling to pedigrees with over one million individuals. Pedigree graphs are rendered via 'igraph' with support for compact full-sib family display; relationship matrices can be visualized as heatmaps. Supports complex mating systems, including selfing and pedigrees in which the same individual can appear as both sire and dam. |
| Authors: | Sheng Luan [aut, cre] |
| Maintainer: | Sheng Luan <[email protected]> |
| License: | GPL-3 |
| Version: | 1.8.1 |
| Built: | 2026-05-14 05:54:13 UTC |
| Source: | https://github.com/luansheng/vispedigree |
Intercepts data.table's [ method for tidyped objects.
After subsetting, the method checks whether the result is still a valid
pedigree (all referenced parents still present). If so, IndNum,
SireNum, and DamNum are rebuilt and the tidyped class
is preserved. If the pedigree becomes structurally incomplete (missing parent
records), the result is degraded to a plain data.table with a warning.
Column-only selections (missing core columns) also return a plain
data.table.
## S3 method for class 'tidyped' x[...]## S3 method for class 'tidyped' x[...]
x |
A |
... |
Arguments passed to the |
A tidyped object if the result is still a complete pedigree,
otherwise a plain data.table.
Rapidly restores the tidyped class to a data.table or
data.frame that was previously processed by tidyped()
but lost its class attributes due to data manipulation.
as_tidyped(x)as_tidyped(x)
x |
A |
This is a lightweight operation that only checks for the required columns and re-attaches the class—it does not re-run the full pedigree sorting, generation inference, or loop detection.
This helper is intended for objects that still contain the core pedigree
columns and numeric indices, but no longer inherit from tidyped.
A common reproducible case is rbind() on two tidyped
fragments, which typically returns a plain data.table. Converting a
tidyped object to a plain data.frame and then subsetting it
also drops the class.
Some operations, such as merge() or certain dplyr workflows, may or
may not preserve the tidyped class depending on the versions of
data.table, dplyr, and the exact method dispatch path used in
the current R session. Therefore, as_tidyped() should be viewed as a
safe recovery helper rather than something only needed after one specific
verb.
Typical class-loss scenarios include:
rbind(tped1, tped2) — often returns plain data.table
as.data.frame(tped)[rows, ] — returns plain data.frame
manual class removal or serialization / import workflows
After such operations, downstream analysis functions (e.g.,
pedstats, pedne) will either error or
automatically restore the class. You can also call as_tidyped()
explicitly to restore the class yourself.
A tidyped object.
library(visPedigree) tp <- tidyped(simple_ped) class(tp) # [1] "tidyped" "data.table" "data.frame" # Simulate class loss via rbind() tp2 <- rbind(tp[1:5], tp[6:10]) class(tp2) # [1] "data.table" "data.frame" # Restore the class tp3 <- as_tidyped(tp2) class(tp3) # [1] "tidyped" "data.table" "data.frame" # It can also restore from a plain data.frame if core columns are intact tp_df <- as.data.frame(tp) tp4 <- tp_df[tp_df$Gen > 1, ] class(tp4) # [1] "data.frame" tp5 <- as_tidyped(tp4) class(tp5) # [1] "tidyped" "data.table" "data.frame"library(visPedigree) tp <- tidyped(simple_ped) class(tp) # [1] "tidyped" "data.table" "data.frame" # Simulate class loss via rbind() tp2 <- rbind(tp[1:5], tp[6:10]) class(tp2) # [1] "data.table" "data.frame" # Restore the class tp3 <- as_tidyped(tp2) class(tp3) # [1] "tidyped" "data.table" "data.frame" # It can also restore from a plain data.frame if core columns are intact tp_df <- as.data.frame(tp) tp4 <- tp_df[tp_df$Gen > 1, ] class(tp4) # [1] "data.frame" tp5 <- as_tidyped(tp4) class(tp5) # [1] "tidyped" "data.table" "data.frame"
A dataset containing a pedigree with many full-sib individuals per family.
big_family_size_pedbig_family_size_ped
A data.table with 8 columns:
Individual ID
Sire ID
Dam ID
Sex of the individual
Year of birth
Numeric ID for individual
Numeric ID for sire
Numeric ID for dam
A dataset containing a large, complex pedigree covering about 100 generations, useful for testing the performance and accuracy of partial inbreeding and similar calculations.
complex_pedcomplex_ped
A data.table with a standard pedigree structure.
A dataset containing a pedigree with many generations.
deep_peddeep_ped
A data.table with 4 columns:
Individual ID
Sire ID
Dam ID
Sex of the individual
Restores a compact pedmat to its original dimensions by mapping each individual to their family representative's values. For non-compact matrices, returns the matrix unchanged.
expand_pedmat(x)expand_pedmat(x)
x |
A pedmat object from |
For compact matrices, full-siblings within the same family will have identical relationship values in the expanded matrix because they shared the same representative during calculation.
Matrix or vector with original pedigree dimensions:
Matrices: Row and column names set to all individual IDs
Vectors (e.g., method="f"): Names set to all individual IDs
The result is not a pedmat object (S3 class stripped).
tped <- tidyped(small_ped) # Compact matrix A_compact <- pedmat(tped, method = "A", compact = TRUE) dim(A_compact) # Reduced dimensions # Expand to full size A_full <- expand_pedmat(A_compact) dim(A_full) # Original dimensions restored # Non-compact matrices are returned unchanged A <- pedmat(tped, method = "A", compact = FALSE) A2 <- expand_pedmat(A) identical(dim(A), dim(A2)) # TRUEtped <- tidyped(small_ped) # Compact matrix A_compact <- pedmat(tped, method = "A", compact = TRUE) dim(A_compact) # Reduced dimensions # Expand to full size A_full <- expand_pedmat(A_compact) dim(A_full) # Original dimensions restored # Non-compact matrices are returned unchanged A <- pedmat(tped, method = "A", compact = FALSE) A2 <- expand_pedmat(A) identical(dim(A), dim(A2)) # TRUE
A dataset from ENDOG containing individuals with a single missing parent (half founders). Useful for testing genetic algorithms correctly conserving probability mass for missing lineages.
half_founder_pedhalf_founder_ped
A data.frame with 4 columns:
Individual ID
Sire ID
Dam ID
Sex of the individual
Check whether a tidyped object contains candidate flags
has_candidates(x)has_candidates(x)
x |
A tidyped object. |
Logical scalar.
Check whether a tidyped object contains inbreeding coefficients
has_inbreeding(x)has_inbreeding(x)
x |
A tidyped object. |
Logical scalar.
A simulated pedigree designed to demonstrate high levels of inbreeding and partial inbreeding decomposition. Contains full-sib mating and backcrossing.
inbred_pedinbred_ped
A data.table with 5 columns:
Individual ID
Sire ID
Dam ID
Sex of the individual
Generation number
inbreed function calculates the inbreeding coefficients for all individuals in a tidied pedigree.
inbreed(ped, ...)inbreed(ped, ...)
ped |
A |
... |
Additional arguments (currently ignored). |
This function takes a pedigree tidied by the tidyped function and
calculates the inbreeding coefficients using an optimized C++ implementation of
the Sargolzaei & Iwaisaki (2005) LAP (Longest Ancestral Path) bucket algorithm.
This method is the fastest known direct algorithm for computing all inbreeding
coefficients: it replaces the O() linear scan of Meuwissen & Luo (1992)
with O(1) bucket pops and selective ancestor clearing, giving
total work where is the number of distinct
ancestors of individual . At , the kernel
completes in approximately 0.12 s — over 10 faster than the previous
Meuwissen & Luo (1992) implementation and on par with the pedigreemm
reference C implementation of the same algorithm.
It is the core engine used by both tidyped(..., inbreed = TRUE) and
pedmat(..., method = "f"), ensuring consistent results across the package.
A tidyped object with an additional column f.
library(visPedigree) data(simple_ped) ped <- tidyped(simple_ped) ped_f <- inbreed(ped) ped_f[f > 0, .(Ind, Sire, Dam, f)]library(visPedigree) data(simple_ped) ped <- tidyped(simple_ped) ped_f <- inbreed(ped) ped_f[f > 0, .(Ind, Sire, Dam, f)]
Test if an object is a tidyped
is_tidyped(x)is_tidyped(x)
x |
An object to test. |
Logical scalar.
A dataset containing a pedigree with circular mating loops.
loop_pedloop_ped
A data.table with 3 columns:
Individual ID
Sire ID
Dam ID
Estimates the proportion of genes for each individual that originates from specific founder groups (e.g., breeds, source populations).
pedancestry(ped, foundervar, target_labels = NULL)pedancestry(ped, foundervar, target_labels = NULL)
ped |
A |
foundervar |
Character. The name of the column containing founder-group labels (e.g., "Breed", "Origin"). |
target_labels |
Character vector. Specific founder-group labels to track.
If NULL, all unique labels in |
A data.table with columns:
Ind: Individual ID.
One column per tracked label (named after each unique value in
foundervar among founders, or as specified by
target_labels).
Each value gives the proportion of genes (0–1) originating from that
founder group. Row sums across all label columns equal 1.
library(data.table) # Create dummy labels for founders tp <- tidyped(small_ped) tp_dated <- copy(tp) founders <- tp_dated[is.na(Sire) & is.na(Dam), Ind] # Assign 'LineA' and 'LineB' tp_dated[Ind %in% founders[1:(length(founders)/2)], Origin := "LineA"] tp_dated[is.na(Origin), Origin := "LineB"] # Calculate ancestry proportions for all individuals anc <- pedancestry(tp_dated, foundervar = "Origin") print(tail(anc))library(data.table) # Create dummy labels for founders tp <- tidyped(small_ped) tp_dated <- copy(tp) founders <- tp_dated[is.na(Sire) & is.na(Dam), Ind] # Assign 'LineA' and 'LineB' tp_dated[Ind %in% founders[1:(length(founders)/2)], Origin := "LineA"] tp_dated[is.na(Origin), Origin := "LineB"] # Calculate ancestry proportions for all individuals anc <- pedancestry(tp_dated, foundervar = "Origin") print(tail(anc))
Calculates genetic contributions from founders and influential ancestors. Implements the gene dropping algorithm for founder contributions and Boichard's algorithm for ancestor contributions to estimate the effective number of founders ($f_e$) and ancestors ($f_a$).
pedcontrib( ped, reference = NULL, mode = c("both", "founder", "ancestor"), top = 20 )pedcontrib( ped, reference = NULL, mode = c("both", "founder", "ancestor"), top = 20 )
ped |
A |
reference |
Character vector. Optional subset of individual IDs defining the reference population. If NULL, uses all individuals in the most recent generation. |
mode |
Character. Type of contribution to calculate:
|
top |
Integer. Number of top contributors to return. Default is 20. |
**Founder Contributions ($f_e$):** Calculated by probabilistic gene flow from founders to the reference cohort. When individual ancestors with one unknown parent exist, "phantom" parents are temporarily injected correctly conserving the probability mass.
**Ancestor Contributions ($f_a$):** Calculated using Boichard's iterative algorithm (1997), accounting for:
Marginal genetic contribution of each ancestor
Long-term contributions through multiple pathways
The parameter $f_a$ acts as a stringent metric since it identifies the bottlenecks of genetic variation in pedigrees.
A list with class pedcontrib containing:
founders: A data.table of founder contributions (if mode includes "founder", or "both").
ancestors: A data.table of ancestor contributions (if mode includes "ancestor", or "both").
summary: A list of summary statistics including:
f_e: Classical effective number of founders (, Lacy 1989).
f_e_H: Information-theoretic effective number of founders
(, Shannon entropy): .
f_a: Classical effective number of ancestors (, Boichard 1997).
f_a_H: Information-theoretic effective number of ancestors
(): .
Each contribution table contains:
Ind: Individual ID.
Contrib: Contribution to the reference population (0-1).
CumContrib: Cumulative contribution.
Rank: Rank by contribution.
Boichard, D., Maignel, L., & Verrier, É. (1997). The value of using probabilities of gene origin to measure genetic variability in a population. Genetics Selection Evolution, 29(1), 5-23.
library(data.table) # Load a sample pedigree tp <- tidyped(small_ped) # Calculate both founder and ancestor contributions for reference population ref_ids <- c("Z1", "Z2", "X", "Y") contrib <- pedcontrib(tp, reference = ref_ids, mode = "both") # Print results including f_e, f_e(H), f_a, and f_a(H) print(contrib) # Access Shannon-entropy effective numbers directly contrib$summary$f_e_H # Information-theoretic effective founders (q=1) contrib$summary$f_e # Classical effective founders (q=2) contrib$summary$f_a_H # Information-theoretic effective ancestors (q=1) contrib$summary$f_a # Classical effective ancestors (q=2) # Diversity ratio rho > 1 indicates long-tail founder value contrib$summary$f_e_H / contrib$summary$f_elibrary(data.table) # Load a sample pedigree tp <- tidyped(small_ped) # Calculate both founder and ancestor contributions for reference population ref_ids <- c("Z1", "Z2", "X", "Y") contrib <- pedcontrib(tp, reference = ref_ids, mode = "both") # Print results including f_e, f_e(H), f_a, and f_a(H) print(contrib) # Access Shannon-entropy effective numbers directly contrib$summary$f_e_H # Information-theoretic effective founders (q=1) contrib$summary$f_e # Classical effective founders (q=2) contrib$summary$f_a_H # Information-theoretic effective ancestors (q=1) contrib$summary$f_a # Classical effective ancestors (q=2) # Diversity ratio rho > 1 indicates long-tail founder value contrib$summary$f_e_H / contrib$summary$f_e
Estimates the number of distinct ancestral generations using the Equi-Generate Coefficient (ECG). The ECG is calculated as 1/2 of the sum of the parents' ECG values plus 1.
pedecg(ped)pedecg(ped)
ped |
A |
A data.table with columns:
Ind: Individual ID.
ECG: Equi-Generate Coefficient.
FullGen: Fully traced generations (depth of complete ancestry).
MaxGen: Maximum ancestral generations (depth of deepest ancestor).
Boichard, D., Maignel, L., & Verrier, E. (1997). The value of using probabilities of gene origin to measure genetic variability in a population. Genetics Selection Evolution, 29(1), 5.
tp <- tidyped(simple_ped) ecg <- pedecg(tp) # ECG combines pedigree depth and completeness head(ecg) # Individuals with deeper and more complete ancestry have larger ECG values ecg[order(-ECG)][1:5]tp <- tidyped(simple_ped) ecg <- pedecg(tp) # ECG combines pedigree depth and completeness head(ecg) # Individuals with deeper and more complete ancestry have larger ECG values ecg[order(-ECG)][1:5]
Classifies individuals into inbreeding levels based on their inbreeding coefficients (F) according to standard or user-defined thresholds.
pedfclass(ped, breaks = c(0.0625, 0.125, 0.25), labels = NULL)pedfclass(ped, breaks = c(0.0625, 0.125, 0.25), labels = NULL)
ped |
A |
breaks |
Numeric vector of strictly increasing positive upper bounds for
inbreeding classes. Default is |
labels |
Optional character vector of interval labels. If |
The default thresholds follow common pedigree interpretation rules:
F = 0.0625: approximately the offspring of half-sib mating.
F = 0.125: approximately the offspring of avuncular or grandparent-grandchild mating.
F = 0.25: approximately the offspring of full-sib or parent-offspring mating.
Therefore, assigning F = 0.25 to the class "0.125 < F <= 0.25"
is appropriate. If finer reporting is needed, supply custom breaks,
for example to separate 0.25, 0.375, or 0.5.
A data.table with 3 columns:
FClassAn ordered factor. By default it contains 5 levels:
"F = 0", "0 < F <= 0.0625", "0.0625 < F <= 0.125",
"0.125 < F <= 0.25", and "F > 0.25". The number of levels
equals length(breaks) + 2 (the fixed zero class plus one class per
bounded interval plus the open-ended tail).
CountInteger. Number of individuals in each class.
PercentageNumeric. Percentage of individuals in each class.
tp <- tidyped(simple_ped, addnum = TRUE) pedfclass(tp) # Finer custom classes (4 breaks, labels auto-generated) pedfclass(tp, breaks = c(0.03125, 0.0625, 0.125, 0.25)) # Custom labels aligned to breaks (3 labels for 3 breaks; tail is auto) pedfclass(tp, labels = c("Low", "Moderate", "High")) tp_inbred <- tidyped(inbred_ped, addnum = TRUE) pedfclass(tp_inbred)tp <- tidyped(simple_ped, addnum = TRUE) pedfclass(tp) # Finer custom classes (4 breaks, labels auto-generated) pedfclass(tp, breaks = c(0.03125, 0.0625, 0.125, 0.25)) # Custom labels aligned to breaks (3 labels for 3 breaks; tail is auto) pedfclass(tp, labels = c("Low", "Moderate", "High")) tp_inbred <- tidyped(inbred_ped, addnum = TRUE) pedfclass(tp_inbred)
Computes the generation intervals for the four gametic pathways: Sire to Son (SS), Sire to Daughter (SD), Dam to Son (DS), and Dam to Daughter (DD). The generation interval is defined as the age of the parents at the birth of their offspring.
pedgenint( ped, timevar = NULL, unit = c("year", "month", "day", "hour"), format = NULL, cycle = NULL, by = NULL )pedgenint( ped, timevar = NULL, unit = c("year", "month", "day", "hour"), format = NULL, cycle = NULL, by = NULL )
ped |
A |
timevar |
Character. The name of the column containing the birth date (or hatch date) of each individual. The column must be one of:
If |
unit |
Character. Output time unit for the interval:
|
format |
Character. Optional format string for parsing |
cycle |
Numeric. Optional target (designed) length of one generation
cycle expressed in
where |
by |
Character. Optional grouping column (e.g., |
Parent-offspring pairs with zero or negative intervals are excluded from
the calculation because they typically indicate data entry errors or
insufficient time resolution. If many zero intervals are expected (e.g.,
when using unit = "year" with annual spawners), consider using a
finer time unit such as "month" or "day".
Numeric year columns (e.g., 2020) are automatically converted to
Date by appending "-07-01" (mid-year) as a reasonable default.
For more precise results, convert to Date before calling this function.
A data.table with columns:
Group: Grouping level (if by is used).
Pathway: One of "SS", "SD", "DS", "DD",
"SO", "DO", or "Average".
SS/SD/DS/DD require offspring sex; SO (Sire-Offspring) and DO (Dam-Offspring)
are computed from all parent-offspring pairs regardless of offspring sex.
N: Number of parent-offspring pairs used.
Mean: Average generation interval in unit.
SD: Standard deviation of the interval.
GenEquiv: (Optional) Generation equivalents based on cycle.
# ---- Basic usage with package dataset (numeric Year auto-converted) ---- tped <- tidyped(big_family_size_ped) gi <- pedgenint(tped, timevar = "Year") gi # ---- Generation equivalents with cycle ---- gi2 <- pedgenint(tped, timevar = "Year", cycle = 2) gi2# ---- Basic usage with package dataset (numeric Year auto-converted) ---- tped <- tidyped(big_family_size_ped) gi <- pedgenint(tped, timevar = "Year") gi # ---- Generation equivalents with cycle ---- gi2 <- pedgenint(tped, timevar = "Year", cycle = 2) gi2
Calculates the diversity half-life () of a pedigree across
time points using a Renyi-2 entropy cascade framework. The total loss rate of
genetic diversity is partitioned into three additive components:
: foundation bottleneck (unequal founder
contributions).
: breeding bottleneck (overuse of key ancestors).
: genetic drift (random sampling loss).
The function rolls over time points defined by timevar, computing
and (via pedcontrib) and
(via the internal coancestry engine) for each time point. No redundant
Ne calculations are performed.
pedhalflife( ped, timevar = "Gen", at = NULL, nsamples = 1000, ncores = 1, seed = NULL ) ## S3 method for class 'pedhalflife' print(x, ...) ## S3 method for class 'pedhalflife' plot(x, type = c("log", "raw"), ...)pedhalflife( ped, timevar = "Gen", at = NULL, nsamples = 1000, ncores = 1, seed = NULL ) ## S3 method for class 'pedhalflife' print(x, ...) ## S3 method for class 'pedhalflife' plot(x, type = c("log", "raw"), ...)
ped |
A |
timevar |
Character.
Column name in |
at |
Optional vector of values selecting which time points to include
(e.g., |
nsamples |
Integer. Sample size per time point for coancestry estimation (passed to the internal coancestry engine). Default: 1000. |
ncores |
Integer. Number of OpenMP threads for C++ backends. Default: 1. |
seed |
Integer or |
x |
A |
... |
Additional arguments (ignored). |
type |
Character.
|
The mathematical identity underlying the cascade is:
Taking the negative time-slope of each term gives the
components, which sum exactly by linearity of OLS:
is the number of time-units
(time points, years, generations) for diversity to halve.
A list of class pedhalflife with two data.table
components:
timeseriesPer-time-point tracking with columns
Time (time-point label from timevar), NRef,
fe, fa, fg and their log transformations
(lnfe, lnfa, lnfg, lnfafe,
lnfgfa), plus TimeStep (numeric OLS time axis).
decaySingle-row table with lambda_e,
lambda_b, lambda_d, lambda_total, and
thalf.
library(visPedigree) data(simple_ped) tp <- tidyped(simple_ped) # 1. Calculate half-life using all available generations hl <- pedhalflife(tp, timevar = "Gen") print(hl) # 2. View the underlying log-linear decay plot plot(hl, type = "log") # 3. Calculate half-life for a specific time window (e.g., Generations 2 to 4) hl_subset <- pedhalflife(tp, timevar = "Gen", at = c(2, 3, 4)) print(hl_subset)library(visPedigree) data(simple_ped) tp <- tidyped(simple_ped) # 1. Calculate half-life using all available generations hl <- pedhalflife(tp, timevar = "Gen") print(hl) # 2. View the underlying log-linear decay plot plot(hl, type = "log") # 3. Calculate half-life for a specific time window (e.g., Generations 2 to 4) hl_subset <- pedhalflife(tp, timevar = "Gen", at = c(2, 3, 4)) print(hl_subset)
Combines founder/ancestor contributions ($f_e$, $f_a$) and effective population size estimates (Ne) from three methods into a single summary object.
pediv( ped, reference = NULL, top = 20, nsamples = 1000, ncores = 1, seed = NULL )pediv( ped, reference = NULL, top = 20, nsamples = 1000, ncores = 1, seed = NULL )
ped |
A |
reference |
Character vector. Optional subset of individual IDs defining the reference population. If NULL, uses all individuals in the most recent generation. |
top |
Integer. Number of top contributors to return in founder/ancestor tables. Default is 20. |
nsamples |
Integer. Number of individuals sampled per cohort for the coancestry
Ne method and for |
ncores |
Integer. Number of cores for parallel processing in the coancestry method. Default is 1. |
seed |
Integer or NULL. Random seed passed to |
Internally calls pedcontrib for and .
The coancestry method is called via the internal calc_ne_coancestry()
function directly so that and the Ne estimate can be obtained from
the same traced pedigree without duplication.
The inbreeding and demographic Ne methods are obtained via pedne.
All calculations use the same reference population.
If any method fails (e.g., insufficient pedigree depth), its value is NA
rather than stopping execution.
(founder genome equivalents, Caballero & Toro 2000) is estimated from
the diagonal-corrected mean coancestry of the reference population:
where is the full reference cohort size, is the
off-diagonal mean relationship among sampled individuals, and
is their mean inbreeding coefficient.
A list with class pediv containing:
summary: A single-row data.table with columns
NRef, NFounder, feH, fe, NAncestor,
faH, fa, fafe, fg, MeanCoan,
GeneDiv, NSampledCoan, NeCoancestry, NeInbreeding,
NeDemographic.
Here feH and faH are the Shannon-entropy ()
effective numbers of founders and ancestors, respectively.
GeneDiv is the pedigree-based retained genetic diversity,
computed as , where is the
diagonal-corrected population mean coancestry (MeanCoan).
founders: A data.table of top founder contributions.
ancestors: A data.table of top ancestor contributions.
tp <- tidyped(small_ped) div <- pediv(tp, reference = c("Z1", "Z2", "X", "Y"), seed = 42L) print(div) # Access Shannon effective numbers from summary div$summary$feH # Shannon effective founders (q=1) div$summary$faH # Shannon effective ancestors (q=1) # Founder diversity profile: NFounder >= feH >= fe with(div$summary, c(NFounder = NFounder, feH = feH, fe = fe))tp <- tidyped(small_ped) div <- pediv(tp, reference = c("Z1", "Z2", "X", "Y"), seed = 42L) print(div) # Access Shannon effective numbers from summary div$summary$feH # Shannon effective founders (q=1) div$summary$faH # Shannon effective ancestors (q=1) # Founder diversity profile: NFounder >= feH >= fe with(div$summary, c(NFounder = NFounder, feH = feH, fe = fe))
Optimized calculation of additive (A), dominance (D), epistatic (AA) relationship matrices, their inverses, and inbreeding coefficients (f). Uses Rcpp with Meuwissen & Luo (1992) algorithm for efficient computation.
pedmat( ped, method = "A", sparse = TRUE, invert_method = "auto", threads = 0, compact = FALSE )pedmat( ped, method = "A", sparse = TRUE, invert_method = "auto", threads = 0, compact = FALSE )
ped |
A tidied pedigree from |
method |
Character, one of:
|
sparse |
Logical, if |
invert_method |
Character, method for matrix inversion (Dinv/AAinv only):
|
threads |
Integer. Number of OpenMP threads to use. Use 0 to keep the system/default setting. Currently, multi-threading is explicitly implemented for:
For |
compact |
Logical, if |
API Design:
Only a single method may be requested per call. This design prevents
accidental heavy computations. If multiple matrices are needed, call
pedmat() separately for each method.
Compact Mode (compact = TRUE):
Full-siblings share identical relationships with all other individuals. Compact mode exploits this by selecting one representative per full-sib family, dramatically reducing matrix size. For example, a pedigree with 170,000 individuals might compact to 1,800 unique relationship patterns.
Key features:
query_relationship(x, id1, id2): Query any individual
pair, including merged siblings (automatic lookup)
expand_pedmat(x): Restore full matrix dimensions
vismat(x): Visualize directly (auto-expands compact)
Performance Notes:
Ainv: O(n) complexity using Henderson's rules. Fast for any size.
Dinv/AAinv: O(n³) matrix inversion. Practical limits:
n < 500: ~10-20 ms
n = 1,000: ~40-60 ms
n = 2,000: ~130-150 ms
n > 2,000: Consider using compact = TRUE
Memory: Sparse matrices use ~O(nnz) memory; dense use O(n²)
Returns a matrix or vector with S3 class "pedmat".
Object type by method:
method="f": Named numeric vector of inbreeding coefficients
All other methods: Sparse or dense matrix (depending on sparse)
S3 Methods:
print(x): Display matrix with metadata header
summary_pedmat(x): Detailed statistics (size, compression, mean, density)
dim(x), length(x), Matrix::diag(x), t(x): Standard operations
x[i, j]: Subsetting (behaves like underlying matrix)
as.matrix(x): Convert to base matrix
Accessing Metadata (use attr(), not $):
attr(x, "ped"): The pedigree used (or compact pedigree if compact=TRUE)
attr(x, "ped_compact"): Compact pedigree (when compact=TRUE)
attr(x, "method"): Calculation method used
attr(x, "call_info"): Full calculation metadata (timing, sizes, etc.)
names(attributes(x)): List all available attributes
Additional attributes when compact = TRUE:
attr(x, "compact_map"): data.table mapping individuals to representatives
attr(x, "family_summary"): data.table summarizing merged families
attr(x, "compact_stats"): Compression statistics (ratio, n_families, etc.)
Meuwissen, T. H. E., & Luo, Z. (1992). Computing inbreeding coefficients in large populations. Genetics Selection Evolution, 24(4), 305-313.
Henderson, C. R. (1976). A simple method for computing the inverse of a numerator relationship matrix used in prediction of breeding values. Biometrics, 32(1), 69-83.
tidyped for preparing pedigree data,
query_relationship for querying individual pairs,
expand_pedmat for restoring full dimensions,
vismat for visualization,
inbreed for simple inbreeding calculation
# Basic usage with small pedigree library(visPedigree) tped <- tidyped(small_ped) # --- Additive Relationship Matrix (default) --- A <- pedmat(tped) A["A", "B"] # Relationship between A and B Matrix::diag(A) # Diagonal = 1 + F (inbreeding) # --- Inbreeding Coefficients --- f <- pedmat(tped, method = "f") f["Z1"] # Inbreeding of individual Z1 # --- Using summary_pedmat() --- summary_pedmat(A) # Detailed matrix statistics # --- Accessing Metadata --- attr(A, "ped") # Original pedigree attr(A, "method") # "A" names(attributes(A)) # All available attributes # --- Compact Mode (for large full-sib families) --- A_compact <- pedmat(tped, method = "A", compact = TRUE) # Query relationships (works for any individual, including merged sibs) query_relationship(A_compact, "Z1", "Z2") # Full-sibs Z1 and Z2 # View compression statistics attr(A_compact, "compact_stats") attr(A_compact, "family_summary") # Expand back to full size A_full <- expand_pedmat(A_compact) dim(A_full) # Original dimensions restored # --- Inverse Matrices --- Ainv <- pedmat(tped, method = "Ainv") # Henderson's rules (fast) # --- Dominance and Epistatic --- D <- pedmat(tped, method = "D") AA <- pedmat(tped, method = "AA") # --- Visualization (requires display device) --- ## Not run: vismat(A) # Heatmap of relationship matrix vismat(A_compact) # Works with compact matrices vismat(A, by = "Gen") # Group by generation ## End(Not run)# Basic usage with small pedigree library(visPedigree) tped <- tidyped(small_ped) # --- Additive Relationship Matrix (default) --- A <- pedmat(tped) A["A", "B"] # Relationship between A and B Matrix::diag(A) # Diagonal = 1 + F (inbreeding) # --- Inbreeding Coefficients --- f <- pedmat(tped, method = "f") f["Z1"] # Inbreeding of individual Z1 # --- Using summary_pedmat() --- summary_pedmat(A) # Detailed matrix statistics # --- Accessing Metadata --- attr(A, "ped") # Original pedigree attr(A, "method") # "A" names(attributes(A)) # All available attributes # --- Compact Mode (for large full-sib families) --- A_compact <- pedmat(tped, method = "A", compact = TRUE) # Query relationships (works for any individual, including merged sibs) query_relationship(A_compact, "Z1", "Z2") # Full-sibs Z1 and Z2 # View compression statistics attr(A_compact, "compact_stats") attr(A_compact, "family_summary") # Expand back to full size A_full <- expand_pedmat(A_compact) dim(A_full) # Original dimensions restored # --- Inverse Matrices --- Ainv <- pedmat(tped, method = "Ainv") # Henderson's rules (fast) # --- Dominance and Epistatic --- D <- pedmat(tped, method = "D") AA <- pedmat(tped, method = "AA") # --- Visualization (requires display device) --- ## Not run: vismat(A) # Heatmap of relationship matrix vismat(A_compact) # Works with compact matrices vismat(A, by = "Gen") # Group by generation ## End(Not run)
Access pedigree metadata from a tidyped object
pedmeta(x)pedmeta(x)
x |
A tidyped object. |
The ped_meta list, or NULL if not set.
Calculates the effective population size (Ne) based on the rate of coancestry, the rate of inbreeding, or demographic parent numbers.
pedne( ped, method = c("coancestry", "inbreeding", "demographic"), by = NULL, reference = NULL, nsamples = 1000, ncores = 1, seed = NULL )pedne( ped, method = c("coancestry", "inbreeding", "demographic"), by = NULL, reference = NULL, nsamples = 1000, ncores = 1, seed = NULL )
ped |
A |
method |
Character. The method to compute Ne. One of |
by |
Character. The name of the column used to group cohorts (e.g., "Year", "BirthYear"). If NULL, calculates overall Ne for all individuals. |
reference |
Character vector. Optional subset of individual IDs defining the reference cohort. If NULL, uses all individuals in the pedigree. |
nsamples |
Integer. Number of individuals to randomly sample per cohort when using the |
ncores |
Integer. Number of cores for parallel processing. Currently only effective for |
seed |
Integer or NULL. Random seed passed to |
The effective population size can be calculated using one of three methods:
"coancestry" (Default): Based on the rate of coancestry () between
pairs of individuals. In this context, is the probability that two alleles
randomly sampled from individuals $i$ and $j$ are identical by descent, which is equivalent
to half the additive genetic relationship ().
This method is generally more robust as it accounts for full genetic drift and
bottlenecks (Cervantes et al., 2011).
To handle large populations, this method samples nsamples individuals per cohort
and computes the mean rate of coancestry among them.
"inbreeding": Based on the individual rate of inbreeding ($F_i$) (Gutiérrez et al., 2008, 2009).
"demographic": Based on the demographic census of breeding males and females (Wright, 1931).
Where and are the number of unique male and female parents contributing to the cohort.
A data.table with columns:
Cohort: Cohort or grouping variable value.
N: Number of individuals in the cohort.
Ne: Calculated effective population size.
...: Additional columns depending on the selected method (e.g., NSampled, DeltaC, MeanF, DeltaF, Nm, Nf).
Cervantes, I., Goyache, F., Molina, A., Valera, M., & Gutiérrez, J. P. (2011). Estimation of effective population size from the rate of coancestry in pedigreed populations. Journal of Animal Breeding and Genetics, 128(1), 56-63.
Gutiérrez, J. P., Cervantes, I., Molina, A., Valera, M., & Goyache, F. (2008). Individual increase in inbreeding allows estimating effective sizes from pedigrees. Genetics Selection Evolution, 40(4), 359-370.
Gutiérrez, J. P., Cervantes, I., & Goyache, F. (2009). Improving the estimation of realized effective population sizes in farm animals. Journal of Animal Breeding and Genetics, 126(4), 327-332.
Wright, S. (1931). Evolution in Mendelian populations. Genetics, 16(2), 97-159.
# Coancestry-based Ne (default) using a simple pedigree grouped by year tp_simple <- tidyped(simple_ped) tp_simple$BirthYear <- 2000 + tp_simple$Gen ne_coan <- suppressMessages(pedne(tp_simple, by = "BirthYear", seed = 42L)) ne_coan # Inbreeding-based Ne using an inbred pedigree tp_inbred <- tidyped(inbred_ped) ne_inb <- suppressMessages(pedne(tp_inbred, method = "inbreeding", by = "Gen")) ne_inb # Demographic Ne from the number of contributing sires and dams ne_demo <- suppressMessages(pedne(tp_simple, method = "demographic", by = "BirthYear")) ne_demo# Coancestry-based Ne (default) using a simple pedigree grouped by year tp_simple <- tidyped(simple_ped) tp_simple$BirthYear <- 2000 + tp_simple$Gen ne_coan <- suppressMessages(pedne(tp_simple, by = "BirthYear", seed = 42L)) ne_coan # Inbreeding-based Ne using an inbred pedigree tp_inbred <- tidyped(inbred_ped) ne_inb <- suppressMessages(pedne(tp_inbred, method = "inbreeding", by = "Gen")) ne_inb # Demographic Ne from the number of contributing sires and dams ne_demo <- suppressMessages(pedne(tp_simple, method = "demographic", by = "BirthYear")) ne_demo
Decomposes individuals' inbreeding coefficients into marginal contributions from specific ancestors. This allows identifying which ancestors or lineages are responsible for the observed inbreeding.
pedpartial(ped, ancestors = NULL, top = 20)pedpartial(ped, ancestors = NULL, top = 20)
ped |
A |
ancestors |
Character vector. IDs of ancestors to calculate partial inbreeding for. If NULL, the top ancestors by marginal contribution are used. |
top |
Integer. Number of top ancestors to include if |
The sum of all partial inbreeding coefficients for an individual (including contributions from founders) equals $1 + f_i$, where $f_i$ is the total inbreeding coefficient. This function specifically isolates the terms in the Meuwissen & Luo (1992) decomposition that correspond to the selected ancestors.
A data.table with the first column as Ind and subsequent
columns representing the partial inbreeding ($pF$) from each ancestor.
Lacey, R. C. (1996). A formula for determining the partial inbreeding coefficient,
. Journal of Heredity, 87(4), 337-339.
Meuwissen, T. H., & Luo, Z. (1992). Computing inbreeding coefficients in large populations. Genetics Selection Evolution, 24(4), 305-313.
library(data.table) tp <- tidyped(inbred_ped) # Calculate partial inbreeding originating from specific ancestors target_ancestors <- inbred_ped[is.na(Sire) & is.na(Dam), Ind] pF <- pedpartial(tp, ancestors = target_ancestors) print(tail(pF))library(data.table) tp <- tidyped(inbred_ped) # Calculate partial inbreeding originating from specific ancestors target_ancestors <- inbred_ped[is.na(Sire) & is.na(Dam), Ind] pF <- pedpartial(tp, ancestors = target_ancestors) print(tail(pF))
Computes either the average pairwise additive genetic relationship
coefficients () within cohorts, or the corrected population mean
coancestry used for pedigree-based diversity summaries.
pedrel( ped, by = "Gen", reference = NULL, compact = FALSE, scale = c("relationship", "coancestry") )pedrel( ped, by = "Gen", reference = NULL, compact = FALSE, scale = c("relationship", "coancestry") )
ped |
A |
by |
Character. The column name to group by (e.g., "Year", "Breed", "Generation"). |
reference |
Character vector. An optional vector of reference individual IDs to calculate relationships for. If provided, only individuals matching these IDs in each group will be used. Default is NULL (use all individuals in the group). |
compact |
Logical. Whether to use compact representation for large families to save memory. Recommended when pedigree size exceeds 25,000. Default is FALSE. |
scale |
Character. One of |
When scale = "relationship", the returned value is the mean of the
off-diagonal additive relationship coefficients among the selected
individuals. When scale = "coancestry", the returned value is the
diagonal-corrected population mean coancestry:
where is the mean off-diagonal relationship,
is the mean inbreeding coefficient of the selected individuals, and
is the number of selected individuals. This matches the
internal coancestry quantity used to derive in pediv.
A data.table with columns:
A grouping identifier column, named after the by parameter (e.g., Gen, Year).
NTotal: Total number of individuals in the group.
NUsed: Number of individuals used in calculation (could be subset by reference).
MeanRel: Present when scale = "relationship"; average of off-diagonal
elements in the Additive Relationship (A) matrix for this group
().
MeanCoan: Present when scale = "coancestry"; diagonal-corrected
population mean coancestry for this group.
library(data.table) # Use the sample dataset and simulate a birth year tp <- tidyped(small_ped) tp$Year <- 2010 + tp$Gen # Example 1: Calculate average relationship grouped by Generation (default) rel_by_gen <- pedrel(tp, by = "Gen") print(rel_by_gen) # Example 2: Calculate average relationship grouped by Year rel_by_year <- pedrel(tp, by = "Year") print(rel_by_year) # Example 3: Calculate corrected mean coancestry coan_by_gen <- pedrel(tp, by = "Gen", scale = "coancestry") print(coan_by_gen) # Example 4: Filter calculations with a reference list in a chosen group candidates <- c("N", "O", "P", "Q", "T", "U", "V", "X", "Y") rel_subset <- pedrel(tp, by = "Gen", reference = candidates) print(rel_subset)library(data.table) # Use the sample dataset and simulate a birth year tp <- tidyped(small_ped) tp$Year <- 2010 + tp$Gen # Example 1: Calculate average relationship grouped by Generation (default) rel_by_gen <- pedrel(tp, by = "Gen") print(rel_by_gen) # Example 2: Calculate average relationship grouped by Year rel_by_year <- pedrel(tp, by = "Year") print(rel_by_year) # Example 3: Calculate corrected mean coancestry coan_by_gen <- pedrel(tp, by = "Gen", scale = "coancestry") print(coan_by_gen) # Example 4: Filter calculations with a reference list in a chosen group candidates <- c("N", "O", "P", "Q", "T", "U", "V", "X", "Y") rel_subset <- pedrel(tp, by = "Gen", reference = candidates) print(rel_subset)
Calculates comprehensive statistics for a pedigree, including population structure, generation intervals, and ancestral depth.
pedstats( ped, timevar = NULL, unit = "year", cycle = NULL, ecg = TRUE, genint = TRUE, ... )pedstats( ped, timevar = NULL, unit = "year", cycle = NULL, ecg = TRUE, genint = TRUE, ... )
ped |
A |
timevar |
Optional character. Name of the column containing the birth date (or hatch date) of each individual. Accepted column formats:
If |
unit |
Character. Time unit for reporting generation intervals:
|
cycle |
Numeric. Optional target generation cycle length in
|
ecg |
Logical. Whether to compute equivalent complete generations
for each individual via |
genint |
Logical. Whether to compute generation intervals via
|
... |
Additional arguments passed to |
An object of class pedstats, which is a list containing:
summary: A data.table with one row summarising the
whole pedigree. Columns:
N — total number of individuals.
NSire — number of unique sires.
NDam — number of unique dams.
NFounder — number of founder individuals
(both parents unknown).
MaxGen — maximum generation number.
ecg: A data.table with one row per individual
(NULL if ecg = FALSE). Columns:
Ind — individual identifier.
ECG — equivalent complete generations.
FullGen — number of fully known generations.
MaxGen — maximum traceable generation depth.
gen_intervals: A data.table of generation intervals
(NULL if no timevar is detected or
genint = FALSE). Columns:
Pathway — gametic pathway label. Seven values:
"SS" (sire to son), "SD" (sire to daughter),
"DS" (dam to son), "DD" (dam to daughter) —
require offspring sex; "SO" (sire to offspring) and
"DO" (dam to offspring) — sex-independent; and
"Average" — all parent-offspring pairs combined.
N — number of parent-offspring pairs.
Mean — mean generation interval.
SD — standard deviation of the interval.
GenEquiv — Mean / cycle (only present when
cycle is supplied).
# ---- Without time variable ---- tp <- tidyped(simple_ped) ps <- pedstats(tp) ps$summary ps$ecg # ---- With annual Year column (big_family_size_ped) ---- tp2 <- tidyped(big_family_size_ped) ps2 <- pedstats(tp2, timevar = "Year") ps2$summary ps2$gen_intervals# ---- Without time variable ---- tp <- tidyped(simple_ped) ps <- pedstats(tp) ps$summary ps$ecg # ---- With annual Year column (big_family_size_ped) ---- tp2 <- tidyped(big_family_size_ped) ps2 <- pedstats(tp2, timevar = "Year") ps2$summary ps2$gen_intervals
Summarizes pedigree subpopulations and group structure.
pedsubpop(ped, by = NULL)pedsubpop(ped, by = NULL)
ped |
A |
by |
Character. The name of the column to group by.
If NULL, summarizes disconnected components via |
When by = NULL, this function is a lightweight summary wrapper around
splitped, returning one row per disconnected pedigree component
plus an optional "Isolated" row for individuals with no known parents
and no offspring. When by is provided, it instead summarizes the pedigree
directly by the specified column (e.g. "Gen", "Year", "Breed").
Use pedsubpop() when you want a compact analytical summary table.
Use splitped when you need the actual re-tidied sub-pedigree
objects for downstream plotting or analysis.
A data.table with columns:
Group: Subpopulation label.
N: Total individuals.
N_Sire: Number of distinct sires.
N_Dam: Number of distinct dams.
N_Founder: Number of founders (parents unknown).
tp <- tidyped(simple_ped) # Summarize disconnected pedigree components pedsubpop(tp) # Summarize by an existing grouping variable pedsubpop(tp, by = "Gen")tp <- tidyped(simple_ped) # Summarize disconnected pedigree components pedsubpop(tp) # Summarize by an existing grouping variable pedsubpop(tp, by = "Gen")
Plot a tidy pedigree
## S3 method for class 'tidyped' plot(x, ...)## S3 method for class 'tidyped' plot(x, ...)
x |
A |
... |
Additional arguments passed to |
Invisibly returns a list of graph data from visped (node/edge
data and layout components) used to render the pedigree; the primary result is
the plot drawn on the current device.
Print Founder and Ancestor Contributions
## S3 method for class 'pedcontrib' print(x, ...)## S3 method for class 'pedcontrib' print(x, ...)
x |
A |
... |
Additional arguments. |
Print Genetic Diversity Summary
## S3 method for class 'pediv' print(x, ...)## S3 method for class 'pediv' print(x, ...)
x |
A |
... |
Additional arguments. |
Print Pedigree Statistics
## S3 method for class 'pedstats' print(x, ...)## S3 method for class 'pedstats' print(x, ...)
x |
A |
... |
Additional arguments. |
Print method for summary.tidyped
## S3 method for class 'summary.tidyped' print(x, ...)## S3 method for class 'summary.tidyped' print(x, ...)
x |
A summary.tidyped object. |
... |
Additional arguments (ignored). |
The input object, invisibly.
Print method for tidyped pedigree
## S3 method for class 'tidyped' print(x, ...)## S3 method for class 'tidyped' print(x, ...)
x |
A tidyped object |
... |
Additional arguments passed to the data.table print method |
The input object, invisibly.
Retrieves relationship coefficients between individuals from a pedmat object. For compact matrices, automatically handles lookup of merged full-siblings.
query_relationship(x, id1, id2 = NULL)query_relationship(x, id1, id2 = NULL)
x |
A pedmat object created by |
id1 |
Character, first individual ID. |
id2 |
Character, second individual ID. If |
For compact matrices (compact = TRUE), this function automatically
maps individuals to their family representatives. For methods A, D, and AA,
it can compute the correct relationship even between merged full-siblings
using the formula:
A: where s, d are parents
D: (for full-sibs in same family)
AA:
If id2 is provided: numeric value (relationship coefficient)
If id2 is NULL: named numeric vector (id1's row)
Returns NA if individual not found
Inverse matrices (Ainv, Dinv, AAinv) are not supported because inverse matrix elements do not represent meaningful relationship coefficients.
tped <- tidyped(small_ped) A <- pedmat(tped, method = "A", compact = TRUE) # Query specific pair query_relationship(A, "A", "B") # Query merged full-siblings (works with compact) query_relationship(A, "Z1", "Z2") # Get all relationships for one individual query_relationship(A, "A")tped <- tidyped(small_ped) A <- pedmat(tped, method = "A", compact = TRUE) # Query specific pair query_relationship(A, "A", "B") # Query merged full-siblings (works with compact) query_relationship(A, "Z1", "Z2") # Get all relationships for one individual query_relationship(A, "A")
A small dataset containing a simple pedigree for demonstration.
simple_pedsimple_ped
A data.table with 4 columns:
Individual ID
Sire ID
Dam ID
Sex of the individual
A small dataset containing a pedigree with some missing parents.
small_pedsmall_ped
A data.frame with 3 columns:
Individual ID
Sire ID
Dam ID
Detects and splits a tidyped object into disconnected groups (connected components). Uses igraph to efficiently find groups of individuals that have no genetic relationships with each other. Isolated individuals (Gen = 0, those with no parents and no offspring) are excluded from group splitting and stored separately.
splitped(ped)splitped(ped)
ped |
A tidyped object created by |
This function identifies connected components in the pedigree graph where edges represent parent-offspring relationships. Two individuals are in the same group if they share any ancestry (direct or indirect).
Isolated individuals (Gen = 0 in tidyped output) are those who:
Have no known parents (Sire and Dam are both NA)
Are not parents of any other individual in the pedigree
These isolated individuals are excluded from splitting and stored in the
isolated attribute. Each resulting group contains at least 2 individuals
(at least one parent-offspring relationship).
The function always returns a list, even if there is only one group (i.e., the pedigree is fully connected). Groups are sorted by size in descending order.
Each group in the result is a valid tidyped object with:
Renumbered IndNum (1 to n for each group)
Updated SireNum and DamNum referencing the new IndNum
Recalculated Gen (generation) based on the group's structure
A list of class "splitped" containing:
GP1, GP2, ...
|
tidyped objects for each disconnected group (with at least 2 individuals), with renumbered IndNum, SireNum, DamNum |
The returned object has the following attributes:
n_groups |
Number of disconnected groups found (excluding isolated individuals) |
sizes |
Named vector of group sizes |
total |
Total number of individuals in groups (excluding isolated) |
isolated |
Character vector of isolated individual IDs (Gen = 0) |
n_isolated |
Number of isolated individuals |
tidyped for pedigree tidying
# Load example data library(visPedigree) data(small_ped) # First tidy the pedigree tped <- tidyped(small_ped) # Split into groups result <- splitped(tped) print(result) # Access individual groups (each is a tidyped object) result$GP1 # Check isolated individuals attr(result, "isolated")# Load example data library(visPedigree) data(small_ped) # First tidy the pedigree tped <- tidyped(small_ped) # Split into groups result <- splitped(tped) print(result) # Access individual groups (each is a tidyped object) result$GP1 # Check isolated individuals attr(result, "isolated")
Computes and displays summary statistics for a pedmat object.
summary_pedmat(x)summary_pedmat(x)
x |
A pedmat object from |
Since pedmat objects are often S4 sparse matrices with custom attributes,
use this function instead of the generic summary() to ensure proper
display of pedigree matrix statistics.
An object of class "summary.pedmat" with statistics including
method, dimensions, compression ratio (if compact), mean relationship,
and matrix density.
tped <- tidyped(small_ped) A <- pedmat(tped, method = "A") summary_pedmat(A)tped <- tidyped(small_ped) A <- pedmat(tped, method = "A") summary_pedmat(A)
Summary method for tidyped objects
## S3 method for class 'tidyped' summary(object, ...)## S3 method for class 'tidyped' summary(object, ...)
object |
A tidyped object. |
... |
Additional arguments (ignored). |
A summary.tidyped object (list) containing:
n_ind: Total number of individuals.
n_male, n_female, n_unknown_sex: Sex composition counts.
n_founders: Number of individuals with no known parents.
n_both_parents: Count of individuals with complete parentage.
max_gen, gen_dist: (Optional) Maximum generation and its distribution.
n_families, family_sizes, top_families: (Optional) Family statistics.
f_stats, n_inbred: (Optional) Inbreeding coefficient statistics.
n_cand, cand_f_stats: (Optional) Candidate-specific statistics.
This function standardizes pedigree records, checks for duplicate IDs and
incompatible parental roles, detects pedigree loops, injects missing
founders, assigns generation numbers, sorts the pedigree, and optionally
traces the pedigree of specified candidates. If the cand parameter
contains individual IDs, only those individuals and their ancestors or
descendants are retained. Tracing direction and the number of generations
can be specified using the trace and tracegen parameters.
tidyped( ped, cand = NULL, trace = "up", tracegen = NULL, addgen = TRUE, addnum = TRUE, inbreed = FALSE, selfing = FALSE, genmethod = "top", ... )tidyped( ped, cand = NULL, trace = "up", tracegen = NULL, addgen = TRUE, addnum = TRUE, inbreed = FALSE, selfing = FALSE, genmethod = "top", ... )
ped |
A data.table or data frame containing the pedigree. The first three columns must be individual, sire, and dam IDs. Additional columns, such as sex or generation, can be included. Column names can be customized, but their order must remain unchanged. Individual IDs should not be coded as "", " ", "0", "*", or "NA"; otherwise, they will be removed. Missing parents should be denoted by "NA", "0", or "*". Spaces and empty strings ("") are also treated as missing parents but are not recommended. |
cand |
A character vector of individual IDs, or NULL. If provided, only the candidates and their ancestors/descendants are retained. |
trace |
A character value specifying the tracing direction: "up", "down", or "all". "up" traces ancestors; "down" traces descendants; "all" traces the union of ancestors and descendants. This parameter is only used if |
tracegen |
An integer specifying the number of generations to trace. This parameter is only used if |
addgen |
A logical value indicating whether to generate generation numbers. Default is TRUE, which adds a Gen column to the output. |
addnum |
A logical value indicating whether to generate a numeric pedigree. Default is TRUE, which adds IndNum, SireNum, and DamNum columns to the output. |
inbreed |
A logical value indicating whether to calculate inbreeding coefficients. Default is FALSE. If TRUE, an f column is added to the output. This uses the same optimized engine as |
selfing |
A logical value indicating whether to allow the same individual to appear as both sire and dam. This is common in plant breeding (monoecious species) where the same plant can serve as both male and female parent. If TRUE, individuals appearing in both the Sire and Dam columns will be assigned Sex = "monoecious" instead of triggering an error. Default is FALSE. |
genmethod |
A character value specifying the generation assignment method: "top" or "bottom". "top" (top-aligned) assigns generations from parents to offspring, starting founders at Gen 1. "bottom" (bottom-aligned) assigns generations from offspring to parents, aligning terminal nodes at the bottom. Default is "top". |
... |
Additional arguments passed to |
Compared to the legacy version, this function reports cyclic pedigrees more
clearly and uses a mixed implementation. There are two candidate-tracing
paths: when the input is a raw pedigree, igraph is used for loop
checking, candidate tracing, and topological sorting; when the input is an
already validated tidyped object and cand is supplied,
tracing and topological sorting use integer-indexed C++ routines. Generation
assignment can be performed using either a top-down approach (default,
aligning founders at the top) or a bottom-up approach (aligning terminal
nodes at the bottom).
A tidyped object (which inherits from data.table). Individual, sire, and dam ID columns are renamed to Ind, Sire, and Dam. Missing parents are replaced with NA. The Sex column contains "male", "female", "monoecious", or NA. The Cand column is included if cand is not NULL. The Gen column is included if addgen is TRUE. The IndNum, SireNum, and DamNum columns are included if addnum is TRUE. The Family and FamilySize columns identify full-sibling families (for example, "AxB" for offspring of sire A and dam B). The f column is included if inbreed is TRUE.
summary.tidyped for summarizing tidyped objects
visped for visualizing pedigree structure
pedmat for computing relationship matrices
vismat for visualizing relationship matrices
splitped for splitting pedigree into connected components
inbreed for calculating inbreeding coefficients
library(visPedigree) library(data.table) # Tidy a simple pedigree tidy_ped <- tidyped(simple_ped) head(tidy_ped) # Trace ancestors of a specific individual within 2 generations tidy_ped_tracegen <- tidyped(simple_ped, cand = "J5X804", trace = "up", tracegen = 2) head(tidy_ped_tracegen) # Trace both ancestors and descendants for multiple candidates # This is highly optimized and works quickly even on 100k+ individuals cand_list <- c("J5X804", "J3Y620") tidy_ped_all <- tidyped(simple_ped, cand = cand_list, trace = "all") # Check for loops (will error if loops exist) try(tidyped(loop_ped)) # Example with a large pedigree: extract 2 generations of ancestors for 2007 candidates cand_2007 <- big_family_size_ped[Year == 2007, Ind] tidy_big <- tidyped(big_family_size_ped, cand = cand_2007, trace = "up", tracegen = 2) summary(tidy_big)library(visPedigree) library(data.table) # Tidy a simple pedigree tidy_ped <- tidyped(simple_ped) head(tidy_ped) # Trace ancestors of a specific individual within 2 generations tidy_ped_tracegen <- tidyped(simple_ped, cand = "J5X804", trace = "up", tracegen = 2) head(tidy_ped_tracegen) # Trace both ancestors and descendants for multiple candidates # This is highly optimized and works quickly even on 100k+ individuals cand_list <- c("J5X804", "J3Y620") tidy_ped_all <- tidyped(simple_ped, cand = cand_list, trace = "all") # Check for loops (will error if loops exist) try(tidyped(loop_ped)) # Example with a large pedigree: extract 2 generations of ancestors for 2007 candidates cand_2007 <- big_family_size_ped[Year == 2007, Ind] tidy_big <- tidyped(big_family_size_ped, cand = cand_2007, trace = "up", tracegen = 2) summary(tidy_big)
vismat provides visualization tools for relationship matrices (A, D, AA),
supporting individual-level heatmaps and relationship coefficient histograms.
This function is useful for exploring population genetic structure, identifying
inbred individuals, and analyzing kinship between families.
vismat( mat, ped = NULL, type = "heatmap", ids = NULL, reorder = TRUE, by = NULL, grouping = NULL, labelcex = NULL, ... )vismat( mat, ped = NULL, type = "heatmap", ids = NULL, reorder = TRUE, by = NULL, grouping = NULL, labelcex = NULL, ... )
mat |
A relationship matrix. Can be one of the following types:
Note: Inverse matrices (Ainv, Dinv, AAinv) are not supported for visualization because their elements do not represent meaningful relationship coefficients. |
ped |
Optional. A tidied pedigree object ( |
type |
Character, type of visualization. Supported options:
|
ids |
Character vector specifying individual IDs to display. Used to
filter and display a submatrix of specific individuals. If |
reorder |
Logical. If Clustering principle: Based on relationship profile distance (Euclidean distance between rows). Full-sibs have nearly identical relationship profiles with the whole population, so they cluster tightly together and appear as contiguous blocks in the heatmap. |
by |
Optional. Column name in
Useful for visualizing population structure in large pedigrees. |
grouping |
|
labelcex |
Numeric. Manual control for font size of individual labels.
If |
... |
Additional arguments passed to the underlying plotting function: |
When mat is a compact pedmat object (created with
pedmat(..., compact = TRUE)):
With by: Group-level mean relationships are computed
algebraically from the K×K compact matrix, including a correction for
sibling off-diagonal values. This avoids expanding to the full N×N
matrix, making family-level or generation-level visualization feasible
even for pedigrees with hundreds of thousands of individuals.
Without by, N > VISMAT_EXPAND_MAX (5 000): The compact KK matrix
is plotted directly using representative individuals. Labels show the
number of individuals each representative stands for, e.g.,
"ID (\u00d7350)". This avoids memory-intensive full expansion.
Without by, N 5 000: The compact matrix is
expanded via expand_pedmat to restore full dimensions.
Uses a Nature Genetics style color palette (white to orange to red to dark red).
Hierarchical clustering reordering (Ward.D2) is enabled by default.
Grid lines shown when N VISMAT_GRID_MAX (100);
labels shown when N VISMAT_LABEL_MAX (500).
mat[1,1] is displayed at the top-left corner.
Shows the distribution of lower-triangular elements (pairwise kinship).
X-axis: relationship coefficient values; Y-axis: frequency percentage.
The following automatic thresholds are defined as package-internal
constants (VISMAT_*) at the top of R/vismat.R:
VISMAT_EXPAND_MAX (5 000): compact matrices with original
N above this are shown in representative view instead of expanding.
VISMAT_REORDER_MAX (2 000): hierarchical clustering is
automatically skipped.
VISMAT_LABEL_MAX (500): individual labels are hidden.
VISMAT_GRID_MAX (100): cell grid lines are hidden.
by grouping uses vectorized rowsum() algebra — suitable
for large matrices.
For the additive relationship matrix A:
Diagonal elements = 1 + F (F = inbreeding coefficient).
Off-diagonal elements = 2 × kinship coefficient.
0: unrelated; 0.25: half-sibs / grandparent–grandchild; 0.5: full-sibs / parent–offspring; 1.0: same individual.
Invisibly returns the lattice plot object. The plot is
rendered on the current graphics device.
pedmat for computing relationship matrices,
expand_pedmat for manually restoring compact matrix dimensions,
query_relationship for querying individual pairs,
tidyped for tidying pedigree data,
visped for visualizing pedigree structure graphs,
levelplot, histogram
library(visPedigree) data(small_ped) ped <- tidyped(small_ped) # ============================================================ # Basic Usage # ============================================================ # Method 1: from tidyped object (auto-computes A) vismat(ped) # Method 2: from pedmat object A <- pedmat(ped) vismat(A) # Method 3: from plain matrix vismat(as.matrix(A)) # ============================================================ # Compact Pedigree (auto-expanded before plotting) # ============================================================ # For pedigrees with large full-sib families, compute a compact matrix # first for efficiency, then pass directly to vismat() — it automatically # expands back to full dimensions. A_compact <- pedmat(ped, compact = TRUE) vismat(A_compact) # prints: "Expanding compact matrix (N -> M individuals)" # For very large pedigrees, aggregate to a group-level view instead vismat(A, ped = ped, by = "Gen", main = "Mean Relationship Between Generations") # ============================================================ # Heatmap Customization # ============================================================ # Custom title and axis labels vismat(A, main = "Additive Relationship Matrix", xlab = "Individual", ylab = "Individual") # Preserve original pedigree order (no clustering) vismat(A, reorder = FALSE) # Custom label font size vismat(A, labelcex = 0.5) # Custom color palette (blue-white-red) vismat(A, col.regions = colorRampPalette(c("blue", "white", "red"))(100)) # ============================================================ # Display a Subset of Individuals # ============================================================ target_ids <- rownames(A)[1:8] vismat(A, ids = target_ids) # ============================================================ # Histogram of Relationship Coefficients # ============================================================ vismat(A, type = "histogram") vismat(A, type = "histogram", nint = 30) # ============================================================ # Group-level Aggregation # ============================================================ # Group by generation vismat(A, ped = ped, by = "Gen", main = "Mean Relationship Between Generations") # Group by full-sib family (founders without a family are excluded) vismat(A, ped = ped, by = "Family") # ============================================================ # Other Relationship Matrices # ============================================================ # Dominance relationship matrix D <- pedmat(ped, method = "D") vismat(D, main = "Dominance Relationship Matrix")library(visPedigree) data(small_ped) ped <- tidyped(small_ped) # ============================================================ # Basic Usage # ============================================================ # Method 1: from tidyped object (auto-computes A) vismat(ped) # Method 2: from pedmat object A <- pedmat(ped) vismat(A) # Method 3: from plain matrix vismat(as.matrix(A)) # ============================================================ # Compact Pedigree (auto-expanded before plotting) # ============================================================ # For pedigrees with large full-sib families, compute a compact matrix # first for efficiency, then pass directly to vismat() — it automatically # expands back to full dimensions. A_compact <- pedmat(ped, compact = TRUE) vismat(A_compact) # prints: "Expanding compact matrix (N -> M individuals)" # For very large pedigrees, aggregate to a group-level view instead vismat(A, ped = ped, by = "Gen", main = "Mean Relationship Between Generations") # ============================================================ # Heatmap Customization # ============================================================ # Custom title and axis labels vismat(A, main = "Additive Relationship Matrix", xlab = "Individual", ylab = "Individual") # Preserve original pedigree order (no clustering) vismat(A, reorder = FALSE) # Custom label font size vismat(A, labelcex = 0.5) # Custom color palette (blue-white-red) vismat(A, col.regions = colorRampPalette(c("blue", "white", "red"))(100)) # ============================================================ # Display a Subset of Individuals # ============================================================ target_ids <- rownames(A)[1:8] vismat(A, ids = target_ids) # ============================================================ # Histogram of Relationship Coefficients # ============================================================ vismat(A, type = "histogram") vismat(A, type = "histogram", nint = 30) # ============================================================ # Group-level Aggregation # ============================================================ # Group by generation vismat(A, ped = ped, by = "Gen", main = "Mean Relationship Between Generations") # Group by full-sib family (founders without a family are excluded) vismat(A, ped = ped, by = "Family") # ============================================================ # Other Relationship Matrices # ============================================================ # Dominance relationship matrix D <- pedmat(ped, method = "D") vismat(D, main = "Dominance Relationship Matrix")
visped function draws a graph of a full or compact pedigree.
visped( ped, compact = FALSE, outline = FALSE, cex = NULL, showgraph = TRUE, file = NULL, highlight = NULL, trace = FALSE, showf = FALSE, pagewidth = 200, symbolsize = 1, maxiter = 1000, genlab = FALSE, genlabcex = NULL, ... )visped( ped, compact = FALSE, outline = FALSE, cex = NULL, showgraph = TRUE, file = NULL, highlight = NULL, trace = FALSE, showf = FALSE, pagewidth = 200, symbolsize = 1, maxiter = 1000, genlab = FALSE, genlabcex = NULL, ... )
ped |
A |
compact |
A logical value indicating whether IDs of full-sib individuals in one generation will be removed and replaced with the number of full-sib individuals. For example, if there are 100 full-sib individuals in one generation, they will be replaced with a single label "100" when |
outline |
A logical value indicating whether shapes without labels will be shown. A graph of the pedigree without individual labels is shown when setting |
cex |
NULL or a numeric value changing the size of individual labels shown in the graph. cex is an abbreviation for 'character expansion factor'. The |
showgraph |
A logical value indicating whether a plot will be shown in the default graphic device (e.g., the Plots panel in RStudio). This is useful for quick viewing without opening a PDF file. However, the graph on the default device may not be legible (e.g., overlapping labels or aliasing lines) due to size restrictions. It is recommended to set |
file |
NULL or a character value specifying whether the pedigree graph will be saved as a PDF file. The PDF output is a legible vector drawing where labels do not overlap, even with many individuals or long labels. It is recommended to save the pedigree graph as a PDF file. The default value is NULL. |
highlight |
NULL, a character vector of individual IDs, or a list specifying individuals to highlight. If a character vector is provided, individuals will be highlighted with a purple border while preserving their sex-based fill color. If a list is provided, it should contain:
For example: |
trace |
A logical value or a character string. If TRUE, all ancestors and descendants of the individuals specified in |
showf |
A logical value indicating whether inbreeding coefficients will be shown in the graph. If |
pagewidth |
A numeric value specifying the width of the PDF file in inches. This controls the horizontal scaling of the layout. The default value is 200. |
symbolsize |
A numeric value specifying the scaling factor for node size relative to the label size. Values greater than 1 increase the node size (adding padding around the label), while values less than 1 decrease it. This is useful for fine-tuning the whitespace and legibility of dense graphs. The default value is 1. |
maxiter |
An integer specifying the maximum number of iterations for the Sugiyama layout algorithm to minimize edge crossings. Higher values (e.g., 2000 or 5000) may result in fewer crossed lines for complex pedigrees but will increase computation time. The default value is 1000. |
genlab |
A logical value indicating whether generation labels (G1, G2, ...) will be drawn on the left margin of the pedigree graph. This helps identify the generation of each row of nodes, especially in deep pedigrees with many generations. The default value is FALSE. |
genlabcex |
NULL or a numeric value controlling the size of generation labels shown when |
... |
Additional arguments passed to |
This function takes a pedigree tidied by the tidyped function and outputs a hierarchical graph for all individuals in the pedigree. The graph can be shown on the default graphic device or saved as a PDF file. The PDF output is a legible vector drawing that is legible and avoids overlapping labels. It is especially useful when the number of individuals is large and individual labels are long.
Rendering is performed using a Two-Pass strategy: edges are drawn first to ensure center-to-center connectivity, followed by nodes and labels. This ensures perfect visual alignment in high-resolution vector outputs. The function also supports real-time ancestry and descendant highlighting.
This function can draw the graph of a very large pedigree (> 10,000 individuals per generation) by compacting full-sib individuals. It is highly effective for aquatic animal pedigrees, which usually include many full-sib families per generation in nucleus breeding populations. The outline of a pedigree without individual labels is still shown if the width of a pedigree graph exceeds the maximum width (500 inches) of the PDF file.
In the graph, two shapes and four colors are used. Circles represent individuals, and squares represent families. Dark sky blue indicates males, dark goldenrod indicates females, purple indicates monoecious individuals (common in plant breeding, where the same individual serves as both male and female parent), and dark olive green indicates unknown sex. For example, a dark sky blue circle represents a male individual; a dark goldenrod square represents all female individuals in a full-sib family when compact = TRUE.
The function mainly produces a plot on the current graphics device and/or a PDF file. It invisibly returns a list containing the graph object, layout coordinates, and node sizes.
Isolated individuals (those with no parents and no progeny, assigned Gen 0) are automatically filtered out and not shown in the plot. A message will be issued if any such individuals are removed.
tidyped for tidying pedigree data (required input)
vismat for visualizing relationship matrices as heatmaps
pedmat for computing relationship matrices
splitped for splitting pedigree into connected components
plot.igraph underlying plotting function
library(visPedigree) library(data.table) # Drawing a simple pedigree simple_ped_tidy <- tidyped(simple_ped) visped(simple_ped_tidy, cex=0.25, symbolsize=5.5) # Highlighting an individual and its ancestors and descendants visped(simple_ped_tidy, highlight = "J5X804", trace = "all", cex=0.25, symbolsize=5.5) # Showing inbreeding coefficients in the graph simple_ped_tidy_inbreed <- tidyped(simple_ped, inbreed = TRUE) visped(simple_ped_tidy_inbreed, showf = TRUE, cex=0.25, symbolsize=5.5) # visped() will automatically compute inbreeding coefficients if 'f' is missing visped(simple_ped_tidy, showf = TRUE, cex=0.25, symbolsize=5.5) # Adjusting page width and symbol size for better layout # Increase pagewidth to spread nodes horizontally in the pdf file # Increase symbolsize for more padding around individual labels visped(simple_ped_tidy, cex=0.25, symbolsize=5.5, pagewidth = 100, file = tempfile(fileext = ".pdf")) # Highlighting multiple individuals with custom colors visped(simple_ped_tidy, highlight = list(ids = c("J3Y620", "J1X971"), frame.color = "#4caf50", color = "#81c784"), cex=0.25, symbolsize=5.5) # Handling large pedigrees: Saving to PDF is recommended for legibility # The 'trace' and 'tracegen' parameters in tidyped() help prune the graph cand_labels <- big_family_size_ped[(Year == 2007) & (substr(Ind,1,2) == "G8"), Ind] big_ped_tidy <- tidyped(big_family_size_ped, cand = cand_labels, trace = "up", tracegen = 2) # Use compact = TRUE for large families visped(big_ped_tidy, compact = TRUE, cex=0.08, symbolsize=5.5, file = tempfile(fileext = ".pdf")) # Use outline = TRUE if individual labels are not required visped(big_ped_tidy, compact = TRUE, outline = TRUE, file = tempfile(fileext = ".pdf"))library(visPedigree) library(data.table) # Drawing a simple pedigree simple_ped_tidy <- tidyped(simple_ped) visped(simple_ped_tidy, cex=0.25, symbolsize=5.5) # Highlighting an individual and its ancestors and descendants visped(simple_ped_tidy, highlight = "J5X804", trace = "all", cex=0.25, symbolsize=5.5) # Showing inbreeding coefficients in the graph simple_ped_tidy_inbreed <- tidyped(simple_ped, inbreed = TRUE) visped(simple_ped_tidy_inbreed, showf = TRUE, cex=0.25, symbolsize=5.5) # visped() will automatically compute inbreeding coefficients if 'f' is missing visped(simple_ped_tidy, showf = TRUE, cex=0.25, symbolsize=5.5) # Adjusting page width and symbol size for better layout # Increase pagewidth to spread nodes horizontally in the pdf file # Increase symbolsize for more padding around individual labels visped(simple_ped_tidy, cex=0.25, symbolsize=5.5, pagewidth = 100, file = tempfile(fileext = ".pdf")) # Highlighting multiple individuals with custom colors visped(simple_ped_tidy, highlight = list(ids = c("J3Y620", "J1X971"), frame.color = "#4caf50", color = "#81c784"), cex=0.25, symbolsize=5.5) # Handling large pedigrees: Saving to PDF is recommended for legibility # The 'trace' and 'tracegen' parameters in tidyped() help prune the graph cand_labels <- big_family_size_ped[(Year == 2007) & (substr(Ind,1,2) == "G8"), Ind] big_ped_tidy <- tidyped(big_family_size_ped, cand = cand_labels, trace = "up", tracegen = 2) # Use compact = TRUE for large families visped(big_ped_tidy, compact = TRUE, cex=0.08, symbolsize=5.5, file = tempfile(fileext = ".pdf")) # Use outline = TRUE if individual labels are not required visped(big_ped_tidy, compact = TRUE, outline = TRUE, file = tempfile(fileext = ".pdf"))