formatClones - Generate an ordered list of airrClone objects for lineage construction
formatClones takes a
tibble with AIRR or
Change-O style columns as input and masks gap positions, masks ragged ends,
removes duplicates sequences, and merges annotations associated with duplicate
sequences. If specified, it will un-merge duplicate sequences with different
values specified in the
trait option. It returns a list of
objects ordered by number of sequences which serve as input for lineage reconstruction.
formatClones( data, seq = "sequence_alignment", clone = "clone_id", subclone = "subclone_id", nproc = 1, chain = "H", heavy = "IGH", cell = "cell_id", locus = "locus", minseq = 2, split_light = FALSE, majoronly = FALSE, columns = NULL, ... )
- data.frame containing the AIRR or Change-O data for a clone. See makeAirrClone for required columns and their defaults
- sequence alignment column name.
- name of the column containing the identifier for the clone. All entries in this column should be identical.
- name of the column containing the identifier for the subclone.
- number of cores to parallelize formating over.
- if HL, include light chain information if available.
- name of heavy chain locus (default = “IGH”)
- name of the column containing cell assignment information
- name of the column containing locus information
- minimum numbner of sequences per clone
- split or lump subclones? See
- only return largest subclone and sequences without light chains
- additional data columns to include in output
- additional arguments to pass to makeAirrClone
A tibble of airrClone objects containing modified clones.
This function is a wrapper for makeAirrClone. Also removes whitespace, ;, :, and = from ids
data(ExampleAirr) # Select two clones, for demonstration purpose sel <- c("3170", "3184") clones <- formatClones(ExampleAirr[ExampleAirr$clone_id %in% sel,],trait="sample_id")
Executes in order makeAirrClone. Returns a tibble of airrClone objects which serve as input to getTrees and findSwitches.