Skip to contents

Prepare files to perform GSEA. The slot active.ident must be the one on which to build the CLS (phenotype) file. Please make sure you mutate this slot in order to capture your populations of interest.

Usage

prepare_gsea(
  sobj = NULL,
  do_cls = TRUE,
  do_gmt = TRUE,
  do_expression = TRUE,
  gene_sets_list = NULL,
  signature_builder = NULL,
  signature_population = NULL,
  signature_reference = NULL,
  n = 300,
  assay = "RNA",
  theslot = "data",
  to_upper = FALSE,
  save_path = NULL,
  name_expression = "GSEA_gene_expression",
  name_cls = "GSEA_phenotype_label",
  name_chip = "GSEA_chip_annotations",
  name_gmt = "GSEA_genelists"
)

Arguments

sobj

A Seurat object

do_cls

LOGICAL : whether to build the CLS file or not (default to TRUE)

do_gmt

LOGICAL : whether to build the GMT file or not (default to TRUE)

do_expression

LOGICAL : whether to build the TXT and CHIP files or not (default to TRUE)

gene_sets_list

CHARACTER : a list of signature names from MSigDB, for example "BIOCARTA_CFTR_PATHWAY" or "E2F1_UP.V1_DN", to add in the GMT file. This parameter is taken into account only if do_gmt is set to TRUE (default to NULL)

signature_builder

DATA.FRAME : if you want to create your own signature, please fill in a dataframe with three columns : signature_name, population, reference. The first column (signature_name) corresponds to the signature name_ The second column (population) corresponds to an index in the named list signature_population. The third column (reference) corresponds to an index in the list signature_reference. The method to build a signature consists in selected the n most significantly upregulated genes in a chosen population in contrast to reference populations. Example : signature_builder = data.frame(signature_name = c("A", "B", "C"), signature_population = c(1,1,2), signature_reference = c(1,2,2)) (no default)

signature_population

LIST : a list where names correspond to column population in signature_builder and values correspond to population in the slod active.ident of sobj (no default)

signature_reference

LIST : a list where names correspond to column reference in signature_builder and values correspond to population in the slod active.ident of sobj (no default)

n

INTEGER : how many genes in each signature ? (default to 300)

assay

CHARACTER : the assay on which to get the slot to build the expression matrix. This parameter is taken into account only if do_expression is set to TRUE (default to 'RNA')

theslot

CHARACTER : the slot in the assay to build the expression matrix. This parameter is taken into account only if do_expression is set to TRUE (default to 'data')

to_upper

LOGICAL : whether to convert all gene names to upper or not (default to FALSE)

save_path

CHARACTER : full path where to save the files (no default)

name_expression

CHARACTER : file name without extension for the expression matrix (default to 'GSEA_gene_expression')

name_cls

CHARACTER : file name without extension for the CLS file (default to 'GSEA_phenotype_label')

name_chip

CHARACTER : file name without extension for the CHIP file (default to 'GSEA_chip_annotations')

name_gmt

CHARACTER : file name without extension for the GMT file (default to 'GSEA_genelists')

Value

Save maximum four files to save_path