My functions to perform the various overrepresentation/enrichment analyses generally assume the result of extract_significant_genes() as the input. Sadly, I do not want to save those data structures to disk because they can be quite monstrously large when serialized (because I keep a reference to the input in them which gets dereferenced on save()). As a result, it is far simpler/safer to just read the xlsx outputs produced by combine_de_tables and extract_significant_genes. Ergo the following two little functions.
table_reader <- function(xlsx, sheet = "outcome") {
input_df <- openxlsx::readWorkbook(xlsx, sheet = sheet, startRow = 2)
rownames(input_df) <- input_df[["row.names"]]
input_df[["row.names"]] <- NULL
print(dim(input_df))
if (nrow(input_df) < 5000) {
warning("Something appears wrong with the de table: ", input_xlsx)
}
return(input_df)
}
sig_reader <- function(xlsx, sheet = "outcome", direction = "up") {
this_sheet <- paste0(direction, "_deseq_", sheet)
input_df <- openxlsx::readWorkbook(xlsx, sheet = this_sheet, startRow = 2)
rownames(input_df) <- input_df[["row.names"]]
input_df[["row.names"]] <- NULL
dim(input_df)
if (nrow(input_df) < 20) {
message("There are less than 20 rows in this significance table, it is unlikely to be interesting.")
}
return(input_df)
}I have been having some difficulties getting the treeplot fonts to match our readability goals. The set of things I have tried so far are:
With that in mind, the following two parameters are going to set the width and height of the output pdf documents which will be sent to Maria Adelaida in order to arrange into the final figures via inkscape.
treeplot_height <- 6
treeplot_width <- 12
wrap_width <- 24
desired_go_ont <- "BP"I once again moved the ontology analyses to their own document. This is primarily because I want a fresh notebook to play around with these and get some of the resulting clutter out of the DE notebooks.
The con of doing this is that some of my enrichment methods are smart enough to directly take the output from combine_de_tables()/extract_significant_genes(). I guess I can dispatch a method to take the xlsx file as input…
Thus, for each enrichment analysis that was in the differential expression documents, there will now be a few stanzas here, one which loads the appropriate xlsx file, one which performs gProfiler, and one which uses clusterProfiler. On occasion there may be another with random stuff like me poking at transcription factors or other side interests…
In addition, I am going to do the Tumaco-only analyses first, because they are what are actually in the paper.
The gene set enrichment will follow each DE analysis during this document. I am adding a series of explicitly GSEA analyses in this most recent iteration, in these I will pass the full DE table and check the distribution of logFC values against the genes in each category as opposed to the simpler over-enrichment of the high/low DE genes.
Most (all?) of the gene set enrichment analyses performed in this document are a combination of gProfiler2 (Kolberg et al. (2020)) and clusterProfiler (Yu et al. (2012)). There are functions available in hpgltools to also perform a few other methods, but for the moment I am sticking to only these two. Oh yeah, I need to wrap clusterProfiler’s tree (taken from topGO) plotter because it spams plots everywhere…
tc_clinical_xlsx <- glue("analyses/3_cali_and_tumaco/DE_Cure_Fail/Clinical_Samples/tc_clinical_cf_table_sva-v{ver}.xlsx")
all_de_cf_table <- table_reader(tc_clinical_xlsx, "outcome")## [1] 12162 85
all_de_cf_up <- sig_reader(glue("analyses/3_cali_and_tumaco/DE_Cure_Fail/Clinical_Samples/tc_clinical_cf_sig_sva-v{ver}.xlsx"), "outcome", "up")
all_de_cf_down <- sig_reader(glue("analyses/3_cali_and_tumaco/DE_Cure_Fail/Clinical_Samples/tc_clinical_cf_sig_sva-v{ver}.xlsx"), "outcome", "down")
all_de_cf <- rbind.data.frame(all_de_cf_up, all_de_cf_down)
t_clinical_xlsx <- glue("analyses/4_tumaco/DE_Cure_Fail/t_all_visitcf_sig_sva-v{ver}.xlsx")
t_de_cf <- openxlsx::readWorkbook(t_clinical_xlsx, sheet = 3, startRow = 2)
rownames(t_de_cf) <- t_de_cf[["row.names"]]
t_de_cf[["row.names"]] <- NULL
t_de_cf_up <- t_de_cf[, c("deseq_logfc", "deseq_adjp", "deseq_basemean", "deseq_lfcse")]
t_de_cf <- openxlsx::readWorkbook(t_clinical_xlsx, sheet = 4, startRow = 2)
rownames(t_de_cf) <- t_de_cf[["row.names"]]
t_de_cf[["row.names"]] <- NULL
t_de_cf_down <- t_de_cf[, c("deseq_logfc", "deseq_adjp", "deseq_basemean", "deseq_lfcse")]
t_de_cf <- rbind.data.frame(t_de_cf_up, t_de_cf_down)The following gProfiler searches use the all_gprofiler() function instead of simple_gprofiler(). As a result, the results are separated by {contrast}_{direction}. Thus ‘outcome_down’.
The same plots are available as the previous gProfiler searches, but in many of the following runs, I used the dotplot() function to get a slightly different view of the results.
The xlsx files are:
t_clinical_cf_xlsx <- glue("{cf_prefix}/All_Samples/t_clinical_cf_table_sva-v{ver}.xlsx")
t_clinical_cf_table_sva <- table_reader(t_clinical_cf_xlsx, "outcome")## [1] 14156 85
t_clinical_cf_sig_xlsx <- glue("{cf_prefix}/All_Samples/t_clinical_cf_sig_sva-v{ver}.xlsx")
t_clinical_cf_sig_sva_up <- sig_reader(t_clinical_cf_sig_xlsx, "outcome")
t_clinical_cf_sig_sva_down <- sig_reader(t_clinical_cf_sig_xlsx, "outcome", "down")And without biopsies
t_clinical_nobiop_xlsx <- glue("{cf_prefix}/All_Samples/t_clinical_nobiop_cf_table_sva-v{ver}.xlsx")
t_clinicalnb_cf_table_sva <- table_reader(t_clinical_nobiop_xlsx, "outcome")## [1] 11910 92
t_clinical_nobiop_sig_xlsx <- glue("{cf_prefix}/All_Samples/t_clinical_nobiop_cf_sig_sva-v{ver}.xlsx")
t_clinicalnb_cf_sig_sva_up <- sig_reader(t_clinical_nobiop_sig_xlsx, "outcome")
t_clinicalnb_cf_sig_sva_down <- sig_reader(t_clinical_nobiop_sig_xlsx, "outcome", "down")
t_clinicalnb_cf_sig_sva_both <- rbind.data.frame(t_clinicalnb_cf_sig_sva_up,
t_clinicalnb_cf_sig_sva_down)After most of the DE analyses, the set of significantly DE genes gets passed to gProfiler2 and clusterProfiler. One slightly neat thing in my gprofiler (and goseq/topGO/gostats) function(s): it coerces the result into the same datastructure produced by clusterProfiler, thus one may play with the various plotting functions in the enrichplot (Yu (n.d.)) package. This is kind of fun because gProfiler2 provides easy access to a few datasets:
My general sense is that the comparisons of primary interest are reactome and one or more GO. I suspect that if there are lots of transcription factors, that might prove interesting.
t_cf_clinical_gp_up <- simple_gprofiler(
t_clinical_cf_sig_sva_up,
excel = glue("{xlsx_prefix}/Gene_Set_Overrepresentation/clinical_cure_up_gp-v{ver}.xlsx"))
t_cf_clinical_gp_up## A set of ontologies produced by gprofiler using 94
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 3 MF
## 48 BP
## 0 KEGG
## 5 REAC
## 3 WP
## 29 TF
## 1 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
go_termsim <- enrichplot::pairwise_termsim(t_cf_clinical_gp_up[["BP_enrich"]])
t_cf_clinical_gp_go_up_tree <- sm(enrichplot::treeplot(go_termsim,
label_format = wrap_width))
pp(file = "images/overrepresentation/t_cf_clinical_gp_up_tree.pdf",
width = treeplot_width, height = treeplot_height)## Warning in pp(file = "images/overrepresentation/t_cf_clinical_gp_up_tree.pdf",
## : The directory: images/overrepresentation does not exist, will attempt to
## create it.
t_cf_clinical_gp_go_up_tree
dev.off()## png
## 2
t_cf_clinical_gp_go_up_treeenrichplot::dotplot(t_cf_clinical_gp_up[["BP_enrich"]])reactome_termsim <- enrichplot::pairwise_termsim(t_cf_clinical_gp_up[["REAC_enrich"]])
t_cf_clinical_gp_reac_up_tree <- sm(enrichplot::treeplot(reactome_termsim,
label_format = wrap_width))
pp(file = "images/overrepresentation/t_cf_clinical_gp_up_tree.pdf",
width = treeplot_width, height = treeplot_height)
t_cf_clinical_gp_reac_up_tree
dev.off()## png
## 2
t_cf_clinical_gp_reac_up_treepp(file = "images/overrepresentation/t_cf_clinical_gp_up_dot.pdf",
width = treeplot_width, height = treeplot_height)
enrichplot::dotplot(t_cf_clinical_gp_up[["REAC_enrich"]])
dev.off()## png
## 2
enrichplot::dotplot(t_cf_clinical_gp_up[["REAC_enrich"]])tf_termsim <- enrichplot::pairwise_termsim(t_cf_clinical_gp_up[["TF_enrich"]])
## The treeplot fails for this for some reason?
##t_cf_clinical_gp_tf_up_tree <- sm(enrichplot::treeplot(tf_termsim))
enrichplot::dotplot(t_cf_clinical_gp_up[["TF_enrich"]])enrichplot::dotplot(t_cf_clinical_gp_up[["WP_enrich"]])No biopsies!
t_cf_clinicalnb_gp_up <- simple_gprofiler(
t_clinicalnb_cf_sig_sva_up,
excel = glue("{xlsx_prefix}/Gene_Set_Overrepresentation/clinicalnb_cure_up_gp-v{ver}.xlsx"))
t_cf_clinicalnb_gp_up## A set of ontologies produced by gprofiler using 140
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 6 MF
## 75 BP
## 0 KEGG
## 6 REAC
## 2 WP
## 41 TF
## 1 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
go_termsim <- enrichplot::pairwise_termsim(t_cf_clinicalnb_gp_up[["BP_enrich"]])
t_cf_clinicalnb_gp_go_up_tree <- sm(enrichplot::treeplot(go_termsim,
label_format = wrap_width))
pp(file = "figures/t_cf_clinicalnb_gp_up_tree.svg",
width = treeplot_width, height = treeplot_height)
t_cf_clinicalnb_gp_go_up_tree
dev.off()## png
## 2
t_cf_clinicalnb_gp_go_up_treeenrichplot::dotplot(t_cf_clinicalnb_gp_up[["BP_enrich"]])reactome_termsim <- enrichplot::pairwise_termsim(t_cf_clinicalnb_gp_up[["REAC_enrich"]])
t_cf_clinicalnb_gp_reac_up_tree <- sm(enrichplot::treeplot(reactome_termsim,
label_format = wrap_width))
pp(file = "images/overrepresentation/t_cf_clinicalnb_gp_up_tree.pdf",
width = treeplot_width, height = treeplot_height)
t_cf_clinicalnb_gp_reac_up_tree
dev.off()## png
## 2
t_cf_clinicalnb_gp_reac_up_treepp(file = "images/overrepresentation/t_cf_clinicalnb_gp_up_dot.pdf",
width = treeplot_width, height = treeplot_height)
enrichplot::dotplot(t_cf_clinicalnb_gp_up[["REAC_enrich"]])
dev.off()## png
## 2
enrichplot::dotplot(t_cf_clinicalnb_gp_up[["REAC_enrich"]])tf_termsim <- enrichplot::pairwise_termsim(t_cf_clinicalnb_gp_up[["TF_enrich"]])
## The treeplot fails for this for some reason?
##t_cf_clinicalnb_gp_tf_up_tree <- sm(enrichplot::treeplot(tf_termsim))
enrichplot::dotplot(t_cf_clinicalnb_gp_up[["TF_enrich"]])enrichplot::dotplot(t_cf_clinicalnb_gp_up[["WP_enrich"]])The only significant results for this group appear to be GO.
t_cf_clinical_gp_down <- simple_gprofiler(
t_clinical_cf_sig_sva_down,
excel = glue("{xlsx_prefix}/Gene_Set_Overrepresentation/clinical_fail_up_gp-v{ver}.xlsx"))
t_cf_clinical_gp_down## A set of ontologies produced by gprofiler using 183
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 9 MF
## 73 BP
## 0 KEGG
## 1 REAC
## 0 WP
## 1 TF
## 1 MIRNA
## 1 HPA
## 0 CORUM
## 0 HP hits.
go_termsim <- enrichplot::pairwise_termsim(t_cf_clinical_gp_down[["BP_enrich"]])
t_cf_clinical_gp_go_down_tree <- sm(enrichplot::treeplot(go_termsim))
pp(file = "images/overrepresentation/t_cf_clinical_gp_down_tree.pdf",
width = treeplot_width, height = treeplot_height)
t_cf_clinical_gp_go_down_tree
dev.off()## png
## 2
No biopsies!
t_cf_clinicalnb_gp_down <- simple_gprofiler(
t_clinicalnb_cf_sig_sva_down,
excel = glue("{xlsx_prefix}/Gene_Set_Overrepresentation/clinicalnb_fail_up_gp-v{ver}.xlsx"))
t_cf_clinicalnb_gp_down## A set of ontologies produced by gprofiler using 75
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 6 MF
## 52 BP
## 3 KEGG
## 5 REAC
## 0 WP
## 1 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 1 HP hits.
go_termsim <- enrichplot::pairwise_termsim(t_cf_clinicalnb_gp_down[["BP_enrich"]])
t_cf_clinicalnb_gp_go_down_tree <- sm(enrichplot::treeplot(go_termsim))
pp(file = "images/overrepresentation/t_cf_clinicalnb_gp_down_tree.pdf",
width = treeplot_width, height = treeplot_height)
t_cf_clinicalnb_gp_go_down_tree
dev.off()## png
## 2
t_cf_clinicalnb_gp_go_down_tree## And both, this is not usually where I put this, but the
## clinical samples are a bit of a special case.
t_cf_clinicalnb_gp_both <- simple_gprofiler(
t_clinicalnb_cf_sig_sva_both,
excel = glue("{xlsx_prefix}/Gene_Set_Overrepresentation/clinicalnb_fail_up_gp-v{ver}.xlsx"))
t_cf_clinicalnb_gp_both## A set of ontologies produced by gprofiler using 215
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 11 MF
## 127 BP
## 3 KEGG
## 8 REAC
## 2 WP
## 43 TF
## 1 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
go_termsim <- enrichplot::pairwise_termsim(t_cf_clinicalnb_gp_both[["BP_enrich"]])
t_cf_clinicalnb_gp_go_both_tree <- sm(enrichplot::treeplot(go_termsim))
pp(file = "images/overrepresentation/t_cf_clinicalnb_gp_both_tree.pdf",
width = treeplot_width, height = treeplot_height)
t_cf_clinicalnb_gp_go_both_tree
dev.off()## png
## 2
t_cf_clinicalnb_gp_go_both_treereac_termsim <- enrichplot::pairwise_termsim(t_cf_clinicalnb_gp_both[["REAC_enrich"]])
t_cf_clinicalnb_gp_reac_both_tree <- sm(enrichplot::treeplot(reac_termsim))
pp(file = "images/overrepresentation/t_cf_clinicalnb_gp_both_reac_tree.pdf",
width = treeplot_width, height = treeplot_height)
t_cf_clinicalnb_gp_reac_both_tree
dev.off()## png
## 2
t_cf_clinicalnb_gp_reac_both_treeThe following essentially repeats the gProfiler2 invocation using clusterProfiler. I have a couple of functions which compare the GO results from various methods, perhaps I should dig it out and see how similar the results are using these two tools. My assumption is that the primary differences should arise from the fact that gProfiler theoretically is updating their GO data over time and cProfiler uses the information in org.Hs.eg.db, which afaik has not changed in quite a while.
t_cf_clinical_cp_up <- simple_clusterprofiler(
t_clinical_cf_sig_sva_up, de_table = t_clinical_cf_table_sva,
excel = glue("{xlsx_prefix}/Gene_Set_Overrepresentation/clinical_cure_up_cp-v{ver}.xlsx"))## preparing geneSet collections...
## GSEA analysis...
## leading edge analysis...
## done...
## Reading KEGG annotation online: "https://rest.kegg.jp/link/hsa/pathway"...
## Reading KEGG annotation online: "https://rest.kegg.jp/list/pathway/hsa"...
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Writing a sheet containing the legend.
##
## Writing the BP data.
##
## Loading required namespace: Vennerable
##
## Writing the MF data.
##
## Writing the KEGG data.
##
## Finished writing excel file.
enrichplot::dotplot(t_cf_clinical_cp_up[["enrich_objects"]][["BP_all"]])t_cf_clinical_cp_down <- simple_clusterprofiler(
t_clinical_cf_sig_sva_down,
excel = glue("{xlsx_prefix}/Gene_Set_Overrepresentation/clinical_fail_up_cp-v{ver}.xlsx"))## Unable to find the fold-change column in the de table, not doing gsea.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Writing a sheet containing the legend.
##
## Writing the BP data.
##
## Writing the MF data.
##
## Writing the CC data.
##
## Writing the KEGG data.
##
## Finished writing excel file.
enrichplot::dotplot(t_cf_clinical_cp_down[["enrich_objects"]][["BP_all"]])Note that for both I did not bother with the analysis that included biopsies.
t_cf_clinicalnb_cp_both <- simple_clusterprofiler(
t_clinicalnb_cf_sig_sva_both,
excel = glue("{xlsx_prefix}/Gene_Set_Overrepresentation/clinical_fail_up_cp-v{ver}.xlsx"))## Unable to find the fold-change column in the de table, not doing gsea.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Deleting the file analyses/4_tumaco/Gene_Set_Overrepresentation/clinical_fail_up_cp-v202501.xlsx before writing the tables.
##
## Writing a sheet containing the legend.
##
## Writing the BP data.
##
## Writing the MF data.
##
## Writing the CC data.
##
## Writing the KEGG data.
##
## Finished writing excel file.
enrichplot::dotplot(t_cf_clinical_cp_both[["enrich_objects"]][["BP_all"]])## Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'dotplot': object 't_cf_clinical_cp_both' not found
GSEA is not associated with either up nor down, it takes the rank order of genes with respect (in this case) to fold-change.
t_cf_clinical_topn_gsea <- plot_topn_gsea(t_cf_clinical_cp_up)
t_cf_clinical_topn_gsea[["GO"]][[1]]t_cf_clinical_topn_gsea[["GO"]][[2]]t_cf_clinical_topn_gsea[["GO"]][[3]]t_cf_clinical_topn_gsea[["GO"]][[4]]t_cf_clinical_topn_gsea[["GO"]][[5]]The relevant xlsx files are:
tv1_later_xlsx <- glue("{xlsx_prefix}/DE_Visits/tv1_vs_later_tables-v{ver}.xlsx")
tv1_vs_later_table <- table_reader(tv1_later_xlsx, "later_vs_first")## [1] 11910 92
tv1_later_sig_xlsx <- glue("{xlsx_prefix}/DE_Visits/tv1_vs_later_sig-v{ver}.xlsx")
tv1_vs_later_up_sig <- sig_reader(tv1_later_sig_xlsx, "later_vs_first")
tv1_vs_later_down_sig <- sig_reader(tv1_later_sig_xlsx, "later_vs_first", "down")## There are less than 20 rows in this significance table, it is unlikely to be interesting.
tv1_vs_later_both <- rbind.data.frame(tv1_vs_later_up_sig, tv1_vs_later_down_sig)I am not likely to do the decreased in visit 1, there are only 7 genes.
tv1later_up_gp <- simple_gprofiler(
tv1_vs_later_up_sig,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/tv1_vs_later_up_gp-v{ver}.xlsx"))
tv1later_up_gp## A set of ontologies produced by gprofiler using 24
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 1 MF
## 5 BP
## 0 KEGG
## 3 REAC
## 2 WP
## 1 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
enrichplot::dotplot(tv1later_up_gp[["BP_enrich"]])enrichplot::dotplot(tv1later_up_gp[["REAC_enrich"]])tv1later_both_gp <- simple_gprofiler(
tv1_vs_later_both,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/tv1_vs_later_both_gp-v{ver}.xlsx"))
tv1later_both_gp## A set of ontologies produced by gprofiler using 31
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 2 MF
## 7 BP
## 1 KEGG
## 3 REAC
## 2 WP
## 0 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
enrichplot::dotplot(tv1later_both_gp[["BP_enrich"]])enrichplot::dotplot(tv1later_both_gp[["REAC_enrich"]])Note to self: make some classes for plot_topn_gsea so it can handle the various inputs it is likely to receive.
tv1later_up_cp <- simple_clusterprofiler(
tv1_vs_later_up_sig, de_table = tv1_vs_later_table,
orgdb = "org.Hs.eg.db", kegg_prefix = "hs", do_kegg = TRUE)## preparing geneSet collections...
## GSEA analysis...
## leading edge analysis...
## done...
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
enrichplot::dotplot(tv1later_up_cp[["enrich_objects"]][["MF_all"]])enrichplot::dotplot(tv1later_up_cp[["enrich_objects"]][["BP_all"]])written <- write_cp_data(
tv1later_up_cp,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/tv1_vs_later_up_cp-v{ver}.xlsx"))## Writing a sheet containing the legend.
## Writing the MF data.
## Writing the CC data.
## Finished writing excel file.
tv1later_both_cp <- simple_clusterprofiler(
tv1_vs_later_both, de_table = tv1_vs_later_table,
orgdb = "org.Hs.eg.db", kegg_prefix = "hs", do_kegg = TRUE)## preparing geneSet collections...
## GSEA analysis...
## leading edge analysis...
## done...
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
enrichplot::dotplot(tv1later_both_cp[["enrich_objects"]][["MF_all"]])enrichplot::dotplot(tv1later_both_cp[["enrich_objects"]][["BP_all"]])written <- write_cp_data(
tv1later_both_cp,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/tv1_vs_later_both_cp-v{ver}.xlsx"))## Writing a sheet containing the legend.
## Writing the BP data.
## Writing the MF data.
## Writing the CC data.
## Writing the KEGG data.
## Finished writing excel file.
tv1later_topn_gsea <- plot_topn_gsea(tv1later_up_cp)
tv1later_topn_gsea[["GO"]][[1]]tv1later_topn_gsea[["GO"]][[2]]The relevant input xlsx files are:
Oh, I messed up and put the DE results in the GSEA directory!
Let us see if we observe general male/female differences in the data. This has an important caveat: there are few female failures in the dataset and so the results may reflect that.
t_sex_xlsx <- glue("{xlsx_prefix}/DE_Sex/t_sex_cure_table-v{ver}.xlsx")
t_sex_table <- table_reader(t_sex_xlsx, "male_vs_female")## [1] 13971 92
t_sex_sig_xlsx <- glue("{xlsx_prefix}/DE_Sex/t_sex_cure_sig-v{ver}.xlsx")
t_sex_up_sig <- sig_reader(t_sex_sig_xlsx, "male_vs_female")
t_sex_down_sig <- sig_reader(t_sex_sig_xlsx, "male_vs_female", "down")
t_sex_both_sig <- rbind.data.frame(t_sex_up_sig, t_sex_down_sig)x ### gProfiler
t_sex_up_gp <- simple_gprofiler(
t_sex_up_sig,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/sex_male_up_gp-v{ver}.xlsx"))
t_sex_up_gp## A set of ontologies produced by gprofiler using 176
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 9 MF
## 38 BP
## 2 KEGG
## 7 REAC
## 0 WP
## 35 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 9 HP hits.
enrichplot::dotplot(t_sex_up_gp[["BP_enrich"]])enrichplot::dotplot(t_sex_up_gp[["REAC_enrich"]])enrichplot::dotplot(t_sex_up_gp[["TF_enrich"]])t_sex_down_gp <- simple_gprofiler(
t_sex_down_sig,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/sex_female_up_gp-v{ver}.xlsx"))
t_sex_down_gp## A set of ontologies produced by gprofiler using 134
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 0 MF
## 0 BP
## 0 KEGG
## 0 REAC
## 0 WP
## 6 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
t_sex_both_gp <- simple_gprofiler(
t_sex_both_sig,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/sex_female_up_gp-v{ver}.xlsx"))
t_sex_both_gp## A set of ontologies produced by gprofiler using 310
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 9 MF
## 39 BP
## 1 KEGG
## 7 REAC
## 0 WP
## 34 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 9 HP hits.
Given the results from gProfiler, I do not have high expectations for clusterProfiler, so I am not likely to take the time to write them out. It seems like male/female is confounded with cure/fail.
t_sex_up_cp <- simple_clusterprofiler(
t_sex_up_sig, de_table = t_sex_table,
orgdb = "org.Hs.eg.db", kegg_prefix = "hs", do_kegg = TRUE)## preparing geneSet collections...
## GSEA analysis...
## leading edge analysis...
## done...
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
enrichplot::dotplot(t_sex_up_cp[["enrich_objects"]][["BP_all"]])t_sex_down_cp <- simple_clusterprofiler(
t_sex_down_sig, de_table = t_sex_table,
orgdb = "org.Hs.eg.db", kegg_prefix = "hs", do_kegg = TRUE)## preparing geneSet collections...
## GSEA analysis...
## leading edge analysis...
## done...
enrichplot::dotplot(t_sex_down_cp[["enrich_objects"]][["BP_all"]])t_sex_both_cp <- simple_clusterprofiler(
t_sex_both_sig, de_table = t_sex_table,
orgdb = "org.Hs.eg.db", kegg_prefix = "hs", do_kegg = TRUE)## preparing geneSet collections...
## GSEA analysis...
## leading edge analysis...
## done...
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
enrichplot::dotplot(t_sex_both_cp[["enrich_objects"]][["BP_all"]])t_sex_topn_gsea <- plot_topn_gsea(t_sex_up_cp)
t_sex_topn_gsea[["GO"]][[1]]Once again, the relevant files to load:
t_ethnicity_xlsx <- glue("{xlsx_prefix}/DE_Ethnicity/t_ethnicity_table-v{ver}.xlsx")
t_ethnicity_mestizo_indigenous <- table_reader(t_ethnicity_xlsx, "mestizo_indigenous")## [1] 14156 83
t_ethnicity_mestizo_afrocol <- table_reader(t_ethnicity_xlsx, "mestizo_afrocol")## [1] 14156 83
t_ethnicity_indigenous_afrocol <- table_reader(t_ethnicity_xlsx, "indigenous_afrocol")## [1] 14156 92
t_ethnicity_sig_xlsx <- glue("{xlsx_prefix}/DE_Ethnicity/t_ethnicity_sig-v{ver}.xlsx")
t_ethnicity_mestizo_indigenous_up <- sig_reader(t_ethnicity_sig_xlsx, "mestizo_indigenous")
t_ethnicity_mestizo_indigenous_down <- sig_reader(t_ethnicity_sig_xlsx, "mestizo_indigenous", "down")
t_ethnicity_mestizo_afrocol_up <- sig_reader(t_ethnicity_sig_xlsx, "mestizo_afrocol")
t_ethnicity_mestizo_afrocol_down <- sig_reader(t_ethnicity_sig_xlsx, "mestizo_afrocol", "down")
t_ethnicity_indigenous_afrocol_up <- sig_reader(t_ethnicity_sig_xlsx, "indigenous_afrocol")
t_ethnicity_indigenous_afrocol_down <- sig_reader(t_ethnicity_sig_xlsx, "indigenous_afrocol", "down")
t_ethnicity_any <- rbind.data.frame(
t_ethnicity_mestizo_indigenous_up,
t_ethnicity_mestizo_indigenous_down,
t_ethnicity_mestizo_afrocol_up,
t_ethnicity_mestizo_afrocol_down,
t_ethnicity_indigenous_afrocol_up,
t_ethnicity_indigenous_afrocol_down)## Error in rbind.data.frame(t_ethnicity_mestizo_indigenous_up, t_ethnicity_mestizo_indigenous_down, : numbers of columns of arguments do not match
mestizo_indigenous_up_gp <- simple_gprofiler(
t_ethnicity_mestizo_indigenous_up,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/ethnicity_mi_up_gp-v{ver}.xlsx"))
mestizo_indigenous_up_gp## A set of ontologies produced by gprofiler using 83
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 1 MF
## 7 BP
## 0 KEGG
## 2 REAC
## 0 WP
## 7 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
indigenous_mestizo_up_gp <- simple_gprofiler(
t_ethnicity_mestizo_indigenous_down,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/ethnicity_im_up_gp-v{ver}.xlsx"))
indigenous_mestizo_up_gp## A set of ontologies produced by gprofiler using 97
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 4 MF
## 16 BP
## 3 KEGG
## 4 REAC
## 2 WP
## 1 TF
## 0 MIRNA
## 1 HPA
## 0 CORUM
## 0 HP hits.
mestizo_afrocol_up_gp <- simple_gprofiler(
t_ethnicity_mestizo_afrocol_up,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/ethnicity_ma_up_gp-v{ver}.xlsx"))
mestizo_afrocol_up_gp## A set of ontologies produced by gprofiler using 57
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 0 MF
## 5 BP
## 0 KEGG
## 0 REAC
## 0 WP
## 0 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
afrocol_mestizo_up_gp <- simple_gprofiler(
t_ethnicity_mestizo_afrocol_down,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/ethnicity_am_up_gp-v{ver}.xlsx"))
afrocol_mestizo_up_gp## A set of ontologies produced by gprofiler using 92
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 3 MF
## 0 BP
## 0 KEGG
## 1 REAC
## 0 WP
## 7 TF
## 3 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
indigenous_afrocol_up_gp <- simple_gprofiler(
t_ethnicity_indigenous_afrocol_up,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/ethnicity_ia_up_gp-v{ver}.xlsx"))
indigenous_afrocol_up_gp## A set of ontologies produced by gprofiler using 165
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 6 MF
## 44 BP
## 1 KEGG
## 2 REAC
## 0 WP
## 1 TF
## 0 MIRNA
## 1 HPA
## 0 CORUM
## 0 HP hits.
afrocol_indigenous_up_gp <- simple_gprofiler(
t_ethnicity_indigenous_afrocol_down,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/ethnicity_ai_up_gp-v{ver}.xlsx"))
afrocol_indigenous_up_gp## A set of ontologies produced by gprofiler using 236
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 6 MF
## 15 BP
## 2 KEGG
## 5 REAC
## 0 WP
## 4 TF
## 1 MIRNA
## 6 HPA
## 0 CORUM
## 1 HP hits.
ethnicity_any_gp <- simple_gprofiler(
t_ethnicity_any,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/ethnicity_any_gp-v{ver}.xlsx"))## Error in eval(expr, envir, enclos): object 't_ethnicity_any' not found
ethnicity_any_gp## Error in eval(expr, envir, enclos): object 'ethnicity_any_gp' not found
enrichplot::dotplot(ethnicity_any_gp[["BP_enrich"]])## Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'dotplot': object 'ethnicity_any_gp' not found
enrichplot::dotplot(ethnicity_any_gp[["REAC_enrich"]])## Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'dotplot': object 'ethnicity_any_gp' not found
enrichplot::dotplot(ethnicity_any_gp[["TF_enrich"]])## Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'dotplot': object 'ethnicity_any_gp' not found
mestizo_indigenous_up_cp <- simple_clusterprofiler(
t_ethnicity_mestizo_indigenous_up,
de_table = t_ethnicity_mestizo_indigenous)## preparing geneSet collections...
## GSEA analysis...
## leading edge analysis...
## done...
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
mestizo_indigenous_up_cp## A set of ontologies produced by clusterprofiler.
indigenous_mestizo_up_cp <- simple_clusterprofiler(
t_ethnicity_mestizo_indigenous_down)## Unable to find the fold-change column in the de table, not doing gsea.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
indigenous_mestizo_up_cp## A set of ontologies produced by clusterprofiler.
mestizo_afrocol_up_cp <- simple_clusterprofiler(
t_ethnicity_mestizo_afrocol_up,
de_table = t_ethnicity_mestizo_afrocol)## preparing geneSet collections...
## GSEA analysis...
## leading edge analysis...
## done...
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
mestizo_afrocol_up_cp## A set of ontologies produced by clusterprofiler.
afrocol_mestizo_up_cp <- simple_clusterprofiler(t_ethnicity_mestizo_afrocol_down)## Unable to find the fold-change column in the de table, not doing gsea.
afrocol_mestizo_up_cp## A set of ontologies produced by clusterprofiler.
indigenous_afrocol_up_cp <- simple_clusterprofiler(
t_ethnicity_indigenous_afrocol_up,
de_table = t_ethnicity_indigenous_afrocol)## preparing geneSet collections...
## GSEA analysis...
## leading edge analysis...
## done...
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
indigenous_afrocol_up_cp## A set of ontologies produced by clusterprofiler.
afrocol_indigenous_up_cp <- simple_clusterprofiler(t_ethnicity_indigenous_afrocol_down)## Unable to find the fold-change column in the de table, not doing gsea.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
afrocol_indigenous_up_cp## A set of ontologies produced by clusterprofiler.
It looks like there are very few groups in the visit 1 significant genes.
The relevant xlsx files are:
t_clinical_v1_xlsx <- glue("{cf_prefix}/Visits/t_clinical_v1_cf_table_sva-v{ver}.xlsx")
t_cf_clinical_v1_table_sva <- table_reader(t_clinical_v1_xlsx)## [1] 14023 92
t_clinical_v1_sig_xlsx <- glue("{cf_prefix}/Visits/t_clinical_v1_cf_sig_sva-v{ver}.xlsx")
t_cf_clinical_v1_sig_sva_up <- sig_reader(t_clinical_v1_sig_xlsx, "outcome")
t_cf_clinical_v1_sig_sva_down <- sig_reader(t_clinical_v1_sig_xlsx, "outcome", "down")
t_cf_clinical_v1_sig_sva_both <- rbind.data.frame(
t_cf_clinical_v1_sig_sva_up,
t_cf_clinical_v1_sig_sva_down)t_cf_clinical_v1_sig_sva_up_gp <- simple_gprofiler(t_cf_clinical_v1_sig_sva_up)
t_cf_clinical_v1_sig_sva_up_gp## A set of ontologies produced by gprofiler using 27
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 1 MF
## 0 BP
## 0 KEGG
## 0 REAC
## 0 WP
## 0 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
t_cf_clinical_v1_sig_sva_down_gp <- simple_gprofiler(t_cf_clinical_v1_sig_sva_down)
t_cf_clinical_v1_sig_sva_down_gp## A set of ontologies produced by gprofiler using 75
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 4 MF
## 15 BP
## 0 KEGG
## 1 REAC
## 1 WP
## 1 TF
## 0 MIRNA
## 0 HPA
## 1 CORUM
## 1 HP hits.
enrichplot::dotplot(t_cf_clinical_v1_sig_sva_down_gp[["BP_enrich"]])t_cf_clinical_v1_sig_sva_both_gp <- simple_gprofiler(t_cf_clinical_v1_sig_sva_both)
t_cf_clinical_v1_sig_sva_both_gp## A set of ontologies produced by gprofiler using 102
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 2 MF
## 11 BP
## 1 KEGG
## 0 REAC
## 0 WP
## 0 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
enrichplot::dotplot(t_cf_clinical_v1_sig_sva_both_gp[["BP_enrich"]])t_cf_clinical_v1_sig_sva_up_cp <- simple_cprofiler(
t_cf_clinical_v1_sig_sva_up,
de_table = t_cf_clinical_v1_table_sva,
orgdb = "org.Hs.eg.db")## preparing geneSet collections...
## GSEA analysis...
## leading edge analysis...
## done...
t_cf_clinical_v1_sig_sva_up_cp## A set of ontologies produced by clusterprofiler.
enrichplot::dotplot(t_cf_clinical_v1_sig_sva_up_cp[["enrich_objects"]][["BP_all"]])t_cf_clinical_v1_sig_sva_down_cp <- simple_cprofiler(
t_cf_clinical_v1_sig_sva_down,
orgdb = "org.Hs.eg.db")## Unable to find the fold-change column in the de table, not doing gsea.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
t_cf_clinical_v1_sig_sva_both_cp <- simple_cprofiler(
t_cf_clinical_v1_sig_sva_both,
orgdb = "org.Hs.eg.db")## Unable to find the fold-change column in the de table, not doing gsea.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
enrichplot::dotplot(t_cf_clinical_v1_sig_sva_both_cp[["enrich_objects"]][["BP_all"]])It appears there are too few results to perform the gsea plots.
The relevant xlsx output may be found at:
t_biopsy_xlsx <- glue("{cf_prefix}/Biopsies/t_biopsy_cf_table_sva-v{ver}.xlsx")
t_cf_biopsy_table_sva <- table_reader(t_biopsy_xlsx, "outcome")## [1] 13513 92
t_biopsy_sig_xlsx <- glue("{cf_prefix}/Biopsies/t_cf_biopsy_sig_sva-v{ver}.xlsx")
t_cf_biopsy_sig_sva_up <- sig_reader(t_biopsy_sig_xlsx, "outcome")## There are less than 20 rows in this significance table, it is unlikely to be interesting.
t_cf_biopsy_sig_sva_down <- sig_reader(t_biopsy_sig_xlsx, "outcome", "down")## There are less than 20 rows in this significance table, it is unlikely to be interesting.
t_cf_biopsy_sig_sva_both <- rbind.data.frame(t_cf_biopsy_sig_sva_up,
t_cf_biopsy_sig_sva_down)We only have 17 genes in the biopsies, but perhaps they are still interesting?
t_cf_biopsy_sig_sva_gp_up <- simple_gprofiler(
t_cf_biopsy_sig_sva_up,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_biopsy_sig_sva_up_gp-v{ver}.xlsx"))
t_cf_biopsy_sig_sva_gp_up## A set of ontologies produced by gprofiler using 17
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 9 MF
## 76 BP
## 4 KEGG
## 1 REAC
## 5 WP
## 2 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
enrichplot::dotplot(t_cf_biopsy_sig_sva_gp_up[["BP_enrich"]])go_termsim <- enrichplot::pairwise_termsim(t_cf_biopsy_sig_sva_gp_up[["BP_enrich"]])
go_treeplot <- sm(treeplot(go_termsim, label_format = wrap_width))
pp(file = glue("images/overrepresentation/t_cf_biopsy_up_gp_go-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
go_treeplot
dev.off()## png
## 2
go_treeplott_cf_biopsy_sig_sva_gp_down <- simple_gprofiler(
t_cf_biopsy_sig_sva_down,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_biopsy_sig_sva_down_gp-v{ver}.xlsx"))
t_cf_biopsy_sig_sva_gp_down## A set of ontologies produced by gprofiler using 11
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 0 MF
## 0 BP
## 0 KEGG
## 0 REAC
## 0 WP
## 0 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
## Not nearly as interestingt_cf_biopsy_sig_sva_gp_both <- simple_gprofiler(
t_cf_biopsy_sig_sva_both,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_biopsy_sig_sva_both_gp-v{ver}.xlsx"))
t_cf_biopsy_sig_sva_gp_both## A set of ontologies produced by gprofiler using 28
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 9 MF
## 71 BP
## 4 KEGG
## 1 REAC
## 4 WP
## 2 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
enrichplot::dotplot(t_cf_biopsy_sig_sva_gp_both[["BP_enrich"]])Again, clusterprofiler version
t_cf_biopsy_sig_sva_cp_up <- simple_cprofiler(
t_cf_biopsy_sig_sva_up, de_table = t_cf_biopsy_table_sva,
orgdb = "org.Hs.eg.db",
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_biopsy_sig_sva_up_cp-v{ver}.xlsx"))## preparing geneSet collections...
## GSEA analysis...
## leading edge analysis...
## done...
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Writing a sheet containing the legend.
##
## Writing the BP data.
##
## Writing the MF data.
##
## Writing the CC data.
##
## Writing the KEGG data.
##
## Finished writing excel file.
enrichplot::dotplot(t_cf_biopsy_sig_sva_cp_up[["enrich_objects"]][["BP_all"]])t_cf_biopsy_sig_topn_gsea <- plot_topn_gsea(t_cf_biopsy_sig_sva_cp_up)
t_cf_biopsy_sig_topn_gsea[["GO_outcome_up"]][[1]]## NULL
t_eosinophil_xlsx <- glue("{cf_prefix}/Eosinophils/t_eosinophil_cf_table_sva-v{ver}.xlsx")
t_cf_eosinophil_table_sva <- table_reader(t_eosinophil_xlsx, "outcome")## [1] 10532 92
t_eosinophil_sig_xlsx <- glue("{cf_prefix}/Eosinophils/t_eosinophil_cf_sig_sva-v{ver}.xlsx")
t_cf_eosinophil_sig_sva_up <- sig_reader(t_eosinophil_sig_xlsx, "outcome")
t_cf_eosinophil_sig_sva_down <- sig_reader(t_eosinophil_sig_xlsx, "outcome", "down")
t_cf_eosinophil_sig_sva_both <- rbind.data.frame(t_cf_eosinophil_sig_sva_up,
t_cf_eosinophil_sig_sva_down)t_cf_eosinophil_sig_sva_up_gp <- simple_gprofiler(
t_cf_eosinophil_sig_sva_up,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_eosinophil_up_gp-v{ver}.xlsx"))
t_cf_eosinophil_sig_sva_up_gp## A set of ontologies produced by gprofiler using 116
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 21 MF
## 119 BP
## 2 KEGG
## 7 REAC
## 5 WP
## 68 TF
## 1 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
enrichplot::dotplot(t_cf_eosinophil_sig_sva_up_gp[["BP_enrich"]])enrichplot::dotplot(t_cf_eosinophil_sig_sva_up_gp[["TF_enrich"]])go_termsim <- enrichplot::pairwise_termsim(t_cf_eosinophil_sig_sva_up_gp[["BP_enrich"]])
go_treeplot <- sm(treeplot(go_termsim, label_format = wrap_width))
pp(file = "figures/t_cf_eosinophil_up_gp_go.svg",
height = treeplot_height, width = treeplot_width)
go_treeplot
dev.off()## png
## 2
go_treeplotreac_termsim <- enrichplot::pairwise_termsim(t_cf_eosinophil_sig_sva_up_gp[["REAC_enrich"]])
reac_treeplot <- sm(treeplot(reac_termsim, label_format = wrap_width))
pp(file = glue("images/overrepresentation/t_cf_eosinophil_up_gp_reac-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
reac_treeplot
dev.off()## png
## 2
t_cf_eosinophil_sig_sva_down_gp <- simple_gprofiler(
t_cf_eosinophil_sig_sva_down,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_eosinophil_down_gp-v{ver}.xlsx"))
enrichplot::dotplot(t_cf_eosinophil_sig_sva_down_gp[["BP_enrich"]])go_termsim <- enrichplot::pairwise_termsim(t_cf_eosinophil_sig_sva_down_gp[["BP_enrich"]])
go_treeplot <- sm(treeplot(go_termsim, label_format = wrap_width))
pp(file = glue("images/overrepresentation/t_cf_eosinophil_down_gp_go-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
go_treeplot
dev.off()## png
## 2
go_treeplot## There is only one reactome hit, so not plotting it.I evaluated this and the ‘up’ set next to each other and they are extremely similar:
up: 148 GO hits, 68 TF, 0 HPA both: 169 GO, 69 TF, and 2 HPA
Otherwise I think they are the same.
t_cf_eosinophil_sig_sva_both_gp <- simple_gprofiler(
t_cf_eosinophil_sig_sva_both,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_eosinophil_both_gp-v{ver}.xlsx"))
t_cf_eosinophil_sig_sva_both_gp## A set of ontologies produced by gprofiler using 191
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 25 MF
## 138 BP
## 2 KEGG
## 8 REAC
## 6 WP
## 72 TF
## 1 MIRNA
## 2 HPA
## 0 CORUM
## 0 HP hits.
enrichplot::dotplot(t_cf_eosinophil_sig_sva_both_gp[["BP_enrich"]])go_termsim <- enrichplot::pairwise_termsim(t_cf_eosinophil_sig_sva_both_gp[["BP_enrich"]])
go_treeplot <- sm(treeplot(go_termsim, label_formap = wrap_width))
pp(file = glue("images/overrepresentation/t_cf_eosinophil_both_gp_go-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
go_treeplot
dev.off()## png
## 2
go_treeplotreac_termsim <- enrichplot::pairwise_termsim(t_cf_eosinophil_sig_sva_both_gp[["REAC_enrich"]])
reac_treeplot <- sm(treeplot(reac_termsim, label_format = wrap_width))
pp(file = glue("images/overrepresentation/t_cf_eosinophil_both_gp_reac-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
reac_treeplot
dev.off()## png
## 2
t_cf_eosinophil_sig_sva_cp_up <- simple_cprofiler(
t_cf_eosinophil_sig_sva_up, de_table = t_cf_eosinophil_table_sva,
orgdb = "org.Hs.eg.db",
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_eosinophil_sig_sva_up_cp-v{ver}.xlsx"))## preparing geneSet collections...
## GSEA analysis...
## leading edge analysis...
## done...
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Writing a sheet containing the legend.
##
## Writing the BP data.
##
## Writing the MF data.
##
## Writing the CC data.
##
## Writing the KEGG data.
##
## Finished writing excel file.
enrichplot::dotplot(t_cf_eosinophil_sig_sva_cp_up[["enrich_objects"]][["BP_all"]])t_cf_eosinophil_sig_sva_cp_down <- simple_cprofiler(
t_cf_eosinophil_sig_sva_down,
orgdb = "org.Hs.eg.db",
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_eosinophil_sig_sva_down_cp-v{ver}.xlsx"))## Unable to find the fold-change column in the de table, not doing gsea.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Writing a sheet containing the legend.
##
## Writing the BP data.
##
## Writing the MF data.
##
## Writing the CC data.
##
## Writing the KEGG data.
##
## Finished writing excel file.
enrichplot::dotplot(t_cf_eosinophil_sig_sva_cp_down[["enrich_objects"]][["BP_all"]])t_cf_eosinophil_sig_topn_gsea <- plot_topn_gsea(t_cf_eosinophil_sig_sva_cp_up)
t_cf_eosinophil_sig_topn_gsea[["GO_outcome_up"]][[1]]## NULL
t_monocyte_xlsx <- glue("{cf_prefix}/Monocytes/t_monocyte_cf_table_sva-v{ver}.xlsx")
t_cf_monocyte_table_sva <- table_reader(t_monocyte_xlsx, "outcome")## [1] 10862 92
t_monocyte_sig_xlsx <- glue("{cf_prefix}/Monocytes/t_monocyte_cf_sig_sva-v{ver}.xlsx")
t_cf_monocyte_sig_sva_up <- sig_reader(t_monocyte_sig_xlsx, "outcome")
t_cf_monocyte_sig_sva_down <- sig_reader(t_monocyte_sig_xlsx, "outcome", "down")
t_cf_monocyte_sig_sva_both <- rbind.data.frame(t_cf_monocyte_sig_sva_up,
t_cf_monocyte_sig_sva_down)Now that I am looking back over these results, I am not compeltely certain why I only did the gprofiler search for the sva data…
t_cf_monocyte_sig_sva_up_gp <- simple_gprofiler(
t_cf_monocyte_sig_sva_up,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_monocyte_up_gp-v{ver}.xlsx"))
t_cf_monocyte_sig_sva_up_gp## A set of ontologies produced by gprofiler using 60
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 0 MF
## 23 BP
## 1 KEGG
## 0 REAC
## 1 WP
## 5 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
enrichplot::dotplot(t_cf_monocyte_sig_sva_up_gp[["BP_enrich"]])enrichplot::dotplot(t_cf_monocyte_sig_sva_up_gp[["TF_enrich"]])go_termsim <- enrichplot::pairwise_termsim(t_cf_monocyte_sig_sva_up_gp[["BP_enrich"]])
go_treeplot <- sm(treeplot(go_termsim, label_format = wrap_width))
pp(file = "figures/overrepresentation/t_cf_monocyte_up_gp_go.svg",
height = treeplot_height, width = treeplot_width)## Warning in pp(file = "figures/overrepresentation/t_cf_monocyte_up_gp_go.svg", :
## The directory: figures/overrepresentation does not exist, will attempt to
## create it.
go_treeplot
dev.off()## png
## 2
go_treeplott_cf_monocyte_sig_sva_down_gp <- simple_gprofiler(
t_cf_monocyte_sig_sva_down,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_monocyte_down_gp-v{ver}.xlsx"))
enrichplot::dotplot(t_cf_monocyte_sig_sva_down_gp[["BP_enrich"]])go_termsim <- enrichplot::pairwise_termsim(t_cf_monocyte_sig_sva_down_gp[["BP_enrich"]])
go_treeplot <- sm(treeplot(go_termsim, label_format = wrap_width))
pp(file = glue("images/overrepresentation/t_cf_monocyte_down_gp_go-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
go_treeplot
dev.off()## png
## 2
go_treeplot## Insufficient results to make a tree plot.t_cf_monocyte_sig_sva_both_gp <- simple_gprofiler(
t_cf_monocyte_sig_sva_both,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_monocyte_both_gp-v{ver}.xlsx"))
t_cf_monocyte_sig_sva_both_gp## A set of ontologies produced by gprofiler using 112
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 2 MF
## 80 BP
## 3 KEGG
## 1 REAC
## 1 WP
## 8 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
enrichplot::dotplot(t_cf_monocyte_sig_sva_both_gp[["BP_enrich"]])go_termsim <- enrichplot::pairwise_termsim(t_cf_monocyte_sig_sva_both_gp[["BP_enrich"]])
go_treeplot <- sm(treeplot(go_termsim, label_format = wrap_width))
pp(file = glue("images/overrepresentation/t_cf_monocyte_both_gp_go-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
go_treeplot
dev.off()## png
## 2
go_treeplot## Insufficient results for reactome
tf_termsim <- enrichplot::pairwise_termsim(t_cf_monocyte_sig_sva_both_gp[["TF_enrich"]])
tf_treeplot <- sm(treeplot(tf_termsim, label_format = wrap_width))
pp(file = glue("images/overrepresentation/t_cf_monocyte_both_gp_tf-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
tf_treeplot
dev.off()## png
## 2
tf_treeplott_cf_monocyte_sig_sva_cp_up <- simple_cprofiler(
t_cf_monocyte_sig_sva_up, de_table = t_cf_monocyte_table_sva,
orgdb = "org.Hs.eg.db",
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_monocyte_sig_sva_up_cp-v{ver}.xlsx"))## preparing geneSet collections...
## GSEA analysis...
## leading edge analysis...
## done...
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Writing a sheet containing the legend.
##
## Writing the BP data.
##
## Writing the KEGG data.
##
## Finished writing excel file.
enrichplot::dotplot(t_cf_monocyte_sig_sva_cp_up[["enrich_objects"]][["BP_all"]])t_cf_monocyte_sig_sva_cp_down <- simple_cprofiler(
t_cf_monocyte_sig_sva_down,
orgdb = "org.Hs.eg.db",
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_monocyte_sig_sva_down_cp-v{ver}.xlsx"))## Unable to find the fold-change column in the de table, not doing gsea.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Writing a sheet containing the legend.
##
## Writing the BP data.
##
## Writing the MF data.
##
## Writing the CC data.
##
## Writing the KEGG data.
##
## Finished writing excel file.
enrichplot::dotplot(t_cf_monocyte_sig_sva_cp_down[["enrich_objects"]][["BP_all"]])t_cf_monocyte_sig_topn_gsea <- plot_topn_gsea(t_cf_monocyte_sig_sva_cp_up)
t_cf_monocyte_sig_topn_gsea[["GO_outcome_up"]][[1]]## NULL
t_neutrophil_xlsx <- glue("{cf_prefix}/Neutrophils/t_neutrophil_cf_table_sva-v{ver}.xlsx")
t_cf_neutrophil_table_sva <- table_reader(t_neutrophil_xlsx, "outcome")## [1] 9101 92
t_neutrophil_sig_xlsx <- glue("{cf_prefix}/Neutrophils/t_neutrophil_cf_sig_sva-v{ver}.xlsx")
t_cf_neutrophil_sig_sva_up <- sig_reader(t_neutrophil_sig_xlsx, "outcome")
t_cf_neutrophil_sig_sva_down <- sig_reader(t_neutrophil_sig_xlsx, "outcome", "down")
t_cf_neutrophil_sig_sva_both <- rbind.data.frame(t_cf_neutrophil_sig_sva_up,
t_cf_neutrophil_sig_sva_down)Now that I am looking back over these results, I am not compeltely certain why I only did the gprofiler search for the sva data…
t_cf_neutrophil_sig_sva_up_gp <- simple_gprofiler(
t_cf_neutrophil_sig_sva_up,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_neutrophil_up_gp-v{ver}.xlsx"))
t_cf_neutrophil_sig_sva_up_gp## A set of ontologies produced by gprofiler using 130
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 4 MF
## 62 BP
## 0 KEGG
## 5 REAC
## 2 WP
## 53 TF
## 1 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
enrichplot::dotplot(t_cf_neutrophil_sig_sva_up_gp[["BP_enrich"]])enrichplot::dotplot(t_cf_neutrophil_sig_sva_up_gp[["TF_enrich"]])go_termsim <- enrichplot::pairwise_termsim(t_cf_neutrophil_sig_sva_up_gp[["BP_enrich"]])
go_treeplot <- sm(treeplot(go_termsim, label_format = wrap_width))
pp(file = "figures/t_cf_neutrophil_up_gp_go.svg",
height = treeplot_height, width = treeplot_width)
go_treeplot
dev.off()## png
## 2
go_treeplott_cf_neutrophil_sig_sva_down_gp <- simple_gprofiler(
t_cf_neutrophil_sig_sva_down,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_neutrophil_down_gp-v{ver}.xlsx"))
t_cf_neutrophil_sig_sva_down_gp## A set of ontologies produced by gprofiler using 30
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 0 MF
## 0 BP
## 1 KEGG
## 0 REAC
## 0 WP
## 2 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 1 HP hits.
## Not much to work with here.t_cf_neutrophil_sig_sva_both_gp <- simple_gprofiler(
t_cf_neutrophil_sig_sva_both,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_neutrophil_both_gp-v{ver}.xlsx"))
t_cf_neutrophil_sig_sva_both_gp## A set of ontologies produced by gprofiler using 160
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 4 MF
## 64 BP
## 0 KEGG
## 5 REAC
## 2 WP
## 55 TF
## 1 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
enrichplot::dotplot(t_cf_neutrophil_sig_sva_both_gp[["BP_enrich"]])enrichplot::dotplot(t_cf_neutrophil_sig_sva_both_gp[["TF_enrich"]])go_termsim <- enrichplot::pairwise_termsim(t_cf_neutrophil_sig_sva_both_gp[["BP_enrich"]])
go_treeplot <- sm(treeplot(go_termsim, label_format = wrap_width))
pp(file = glue("images/overrepresentation/t_cf_neutrophil_both_gp_go-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
go_treeplot
dev.off()## png
## 2
go_treeplot## Insufficient results for reactomet_cf_neutrophil_sig_sva_cp_up <- simple_cprofiler(
t_cf_neutrophil_sig_sva_up, de_table = t_cf_neutrophil_table_sva,
orgdb = "org.Hs.eg.db",
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_neutrophil_sig_sva_up_cp-v{ver}.xlsx"))## preparing geneSet collections...
## GSEA analysis...
## leading edge analysis...
## done...
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Writing a sheet containing the legend.
##
## Writing the BP data.
##
## Writing the MF data.
##
## Writing the CC data.
##
## Finished writing excel file.
enrichplot::dotplot(t_cf_neutrophil_sig_sva_cp_up[["enrich_objects"]][["BP_all"]])t_cf_neutrophil_sig_sva_cp_down <- simple_cprofiler(
t_cf_neutrophil_sig_sva_down,
orgdb = "org.Hs.eg.db",
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_neutrophil_sig_sva_down_cp-v{ver}.xlsx"))## Unable to find the fold-change column in the de table, not doing gsea.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Writing a sheet containing the legend.
##
## Writing the MF data.
##
## Writing the CC data.
##
## Writing the KEGG data.
##
## Finished writing excel file.
enrichplot::dotplot(t_cf_neutrophil_sig_sva_cp_down[["enrich_objects"]][["BP_all"]])t_cf_neutrophil_sig_topn_gsea <- plot_topn_gsea(t_cf_neutrophil_sig_sva_cp_up)
t_cf_neutrophil_sig_topn_gsea[["GO_outcome_up"]][[1]]## NULL
t_monocyte_v1_xlsx <- glue("{cf_prefix}/Monocytes/t_monocyte_v1_cf_table_sva-v{ver}.xlsx")
t_cf_monocyte_v1_table_sva <- table_reader(t_monocyte_v1_xlsx, "outcome")## [1] 10482 92
t_monocyte_v1_sig_xlsx <- glue("{cf_prefix}/Monocytes/t_monocyte_v1_cf_sig_sva-v{ver}.xlsx")
t_cf_monocyte_v1_sig_sva_up <- sig_reader(t_monocyte_v1_sig_xlsx, "outcome")## There are less than 20 rows in this significance table, it is unlikely to be interesting.
t_cf_monocyte_v1_sig_sva_down <- sig_reader(t_monocyte_v1_sig_xlsx, "outcome", "down")
t_cf_monocyte_v1_sig_sva_both <- rbind.data.frame(t_cf_monocyte_v1_sig_sva_up,
t_cf_monocyte_v1_sig_sva_down)V1: Up: 14 genes; No categories V1: Down: 52 genes; 20 GO, 5 TF
t_cf_monocyte_v1_sig_sva_up_gp <- simple_gprofiler(
t_cf_monocyte_v1_sig_sva_up,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_neutrophil_up_gp-v{ver}.xlsx"))
t_cf_monocyte_v1_sig_sva_up_gp## A set of ontologies produced by gprofiler using 14
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 0 MF
## 0 BP
## 0 KEGG
## 0 REAC
## 0 WP
## 0 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
t_cf_monocyte_v1_sig_sva_down_gp <- simple_gprofiler(
t_cf_monocyte_v1_sig_sva_down,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_neutrophil_down_gp-v{ver}.xlsx"))
t_cf_monocyte_v1_sig_sva_down_gp## A set of ontologies produced by gprofiler using 52
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 2 MF
## 9 BP
## 0 KEGG
## 0 REAC
## 0 WP
## 0 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
t_cf_monocyte_v1_sig_sva_both_gp <- simple_gprofiler(
t_cf_monocyte_v1_sig_sva_both,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_neutrophil_down_gp-v{ver}.xlsx"))
t_cf_monocyte_v1_sig_sva_both_gp## A set of ontologies produced by gprofiler using 66
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 5 MF
## 4 BP
## 0 KEGG
## 0 REAC
## 0 WP
## 0 TF
## 1 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
I like cats!
t_cf_monocyte_v1_sig_sva_up_cp <- simple_cprofiler(t_cf_monocyte_v1_sig_sva_up,
t_cf_monocyte_v1_table_sva,
orgdb = "org.Hs.eg.db")## preparing geneSet collections...
## GSEA analysis...
## leading edge analysis...
## done...
enrichplot::dotplot(t_cf_monocyte_v1_sig_sva_up_cp[["enrich_objects"]][["BP_all"]])t_cf_monocyte_v1_topn_gsea <- plot_topn_gsea(t_cf_monocyte_v1_sig_sva_up_cp)
t_cf_monocyte_v1_topn_gsea[["GO_outcome_up"]][[1]]## NULL
t_neutrophil_v1_xlsx <- glue("{cf_prefix}/Neutrophils/t_neutrophil_v1_cf_table_sva-v{ver}.xlsx")
t_cf_neutrophil_v1_table_sva <- table_reader(t_neutrophil_v1_xlsx, "outcome")## [1] 8717 92
t_neutrophil_v1_sig_xlsx <- glue("{cf_prefix}/Neutrophils/t_neutrophil_v1_cf_sig_sva-v{ver}.xlsx")
t_cf_neutrophil_v1_sig_sva_up <- sig_reader(t_neutrophil_v1_sig_xlsx, "outcome")## There are less than 20 rows in this significance table, it is unlikely to be interesting.
t_cf_neutrophil_v1_sig_sva_down <- sig_reader(t_neutrophil_v1_sig_xlsx, "outcome", "down")## There are less than 20 rows in this significance table, it is unlikely to be interesting.
t_cf_neutrophil_v1_sig_sva_both <- rbind.data.frame(t_cf_neutrophil_v1_sig_sva_up,
t_cf_neutrophil_v1_sig_sva_down)t_cf_neutrophil_v1_sig_sva_up_gp <- simple_gprofiler(
t_cf_neutrophil_v1_sig_sva_up,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_neutrophil_up_gp-v{ver}.xlsx"))## There are only, 5 returning null.
t_cf_neutrophil_v1_sig_sva_up_gp## NULL
t_cf_neutrophil_v1_sig_sva_down_gp <- simple_gprofiler(
t_cf_neutrophil_v1_sig_sva_down,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_neutrophil_down_gp-v{ver}.xlsx"))## There are only, 8 returning null.
t_cf_neutrophil_v1_sig_sva_down_gp## NULL
t_cf_neutrophil_v1_sig_sva_both_gp <- simple_gprofiler(
t_cf_neutrophil_v1_sig_sva_both,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_neutrophil_down_gp-v{ver}.xlsx"))
t_cf_neutrophil_v1_sig_sva_both_gp## A set of ontologies produced by gprofiler using 13
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 0 MF
## 0 BP
## 0 KEGG
## 0 REAC
## 0 WP
## 0 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
t_cf_neutrophil_v1_sig_sva_up_cp <- simple_cprofiler(t_cf_neutrophil_v1_sig_sva_up,
t_cf_neutrophil_v1_table_sva,
orgdb = "org.Hs.eg.db")## preparing geneSet collections...
## GSEA analysis...
## leading edge analysis...
## done...
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
enrichplot::dotplot(t_cf_neutrophil_v1_sig_sva_up_cp[["enrich_objects"]][["BP_all"]])t_cf_neutrophil_v1_topn_gsea <- plot_topn_gsea(t_cf_neutrophil_v1_sig_sva_up_cp)
t_cf_neutrophil_v1_topn_gsea[["GO_outcome_up"]][[1]]## NULL
t_eosinophil_v1_xlsx <- glue("{cf_prefix}/Eosinophils/t_eosinophil_v1_cf_table_sva-v{ver}.xlsx")
t_cf_eosinophil_v1_table_sva <- table_reader(t_eosinophil_v1_xlsx, "outcome")## [1] 9979 92
t_eosinophil_v1_sig_xlsx <- glue("{cf_prefix}/Eosinophils/t_eosinophil_v1_cf_sig_sva-v{ver}.xlsx")
t_cf_eosinophil_v1_sig_sva_up <- sig_reader(t_eosinophil_v1_sig_xlsx, "outcome")## There are less than 20 rows in this significance table, it is unlikely to be interesting.
t_cf_eosinophil_v1_sig_sva_down <- sig_reader(t_eosinophil_v1_sig_xlsx, "outcome", "down")## There are less than 20 rows in this significance table, it is unlikely to be interesting.
t_cf_eosinophil_v1_sig_sva_both <- rbind.data.frame(t_cf_eosinophil_v1_sig_sva_up,
t_cf_eosinophil_v1_sig_sva_down)t_cf_eosinophil_v1_sig_sva_up_gp <- simple_gprofiler(
t_cf_eosinophil_v1_sig_sva_up,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_eosinophil_up_gp-v{ver}.xlsx"))
t_cf_eosinophil_v1_sig_sva_up_gp## A set of ontologies produced by gprofiler using 13
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 0 MF
## 0 BP
## 0 KEGG
## 0 REAC
## 0 WP
## 0 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
t_cf_eosinophil_v1_sig_sva_down_gp <- simple_gprofiler(
t_cf_eosinophil_v1_sig_sva_down,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_eosinophil_down_gp-v{ver}.xlsx"))
t_cf_eosinophil_v1_sig_sva_down_gp## A set of ontologies produced by gprofiler using 19
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 0 MF
## 0 BP
## 1 KEGG
## 1 REAC
## 0 WP
## 1 TF
## 0 MIRNA
## 0 HPA
## 0 CORUM
## 0 HP hits.
t_cf_eosinophil_v1_sig_sva_both_gp <- simple_gprofiler(
t_cf_eosinophil_v1_sig_sva_both,
excel = glue("{xlsx_prefix}/Gene_Set_Enrichment/t_cf_eosinophil_down_gp-v{ver}.xlsx"))
t_cf_eosinophil_v1_sig_sva_both_gp## A set of ontologies produced by gprofiler using 32
## genes against the hsapiens annotations and significance cutoff 0.05.
## There are:
## 0 MF
## 9 BP
## 0 KEGG
## 1 REAC
## 0 WP
## 0 TF
## 0 MIRNA
## 0 HPA
## 1 CORUM
## 0 HP hits.
t_cf_eosinophil_v1_sig_sva_up_cp <- simple_cprofiler(t_cf_eosinophil_v1_sig_sva_up,
t_cf_eosinophil_v1_table_sva,
orgdb = "org.Hs.eg.db")## preparing geneSet collections...
## GSEA analysis...
## leading edge analysis...
## done...
enrichplot::dotplot(t_cf_eosinophil_v1_sig_sva_up_cp[["enrich_objects"]][["BP_all"]])t_cf_eosinophil_v1_topn_gsea <- plot_topn_gsea(t_cf_eosinophil_v1_sig_sva_up_cp)
t_cf_eosinophil_v1_topn_gsea[["GO_outcome_up"]][[1]]## NULL
Speaking with Maria Adelaida, it sounds like the most appropriate set to try for now is |logFC| >= 0.58 and p <= 0.05. I wrote out the dream results in a simpler and less fun way than everything else.
mixed_all_tbl <- table_reader(
glue("excel/mixed_all_celltypes_nobiop_table-v{ver}.xlsx"),
sheet = "failure_vs_cure")## [1] 11910 22
all_sig_up_idx <- mixed_all_tbl[["logfc"]] >= 0.58 &
mixed_all_tbl[["p_value"]] <= 0.05
all_sig_down_idx <- mixed_all_tbl[["logfc"]] <= -0.58 &
mixed_all_tbl[["p_value"]] <= 0.05
all_up_genes <- rownames(mixed_all_tbl)[all_sig_up_idx]
all_down_genes <- rownames(mixed_all_tbl)[all_sig_down_idx]
all_both_genes <- c(all_up_genes, all_down_genes)t_dream_all_up_gp <- simple_gprofiler(
all_up_genes,
excel = glue("excel/dream_all_up_gp-v{ver}.xlsx"))
all_up_termsim <- enrichplot::pairwise_termsim(t_dream_all_up_gp[["BP_enrich"]])
all_up_treeplot <- sm(treeplot(all_up_termsim, label_format = wrap_width))
pp(file = glue("images/overrepresentation/t_dream_all_up_gp_go-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
all_up_treeplot
dev.off()## png
## 2
all_up_treeplott_dream_all_down_gp <- simple_gprofiler(
all_down_genes,
excel = glue("excel/dream_all_down_gp-v{ver}.xlsx"))
all_down_termsim <- enrichplot::pairwise_termsim(t_dream_all_down_gp[["BP_enrich"]])
all_down_treeplot <- sm(treeplot(all_down_termsim, label_format = wrap_width))
pp(file = glue("images/overrepresentation/t_dream_all_down_gp_go-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
all_down_treeplot
dev.off()## png
## 2
all_down_treeplott_dream_all_both_gp <- simple_gprofiler(
all_both_genes,
excel = glue("excel/dream_all_both_gp-v{ver}.xlsx"))
all_both_termsim <- enrichplot::pairwise_termsim(t_dream_all_both_gp[["BP_enrich"]])
all_both_treeplot <- sm(treeplot(all_both_termsim, label_format = wrap_width))
pp(file = glue("images/overrepresentation/t_dream_all_both_gp_go-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
all_both_treeplot
dev.off()## png
## 2
all_both_treeplotmixed_monocyte_tbl <- table_reader(
glue("excel/mixed_monocyte_table-v{ver}.xlsx"),
sheet = "failure_vs_cure")## [1] 10862 22
monocyte_sig_up_idx <- mixed_monocyte_tbl[["logfc"]] >= 0.58 &
mixed_monocyte_tbl[["p_value"]] <= 0.05
monocyte_sig_down_idx <- mixed_monocyte_tbl[["logfc"]] <= -0.58 &
mixed_monocyte_tbl[["p_value"]] <= 0.05
monocyte_up_genes <- rownames(mixed_monocyte_tbl)[monocyte_sig_up_idx]
monocyte_down_genes <- rownames(mixed_monocyte_tbl)[monocyte_sig_down_idx]
monocyte_both_genes <- c(monocyte_up_genes, monocyte_down_genes)t_dream_monocyte_up_gp <- simple_gprofiler(
monocyte_up_genes,
excel = glue("excel/dream_monocyte_up_gp-v{ver}.xlsx"))
monocyte_up_termsim <- enrichplot::pairwise_termsim(t_dream_monocyte_up_gp[["BP_enrich"]])
monocyte_up_treeplot <- sm(treeplot(monocyte_up_termsim, label_format = wrap_width))
pp(file = glue("images/overrepresentation/t_dream_monocyte_up_gp_go-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
monocyte_up_treeplot
dev.off()## png
## 2
monocyte_up_treeplott_dream_monocyte_down_gp <- simple_gprofiler(
monocyte_down_genes,
excel = glue("excel/dream_monocyte_down_gp-v{ver}.xlsx"))
monocyte_down_termsim <- enrichplot::pairwise_termsim(t_dream_monocyte_down_gp[["BP_enrich"]])
monocyte_down_treeplot <- sm(treeplot(monocyte_down_termsim, label_format = wrap_width))
pp(file = glue("images/overrepresentation/t_dream_monocyte_down_gp_go-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
monocyte_down_treeplot
dev.off()## png
## 2
monocyte_down_treeplott_dream_monocyte_both_gp <- simple_gprofiler(
monocyte_both_genes,
excel = glue("excel/dream_monocyte_both_gp-v{ver}.xlsx"))
monocyte_both_termsim <- enrichplot::pairwise_termsim(t_dream_monocyte_both_gp[["BP_enrich"]])
monocyte_both_treeplot <- sm(treeplot(monocyte_both_termsim, label_format = wrap_width))
pp(file = glue("images/overrepresentation/t_dream_monocyte_both_gp_go-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
monocyte_both_treeplot
dev.off()## png
## 2
monocyte_both_treeplotneutrophil_tbl <- table_reader(
glue("excel/mixed_neutrophil_table-v{ver}.xlsx"),
sheet = "failure_vs_cure")## [1] 19952 22
neutrophil_sig_up_idx <- neutrophil_tbl[["logfc"]] >= 0.58 &
neutrophil_tbl[["p_value"]] <= 0.05
neutrophil_sig_down_idx <- neutrophil_tbl[["logfc"]] <= -0.58 &
neutrophil_tbl[["p_value"]] <= 0.05
neutrophil_up_genes <- rownames(neutrophil_tbl)[neutrophil_sig_up_idx]
neutrophil_down_genes <- rownames(neutrophil_tbl)[neutrophil_sig_down_idx]
neutrophil_both_genes <- c(neutrophil_up_genes, neutrophil_down_genes)
### Increased in failt_dream_neutrophil_up_gp <- simple_gprofiler(
neutrophil_up_genes,
excel = glue("excel/dream_neutrophil_up_gp-v{ver}.xlsx"))
neutrophil_up_termsim <- enrichplot::pairwise_termsim(t_dream_neutrophil_up_gp[["BP_enrich"]])
neutrophil_up_treeplot <- sm(treeplot(neutrophil_up_termsim, label_format = wrap_width))
pp(file = glue("images/overrepresentation/t_dream_neutrophil_up_gp_go-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
neutrophil_up_treeplot
dev.off()## png
## 2
neutrophil_up_treeplott_dream_neutrophil_down_gp <- simple_gprofiler(
neutrophil_down_genes,
excel = glue("excel/dream_neutrophil_down_gp-v{ver}.xlsx"))
neutrophil_down_termsim <- enrichplot::pairwise_termsim(t_dream_neutrophil_down_gp[["BP_enrich"]])
neutrophil_down_treeplot <- sm(treeplot(neutrophil_down_termsim, label_format = wrap_width))
pp(file = glue("images/overrepresentation/t_dream_neutrophil_down_gp_go-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
neutrophil_down_treeplot
dev.off()## png
## 2
neutrophil_down_treeplott_dream_neutrophil_both_gp <- simple_gprofiler(
neutrophil_both_genes,
excel = glue("excel/dream_neutrophil_both_gp-v{ver}.xlsx"))
neutrophil_both_termsim <- enrichplot::pairwise_termsim(t_dream_neutrophil_both_gp[["BP_enrich"]])
neutrophil_both_treeplot <- sm(treeplot(neutrophil_both_termsim, label_format = wrap_width))## Error in stats::cutree(hc, nCluster): elements of 'k' must be between 1 and 4
pp(file = glue("images/overrepresentation/t_dream_neutrophil_both_gp_go-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
neutrophil_both_treeplot## Error in eval(expr, envir, enclos): object 'neutrophil_both_treeplot' not found
dev.off()## png
## 2
neutrophil_both_treeplot## Error in eval(expr, envir, enclos): object 'neutrophil_both_treeplot' not found
eosinophil_tbl <- table_reader(
glue("excel/mixed_eosinophil_table-v{ver}.xlsx"),
sheet = "failure_vs_cure")## [1] 19952 22
eosinophil_sig_up_idx <- eosinophil_tbl[["logfc"]] >= 0.58 &
eosinophil_tbl[["p_value"]] <= 0.05
eosinophil_sig_down_idx <- eosinophil_tbl[["logfc"]] <= -0.58 &
eosinophil_tbl[["p_value"]] <= 0.05
eosinophil_up_genes <- rownames(eosinophil_tbl)[eosinophil_sig_up_idx]
eosinophil_down_genes <- rownames(eosinophil_tbl)[eosinophil_sig_down_idx]
eosinophil_both_genes <- c(eosinophil_up_genes, eosinophil_down_genes)t_dream_eosinophil_up_gp <- simple_gprofiler(
eosinophil_up_genes,
excel = glue("excel/dream_eosinophil_up_gp-v{ver}.xlsx"))
eosinophil_up_termsim <- enrichplot::pairwise_termsim(t_dream_eosinophil_up_gp[["BP_enrich"]])
eosinophil_up_treeplot <- sm(treeplot(eosinophil_up_termsim, label_format = wrap_width))
pp(file = glue("images/overrepresentation/t_dream_eosinophil_up_gp_go-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
eosinophil_up_treeplot
dev.off()## png
## 2
eosinophil_up_treeplott_dream_eosinophil_down_gp <- simple_gprofiler(
eosinophil_down_genes,
excel = glue("excel/dream_eosinophil_down_gp-v{ver}.xlsx"))
eosinophil_down_termsim <- enrichplot::pairwise_termsim(t_dream_eosinophil_down_gp[["BP_enrich"]])
eosinophil_down_treeplot <- sm(treeplot(eosinophil_down_termsim, label_format = wrap_width))
pp(file = glue("images/overrepresentation/t_dream_eosinophil_down_gp_go-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
eosinophil_down_treeplot
dev.off()## png
## 2
eosinophil_down_treeplott_dream_eosinophil_both_gp <- simple_gprofiler(
eosinophil_both_genes,
excel = glue("excel/dream_eosinophil_both_gp-v{ver}.xlsx"))
eosinophil_both_termsim <- enrichplot::pairwise_termsim(t_dream_eosinophil_both_gp[["BP_enrich"]])
eosinophil_both_treeplot <- sm(treeplot(eosinophil_both_termsim, label_format = wrap_width))
pp(file = glue("images/overrepresentation/t_dream_eosinophil_both_gp_go-v{ver}.pdf"),
height = treeplot_height, width = treeplot_width)
eosinophil_both_treeplot
dev.off()## png
## 2
eosinophil_both_treeplot