Changelog
- 202401-202405: Cleanups, formatting, ensuring that everything works
in the container.
- 202310: Cleaning up to make everything pass within a containerized
environment.
- 202310: Received a set of colors and contrasts of interest for a
barplot of significance.
- 20230410: Making some changes to improve the differential expression
plots as well as prepare for some different pathway/GSEA/GSVA analyses
on the data.
Introduction
Having established that the TMRC2 macrophage data looks robust and
illustrative of a couple of interesting questions, let us perform a
couple of differential analyses of it.
Also note that as of 202212, we received a new set of samples which
now include some which are a completely different cell type, U937. As
their ATCC page states, they are malignant cells taken from the pleural
effusion of a 37 year old white male with histiocytic lymphoma and which
exhibit the morphology of monocytes. Thus, this document now includes
some comparisons of the cell types as well as the various macrophage
donors (given that there are now more donors too).
Human data
I am moving the dataset manipulations here so that I can look at them
all together before running the various DE analyses.
Create sets focused
on drug, celltype, strain, and combinations
Let us start by playing with the metadata a little and create sets
with the condition set to:
- Drug treatment
- Cell type (macrophage or U937)
- Donor
- Infection Strain
- Some useful combinations thereof
In addition, keep mental track of which datasets are comprised of all
samples vs. those which are only macrophage vs. those which are only
U937. (Thus, the usage of all_human vs. hs_macr vs. u937 as prefixes for
the data structures.)
Ideally, these recreations of the data should perhaps be in the
datastructures worksheet.
all_human <- sanitize_metadata(hs_macrophage, columns = "drug") %>%
set_conditions(fact = "drug", colors = color_choices[["drug"]]) %>%
set_batches(fact = "typeofcells")
## Recasting the data.frame to DataFrame.
## The numbers of samples by condition are:
##
## antimony none
## 34 34
## The number of samples by batch are:
##
## Macrophages U937
## 54 14
## The following 3 lines were copy/pasted to datastructures and should be removed soon.
no_strain_idx <- colData(all_human)[["strainid"]] == "none"
##colData(all_human)[["strainid"]] <- paste0("s", colData(all_human)[["strainid"]],
## "_", colData(all_human)[["macrophagezymodeme"]])
colData(all_human)[no_strain_idx, "strainid"] <- "none"
table(colData(all_human)[["strainid"]])
##
## 10763 10772 10977 11026 11075 11126 12251 12309 12355 12367 2169 7158 none
## 2 8 2 2 2 8 7 8 2 7 8 2 10
all_human_types <- set_conditions(all_human, fact = "typeofcells") %>%
set_batches(fact = "drug")
## The numbers of samples by condition are:
##
## Macrophages U937
## 54 14
## The number of samples by batch are:
##
## antimony none
## 34 34
type_zymo_fact <- paste0(colData(all_human_types)[["condition"]], "_",
colData(all_human_types)[["macrophagezymodeme"]])
type_zymo <- set_conditions(all_human_types, fact = type_zymo_fact)
## The numbers of samples by condition are:
##
## Macrophages_none Macrophages_z22 Macrophages_z23 U937_none
## 8 23 23 2
## U937_z22 U937_z23
## 6 6
type_drug_fact <- paste0(colData(all_human_types)[["condition"]], "_",
colData(all_human_types)[["drug"]])
type_drug <- set_conditions(all_human_types, fact = type_drug_fact)
## The numbers of samples by condition are:
##
## Macrophages_antimony Macrophages_none U937_antimony
## 27 27 7
## U937_none
## 7
strain_fact <- colData(all_human_types)[["strainid"]]
table(strain_fact)
## strain_fact
## 10763 10772 10977 11026 11075 11126 12251 12309 12355 12367 2169 7158 none
## 2 8 2 2 2 8 7 8 2 7 8 2 10
new_conditions <- paste0(colData(hs_macrophage)[["macrophagetreatment"]], "_",
colData(hs_macrophage)[["macrophagezymodeme"]])
## Note the sanitize() call is redundant with the addition of sanitize() in the
## datastructures file, but I don't want to wait to rerun that.
hs_macr <- set_conditions(hs_macrophage, fact = new_conditions) %>%
sanitize_metadata(column = "drug") %>%
subset_se(subset = "typeofcells!='U937'") %>%
set_se_colors(color_choices[["treatment_zymo"]])
## The numbers of samples by condition are:
##
## inf_sb_z22 inf_sb_z23 inf_z22 inf_z23 uninf_none
## 15 14 14 15 5
## uninf_sb_none
## 5
## Recasting the data.frame to DataFrame.
Separate Macrophage
samples
Once again, we should reconsider where the following block is placed,
but these datastructures are likely to be used in many of the following
analyses.
hs_macr_drug_se <- set_conditions(hs_macr, fact = "drug", colors = color_choices[["drug"]])
## The numbers of samples by condition are:
##
## antimony none
## 27 27
hs_macr_strain_se <- set_conditions(hs_macr, fact = "macrophagezymodeme",
colors = color_choices[["zymo"]]) %>%
subset_se(subset = "macrophagezymodeme != 'none'")
## The numbers of samples by condition are:
##
## none z22 z23
## 8 23 23
table(colData(hs_macr)[["strainid"]])
##
## 10763 10772 10977 11026 11075 11126 12251 12309 12355 12367 2169 7158 none
## 2 6 2 2 2 6 5 6 2 5 6 2 8
Refactor U937
samples
The U937 samples were separated in the datastructures file, but we
want to use the combination of drug/zymodeme with them pretty much
exclusively.
new_conditions <- paste0(colData(hs_u937)[["macrophagetreatment"]], "_",
colData(hs_u937)[["macrophagezymodeme"]])
u937_se <- set_conditions(hs_u937, fact = new_conditions,
colors = color_choices[["treatment_zymo"]])
## The numbers of samples by condition are:
##
## inf_sb_z22 inf_sb_z23 inf_z22 inf_z23 uninf_none
## 3 3 3 3 1
## uninf_sb_none
## 1
Contrasts used in
this document
Given the various ways we have chopped up this dataset, there are a
few general types of contrasts we will perform, which will then be
combined into greater complexity:
- drug treatment: Antimonal treated or not.
- strains used: Uninfected, z2.3, and z2.2.
- cellltypes: U937 or macrophage.
- donors: The person from whom the macrophages were taken.
In the end, our actual goal is to consider the variable effects of
drug+strain and see if we can discern patterns which lead to better or
worse drug treatment outcome.
There is a set of contrasts in which we are primarily interested in
this data, these follow. I created one ratio of ratios contrast which I
think has the potential to ask our biggest question.
## Each of the following lists has the name of the contrast as the key
## followed by a two element vector comprised of the numerator and
## denominator as the value. In the case of this first contrast, that
## is comprised of a string which manually defines a series of more
## complex contrasts than the usual/simple pairwise.
tmrc2_human_extra <- "z23drugnodrug_vs_z22drugnodrug = (conditioninf_sb_z23 - conditioninf_z23) - (conditioninf_sb_z22 - conditioninf_z22), z23z22drug_vs_z23z22nodrug = (conditioninf_sb_z23 - conditioninf_sb_z22) - (conditioninf_z23 - conditioninf_z22)"
tmrc2_human_keepers <- list(
"z23nosb_vs_uninf" = c("inf_z23", "uninf_none"),
"z22nosb_vs_uninf" = c("inf_z22", "uninf_none"),
"z23nosb_vs_z22nosb" = c("inf_z23", "inf_z22"),
"z23sb_vs_z22sb" = c("inf_sb_z23", "inf_sb_z22"),
"z23sb_vs_z23nosb" = c("inf_sb_z23", "inf_z23"),
"z22sb_vs_z22nosb" = c("inf_sb_z22", "inf_z22"),
"z23sb_vs_sb" = c("inf_sb_z23", "uninf_sb_none"),
"z22sb_vs_sb" = c("inf_sb_z22", "uninf_sb_none"),
"z23sb_vs_uninf" = c("inf_sb_z23", "uninf_none"),
"z22sb_vs_uninf" = c("inf_sb_z22", "uninf_none"),
"sb_vs_uninf" = c("uninf_sb_none", "uninf_none"),
"extra_z2322" = c("z23drugnodrug", "z22drugnodrug"),
"extra_drugnodrug" = c("z23z22drug", "z23z22nodrug"))
single_tmrc2_keeper <- list(
"z22sb_vs_sb" = c("inf_sb_z22", "uninf_sb_none"))
tmrc2_drug_keepers <- list(
"drug" = c("antimony", "none"))
tmrc2_type_keepers <- list(
"type" = c("U937", "Macrophages"))
tmrc2_strain_keepers <- list(
"strain" = c("z23", "z22"))
type_zymo_extra <- "zymos_vs_types = (conditionU937_z23 - conditionU937_z22) - (conditionMacrophages_z23 - conditionMacrophages_z22)"
tmrc2_typezymo_keepers <- list(
"u937_macr" = c("Macrophages_none", "U937_none"),
"zymo_macr" = c("Macrophages_z23", "Macrophages_z22"),
"zymo_u937" = c("U937_z23", "U937_z22"),
"z23_types" = c("U937_z23", "Macrophages_z23"),
"z22_types" = c("U937_z22", "Macrophages_z22"),
"zymos_types" = c("zymos_vs_types"))
tmrc2_typedrug_keepers <- list(
"type_nodrug" = c("U937_none", "Macrophages_none"),
"type_drug" = c("U937_antimony", "Macrophages_antimony"),
"macr_drugs" = c("Macrophages_antimony", "Macrophages_none"),
"u937_drugs" = c("U937_antimony", "U937_none"))
u937_keepers <- list(
"z23nosb_vs_uninf" = c("inf_z23", "uninf_none"),
"z22nosb_vs_uninf" = c("inf_z22", "uninf_none"),
"z23nosb_vs_z22nosb" = c("inf_z23", "inf_z22"),
"z23sb_vs_z22sb" = c("inf_sb_z23", "inf_sb_z22"),
"z23sb_vs_z23nosb" = c("inf_sb_z23", "inf_z23"),
"z22sb_vs_z22nosb" = c("inf_sb_z22", "inf_z22"),
"z23sb_vs_sb" = c("inf_sb_z23", "uninf_sb_none"),
"z22sb_vs_sb" = c("inf_sb_z22", "uninf_sb_none"),
"z23sb_vs_uninf" = c("inf_sb_z23", "uninf_none"),
"z22sb_vs_uninf" = c("inf_sb_z22", "uninf_none"),
"sb_vs_uninf" = c("uninf_sb_none", "uninf_none"))
## If some cases, when the set of significant genes was chosen, an
## additional filter was added to exclude genes with expression values
## less than 'high_expression' according to the
## 'high_expression_column' in the table.
high_expression <- 128
high_expression_column <- "deseq_basemean"
combined_to_tsv <- function(combined, celltype = "all") {
keepers <- combined[["keepers"]]
for (k in seq_len(length(keepers))) {
kname <- names(keepers)[k]
numerator <- keepers[[k]][1]
denominator <- keepers[[k]][2]
filename <- glue("analyses/macrophage_de/tsv_tables/tmrc2_{celltype}_{kname}_n{numerator}_d{denominator}-v{ver}.xlsx")
kdata <- combined[["data"]][[kname]]
if (is.null(kdata[["basic_num"]])) {
next
}
wanted <- c("hgnc_symbol", "deseq_logfc", "deseq_adjp",
"deseq_basemean", "deseq_num", "deseq_den")
wanted_data <- kdata[, wanted]
colnames(wanted_data) <- c("hgnc_symbol", "deseq_logfc", "deseq_adjp",
"deseq_mean", "deseq_numerator", "deseq_denominator")
write_xlsx(data = wanted_data, excel = filename)
}
}
write_all_gp <- function(all_gp, suffix = NULL) {
all_written <- list()
for (g in seq_len(length(all_gp))) {
name <- names(all_gp)[g]
datum <- all_gp[[name]]
filename <- glue("analyses/macrophage_de/gprofiler/{name}_gprofiler-v{ver}.xlsx")
if (!is.null(suffix)) {
filename <- glue("analyses/macrophage_de/gprofiler/{name}_gprofiler{suffix}-v{ver}.xlsx")
}
written <- sm(write_gprofiler_data(datum, excel = filename))
all_written[[g]] <- written
}
return(all_written)
}
Primary queries
There is a series of initial questions which make some sense to me,
but these do not necessarily match the set of questions which are most
pressing. I am hoping to pull both of these sets of queries in one.
Before extracting these groups of queries, let us invoke the
all_pairwise() function and get all of the likely contrasts along with
one or more extras that might prove useful (the ‘extra’ argument).
The structure of these blocks will all basically be identical:
- Perform a set of pairwise contrasts of all the conditions against
each other. Optionally use sva.
- Given that result, dump it in its entirety to an xlsx file in the
analyses/ directory.
- Given those combined tables, extract from them the set deemed
‘significant’ by whatever criteria we want to try. (Usually |lfc| >=
1.0, adjusted p <= 0.05; but potentially also expression >= x and
sometimes a set of less stringent values (|lfc| >= 0.6))
- Given one or more gene sets deemed ‘significant’ pass them to
gProfiler2 and see what pops out.
Combined U937 and
Macrophages: Compare drug effects
When we have the u937 cells in the same dataset as the macrophages,
that provides an interesting opportunity to see if we can observe
drug-dependant effects which are shared across both cell types.
Note to self: given the changes to hpgltools I may need to specify
the statistical model string when I am using svaseq for some/many/all of
these comparisons.
drug_de <- all_pairwise(all_human, filter = TRUE, model_svs = "svaseq",
model_fstring = "~ 0 + condition")
## antimony none
## 34 34
## Running normalize_se.
## Removing 9198 low-count genes (12283 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Running normalize_se.
## Setting 85798 entries to zero.
## This received a matrix of SVs.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## Warning in createContrastL(objFlt$formula, objFlt$data, L): Contrasts with only
## a single non-zero term are already evaluated by default.
## conditions
## antimony none
## 34 34
## conditions
## antimony none
## 34 34
## conditions
## antimony none
## 34 34
## A pairwise differential expression with results from: basic, deseq, ebseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: svaseq.
## The primary analysis performed 1 comparisons.
## The logFC agreement among the methods follows:
## nn_vs_ntmn
## basic_vs_deseq 0.8651
## basic_vs_dream 0.8738
## basic_vs_ebseq 0.8367
## basic_vs_edger 0.8671
## basic_vs_limma 0.8771
## basic_vs_noiseq 0.8726
## deseq_vs_dream 0.9719
## deseq_vs_ebseq 0.9344
## deseq_vs_edger 0.9988
## deseq_vs_limma 0.9665
## deseq_vs_noiseq 0.9738
## dream_vs_ebseq 0.9770
## dream_vs_edger 0.9741
## dream_vs_limma 0.9953
## dream_vs_noiseq 0.9616
## ebseq_vs_edger 0.9383
## ebseq_vs_limma 0.9728
## ebseq_vs_noiseq 0.9486
## edger_vs_limma 0.9688
## edger_vs_noiseq 0.9745
## limma_vs_noiseq 0.9584
drug_table <- combine_de_tables(
drug_de, keepers = tmrc2_drug_keepers,
excel = glue("analyses/macrophage_de/de_tables/macrophage_drug_comparison-v{ver}.xlsx"))
drug_table
## A set of combined differential expression results.
## table deseq_sigup deseq_sigdown edger_sigup edger_sigdown
## 1 none_vs_antimony-inverted 480 764 480 759
## limma_sigup limma_sigdown
## 1 471 700
## `geom_line()`: Each group consists of only one observation.
## i Do you need to adjust the group aesthetic?
## Plot describing unique/shared genes in a differential expression table.

combined_to_tsv(drug_table, celltype = "all")
drug_sig <- extract_significant_genes(
drug_table,
excel = glue("analyses/macrophage_de/sig_tables/macrophage_drug_sig-v{ver}.xlsx"))
drug_sig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down ebseq_up
## drug 471 700 480 759 480 764 323
## ebseq_down basic_up basic_down
## drug 577 256 393

drug_highsig <- extract_significant_genes(
drug_table, min_mean_exprs = high_expression, exprs_column = high_expression_column,
excel = glue("analyses/macrophage_de/sig_tables/macrophage_drug_highsig-v{ver}.xlsx"))
drug_highsig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down ebseq_up
## drug 222 388 233 427 231 429 162
## ebseq_down basic_up basic_down
## drug 346 208 343

drug_lesssig <- extract_significant_genes(
drug_table, lfc = 0.6,
excel = glue("analyses/macrophage_de/sig_tables/macrophage_drug_lesssig-v{ver}.xlsx"))
drug_lesssig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 0.6 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down ebseq_up
## drug 1120 1326 1098 1451 1082 1461 647
## ebseq_down basic_up basic_down
## drug 983 772 945

gProfiler2
results of the significant drug genes
all_drug_gp <- all_gprofiler(drug_sig, enrich_id_column = "hgnc_symbol")
all_drug_gp
## BP CC CORUM HP HPA KEGG MIRNA MF REAC TF WP
## drug_up 88 32 0 0 0 5 0 38 78 26 9
## drug_down 320 61 0 0 0 1 1 32 2 297 2
written <- write_all_gp(all_drug_gp)
all_drug_lesssig <- all_gprofiler(drug_lesssig, enrich_id_column = "hgnc_symbol")
written <- write_all_gp(all_drug_lesssig, suffix = "_lfc0.6_")
Combined U937 and
Macrophages: compare cell types
There are a couple of ways one might want to directly compare the two
cell types.
- Given that the variance between the two celltypes is so huge, just
compare all samples.
- One might want to compare them with the interaction effects of
drug/zymodeme.
type_de <- all_pairwise(all_human_types, filter = TRUE, model_fstring = "~ 0 + condition",
model_svs = "svaseq")
## Macrophages U937
## 54 14
## Running normalize_se.
## Removing 9198 low-count genes (12283 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Running normalize_se.
## Setting 85798 entries to zero.
## This received a matrix of SVs.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## Warning in createContrastL(objFlt$formula, objFlt$data, L): Contrasts with only
## a single non-zero term are already evaluated by default.
## conditions
## Macrophages U937
## 54 14
## conditions
## Macrophages U937
## 54 14
## conditions
## Macrophages U937
## 54 14
## A pairwise differential expression with results from: basic, deseq, ebseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: svaseq.
## The primary analysis performed 1 comparisons.
## The logFC agreement among the methods follows:
## U937_vs_Mc
## basic_vs_deseq 0.8587
## basic_vs_dream 0.8853
## basic_vs_ebseq 0.8125
## basic_vs_edger 0.8576
## basic_vs_limma 0.8982
## basic_vs_noiseq 0.9114
## deseq_vs_dream 0.9938
## deseq_vs_ebseq 0.9805
## deseq_vs_edger 0.9974
## deseq_vs_limma 0.9849
## deseq_vs_noiseq 0.9835
## dream_vs_ebseq 0.9682
## dream_vs_edger 0.9947
## dream_vs_limma 0.9932
## dream_vs_noiseq 0.9910
## ebseq_vs_edger 0.9836
## ebseq_vs_limma 0.9490
## ebseq_vs_noiseq 0.9652
## edger_vs_limma 0.9859
## edger_vs_noiseq 0.9829
## limma_vs_noiseq 0.9817
type_table <- combine_de_tables(
type_de, keepers = tmrc2_type_keepers,
excel = glue("analyses/macrophage_de/de_tables/macrophage_type_comparison-v{ver}.xlsx"))
type_table
## A set of combined differential expression results.
## table deseq_sigup deseq_sigdown edger_sigup edger_sigdown
## 1 U937_vs_Macrophages 2105 2436 2076 2460
## limma_sigup limma_sigdown
## 1 2247 2129
## `geom_line()`: Each group consists of only one observation.
## i Do you need to adjust the group aesthetic?
## Plot describing unique/shared genes in a differential expression table.

combined_to_tsv(type_table, celltype = "all")
type_sig <- extract_significant_genes(
type_table,
excel = glue("analyses/macrophage_de/sig_tables/macrophage_type_sig-v{ver}.xlsx"))
type_sig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down ebseq_up
## type 2247 2129 2076 2460 2105 2436 1880
## ebseq_down basic_up basic_down
## type 2485 1972 1784

type_highsig <- extract_significant_genes(
type_table, min_mean_exprs = high_expression, exprs_column = high_expression_column,
excel = glue("analyses/macrophage_de/sig_tables/macrophage_type_highsig-v{ver}.xlsx"))
type_highsig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down ebseq_up
## type 1365 1632 1297 1762 1322 1736 1181
## ebseq_down basic_up basic_down
## type 1789 1345 1613

type_lesssig <- extract_significant_genes(
type_table, lfc = 0.6,
excel = glue("analyses/macrophage_de/sig_tables/macrophage_type_lesssig-v{ver}.xlsx"))
type_sig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down ebseq_up
## type 2247 2129 2076 2460 2105 2436 1880
## ebseq_down basic_up basic_down
## type 2485 1972 1784

Combined factors
of interest: celltype+zymodeme
Given the above explicit comparison of all samples comprising the two
cell types, now let us look at the drug treatment+zymodeme status with
all samples, macrophages and U937.
type_zymo_de <- all_pairwise(type_zymo, filter = TRUE, model_svs = "svaseq",
model_fstring = "~ 0 + condition",
extra_contrasts = type_zymo_extra)
## Macrophages_none Macrophages_z22 Macrophages_z23 U937_none
## 8 23 23 2
## U937_z22 U937_z23
## 6 6
## Running normalize_se.
## Removing 9198 low-count genes (12283 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Running normalize_se.
## Setting 85798 entries to zero.
## This received a matrix of SVs.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## The contrast zymos is not in the results.
## If this is not an extra contrast, then this is an error.
## Warning in createContrastL(objFlt$formula, objFlt$data, L): Contrasts with only
## a single non-zero term are already evaluated by default.
## conditions
## Macrophages_none Macrophages_z22 Macrophages_z23 U937_none
## 8 23 23 2
## U937_z22 U937_z23
## 6 6
## conditions
## Macrophages_none Macrophages_z22 Macrophages_z23 U937_none
## 8 23 23 2
## U937_z22 U937_z23
## 6 6
## conditions
## Macrophages_none Macrophages_z22 Macrophages_z23 U937_none
## 8 23 23 2
## U937_z22 U937_z23
## 6 6

## A pairwise differential expression with results from: basic, deseq, ebseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: svaseq.
## The primary analysis performed 15 comparisons.
Strangely, as of 20250903, the following line throws an error in the
container: “Error: subscript contains invalid names.”
But when I run it manually I get no error. I assume this means that I
fell behind while maintaining hpgltools HEAD with the container?
In addition, the functions actually did return successfully.
type_zymo_table <- combine_de_tables(
type_zymo_de, keepers = tmrc2_typezymo_keepers,
excel = glue("analyses/macrophage_de/de_tables/macrophage_type_zymo_comparison-v{ver}.xlsx"))
## Error : subscript contains invalid names
## coefficient limma did not find NA or zymos_vs_types.
## coefficient edger did not find conditionNA or conditionzymos_vs_types.
## coefficient limma did not find NA or zymos_vs_types.
## A set of combined differential expression results.
## table deseq_sigup deseq_sigdown edger_sigup
## 1 U937_none_vs_Macrophages_none-inverted 2353 2081 2360
## 2 Macrophages_z23_vs_Macrophages_z22 300 459 297
## 3 U937_z23_vs_U937_z22 1 2 1
## 4 U937_z23_vs_Macrophages_z23 2153 2468 2122
## 5 U937_z22_vs_Macrophages_z22 2024 2539 2005
## 6 zymos_vs_types 0 0 334
## edger_sigdown limma_sigup limma_sigdown
## 1 2088 2017 2206
## 2 460 382 318
## 3 3 0 0
## 4 2498 2294 2182
## 5 2558 2272 2154
## 6 219 185 222
## Plot describing unique/shared genes in a differential expression table.

combined_to_tsv(type_zymo_table, celltype = "all")
type_zymo_sig <- extract_significant_genes(
type_zymo_table,
excel = glue("analyses/macrophage_de/sig_tables/macrophage_type_zymo_sig-v{ver}.xlsx"))
## There is no deseq_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no ebseq_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no basic_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down
## u937_macr 2017 2206 2360 2088 2353 2081
## zymo_macr 382 318 297 460 300 459
## zymo_u937 0 0 1 3 1 2
## z23_types 2294 2182 2122 2498 2153 2468
## z22_types 2272 2154 2005 2558 2024 2539
## zymos_types 185 222 334 219 0 0
## ebseq_up ebseq_down basic_up basic_down
## u937_macr 1720 1867 0 0
## zymo_macr 211 255 213 113
## zymo_u937 0 1 0 0
## z23_types 1971 2021 1997 1808
## z22_types 1899 2423 2001 1804
## zymos_types 0 0 0 0

type_zymo_highsig <- extract_significant_genes(
type_zymo_table, min_mean_exprs = high_expression, exprs_column = high_expression_column,
excel = glue("analyses/macrophage_de/sig_tables/macrophage_type_zymo_highsig-v{ver}.xlsx"))
## Warning in get_sig_genes(this_table, lfc = lfc, p = p, z = z, n = n, column =
## this_fc_column, : The column deseq_basemean does not appears to be in the
## table, cannot filter by expression.
## Warning in get_sig_genes(this_table, lfc = lfc, p = p, z = z, n = n, column =
## this_fc_column, : The column deseq_basemean does not appears to be in the
## table, cannot filter by expression.
## There is no deseq_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no ebseq_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no basic_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
type_zymo_lesssig <- extract_significant_genes(
type_zymo_table, lfc = 0.6,
excel = glue("analyses/macrophage_de/sig_tables/macrophage_type_zymo_lesssig-v{ver}.xlsx"))
## There is no deseq_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no ebseq_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no basic_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 0.6 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down
## u937_macr 2865 3266 3296 3115 3292 3121
## zymo_macr 737 704 624 946 622 936
## zymo_u937 1 1 1 3 1 2
## z23_types 3432 3109 3197 3550 3242 3501
## z22_types 3465 3072 3141 3536 3181 3509
## zymos_types 338 331 527 374 0 0
## ebseq_up ebseq_down basic_up basic_down
## u937_macr 2157 2696 0 0
## zymo_macr 343 418 442 389
## zymo_u937 2 1 0 0
## z23_types 2933 2687 3151 2748
## z22_types 2949 3250 3231 2760
## zymos_types 0 0 0 0

Combined factors
of interest: celltype+drug
The ‘type_drug’ datastructure is the same as above, but the condition
is created from the concatenation of the cell type and drug
treatment.
type_drug_de <- all_pairwise(type_drug, filter = TRUE, model_svs = "svaseq",
model_fstring = "~ 0 + condition")
## Macrophages_antimony Macrophages_none U937_antimony
## 27 27 7
## U937_none
## 7
## Running normalize_se.
## Removing 9198 low-count genes (12283 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Running normalize_se.
## Setting 85798 entries to zero.
## This received a matrix of SVs.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## Warning in createContrastL(objFlt$formula, objFlt$data, L): Contrasts with only
## a single non-zero term are already evaluated by default.
## conditions
## Macrophages_antimony Macrophages_none U937_antimony
## 27 27 7
## U937_none
## 7
## conditions
## Macrophages_antimony Macrophages_none U937_antimony
## 27 27 7
## U937_none
## 7
## conditions
## Macrophages_antimony Macrophages_none U937_antimony
## 27 27 7
## U937_none
## 7

## A pairwise differential expression with results from: basic, deseq, ebseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: svaseq.
## The primary analysis performed 6 comparisons.
type_drug_table <- combine_de_tables(
type_drug_de, keepers = tmrc2_typedrug_keepers,
excel = glue("analyses/macrophage_de/de_tables/macrophage_type_drug_comparison-v{ver}.xlsx"))
type_drug_table
## A set of combined differential expression results.
## table deseq_sigup deseq_sigdown
## 1 U937_none_vs_Macrophages_none 2094 2644
## 2 U937_antimony_vs_Macrophages_antimony 2102 2375
## 3 Macrophages_none_vs_Macrophages_antimony-inverted 606 966
## 4 U937_none_vs_U937_antimony-inverted 421 167
## edger_sigup edger_sigdown limma_sigup limma_sigdown
## 1 2059 2668 2295 2194
## 2 2083 2385 2254 2133
## 3 605 960 672 910
## 4 442 176 211 162
## Plot describing unique/shared genes in a differential expression table.

#combined_to_tsv(type_drug_table, celltype = "all")
type_drug_sig <- extract_significant_genes(
type_drug_table,
excel = glue("analyses/macrophage_de/sig_tables/macrophage_type_drug_sig-v{ver}.xlsx"))
type_drug_sig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down
## type_nodrug 2295 2194 2059 2668 2094 2644
## type_drug 2254 2133 2083 2385 2102 2375
## macr_drugs 672 910 605 960 606 966
## u937_drugs 211 162 442 176 421 167
## ebseq_up ebseq_down basic_up basic_down
## type_nodrug 1956 2465 2041 1852
## type_drug 2008 2312 1989 1856
## macr_drugs 482 881 369 569
## u937_drugs 359 157 168 146

type_drug_highsig <- extract_significant_genes(
type_drug_table, min_mean_exprs = high_expression, exprs_column = high_expression_column,
excel = glue("analyses/macrophage_de/sig_tables/macrophage_type_drug_highsig-v{ver}.xlsx"))
type_drug_highsig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down
## type_nodrug 1394 1690 1312 1895 1343 1869
## type_drug 1386 1649 1302 1734 1321 1721
## macr_drugs 330 520 300 564 303 565
## u937_drugs 102 84 210 105 202 101
## ebseq_up ebseq_down basic_up basic_down
## type_nodrug 1275 1789 1402 1659
## type_drug 1266 1719 1379 1657
## macr_drugs 243 517 294 471
## u937_drugs 168 100 99 85

type_drug_lesssig <- extract_significant_genes(
type_drug_table, lfc = 0.6,
excel = glue("analyses/macrophage_de/sig_tables/macrophage_type_drug_lesssig-v{ver}.xlsx"))
type_drug_lesssig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 0.6 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down
## type_nodrug 3497 3150 3169 3694 3220 3659
## type_drug 3387 3114 3181 3438 3210 3414
## macr_drugs 1422 1592 1311 1747 1303 1743
## u937_drugs 459 429 800 417 770 414
## ebseq_up ebseq_down basic_up basic_down
## type_nodrug 3010 3258 3246 2824
## type_drug 3048 3198 3187 2886
## macr_drugs 999 1467 1071 1179
## u937_drugs 720 454 470 449

Individual cell
types
At this point, I think it is fair to say that the two cell types are
sufficiently different that they do not really belong together in a
single analysis.
drug or strain
effects, single cell type
One of the queries Najib asked which I think I misinterpreted was to
look at drug and/or strain effects. My interpretation is somewhere below
and was not what he was looking for. Instead, he was looking to see
all(macrophage) drug/nodrug and all(macrophage) z23/z22 and compare them
to each other. It may be that this is still a wrong interpretation, if
so the most likely comparison is either:
- (z23drug/z22drug) / (z23nodrug/z22nodrug), or perhaps
- (z23drug/z23nodrug) / (z22drug/z22nodrug),
I am not sure those confuse me, and at least one of them is below
Macrophages
In these blocks we will explicitly query only one factor at a time,
drug and strain. The eventual goal is to look for effects of drug
treatment and/or strain treatment which are shared?
Macrophage Drug
only
Thus we will start with the pure drug query. In this block we will
look only at the drug/nodrug effect.
hs_macr_drug_de <- all_pairwise(hs_macr_drug_se, filter = TRUE, model_svs = "svaseq",
model_fstring = "~ 0 + condition")
## antimony none
## 27 27
## Running normalize_se.
## Removing 9725 low-count genes (11756 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Running normalize_se.
## Setting 40036 entries to zero.
## This received a matrix of SVs.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## Warning in createContrastL(objFlt$formula, objFlt$data, L): Contrasts with only
## a single non-zero term are already evaluated by default.
## conditions
## antimony none
## 27 27
## conditions
## antimony none
## 27 27
## conditions
## antimony none
## 27 27
## A pairwise differential expression with results from: basic, deseq, ebseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: svaseq.
## The primary analysis performed 1 comparisons.
## The logFC agreement among the methods follows:
## nn_vs_ntmn
## basic_vs_deseq 0.8930
## basic_vs_dream 0.8901
## basic_vs_ebseq 0.8651
## basic_vs_edger 0.8942
## basic_vs_limma 0.8948
## basic_vs_noiseq 0.9048
## deseq_vs_dream 0.9951
## deseq_vs_ebseq 0.9647
## deseq_vs_edger 0.9997
## deseq_vs_limma 0.9911
## deseq_vs_noiseq 0.9837
## dream_vs_ebseq 0.9725
## dream_vs_edger 0.9952
## dream_vs_limma 0.9961
## dream_vs_noiseq 0.9880
## ebseq_vs_edger 0.9643
## ebseq_vs_limma 0.9694
## ebseq_vs_noiseq 0.9891
## edger_vs_limma 0.9911
## edger_vs_noiseq 0.9837
## limma_vs_noiseq 0.9852
hs_macr_drug_table <- combine_de_tables(
hs_macr_drug_de, keepers = tmrc2_drug_keepers,
excel = glue("analyses/macrophage_de/de_tables/macrophage_onlydrug_table-v{ver}.xlsx"))
hs_macr_drug_table
## A set of combined differential expression results.
## table deseq_sigup deseq_sigdown edger_sigup edger_sigdown
## 1 none_vs_antimony-inverted 519 862 525 852
## limma_sigup limma_sigdown
## 1 556 808
## `geom_line()`: Each group consists of only one observation.
## i Do you need to adjust the group aesthetic?
## Plot describing unique/shared genes in a differential expression table.

#combined_to_tsv(hs_macr_drug_table, celltype = "macrophage")
hs_macr_drug_sig <- extract_significant_genes(
hs_macr_drug_table,
excel = glue("analyses/macrophage_de/sig_tables/macrophageonly_drug_sig-v{ver}.xlsx"))
hs_macr_drug_sig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down ebseq_up
## drug 556 808 525 852 519 862 425
## ebseq_down basic_up basic_down
## drug 821 366 562

hs_macr_drug_highsig <- extract_significant_genes(
hs_macr_drug_table, min_mean_exprs = high_expression, exprs_column = high_expression_column,
excel = glue("analyses/macrophage_de/sig_tables/macrophageonly_drug_highsig-v{ver}.xlsx"))
hs_macr_drug_highsig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down ebseq_up
## drug 283 492 273 539 268 548 225
## ebseq_down basic_up basic_down
## drug 511 285 479

## Creating the following to see how it affects gProfiler.
hs_macr_drug_lesssig <- extract_significant_genes(
hs_macr_drug_table, lfc = 0.6,
excel = glue("analyses/macrophage_de/sig_tables/macrophageonly_drug_sig_lfc0.6-v{ver}.xlsx"))
Macrophage Strain
only
In a similar fashion, let us look for effects which are observed when
we consider only the strain used during infection.
hs_macr_strain_de <- all_pairwise(hs_macr_strain_se, filter = TRUE, model_svs = "svaseq",
model_fstring = "~ 0 + condition")
## z22 z23
## 23 23
## Running normalize_se.
## Removing 9761 low-count genes (11720 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Running normalize_se.
## Setting 32467 entries to zero.
## This received a matrix of SVs.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## Warning in createContrastL(objFlt$formula, objFlt$data, L): Contrasts with only
## a single non-zero term are already evaluated by default.
## conditions
## z22 z23
## 23 23
## conditions
## z22 z23
## 23 23
## conditions
## z22 z23
## 23 23
## A pairwise differential expression with results from: basic, deseq, ebseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: svaseq.
## The primary analysis performed 1 comparisons.
## The logFC agreement among the methods follows:
## z23_vs_z22
## basic_vs_deseq 0.8273
## basic_vs_dream 0.8408
## basic_vs_ebseq 0.8021
## basic_vs_edger 0.8313
## basic_vs_limma 0.8629
## basic_vs_noiseq 0.1082
## deseq_vs_dream 0.9850
## deseq_vs_ebseq 0.9721
## deseq_vs_edger 0.9991
## deseq_vs_limma 0.9637
## deseq_vs_noiseq 0.2183
## dream_vs_ebseq 0.9734
## dream_vs_edger 0.9876
## dream_vs_limma 0.9782
## dream_vs_noiseq 0.2520
## ebseq_vs_edger 0.9726
## ebseq_vs_limma 0.9614
## ebseq_vs_noiseq 0.4180
## edger_vs_limma 0.9668
## edger_vs_noiseq 0.2127
## limma_vs_noiseq 0.2754
hs_macr_strain_table <- combine_de_tables(
hs_macr_strain_de, keepers = tmrc2_strain_keepers,
excel = glue("analyses/macrophage_de/de_tables/macrophage_onlystrain_table-v{ver}.xlsx"))
hs_macr_strain_table
## A set of combined differential expression results.
## table deseq_sigup deseq_sigdown edger_sigup edger_sigdown limma_sigup
## 1 z23_vs_z22 291 371 288 363 337
## limma_sigdown
## 1 275
## `geom_line()`: Each group consists of only one observation.
## i Do you need to adjust the group aesthetic?
## Plot describing unique/shared genes in a differential expression table.

combined_to_tsv(hs_macr_strain_table, celltype = "macrophage")
hs_macr_strain_sig <- extract_significant_genes(
hs_macr_strain_table,
excel = glue("analyses/macrophage_de/sig_tables/macrophageonly_onlystrain_sig-v{ver}.xlsx"))
hs_macr_strain_sig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down ebseq_up
## strain 337 275 288 363 291 371 199
## ebseq_down basic_up basic_down
## strain 216 210 112

hs_macr_strain_highsig <- extract_significant_genes(
hs_macr_strain_table, min_mean_exprs = high_expression, exprs_column = high_expression_column,
excel = glue("analyses/macrophage_de/sig_tables/macrophageonly_onlystrain_highsig-v{ver}.xlsx"))
hs_macr_strain_highsig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down ebseq_up
## strain 193 101 194 110 194 112 156
## ebseq_down basic_up basic_down
## strain 51 184 93

hs_macr_strain_lesssig <- extract_significant_genes(
hs_macr_strain_table, lfc = 0.6,
excel = glue("analyses/macrophage_de/sig_tables/macrophageonly_onlystrain_lesssig-v{ver}.xlsx"))
hs_macr_strain_lesssig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 0.6 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down ebseq_up
## strain 667 662 608 822 607 830 331
## ebseq_down basic_up basic_down
## strain 370 446 387

Compare Drug and
Strain Effects
Now let us consider the above two comparisons together. First, I will
plot the logFC values of them against each other (drug on x-axis and
strain on the y-axis). Then we can extract the significant genes in a
few combined categories of interest. I assume these will focus
exclusively on the categories which include the introduction of the
drug.
drug_strain_comp_df <- merge(hs_macr_drug_table[["data"]][["drug"]],
hs_macr_strain_table[["data"]][["strain"]],
by = "row.names")
drug_strain_comp_plot <- plot_linear_scatter(
drug_strain_comp_df[, c("deseq_logfc.x", "deseq_logfc.y")])
## Contrasts: antimony/none, z23/z22; x-axis: drug, y-axis: strain
## top left: higher no drug, z23; top right: higher drug z23
## bottom left: higher no drug, z22; bottom right: higher drug z22
drug_strain_comp_plot[["scatter"]]

As I noted in the comments above, some quadrants of the scatter plot
are likely to be of greater interest to us than others (the right side).
Because I get confused sometimes, the following block will explicitly
name the categories of likely interest, then ask which genes are shared
among them, and finally use UpSetR to extract the various gene
intersection/union categories.
higher_drug <- hs_macr_drug_sig[["deseq"]][["downs"]][[1]]
higher_nodrug <- hs_macr_drug_sig[["deseq"]][["ups"]][[1]]
higher_z23 <- hs_macr_strain_sig[["deseq"]][["ups"]][[1]]
higher_z22 <- hs_macr_strain_sig[["deseq"]][["downs"]][[1]]
sum(rownames(higher_drug) %in% rownames(higher_z23))
## [1] 94
sum(rownames(higher_drug) %in% rownames(higher_z22))
## [1] 87
sum(rownames(higher_nodrug) %in% rownames(higher_z23))
## [1] 26
sum(rownames(higher_nodrug) %in% rownames(higher_z22))
## [1] 73
drug_z23_lst <- list("drug" = rownames(higher_drug),
"z23" = rownames(higher_z23))
upset_input <- UpSetR::fromList(drug_z23_lst)
higher_drug_z23 <- upset(upset_input, text.scale = 2)
higher_drug_z23

drug_z23_shared_genes <- overlap_groups(drug_z23_lst)
shared_genes_drug_z23 <- overlap_geneids(drug_z23_shared_genes, "drug:z23")
shared_genes_drug_z23 <- attr(drug_z23_shared_genes, "elements")[drug_z23_shared_genes[["drug:z23"]]]
drug_z22_lst <- list("drug" = rownames(higher_drug),
"z22" = rownames(higher_z22))
higher_drug_z22 <- upset(UpSetR::fromList(drug_z22_lst), text.scale = 2)
higher_drug_z22

drug_z22_shared_genes <- overlap_groups(drug_z22_lst)
shared_genes_drug_z22 <- overlap_geneids(drug_z22_shared_genes, "drug:z22")
shared_genes_drug_z22 <- attr(drug_z22_shared_genes, "elements")[drug_z22_shared_genes[["drug:z22"]]]
Our main question of
interest
The data structure hs_macr contains our primary macrophages, which
are, as shown above, the data we can really sink our teeth into.
Note, we expect some errors when running the combine_de_tables()
because not all methods I use are comfortable using the ratio or ratios
contrasts we added in the ‘extras’ argument. As a result, when we
combine them into the larger output tables, those peculiar contrasts
fail. This does not stop it from writing the rest of the results,
however.
hs_macr_de_noextra <- all_pairwise(hs_macr, model_svs = "svaseq",
model_fstring = "~ 0 + condition", filter = TRUE)
## inf_sb_z22 inf_sb_z23 inf_z22 inf_z23 uninf_none
## 12 11 11 12 4
## uninf_sb_none
## 4
## Running normalize_se.
## Removing 9725 low-count genes (11756 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Running normalize_se.
## Setting 40036 entries to zero.
## This received a matrix of SVs.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## Warning in createContrastL(objFlt$formula, objFlt$data, L): Contrasts with only
## a single non-zero term are already evaluated by default.
## conditions
## inf_sb_z22 inf_sb_z23 inf_z22 inf_z23 uninf_none
## 12 11 11 12 4
## uninf_sb_none
## 4
## conditions
## inf_sb_z22 inf_sb_z23 inf_z22 inf_z23 uninf_none
## 12 11 11 12 4
## uninf_sb_none
## 4
## conditions
## inf_sb_z22 inf_sb_z23 inf_z22 inf_z23 uninf_none
## 12 11 11 12 4
## uninf_sb_none
## 4

hs_macr_de <- all_pairwise(hs_macr, model_svs = "svaseq", model_fstring = "~ 0 + condition",
filter = TRUE, extra_contrasts = tmrc2_human_extra)
## inf_sb_z22 inf_sb_z23 inf_z22 inf_z23 uninf_none
## 12 11 11 12 4
## uninf_sb_none
## 4
## Running normalize_se.
## Removing 9725 low-count genes (11756 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Running normalize_se.
## Setting 40036 entries to zero.
## This received a matrix of SVs.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## The contrast z23drugnodrug is not in the results.
## If this is not an extra contrast, then this is an error.
## The contrast z23z22drug is not in the results.
## If this is not an extra contrast, then this is an error.
## Warning in createContrastL(objFlt$formula, objFlt$data, L): Contrasts with only
## a single non-zero term are already evaluated by default.
## conditions
## inf_sb_z22 inf_sb_z23 inf_z22 inf_z23 uninf_none
## 12 11 11 12 4
## uninf_sb_none
## 4
## conditions
## inf_sb_z22 inf_sb_z23 inf_z22 inf_z23 uninf_none
## 12 11 11 12 4
## uninf_sb_none
## 4
## conditions
## inf_sb_z22 inf_sb_z23 inf_z22 inf_z23 uninf_none
## 12 11 11 12 4
## uninf_sb_none
## 4
## A pairwise differential expression with results from: basic, deseq, ebseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: svaseq.
## The primary analysis performed 15 comparisons.
Write out the results.
hs_single_table <- combine_de_tables(
hs_macr_de_noextra, keepers = single_tmrc2_keeper,
excel = glue("analyses/macrophage_de/de_tables/hs_macr_drug_zymo_z22sb_sb-v{ver}.xlsx"))
hs_single_table
## A set of combined differential expression results.
## table deseq_sigup deseq_sigdown edger_sigup
## 1 uninf_sb_none_vs_inf_sb_z22-inverted 33 0 32
## edger_sigdown limma_sigup limma_sigdown
## 1 0 2 0
## Only z22sb_vs_sb_up has information, cannot create an UpSet.
## Plot describing unique/shared genes in a differential expression table.
## NULL
hs_macr_table <- combine_de_tables(
hs_macr_de, keepers = tmrc2_human_keepers,
excel = glue("analyses/macrophage_de/de_tables/hs_macr_drug_zymo_table_macr_only-v{ver}.xlsx"))
## Warning in extract_keepers(extracted, keepers, table_names, all_coefficients, :
## The table for extra_z2322 using basic does not appear in the pairwise data.
## Warning in extract_keepers(extracted, keepers, table_names, all_coefficients, :
## The table for extra_z2322 using ebseq does not appear in the pairwise data.
## Warning in extract_keepers(extracted, keepers, table_names, all_coefficients, :
## The table for extra_z2322 using noiseq does not appear in the pairwise data.
## Error : subscript contains invalid names
## coefficient limma did not find z22drugnodrug or z23drugnodrug.
## coefficient edger did not find conditionz22drugnodrug or conditionz23drugnodrug.
## coefficient limma did not find z22drugnodrug or z23drugnodrug.
## Warning in extract_keepers(extracted, keepers, table_names, all_coefficients, :
## The table for extra_drugnodrug using basic does not appear in the pairwise
## data.
## Warning in extract_keepers(extracted, keepers, table_names, all_coefficients, :
## The table for extra_drugnodrug using ebseq does not appear in the pairwise
## data.
## Warning in extract_keepers(extracted, keepers, table_names, all_coefficients, :
## The table for extra_drugnodrug using noiseq does not appear in the pairwise
## data.
## Error : subscript contains invalid names
## coefficient limma did not find z23z22nodrug or z23z22drug.
## coefficient edger did not find conditionz23z22nodrug or conditionz23z22drug.
## coefficient limma did not find z23z22nodrug or z23z22drug.
## A set of combined differential expression results.
## table deseq_sigup deseq_sigdown edger_sigup
## 1 uninf_none_vs_inf_z23-inverted 478 265 470
## 2 uninf_none_vs_inf_z22-inverted 359 6 340
## 3 inf_z23_vs_inf_z22 349 539 359
## 4 inf_sb_z23_vs_inf_sb_z22 343 252 339
## 5 inf_z23_vs_inf_sb_z23-inverted 619 828 625
## 6 inf_z22_vs_inf_sb_z22-inverted 505 1040 520
## 7 uninf_sb_none_vs_inf_sb_z23-inverted 461 247 461
## 8 uninf_sb_none_vs_inf_sb_z22-inverted 33 0 32
## 9 uninf_none_vs_inf_sb_z23-inverted 839 923 854
## 10 uninf_none_vs_inf_sb_z22-inverted 660 746 672
## 11 uninf_sb_none_vs_uninf_none 561 748 563
## 12 FALSE 0 0 329
## 13 FALSE 0 0 329
## edger_sigdown limma_sigup limma_sigdown
## 1 270 392 251
## 2 6 264 71
## 3 528 450 390
## 4 253 377 215
## 5 821 571 746
## 6 1009 671 925
## 7 249 374 232
## 8 0 2 0
## 9 906 805 914
## 10 733 555 744
## 11 742 513 696
## 12 63 243 135
## 13 63 243 135
## Plot describing unique/shared genes in a differential expression table.

combined_to_tsv(hs_macr_table, "macrophage")
hs_macr_sig <- extract_significant_genes(
hs_macr_table,
excel = glue("analyses/macrophage_de/sig_tables/hs_macr_drug_zymo_sig-v{ver}.xlsx"))
## There is no deseq_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no deseq_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no ebseq_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no ebseq_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no basic_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no basic_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down
## z23nosb_vs_uninf 392 251 470 270 478 265
## z22nosb_vs_uninf 264 71 340 6 359 6
## z23nosb_vs_z22nosb 450 390 359 528 349 539
## z23sb_vs_z22sb 377 215 339 253 343 252
## z23sb_vs_z23nosb 571 746 625 821 619 828
## z22sb_vs_z22nosb 671 925 520 1009 505 1040
## z23sb_vs_sb 374 232 461 249 461 247
## z22sb_vs_sb 2 0 32 0 33 0
## z23sb_vs_uninf 805 914 854 906 839 923
## z22sb_vs_uninf 555 744 672 733 660 746
## sb_vs_uninf 513 696 563 742 561 748
## extra_z2322 243 135 329 63 0 0
## extra_drugnodrug 243 135 329 63 0 0
## ebseq_up ebseq_down basic_up basic_down
## z23nosb_vs_uninf 111 112 0 0
## z22nosb_vs_uninf 160 2 0 0
## z23nosb_vs_z22nosb 257 408 281 259
## z23sb_vs_z22sb 106 108 117 44
## z23sb_vs_z23nosb 412 699 371 540
## z22sb_vs_z22nosb 458 886 437 680
## z23sb_vs_sb 33 58 0 0
## z22sb_vs_sb 25 0 0 0
## z23sb_vs_uninf 280 767 350 489
## z22sb_vs_uninf 444 551 276 396
## sb_vs_uninf 316 495 0 0
## extra_z2322 0 0 0 0
## extra_drugnodrug 0 0 0 0

hs_macr_highsig <- extract_significant_genes(
hs_macr_table, min_mean_exprs = high_expression, exprs_column = high_expression_column,
excel = glue("analyses/macrophage_de/sig_tables/hs_macr_drug_zymo_highsig-v{ver}.xlsx"))
## Warning in get_sig_genes(this_table, lfc = lfc, p = p, z = z, n = n, column =
## this_fc_column, : The column deseq_basemean does not appears to be in the
## table, cannot filter by expression.
## Warning in get_sig_genes(this_table, lfc = lfc, p = p, z = z, n = n, column =
## this_fc_column, : The column deseq_basemean does not appears to be in the
## table, cannot filter by expression.
## Warning in get_sig_genes(this_table, lfc = lfc, p = p, z = z, n = n, column =
## this_fc_column, : The column deseq_basemean does not appears to be in the
## table, cannot filter by expression.
## Warning in get_sig_genes(this_table, lfc = lfc, p = p, z = z, n = n, column =
## this_fc_column, : The column deseq_basemean does not appears to be in the
## table, cannot filter by expression.
## There is no deseq_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no deseq_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no ebseq_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no ebseq_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no basic_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no basic_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down
## z23nosb_vs_uninf 269 139 317 139 314 138
## z22nosb_vs_uninf 103 4 110 0 115 0
## z23nosb_vs_z22nosb 221 154 247 174 238 178
## z23sb_vs_z22sb 211 105 210 86 211 84
## z23sb_vs_z23nosb 305 482 306 566 303 570
## z22sb_vs_z22nosb 330 545 301 572 288 598
## z23sb_vs_sb 250 130 278 140 274 140
## z22sb_vs_sb 2 0 9 0 13 0
## z23sb_vs_uninf 499 603 491 605 482 618
## z22sb_vs_uninf 310 479 318 501 303 513
## sb_vs_uninf 291 459 294 495 291 498
## extra_z2322 243 135 329 63 0 0
## extra_drugnodrug 243 135 329 63 0 0
## ebseq_up ebseq_down basic_up basic_down
## z23nosb_vs_uninf 87 64 0 0
## z22nosb_vs_uninf 41 0 0 0
## z23nosb_vs_z22nosb 207 140 214 169
## z23sb_vs_z22sb 80 37 99 35
## z23sb_vs_z23nosb 212 529 293 465
## z22sb_vs_z22nosb 276 519 326 540
## z23sb_vs_sb 21 28 0 0
## z22sb_vs_sb 5 0 0 0
## z23sb_vs_uninf 177 550 295 401
## z22sb_vs_uninf 235 393 216 321
## sb_vs_uninf 191 352 0 0
## extra_z2322 0 0 0 0
## extra_drugnodrug 0 0 0 0

hs_macr_lesssig <- extract_significant_genes(
hs_macr_table, lfc = 0.6,
excel = glue("analyses/macrophage_de/sig_tables/hs_macr_drug_zymo_sig_lfc0.6-v{ver}.xlsx"))
## There is no deseq_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no deseq_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no ebseq_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no ebseq_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no basic_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## There is no basic_logfc column in the table.
## The columns are: ensembl_gene_id, ensembl_transcript_id, version, transcript_version, description, gene_biotype, cds_length, chromosome_name, strand, start_position, end_position, hgnc_symbol, transcript, dream_logfc, dream_adjp, edger_logfc, edger_adjp, limma_logfc, limma_adjp, dream_ave, dream_t, dream_p, dream_b, edger_logcpm, edger_lr, edger_p, limma_ave, limma_t, limma_p, limma_b, limma_adjp_fdr, dream_adjp_fdr, edger_adjp_fdr, lfc_meta, lfc_var, lfc_varbymed, p_meta, p_var
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 0.6 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down
## z23nosb_vs_uninf 701 587 856 594 865 587
## z22nosb_vs_uninf 378 128 464 21 505 22
## z23nosb_vs_z22nosb 867 786 746 964 728 981
## z23sb_vs_z22sb 670 587 649 645 655 641
## z23sb_vs_z23nosb 1237 1395 1297 1536 1279 1545
## z22sb_vs_z22nosb 1492 1542 1287 1692 1211 1747
## z23sb_vs_sb 622 643 761 610 772 614
## z22sb_vs_sb 2 0 33 0 34 0
## z23sb_vs_uninf 1516 1656 1595 1671 1557 1700
## z22sb_vs_uninf 1127 1297 1246 1293 1222 1340
## sb_vs_uninf 1037 1148 1055 1262 1042 1291
## extra_z2322 381 288 482 210 0 0
## extra_drugnodrug 381 288 482 210 0 0
## ebseq_up ebseq_down basic_up basic_down
## z23nosb_vs_uninf 141 196 0 0
## z22nosb_vs_uninf 186 4 0 0
## z23nosb_vs_z22nosb 463 602 593 580
## z23sb_vs_z22sb 144 237 193 144
## z23sb_vs_z23nosb 774 1166 1016 1047
## z22sb_vs_z22nosb 1044 1349 1228 1264
## z23sb_vs_sb 39 115 0 0
## z22sb_vs_sb 30 0 0 0
## z23sb_vs_uninf 458 1259 665 812
## z22sb_vs_uninf 746 907 535 686
## sb_vs_uninf 495 788 0 0
## extra_z2322 0 0 0 0
## extra_drugnodrug 0 0 0 0

gene group upset
2.3 vs 2.2 up and
down vs. uninfected
This is my version of the Venn diagram which includes the text:
“Differentially expressed genes in macrophages infected with
subpopulations 2.2 or 2.3. Volcano plots contrast of: A. Venn diagram
for upregulated and downregulated genes by infection with 2.3 and 2.2
strains. B. infected cells with 2.3 strains and uninfected cells; C.
infected cells with 2.2 strains and uninfected cells; D. infected cells
with 2.3 strains and infected cells with 2.2 strains”
The following upset plot is currently Figure 2E.
nodrug_upset <- upsetr_combined_de(hs_macr_table,
desired_contrasts = c("z22nosb_vs_uninf", "z23nosb_vs_uninf"))
pp(file = "images/nodrug_upset.svg")
nodrug_upset[["plot"]]
dev.off()
## png
## 2
## Plot describing unique/shared genes in a differential expression table.

A point of
interest while Olga visits Umd
Najib and Olga asked about pulling the 9 gene IDs which are in the
peculiar situation of increased expression in z2.2/uninf and decreased
in z2.3/uninf. In the previous upset plot, these are visible in the 6th
bar. I can access these via the attr() function, which I should admit I
can never remember how to use, so I am going to use the code under the
‘Compare(no)Sb z2.3/z2.2 treatment’ heading to remember how to extract
these genes.
all_groups <- nodrug_upset[["groups"]]
wanted_group <- "z23nosb_vs_uninf_down:z22nosb_vs_uninf_up"
gene_idx <- all_groups[[wanted_group]]
wanted_genes <- attr(all_groups, "elements")[gene_idx]
wanted_genes
## [1] "ENSG00000004846" "ENSG00000111783" "ENSG00000118298" "ENSG00000120738"
## [5] "ENSG00000126217" "ENSG00000163687" "ENSG00000170345" "ENSG00000244242"
## [9] "ENSG00000277481"
gene_symbol_idx <- rownames(rowData(hs_macr)) %in% as.character(wanted_genes)
rowData(hs_macr)[gene_symbol_idx, "hgnc_symbol"]
## [1] "ABCB5" "RFX4" "CA14" "EGR1" "MCF2L" "DNASE1L3" "FOS"
## [8] "IFITM10" "PKD1L3"
- ABCB5: ATB Binding Cassette Subfamily B Member #5, wide range of
functions in this diverse paralogous family. Associated with skin
diseases (melanoma and Epidermolysis Bullosa; participate in
ATP-dependent transmembrane transport).
- RFX4: Regulatory Factor X #4: transcription factor.
- CA14: Carbonic anhydrase #14: Zync metalloenzyme catalyzes
reversible hydration of CO2. This gene looks pretty neat, but not really
relevant to anything we are likely to care about.
- EGR1: Early Growth Response Protein #1: Another Tx factor
(zinc-finger) – important for cell survival/proliferation/cell death.
Presumably important for healing?
- MCF2L: MCF.2 Cell Line Derived Transforming Sequence Like? guanine
nucleotide exchange factor interacting with GTP-bound Rac1. Apparently
associated with ostroarthritis; potentially relevant to regulation of
RHOA and CDC42 signalling.
- DNASE1L3: Deoxyribonuclease I family member: not inhibited by actin,
breaks down DNA during apoptosis. Important during necrosis.
- FOS: Proto-Oncogene, AP-1 Transcription Factor: leucine zipper
dimerizes with JUN family proteins, forming tx factor complex AP-1.
Important for cell proliferation, differentiation, and
transformation.
- IFITM10: Interferon-Induced Transmembrane Protein #10
- PKD1L3: Polycystin 1 Like #3, Transient Receptor Potential Channel
Interacting: 11 transmembrane domain protein which might help create
cation channels.
As some comparison points, the Venn in the current figure has:
- 387 up z2.3
- 259 up z2.2
- 83 shared up z2.3 and z2.2
- 247 down z2.3
- 3 down z2.2
- 3 shared down z2.3 and z2.2
2.2 and 2.3 with
SbV vs 2.2 and 2.3 without SbV
This is my version of the Venn with the text:
“Differentially expressed genes in macrophages infected with
subpopulations 2.2 or 2.3, in presence of SbV. Volcano plots contrast
of: A. infected cells with 2.3 strains + SbV and infected cells with 2.3
strains; B. infected cells with 2.2 strains + SbV and infected cells
with 2.2 strains; C. infected cells with 2.3 strains + SbV and infected
cells with 2.2 strains + SbV. D. Venn diagram for upregulated and
downregulated genes by infection with 2.3+SbV and 2.2+SbV strains.”
A query from Olga (20240801): Please include in the upset in figure 3
the contrast of uninfected cells + SbV vs uninfected without SbV.
## I keep mis-interpreting this text, it is z2.3/z2.3SbV and z2.2/z2.2SbV
drugnodrug_upset <- upsetr_combined_de(hs_macr_table,
desired_contrasts = c("z23sb_vs_z23nosb", "z22sb_vs_z22nosb"))
pp(file = "images/drugnodrug_upset.pdf")
drugnodrug_upset[["plot"]]
dev.off()
## png
## 2
## Plot describing unique/shared genes in a differential expression table.

drugnodrug_uninf_contrasts <- c("z23sb_vs_z23nosb", "z22sb_vs_z22nosb", "sb_vs_uninf")
drugnodrug_upset_with_uninf <- upsetr_combined_de(hs_macr_table,
desired_contrasts = drugnodrug_uninf_contrasts)
pp(file = "figures/drugnodrug_with_uninf_upset.svg")
drugnodrug_upset_with_uninf[["plot"]]
dev.off()
## png
## 2
drugnodrug_upset_with_uninf
## Plot describing unique/shared genes in a differential expression table.

For some comparison points, the venn image has:
- 222 up z2.3 SbV
- 134 up z2.2 SbV
- 182 down z2.3 SbV
- 396 down z2.2 SbV
- 605 shared down z2.2 and z2.3 SbV
- 34 shared down z2.2 SbV and up z2.3 SbV
- 363 shared up z2.2 SbV and z2.3 SbV
Compare z2.2SbV vs
SbV and z2.3SbV and SbV
drug_upset <- upsetr_combined_de(hs_macr_table,
desired_contrasts = c("z22sb_vs_sb", "z23sb_vs_sb"))
pp(file = "images/drug_upset.pdf")
drug_upset[["plot"]]
dev.off()
## png
## 2
## Plot describing unique/shared genes in a differential expression table.

Significance barplot
of interest
Olga kindly sent a set of particularly interesting contrasts and
colors for a significance barplot, they include the following:
- z2.3 vs. uninfected.
- z2.2 vs. uninfected.
- z2.3 vs z2.2
- z2.3Sbv vs z2.3
- z2.2Sbv vs z2.2
- z2.3Sbv vs z2.2Sbv
- Sbv vs uninfected.
The existing set of ‘keepers’ exvised to these is taken from the
extant set of ‘tmrc2_human_keepers’ and is as follows:
barplot_keepers <- list(
## z2.3 vs uninfected
"z23nosb_vs_uninf" = c("inf_z23", "uninf_none"),
## z2.2 vs uninfected
"z22nosb_vs_uninf" = c("inf_z22", "uninf_none"),
## z2.3 vs z2.2
"z23nosb_vs_z22nosb" = c("inf_z23", "inf_z22"),
## z2.3Sbv vs z2.3
"z23sb_vs_z23nosb" = c("inf_sb_z23", "inf_z23"),
## z2.2Sbv vs z2.2
"z22sb_vs_z22nosb" = c("inf_sb_z22", "inf_z22"),
## z2.3Sbv vs z2.2Sbv
"z23sb_vs_z22sb" = c("inf_sb_z23", "inf_sb_z22"),
## Sbv vs uninfected.
"sb_vs_uninf" = c("uninf_sb_none", "uninf_none"))
barplot_combined <- combine_de_tables(
hs_macr_de, keepers = barplot_keepers,
excel = glue("analyses/macrophage_de/de_tables/hs_macr_drug_zymo_7contrasts-v{ver}.xlsx"))
Now let us use the colors suggested by Olga to make a barplot of
these…
color_list <- c( "#de8bf9", "#ad07e3","#410257", "#ffa0a0", "#f94040", "#a00000")
barplot_sig <- extract_significant_genes(
barplot_combined, color_list = color_list, according_to = "deseq",
excel = glue("analyses/macrophage_de/sig_tables/hs_macr_drug_zymo_7contrasts_sig-v{ver}.xlsx"))
barplot_sig
## A set of genes deemed significant according to deseq.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## deseq_up deseq_down
## z23nosb_vs_uninf 478 265
## z22nosb_vs_uninf 359 6
## z23nosb_vs_z22nosb 349 539
## z23sb_vs_z23nosb 619 828
## z22sb_vs_z22nosb 505 1040
## z23sb_vs_z22sb 343 252
## sb_vs_uninf 561 748

PROPER
In our last meeting there were some questions about the statistical
power of different future experimental designs. One thing I can do is to
use PROPER to estimate the power of an extant dataset and infer from
that the likely power of other designs.
In order to use proper, one must feed it one or more DE tables.
power_estimate <- simple_proper(hs_single_table)
## Error in if (all_coverage < cutoff) {: missing value where TRUE/FALSE needed
power_estimate[[1]][["power_plot"]]
## Error: object 'power_estimate' not found
power_estimate[[1]][["powertd_plot"]]
## Error: object 'power_estimate' not found
power_estimate[[1]][["powerfd_plot"]]
## Error: object 'power_estimate' not found
Our main questions in
U937
Let us do the same comparisons in the U937 samples, though I will not
do the extra contrasts, primarily because I think the dataset is less
likely to support them.
u937_de <- all_pairwise(u937_se, model_svs = "svaseq",
filter = TRUE, model_fstring = "~ 0 + condition")
## inf_sb_z22 inf_sb_z23 inf_z22 inf_z23 uninf_none
## 3 3 3 3 1
## uninf_sb_none
## 1
## Running normalize_se.
## Removing 10730 low-count genes (10751 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Running normalize_se.
## Setting 938 entries to zero.
## This received a matrix of SVs.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## Warning in createContrastL(objFlt$formula, objFlt$data, L): Contrasts with only
## a single non-zero term are already evaluated by default.
## conditions
## inf_sb_z22 inf_sb_z23 inf_z22 inf_z23 uninf_none
## 3 3 3 3 1
## uninf_sb_none
## 1
## conditions
## inf_sb_z22 inf_sb_z23 inf_z22 inf_z23 uninf_none
## 3 3 3 3 1
## uninf_sb_none
## 1
## conditions
## inf_sb_z22 inf_sb_z23 inf_z22 inf_z23 uninf_none
## 3 3 3 3 1
## uninf_sb_none
## 1
## Error in NOISeq::noiseqbio(norm_input, k = k, norm = norm, factor = condition_column, :
## ERROR: To run NOISeqBIO at least two replicates per condition are needed.
## Please, run NOISeq if there are not enough replicates in your experiment.
## A pairwise differential expression with results from: basic, deseq, ebseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: svaseq.
## The primary analysis performed 15 comparisons.
u937_table <- combine_de_tables(
u937_de, keepers = u937_keepers,
excel = glue("analyses/macrophage_de/de_tables/u937_drug_zymo_table-v{ver}.xlsx"))
u937_table
## A set of combined differential expression results.
## table deseq_sigup deseq_sigdown edger_sigup
## 1 uninf_none_vs_inf_z23-inverted 0 5 2
## 2 uninf_none_vs_inf_z22-inverted 0 0 0
## 3 inf_z23_vs_inf_z22 1 0 17
## 4 inf_sb_z23_vs_inf_sb_z22 0 0 0
## 5 inf_z23_vs_inf_sb_z23-inverted 256 171 311
## 6 inf_z22_vs_inf_sb_z22-inverted 298 154 305
## 7 uninf_sb_none_vs_inf_sb_z23-inverted 0 0 2
## 8 uninf_sb_none_vs_inf_sb_z22-inverted 0 0 2
## 9 uninf_none_vs_inf_sb_z23-inverted 296 151 306
## 10 uninf_none_vs_inf_sb_z22-inverted 294 169 300
## 11 uninf_sb_none_vs_uninf_none 239 119 261
## edger_sigdown limma_sigup limma_sigdown
## 1 5 0 3
## 2 5 0 3
## 3 6 3 3
## 4 1 0 2
## 5 176 221 192
## 6 149 220 190
## 7 0 0 0
## 8 5 1 3
## 9 155 233 181
## 10 175 227 210
## 11 127 192 154
## Plot describing unique/shared genes in a differential expression table.

combined_to_tsv(u937_table, celltype = "u937")
u937_sig <- extract_significant_genes(
u937_table,
excel = glue("analyses/macrophage_de/sig_tables/u937_drug_zymo_sig-v{ver}.xlsx"))
u937_sig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down
## z23nosb_vs_uninf 0 3 2 5 0 5
## z22nosb_vs_uninf 0 3 0 5 0 0
## z23nosb_vs_z22nosb 3 3 17 6 1 0
## z23sb_vs_z22sb 0 2 0 1 0 0
## z23sb_vs_z23nosb 221 192 311 176 256 171
## z22sb_vs_z22nosb 220 190 305 149 298 154
## z23sb_vs_sb 0 0 2 0 0 0
## z22sb_vs_sb 1 3 2 5 0 0
## z23sb_vs_uninf 233 181 306 155 296 151
## z22sb_vs_uninf 227 210 300 175 294 169
## sb_vs_uninf 192 154 261 127 239 119
## ebseq_up ebseq_down basic_up basic_down
## z23nosb_vs_uninf 5 14 0 0
## z22nosb_vs_uninf 0 7 0 0
## z23nosb_vs_z22nosb 8 42 0 0
## z23sb_vs_z22sb 0 0 0 0
## z23sb_vs_z23nosb 328 179 0 0
## z22sb_vs_z22nosb 279 150 0 0
## z23sb_vs_sb 5 4 0 0
## z22sb_vs_sb 7 6 0 0
## z23sb_vs_uninf 267 122 0 0
## z22sb_vs_uninf 226 163 0 0
## sb_vs_uninf 152 175 0 0

u937_highsig <- extract_significant_genes(
u937_table, min_mean_exprs = high_expression, exprs_column = high_expression_column,
excel = glue("analyses/macrophage_de/sig_tables/u937_drug_zymo_highsig-v{ver}.xlsx"))
u937_highsig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down
## z23nosb_vs_uninf 0 3 0 4 0 4
## z22nosb_vs_uninf 0 2 0 4 0 0
## z23nosb_vs_z22nosb 2 3 6 4 1 0
## z23sb_vs_z22sb 0 0 0 0 0 0
## z23sb_vs_z23nosb 149 125 174 116 160 120
## z22sb_vs_z22nosb 130 111 152 104 149 107
## z23sb_vs_sb 0 0 0 0 0 0
## z22sb_vs_sb 0 1 0 1 0 0
## z23sb_vs_uninf 145 99 155 97 154 96
## z22sb_vs_uninf 143 119 155 115 155 116
## sb_vs_uninf 126 91 137 89 136 89
## ebseq_up ebseq_down basic_up basic_down
## z23nosb_vs_uninf 2 4 0 0
## z22nosb_vs_uninf 0 2 0 0
## z23nosb_vs_z22nosb 0 25 0 0
## z23sb_vs_z22sb 0 0 0 0
## z23sb_vs_z23nosb 182 119 0 0
## z22sb_vs_z22nosb 139 103 0 0
## z23sb_vs_sb 0 0 0 0
## z22sb_vs_sb 0 0 0 0
## z23sb_vs_uninf 161 83 0 0
## z22sb_vs_uninf 136 112 0 0
## sb_vs_uninf 89 139 0 0

u937_lesssig <- extract_significant_genes(
u937_table, lfc = 0.6,
excel = glue("analyses/macrophage_de/sig_tables/u937_drug_zymo_lesssig-v{ver}.xlsx"))
u937_lesssig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 0.6 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down
## z23nosb_vs_uninf 1 5 6 13 2 8
## z22nosb_vs_uninf 0 7 0 15 0 2
## z23nosb_vs_z22nosb 17 10 40 23 2 2
## z23sb_vs_z22sb 1 3 7 3 1 0
## z23sb_vs_z23nosb 478 433 627 416 499 421
## z22sb_vs_z22nosb 568 506 739 451 678 467
## z23sb_vs_sb 0 1 4 3 0 2
## z22sb_vs_sb 1 16 7 26 2 11
## z23sb_vs_uninf 487 472 600 439 566 409
## z22sb_vs_uninf 517 568 641 522 596 500
## sb_vs_uninf 430 400 535 373 466 345
## ebseq_up ebseq_down basic_up basic_down
## z23nosb_vs_uninf 14 113 0 0
## z22nosb_vs_uninf 0 27 0 0
## z23nosb_vs_z22nosb 78 427 0 0
## z23sb_vs_z22sb 5 1 0 0
## z23sb_vs_z23nosb 765 656 0 0
## z22sb_vs_z22nosb 582 442 0 0
## z23sb_vs_sb 20 10 0 0
## z22sb_vs_sb 26 18 0 0
## z23sb_vs_uninf 488 332 0 0
## z22sb_vs_uninf 406 488 0 0
## sb_vs_uninf 160 209 0 0

Compare (no)Sb
z2.3/z2.2 treatments among macrophages
In the following block, I will jump back to the macrophage samples
and look for genes which are shared/unique when comparing z2.3/z2.2 for
the drug treated samples and the untreated samples.
upset_plots_hs_macr <- upsetr_sig(
hs_macr_sig, both = TRUE,
contrasts = c("z23sb_vs_z22sb", "z23nosb_vs_z22nosb"))
upset_plots_hs_macr[["both"]]
## [1] TRUE
groups <- upset_plots_hs_macr[["both_groups"]]
shared_genes <- attr(groups, "elements")[groups[[2]]] %>%
gsub(pattern = "^gene:", replacement = "")
length(shared_genes)
## [1] 387
shared_gp <- simple_gprofiler(shared_genes)
shared_gp[["pvalue_plots"]][["MF"]]
## NULL
shared_gp[["pvalue_plots"]][["BP"]]
## NULL
shared_gp[["pvalue_plots"]][["REAC"]]

drug_genes <- attr(groups, "elements")[groups[["z23sb_vs_z22sb"]]] %>%
gsub(pattern = "^gene:", replacement = "")
drugonly_gp <- simple_gprofiler(drug_genes)
drugonly_gp[["pvalue_plots"]][["BP"]]
## NULL
I want to try something, directly include the u937 data in this.
Thus, in the following block I will repeat but compare all samples and
the U937 using the same logic.
both_sig <- hs_macr_sig
names(both_sig[["deseq"]][["ups"]]) <- paste0("macr_", names(both_sig[["deseq"]][["ups"]]))
names(both_sig[["deseq"]][["downs"]]) <- paste0("macr_", names(both_sig[["deseq"]][["downs"]]))
u937_deseq <- u937_sig[["deseq"]]
names(u937_deseq[["ups"]]) <- paste0("u937_", names(u937_deseq[["ups"]]))
names(u937_deseq[["downs"]]) <- paste0("u937_", names(u937_deseq[["downs"]]))
both_sig[["deseq"]][["ups"]] <- c(both_sig[["deseq"]][["ups"]], u937_deseq[["ups"]])
both_sig[["deseq"]][["downs"]] <- c(both_sig[["deseq"]][["ups"]], u937_deseq[["downs"]])
summary(both_sig[["deseq"]][["ups"]])
## Length Class Mode
## macr_z23nosb_vs_uninf 73 DFrame S4
## macr_z22nosb_vs_uninf 73 DFrame S4
## macr_z23nosb_vs_z22nosb 73 DFrame S4
## macr_z23sb_vs_z22sb 73 DFrame S4
## macr_z23sb_vs_z23nosb 73 DFrame S4
## macr_z22sb_vs_z22nosb 73 DFrame S4
## macr_z23sb_vs_sb 73 DFrame S4
## macr_z22sb_vs_sb 73 DFrame S4
## macr_z23sb_vs_uninf 73 DFrame S4
## macr_z22sb_vs_uninf 73 DFrame S4
## macr_sb_vs_uninf 73 DFrame S4
## macr_extra_z2322 0 data.frame list
## macr_extra_drugnodrug 0 data.frame list
## u937_z23nosb_vs_uninf 64 DFrame S4
## u937_z22nosb_vs_uninf 64 DFrame S4
## u937_z23nosb_vs_z22nosb 64 DFrame S4
## u937_z23sb_vs_z22sb 64 DFrame S4
## u937_z23sb_vs_z23nosb 64 DFrame S4
## u937_z22sb_vs_z22nosb 64 DFrame S4
## u937_z23sb_vs_sb 64 DFrame S4
## u937_z22sb_vs_sb 64 DFrame S4
## u937_z23sb_vs_uninf 64 DFrame S4
## u937_z22sb_vs_uninf 64 DFrame S4
## u937_sb_vs_uninf 64 DFrame S4
upset_plots_both <- upsetr_sig(
both_sig, both = TRUE,
contrasts = c("macr_z23sb_vs_z22sb", "macr_z23nosb_vs_z22nosb",
"u937_z23sb_vs_z22sb", "u937_z23nosb_vs_z22nosb"))
upset_plots_both[["both"]]
## [1] TRUE
Compare DE results
from macrophages and U937 samples
Looking a bit more closely at these, I think the u937 data is too
sparse to effectively compare.
macr_u937_comparison <- compare_de_results(hs_macr_table, u937_table)



macr_u937_comparison[["lfc_heat"]]

macr_u937_venns <- compare_significant_contrasts(hs_macr_sig, second_sig_tables = u937_sig,
contrasts = "z23sb_vs_z23nosb")


macr_u937_venns[["up_plot"]]

macr_u937_venns[["down_plot"]]

macr_u937_venns_v2 <- compare_significant_contrasts(
hs_macr_sig, second_sig_tables = u937_sig, contrasts = "z22sb_vs_z22nosb")


macr_u937_venns_v2[["up_plot"]]

macr_u937_venns_v2[["down_plot"]]

macr_u937_venns_v3 <- compare_significant_contrasts(
hs_macr_sig, second_sig_tables = u937_sig, contrasts = "sb_vs_uninf")


macr_u937_venns_v3[["up_plot"]]

macr_u937_venns_v3[["down_plot"]]

Compare
macrophage/u937 with respect to z2.3/z2.2
comparison_df <- merge(hs_macr_table[["data"]][["z23sb_vs_z22sb"]],
u937_table[["data"]][["z23sb_vs_z22sb"]],
by = "row.names")
macru937_z23z22_plot <- plot_linear_scatter(comparison_df[, c("deseq_logfc.x", "deseq_logfc.y")])
macru937_z23z22_plot[["scatter"]]

comparison_df <- merge(hs_macr_table[["data"]][["z23nosb_vs_z22nosb"]],
u937_table[["data"]][["z23nosb_vs_z22nosb"]],
by = "row.names")
macru937_z23z22_plot <- plot_linear_scatter(comparison_df[, c("deseq_logfc.x", "deseq_logfc.y")])
macru937_z23z22_plot[["scatter"]]

Add donor to the
contrasts, no sva
In the following block, I will change the sample condition to include
the donor.
no_power_fact <- paste0(colData(hs_macr)[["donor"]], "_",
colData(hs_macr)[["condition"]])
table(colData(hs_macr)[["donor"]])
##
## d01 d02 d09 d81
## 13 14 13 14
## no_power_fact
## d01_inf_sb_z22 d01_inf_sb_z23 d01_inf_z22 d01_inf_z23
## 3 3 2 3
## d01_uninf_none d01_uninf_sb_none d02_inf_sb_z22 d02_inf_sb_z23
## 1 1 3 3
## d02_inf_z22 d02_inf_z23 d02_uninf_none d02_uninf_sb_none
## 3 3 1 1
## d09_inf_sb_z22 d09_inf_sb_z23 d09_inf_z22 d09_inf_z23
## 3 2 3 3
## d09_uninf_none d09_uninf_sb_none d81_inf_sb_z22 d81_inf_sb_z23
## 1 1 3 3
## d81_inf_z22 d81_inf_z23 d81_uninf_none d81_uninf_sb_none
## 3 3 1 1
hs_nopower <- set_conditions(hs_macr, fact = no_power_fact)
## The numbers of samples by condition are:
##
## d01_inf_sb_z22 d01_inf_sb_z23 d01_inf_z22 d01_inf_z23
## 3 3 2 3
## d01_uninf_none d01_uninf_sb_none d02_inf_sb_z22 d02_inf_sb_z23
## 1 1 3 3
## d02_inf_z22 d02_inf_z23 d02_uninf_none d02_uninf_sb_none
## 3 3 1 1
## d09_inf_sb_z22 d09_inf_sb_z23 d09_inf_z22 d09_inf_z23
## 3 2 3 3
## d09_uninf_none d09_uninf_sb_none d81_inf_sb_z22 d81_inf_sb_z23
## 1 1 3 3
## d81_inf_z22 d81_inf_z23 d81_uninf_none d81_uninf_sb_none
## 3 3 1 1
hs_nopower <- subset_se(hs_nopower, subset = "macrophagezymodeme!='none'")
hs_nopower_nosva_de <- all_pairwise(hs_nopower, model_svs = FALSE, filter = TRUE)
## d01_inf_sb_z22 d01_inf_sb_z23 d01_inf_z22 d01_inf_z23 d02_inf_sb_z22
## 3 3 2 3 3
## d02_inf_sb_z23 d02_inf_z22 d02_inf_z23 d09_inf_sb_z22 d09_inf_sb_z23
## 3 3 3 3 2
## d09_inf_z22 d09_inf_z23 d81_inf_sb_z22 d81_inf_sb_z23 d81_inf_z22
## 3 3 3 3 3
## d81_inf_z23
## 3
## z2.2 z2.3
## 23 23
## Warning: attributes are not identical across measure variables; they will be
## dropped
## Running normalize_se.
## Removing 9761 low-count genes (11720 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Running normalize_se.
## Setting 32467 entries to zero.
## converting counts to integer mode
## Error in checkFullRank(modelMatrix) :
## the model matrix is not full rank, so the model cannot be fit as specified.
## One or more variables or interaction terms in the design formula are linear
## combinations of the others and must be removed.
##
## Please read the vignette section 'Model matrix not full rank':
##
## vignette('DESeq2')
## Coefficients not estimable: batchz23
## Warning: Partial NA coefficients for 11720 probe(s)
## Error in variancePartition::dream(exprObj = voom_result, formula = model_fstring, :
## Design matrix is singular, covariates are very correlated
## conditions
## d01_inf_sb_z22 d01_inf_sb_z23 d01_inf_z22 d01_inf_z23 d02_inf_sb_z22
## 3 3 2 3 3
## d02_inf_sb_z23 d02_inf_z22 d02_inf_z23 d09_inf_sb_z22 d09_inf_sb_z23
## 3 3 3 3 2
## d09_inf_z22 d09_inf_z23 d81_inf_sb_z22 d81_inf_sb_z23 d81_inf_z22
## 3 3 3 3 3
## d81_inf_z23
## 3
## Error in glmFit.default(y, design = design, dispersion = dispersion, offset = offset, :
## Design matrix not of full rank. The following coefficients not estimable:
## batchz23
## Warning in edger_pairwise(...): estimateGLMCommonDisp() failed. Trying again
## with estimateDisp().
## Warning in edger_pairwise(...): There was a failure when doing the estimations.
## There was a failure when doing the estimations, using estimateDisp().
## Error in glmFit.default(sely, design, offset = seloffset, dispersion = 0.05, :
## Design matrix not of full rank. The following coefficients not estimable:
## batchz23
## conditions
## d01_inf_sb_z22 d01_inf_sb_z23 d01_inf_z22 d01_inf_z23 d02_inf_sb_z22
## 3 3 2 3 3
## d02_inf_sb_z23 d02_inf_z22 d02_inf_z23 d09_inf_sb_z22 d09_inf_sb_z23
## 3 3 3 3 2
## d09_inf_z22 d09_inf_z23 d81_inf_sb_z22 d81_inf_sb_z23 d81_inf_z22
## 3 3 3 3 3
## d81_inf_z23
## 3
## Coefficients not estimable: batchz23
## Warning: Partial NA coefficients for 11720 probe(s)
## Coefficients not estimable: batchz23
## Warning: Partial NA coefficients for 11720 probe(s)
## conditions
## d01_inf_sb_z22 d01_inf_sb_z23 d01_inf_z22 d01_inf_z23 d02_inf_sb_z22
## 3 3 2 3 3
## d02_inf_sb_z23 d02_inf_z22 d02_inf_z23 d09_inf_sb_z22 d09_inf_sb_z23
## 3 3 3 3 2
## d09_inf_z22 d09_inf_z23 d81_inf_sb_z22 d81_inf_sb_z23 d81_inf_z22
## 3 3 3 3 3
## d81_inf_z23
## 3

nopower_keepers <- list(
"d01_zymo" = c("d01infz23", "d01infz22"),
"d01_sbzymo" = c("d01infsbz23", "d01infsbz22"),
"d02_zymo" = c("d02infz23", "d02infz22"),
"d02_sbzymo" = c("d02infsbz23", "d02infsbz22"),
"d09_zymo" = c("d09infz23", "d09infz22"),
"d09_sbzymo" = c("d09infsbz23", "d09infsbz22"),
"d81_zymo" = c("d81infz23", "d81infz22"),
"d81_sbzymo" = c("d81infsbz23", "d81infsbz22"))
hs_nopower_nosva_table <- combine_de_tables(
hs_nopower_nosva_de, keepers = nopower_keepers,
excel = glue("analyses/macrophage_de/de_tables/hs_nopower_table-v{ver}.xlsx"))
## The keepers has no elements in the coefficients.
## Here are the keepers: d01infz23, d01infz22, d01infsbz23, d01infsbz22, d02infz23, d02infz22, d02infsbz23, d02infsbz22, d09infz23, d09infz22, d09infsbz23, d09infsbz22, d81infz23, d81infz22, d81infsbz23, d81infsbz22
## Here are the coefficients: d81_inf_z23, d81_inf_z22, d81_inf_z23, d81_inf_sb_z23, d81_inf_z22, d81_inf_sb_z23, d81_inf_z23, d81_inf_sb_z22, d81_inf_z22, d81_inf_sb_z22, d81_inf_sb_z23, d81_inf_sb_z22, d81_inf_z23, d09_inf_z23, d81_inf_z22, d09_inf_z23, d81_inf_sb_z23, d09_inf_z23, d81_inf_sb_z22, d09_inf_z23, d81_inf_z23, d09_inf_z22, d81_inf_z22, d09_inf_z22, d81_inf_sb_z23, d09_inf_z22, d81_inf_sb_z22, d09_inf_z22, d09_inf_z23, d09_inf_z22, d81_inf_z23, d09_inf_sb_z23, d81_inf_z22, d09_inf_sb_z23, d81_inf_sb_z23, d09_inf_sb_z23, d81_inf_sb_z22, d09_inf_sb_z23, d09_inf_z23, d09_inf_sb_z23, d09_inf_z22, d09_inf_sb_z23, d81_inf_z23, d09_inf_sb_z22, d81_inf_z22, d09_inf_sb_z22, d81_inf_sb_z23, d09_inf_sb_z22, d81_inf_sb_z22, d09_inf_sb_z22, d09_inf_z23, d09_inf_sb_z22, d09_inf_z22, d09_inf_sb_z22, d09_inf_sb_z23, d09_inf_sb_z22, d81_inf_z23, d02_inf_z23, d81_inf_z22, d02_inf_z23, d81_inf_sb_z23, d02_inf_z23, d81_inf_sb_z22, d02_inf_z23, d09_inf_z23, d02_inf_z23, d09_inf_z22, d02_inf_z23, d09_inf_sb_z23, d02_inf_z23, d09_inf_sb_z22, d02_inf_z23, d81_inf_z23, d02_inf_z22, d81_inf_z22, d02_inf_z22, d81_inf_sb_z23, d02_inf_z22, d81_inf_sb_z22, d02_inf_z22, d09_inf_z23, d02_inf_z22, d09_inf_z22, d02_inf_z22, d09_inf_sb_z23, d02_inf_z22, d09_inf_sb_z22, d02_inf_z22, d02_inf_z23, d02_inf_z22, d81_inf_z23, d02_inf_sb_z23, d81_inf_z22, d02_inf_sb_z23, d81_inf_sb_z23, d02_inf_sb_z23, d81_inf_sb_z22, d02_inf_sb_z23, d09_inf_z23, d02_inf_sb_z23, d09_inf_z22, d02_inf_sb_z23, d09_inf_sb_z23, d02_inf_sb_z23, d09_inf_sb_z22, d02_inf_sb_z23, d02_inf_z23, d02_inf_sb_z23, d02_inf_z22, d02_inf_sb_z23, d81_inf_z23, d02_inf_sb_z22, d81_inf_z22, d02_inf_sb_z22, d81_inf_sb_z23, d02_inf_sb_z22, d81_inf_sb_z22, d02_inf_sb_z22, d09_inf_z23, d02_inf_sb_z22, d09_inf_z22, d02_inf_sb_z22, d09_inf_sb_z23, d02_inf_sb_z22, d09_inf_sb_z22, d02_inf_sb_z22, d02_inf_z23, d02_inf_sb_z22, d02_inf_z22, d02_inf_sb_z22, d02_inf_sb_z23, d02_inf_sb_z22, d81_inf_z23, d01_inf_z23, d81_inf_z22, d01_inf_z23, d81_inf_sb_z23, d01_inf_z23, d81_inf_sb_z22, d01_inf_z23, d09_inf_z23, d01_inf_z23, d09_inf_z22, d01_inf_z23, d09_inf_sb_z23, d01_inf_z23, d09_inf_sb_z22, d01_inf_z23, d02_inf_z23, d01_inf_z23, d02_inf_z22, d01_inf_z23, d02_inf_sb_z23, d01_inf_z23, d02_inf_sb_z22, d01_inf_z23, d81_inf_z23, d01_inf_z22, d81_inf_z22, d01_inf_z22, d81_inf_sb_z23, d01_inf_z22, d81_inf_sb_z22, d01_inf_z22, d09_inf_z23, d01_inf_z22, d09_inf_z22, d01_inf_z22, d09_inf_sb_z23, d01_inf_z22, d09_inf_sb_z22, d01_inf_z22, d02_inf_z23, d01_inf_z22, d02_inf_z22, d01_inf_z22, d02_inf_sb_z23, d01_inf_z22, d02_inf_sb_z22, d01_inf_z22, d01_inf_z23, d01_inf_z22, d81_inf_z23, d01_inf_sb_z23, d81_inf_z22, d01_inf_sb_z23, d81_inf_sb_z23, d01_inf_sb_z23, d81_inf_sb_z22, d01_inf_sb_z23, d09_inf_z23, d01_inf_sb_z23, d09_inf_z22, d01_inf_sb_z23, d09_inf_sb_z23, d01_inf_sb_z23, d09_inf_sb_z22, d01_inf_sb_z23, d02_inf_z23, d01_inf_sb_z23, d02_inf_z22, d01_inf_sb_z23, d02_inf_sb_z23, d01_inf_sb_z23, d02_inf_sb_z22, d01_inf_sb_z23, d01_inf_z23, d01_inf_sb_z23, d01_inf_z22, d01_inf_sb_z23, d81_inf_z23, d01_inf_sb_z22, d81_inf_z22, d01_inf_sb_z22, d81_inf_sb_z23, d01_inf_sb_z22, d81_inf_sb_z22, d01_inf_sb_z22, d09_inf_z23, d01_inf_sb_z22, d09_inf_z22, d01_inf_sb_z22, d09_inf_sb_z23, d01_inf_sb_z22, d09_inf_sb_z22, d01_inf_sb_z22, d02_inf_z23, d01_inf_sb_z22, d02_inf_z22, d01_inf_sb_z22, d02_inf_sb_z23, d01_inf_sb_z22, d02_inf_sb_z22, d01_inf_sb_z22, d01_inf_z23, d01_inf_sb_z22, d01_inf_z22, d01_inf_sb_z22, d01_inf_sb_z23, d01_inf_sb_z22
## Error in extract_keepers(extracted, keepers, table_names, all_coefficients, : Unable to find the set of contrasts to keep, fix this and try again.
## extra_contrasts = extra)
hs_nopower_nosva_sig <- extract_significant_genes(
hs_nopower_nosva_table,
excel = glue("analyses/macrophage_de/sig_tables/hs_nopower_nosva_sig-v{ver}.xlsx"))
## Error: object 'hs_nopower_nosva_table' not found
d01d02_zymo_nosva_comp <- merge(hs_nopower_nosva_table[["data"]][["d01_zymo"]],
hs_nopower_nosva_table[["data"]][["d02_zymo"]],
by = "row.names")
## Error: object 'hs_nopower_nosva_table' not found
d0102_zymo_nosva_plot <- plot_linear_scatter(d01d02_zymo_nosva_comp[, c("deseq_logfc.x", "deseq_logfc.y")])
## Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'as.data.frame': object 'd01d02_zymo_nosva_comp' not found
d0102_zymo_nosva_plot[["scatter"]]
## Error: object 'd0102_zymo_nosva_plot' not found
d0102_zymo_nosva_plot[["correlation"]]
## Error: object 'd0102_zymo_nosva_plot' not found
d0102_zymo_nosva_plot[["lm_rsq"]]
## Error: object 'd0102_zymo_nosva_plot' not found
d09d81_zymo_nosva_comp <- merge(hs_nopower_nosva_table[["data"]][["d09_zymo"]],
hs_nopower_nosva_table[["data"]][["d81_zymo"]],
by = "row.names")
## Error: object 'hs_nopower_nosva_table' not found
d0981_zymo_nosva_plot <- plot_linear_scatter(d09d81_zymo_nosva_comp[, c("deseq_logfc.x", "deseq_logfc.y")])
## Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'as.data.frame': object 'd09d81_zymo_nosva_comp' not found
d0981_zymo_nosva_plot[["scatter"]]
## Error: object 'd0981_zymo_nosva_plot' not found
d0981_zymo_nosva_plot[["correlation"]]
## Error: object 'd0981_zymo_nosva_plot' not found
d0981_zymo_nosva_plot[["lm_rsq"]]
## Error: object 'd0981_zymo_nosva_plot' not found
d01d81_zymo_nosva_comp <- merge(hs_nopower_nosva_table[["data"]][["d01_zymo"]],
hs_nopower_nosva_table[["data"]][["d81_zymo"]],
by = "row.names")
## Error: object 'hs_nopower_nosva_table' not found
d0181_zymo_nosva_plot <- plot_linear_scatter(d01d81_zymo_nosva_comp[, c("deseq_logfc.x", "deseq_logfc.y")])
## Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'as.data.frame': object 'd01d81_zymo_nosva_comp' not found
d0181_zymo_nosva_plot[["scatter"]]
## Error: object 'd0181_zymo_nosva_plot' not found
d0181_zymo_nosva_plot[["correlation"]]
## Error: object 'd0181_zymo_nosva_plot' not found
d0181_zymo_nosva_plot[["lm_rsq"]]
## Error: object 'd0181_zymo_nosva_plot' not found
upset_plots_nosva <- upsetr_sig(hs_nopower_nosva_sig, both = TRUE,
contrasts = c("d01_zymo", "d02_zymo", "d09_zymo", "d81_zymo"))
## Error: object 'hs_nopower_nosva_sig' not found
upset_plots_nosva[["up"]]
## Error: object 'upset_plots_nosva' not found
upset_plots_nosva[["down"]]
## Error: object 'upset_plots_nosva' not found
upset_plots_nosva[["both"]]
## Error: object 'upset_plots_nosva' not found
## The 7th element in the both groups list is the set shared among all donors.
## I don't feel like writing out x:y:z:a
groups <- upset_plots_nosva[["both_groups"]]
## Error: object 'upset_plots_nosva' not found
shared_genes <- attr(groups, "elements")[groups[[7]]] %>%
gsub(pattern = "^gene:", replacement = "")
## Error in groups[[7]]: subscript out of bounds
shared_gp <- simple_gprofiler(shared_genes)
shared_gp[["pvalue_plots"]][["MF"]]
## NULL
shared_gp[["pvalue_plots"]][["BP"]]
## NULL
shared_gp[["pvalue_plots"]][["REAC"]]

shared_gp[["pvalue_plots"]][["WP"]]

Add donor to the
contrasts, sva
Same deal as the last block, but this time add SVA into the mix!
hs_nopower_sva_de <- all_pairwise(hs_nopower, model_svs = "svaseq",
model_fstring = "~ 0 + condition", filter = TRUE)
## d01_inf_sb_z22 d01_inf_sb_z23 d01_inf_z22 d01_inf_z23 d02_inf_sb_z22
## 3 3 2 3 3
## d02_inf_sb_z23 d02_inf_z22 d02_inf_z23 d09_inf_sb_z22 d09_inf_sb_z23
## 3 3 3 3 2
## d09_inf_z22 d09_inf_z23 d81_inf_sb_z22 d81_inf_sb_z23 d81_inf_z22
## 3 3 3 3 3
## d81_inf_z23
## 3
## Running normalize_se.
## Removing 9761 low-count genes (11720 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Running normalize_se.
## Setting 32467 entries to zero.
## This received a matrix of SVs.
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## Warning in createContrastL(objFlt$formula, objFlt$data, L): Contrasts with only
## a single non-zero term are already evaluated by default.
## conditions
## d01_inf_sb_z22 d01_inf_sb_z23 d01_inf_z22 d01_inf_z23 d02_inf_sb_z22
## 3 3 2 3 3
## d02_inf_sb_z23 d02_inf_z22 d02_inf_z23 d09_inf_sb_z22 d09_inf_sb_z23
## 3 3 3 3 2
## d09_inf_z22 d09_inf_z23 d81_inf_sb_z22 d81_inf_sb_z23 d81_inf_z22
## 3 3 3 3 3
## d81_inf_z23
## 3
## conditions
## d01_inf_sb_z22 d01_inf_sb_z23 d01_inf_z22 d01_inf_z23 d02_inf_sb_z22
## 3 3 2 3 3
## d02_inf_sb_z23 d02_inf_z22 d02_inf_z23 d09_inf_sb_z22 d09_inf_sb_z23
## 3 3 3 3 2
## d09_inf_z22 d09_inf_z23 d81_inf_sb_z22 d81_inf_sb_z23 d81_inf_z22
## 3 3 3 3 3
## d81_inf_z23
## 3
## conditions
## d01_inf_sb_z22 d01_inf_sb_z23 d01_inf_z22 d01_inf_z23 d02_inf_sb_z22
## 3 3 2 3 3
## d02_inf_sb_z23 d02_inf_z22 d02_inf_z23 d09_inf_sb_z22 d09_inf_sb_z23
## 3 3 3 3 2
## d09_inf_z22 d09_inf_z23 d81_inf_sb_z22 d81_inf_sb_z23 d81_inf_z22
## 3 3 3 3 3
## d81_inf_z23
## 3

nopower_keepers <- list(
"d01_zymo" = c("d01infz23", "d01infz22"),
"d01_sbzymo" = c("d01infsbz23", "d01infsbz22"),
"d02_zymo" = c("d02infz23", "d02infz22"),
"d02_sbzymo" = c("d02infsbz23", "d02infsbz22"),
"d09_zymo" = c("d09infz23", "d09infz22"),
"d09_sbzymo" = c("d09infsbz23", "d09infsbz22"),
"d81_zymo" = c("d81infz23", "d81infz22"),
"d81_sbzymo" = c("d81infsbz23", "d81infsbz22"))
hs_nopower_sva_table <- combine_de_tables(
hs_nopower_sva_de, keepers = nopower_keepers,
excel = glue("analyses/macrophage_de/de_tables/hs_nopower_table-v{ver}.xlsx"))
## The keepers has no elements in the coefficients.
## Here are the keepers: d01infz23, d01infz22, d01infsbz23, d01infsbz22, d02infz23, d02infz22, d02infsbz23, d02infsbz22, d09infz23, d09infz22, d09infsbz23, d09infsbz22, d81infz23, d81infz22, d81infsbz23, d81infsbz22
## Here are the coefficients: d81_inf_z23, d81_inf_z22, d81_inf_z23, d81_inf_sb_z23, d81_inf_z22, d81_inf_sb_z23, d81_inf_z23, d81_inf_sb_z22, d81_inf_z22, d81_inf_sb_z22, d81_inf_sb_z23, d81_inf_sb_z22, d81_inf_z23, d09_inf_z23, d81_inf_z22, d09_inf_z23, d81_inf_sb_z23, d09_inf_z23, d81_inf_sb_z22, d09_inf_z23, d81_inf_z23, d09_inf_z22, d81_inf_z22, d09_inf_z22, d81_inf_sb_z23, d09_inf_z22, d81_inf_sb_z22, d09_inf_z22, d09_inf_z23, d09_inf_z22, d81_inf_z23, d09_inf_sb_z23, d81_inf_z22, d09_inf_sb_z23, d81_inf_sb_z23, d09_inf_sb_z23, d81_inf_sb_z22, d09_inf_sb_z23, d09_inf_z23, d09_inf_sb_z23, d09_inf_z22, d09_inf_sb_z23, d81_inf_z23, d09_inf_sb_z22, d81_inf_z22, d09_inf_sb_z22, d81_inf_sb_z23, d09_inf_sb_z22, d81_inf_sb_z22, d09_inf_sb_z22, d09_inf_z23, d09_inf_sb_z22, d09_inf_z22, d09_inf_sb_z22, d09_inf_sb_z23, d09_inf_sb_z22, d81_inf_z23, d02_inf_z23, d81_inf_z22, d02_inf_z23, d81_inf_sb_z23, d02_inf_z23, d81_inf_sb_z22, d02_inf_z23, d09_inf_z23, d02_inf_z23, d09_inf_z22, d02_inf_z23, d09_inf_sb_z23, d02_inf_z23, d09_inf_sb_z22, d02_inf_z23, d81_inf_z23, d02_inf_z22, d81_inf_z22, d02_inf_z22, d81_inf_sb_z23, d02_inf_z22, d81_inf_sb_z22, d02_inf_z22, d09_inf_z23, d02_inf_z22, d09_inf_z22, d02_inf_z22, d09_inf_sb_z23, d02_inf_z22, d09_inf_sb_z22, d02_inf_z22, d02_inf_z23, d02_inf_z22, d81_inf_z23, d02_inf_sb_z23, d81_inf_z22, d02_inf_sb_z23, d81_inf_sb_z23, d02_inf_sb_z23, d81_inf_sb_z22, d02_inf_sb_z23, d09_inf_z23, d02_inf_sb_z23, d09_inf_z22, d02_inf_sb_z23, d09_inf_sb_z23, d02_inf_sb_z23, d09_inf_sb_z22, d02_inf_sb_z23, d02_inf_z23, d02_inf_sb_z23, d02_inf_z22, d02_inf_sb_z23, d81_inf_z23, d02_inf_sb_z22, d81_inf_z22, d02_inf_sb_z22, d81_inf_sb_z23, d02_inf_sb_z22, d81_inf_sb_z22, d02_inf_sb_z22, d09_inf_z23, d02_inf_sb_z22, d09_inf_z22, d02_inf_sb_z22, d09_inf_sb_z23, d02_inf_sb_z22, d09_inf_sb_z22, d02_inf_sb_z22, d02_inf_z23, d02_inf_sb_z22, d02_inf_z22, d02_inf_sb_z22, d02_inf_sb_z23, d02_inf_sb_z22, d81_inf_z23, d01_inf_z23, d81_inf_z22, d01_inf_z23, d81_inf_sb_z23, d01_inf_z23, d81_inf_sb_z22, d01_inf_z23, d09_inf_z23, d01_inf_z23, d09_inf_z22, d01_inf_z23, d09_inf_sb_z23, d01_inf_z23, d09_inf_sb_z22, d01_inf_z23, d02_inf_z23, d01_inf_z23, d02_inf_z22, d01_inf_z23, d02_inf_sb_z23, d01_inf_z23, d02_inf_sb_z22, d01_inf_z23, d81_inf_z23, d01_inf_z22, d81_inf_z22, d01_inf_z22, d81_inf_sb_z23, d01_inf_z22, d81_inf_sb_z22, d01_inf_z22, d09_inf_z23, d01_inf_z22, d09_inf_z22, d01_inf_z22, d09_inf_sb_z23, d01_inf_z22, d09_inf_sb_z22, d01_inf_z22, d02_inf_z23, d01_inf_z22, d02_inf_z22, d01_inf_z22, d02_inf_sb_z23, d01_inf_z22, d02_inf_sb_z22, d01_inf_z22, d01_inf_z23, d01_inf_z22, d81_inf_z23, d01_inf_sb_z23, d81_inf_z22, d01_inf_sb_z23, d81_inf_sb_z23, d01_inf_sb_z23, d81_inf_sb_z22, d01_inf_sb_z23, d09_inf_z23, d01_inf_sb_z23, d09_inf_z22, d01_inf_sb_z23, d09_inf_sb_z23, d01_inf_sb_z23, d09_inf_sb_z22, d01_inf_sb_z23, d02_inf_z23, d01_inf_sb_z23, d02_inf_z22, d01_inf_sb_z23, d02_inf_sb_z23, d01_inf_sb_z23, d02_inf_sb_z22, d01_inf_sb_z23, d01_inf_z23, d01_inf_sb_z23, d01_inf_z22, d01_inf_sb_z23, d81_inf_z23, d01_inf_sb_z22, d81_inf_z22, d01_inf_sb_z22, d81_inf_sb_z23, d01_inf_sb_z22, d81_inf_sb_z22, d01_inf_sb_z22, d09_inf_z23, d01_inf_sb_z22, d09_inf_z22, d01_inf_sb_z22, d09_inf_sb_z23, d01_inf_sb_z22, d09_inf_sb_z22, d01_inf_sb_z22, d02_inf_z23, d01_inf_sb_z22, d02_inf_z22, d01_inf_sb_z22, d02_inf_sb_z23, d01_inf_sb_z22, d02_inf_sb_z22, d01_inf_sb_z22, d01_inf_z23, d01_inf_sb_z22, d01_inf_z22, d01_inf_sb_z22, d01_inf_sb_z23, d01_inf_sb_z22
## Error in extract_keepers(extracted, keepers, table_names, all_coefficients, : Unable to find the set of contrasts to keep, fix this and try again.
## extra_contrasts = extra)
hs_nopower_sva_sig <- extract_significant_genes(
hs_nopower_sva_table,
excel = glue("analyses/macrophage_de/sig_tables/hs_nopower_sva_sig-v{ver}.xlsx"))
## Error: object 'hs_nopower_sva_table' not found
d01d02_zymo_sva_comp <- merge(hs_nopower_sva_table[["data"]][["d01_zymo"]],
hs_nopower_sva_table[["data"]][["d02_zymo"]],
by = "row.names")
## Error: object 'hs_nopower_sva_table' not found
d0102_zymo_sva_plot <- plot_linear_scatter(d01d02_zymo_sva_comp[, c("deseq_logfc.x", "deseq_logfc.y")])
## Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'as.data.frame': object 'd01d02_zymo_sva_comp' not found
d0102_zymo_sva_plot[["scatter"]]
## Error: object 'd0102_zymo_sva_plot' not found
d0102_zymo_sva_plot[["correlation"]]
## Error: object 'd0102_zymo_sva_plot' not found
d0102_zymo_sva_plot[["lm_rsq"]]
## Error: object 'd0102_zymo_sva_plot' not found
d09d81_zymo_sva_comp <- merge(hs_nopower_sva_table[["data"]][["d09_zymo"]],
hs_nopower_sva_table[["data"]][["d81_zymo"]],
by = "row.names")
## Error: object 'hs_nopower_sva_table' not found
d0981_zymo_sva_plot <- plot_linear_scatter(d09d81_zymo_sva_comp[, c("deseq_logfc.x", "deseq_logfc.y")])
## Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'as.data.frame': object 'd09d81_zymo_sva_comp' not found
d0981_zymo_sva_plot[["scatter"]]
## Error: object 'd0981_zymo_sva_plot' not found
d0981_zymo_sva_plot[["correlation"]]
## Error: object 'd0981_zymo_sva_plot' not found
d0981_zymo_sva_plot[["lm_rsq"]]
## Error: object 'd0981_zymo_sva_plot' not found
d01d81_zymo_sva_comp <- merge(hs_nopower_sva_table[["data"]][["d01_zymo"]],
hs_nopower_sva_table[["data"]][["d81_zymo"]],
by = "row.names")
## Error: object 'hs_nopower_sva_table' not found
d0181_zymo_sva_plot <- plot_linear_scatter(d01d81_zymo_sva_comp[, c("deseq_logfc.x", "deseq_logfc.y")])
## Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'as.data.frame': object 'd01d81_zymo_sva_comp' not found
d0181_zymo_sva_plot[["scatter"]]
## Error: object 'd0181_zymo_sva_plot' not found
d0181_zymo_sva_plot[["correlation"]]
## Error: object 'd0181_zymo_sva_plot' not found
d0181_zymo_sva_plot[["lm_rsq"]]
## Error: object 'd0181_zymo_sva_plot' not found
upset_plots_sva <- upsetr_sig(hs_nopower_sva_sig, both = TRUE,
contrasts = c("d01_zymo", "d02_zymo", "d09_zymo", "d81_zymo"))
## Error: object 'hs_nopower_sva_sig' not found
## Error: object 'upset_plots_sva' not found
upset_plots_sva[["down"]]
## Error: object 'upset_plots_sva' not found
upset_plots_sva[["both"]]
## Error: object 'upset_plots_sva' not found
## The 7th element in the both groups list is the set shared among all donors.
## I don't feel like writing out x:y:z:a
groups <- upset_plots_sva[["both_groups"]]
## Error: object 'upset_plots_sva' not found
shared_genes <- attr(groups, "elements")[groups[[7]]] %>%
gsub(pattern = "^gene:", replacement = "")
## Error in groups[[7]]: subscript out of bounds
shared_gp <- simple_gprofiler(shared_genes)
shared_gp[["pvalue_plots"]][["MF"]]
## NULL
shared_gp[["pvalue_plots"]][["BP"]]
## NULL
shared_gp[["pvalue_plots"]][["REAC"]]

shared_gp[["pvalue_plots"]][["WP"]]

Donor comparison
Now compare the donors to each other directly.
hs_donors <- set_conditions(hs_macr, fact = "donor")
## The numbers of samples by condition are:
##
## d01 d02 d09 d81
## 13 14 13 14
donor_de <- all_pairwise(hs_donors, model_svs = "svaseq",
model_fstring = "~ 0 + condition", filter = TRUE)
## d01 d02 d09 d81
## 13 14 13 14
## Running normalize_se.
## Removing 9725 low-count genes (11756 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Running normalize_se.
## Setting 40036 entries to zero.
## This received a matrix of SVs.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## Warning in createContrastL(objFlt$formula, objFlt$data, L): Contrasts with only
## a single non-zero term are already evaluated by default.
## conditions
## d01 d02 d09 d81
## 13 14 13 14
## conditions
## d01 d02 d09 d81
## 13 14 13 14
## conditions
## d01 d02 d09 d81
## 13 14 13 14

## A pairwise differential expression with results from: basic, deseq, ebseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: svaseq.
## The primary analysis performed 6 comparisons.
donor_table <- combine_de_tables(
donor_de,
excel = glue("analyses/macrophage_de/de_tables/donor_tables-v{ver}.xlsx"))
donor_table
## A set of combined differential expression results.
## table deseq_sigup deseq_sigdown edger_sigup edger_sigdown limma_sigup
## 1 d02_vs_d01 310 389 318 381 350
## 2 d09_vs_d01 532 457 533 451 513
## 3 d81_vs_d01 668 753 669 744 663
## 4 d09_vs_d02 414 267 412 272 373
## 5 d81_vs_d02 572 650 561 658 532
## 6 d81_vs_d09 221 421 212 423 218
## limma_sigdown
## 1 359
## 2 467
## 3 753
## 4 309
## 5 672
## 6 416
## Plot describing unique/shared genes in a differential expression table.

donor_sig <- extract_significant_genes(
donor_table,
excel = glue("analyses/macrophage_de/sig_tables/donor_sig-v{ver}.xlsx"))
donor_sig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down ebseq_up
## d02_vs_d01 350 359 318 381 310 389 242
## d09_vs_d01 513 467 533 451 532 457 485
## d81_vs_d01 663 753 669 744 668 753 576
## d09_vs_d02 373 309 412 272 414 267 211
## d81_vs_d02 532 672 561 658 572 650 299
## d81_vs_d09 218 416 212 423 221 421 86
## ebseq_down basic_up basic_down
## d02_vs_d01 136 169 185
## d09_vs_d01 190 334 257
## d81_vs_d01 385 435 446
## d09_vs_d02 115 165 105
## d81_vs_d02 378 235 291
## d81_vs_d09 200 73 150

Primary query
contrasts
The final contrast in this list is interesting because it depends on
the extra contrasts applied to the all_pairwise() above. In my way of
thinking, the primary comparisons to consider are either cross-drug or
cross-strain, but not both. However I think in at least a few instances
Olga is interested in strain+drug / uninfected+nodrug.
Write contrast
results
Now let us write out the xlsx file containing the above contrasts.
The file with the suffix _table-version will therefore contain all genes
and the file with the suffix _sig-version will contain only those deemed
significant via our default criteria of DESeq2 |logFC| >= 1.0 and
adjusted p-value <= 0.05.
Over representation
searches
I decided to make one initially small, but I think quickly big change
to the organization of this document: I am moving the GSEA searches up
to immediately after the DE. I will then move the plots of the gprofiler
results to immediately after the various volcano plots so that it is
easier to interpret them.
I am reasonably certain this is the place to check that z23no drug /
uninfected has the expected set of genes and that there is or is not a
reactome result.
Reproducibility note: Given that this is entirely dependent on an
online service, I must assume that the results will change over time; in
addition their web servers undergo maintenance regularly, which may
result in systematic failure of these analyses. I like gProfiler quite a
lot for this type of stuff, but this is an important caveat.
Conversely, the clusterProfiler results later depend on a consistent
orgdb annotation set (or reactome or whatever); those versions are fixed
by the container installation.
all_gp <- all_gprofiler(hs_macr_sig, enrich_id_column = "hgnc_symbol")
for (g in seq_len(length(all_gp))) {
name <- names(all_gp)[g]
datum <- all_gp[[name]]
filename <- glue("analyses/macrophage_de/gprofiler/{name}_gprofiler-v{ver}.xlsx")
written <- sm(write_gprofiler_data(datum, excel = filename))
}
lesssig_all_gp <- all_gprofiler(hs_macr_lesssig, enrich_id_column = "hgnc_symbol")
for (g in seq_len(length(lesssig_all_gp))) {
name <- names(lesssig_all_gp)[g]
datum <- lesssig_all_gp[[name]]
filename <- glue("analyses/macrophage_de/gprofiler/{name}_gprofiler_lesssig-v{ver}.xlsx")
written <- sm(write_gprofiler_data(datum, excel = filename))
}
Explicit GSEA search
vis clusterProfiler
all_cp <- all_cprofiler(hs_macr_sig, hs_macr_table)
## Reading KEGG annotation online: "https://rest.kegg.jp/link/hsa/pathway"...
## Reading KEGG annotation online: "https://rest.kegg.jp/list/pathway/hsa"...
## ReactomePA v1.52.0 Learn more at https://yulab-smu.top/contribution-knowledge-mining/
##
## Please cite:
##
## Guangchuang Yu, Qing-Yu He. ReactomePA: an R/Bioconductor package for
## reactome pathway analysis and visualization. Molecular BioSystems.
## 2016, 12(2):477-479
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
## Warning in simple_clusterprofiler(down, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(down, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
## Warning in simple_clusterprofiler(down, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(down, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
## Warning in simple_clusterprofiler(down, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(down, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
## Warning in simple_clusterprofiler(down, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(down, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
## Warning in simple_clusterprofiler(down, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(down, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
## Warning in simple_clusterprofiler(down, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(down, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
## Warning in simple_clusterprofiler(down, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(down, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
## Warning in simple_clusterprofiler(down, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(down, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
## Warning in simple_clusterprofiler(down, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(down, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(up, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
## Warning in simple_clusterprofiler(down, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this DOSE organism, leaving it as human.
## Warning in simple_clusterprofiler(down, table, orgdb = orgdb, orgdb_from =
## orgdb_from, : I do not know this mesh organism, leaving it as human.
Specific desires in
Reactome results
In previous analyses (I think by Dr. Colmenares), a specific
Tryptophan biosynthesis pathway was observed. Partciularly in the
2.3/uninfected comparison. I think my gprofiler analysis is too
stringent and therefore not observing this. Olga asked if I could look
at that and see if there are trivial settings I can change to highlight
this pathway. The two most likely things I can change are the
stringencies of the DE analysis and/or gProfiler.
test_z23_uninf_up <- hs_macr_sig[["deseq"]][["ups"]][["z23nosb_vs_uninf"]]
nrow(test_z23_uninf_up)
## [1] 478
test_z23_uninf_down <- hs_macr_sig[["deseq"]][["downs"]][["z23nosb_vs_uninf"]]
nrow(test_z23_uninf_down)
## [1] 265
test_gp_up <- simple_gprofiler(test_z23_uninf_up, enrich_id_column = "hgnc_symbol",
threshold = 1.0)
test_gp_up
written_up <- write_gprofiler_data(test_gp_up, excel = "excel/z23_uninf_gp_up_all.xlsx")
test_gp_down <- simple_gprofiler(test_z23_uninf_down, enrich_id_column = "hgnc_symbol",
threshold = 1.0)
test_gp_down
written_down <- write_gprofiler_data(test_gp_down, excel = "excel/z23_uninf_gp_down_all.xlsx")
Plot contrasts of
interest
One suggestion I received recently was to set the axes for these
volcano plots to be static rather than let ggplot choose its own. I am
assuming this is only relevant for pairs of contrasts, but that might
not be true.
Individual zymodemes
vs. uninfected
The following blocks will be a lot of repetition. In each case I am
yanking out the volcano plot for a specific contrast and showing the
original followed by a version with different colors/labelling.
Infected with z2.3
no Antimonial vs. Uninfected
plot_colors <- get_se_colors(hs_macr_table[["input"]][["input"]])
## Error in get_se_colors(hs_macr_table[["input"]][["input"]]): could not find function "get_se_colors"
## The original plot from my xlsx file
hs_macr_table[["plots"]][["z23nosb_vs_uninf"]][["deseq_vol_plots"]]

z23nosb_vs_uninf_volcano <- plot_volcano_condition_de(
input = hs_macr_table[["data"]][["z23nosb_vs_uninf"]],
fc_col = "deseq_logfc", p_col = "deseq_adjp",
label = 10, label_column = "hgnc_symbol",
color_low = plot_colors[["uninfnone"]], color_high = plot_colors[["infz23"]])
## Error: object 'plot_colors' not found
labeled <- z23nosb_vs_uninf_volcano[["plot"]] +
scale_x_continuous(limits = c(-6, 21), breaks = c(-6, -4, -2, 0, 2, 4, 6, 8, 10, 20)) +
ggbreak::scale_x_break(c(10, 19), scales = 0.2, space = 0.02)
## Error: object 'z23nosb_vs_uninf_volcano' not found
pp(file = "figures/fig2a_labeled_with_break.svg")
labeled
## Error: object 'labeled' not found
## png
## 2
## Error: object 'labeled' not found
plotly::ggplotly(z23nosb_vs_uninf_volcano[["plot"]])
## Error: object 'z23nosb_vs_uninf_volcano' not found
The following provides some of the over-representation plots from
gProfiler2.
all_gp[["z23nosb_vs_uninf_up"]][["pvalue_plots"]][["REAC"]]

## Reactome, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z23nosb_vs_uninf_up"]][["pvalue_plots"]][["KEGG"]]

## KEGG, zymodeme2.3 without drug vs. uninfected without drug, up.
##all_gp[["z23nosb_vs_uninf_up"]][["pvalue_plots"]][["MF"]]
## MF, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z23nosb_vs_uninf_up"]][["pvalue_plots"]][["TF"]]

## TF, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z23nosb_vs_uninf_up"]][["pvalue_plots"]][["WP"]]

## WikiPathways, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z23nosb_vs_uninf_up"]][["interactive_plots"]][["WP"]]
message("Olga received a query about the following result, I think it is null.")
## Olga received a query about the following result, I think it is null.
all_gp[["z23nosb_vs_uninf_down"]][["pvalue_plots"]][["REAC"]]
## NULL
message("Is the previous plot null?")
## Is the previous plot null?
## Reactome, zymodeme2.3 without drug vs. uninfected without drug, down.
all_gp[["z23nosb_vs_uninf_down"]][["pvalue_plots"]][["MF"]]
## NULL
## MF, zymodeme2.3 without drug vs. uninfected without drug, down.
all_gp[["z23nosb_vs_uninf_down"]][["pvalue_plots"]][["TF"]]

## TF, zymodeme2.3 without drug vs. uninfected without drug, down.
We have some other categorical enrichment plots available via
enrichplot, let us try a few out for contrasts of interest and see if
any of them prove helpful.
First, as a reminder, here are the contrasts which are available to
examine, in each case there is an _up and _down enrichment object in the
data. Thus in the following list I am going to arbitrarily print out
some invocations which extract putatively interesting bits of data.
- z23nosb_vs_uninf:
all_gp[[“z23nosb_vs_uninf_up”]][[“BP_enrich”]]
- z22nosb_vs_uninf.
- z23nosb_vs_z22nosb.
- z23sb_vs_z22sb.
- z23sb_vs_z23nosb.
- z22sb_vs_z22nosb.
- z23sb_vs_sb.
- z22sb_vs_sb.
- z23sb_vs_uninf.
- z22sb_vs_uninf.
- sb_vs_uninf.
- extra_z2322.
- extra_drugnodrug.
z23nosb_uninf_up_go <- all_gp[["z23nosb_vs_uninf_up"]][["BP_enrich"]]
z23nosb_uninf_up_go_pair <- pairwise_termsim(z23nosb_uninf_up_go)
dotplot(z23nosb_uninf_up_go)

emapplot(z23nosb_uninf_up_go_pair)

##ssplot(z23nosb_uninf_up_go_pair)
treeplot(z23nosb_uninf_up_go_pair)

upsetplot(z23nosb_uninf_up_go)

cnetplot(z23nosb_uninf_up_go)
## Warning: ggrepel: 5 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

Repeat, but using
a less strict set of ‘significant genes’
I am not entirely certain if the Reactome results Olga showed me
included both up and down genes? I am going to assume for the moment
that it was just up/down, but if that proves intractable I will go back
to the manuscript and read more carefully (e.g. I just remembered where
the picture came from!)
Add a little
topgo
In the process of exploring the various parameters used with
gProfiler2, I found myself thinking that it would be nice to have some
topgo results to compare against. The following block is the result of
that thought.
test_genes_up <- hs_macr_lesssig[["deseq"]][["ups"]][["z23nosb_vs_uninf"]]
test_query_up <- simple_gprofiler(test_genes_up, threshold = 0.1)
test_query_up[["pvalue_plots"]][["REAC"]]

pdf(file = "images/test_query_biological_process_z23_vs_uninf_up.pdf", height = 12, width = 9)
test_query_up[["pvalue_plots"]][["BP"]]
## NULL
## png
## 2
enrichplot::dotplot(test_query_up[["BP_enrich"]])

test_genes_down <- hs_macr_lesssig[["deseq"]][["downs"]][["z23nosb_vs_uninf"]]
test_query_down <- simple_gprofiler(test_genes_down)
test_query_down[["pvalue_plots"]][["REAC"]]
## NULL
## I keep getting all sorts of annoying biomart errors.
hs_go <- try(load_biomart_go(archive = FALSE, overwrite = TRUE))
## Using mart: ENSEMBL_MART_ENSEMBL from host: useast.ensembl.org.
## Successfully connected to the hsapiens_gene_ensembl database.
## Error in httr2::req_perform(req) : Failed to perform HTTP request.
## Caused by error in `curl::curl_fetch_memory()` at httr2/R/req-perform.R:201:5:
## ! Timeout was reached [useast.ensembl.org]:
## Operation timed out after 300002 milliseconds with 10486132 bytes received
## Unable to download annotation data.
if ("try-error" %in% class(hs_go)) {
hs_go <- load_biomart_go(archive = TRUE, month = "04", year = "2020", overwrite = TRUE)
}
test_topgo_up <- simple_topgo(test_genes_up, go_db = hs_go[["go"]], parallel = FALSE)
## Error in go_db[, c("L1", "value")]: incorrect number of dimensions
written_topgo <- write_topgo_data(
test_topgo_up,
excel = glue("analyses/macrophage_de/ontology_topgo/topgo_z23_uninf_less_strict.xlsx"))
## Error: object 'test_topgo_up' not found
Infected with z2.2
no Antimonial vs. Uninfected
Here is where things will get most repetitive. In each instance I am
creating a couple of volcano plots followed by printing some of the
gProfiler2 results (when I get the itch).
The following should be a slightly improved version of our extant
figure 2B.
## The original plot
hs_macr_table[["plots"]][["z22nosb_vs_uninf"]][["deseq_vol_plots"]]

z22nosb_vs_uninf_volcano <- plot_volcano_condition_de(
hs_macr_table[["data"]][["z22nosb_vs_uninf"]], "z22nosb_vs_uninf",
fc_col = "deseq_logfc", p_col = "deseq_adjp",
label = 10, label_column = "hgnc_symbol",
color_low = plot_colors[["uninfnone"]], color_high = plot_colors[["infz22"]])
## Error: object 'plot_colors' not found
labeled <- z22nosb_vs_uninf_volcano[["plot"]] +
scale_x_continuous(limits = c(-2, 21), breaks = c(-2, 0, 2, 4, 6, 8, 10, 21, 22)) +
ggbreak::scale_x_break(c(11, 20), scales = 0.2, space = 0.02)
## Error: object 'z22nosb_vs_uninf_volcano' not found
pp(file = "figures/fig2b_labeled_with_break.svg")
labeled
## Error: object 'labeled' not found
## png
## 2
## Error: object 'labeled' not found
plotly::ggplotly(z22nosb_vs_uninf_volcano[["plot"]])
## Error: object 'z22nosb_vs_uninf_volcano' not found
Add some pvalue barplots from gProfiler for this contrast.
all_gp[["z22nosb_vs_uninf_up"]][["pvalue_plots"]][["REAC"]]

## Reactome, zymodeme2.2 without drug vs. uninfected without drug, up.
all_gp[["z22nosb_vs_uninf_up"]][["pvalue_plots"]][["MF"]]
## NULL
## MF, zymodeme2.2 without drug vs. uninfected without drug, up.
all_gp[["z22nosb_vs_uninf_up"]][["pvalue_plots"]][["TF"]]

## TF, zymodeme2.2 without drug vs. uninfected without drug, up.
all_gp[["z22nosb_vs_uninf_up"]][["pvalue_plots"]][["WP"]]

## WikiPathways, zymodeme2.2 without drug vs. uninfected without drug, up.
all_gp[["z22nosb_vs_uninf_down"]][["pvalue_plots"]][["REAC"]]
## NULL
## Reactome, zymodeme2.2 without drug vs. uninfected without drug, down.
all_gp[["z22nosb_vs_uninf_down"]][["pvalue_plots"]][["MF"]]
## NULL
## MF, zymodeme2.2 without drug vs. uninfected without drug, down.
all_gp[["z22nosb_vs_uninf_down"]][["pvalue_plots"]][["TF"]]
## NULL
## TF, zymodeme2.3 without drug vs. uninfected without drug, down.
Infected with z2.3
treated vs. Uninfected treated
I do not think this plot is used at this time.
## The original plot
hs_macr_table[["plots"]][["z23sb_vs_sb"]][["deseq_vol_plots"]]

z23sb_vs_uninfsb_volcano <- plot_volcano_condition_de(
hs_macr_table[["data"]][["z23sb_vs_sb"]], "z23sb_vs_sb",
fc_col = "deseq_logfc", p_col = "deseq_adjp",
label = 10, label_column = "hgnc_symbol",
color_low = plot_colors[["infsbz23"]], color_high = plot_colors[["uninfsbnone"]])
## Error: object 'plot_colors' not found
z23sb_vs_uninfsb_volcano[["plot"]]
## Error: object 'z23sb_vs_uninfsb_volcano' not found
plotly::ggplotly(z23sb_vs_uninfsb_volcano[["plot"]])
## Error: object 'z23sb_vs_uninfsb_volcano' not found
Infected with z2.3
untreated vs. z2.2 untreated
This is figure 2C at this time.
## The original plot
hs_macr_table[["plots"]][["z23nosb_vs_z22nosb"]][["deseq_vol_plots"]]

z23nosb_vs_z22nosb_volcano <- plot_volcano_condition_de(
hs_macr_table[["data"]][["z23nosb_vs_z22nosb"]], "z23nosb_vs_z22nosb",
fc_col = "deseq_logfc", p_col = "deseq_adjp",
label = 10, label_column = "hgnc_symbol",
color_low = plot_colors[["infz23"]], color_high = plot_colors[["infz22"]])
## Error: object 'plot_colors' not found
labeled <- z23nosb_vs_z22nosb_volcano[["plot"]] +
scale_x_continuous(breaks = c(-10, -8, -6, -4, -2, 0, 2, 4, 6))
## Error: object 'z23nosb_vs_z22nosb_volcano' not found
pp(file = "figures/fig2c_labeled.svg")
labeled
## Error: object 'labeled' not found
## png
## 2
## Error: object 'labeled' not found
Infected with z2.3
treated vs. z2.2 treated
This is currently figure 3C.
FIXME: The axis label isn’t quite right for the ggbreak.
## The original plot
hs_macr_table[["plots"]][["z23sb_vs_z22sb"]][["deseq_vol_plots"]]

z23sb_vs_z22sb_volcano <- plot_volcano_condition_de(
hs_macr_table[["data"]][["z23sb_vs_z22sb"]], "z23sb_vs_z22sb",
fc_col = "deseq_logfc", p_col = "deseq_adjp",
label = 10, label_column = "hgnc_symbol",
color_high = plot_colors[["infsbz23"]], color_low = plot_colors[["infsbz22"]])
## Error: object 'plot_colors' not found
labeled <- z23sb_vs_z22sb_volcano[["plot"]] +
scale_x_continuous(breaks = c(-23, -6, -4, -2, 0, 2, 4, 6)) +
ggbreak::scale_x_break(c(-5, -22.5), scales = 10, space = 0.02)
## Error: object 'z23sb_vs_z22sb_volcano' not found
pp(file = "figures/fig3c_labeled_breaks.svg")
labeled
## Error: object 'labeled' not found
## png
## 2
## Error: object 'labeled' not found
Infected with z2.3
SB treated vs. z2.3 untreated
I think this is currently figure 3A.
FIXME: The axis label for the ggbreak isn’t quite right.
## The original plot
hs_macr_table[["plots"]][["z23sb_vs_z23nosb"]][["deseq_vol_plots"]]

z23sb_vs_z23nosb_volcano <- plot_volcano_condition_de(
hs_macr_table[["data"]][["z23sb_vs_z23nosb"]], "z23sb_vs_z23nosb",
fc_col = "deseq_logfc", p_col = "deseq_adjp",
label = 10, label_column = "hgnc_symbol",
color_high = plot_colors[["infsbz23"]], color_low = plot_colors[["infz23"]])
## Error: object 'plot_colors' not found
labeled <- z23sb_vs_z23nosb_volcano[["plot"]] +
scale_x_continuous(limits = c(-19, 6),
breaks = c(-20, -18, -16, -14, -12, -10, -6, -4, -2, 0, 2, 4, 6)) +
ggbreak::scale_x_break(c(-17, -8), scales = 17, space = 0.02)
## Error: object 'z23sb_vs_z23nosb_volcano' not found
pp(file = "figures/fig3a_labeled_with_break.svg")
labeled
## Error: object 'labeled' not found
## png
## 2
## Error: object 'labeled' not found
Infected with z2.3
SB treated vs. z2.3 untreated
## The original plot
hs_macr_table[["plots"]][["z22sb_vs_z22nosb"]][["deseq_vol_plots"]]

z22sb_vs_z22nosb_volcano <- plot_volcano_condition_de(
hs_macr_table[["data"]][["z22sb_vs_z22nosb"]], "z22sb_vs_z22nosb",
fc_col = "deseq_logfc", p_col = "deseq_adjp",
label = 10, label_column = "hgnc_symbol",
color_high = plot_colors[["infsbz22"]], color_low = plot_colors[["infz22"]])
## Error: object 'plot_colors' not found
labeled <- z22sb_vs_z22nosb_volcano[["plot"]] +
scale_x_continuous(breaks = c(-6, -4, -2, 0, 2, 4, 6))
## Error: object 'z22sb_vs_z22nosb_volcano' not found
pp(file = "figures/fig3b_labeled.svg")
labeled
## Error: object 'labeled' not found
## png
## 2
## Error: object 'labeled' not found
Infected with z2.3
SB treated vs. uninfected treated
x_limits <- c(-6, 6)
## The original plot
hs_macr_table[["plots"]][["z23sb_vs_sb"]][["deseq_vol_plots"]]

z23sb_vs_sb_volcano <- plot_volcano_condition_de(
hs_macr_table[["data"]][["z23sb_vs_sb"]], "z23sb_vs_sb",
fc_col = "deseq_logfc", p_col = "deseq_adjp",
label = 10, label_column = "hgnc_symbol", invert = TRUE,
color_low = plot_colors[["infsbz23"]], color_high = plot_colors[["uninfsbnone"]])
## Error: object 'plot_colors' not found
z23sb_vs_sb_volcano[["plot"]]
## Error: object 'z23sb_vs_sb_volcano' not found
Infected with
z2.2 SB treated vs. uninfected treated
## The original plot
hs_macr_table[["plots"]][["z22sb_vs_sb"]][["deseq_vol_plots"]]

z22sb_vs_sb_volcano <- plot_volcano_condition_de(
hs_macr_table[["data"]][["z22sb_vs_sb"]], "z22sb_vs_sb",
fc_col = "deseq_logfc", p_col = "deseq_adjp",
label = 10, label_column = "hgnc_symbol", invert = TRUE,
color_low = plot_colors[["infsbz22"]], color_high = plot_colors[["uninfsbnone"]])
## Error: object 'plot_colors' not found
z22sb_vs_sb_volcano[["plot"]]
## Error: object 'z22sb_vs_sb_volcano' not found
Uninfected+SbV
vs. Uninfected-SbV
This is currently figure 3D.
FIXME: This needs the BOLA2B ggbreak.
## The original plot
hs_macr_table[["plots"]][["sb_vs_uninf"]][["deseq_vol_plots"]]

sb_vs_uninf_volcano <- plot_volcano_condition_de(
hs_macr_table[["data"]][["sb_vs_uninf"]], "sb_vs_uninf",
fc_col = "deseq_logfc", p_col = "deseq_adjp",
label = 10, label_column = "hgnc_symbol",
color_high = plot_colors[["uninfsbnone"]], color_low = plot_colors[["uninfnone"]])
## Error: object 'plot_colors' not found
labeled <- sb_vs_uninf_volcano[["plot"]] +
scale_x_continuous(breaks = c(-23, -6, -4, -2, 0, 2, 4, 6)) +
ggbreak::scale_x_break(c(-5, -22.5), scales = 10, space = 0.02)
## Error: object 'sb_vs_uninf_volcano' not found
pp(file = "figures/fig3d_labeled_breaks.svg")
labeled
## Error: object 'labeled' not found
## png
## 2
## Error: object 'labeled' not found
Double-check that
gene counts match my perceptions
Check that my perception of the number of significant up/down genes
matches what the table/venn says. In the following block I am performing
some venn/upset analyses to see if the numbers of genes match what we
have in the current version of the manuscript (plus or minus a gene) and
thus if my interpretation of the figure/legend text matches what I think
it means.
shared <- Vennerable::Venn(list(
"drug" = rownames(hs_macr_sig[["deseq"]][["ups"]][["z23sb_vs_uninf"]]),
"nodrug" = rownames(hs_macr_sig[["deseq"]][["ups"]][["z23nosb_vs_uninf"]])))
pp(file = "images/z23_vs_uninf_venn_up.png")
Vennerable::plot(shared)
dev.off()
## png
## 2

## I see 910 z23sb/uninf and 670 no z23nosb/uninf genes in the venn diagram.
length(shared@IntersectionSets[["10"]]) + length(shared@IntersectionSets[["11"]])
## [1] 839
dim(hs_macr_sig[["deseq"]][["ups"]][["z23sb_vs_uninf"]])
## [1] 839 73
shared <- Vennerable::Venn(list(
"drug" = rownames(hs_macr_sig[["deseq"]][["ups"]][["z22sb_vs_uninf"]]),
"nodrug" = rownames(hs_macr_sig[["deseq"]][["ups"]][["z22nosb_vs_uninf"]])))
pp(file = "images/z22_vs_uninf_venn_up.png")
Vennerable::plot(shared)
dev.off()
## png
## 2

length(shared@IntersectionSets[["10"]]) + length(shared@IntersectionSets[["11"]])
## [1] 660
dim(hs_macr_sig[["deseq"]][["ups"]][["z22sb_vs_uninf"]])
## [1] 660 73
Note to self: There is an error in my volcano plot code
which takes effect when the numerator and denominator of the
all_pairwise contrasts are different than those in combine_de_tables. It
is putting the ups/downs on the correct sides of the plot, but calling
the down genes ‘up’ and vice-versa. The reason for this is that I did a
check for this happening, but used the wrong argument to handle it.
A likely bit of text for these volcano plots:
The set of genes differentially expressed between the zymodeme 2.3
and uninfected samples without druge treatment was quantified with
DESeq2 and included surrogate estimates from SVA. Given the criteria of
significance of a abs(logFC) >= 1.0 and false discovery rate adjusted
p-value <= 0.05, 670 genes were observed as significantly increased
between the infected and uninfected samples and 386 were observed as
decreased. The most increased genes from the uninfected samples include
some which are potentially indicative of a strong innate immune response
and the inflammatory response.
In contrast, when the set of genes differentially expressed between
the zymodeme 2.2 and uninfected samples was visualized, only 7 genes
were observed as decreased and 435 increased. The inflammatory response
was significantly less apparent in this set, but instead included genes
related to transporter activity and oxidoreductases.
Direct zymodeme
comparisons
An orthogonal comparison to that performed above is to directly
compare the zymodeme 2.3 and 2.2 samples with and without antimonial
treatment.
Z2.3 / z2.2
without drug
z23nosb_vs_z22nosb_volcano <- plot_volcano_de(
table = hs_macr_table[["data"]][["z23nosb_vs_z22nosb"]],
fc_col = "deseq_logfc", p_col = "deseq_adjp",
shapes_by_state = FALSE, color_by = "fc", label = 10, label_column = "hgnc_symbol")
## Error in plot_volcano_de(table = hs_macr_table[["data"]][["z23nosb_vs_z22nosb"]], : could not find function "plot_volcano_de"
plotly::ggplotly(z23nosb_vs_z22nosb_volcano[["plot"]])
## Error: object 'z23nosb_vs_z22nosb_volcano' not found
z23sb_vs_z22sb_volcano <- plot_volcano_de(
table = hs_macr_table[["data"]][["z23sb_vs_z22sb"]],
fc_col = "deseq_logfc", p_col = "deseq_adjp",
shapes_by_state = FALSE, color_by = "fc", label = 10, label_column = "hgnc_symbol")
## Error in plot_volcano_de(table = hs_macr_table[["data"]][["z23sb_vs_z22sb"]], : could not find function "plot_volcano_de"
plotly::ggplotly(z23sb_vs_z22sb_volcano[["plot"]])
## Error: object 'z23sb_vs_z22sb_volcano' not found
z23nosb_vs_z22nosb_volcano[["plot"]] +
xlim(-10, 10) +
ylim(0, 60)
## Error: object 'z23nosb_vs_z22nosb_volcano' not found
pp(file = "images/z23nosb_vs_z22nosb_reactome_up.svg",
image = all_gp[["z23nosb_vs_z22nosb_up"]][["pvalue_plots"]][["REAC"]],
height = 12, width = 9)
## Warning: ImageMagick was built without librsvg which causes poor quality of SVG rendering.
## For better results use image_read_svg() which uses the rsvg package.
## Error in eval(expr, envir): R: geometry does not contain image `/lab/singularity/tmrc2_macrophage_deb/202509081525_outputs/images/z23nosb_vs_z22nosb_reactome_up.svg' @ warning/attribute.c/GetImageBoundingBox/554
all_gp[["z23nosb_vs_z22nosb_up"]][["pvalue_plots"]][["REAC"]]

## Reactome, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z23nosb_vs_z22nosb_up"]][["pvalue_plots"]][["KEGG"]]

## KEGG, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z23nosb_vs_z22nosb_up"]][["pvalue_plots"]][["MF"]]
## NULL
## MF, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z23nosb_vs_z22nosb_up"]][["pvalue_plots"]][["TF"]]

## TF, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z23nosb_vs_z22nosb_up"]][["pvalue_plots"]][["WP"]]

## WikiPathways, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z23nosb_vs_z22nosb_up"]][["interactive_plots"]][["WP"]]
pp(file = "images/z23nosb_vs_z22nosb_reactome_down.svg",
image = all_gp[["z23nosb_vs_z22nosb_down"]][["pvalue_plots"]][["REAC"]],
height = 12, width = 9)
## Warning: ImageMagick was built without librsvg which causes poor quality of SVG rendering.
## For better results use image_read_svg() which uses the rsvg package.
## Error in eval(expr, envir): R: geometry does not contain image `/lab/singularity/tmrc2_macrophage_deb/202509081525_outputs/images/z23nosb_vs_z22nosb_reactome_down.svg' @ warning/attribute.c/GetImageBoundingBox/554
all_gp[["z23nosb_vs_z22nosb_down"]][["pvalue_plots"]][["REAC"]]

## Reactome, zymodeme2.3 without drug vs. uninfected without drug, down.
all_gp[["z23nosb_vs_z22nosb_down"]][["pvalue_plots"]][["MF"]]
## NULL
## MF, zymodeme2.3 without drug vs. uninfected without drug, down.
all_gp[["z23nosb_vs_z22nosb_down"]][["pvalue_plots"]][["TF"]]

## TF, zymodeme2.3 without drug vs. uninfected without drug, down.
z2.3 / z2.2 with
drug
z23sb_vs_z22sb_volcano[["plot"]] +
xlim(-10, 10) +
ylim(0, 60)
## Error: object 'z23sb_vs_z22sb_volcano' not found
pp(
file = "images/z23sb_vs_z22sb_reactome_up.png",
image = all_gp[["z23sb_vs_z22sb_up"]][["pvalue_plots"]][["REAC"]],
height = 12, width = 9)
## Warning in pp(file = "images/z23sb_vs_z22sb_reactome_up.png", image =
## all_gp[["z23sb_vs_z22sb_up"]][["pvalue_plots"]][["REAC"]], : There is no device
## to shut down.
## Error in eval(expr, envir): R: improper image header `/lab/singularity/tmrc2_macrophage_deb/202509081525_outputs/images/z23sb_vs_z22sb_reactome_up.png' @ error/png.c/ReadPNGImage/3941
all_gp[["z23sb_vs_z22sb_up"]][["pvalue_plots"]][["REAC"]]
## Reactome, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z23sb_vs_z22sb_up"]][["pvalue_plots"]][["KEGG"]]
## KEGG, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z23sb_vs_z22sb_up"]][["pvalue_plots"]][["MF"]]
## NULL
## MF, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z23sb_vs_z22sb_up"]][["pvalue_plots"]][["TF"]]
## TF, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z23sb_vs_z22sb_up"]][["pvalue_plots"]][["WP"]]
## WikiPathways, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z23sb_vs_z22sb_up"]][["interactive_plots"]][["WP"]]
all_gp[["z23sb_vs_z22sb_down"]][["pvalue_plots"]][["REAC"]]
## Reactome, zymodeme2.3 without drug vs. uninfected without drug, down.
all_gp[["z23sb_vs_z22sb_down"]][["pvalue_plots"]][["MF"]]
## NULL
## MF, zymodeme2.3 without drug vs. uninfected without drug, down.
all_gp[["z23sb_vs_z22sb_down"]][["pvalue_plots"]][["TF"]]
## TF, zymodeme2.3 without drug vs. uninfected without drug, down.
Venn to see
shared/unique genes
Once again I wish to pull out the significant genes and see how my
numbers match against the text.
shared <- Vennerable::Venn(list(
"drug" = rownames(hs_macr_sig[["deseq"]][["ups"]][["z23sb_vs_z22sb"]]),
"nodrug" = rownames(hs_macr_sig[["deseq"]][["ups"]][["z23nosb_vs_z22nosb"]])))
pp(file = "images/drug_nodrug_venn_up.png")
Vennerable::plot(shared)
dev.off()
## png
## 2

shared <- Vennerable::Venn(
list("drug" = rownames(hs_macr_sig[["deseq"]][["downs"]][["z23sb_vs_z22sb"]]),
"nodrug" = rownames(hs_macr_sig[["deseq"]][["downs"]][["z23nosb_vs_z22nosb"]])))
pp(file = "images/drug_nodrug_venn_down.png")
Vennerable::plot(shared)
dev.off()
## png
## 2

A slightly different way of looking at the differences between the
two zymodeme infections is to directly compare the infected samples with
and without drug. Thus, when a volcano plot showing the comparison of
the zymodeme 2.3 vs. 2.2 samples was plotted, 484 genes were observed as
increased and 422 decreased; these groups include many of the same
inflammatory (up) and membrane (down) genes.
Similar patterns were observed when the antimonial was included.
Thus, when a Venn diagram of the two sets of increased genes was
plotted, a significant number of the genes was observed as increased
(313) and decreased (244) in both the untreated and antimonial treated
samples.
Drug effects on each
zymodeme infection
Another likely question is to directly compare the treated vs
untreated samples for each zymodeme infection in order to visualize the
effects of antimonial.
z2.3 with and
without drug
z23sb_vs_z23nosb_volcano <- plot_volcano_de(
table = hs_macr_table[["data"]][["z23sb_vs_z23nosb"]],
fc_col = "deseq_logfc", p_col = "deseq_adjp",
shapes_by_state = FALSE, color_by = "fc", label = 10, label_column = "hgnc_symbol")
## Error in plot_volcano_de(table = hs_macr_table[["data"]][["z23sb_vs_z23nosb"]], : could not find function "plot_volcano_de"
plotly::ggplotly(z23sb_vs_z23nosb_volcano[["plot"]])
## Error: object 'z23sb_vs_z23nosb_volcano' not found
z22sb_vs_z22nosb_volcano <- plot_volcano_de(
table = hs_macr_table[["data"]][["z22sb_vs_z22nosb"]],
fc_col = "deseq_logfc", p_col = "deseq_adjp",
shapes_by_state = FALSE, color_by = "fc", label = 10, label_column = "hgnc_symbol")
## Error in plot_volcano_de(table = hs_macr_table[["data"]][["z22sb_vs_z22nosb"]], : could not find function "plot_volcano_de"
plotly::ggplotly(z22sb_vs_z22nosb_volcano[["plot"]])
## Error: object 'z22sb_vs_z22nosb_volcano' not found
z23sb_vs_z23nosb_volcano[["plot"]] +
xlim(-8, 8) +
ylim(0, 210)
## Error: object 'z23sb_vs_z23nosb_volcano' not found
pp(file = "images/z23sb_vs_z23nosb_reactome_up.png",
image = all_gp[["z23sb_vs_z23nosb_up"]][["pvalue_plots"]][["REAC"]],
height = 12, width = 9)
## Warning in pp(file = "images/z23sb_vs_z23nosb_reactome_up.png", image =
## all_gp[["z23sb_vs_z23nosb_up"]][["pvalue_plots"]][["REAC"]], : There is no
## device to shut down.
## Error in eval(expr, envir): R: improper image header `/lab/singularity/tmrc2_macrophage_deb/202509081525_outputs/images/z23sb_vs_z23nosb_reactome_up.png' @ error/png.c/ReadPNGImage/3941
all_gp[["z23sb_vs_z23nosb_up"]][["pvalue_plots"]][["REAC"]]
## Reactome, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z23sb_vs_z23nosb_up"]][["pvalue_plots"]][["KEGG"]]
## KEGG, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z23sb_vs_z23nosb_up"]][["pvalue_plots"]][["MF"]]
## NULL
## MF, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z23sb_vs_z23nosb_up"]][["pvalue_plots"]][["TF"]]
## TF, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z23sb_vs_z23nosb_up"]][["pvalue_plots"]][["WP"]]
## WikiPathways, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z23sb_vs_z23nosb_up"]][["interactive_plots"]][["WP"]]
all_gp[["z23sb_vs_z23nosb_down"]][["pvalue_plots"]][["REAC"]]
## Reactome, zymodeme2.3 without drug vs. uninfected without drug, down.
all_gp[["z23sb_vs_z23nosb_down"]][["pvalue_plots"]][["MF"]]
## NULL
## MF, zymodeme2.3 without drug vs. uninfected without drug, down.
all_gp[["z23sb_vs_z23nosb_down"]][["pvalue_plots"]][["TF"]]
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_col()`).
## TF, zymodeme2.3 without drug vs. uninfected without drug, down.
z2.2 with and
without drug
z22sb_vs_z22nosb_volcano[["plot"]] +
xlim(-8, 8) +
ylim(0, 210)
## Error: object 'z22sb_vs_z22nosb_volcano' not found
pp(file = "images/z22sb_vs_z22nosb_reactome_up.png",
image = all_gp[["z22sb_vs_z22nosb_up"]][["pvalue_plots"]][["REAC"]],
height = 12, width = 9)
## Warning in pp(file = "images/z22sb_vs_z22nosb_reactome_up.png", image =
## all_gp[["z22sb_vs_z22nosb_up"]][["pvalue_plots"]][["REAC"]], : There is no
## device to shut down.
## Error in eval(expr, envir): R: improper image header `/lab/singularity/tmrc2_macrophage_deb/202509081525_outputs/images/z22sb_vs_z22nosb_reactome_up.png' @ error/png.c/ReadPNGImage/3941
all_gp[["z22sb_vs_z22nosb_up"]][["pvalue_plots"]][["REAC"]]
## Reactome, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z22sb_vs_z22nosb_up"]][["pvalue_plots"]][["KEGG"]]
## KEGG, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z22sb_vs_z22nosb_up"]][["pvalue_plots"]][["MF"]]
## NULL
## MF, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z22sb_vs_z22nosb_up"]][["pvalue_plots"]][["TF"]]
## TF, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z22sb_vs_z22nosb_up"]][["pvalue_plots"]][["WP"]]
## WikiPathways, zymodeme2.3 without drug vs. uninfected without drug, up.
all_gp[["z22sb_vs_z22nosb_up"]][["interactive_plots"]][["WP"]]
all_gp[["z22sb_vs_z22nosb_down"]][["pvalue_plots"]][["REAC"]]
## Reactome, zymodeme2.3 without drug vs. uninfected without drug, down.
all_gp[["z22sb_vs_z22nosb_down"]][["pvalue_plots"]][["MF"]]
## NULL
## MF, zymodeme2.3 without drug vs. uninfected without drug, down.
all_gp[["z22sb_vs_z22nosb_down"]][["pvalue_plots"]][["TF"]]
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_col()`).
## TF, zymodeme2.3 without drug vs. uninfected without drug, down.
Shared and unique
genes after/before drug
shared <- Vennerable::Venn(list(
"z23" = rownames(hs_macr_sig[["deseq"]][["ups"]][["z23sb_vs_z23nosb"]]),
"z22" = rownames(hs_macr_sig[["deseq"]][["ups"]][["z22sb_vs_z22nosb"]])))
pp(file = "images/z23_z22_drug_venn_up.png")
Vennerable::plot(shared)
dev.off()
## png
## 2

shared <- Vennerable::Venn(list(
"z23" = rownames(hs_macr_sig[["deseq"]][["downs"]][["z23sb_vs_z23nosb"]]),
"z22" = rownames(hs_macr_sig[["deseq"]][["downs"]][["z22sb_vs_z22nosb"]])))
pp(file = "images/z23_z22_drug_venn_down.png")
Vennerable::plot(shared)
dev.off()
## png
## 2

Note: I am settig the x and y-axis boundaries by allowing the plotter
to pick its own axis the first time, writing down the ranges I observe,
and then setting them to the largest of the pair. It is therefore
possible that I missed one or more genes which lies outside that
range.
The previous plotted contrasts sought to show changes between the two
strains z2.3 and z2.2. Conversely, the previous volcano plots seek to
directly compare each strain before/after drug treatment.
LRT of the Human
Macrophage
A slightly different tack to examine the data is to perform a
likelihood ratio test in order to look for trends which are shared among
genes when examining different conditions in the data.
tmrc2_lrt_strain_drug <- deseq_lrt(hs_macr, interactor_column = "drug",
interest_column = "macrophagezymodeme",
factors = c("drug", "macrophagezymodeme"))
## converting counts to integer mode
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 38 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
## rlog() may take a long time with 50 or more samples,
## vst() is a much faster transformation
## Working with 858 genes.
## Working with 855 genes after filtering: minc > 3
## Joining with `by = join_by(merge)`
## Joining with `by = join_by(merge)`

tmrc2_lrt_strain_drug[["cluster_data"]][["plot"]]

Parasite
Let us consider for a moment differences among the parasite
transcriptomes for the samples which were not drug treated.
One thing I did in the initial implementation of this document was to
repeat the variable ‘up_genes’ for each comparison; I think this time I
will make a different variable for each comparison so I can play with
them a little further.
comparison <- "z23_vs_z22"
lp_macrophage_de <- all_pairwise(lp_macrophage_nosb, model_svs = "svaseq",
model_fstring = "~ 0 + condition", filter = TRUE)
## z2.2 z2.3
## 14 15
## Running normalize_se.
## Removing 119 low-count genes (8591 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Running normalize_se.
## Setting 2387 entries to zero.
## This received a matrix of SVs.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## Warning in createContrastL(objFlt$formula, objFlt$data, L): Contrasts with only
## a single non-zero term are already evaluated by default.
## conditions
## z22 z23
## 14 15
## conditions
## z22 z23
## 14 15
## conditions
## z22 z23
## 14 15
tmrc2_parasite_keepers <- list(
"z23_vs_z22" = c("z23", "z22"))
lp_macrophage_table <- combine_de_tables(
lp_macrophage_de, keepers = tmrc2_parasite_keepers,
excel = glue("analyses/macrophage_de/de_tables/parasite_infection_de-v{ver}.xlsx"))
lp_macrophage_sig <- extract_significant_genes(
lp_macrophage_table,
excel = glue("analyses/macrophage_de/sig_tables/parasite_sig-v{ver}.xlsx"))
lp_macrophage_table[["plots"]][[comparison]][["deseq_vol_plots"]]

lp_macrophage_table[["plots"]][[comparison]][["deseq_ma_plots"]]

up_genes_z23z22 <- lp_macrophage_sig[["deseq"]][["ups"]][[comparison]]
dim(up_genes_z23z22)
## [1] 48 69
down_genes_z23z22 <- lp_macrophage_sig[["deseq"]][["downs"]][[comparison]]
dim(down_genes_z23z22)
## [1] 91 69
lp_z23sb_vs_z22sb_volcano <- plot_volcano_de(
table = lp_macrophage_table[["data"]][["z23_vs_z22"]],
fc_col = "deseq_logfc", p_col = "deseq_adjp",
shapes_by_state = FALSE, color_by = "fc", label = 10, label_column = "hgnc_symbol")
## Error in plot_volcano_de(table = lp_macrophage_table[["data"]][["z23_vs_z22"]], : could not find function "plot_volcano_de"
plotly::ggplotly(lp_z23sb_vs_z22sb_volcano[["plot"]])
## Error: object 'lp_z23sb_vs_z22sb_volcano' not found
lp_z23sb_vs_z22sb_volcano[["plot"]]
## Error: object 'lp_z23sb_vs_z22sb_volcano' not found
Ontology search
An important note, I recently added a minimum crossreference argument
(defaulting to 40 genes), which causes lots of comparisons to fail for
poorly annotated genomes (like panamensis.) Thus, I am relaxing that
constraint for these searches.
lp_lengths <- all_lp_annot[, c("gid", "annot_cds_length")]
colnames(lp_lengths) <- c("ID", "length")
up_goseq <- simple_goseq(up_genes_z23z22, go_db = lp_go,
length_db = lp_lengths, min_xref = 15)
## Found 18 go_db genes and 48 length_db genes out of 48.
## Testing that go categories are defined.
## Removing undefined categories.
## Gathering synonyms.
## Gathering category definitions.
## View categories over represented in the 2.3 samples
up_goseq[["pvalue_plots"]][["bpp_plot_over"]]

down_goseq <- simple_goseq(down_genes_z23z22, go_db = lp_go,
length_db = lp_lengths, min_xref = 15)
## Found 25 go_db genes and 91 length_db genes out of 91.
## Testing that go categories are defined.
## Removing undefined categories.
## Gathering synonyms.
## Gathering category definitions.
## View categories over represented in the 2.2 samples
down_goseq[["pvalue_plots"]][["bpp_plot_over"]]

created <- dir.create(glue("analyses/macrophage_de/goseq_parasite"))
written_goseq <- write_goseq_data(
up_goseq,
excel = glue("analyses/macrophage_de/goseq_parasite/lp_macrophage_increased_z2.3_goseq-v{ver}.xlsx"))
## Attempting to generate a id2go file in the format expected by topGO.
written_goseq <- write_goseq_data(
down_goseq,
excel = glue("analyses/macrophage_de/goseq_parasite/lp_macrophage_increased_z2.2_goseq-v{ver}.xlsx"))
GSVA
Note: The following block assumes one is able to download a fresh
copy of msigDB, which I am not sure is possible within the constraints
of a container (I mean it is trivial to do, but I am not sure if it is
ok due to licensing). However, Broad provides a data package of a msigdb
release. As a result, the following block will be repeated using
that.
hs_infected <- subset_se(hs_macrophage, subset = "macrophagetreatment!='uninf'") %>%
subset_se(subset = "macrophagetreatment!='uninf_sb'")
hs_gsva_c2 <- simple_gsva(hs_infected)
hs_gsva_c2_meta <- get_msigdb_metadata(hs_gsva_c2, msig_xml = "reference/msigdb_v7.2.xml")
hs_gsva_c2_sig <- get_sig_gsva_categories(
hs_gsva_c2_meta,
excel = glue("analyses/macrophage_de/gsva/hs_macrophage_gsva_c2_sig.xlsx"))
hs_gsva_c2_sig[["raw_plot"]]
hs_gsva_c7 <- simple_gsva(hs_infected, signature_category = "c7")
hs_gsva_c7_meta <- get_msigdb_metadata(hs_gsva_c7, msig_xml = "reference/msigdb_v7.2.xml")
hs_gsva_c7_sig <- get_sig_gsva_categories(
hs_gsva_c7,
excel = glue("analyses/macrophage_de/gsva/hs_macrophage_gsva_c7_sig.xlsx"))
hs_gsva_c7_sig[["raw_plot"]]
Repeat using the
GSVAdata package.
hs_infected <- subset_se(hs_macrophage, subset = "macrophagetreatment!='uninf'") %>%
subset_se(subset = "macrophagetreatment!='uninf_sb'")
hs_gsva_c2 <- simple_gsva(hs_infected)
## Error: unable to find an inherited method for function 'annotation' for signature 'object = "SummarizedExperiment"'
##hs_gsva_c2_meta <- get_msigdb_metadata(hs_gsva_c2, msig_xml="reference/msigdb_v7.2.xml")
hs_gsva_c2_sig <- get_sig_gsva_categories(
hs_gsva_c2,
excel = glue("analyses/macrophage_de/gsva/hs_macrophage_gsva_c2_sig.xlsx"))
## Error: object 'hs_gsva_c2' not found
hs_gsva_c2_sig[["raw_plot"]]
## Error: object 'hs_gsva_c2_sig' not found
hs_gsva_c7 <- simple_gsva(hs_infected, signature_category = "c7")
## Error: unable to find an inherited method for function 'annotation' for signature 'object = "SummarizedExperiment"'
##hs_gsva_c7_meta <- get_msigdb_metadata(hs_gsva_c7, msig_xml="reference/msigdb_v7.2.xml")
hs_gsva_c7_sig <- get_sig_gsva_categories(
hs_gsva_c7,
excel = glue("analyses/macrophage_de/gsva/hs_macrophage_gsva_c7_sig.xlsx"))
## Error: object 'hs_gsva_c7' not found
hs_gsva_c7_sig[["raw_plot"]]
## Error: object 'hs_gsva_c7_sig' not found
Try out a new
tool
Two reasons: Najib loves him some PCA, this uses wikipathways, which
is something I think is neat.
Ok, I spent some time looking through the code and I have some
problems with some of the design decisions.
Most importantly, it requires a data.frame() which has the following
format:
- No rownames, instead column #1 is the sample ID.
- Columns 2-m are the categorical/survival/etc metrics.
- Columns m-n are 1 gene-per-column with log2 values.
But when I think about it I think I get the idea, they want to be
able to do modelling stuff more easily with response factors.
library(pathwayPCA)
library(rWikiPathways)
##
## Attaching package: 'rWikiPathways'
## The following object is masked from 'package:edgeR':
##
## getCounts
downloaded <- downloadPathwayArchive(organism = "Homo sapiens", format = "gmt")
data_path <- system.file("extdata", package = "pathwayPCA")
wikipathways <- read_gmt(paste0(data_path, "/wikipathways_human_symbol.gmt"),
description = TRUE)
se <- subset_se(hs_macrophage, subset = "macrophagetreatment!='uninf'") %>%
subset_se(subset = "macrophagetreatment!='uninf_sb'")
se <- set_conditions(se, fact = "macrophagezymodeme")
## The numbers of samples by condition are:
##
## none z22 z23
## 0 29 29
symbol_column <- "hgnc_symbol"
symbol_vector <- rowData(se)[[symbol_column]]
names(symbol_vector) <- rownames(rowData(se))
symbol_df <- as.data.frame(symbol_vector)
assay_df <- merge(symbol_df, as.data.frame(assay(se)), by = "row.names")
assay_df[["Row.names"]] <- NULL
rownames(assay_df) <- make.names(assay_df[["symbol_vector"]], unique = TRUE)
assay_df[["symbol_vector"]] <- NULL
assay_df <- as.data.frame(t(assay_df))
assay_df[["SampleID"]] <- rownames(assay_df)
assay_df <- dplyr::select(assay_df, "SampleID", everything())
factor_df <- as.data.frame(colData(se))
factor_df[["SampleID"]] <- rownames(factor_df)
factor_df <- dplyr::select(factor_df, "SampleID", everything())
factor_df <- factor_df[, c("SampleID", factors)]
## Error: object 'factors' not found
tt <- CreateOmics(
assayData_df = assay_df,
pathwayCollection_ls = wikipathways,
response = factor_df,
respType = "categorical",
minPathSize = 5)
## 3190 genes have variance < epsilon and will be removed. These gene(s) include:
## [1] "TNMD" "CYP51A1" "KRIT1" "MAD1L1" "ARF5"
## [6] "REXO5" "FBXL3" "REX1BD" "KRT33A" "TAC1"
## [11] "LGALS14" "SLC13A2" "TRAPPC6A" "SELE" "TFAP2B"
## [16] "SS18L2" "IDS" "SLC7A14" "CLDN11" "MDH1"
## [21] "COX15" "MATR3" "ISL1" "INSRR" "EFCAB1"
## [26] "TMSB10" "OTC" "HOXC8" "XK" "NOP16"
## [31] "TNFRSF17" "GUCA1A" "NNAT" "NRIP2" "MCOLN3"
## [36] "SERPINB3" "MRPS24" "SEZ6" "AHRR" "BORCS8.MEF2B"
## [41] "KDM4A" "THUMPD1" "IFT80" "ERLEC1" "PAGE1"
## [46] "FRMPD1" "LNX1" "IPCEF1" "ZNF37A" "TUBA3D"
## [51] "SPAG5" "EXOSC5" "TIGAR" "TP53INP2" "LXN"
## [56] "AFM" "CFHR2" "UBA5" "JMJD4" "PCDHA6"
## [61] "PCDHGA2" "C1QTNF3" "RNF13" "ZNF671" "RRN3"
## [66] "CHERP" "DIMT1" "NME8" "PIGS" "DEFB127"
## [71] "FXYD3" "CMTM1" "FLT3LG" "RBM27" "ANGPT2"
## [76] "RNF31" "SEMA4G" "NUBP2" "KCNK16" "MAGEB2"
## [81] "MTAP" "SERPIND1" "DDT" "SEC14L2" "GGT1"
## [86] "PRODH" "SOX10" "TIMP3" "PSMA3" "SNW1"
## [91] "SERPINA4" "PCK2" "PRORP" "TM9SF1" "RAB5IF"
## [96] "CST9L" "CST4" "SPINT3" "EPPIN" "RBFA"
## [101] "CEP76" "H2BW2" "MCTS2P" "SRPX" "F9"
## [106] "PPP1R2C" "BRS3" "TIMP1" "GLA" "ACP5"
## [111] "DHRS12" "ZNF821" "CMC2" "ZNF174" "CORO7.PAM16"
## [116] "SALL1" "AQP9" "OIP5" "ARHGEF10" "CGB2"
## [121] "CGB3" "PPP1R37" "RNASEH2A" "OAZ1" "C19orf44"
## [126] "MED26" "ZNF419" "LGALS13" "CEACAM5" "BABAM1"
## [131] "ATP1A3" "ZNRF4" "TMEM205" "WDR83OS" "PIK3R2"
## [136] "PDE4C" "DDX49" "ERF" "RASA4" "TFPI2"
## [141] "DUS4L" "NPVF" "WNT2" "HOXA5" "HOXA6"
## [146] "CHN2" "MINDY4" "PSMA2" "OGN" "ASPN"
## [151] "ECM2" "EXOSC3" "VSIR" "RNF43" "ASPA"
## [156] "HOXB6" "SLC16A6" "RANGRF" "VTN" "FOXN1"
## [161] "UNC119" "ALDOC" "ODAM" "SMR3A" "CHIC2"
## [166] "IL2" "CPZ" "DBX1" "SNX15" "APOC3"
## [171] "SCGB2A2" "PTPMT1" "CALCA" "MYF6" "MYF5"
## [176] "PRR4" "AKAP3" "GSG1" "OGFOD2" "ART4"
## [181] "MGP" "FZD10" "LPCAT3" "GYS2" "MAK"
## [186] "ASF1A" "IL17A" "TSPO2" "CCNC" "HDGFL1"
## [191] "OR12D3" "MRPL2" "TMCO6" "PDE8B" "IL5"
## [196] "SMC4" "NPHP3" "BCHE" "ABHD14B" "ABHD14A.ACY1"
## [201] "TP53I3" "INO80B" "REG1A" "KCNJ13" "NEU2"
## [206] "HSPE1" "ABCB6" "PNO1" "ATP6V1B1" "ANGPTL1"
## [211] "NCF2" "PRAMEF1" "AGMAT" "TNNI3K" "TSNAX"
## [216] "CRYGD" "ZC2HC1B" "CCND2" "FGF23" "TRIM32"
## [221] "TGFB3" "ZNF410" "GPR75" "IFIT3" "NKX2.3"
## [226] "IFIT2" "HOXB8" "HOXB5" "CRHR1" "HOXB1"
## [231] "MLANA" "IFNA6" "GRIA2" "LRP11" "MAGEB4"
## [236] "SLC25A2" "GPR31" "TNFSF11" "TRIM6" "TAS2R10"
## [241] "IAPP" "SLITRK3" "CLCC1" "GPSM2" "OBP2A"
## [246] "TBX22" "PRM2" "RWDD3" "MRM2" "SLC25A51"
## [251] "DCAF10" "WDR83" "ACTRT1" "TSFM" "ORMDL2"
## [256] "CDK2" "KBTBD4" "COL10A1" "SERPINA7" "H2BW1"
## [261] "ESX1" "B9D2" "MC3R" "GCNT7" "ANKRD60"
## [266] "C20orf85" "TP53TG5" "MAGEA10" "FASTKD3" "CRISP2"
## [271] "AARS2" "RPS10" "APOBEC2" "GCM2" "TRIM51"
## [276] "SCGB2A1" "GPR18" "IRF1" "AMELX" "NPBWR2"
## [281] "BHLHE23" "FOSB" "DEFB126" "FOXA2" "NKX2.4"
## [286] "NKX2.2" "CSTL1" "FLRT3" "MGME1" "TMEM74B"
## [291] "CITED1" "MMP24" "TMEM115" "KIRREL2" "X.5"
## [296] "STATH" "HTN1" "TIMM17B" "EVI2A" "OMG"
## [301] "AVPR2" "OMD" "TAS2R3" "TAS2R4" "OR7A10"
## [306] "SLC35E1" "OR7C2" "GNG13" "EMC6" "OR1E2"
## [311] "FGL2" "GNAZ" "ADORA2A" "TAS2R16" "ATP6V1F"
## [316] "LRRC4" "LRRC17" "FEZF1" "MRPS12" "HOXD1"
## [321] "HAT1" "HOXD9" "HOXD13" "ELL3" "CALML4"
## [326] "THAP10" "ACKR4" "SOX15" "KLK8" "NEDD8"
## [331] "VCY1B" "VCY" "CDY2B" "INS.IGF2" "KCNA5"
## [336] "ANGPTL8" "ACE2" "GDF1" "MRPL34" "LSM7"
## [341] "ACSBG2" "BMP15" "ARPC1B" "OR11H1" "CALY"
## [346] "PPAN" "HSD17B3" "PRRG1" "BPIFA3" "DEFB118"
## [351] "GJA9" "CDX4" "NAPSA" "PDLIM4" "TMEM204"
## [356] "KRT33B" "FSHB" "USP29" "NR0B2" "ACTR10"
## [361] "ABHD12B" "RTBDN" "TRIM22" "TIMM10B" "SCLY"
## [366] "FTHL17" "NIP7" "VPS4A" "SCP2D1" "SSTR4"
## [371] "APCS" "TOE1" "PPP1R3D" "BHMT2" "ZBED3"
## [376] "ANGPTL3" "STOML3" "IRS4" "ERG28" "GSC"
## [381] "CAMK1" "GSTM1" "TSHB" "GSTM3" "FKBP11"
## [386] "GRP" "PRH2" "SOX3" "BIVM" "ERCC5"
## [391] "UGT2A3" "CSN2" "LACRT" "GLS2" "FAM186B"
## [396] "BLOC1S1" "ZC3H10" "SLC26A10" "MIP" "CHST5"
## [401] "HTR2B" "TMBIM1" "RCBTB2" "KDELR2" "CIDEB"
## [406] "NKX2.8" "NKX2.1" "SRSF1" "CHAD" "GH2"
## [411] "LIMD2" "CFC1" "OR13C9" "ANGPTL2" "TLR4"
## [416] "HINT2" "YIPF3" "CLPS" "FXYD6" "SLTM"
## [421] "LHCGR" "SLC3A1" "LBX1" "CUZD1" "RBP4"
## [426] "GPR87" "MSTN" "CARF" "IL21" "GSTCD"
## [431] "ZCRB1" "NDUFA9" "KERA" "SYCP3" "CCDC65"
## [436] "NPFF" "LPAR6" "RNF113B" "SSTR1" "TSSK4"
## [441] "RAB15" "SERF2" "FGF7" "CELF6" "IGSF6"
## [446] "CHST4" "ZSCAN32" "CTRL" "KIF2B" "ADCYAP1"
## [451] "MEP1B" "SAT2" "ZNF750" "ELAC1" "SLC39A3"
## [456] "CCDC97" "TMEM91" "ZNF593" "EVA1B" "DMRTB1"
## [461] "BARHL2" "PIGM" "CRNN" "JTB" "CREB3L4"
## [466] "PYCR2" "CHAC2" "ANKRD53" "TEX261" "LIPT1"
## [471] "LYG1" "GPR17" "PHOSPHO2" "RPL32" "LRTM1"
## [476] "ZNF660" "EIF2A" "AMT" "ECE2" "SLC26A1"
## [481] "CABS1" "BHMT" "KCNMB1" "LRRTM2" "SLC17A4"
## [486] "H2BC1" "HIGD2A" "TCTE1" "CLVS2" "TAAR2"
## [491] "TAAR6" "TAAR8" "WTAP" "RBAK" "FERD3L"
## [496] "TMEM140" "CLTRN" "LANCL3" "SYTL5" "AKAP4"
## 1103 gene name(s) are invalid. Invalid name(s) include:
## [1] "NME1.NME2" "RTEL1.TNFRSF6B" "STON1.GTF2A1L" "X.1"
## [5] "PTGES3L.AARSD1" "NKX3.2" "X.2" "TMEM189.UBE2V1"
## [9] "H1.3" "X.3" "H1.1" "X.4"
## [13] "CHURC1.FNTB" "X.6" "H3.3B" "ZNF670.ZNF695"
## [17] "X.7" "ERVK3.1" "X.8" "X.9"
## [21] "NKX6.2" "X.10" "H3.3A" "NKX6.1"
## [25] "NKX6.3" "NKX3.1" "X.11" "X.12"
## [29] "H3.4" "H1.4" "JMJD7.PLA2G4B" "X.14"
## [33] "KRTAP4.4" "RAB4B.EGLN2" "X.15" "X.16"
## [37] "H1.8" "HLA.DQB1" "X.17" "NKX2.6"
## [41] "KRTAP9.7" "KRTAP11.1" "NKX2.5" "KRTAP8.1"
## [45] "X.19" "KRTAP19.1" "H1.5" "KRTAP6.1"
## [49] "H1.10" "KRTAP5.5" "KRTAP17.1" "KRTAP21.2"
## [53] "H1.7" "X.20" "H1.6" "H1.2"
## [57] "X.21" "H3.5" "X.22" "H1.0"
## [61] "HLA.DRB1" "KRTAP5.3" "HLA.DQA1" "X.23"
## [65] "H4.16" "X.25" "KRTAP4.1" "HLA.DRB5"
## [69] "MT.ND6" "MT.CO2" "MT.CYB" "MT.ND2"
## [73] "MT.ND5" "MT.CO1" "MT.ND3" "MT.ND4"
## [77] "MT.ND1" "MT.ATP6" "MT.CO3" "X.26"
## [81] "HLA.DOA" "HLA.DMA" "HLA.DRA" "X.28"
## [85] "HLA.C" "KRTAP5.11" "KRTAP5.10" "HLA.E"
## [89] "HLA.G" "HLA.F" "X.29" "CLLU1.AS1"
## [93] "X.30" "TRBV20OR9.2" "KRTAP5.6" "KRTAP5.2"
## [97] "KRTAP5.1" "KRTAP19.8" "HLA.A" "X.31"
## [101] "IGKV4.1" "IGKV6.21" "IGKV3D.20" "IGLV10.54"
## [105] "IGLV5.52" "IGLV1.51" "IGLV1.50" "IGLV1.47"
## [109] "IGLV7.46" "IGLV1.44" "IGLV7.43" "IGLV1.40"
## [113] "IGLV3.25" "IGLV2.23" "IGLV3.21" "IGLV3.19"
## [117] "IGLV3.16" "IGLV2.14" "IGLV2.11" "IGLV3.9"
## [121] "IGLV3.1" "TRBV7.3" "TRBV5.3" "TRBV10.1"
## [125] "TRBV6.5" "TRBV6.6" "TRBV7.6" "TRBV5.1"
## [129] "TRBV20.1" "TRBV24.1" "TRBJ2.1" "TRBJ2.2P"
## [133] "TRBJ2.3" "TRBJ2.6" "TRBJ2.7" "TRAV12.3"
## [137] "TRAV8.7" "IGHD3.10" "IGHV6.1" "IGHV1.2"
## [141] "IGHV1.3" "IGHV2.5" "IGHV3.7" "IGHV3.11"
## [145] "IGHV3.13" "IGHV3.15" "IGHV1.18" "IGHV3.20"
## [149] "IGHV3.21" "IGHV3.23" "IGHV1.24" "IGHV2.26"
## [153] "IGHV4.28" "IGHV3.33" "IGHV4.34" "IGHV4.39"
## [157] "IGHV3.49" "IGHV5.51" "IGHV3.66" "IGHV3.73"
## [161] "KRTAP16.1" "KRTAP3.3" "KRTAP3.2" "MT.ND4L"
## [165] "X.32" "ZNF625.ZNF20" "ERV3.1" "X.33"
## [169] "RPL17.C18orf32" "KRTAP1.5" "ZNF816.ZNF321P" "IGHV3.64"
## [173] "HLA.DPB1" "IGHV4.59" "IGHV3.74" "APOC4.APOC2"
## [177] "X.36" "X.38" "ERVMER34.1" "X.39"
## [181] "MT.ATP8" "IGKV3D.7" "TRBV5.4" "X.40"
## [185] "IGKV1OR2.108" "HLA.DPA1" "IGHV3.43" "HLA.DQB2"
## [189] "TRBV29.1" "X.41" "IGKV3OR2.268" "HLA.B"
## [193] "HNRNPUL2.BSCL2" "NKX1.1" "X.44" "HLA.DQA2"
## [197] "IGKV2D.30" "IGKV1D.8" "IGKV1.6" "X.47"
## [201] "IGKV3.20" "IGKV1D.33" "IGKV1.17" "IGKV1.8"
## [205] "IGKV1.16" "HLA.DOB" "KRTAP5.8" "IGKV2.24"
## [209] "IGKV3.11" "X.48" "KRTAP5.4" "IGKV1.9"
## [213] "X.50" "IGKV1.33" "IGKV1.39" "IGKV2D.28"
## [217] "HLA.DMB" "IGKV1D.17" "ERVW.1" "PPAN.P2RY11"
## [221] "IGKV2.30" "IGKV2D.29" "IGKV1.12" "IGKV1.5"
## [225] "X.51" "X.52" "DNAJC25.GNG10" "KRTAP5.7"
## [229] "IGKV3.15" "KRTAP4.2" "IGKV1.27" "TRIM39.RPP21"
## [233] "X.54" "PRR5.ARHGAP8" "STIMATE.MUSTN1" "RBM14.RBM4"
## [237] "LY75.CD302" "X.55" "X.56" "TNFSF12.TNFSF13"
## [241] "ATP5MF.PTCD1" "X.57" "EPPIN.WFDC6" "X.58"
## [245] "X.59" "X.60" "X.61" "X.62"
## [249] "X.63" "X.64" "RNF103.CHMP3" "X.66"
## [253] "ARPIN.AP3S2" "ARPC4.TTLL3" "X.67" "X.68"
## [257] "X.69" "LY6G6F.LY6G6D" "X.70" "CCDC169.SOHLH2"
## [261] "NT5C1B.RDH14" "X.71" "X.72" "X.73"
## [265] "TMED7.TICAM2" "X.74" "MSANTD3.TMEFF1" "X.75"
## [269] "CENPS.CORT" "X.76" "X.77" "TRBV7.4"
## [273] "CHKB.CPT1B" "X.78" "X.79" "X.80"
## [277] "X.81" "X.82" "X.83" "CKLF.CMTM1"
## [281] "ATP6V1G2.DDX39B" "INMT.MINDY4" "X.85" "STX16.NPEPL1"
## [285] "KRTAP5.9" "X.86" "SAA2.SAA4" "ZFP91.CNTF"
## [289] "X.87" "MSH5.SAPCD1" "FXYD6.FXYD2" "X.88"
## [293] "X.89" "X.90" "X.91" "X.92"
## [297] "X.93" "NEDD8.MDP1" "TRAV1.1" "X.94"
## [301] "X.96" "X.97" "KLRC4.KLRK1" "X.98"
## [305] "X.99" "X.100" "X.101" "X.102"
## [309] "X.103" "X.104" "TRAV1.2" "X.107"
## [313] "X.108" "X.109" "X.110" "X.111"
## [317] "X.112" "X.113" "SLCO1B3.SLCO1B7" "X.114"
## [321] "X.115" "X.117" "X.118" "X.120"
## [325] "X.121" "X.122" "RPL36A.HNRNPH2" "X.123"
## [329] "X.124" "P2RX5.TAX1BP3" "X.125" "X.126"
## [333] "X.127" "PPT2.EGFL8" "X.129" "X.130"
## [337] "X.131" "X.132" "X.133" "X.134"
## [341] "SPECC1L.ADORA2A" "BCL2L2.PABPN1" "X.137" "X.138"
## [345] "PINX1.1" "X.140" "X.141" "X.142"
## [349] "X.143" "UBE2F.SCLY" "X.144" "FPGT.TNNI3K"
## [353] "BLOC1S5.TXNDC5" "X.146" "POC1B.GALNT4" "NDUFC2.KCTD14"
## [357] "X.147" "ZHX1.C8orf76" "X.150" "ST20.MTHFS"
## [361] "X.151" "TGIF2.RAB5IF" "X.152" "X.153"
## [365] "X.155" "X.156" "X.158" "X.159"
## [369] "X.160" "X.161" "X.162" "PMF1.BGLAP"
## [373] "X.163" "X.164" "X.165" "X.166"
## [377] "X.169" "X.170" "X.171" "X.172"
## [381] "X.173" "X.174" "X.175" "X.176"
## [385] "TEN1.CDK3" "X.177" "X.178" "X.179"
## [389] "X.180" "X.181" "X.182" "ISY1.RAB43"
## [393] "X.183" "X.184" "X.185" "X.186"
## [397] "TMEM256.PLSCR3" "X.191" "X.192" "X.193"
## [401] "X.195" "X.196" "LINC02210.CRHR1" "X.197"
## [405] "X.198" "X.200" "X.201" "X.202"
## [409] "X.203" "X.205" "CFAP298.TCP10L" "X.206"
## [413] "EEF1E1.BLOC1S5" "X.207" "X.208" "X.209"
## [417] "X.210" "X.211" "X.213" "X.214"
## [421] "X.215" "X.216" "X.217" "X.218"
## [425] "X.221" "X.222" "X.223" "X.224"
## [429] "X.225" "X.226" "X.227" "X.228"
## [433] "X.229" "X.230" "X.231" "X.232"
## [437] "X.233" "X.234" "X.235" "X.236"
## [441] "X.237" "X.238" "X.239" "X.240"
## [445] "X.241" "X.242" "X.244" "X.245"
## [449] "X.246" "X.247" "X.248" "X.249"
## [453] "X.250" "X.251" "X.252" "X.253"
## [457] "X.254" "X.255" "X.256" "X.257"
## [461] "X.258" "X.259" "X.260" "X.261"
## [465] "X.263" "X.264" "X.266" "X.267"
## [469] "X.268" "X.269" "X.270" "MIA.RAB4B"
## [473] "X.271" "X.272" "X.273" "X.274"
## [477] "X.275" "X.276" "X.277" "X.279"
## [481] "X.280" "X.281" "X.283" "X.285"
## [485] "X.286" "X.288" "ARHGAP19.SLIT1" "COMMD3.BMI1"
## [489] "ZNF559.ZNF177" "X.289" "TSNAX.DISC1" "X.290"
## [493] "X.291" "X.292" "BORCS7.ASMT" "IGHV3.30"
## [497] "URGCP.MRPS24" "RPS10.NUDT3" "TLCD4.RWDD3" "X.293"
## These genes may be excluded from analysis. Proper gene names
## contain alphanumeric characters only, and start with a letter.
## Warning in CheckSampleIDs(assayData_df): Row names will be ignored. Sample IDs must be in the first column of the data
## frame.
## Error in .convertPhenoDF(response, type = respType): Regression and categorical data must be a data frame with two columns, sample ID
## and response, in exactly that order.
super <- AESPCA_pVals(
object = tt,
numPCs = 2,
parallel = FALSE,
numCores = 8,
numReps = 2,
adjustment = "BH")
## Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'AESPCA_pVals': object 'tt' not found
Evaluating a log2FC
barplot
Figure 2E is now comprised of a plot which shows log2FC values with
error bars for selected genes and seeks to show differences between
2.3/uninfected and 2.2/uninfected.
Here is the table Olga used to generate it:
I went looking in the xlsx files produced in 202405 and found that
these are the log2FC values and standard errors produced by DESeq2.
It should be noted that in my most recent version of these analyses,
these numbers did shift slightly. I am looking into that now.
** 2.3 vs Uninfected MØ 2.2 vs Uninfected MØ
| Gene | Mean | SEM
| n | Mean | SEM |n |
|IFI27 | 7.224 | 0.5662 |6 | 2.702 | 0.5669 | 6| |RSAD2 | 6.29 |
0.7312 |6 | 1.623 | 0.7303 | 6| |CCL8 | 6.225 | 0.928 |6 | -0.314| 0.941
| 6| |IFI44L| 5.895 | 0.612 |6 | 2.06 | 0.611 | 6| |OASL | 4.726 |
0.4974 |6 | 1.392 | 0.4973 | 6| |USP18 | 3.644 | 0.483 |6 | 0.999 |
0.4826 | 6| |IDO1 | 7.145 | 1.107 |6 | 1.257 | 1.141 | 6| |IDO2 | 3.935
| 1.3 |6 | 2.557 | 1.341 | 6| |KYNU | 1.07 | 0.2186 |6 | 0.0207| 0.2184
| 6| |AHR | 0.9382 | 0.2236 |6 | 0.5032| 0.2239 | 6| |IL4I1 | 2.593 |
0.4623 |6 | 0.039 | 0.4618 | 6| |SOD2 | 2.76 | 0.349 |6 | 0.4241| 0.3528
| 6| |NOTCH1| 0.7572| 0.275 |6 | 1.495 | 0.2744 | 6| |DLL1 | 0.8268|
0.5285 |6 | 3.455 | 0.5228 | 6| |DLL4 | 1.116 | 0.737 |6 | 4.243 | 0.71
| 6| |HES1 | -0.0183| 0.8599 |6 | 6.536 | 0.7973 | 6| |HEY1 | 0.5533|
0.5789 |6 | 4.181 | 0.6273 | 6|
Ok, I think I found a problem: The NOTCH1 value is actually the
adjusted p-value.
- Transporters without drug
** 2.3 vs Uninfected MØ 2.2 vs Uninfected MØ
| Gene | Mean | SEM
| n| Mean | SEM | n|
|ABCB1 | -2.354 | 0.442 | 6| -0.406| 0.431| 6| |ABCG4 | -3.715 |
0.648 | 6| -0.653| 0.630| 6| |ABCB5 | -1.192 | 0.380 | 6| 1.351 | 0.363|
6| |ABCA9 | 1.880 | 0.648 | 6| 3.444 | 0.637| 6| |ABCC2 | 0.454 | 0.321
| 6| 1.818 | 0.314| 6| |AQP2 | -1.191 | 0.529 | 6| 0.745 | 0.514| 6|
|AQP3 | -0.940 | 0.402 | 6| 0.431 | 0.395| 6|
** 2.3 vs Uninfected MØ 2.2 vs Uninfected MØ
|Gene | Mean | SEM |
n| Mean | SEM | n |
|ABCB1 | -0.697| 0.349 | 6| -1.255| 0.337 | 6| |ABCG4 | 1.231 | 0.503
| 6| 0.547 | 0.484 | 6| |AQP2 | 0.816 | 0.399 | 6| 0.043 | 0.387 | 6|
|AQP3 | -1.286| 0.320 | 6| -1.613| 0.309 | 6| |AQP8 | 0.634 | 0.370 | 6|
0.943 | 0.365 | 6|
Let us now see if I can recapitulate the plot…
nodrug_contrasts <- c("z23nosb_vs_uninf", "z22nosb_vs_uninf")
genes_no_drug <- c("IFI27", "RSAD2", "CCL8", "IFI44L", "OASL", "USP18", "IDO1", "IDO2", "KYNU", "AHR", "IL4I1", "SOD2", "NOTCH1", "DLL1", "DLL4", "HES1", "HEY1")
transporters_no_drug <- c("ABCB1", "ABCG4", "ABCB5", "ABCA9", "ABCC2", "AQP2", "AQP3")
drug_contrasts <- c("z23sb_vs_sb", "z22sb_vs_sb")
transporters_drug <- c("ABCB1", "ABCG4", "AQP2", "AQP3", "AQP8")
These values came out of the data structure called
‘hs_macr_table’
z23nosb_uninf_values <- hs_macr_table[["data"]][["z23nosb_vs_uninf"]]
gene_idx <- z23nosb_uninf_values[["hgnc_symbol"]] %in% genes_no_drug
nodrug_rows <- z23nosb_uninf_values[gene_idx, ]
rownames(nodrug_rows) <- nodrug_rows[["hgnc_symbol"]]
z23_nodrug_values <- nodrug_rows[, c("deseq_logfc", "deseq_lfcse")]
z23_nodrug_values
## DataFrame with 17 rows and 2 columns
## deseq_logfc deseq_lfcse
## <numeric> <numeric>
## IL4I1 2.59300 0.4623
## AHR 0.93810 0.2236
## CCL8 6.22500 0.9280
## SOD2 2.76000 0.3490
## HES1 -0.01786 0.8599
## ... ... ...
## HEY1 0.5531 0.6520
## IFI27 7.2240 0.5662
## USP18 3.6440 0.4830
## IDO2 3.9340 1.2990
## DLL1 0.8268 0.5284
z22nosb_uninf_values <- hs_macr_table[["data"]][["z22nosb_vs_uninf"]]
gene_idx <- z22nosb_uninf_values[["hgnc_symbol"]] %in% genes_no_drug
nodrug_rows <- z22nosb_uninf_values[gene_idx, ]
rownames(nodrug_rows) <- nodrug_rows[["hgnc_symbol"]]
z22_nodrug_values <- nodrug_rows[, c("deseq_logfc", "deseq_lfcse")]
z22_nodrug_values
## DataFrame with 17 rows and 2 columns
## deseq_logfc deseq_lfcse
## <numeric> <numeric>
## IL4I1 0.03995 0.4618
## AHR 0.50310 0.2239
## CCL8 -0.31360 0.9406
## SOD2 0.42410 0.3528
## HES1 6.53600 0.7973
## ... ... ...
## HEY1 4.181 0.6273
## IFI27 2.702 0.5669
## USP18 0.999 0.4826
## IDO2 2.557 1.3410
## DLL1 3.455 0.5228
z23_nodrug_values[["state"]] <- "z23_vs_uninfected"
z22_nodrug_values[["state"]] <- "z22_vs_uninfected"
plot_df <- rbind.data.frame(as.data.frame(z23_nodrug_values), as.data.frame(z22_nodrug_values))
plot_df[["gene"]] <- rownames(plot_df)
## I just realized that this is actually just a comparison of z23/z22
## we should just take the adjusted p-values from that contrast for this.
z23_z22_comparison <- hs_macr_table[["data"]][["z23nosb_vs_z22nosb"]]
nodrug_rows <- z23_z22_comparison[gene_idx, ]
nodrug_pvalues <- nodrug_rows[, c("deseq_p", "deseq_adjp")]
rownames(nodrug_pvalues) <- nodrug_rows[["hgnc_symbol"]]
nodrug_pvalues
## DataFrame with 17 rows and 2 columns
## deseq_p deseq_adjp
## <numeric> <numeric>
## IL4I1 1.250e-13 3.949e-12
## AHR 8.308e-03 2.421e-02
## CCL8 3.677e-21 4.197e-19
## SOD2 6.181e-20 5.813e-18
## HES1 9.422e-38 2.215e-34
## ... ... ...
## HEY1 9.854e-17 5.410e-15
## IFI27 6.486e-28 2.310e-25
## USP18 1.772e-13 5.467e-12
## IDO2 1.047e-01 1.895e-01
## DLL1 4.352e-12 1.103e-10
ggplot(plot_df, aes(x = gene, y = deseq_logfc, fill = state)) +
geom_bar(position = position_dodge(), stat = "identity") +
geom_errorbar(aes(ymin = deseq_logfc - deseq_lfcse,
ymax = deseq_logfc + deseq_lfcse),
width = 0.2, position = position_dodge(0.9)) +
scale_fill_manual(values = c("#1B9E77", "#7570B3")) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5))

comparison <- c("z23_vs_uninfected", "z22_vs_uninfected")
comparisons <- rep(list(comparison), nrow(plot_df) / 2)
ggplot(plot_df, aes(x = gene, y = deseq_logfc, fill = state, add = deseq_lfcse, facet.by = "state")) +
geom_bar(position = position_dodge(), stat = "identity") +
geom_errorbar(aes(ymin = deseq_logfc - deseq_lfcse,
ymax = deseq_logfc + deseq_lfcse),
width = 0.2, position = position_dodge(0.9)) +
stat_compare_means() +
stat_compare_means(comparisons = comparisons, label.y = rownames(z23_nodrug_values)) +
scale_fill_manual(values = c("#1B9E77", "#7570B3")) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5))
## Error in stat_compare_means(): could not find function "stat_compare_means"
Excellent, the values now match up. Now I ust need to figure out why
the stupid hgnc IDs got lost… I can see them in the hs_annot data
structure, so I must have messed up when I regenered the input to the
de. Ok, I got to the same starting point now with identical values. As
soon as I did that, I looked at the resulting plot and realized that we
are actually just comparing z23 / z22.
Here is why: the plot as it stands is a comparison of the log2FC
values of the following two contrasts: z23/uninfected and
z22/uninfected; stated differently, this is (z23/uninf)/(z22/uninf)
which of course cancels out to just z23/z22.
Therefore it is much more parsimonious to just use the values from
z23/z22. I swear I have gone through this exact exercise on so so many
occasions in the past it is terrible.
ggsignificance of
the immune modulators
wanted_genes <- c("IFI27", "RSAD2", "CCL8", "IFI44L", "OASL",
"USP18", "IDO1", "IDO2", "KYNU", "AHR", "IL4I1",
"SOD2", "NOTCH1", "DLL1", "DLL4", "HES1", "HEY1")
modulator_plot <- ggsignif_paired_genes(
hs_macr, conditions = c("inf_z23", "inf_z22"), genes = wanted_genes)
## Running normalize_se.
## Warning in normalize_se(exp, ...): Quantile normalization and sva do not always
## play well together.
## Removing 9725 low-count genes (11756 remaining).
## transform_counts: Found 2226 values less than 0.
## Warning in transform_counts(count_table, method = transform, ...): NaNs
## produced
## Setting 34233 entries to zero.
## Using Row.names, ensembl_gene_id, ensembl_transcript_id, description, gene_biotype, cds_length, chromosome_name, strand, hgnc_symbol, transcript as id variables
## Error in ggsignif_paired_genes(hs_macr, conditions = c("inf_z23", "inf_z22"), : object 'merged' not found
## Error: object 'modulator_plot' not found
ggsignificance of
the transporters
## First line is without drug
wanted_genes <- c("ABCB1", "ABCG4", "ABCB5", "AQP2", "AQP3",
## with drug
"ABCB1", "ABCG4", "AQP2", "AQP3", "AQP8")
transporter_plot <- ggsignif_paired_genes(
hs_macr, conditions = c("inf_z23", "inf_z22"), genes = wanted_genes)
## Running normalize_se.
## Warning in normalize_se(exp, ...): Quantile normalization and sva do not always
## play well together.
## Removing 9725 low-count genes (11756 remaining).
## transform_counts: Found 2226 values less than 0.
## Warning in transform_counts(count_table, method = transform, ...): NaNs
## produced
## Setting 34233 entries to zero.
## Error in `dplyr::arrange()` at magrittr/R/pipe.R:136:3:
## i In argument: `..1 = factor(hgnc_symbol, levels = genes)`.
## Caused by error in `levels<-`:
## ! factor level [6] is duplicated
## Error: object 'transporter_plot' not found
pander::pander(sessionInfo())
## Warning: Your system is mis-configured: '/etc/localtime' is not a symlink
## Warning: It is strongly recommended to set envionment variable TZ to
## 'America/New_York' (or equivalent)
R version 4.5.0 (2025-04-11)
Platform: x86_64-pc-linux-gnu
locale: C
attached base packages: grid,
stats4, stats, graphics, grDevices,
utils, datasets, methods and
base
other attached packages:
rWikiPathways(v.1.28.0), pathwayPCA(v.1.24.0),
Rgraphviz(v.2.52.0), graph(v.1.86.0),
SparseM(v.1.84-2), topGO(v.2.60.1),
GSVAdata(v.1.44.0), org.Hs.eg.db(v.3.21.0),
AnnotationDbi(v.1.70.0), IRanges(v.2.42.0),
S4Vectors(v.0.46.0), Biobase(v.2.68.0),
BiocGenerics(v.0.54.0), generics(v.0.1.4),
ReactomePA(v.1.52.0), edgeR(v.4.6.3),
ruv(v.0.9.7.1), ggstatsplot(v.0.13.1),
enrichplot(v.1.28.4), tidyr(v.1.3.1),
tibble(v.3.3.0), UpSetR(v.1.4.0),
hpgltools(v.1.2), Heatplus(v.3.16.0),
glue(v.1.8.0), ggplot2(v.3.5.2) and
ggbreak(v.0.1.6)
loaded via a namespace (and not attached):
R.methodsS3(v.1.8.2), dichromat(v.2.0-0.1),
GSEABase(v.1.70.0), progress(v.1.2.3),
Biostrings(v.2.76.0), vctrs(v.0.6.5),
ggtangle(v.0.0.7), shape(v.1.4.6.1),
effectsize(v.1.0.1), digest(v.0.6.37),
png(v.0.1-8), corpcor(v.1.6.10),
DEGreport(v.1.44.0), ggrepel(v.0.9.6),
bayestestR(v.0.17.0), correlation(v.0.8.8),
magick(v.2.9.0), MASS(v.7.3-65),
reshape(v.0.8.10), reshape2(v.1.4.4),
httpuv(v.1.6.16), foreach(v.1.5.2),
qvalue(v.2.40.0), withr(v.3.0.2),
psych(v.2.5.6), xfun(v.0.53), ggfun(v.0.2.0),
survival(v.3.8-3), memoise(v.2.0.1),
clusterProfiler(v.4.16.0), gson(v.0.1.0),
BiasedUrn(v.2.0.12), parameters(v.0.28.1),
GlobalOptions(v.0.1.2), tidytree(v.0.4.6),
gtools(v.3.9.5), logging(v.0.10-108),
R.oo(v.1.27.1), DEoptimR(v.1.1-4),
prettyunits(v.1.2.0), datawizard(v.1.2.0),
rematch2(v.2.1.2), KEGGREST(v.1.48.1),
promises(v.1.3.3), httr(v.1.4.7),
restfulr(v.0.0.16), meshes(v.1.34.0),
UCSC.utils(v.1.4.0), DOSE(v.4.2.0),
reactome.db(v.1.92.0), curl(v.7.0.0),
ggraph(v.2.2.2), polyclip(v.1.10-7),
GenomeInfoDbData(v.1.2.14), SparseArray(v.1.8.1),
RBGL(v.1.84.0), RcppEigen(v.0.3.4.0.2),
doParallel(v.1.0.17), xtable(v.1.8-4),
stringr(v.1.5.1), desc(v.1.4.3),
evaluate(v.1.0.4), S4Arrays(v.1.8.1),
BiocFileCache(v.2.16.1), preprocessCore(v.1.70.0),
hms(v.1.1.3), GenomicRanges(v.1.60.0),
colorspace(v.2.1-1), filelock(v.1.0.3),
magrittr(v.2.0.3), later(v.1.4.3),
viridis(v.0.6.5), ggtree(v.3.17.1.001),
lattice(v.0.22-7), genefilter(v.1.90.0),
robustbase(v.0.99-6), XML(v.3.99-0.19),
cowplot(v.1.2.0), matrixStats(v.1.5.0),
ggupset(v.0.4.1), pillar(v.1.11.0),
nlme(v.3.1-168), iterators(v.1.0.14),
caTools(v.1.18.3), compiler(v.4.5.0),
stringi(v.1.8.7), minqa(v.1.2.8),
SummarizedExperiment(v.1.38.1),
GenomicAlignments(v.1.44.0), plyr(v.1.8.9),
BiocIO(v.1.18.0), crayon(v.1.5.3),
abind(v.1.4-8), ggdendro(v.0.2.0),
gridGraphics(v.0.5-1), locfit(v.1.5-9.12),
graphlayouts(v.1.2.2), bit(v.4.6.0),
dplyr(v.1.1.4), fastmatch(v.1.1-6),
codetools(v.0.2-20), crosstalk(v.1.2.1),
bslib(v.0.9.0), paletteer(v.1.6.0),
GetoptLong(v.1.0.5), plotly(v.4.11.0),
remaCor(v.0.0.20), mime(v.0.13),
splines(v.4.5.0), circlize(v.0.4.16),
Rcpp(v.1.1.0), dbplyr(v.2.5.0), lars(v.1.3),
knitr(v.1.50), blob(v.1.2.4), clue(v.0.3-66),
BiocVersion(v.3.21.1), lme4(v.1.1-37),
fs(v.1.6.6), Rdpack(v.2.6.4), EBSeq(v.2.6.0),
openxlsx(v.4.2.8), ggplotify(v.0.1.2),
Matrix(v.1.7-3), statmod(v.1.5.0),
fANCOVA(v.0.6-1), tweenr(v.2.0.3),
pkgconfig(v.2.0.3), tools(v.4.5.0),
cachem(v.1.1.0), RhpcBLASctl(v.0.23-42),
rbibutils(v.2.3), RSQLite(v.2.4.3),
viridisLite(v.0.4.2), DBI(v.1.2.3),
numDeriv(v.2016.8-1.1), graphite(v.1.54.0),
fastmap(v.1.2.0), rmarkdown(v.2.29),
scales(v.1.4.0), gprofiler2(v.0.2.3),
Rsamtools(v.2.24.0), broom(v.1.0.9),
AnnotationHub(v.3.16.1), sass(v.0.4.10),
patchwork(v.1.3.2), BiocManager(v.1.30.26),
insight(v.1.4.2), varhandle(v.2.0.6),
farver(v.2.1.2), reformulas(v.0.4.1),
aod(v.1.3.3), tidygraph(v.1.3.1),
mgcv(v.1.9-3), yaml(v.2.3.10),
MatrixGenerics(v.1.20.0), rtracklayer(v.1.68.0),
cli(v.3.6.5), purrr(v.1.1.0),
txdbmaker(v.1.4.2), lifecycle(v.1.0.4),
mvtnorm(v.1.3-3), backports(v.1.5.0),
Vennerable(v.3.1.0.9000), BiocParallel(v.1.42.1),
annotate(v.1.86.1), MeSHDbi(v.1.44.0),
rjson(v.0.2.23), gtable(v.0.3.6),
parallel(v.4.5.0), ape(v.5.8-1),
testthat(v.3.2.3), limma(v.3.64.3),
jsonlite(v.2.0.0), bitops(v.1.0-9),
NOISeq(v.2.52.0), bit64(v.4.6.0-1),
brio(v.1.1.5), yulab.utils(v.0.2.1),
zip(v.2.3.3), geneLenDataBase(v.1.44.0),
RcppParallel(v.5.1.11-1), jquerylib(v.0.1.4),
GOSemSim(v.2.34.0), zeallot(v.0.2.0),
R.utils(v.2.13.0), pbkrtest(v.0.5.5),
lazyeval(v.0.2.2), pander(v.0.6.6),
ConsensusClusterPlus(v.1.72.0), shiny(v.1.11.1),
htmltools(v.0.5.8.1), GO.db(v.3.21.0),
rappdirs(v.0.3.3), blockmodeling(v.1.1.8),
tinytex(v.0.57), httr2(v.1.2.1),
XVector(v.0.48.0), RCurl(v.1.98-1.17),
rprojroot(v.2.1.0), treeio(v.1.32.0),
mnormt(v.2.1.1), gridExtra(v.2.3),
ggsankey(v.0.0.99999), EnvStats(v.3.1.0),
boot(v.1.3-31), igraph(v.2.1.4),
variancePartition(v.1.38.1), R6(v.2.6.1),
sva(v.3.56.0), DESeq2(v.1.48.1),
gplots(v.3.2.0), labeling(v.0.4.3),
GenomicFeatures(v.1.60.0), cluster(v.2.1.8.1),
pkgload(v.1.4.0), aplot(v.0.2.8),
GenomeInfoDb(v.1.44.2), nloptr(v.2.2.1),
rstantools(v.2.5.0), DelayedArray(v.0.34.1),
tidyselect(v.1.2.1), xml2(v.1.4.0),
ggforce(v.0.5.0), statsExpressions(v.1.7.1),
goseq(v.1.60.0), KernSmooth(v.2.23-26),
data.table(v.1.17.8), ComplexHeatmap(v.2.24.1),
htmlwidgets(v.1.6.4), fgsea(v.1.34.2),
RColorBrewer(v.1.1-3), biomaRt(v.2.64.0),
rlang(v.1.1.6), lmerTest(v.3.1-3) and
ggnewscale(v.0.5.2)
message("This is hpgltools commit: ", get_git_commit())
## If you wish to reproduce this exact build of hpgltools, invoke the following:
## > git clone http://github.com/abelew/hpgltools.git
## > git reset 2315e0db2bd684765eb5d29c71cfe08dc63b4322
## This is hpgltools commit: Mon Sep 8 13:30:32 2025 -0400: 2315e0db2bd684765eb5d29c71cfe08dc63b4322
tmp <- saveme(filename = savefile)
## The savefile is: /lab/singularity/tmrc2_macrophage_deb/202509081525_outputs/savefiles/03differential_expression.rda.xz
## The file does not yet exist.
## The save string is: con <- pipe(paste0('pxz > /lab/singularity/tmrc2_macrophage_deb/202509081525_outputs/savefiles/03differential_expression.rda.xz'), 'wb'); save(list = ls(all.names = TRUE, envir = globalenv()),
## envir = globalenv(), file = con, compress = FALSE); close(con)
## Error in save(list = ls(all.names = TRUE, envir = globalenv()), envir = globalenv(), : ignoring SIGPIPE signal
tmp <- loadme(filename = savefile)
devtools::load_all(‘~/hpgltools’)
