Though there is no fundamental reason to separate the likelihood ratio testing and GSVA from the differential expression analyses, they are both tasks which reach back to the primary expression data, unlike GSEA. Therefore I chose to separate these two tasks into a their own document.
EdgeR (McCarthy, Chen, and Smyth (2012)) and DESeq2 (Love, Huber, and Anders (2014)) both provide easily accessible LRT analyses. If I remember correctly, the function I wrote to simplify these analyses is aware of both, but this document primarily uses the results from DESeq2. I am reasonably certain they end up basically identical…
The likelihood ratio testing performed on the TMRC3 data is intended to look for shared patterns of expression across some known, constant factor. This may be time (are there patterns across the visits of each person), or cell type (do some genes act consistently across type). We could also ask this question of our two clinics, or indeed across the cure/fail samples.
<- normalize_expt(tc_clinical, filter = TRUE) tc_clinical_filt
## Removing 5654 low-count genes (14298 remaining).
<- deseq_lrt(tc_clinical_filt, transform = "vst", interaction = FALSE,
tc_lrt_clinic interactor_column = "visitnumber",
interest_column = "clinic")
## converting counts to integer mode
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 169 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
## A large number of genes was given-- please, make sure this is not an error. Normally, only DE genes will be useful for this function.
## Working with 6279 genes.
## Working with 6279 genes after filtering: minc > 3
## Joining with `by = join_by(merge)`
## Joining with `by = join_by(merge)`
## Warning in `labels<-.dendrogram`(dend, value = value, ...): The lengths of the
## new labels is shorter than the number of leaves in the dendrogram - labels are
## recycled.
"cluster_data"]][["plot"]] tc_lrt_clinic[[
<- normalize_expt(t_clinical, filter=TRUE) t_clinical_filt
## Removing 5796 low-count genes (14156 remaining).
<- deseq_lrt(t_clinical_filt, transform = "vst", interaction = FALSE,
lrt_visit interactor_column = "visitnumber",
interest_column = "finaloutcome")
## converting counts to integer mode
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 120 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
## A large number of genes was given-- please, make sure this is not an error. Normally, only DE genes will be useful for this function.
## Working with 5445 genes.
## Working with 5444 genes after filtering: minc > 3
## Joining with `by = join_by(merge)`
## Joining with `by = join_by(merge)`
## Warning in `labels<-.dendrogram`(dend, value = value, ...): The lengths of the
## new labels is shorter than the number of leaves in the dendrogram - labels are
## recycled.
$cluster_data$plot lrt_visit
summary(lrt_visit[["favorite_genes"]])
## genes cluster
## Length:5444 Min. : 1.00
## Class :character 1st Qu.: 2.00
## Mode :character Median : 3.00
## Mean : 3.65
## 3rd Qu.: 4.00
## Max. :11.00
<- write_xlsx(data = as.data.frame(lrt_visit[["deseq_table"]]),
written excel = glue("excel/lrt_clinical_visit-v{ver}.xlsx"))
<- deseq_lrt(t_visitcf_monocyte, transform = "vst",
lrt_monocyte_visit interaction = FALSE,
interactor_column = "visitnumber",
interest_column = "finaloutcome")
## converting counts to integer mode
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 68 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
## Working with 12 genes.
## Working with 12 genes after filtering: minc > 3
## Joining with `by = join_by(merge)`
## Joining with `by = join_by(merge)`
## Warning in `labels<-.dendrogram`(dend, value = value, ...): The lengths of the
## new labels is shorter than the number of leaves in the dendrogram - labels are
## recycled.
"cluster_data"]][["plot"]] lrt_monocyte_visit[[
<- deseq_lrt(t_visitcf_neutrophil, transform = "vst",
lrt_neutrophil_visit interaction = FALSE,
interactor_column = "visitnumber",
interest_column = "finaloutcome")
## converting counts to integer mode
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 49 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
## Working with 952 genes.
## Working with 952 genes after filtering: minc > 3
## Joining with `by = join_by(merge)`
## Joining with `by = join_by(merge)`
## Warning in `labels<-.dendrogram`(dend, value = value, ...): The lengths of the
## new labels is shorter than the number of leaves in the dendrogram - labels are
## recycled.
"cluster_data"]][["plot"]] lrt_neutrophil_visit[[
<- deseq_lrt(t_visitcf_eosinophil, transform = "vst",
lrt_eosinophil_visit interaction = FALSE,
interactor_column = "visitnumber",
interest_column = "finaloutcome")
## converting counts to integer mode
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## Warning in deseq_lrt(t_visitcf_eosinophil, transform = "vst", interaction =
## FALSE, : There are no significant differences given the 0.05 adjusted p-value.
## Returning the full LRT table just so that you have something to look at.
GSVA(Hänzelmann, Castelo, and Guinney (2013)) is a completely different family of analysis, and at least in my hands is used as a way to explore and look for publications which may be of interest. The general idea is that it performs a specific set of normalizations on the raw data, rank orders the results; then cross references them against extant analyses looking for over represented gene sets in specific papers or categories (e.g. reactome/GO/etc). This may either use a built-in set of mSigDB (Liberzon et al. (2011)) gene sets, or you may pull in manually downloaded (newer) data. On my workstation, I do the latter because I can leverage the downloaded gmt/xml/etc files in order to get the full annotations for the gene sets and accompanying papers. This container just uses the pre-downloaded mSigDB data (“GSVAdata” (n.d.)), which is a bit older.
Reminder: I disabled these for the moment in order to consider how to handle the (non)inclusion of the mSigDB gmt files.
If one chooses, simple_gsva() can either load signatures from the GSVAdata package, which is a little old, or load an arbitrary set. load_gmt_signatures() provides a quick way to extract them from a gmt file.
<- load_gmt_signatures(signatures = "reference/msigdb/c7.all.v7.5.1.entrez.gmt",
broad_c7 signature_category = "c7")
<- load_gmt_signatures(signatures = "reference/msigdb/c2.all.v7.5.1.entrez.gmt",
broad_c2 signature_category = "c2")
<- load_gmt_signatures(signatures = "reference/msigdb/h.all.v7.5.1.entrez.gmt",
broad_h signature_category = "h")
Actually, I am going to try to semi-change my mind: GSVA comes with a dataset comprised of some older mSigDB data. Thus I will repeat each of the following blocks with an invocation which uses the older data and leave the previous invocations here so you can see what I actually ran to make pretty pictures. In the best case scenario, the results should be nearly identical with only a few categories missing between the new copy I downloaded and loaded above and the loaded-by-gsva version.
<- simple_gsva(
tc_celltype_gsva_c2 signatures = broad_c2,
tc_valid, msig_xml = "reference/msigdb/msigdb_v7.5.1.xml")
<- get_sig_gsva_categories(
tc_celltype_gsva_c2_sig
tc_celltype_gsva_c2,excel = "analyses/3_cali_and_tumaco/GSVA/tc_valid_gsva_c2.xlsx")
$subset_plot
tc_celltype_gsva_c2_sig$score_plot tc_celltype_gsva_c2_sig
There are two other important differences between these results and the results obtained when using the above invocations and the following:
heh, only 1 way to find out! Let us try!
Reminder to self, these are the current mSigDB human categories, I am not certain which (I assume all) are included in GSVAdata.
<- simple_gsva(
tc_celltype_gsva_c2 signature_category = "c2") tc_valid,
## Converting the rownames() of the expressionset to ENTREZID.
## 574 ENSEMBL ID's didn't have a matching ENTEREZ ID. Dropping them now.
## Before conversion, the expressionset has 19952 entries.
## After conversion, the expressionset has 19337 entries.
## Adding descriptions and IDs to the gene set annotations.
<- get_sig_gsva_categories(
tc_celltype_gsva_c2_sig
tc_celltype_gsva_c2,excel = "analyses/3_cali_and_tumaco/GSVA/tc_valid_gsva_c2.xlsx")
## libsize was not specified, this parameter has profound effects on limma's result.
## Using the libsize from expt$libsize.
## Limma step 1/6: choosing model.
## Assuming this data is similar to a micro array and not performign voom.
## Limma step 3/6: running lmFit with method: ls.
## Limma step 4/6: making and fitting contrasts with no intercept. (~ 0 + factors)
## Finished make_pairwise_contrasts.
## Limma step 5/6: Running eBayes with robust = FALSE and trend = FALSE.
## Limma step 6/6: Writing limma outputs.
## Limma step 6/6: 1/1: Creating table: failure_vs_cure. Adjust = BH
## Limma step 6/6: 1/2: Creating table: cure. Adjust = BH
## Limma step 6/6: 2/2: Creating table: failure. Adjust = BH
## The factor cure has 122 rows.
## The factor failure has 62 rows.
## Testing each factor against the others.
## Scoring cure against everything else.
## Scoring failure against everything else.
$subset_plot tc_celltype_gsva_c2_sig
$score_plot tc_celltype_gsva_c2_sig
Ditto for C7
<- simple_gsva(
tc_celltype_gsva_c7 signatures = broad_c7,
tc_valid, msig_xml = "reference/msigdb/msigdb_v7.5.1.xml",
signature_category="c7")
<- get_sig_gsva_categories(
tc_celltype_gsva_c7_sig
tc_celltype_gsva_c7,excel = "analyses/3_cali_and_tumaco/GSVA/tc_valid_gsva_c7.xlsx")
<- simple_gsva(
tc_celltype_gsva_c7 signature_category = "c7") tc_valid,
## Converting the rownames() of the expressionset to ENTREZID.
## 574 ENSEMBL ID's didn't have a matching ENTEREZ ID. Dropping them now.
## Before conversion, the expressionset has 19952 entries.
## After conversion, the expressionset has 19337 entries.
## Adding descriptions and IDs to the gene set annotations.
<- get_sig_gsva_categories(
tc_celltype_gsva_c7_sig
tc_celltype_gsva_c7,excel = "analyses/3_cali_and_tumaco/GSVA/tc_valid_gsva_c7.xlsx")
## libsize was not specified, this parameter has profound effects on limma's result.
## Using the libsize from expt$libsize.
## Limma step 1/6: choosing model.
## Assuming this data is similar to a micro array and not performign voom.
## Limma step 3/6: running lmFit with method: ls.
## Limma step 4/6: making and fitting contrasts with no intercept. (~ 0 + factors)
## Finished make_pairwise_contrasts.
## Limma step 5/6: Running eBayes with robust = FALSE and trend = FALSE.
## Limma step 6/6: Writing limma outputs.
## Limma step 6/6: 1/1: Creating table: failure_vs_cure. Adjust = BH
## Limma step 6/6: 1/2: Creating table: cure. Adjust = BH
## Limma step 6/6: 2/2: Creating table: failure. Adjust = BH
## The factor cure has 122 rows.
## The factor failure has 62 rows.
## Testing each factor against the others.
## Scoring cure against everything else.
## Scoring failure against everything else.
$subset_plot tc_celltype_gsva_c7_sig
$score_plot tc_celltype_gsva_c7_sig
I am not too fussed about the hallmark genes.
<- simple_gsva(
tc_celltype_gsva_h
tc_valid,signatures = broad_h,
msig_xml = "reference/msigdb/msigdb_v7.5.1.xml",
signature_category = "h")
<- get_sig_gsva_categories(
tc_celltype_gsva_h_sig
tc_celltype_gsva_h,excel = "analyses/3_cali_and_tumaco/GSVA/tc_valid_gsva_h.xlsx")
The C2 set includes a broad array of databases and studies. Thus when we look at the various gene sets which are deemed to be poor scoring, we see things like ‘the retinoid cycle in cones’ or ‘Nicotine metabolism’.
<- simple_gsva(
t_clinical_gsva_c2
t_clinical,signatures = broad_c2,
msig_xml = "reference/msigdb/msigdb_v7.5.1.xml",
signature_category = "c2")
<- get_sig_gsva_categories(
t_clinical_gsva_c2_sig
t_clinical_gsva_c2,excel = "analyses/4_tumaco/GSVA/t_clinical_gsva_c2.xlsx")
$subset_plot
t_clinical_gsva_c2_sig$score_plot t_clinical_gsva_c2_sig
<- simple_gsva(
t_clinical_gsva_c2 signature_category = "c2") t_clinical,
## Converting the rownames() of the expressionset to ENTREZID.
## 574 ENSEMBL ID's didn't have a matching ENTEREZ ID. Dropping them now.
## Before conversion, the expressionset has 19952 entries.
## After conversion, the expressionset has 19337 entries.
## Adding descriptions and IDs to the gene set annotations.
<- get_sig_gsva_categories(
t_clinical_gsva_c2_sig
t_clinical_gsva_c2,excel = "analyses/4_tumaco/GSVA/t_clinical_gsva_c2.xlsx")
## libsize was not specified, this parameter has profound effects on limma's result.
## Using the libsize from expt$libsize.
## Limma step 1/6: choosing model.
## Assuming this data is similar to a micro array and not performign voom.
## Limma step 3/6: running lmFit with method: ls.
## Limma step 4/6: making and fitting contrasts with no intercept. (~ 0 + factors)
## Finished make_pairwise_contrasts.
## Limma step 5/6: Running eBayes with robust = FALSE and trend = FALSE.
## Limma step 6/6: Writing limma outputs.
## Limma step 6/6: 1/1: Creating table: failure_vs_cure. Adjust = BH
## Limma step 6/6: 1/2: Creating table: cure. Adjust = BH
## Limma step 6/6: 2/2: Creating table: failure. Adjust = BH
## The factor cure has 67 rows.
## The factor failure has 56 rows.
## Testing each factor against the others.
## Scoring cure against everything else.
## Scoring failure against everything else.
$subset_plot t_clinical_gsva_c2_sig
$score_plot t_clinical_gsva_c2_sig
The get_sig_gsva_categories() function uses a few metrics to attempt to find msigDB categories of interest from the GSVA results. One of the methods employed which I hope will prove useful is to test the scores observed for one condition against all values in order to look for categories which are the most likely to be interesting.
With that in mind, here are the brief descriptions for a few of the most highly scored C2 groups and their associated PMIDs. Note that this are taken from the set of failed samples vs. all; there are some differences among the cure samples, but they are generally very similar.
In contrast, the C7 set is relatively focused on the immune system, and is therefore likely to be of direct interest.
<- simple_gsva(
t_clinical_gsva_c7
t_clinical,signatures = broad_c7,
msig_xml = "reference/msigdb/msigdb_v7.5.1.xml",
signature_category = "c7")
<- get_sig_gsva_categories(
t_clinical_gsva_c7_sig
t_clinical_gsva_c7,excel = "analyses/4_tumaco/GSVA/t_clinical_gsva_c7.xlsx")
<- simple_gsva(
t_clinical_gsva_c7 signature_category = "c7") t_clinical,
## Converting the rownames() of the expressionset to ENTREZID.
## 574 ENSEMBL ID's didn't have a matching ENTEREZ ID. Dropping them now.
## Before conversion, the expressionset has 19952 entries.
## After conversion, the expressionset has 19337 entries.
## Adding descriptions and IDs to the gene set annotations.
<- get_sig_gsva_categories(
t_clinical_gsva_c7_sig
t_clinical_gsva_c7,excel = "analyses/4_tumaco/GSVA/t_clinical_gsva_c7.xlsx")
## libsize was not specified, this parameter has profound effects on limma's result.
## Using the libsize from expt$libsize.
## Limma step 1/6: choosing model.
## Assuming this data is similar to a micro array and not performign voom.
## Limma step 3/6: running lmFit with method: ls.
## Limma step 4/6: making and fitting contrasts with no intercept. (~ 0 + factors)
## Finished make_pairwise_contrasts.
## Limma step 5/6: Running eBayes with robust = FALSE and trend = FALSE.
## Limma step 6/6: Writing limma outputs.
## Limma step 6/6: 1/1: Creating table: failure_vs_cure. Adjust = BH
## Limma step 6/6: 1/2: Creating table: cure. Adjust = BH
## Limma step 6/6: 2/2: Creating table: failure. Adjust = BH
## The factor cure has 67 rows.
## The factor failure has 56 rows.
## Testing each factor against the others.
## Scoring cure against everything else.
## Scoring failure against everything else.
With the above in mind, here are a few of the most highly scored papers/sets when scoring the failed samples:
The H set is both much larger and smaller, it is comprised of just 50 sets, but they have many more genes associated with them.
<- simple_gsva(
t_clinical_gsva_h
t_clinical,signatures = broad_h,
msig_xml = "reference/msigdb/msigdb_v7.5.1.xml",
signature_category="h")
<- get_sig_gsva_categories(
t_clinical_gsva_h_sig
t_clinical_gsva_h,excel = "analyses/4_tumaco/GSVA/t_clinical_gsva_h.xlsx")
<- simple_gsva(
t_clinical_gsva_h signature_category = "h") t_clinical,
## Converting the rownames() of the expressionset to ENTREZID.
## 574 ENSEMBL ID's didn't have a matching ENTEREZ ID. Dropping them now.
## Before conversion, the expressionset has 19952 entries.
## After conversion, the expressionset has 19337 entries.
## Adding descriptions and IDs to the gene set annotations.
<- get_sig_gsva_categories(
t_clinical_gsva_h_sig
t_clinical_gsva_h,excel = "analyses/4_tumaco/GSVA/t_clinical_gsva_h.xlsx")
## libsize was not specified, this parameter has profound effects on limma's result.
## Using the libsize from expt$libsize.
## Limma step 1/6: choosing model.
## Assuming this data is similar to a micro array and not performign voom.
## Limma step 3/6: running lmFit with method: ls.
## Limma step 4/6: making and fitting contrasts with no intercept. (~ 0 + factors)
## Finished make_pairwise_contrasts.
## Limma step 5/6: Running eBayes with robust = FALSE and trend = FALSE.
## Limma step 6/6: Writing limma outputs.
## Limma step 6/6: 1/1: Creating table: failure_vs_cure. Adjust = BH
## Limma step 6/6: 1/2: Creating table: cure. Adjust = BH
## Limma step 6/6: 2/2: Creating table: failure. Adjust = BH
## The factor cure has 67 rows.
## The factor failure has 56 rows.
## Testing each factor against the others.
## Scoring cure against everything else.
## Scoring failure against everything else.
<- simple_gsva(
t_biopsy_gsva_c2
t_biopsies,signatures = broad_c2,
msig_xml = "reference/msigdb/msigdb_v7.5.1.xml",
signature_category="c2")
<- get_sig_gsva_categories(
t_biopsy_gsva_c2_sig
t_biopsy_gsva_c2,excel = "analyses/4_tumaco/GSVA/t_biopsy_gsva_c2.xlsx")
<- simple_gsva(
t_biopsy_gsva_c2 signature_category = "c2") t_biopsies,
## Converting the rownames() of the expressionset to ENTREZID.
## 574 ENSEMBL ID's didn't have a matching ENTEREZ ID. Dropping them now.
## Before conversion, the expressionset has 19952 entries.
## After conversion, the expressionset has 19337 entries.
## Adding descriptions and IDs to the gene set annotations.
<- get_sig_gsva_categories(
t_biopsy_gsva_c2_sig
t_biopsy_gsva_c2,excel = "analyses/4_tumaco/GSVA/t_biopsy_gsva_c2.xlsx")
## libsize was not specified, this parameter has profound effects on limma's result.
## Using the libsize from expt$libsize.
## Limma step 1/6: choosing model.
## Assuming this data is similar to a micro array and not performign voom.
## Limma step 3/6: running lmFit with method: ls.
## Limma step 4/6: making and fitting contrasts with no intercept. (~ 0 + factors)
## Finished make_pairwise_contrasts.
## Limma step 5/6: Running eBayes with robust = FALSE and trend = FALSE.
## Limma step 6/6: Writing limma outputs.
## Limma step 6/6: 1/1: Creating table: tumacofailure_vs_tumacocure. Adjust = BH
## Limma step 6/6: 1/2: Creating table: tumacocure. Adjust = BH
## Limma step 6/6: 2/2: Creating table: tumacofailure. Adjust = BH
## The factor tumaco_cure has 9 rows.
## The factor tumaco_failure has 5 rows.
## Testing each factor against the others.
## Scoring tumaco_cure against everything else.
## Scoring tumaco_failure against everything else.
<- simple_gsva(
t_biopsy_gsva_c7
t_biopsies,signatures = broad_c7,
msig_xml = "reference/msigdb/msigdb_v7.5.1.xml",
signature_category = "c7")
<- get_sig_gsva_categories(
t_biopsy_gsva_c7_sig
t_biopsy_gsva_c7,excel = "analyses/4_tumaco/GSVA/t_biopsy_gsva_c7.xlsx")
<- simple_gsva(
t_biopsy_gsva_c7 signature_category = "c7") t_biopsies,
## Converting the rownames() of the expressionset to ENTREZID.
## 574 ENSEMBL ID's didn't have a matching ENTEREZ ID. Dropping them now.
## Before conversion, the expressionset has 19952 entries.
## After conversion, the expressionset has 19337 entries.
## Adding descriptions and IDs to the gene set annotations.
<- get_sig_gsva_categories(
t_biopsy_gsva_c7_sig
t_biopsy_gsva_c7,excel = "analyses/4_tumaco/GSVA/t_biopsy_gsva_c7.xlsx")
## libsize was not specified, this parameter has profound effects on limma's result.
## Using the libsize from expt$libsize.
## Limma step 1/6: choosing model.
## Assuming this data is similar to a micro array and not performign voom.
## Limma step 3/6: running lmFit with method: ls.
## Limma step 4/6: making and fitting contrasts with no intercept. (~ 0 + factors)
## Finished make_pairwise_contrasts.
## Limma step 5/6: Running eBayes with robust = FALSE and trend = FALSE.
## Limma step 6/6: Writing limma outputs.
## Limma step 6/6: 1/1: Creating table: tumacofailure_vs_tumacocure. Adjust = BH
## Limma step 6/6: 1/2: Creating table: tumacocure. Adjust = BH
## Limma step 6/6: 2/2: Creating table: tumacofailure. Adjust = BH
## The factor tumaco_cure has 9 rows.
## The factor tumaco_failure has 5 rows.
## Testing each factor against the others.
## Scoring tumaco_cure against everything else.
## Scoring tumaco_failure against everything else.
<- simple_gsva(
t_biopsy_gsva_h
t_biopsies,signatures = broad_h,
msig_xml = "reference/msigdb/msigdb_v7.5.1.xml",
signature_category = "h")
<- get_sig_gsva_categories(
t_biopsy_gsva_h_sig
t_biopsy_gsva_h,excel = "analyses/4_tumaco/GSVA/t_biopsy_gsva_h.xlsx")
<- simple_gsva(
t_biopsy_gsva_h signature_category = "h") t_biopsies,
## Converting the rownames() of the expressionset to ENTREZID.
## 574 ENSEMBL ID's didn't have a matching ENTEREZ ID. Dropping them now.
## Before conversion, the expressionset has 19952 entries.
## After conversion, the expressionset has 19337 entries.
## Adding descriptions and IDs to the gene set annotations.
<- get_sig_gsva_categories(
t_biopsy_gsva_h_sig
t_biopsy_gsva_h,excel = "analyses/4_tumaco/GSVA/t_biopsy_gsva_h.xlsx")
## libsize was not specified, this parameter has profound effects on limma's result.
## Using the libsize from expt$libsize.
## Limma step 1/6: choosing model.
## Assuming this data is similar to a micro array and not performign voom.
## Limma step 3/6: running lmFit with method: ls.
## Limma step 4/6: making and fitting contrasts with no intercept. (~ 0 + factors)
## Finished make_pairwise_contrasts.
## Limma step 5/6: Running eBayes with robust = FALSE and trend = FALSE.
## Limma step 6/6: Writing limma outputs.
## Limma step 6/6: 1/1: Creating table: tumacofailure_vs_tumacocure. Adjust = BH
## Limma step 6/6: 1/2: Creating table: tumacocure. Adjust = BH
## Limma step 6/6: 2/2: Creating table: tumacofailure. Adjust = BH
## The factor tumaco_cure has 9 rows.
## The factor tumaco_failure has 5 rows.
## Testing each factor against the others.
## Scoring tumaco_cure against everything else.
## Scoring tumaco_failure against everything else.
<- simple_gsva(
t_eosinophil_gsva_c7
t_eosinophils,signatures = broad_c7,
msig_xml = "reference/msigdb/msigdb_v7.5.1.xml",
signature_category = "c7")
<- get_sig_gsva_categories(
t_eosinophil_gsva_c7_sig
t_eosinophil_gsva_c7,excel = "analyses/4_tumaco/GSVA/t_eosinophil_gsva_c7.xlsx")
<- simple_gsva(
t_eosinophil_gsva_c7 signature_category = "c7") t_eosinophils,
## Converting the rownames() of the expressionset to ENTREZID.
## 574 ENSEMBL ID's didn't have a matching ENTEREZ ID. Dropping them now.
## Before conversion, the expressionset has 19952 entries.
## After conversion, the expressionset has 19337 entries.
## Adding descriptions and IDs to the gene set annotations.
<- get_sig_gsva_categories(
t_eosinophil_gsva_c7_sig
t_eosinophil_gsva_c7,excel = "analyses/4_tumaco/GSVA/t_eosinophil_gsva_c7.xlsx")
## libsize was not specified, this parameter has profound effects on limma's result.
## Using the libsize from expt$libsize.
## Limma step 1/6: choosing model.
## Assuming this data is similar to a micro array and not performign voom.
## Limma step 3/6: running lmFit with method: ls.
## Limma step 4/6: making and fitting contrasts with no intercept. (~ 0 + factors)
## Finished make_pairwise_contrasts.
## Limma step 5/6: Running eBayes with robust = FALSE and trend = FALSE.
## Limma step 6/6: Writing limma outputs.
## Limma step 6/6: 1/1: Creating table: tumacofailure_vs_tumacocure. Adjust = BH
## Limma step 6/6: 1/2: Creating table: tumacocure. Adjust = BH
## Limma step 6/6: 2/2: Creating table: tumacofailure. Adjust = BH
## The factor tumaco_cure has 17 rows.
## The factor tumaco_failure has 9 rows.
## Testing each factor against the others.
## Scoring tumaco_cure against everything else.
## Scoring tumaco_failure against everything else.
<- simple_gsva(
t_eosinophil_gsva_c2
t_eosinophils,signatures = broad_c2,
msig_xml = "reference/msigdb/msigdb_v7.5.1.xml",
signature_category = "c2")
<- get_sig_gsva_categories(
t_eosinophil_gsva_c2_sig
t_eosinophil_gsva_c2,excel = "analyses/4_tumaco/GSVA/t_eosinophil_gsva_c2.xlsx")
<- simple_gsva(
t_eosinophil_gsva_c2 signature_category = "c2") t_eosinophils,
## Converting the rownames() of the expressionset to ENTREZID.
## 574 ENSEMBL ID's didn't have a matching ENTEREZ ID. Dropping them now.
## Before conversion, the expressionset has 19952 entries.
## After conversion, the expressionset has 19337 entries.
## Adding descriptions and IDs to the gene set annotations.
<- get_sig_gsva_categories(
t_eosinophil_gsva_c2_sig
t_eosinophil_gsva_c2,excel = "analyses/4_tumaco/GSVA/t_eosinophil_gsva_c2.xlsx")
## libsize was not specified, this parameter has profound effects on limma's result.
## Using the libsize from expt$libsize.
## Limma step 1/6: choosing model.
## Assuming this data is similar to a micro array and not performign voom.
## Limma step 3/6: running lmFit with method: ls.
## Limma step 4/6: making and fitting contrasts with no intercept. (~ 0 + factors)
## Finished make_pairwise_contrasts.
## Limma step 5/6: Running eBayes with robust = FALSE and trend = FALSE.
## Limma step 6/6: Writing limma outputs.
## Limma step 6/6: 1/1: Creating table: tumacofailure_vs_tumacocure. Adjust = BH
## Limma step 6/6: 1/2: Creating table: tumacocure. Adjust = BH
## Limma step 6/6: 2/2: Creating table: tumacofailure. Adjust = BH
## The factor tumaco_cure has 17 rows.
## The factor tumaco_failure has 9 rows.
## Testing each factor against the others.
## Scoring tumaco_cure against everything else.
## Scoring tumaco_failure against everything else.
<- simple_gsva(
t_eosinophil_gsva_h
t_eosinophils,signatures = broad_h,
msig_xml = "reference/msigdb/msigdb_v7.5.1.xml",
signature_category = "h")
<- get_sig_gsva_categories(
t_eosinophil_gsva_h_sig
t_eosinophil_gsva_h,excel = "analyses/4_tumaco/GSVA/t_eosinophil_gsva_h.xlsx")
<- simple_gsva(
t_eosinophil_gsva_h signature_category = "h") t_eosinophils,
## Converting the rownames() of the expressionset to ENTREZID.
## 574 ENSEMBL ID's didn't have a matching ENTEREZ ID. Dropping them now.
## Before conversion, the expressionset has 19952 entries.
## After conversion, the expressionset has 19337 entries.
## Adding descriptions and IDs to the gene set annotations.
<- get_sig_gsva_categories(
t_eosinophil_gsva_h_sig
t_eosinophil_gsva_h,excel = "analyses/4_tumaco/GSVA/t_eosinophil_gsva_h.xlsx")
## libsize was not specified, this parameter has profound effects on limma's result.
## Using the libsize from expt$libsize.
## Limma step 1/6: choosing model.
## Assuming this data is similar to a micro array and not performign voom.
## Limma step 3/6: running lmFit with method: ls.
## Limma step 4/6: making and fitting contrasts with no intercept. (~ 0 + factors)
## Finished make_pairwise_contrasts.
## Limma step 5/6: Running eBayes with robust = FALSE and trend = FALSE.
## Limma step 6/6: Writing limma outputs.
## Limma step 6/6: 1/1: Creating table: tumacofailure_vs_tumacocure. Adjust = BH
## Limma step 6/6: 1/2: Creating table: tumacocure. Adjust = BH
## Limma step 6/6: 2/2: Creating table: tumacofailure. Adjust = BH
## The factor tumaco_cure has 17 rows.
## The factor tumaco_failure has 9 rows.
## Testing each factor against the others.
## Scoring tumaco_cure against everything else.
## Scoring tumaco_failure against everything else.
<- simple_gsva(
t_monocyte_gsva_c7
t_monocytes,signatures = broad_c7,
msig_xml = "reference/msigdb/msigdb_v7.5.1.xml",
signature_category = "c7")
<- get_sig_gsva_categories(
t_monocyte_gsva_c7_sig
t_monocyte_gsva_c7,excel = "analyses/4_tumaco/GSVA/t_monocyte_gsva_c7.xlsx")
<- simple_gsva(
t_monocyte_gsva_c7 signature_category = "c7") t_monocytes,
## Converting the rownames() of the expressionset to ENTREZID.
## 574 ENSEMBL ID's didn't have a matching ENTEREZ ID. Dropping them now.
## Before conversion, the expressionset has 19952 entries.
## After conversion, the expressionset has 19337 entries.
## Adding descriptions and IDs to the gene set annotations.
<- get_sig_gsva_categories(
t_monocyte_gsva_c7_sig
t_monocyte_gsva_c7,excel = "analyses/4_tumaco/GSVA/t_monocyte_gsva_c7.xlsx")
## libsize was not specified, this parameter has profound effects on limma's result.
## Using the libsize from expt$libsize.
## Limma step 1/6: choosing model.
## Assuming this data is similar to a micro array and not performign voom.
## Limma step 3/6: running lmFit with method: ls.
## Limma step 4/6: making and fitting contrasts with no intercept. (~ 0 + factors)
## Finished make_pairwise_contrasts.
## Limma step 5/6: Running eBayes with robust = FALSE and trend = FALSE.
## Limma step 6/6: Writing limma outputs.
## Limma step 6/6: 1/1: Creating table: tumacofailure_vs_tumacocure. Adjust = BH
## Limma step 6/6: 1/2: Creating table: tumacocure. Adjust = BH
## Limma step 6/6: 2/2: Creating table: tumacofailure. Adjust = BH
## The factor tumaco_cure has 21 rows.
## The factor tumaco_failure has 21 rows.
## Testing each factor against the others.
## Scoring tumaco_cure against everything else.
## Scoring tumaco_failure against everything else.
<- simple_gsva(
t_monocyte_gsva_c2
t_monocytes,signatures = broad_c2,
msig_xml = "reference/msigdb/msigdb_v7.5.1.xml",
signature_category = "c2")
<- get_sig_gsva_categories(
t_monocyte_gsva_c2_sig
t_monocyte_gsva_c2,excel = "analyses/4_tumaco/GSVA/t_monocyte_gsva_c2.xlsx")
<- simple_gsva(
t_monocyte_gsva_c2 signature_category = "c2") t_monocytes,
## Converting the rownames() of the expressionset to ENTREZID.
## 574 ENSEMBL ID's didn't have a matching ENTEREZ ID. Dropping them now.
## Before conversion, the expressionset has 19952 entries.
## After conversion, the expressionset has 19337 entries.
## Adding descriptions and IDs to the gene set annotations.
<- get_sig_gsva_categories(
t_monocyte_gsva_c2_sig
t_monocyte_gsva_c2,excel = "analyses/4_tumaco/GSVA/t_monocyte_gsva_c2.xlsx")
## libsize was not specified, this parameter has profound effects on limma's result.
## Using the libsize from expt$libsize.
## Limma step 1/6: choosing model.
## Assuming this data is similar to a micro array and not performign voom.
## Limma step 3/6: running lmFit with method: ls.
## Limma step 4/6: making and fitting contrasts with no intercept. (~ 0 + factors)
## Finished make_pairwise_contrasts.
## Limma step 5/6: Running eBayes with robust = FALSE and trend = FALSE.
## Limma step 6/6: Writing limma outputs.
## Limma step 6/6: 1/1: Creating table: tumacofailure_vs_tumacocure. Adjust = BH
## Limma step 6/6: 1/2: Creating table: tumacocure. Adjust = BH
## Limma step 6/6: 2/2: Creating table: tumacofailure. Adjust = BH
## The factor tumaco_cure has 21 rows.
## The factor tumaco_failure has 21 rows.
## Testing each factor against the others.
## Scoring tumaco_cure against everything else.
## Scoring tumaco_failure against everything else.
<- simple_gsva(
t_monocyte_gsva_h
t_monocytes,signatures = broad_h,
msig_xml = "reference/msigdb/msigdb_v7.5.1.xml",
signature_category = "h")
<- get_sig_gsva_categories(
t_monocyte_gsva_h_sig
t_monocyte_gsva_h,excel = "analyses/4_tumaco/GSVA/t_monocyte_gsva_h.xlsx")
<- simple_gsva(
t_monocyte_gsva_h signature_category = "h") t_monocytes,
## Converting the rownames() of the expressionset to ENTREZID.
## 574 ENSEMBL ID's didn't have a matching ENTEREZ ID. Dropping them now.
## Before conversion, the expressionset has 19952 entries.
## After conversion, the expressionset has 19337 entries.
## Adding descriptions and IDs to the gene set annotations.
<- get_sig_gsva_categories(
t_monocyte_gsva_h_sig
t_monocyte_gsva_h,excel = "analyses/4_tumaco/GSVA/t_monocyte_gsva_h.xlsx")
## libsize was not specified, this parameter has profound effects on limma's result.
## Using the libsize from expt$libsize.
## Limma step 1/6: choosing model.
## Assuming this data is similar to a micro array and not performign voom.
## Limma step 3/6: running lmFit with method: ls.
## Limma step 4/6: making and fitting contrasts with no intercept. (~ 0 + factors)
## Finished make_pairwise_contrasts.
## Limma step 5/6: Running eBayes with robust = FALSE and trend = FALSE.
## Limma step 6/6: Writing limma outputs.
## Limma step 6/6: 1/1: Creating table: tumacofailure_vs_tumacocure. Adjust = BH
## Limma step 6/6: 1/2: Creating table: tumacocure. Adjust = BH
## Limma step 6/6: 2/2: Creating table: tumacofailure. Adjust = BH
## The factor tumaco_cure has 21 rows.
## The factor tumaco_failure has 21 rows.
## Testing each factor against the others.
## Scoring tumaco_cure against everything else.
## Scoring tumaco_failure against everything else.
<- simple_gsva(
t_neutrophil_gsva_c7
t_neutrophils,signatures = broad_c7,
msig_xml = "reference/msigdb/msigdb_v7.5.1.xml",
signature_category = "c7")
<- get_sig_gsva_categories(
t_neutrophil_gsva_c7_sig
t_neutrophil_gsva_c7,excel = "analyses/4_tumaco/GSVA/t_neutrophil_gsva_c7.xlsx")
<- simple_gsva(
t_neutrophil_gsva_c7 signature_category = "c7") t_neutrophils,
## Converting the rownames() of the expressionset to ENTREZID.
## 574 ENSEMBL ID's didn't have a matching ENTEREZ ID. Dropping them now.
## Before conversion, the expressionset has 19952 entries.
## After conversion, the expressionset has 19337 entries.
## Adding descriptions and IDs to the gene set annotations.
<- get_sig_gsva_categories(
t_neutrophil_gsva_c7_sig
t_neutrophil_gsva_c7,excel = "analyses/4_tumaco/GSVA/t_neutrophil_gsva_c7.xlsx")
## libsize was not specified, this parameter has profound effects on limma's result.
## Using the libsize from expt$libsize.
## Limma step 1/6: choosing model.
## Assuming this data is similar to a micro array and not performign voom.
## Limma step 3/6: running lmFit with method: ls.
## Limma step 4/6: making and fitting contrasts with no intercept. (~ 0 + factors)
## Finished make_pairwise_contrasts.
## Limma step 5/6: Running eBayes with robust = FALSE and trend = FALSE.
## Limma step 6/6: Writing limma outputs.
## Limma step 6/6: 1/1: Creating table: tumacofailure_vs_tumacocure. Adjust = BH
## Limma step 6/6: 1/2: Creating table: tumacocure. Adjust = BH
## Limma step 6/6: 2/2: Creating table: tumacofailure. Adjust = BH
## The factor tumaco_cure has 20 rows.
## The factor tumaco_failure has 21 rows.
## Testing each factor against the others.
## Scoring tumaco_cure against everything else.
## Scoring tumaco_failure against everything else.
<- simple_gsva(
t_neutrophil_gsva_c2
t_neutrophils,signatures = broad_c2,
msig_xml = "reference/msigdb/msigdb_v7.5.1.xml",
signature_category = "c2")
<- get_sig_gsva_categories(
t_neutrophil_gsva_c2_sig
t_neutrophil_gsva_c2,excel = "analyses/4_tumaco/GSVA/t_neutrophil_gsva_c2.xlsx")
<- simple_gsva(
t_neutrophil_gsva_c2 signature_category = "c2") t_neutrophils,
## Converting the rownames() of the expressionset to ENTREZID.
## 574 ENSEMBL ID's didn't have a matching ENTEREZ ID. Dropping them now.
## Before conversion, the expressionset has 19952 entries.
## After conversion, the expressionset has 19337 entries.
## Adding descriptions and IDs to the gene set annotations.
<- get_sig_gsva_categories(
t_neutrophil_gsva_c2_sig
t_neutrophil_gsva_c2,excel = "analyses/4_tumaco/GSVA/t_neutrophil_gsva_c2.xlsx")
## libsize was not specified, this parameter has profound effects on limma's result.
## Using the libsize from expt$libsize.
## Limma step 1/6: choosing model.
## Assuming this data is similar to a micro array and not performign voom.
## Limma step 3/6: running lmFit with method: ls.
## Limma step 4/6: making and fitting contrasts with no intercept. (~ 0 + factors)
## Finished make_pairwise_contrasts.
## Limma step 5/6: Running eBayes with robust = FALSE and trend = FALSE.
## Limma step 6/6: Writing limma outputs.
## Limma step 6/6: 1/1: Creating table: tumacofailure_vs_tumacocure. Adjust = BH
## Limma step 6/6: 1/2: Creating table: tumacocure. Adjust = BH
## Limma step 6/6: 2/2: Creating table: tumacofailure. Adjust = BH
## The factor tumaco_cure has 20 rows.
## The factor tumaco_failure has 21 rows.
## Testing each factor against the others.
## Scoring tumaco_cure against everything else.
## Scoring tumaco_failure against everything else.
<- simple_gsva(
t_neutrophil_gsva_h
t_neutrophils,signatures = broad_h,
msig_xml = "reference/msigdb/msigdb_v7.5.1.xml",
signature_category = "h")
<- get_sig_gsva_categories(
t_neutrophil_gsva_h_sig
t_neutrophil_gsva_h,excel = "analyses/4_tumaco/GSVA/t_neutrophil_gsva_h.xlsx")
<- simple_gsva(
t_neutrophil_gsva_h signature_category = "h") t_neutrophils,
## Converting the rownames() of the expressionset to ENTREZID.
## 574 ENSEMBL ID's didn't have a matching ENTEREZ ID. Dropping them now.
## Before conversion, the expressionset has 19952 entries.
## After conversion, the expressionset has 19337 entries.
## Adding descriptions and IDs to the gene set annotations.
<- get_sig_gsva_categories(
t_neutrophil_gsva_h_sig
t_neutrophil_gsva_h,excel = "analyses/4_tumaco/GSVA/t_neutrophil_gsva_h.xlsx")
## libsize was not specified, this parameter has profound effects on limma's result.
## Using the libsize from expt$libsize.
## Limma step 1/6: choosing model.
## Assuming this data is similar to a micro array and not performign voom.
## Limma step 3/6: running lmFit with method: ls.
## Limma step 4/6: making and fitting contrasts with no intercept. (~ 0 + factors)
## Finished make_pairwise_contrasts.
## Limma step 5/6: Running eBayes with robust = FALSE and trend = FALSE.
## Limma step 6/6: Writing limma outputs.
## Limma step 6/6: 1/1: Creating table: tumacofailure_vs_tumacocure. Adjust = BH
## Limma step 6/6: 1/2: Creating table: tumacocure. Adjust = BH
## Limma step 6/6: 2/2: Creating table: tumacofailure. Adjust = BH
## The factor tumaco_cure has 20 rows.
## The factor tumaco_failure has 21 rows.
## Testing each factor against the others.
## Scoring tumaco_cure against everything else.
## Scoring tumaco_failure against everything else.