- Introduce
- Example 1 : Calculating the centrality scores of subpathways.
- Example 2 : Calculating the drug-disease reverse association score and corresponding pvalue of drugs.
- Visualize 1: Plot a subpathway network structure graph.
- Visualize 2: Plot a chemical molecular formula of the drug or compound .
- Visualize 3: Plot a heatmap of the subpathways that are regulated by disease.
- Visualize 4: Plot heatmaps of the subpathways that are regulated by drugs.

We developed a novel software package (DRviaSPCN) that enables repurposing drugs via a subpathway (SP) crosstalk network. The main process include construction of the SP network and calculation of the centrality scores of SPs to reflect the influence of SP crosstalk, calculating the enrichment score of subpathways and weighting them with corresponding centrality score to get weighted enrichment score(weighted-ES), calculation of enrichment scores of drug- and disease-induced dysfunctional SPs and weighted them by the centrality scores of SPs, evaluation of the drug-disease reverse association at the weighted SP level, identification of cancer candidate drugs. There are also several functions used to visualize the results such as visualization of the subpathway network structure of interest, chemical molecular formula of the drug or compound, and heatmap of the expression of subpathways in different sample types.

This vignette illustrates how to easily use the **DRviaSPCN
package**. Here, with the use of functions in this package, users
could identify potential therapeutic drugs for disease through
estimating drug-disease reverse association.

**The method consists of three parts:**

**1.Evaluating influence of SP crosstalk.** We think two
SPs are functionally related if there are some genes in two SPs that
share at least more than one common biological function (GO term). To
analyze the influence of SP crosstalk, we first construct a weighted
SP/GO bipartite network. We defined an edge between a SP and a Go term
if they have at least one common gene. For a certain disease, we used
the different level of the shared genes between two types of samples
(disease and normal), and the Jaccard index between a pair of SP and go
term to define the weight of the edge. Next, we constructed a SP-SP
network based on the SP-GO network. Similarly, we defined an edge
between two subpathways if they have at least a common biology function.
After the above steps, a SP-SP network induced by disease could be
constructed. Then, we applied random walk algorithm to the SP-SP network
to calculate eigenvector centrality score of SPs which can reflect the
influence of SP crosstalk. Finally, the statistical significance
(p-value) of these centrality scores was assessed using a
bootstrap-based randomization method. In the same way, the eigenvector
centrality score and p-value of SPs induced by each drug can also be
calculated.

**2.Evaluating drug-disease reverse association based on
disease- and drug-induced SPs weighted by the SP crosstalk.** In
this part, CMap build 02 raw data was downloaded from the CMap website
(Lamb et al. 2006). After constructing gene expression profiles, the
Gene Set Enrichment Analysis (GSEA) was used to calculate the enrichment
score of SPs For a disease, we applied the GSEA method to the gene
expression profile of the disease. After calculating the enrichment
score (*ES*) of SPs, we weighted them with corresponding
centrality scores to calculate weighted enrichment score (weighted-ES).
In the same way, the weighted enrichment score (weighted-ES) of SPs
induced by each drug can also be calculated. We defined a drug-disease
reverse association score (*RS*) to reflect the treatment extent
of a drug at the SP level. For each drug and certain disease, the SPs
were ranked in descending order based on the corresponding weighed-ES.
We mapped the up-and down-regulated subpathways induced by disease to
the ranked list of SPs induced by each drug to calculate the
*ks.up* and *ks.down*. And the *RS* is equal to
*ks.up - ks.down*.

**3.Identifying statistically significant disease-candidate
drugs.** Generally, a drug was applied with different instances
(different concentrations, cancer cell lines or duration), we thus
calculated the RS for each instance of all drugs. After normalization, a
ranked instance list was constructed according to the normalized scores
of all instances. For a given drug, the instance set of the drug was
extracted and mapped to the ranked list, and the Kolmogorov-Smirnov
statistic was used to compute the drug enrichment score (DES) based on
the instance set. If the instance set of the drug enriches at the
negative RS region of the list, the DES will be strong negative
indicating the drug in different instances may have consistent treatment
effect on the disease. To estimate the statistical significance
(empirical p-values) of the DES, we randomly selected the same number of
instances for the drug to recalculate the DES, and we repeated this
process N times. The p-value was then calculated as p-value = M/N, where
M is the number of randomly selected DESs less than the observed DES. In
the package, the default randomly selected times are set at 1000. To
correct for multiple comparisons, we adjusted the empirical p-values of
drugs using the false discovery rate (FDR) method

This package provides the

`CalCentralityScore`

function to calculate the centrality score of subpathways and the corresponding p-value.This package provides the

`Optimaldrugs`

function to calculate the DES of drugs and corresponding p-value.This package provides the

`plotSPW`

function to plot SP network structure.This package provides the

`getMolecularFM`

function to plot the chemical molecular formula of the drug or compound.This package provides the

`Disease2SPheatmap`

function to plot heatmap of the activities of subpathways in different sample types that are regulated by disease.This package provides the

`Drug2SPheatmap`

function to plot heatmap of the activities of subpathways in different sample types that are regulated by drugs.This package provides the

`GetExample`

function to return example data and environment variables, such as gene expression profile, sample label and so on.

In addition, the essential data `DrugSPESCMatrix`

and
`DrugSPPvalueMatrix`

which are subpathways weighted-ES
induced by all drugs and statistic significance (p-value) of centrality
score of subpathways regulated by all drugs were stored in our
DRviaSPCNData package. Users could download and use this package by the
following code:

```
### Download DRviaSPCNData package from GitHub
library(devtools)
install_github("hanjunwei-lab/DRviaSPCNData",force = TRUE)
library(DRviaSPCNData)
### Get weighted-ES of subpathways
DrugSPESCMatrix<-GetData("DrugSPESCMatrix")
### Get p-value of subpathways centrality score
DrugSPPvalueMatrix<-GetData("DrugSPPvalueMatrix")
```

The function “CalCentralityScore” was used to calculate the centrality scores of SPs to reflect the crosstalk influence, which were used as weights in the calculation of drug-disease reverse association score.

The commands are as follows:

`## Warning: package 'igraph' was built under R version 4.3.2`

```
###Obtain input data
GEP<-GetExample('GEP')# Get the gene expression profile
Slabel<-GetExample('Slabel')# Get the sample class label
```

```
###Run the function
CentralityScoreResult<-CalCentralityScore(ExpData=GEP,Label=Slabel,nperm = 1000)
```

```
## SubPathID Size Centralscore Pvalue FDR
## 1 04920_4 36 0.016816433 0.000 0.000
## 2 05167_28 9 0.015350713 0.000 0.000
## 3 04657_3 10 0.014931183 0.003 0.611
## 4 05205_51 7 0.014164259 0.006 0.611
## 5 05200_31 22 0.010549439 0.007 0.611
## 6 05205_52 14 0.011014076 0.009 0.611
## 7 04625_6 9 0.013364221 0.011 0.611
## 8 05134_7 14 0.009284269 0.011 0.611
## 9 05167_6 30 0.009896507 0.011 0.611
## 10 01521_4 47 0.009160094 0.013 0.611
```

The function `Optimaldrugs`

is used
to calculate the *DES* and statistic significance of drugs. The
detailed algorithm can be seen in the introduction part. Users could
screen out the optimal therapeutic drugs according to a specific
threshold. Here we provide weighted and unweighted methods to calculate
the score, which can be selected by parameters *weight = ’’* .
The screening criteria of the up- and down-regulated subpathways can be
changed through the parameters *pcut = ’’* and *topcut =
’’*. The commands are as follows:

```
###Run the function
Opdrugresult<-Optimaldrugs(ExpData=GEP,Label=Slabel,DrugSPESC=DrugSPESCMatrix,
CentralityScore=CentralityScoreResult,nperm=1000,topcut=10,
pcut=0.01,weight=FALSE)
```

```
## Drug DES pvalue FDR
## 421 valproic acid -0.4621059 0.000 0.0000000
## 1296 xylometazoline -0.9143394 0.000 0.0000000
## 1299 monobenzone -0.9189044 0.000 0.0000000
## 1305 Prestwick-559 -0.9795918 0.000 0.0000000
## 1309 gefitinib -0.9989259 0.000 0.0000000
## 1292 methotrexate -0.8941998 0.001 0.2181667
## 1158 trichostatin A -0.6055317 0.002 0.2908889
## 1182 fluphenazine -0.6356069 0.002 0.2908889
## 1302 4,5-dianilinophthalimide -0.9481740 0.002 0.2908889
## 1275 phthalylsulfathiazole -0.8378088 0.004 0.3490667
```

The function `plotSPW`

used to plot
a subpathway network structure graph. The user just needs to input an
interest subpathway id such as “00020_4”.

The commands are as follows:

```
###load depend package
library(igraph)
###plot network graph of the subpathway "00020_4"
plotSPW("00020_4")
```

The function `getMolecularFm`

can
obtain a chemical molecular formula of the drug or compound. Then users
could visualize the molecular formula through function “plot”.

The commands are as follows:

The function `Disease2SPheatmap`

plots a heat map of the subpathways that are regulated by disease. The
input is the result of function `CalCentralityScore`

, disease
gene expression profile and sample class in the expression profile. We
map subpathways to the disease gene expression through ssgsea to get a
subpathway abundance matrix. Then we visualize the matrix by heatmap.
Users could change the threshold that is used to screen significant
subpathways through the param *pcut*.

The commands are as follows:

`## Warning: package 'pheatmap' was built under R version 4.3.2`

```
###Run the function
Disease2SPheatmap(CentralityScore=CentralityScoreResult,ExpData=GEP,Label=Slabel,pcut=0.05,
bk=c(-2,2),cluster.rows=FALSE,cluster.cols=FALSE,
show.rownames=TRUE,show.colnames=FALSE,
col=c("navy","firebrick3"),cell.width=NA,
cell.height=NA,scale="row",fontsize=7,
fontsize.row=9,fontsize.col=10)
```

The function `Drug2SPheatmap`

plots
heatmaps of the subpathways that are regulated by drugs. The function
input is a character which is drug name, disease gene expression profile
and sample class in the expression profile. We map subpathways to the
disease gene expression through ssgsea to get a subpathway abundance
matrix. Then we visualize the matrix by heatmap. Users could change the
threshold that is used to screen significant subpathways through the
param *pcut*. The result of this function is a list including all
heatmaps.

The commands are as follows:

```
###Load depend package
library(GSVA)
library(pheatmap)
###Run the function
Drug2SPheatmap(drugname="methotrexate_HL60_6_8.8e-06",
DrugSPPvalue=DrugSPPvalueMatrix,ExpData=GEP,
Label=Slabel,pcut=0.05,bk=c(-2,2),cluster.rows=FALSE,
cluster.cols=FALSE,show.rownames=TRUE,
show.colnames=FALSE,col=c("navy","firebrick3"),
cell.width=NA,cell.height=NA,scale="row",
fontsize=6,fontsize.row=9,fontsize.col=10)
```