^{1} Department of Bioinformatics,
Tongji University, Shanghai, China
^{2} Institute of Protein
Research, Tongji University, Shanghai, China
^{3} CAS-MPG Partner Institute of
Computational Biology, Shanghai, China
^{4} Institute of Health Sciences,
Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, China
^{5}
Center for Systems Biology, Soochow University, Suzhou, China
^{§}Corresponding
author：bairong.shen@suda.edu.cn
DS 1. The Synthetic Dataset from a Typical
Mammalian Cell Cycle Pathway
Figure 1-A.
The calculated mutual information matrix of 36
gene pairs among 9 genes from the mammalian G_{1}/S cell cycle
transition network. The mutual information matrix’s axes are numbered with GE
No. 1 to GE No. 9, representing the above species pRB, E2F1, CycDi, CycDa, AP-1, pRBp, pRBpp, CycEi
and CycEa respectively. The diagonal
elements are all equal to one since mutual information is maximized for
measuring the same variables. Furthermore the mutual information matrix is
symmetric (i.e. I(X;Y)=I(Y;X)) and nonnegative (i.e. I(X;Y)≥0), thus the above mutual information
matrix is symmetric and all elements are nonnegative.
Figure 1-B.
The descending-sorted mutual information, correlation
coefficients and corresponding P-values
statistics
for
the total pairwise candidates of the
mammalian cell cycle pathway. The upper subplot is for descending-sorted mutual
information of total 36 gene pairs, and lower one for the descending-sorted
correlation coefficients and corresponding P-values.
As indicated by the vertical dotted line in the lower plot, there are totally
16 pairs with their P-values smaller
than 0.05.
Figure 1-C.
Associativity measure statistics for the APGs and
QPGs groups in DS1. Based on the MICORPS concepts defined in the methodology
section.
Table 1. Authentic Pairwise Genes from Dataset 1
No. |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
R |
2 |
2 |
2 |
2 |
3 |
3 |
4 |
8 |
6 |
3 |
1 |
5 |
4 |
7 |
C |
6 |
4 |
5 |
3 |
4 |
5 |
5 |
9 |
9 |
9 |
3 |
6 |
6 |
9 |
* R: Row, C: Column
Table 2. Questionable Pairwise Genes from Dataset 1
No. |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
R |
2 |
3 |
5 |
1 |
1 |
4 |
2 |
6 |
5 |
1 |
4 |
2 |
3 |
3 |
7 |
C |
9 |
7 |
7 |
7 |
4 |
7 |
7 |
7 |
9 |
5 |
9 |
8 |
8 |
6 |
8 |
* R: Row, C: Column
DS 2. The Cell Cycle Microarray Dataset under the
Elutriation Treatment
Figure 2-A.
The calculated mutual information matrix for 276 gene pairs from the 24
cell-cycle genes. The diagonal elements are all equal to one since mutual
information is maximized for measuring the totally same variables. The mutual
information matrix is nonnegative (I(X;Y)≥0)
and symmetric (I(X;Y) = I(Y;X)).
Figure 2-B.
The descending-order sorted mutual
information, correlation coefficients and corresponding P-values statistics
for
the total pairwise candidates of the cell cycle
regulatory network. As indicated by the vertical dotted line in the lower plot,
there are totally 105 pairs with their P-values
smaller than 0.05.
Figure 2-C.
The calculated P-value statistical
confidence areas for 276 pairwise gene samples with respect to the correlation
coefficient and mutual information. For the upper plot, the blue dots represent
the sorted correlation coefficients for pairwise gene samples; the khaki area
illustrates the percentage variation for selected samples satisfying specific
confidence criterion. Here the confidence criterion is defined as the
percentage of pair samples with their P-values
smaller than 0.05 among all currently selected samples under discussion. The
lower graph illustrates the relationship between the confidence variation and
the descending-sorted mutual information of pair samples. The red dotted line
is for mutual information values, and the reseda area denotes the relative confidence statistics.
Figure 2-D.
The three-dimensional graph for authentic (APGs), questionable (QPGs), and
unauthentic pairwise genes (UPGs) under different thresholds of mutual
information and correlation coefficient. The related P-value adopts 0.05. Totally, there are 276 pairs among 24 genes
for the cell cycle regulatory network. The horizontal axis represents different
mutual information thresholds, and the vertical axis for the correlation
coefficient.
Figure 2-E. Associativity measure statistics for the APGs group in DS2. Based on the
MICORPS concepts defined in the methodology section.
Figure 2-F.
The Phase-shift statistics for the APGs group (totally 83 pairwise genes,
sorted according to descending mutual information values of each pair),
calculated based on the signal processing concepts defined above. The red part
(+1) represents the leading phase shift for the related pairwise genes, the
black (-1) for the lagging phase shift, and the white for those pairs without
any phase shift under specific gain thresholds.
DS 3. The Microarray Dataset of a p53 Pathway with
Multiple Feedback Loops
Figure 3-A.
Mutual information matrix for the triplicate MOTL4 microarray experiments,
implemented under irradiation from 0 to 12 hours at intervals of 2 hours.
Figure 3-B. The descending-sorted
mutual information, correlation coefficients and corresponding P-values statistics for the total
pairwise candidates of the multi-feedback p53 pathway. The mutual information
statistics are of the homogeneous distribution among the range between 0.3134
and 1, while note that the Pearson correlation statistics only have 10
candidates with P-values below 0.05,
indicated with the vertical dashed line in the lower subgraph.
Figure 3-C.
The calculated P-value statistical
confidence areas for 120 pairwise samples via dynamic thresholding with respect
to the correlation coefficient and mutual information. For the upper plot, the
blue dots represent the sorted correlation coefficients for pairwise gene
samples; the khaki area illustrates the percentage variation for selected
samples satisfying specific confidence criterion. Here the confidence criterion
is defined as the percentage of pair samples with their P-values smaller than 0.05 among all currently selected samples
under discussion. The lower graph illustrates the relationship between the
confidence variation and the descending-sorted mutual information of pair
samples. The red dotted line is for mutual information values, and the reseda area denotes the relative
confidence statistics.
Figure 3-D. The
calculated statistics of the authentic (APGs), questionable (QPGs), and
unauthentic (UPGs) groups under different thresholds of mutual information and
correlation coefficients. The related P-value
adopts 0.8 to ensure enough candidates in the APGs for the
network-reconstruction.
Figure 3-E.
The Phase-shift statistics for the APGs group (totally 55 gene pairs, sorted
according to descending mutual information values of each pair), calculated
based on the signal processing concepts defined above. The red part (+1)
represents the leading phase shift for the related pairwise genes, the black
(-1) for the lagging phase shift, and the white for those pairs without any
phase shift under specific gain thresholds.
Figure 3-F. Associativity measure statistics for the APGs group in DS3. Based on the
MICORPS concepts defined in the methodology section.
Figure 3-G. The constructed genetic graph with gain
threshold at 1. As depicted in the graph, #5 (cdk2) and #6 (Rb) are the
weak-connected nodes, #3 (MDM2) and #10 (β-catenin), etc. are the
strong-connected ones under the current gain threshold.