Perform Network Diffusion from Seed Nodes
network_diffusion.Rd
Applies network diffusion techniques to propagate influence from a set of seed nodes across a graph. Supports Laplacian smoothing, heat diffusion, and random walk with restart (RWR).
Usage
network_diffusion(
graph,
seed_nodes,
method = c("laplacian", "heat", "rwr"),
alpha = 0.7,
t = 1,
restart_prob = 0.3,
normalize = TRUE,
precompute = NULL
)
Arguments
- graph
An
igraph
object representing the network to analyze or a data frame containing a symbolic edge list in the first two columns. Additional columns are considered as edge attributes. Must have named vertices.- seed_nodes
Character vector of seed node names (must match
V(graph)$name
).- method
Character. Diffusion method to use:
"laplacian"
: Solves the linear system \((I + \alpha L)^{-1} f_0\), where \(L\) is the (normalized) graph Laplacian and \(\alpha\) is a smoothing parameter. Internally, a sparse Cholesky decomposition is used for efficiency."heat"
: Applies the heat diffusion model \(e^{-tL} f_0\), where \(t\) controls diffusion time. A truncated Taylor expansion is used for approximation."rwr"
: Random Walk with Restart. Iteratively solves \(f = (1 - r)Pf + r f_0\), where \(P\) is the transition matrix and \(r\) is the restart probability.
- alpha
Damping factor for the Laplacian method. Default is
0.7
.- t
Time parameter for the heat diffusion method. Default is
1
.- restart_prob
Restart probability (usually between 0.3 and 0.7) for the RWR method. Default is
0.3
.- normalize
Logical. Whether to normalize the adjacency matrix (symmetric normalization for undirected graphs). Default is
TRUE
.- precompute
Optional list of precompute diffusion matrices (e.g., Laplacian, Cholesky factor, or transition matrix). Use
prepare_diffusion()
to generate this object and avoid redundant computations when calling this function repeatedly (e.g., in greedy optimization).
Value
A data frame with two columns:
- node
Node name
- score
Diffusion score representing influence from the seed nodes
Details
This function allows flexible application of network diffusion strategies, useful in systems biology (e.g., gene prioritization, pathway propagation), network analysis, and disease gene discovery. The underlying matrix operations are based on well-established diffusion models from graph theory.
For "rwr"
(random walk with restart), the algorithm iteratively propagates scores until convergence
based on a row-normalized transition matrix. Recommended method for large networks.
For "laplacian"
and "heat"
, the graph Laplacian is computed from the (optionally normalized) adjacency matrix.
For efficiency in iterative applications, precompute Laplacian and Cholesky decomposition using prepare_diffusion()
.
References
Köhler S, Bauer S, Horn D, Robinson PN. Walking the interactome for prioritization of candidate disease genes. Am J Hum Genet. 2008;82(4):949–958. doi:10.1016/j.ajhg.2008.02.013
Vanunu O, Magger O, Ruppin E, Shlomi T, Sharan R. Associating genes and protein complexes with disease via network propagation. PLoS Comput Biol. 2010;6(1):e1000641. doi:10.1371/journal.pcbi.1000641
Examples
if (FALSE) { # \dontrun{
g <- sample_gnp(100, 0.05, directed = F)
V(g)$name <- as.character(seq_len(vcount(g)))
seed_nodes <- sample(V(g)$name, 5)
network_diffusion(g, seed_npodes, method = "laplacian")
} # }