Bioinfo-Network
System Biology 系统生物学
系统生物学采用综合的方法,科学家通过实验和理论的交互作用研究和建模生物通路及网络。
Integrative approaches:
Scientists study and model pathways and networks by combining experimental data with theoretical insights.
科学家通过结合实验数据和理论见解来研究和建模生物通路与网络。
Integration of biological data:
- To understand how biological systems function by bringing together various datasets.
- 整合多种数据集,以理解生物系统如何运作。
Study of relationships and interactions:
- Analyze how different parts of a biological system relate and interact with each other.
- 研究生物系统不同部分之间的关系和交互作用。
Inference using mixed data types:
- Explore connections and derive knowledge by using diverse types of data.
- 通过多种类型的数据来探索连接并推断知识。
Developing a whole-system model:
- Build comprehensive models to represent the entire biological system.
- 构建全面的模型来表示整个生物系统。
Modeling and prediction:
- Predict how a system behaves when perturbed or influenced.
- 模拟和预测系统在受到干扰或影响时的行为。
This highlights the holistic and interdisciplinary approach of system biology, aiming to understand and predict the complex dynamics of biological systems.
这突出了系统生物学的整体性和跨学科方法,旨在理解和预测生物系统的复杂动态。
Networks
Networks provide natural description of relation between various components.
Examples of biological networks:
Protein-protein interaction network: Describes the interactions between proteins. 蛋白质-蛋白质交互网络:描述蛋白质之间的交互作用。
Protein domain co-occurrence network: Shows how protein domains co-occur across different proteins. 蛋白质结构域共现网络:显示蛋白质结构域在不同蛋白质中的共现情况。
Metabolic networks: Represent metabolic pathways and chemical reactions within a cell. 代谢网络:表示细胞内的代谢通路和化学反应。
Transcription networks: Illustrate regulatory interactions between transcription factors and target genes. 转录网络:展示转录因子与目标基因之间的调控交互作用。
Key takeaway:
- Networks provide an intuitive and powerful way to visualize and analyze complex relationships in biological systems. 网络为可视化和分析生物系统中的复杂关系提供了一种直观且强大的工具。
Network Basic Features
Degree: The number of connections that a node has.
Distance: The number of connections between two nodes along the shortest path.
Path: A sequence of connections between nodes.
- Reachability: 两个节点是否可以互相到达。
- Closeness: 平均最短路径距离(紧密性):节点之间的平均最短路径距离。
- Betweenness: 出现在多少条最短路径中(中介中心性):一个节点在其他节点最短路径中出现的频率。
DNA-Protein
- Transcriptional regulatory networks
- Methylation networks
RNA-RNA
- miRNA regulatory networks
RNA-Protein
- Splicing regulatory networks
Protein-Protein
- Co-expression networks
- Co-localization networks
- Co-evolution networks
- Structure networks
- Pathway networks
- Protease regulatory networks
- Signal transduction networks
- Gene Ontology networks
How to Build Biological Networks?
- Search/Retrieve from knowledge bases
- Predict from genome sequences
- Predict from “omics” data
- Predict from literature
- Integrate and analyze
- Meta-networks from genome scale data analysis
How to build biological networks
Steps to Build Biological Networks (构建生物网络的步骤):
- Search/Retrieve from knowledge bases (从知识库搜索/提取):
- Use existing databases such as STRING, KEGG, or Reactome to retrieve pre-built networks or related data.
- 使用现有数据库(如STRING、KEGG或Reactome)提取已构建的网络或相关数据。
- Predict from genome sequences (基于基因组序列预测):
- Analyze DNA sequences to infer potential interactions, such as transcription factor binding or regulatory elements.
- 分析DNA序列以推断潜在的交互作用,例如转录因子结合或调控元件。
- Predict from "omics" data (基于组学数据预测):
- Use transcriptomics, proteomics, metabolomics, or other high-throughput data to identify relationships between biological entities.
- 使用转录组学、蛋白质组学、代谢组学或其他高通量数据识别生物实体之间的关系。
- Predict from literature (基于文献预测):
- Extract interaction data from scientific publications using text mining or manual curation.
- 通过文本挖掘或人工整理从科学文献中提取交互数据。
- Integrate and analyze (整合与分析):
- Combine multiple data sources to construct comprehensive networks and analyze their properties.
- 整合多种数据源以构建全面的网络并分析其特性。
- Meta-networks from genome-scale data analysis (基于基因组规模数据分析的元网络):
- Build higher-order networks by integrating multiple types of interactions or networks.
- 通过整合多种类型的交互或网络构建高阶网络。
Key takeaway:
These steps highlight the multi-faceted approach to building biological networks, enabling a deeper understanding of complex biological systems.
这些步骤强调了构建生物网络的多方面方法,有助于更深入地理解复杂的生物系统。
Models for networks of complex topology
Erdős-Rényi 随机图模型 (1960)
- Start with N vertices and no edges.
- Connect each pair of vertices with probability
.
- 特点:
- 随机网络,节点之间的边是以一定概率 pp 随机连接的。
- 不存在特定的结构或规则,网络的度分布接近泊松分布。
- 应用:
- 最早提出的网络模型,用于研究随机连接的系统。
- 局限性:
- 无法解释实际复杂网络中的“小世界效应”和“度分布的幂律性质”。
Watts-Strogatz 小世界模型 (1998)
- Start with a regular network with N vertices
- Rewire each edge with probability
.
- 特点:
- 通过从规则网络开始,随机重连一部分边来构建。
- 具有:
- 高聚类系数(节点之间的邻居容易形成三角形结构)。
- 短路径长度(小世界效应:任意两个节点间的最短路径很短)。
- 应用:
- 模拟社交网络、生物网络等具有“小世界性质”的网络。
- 局限性:
- 度分布并不符合实际网络中常见的幂律分布。
Barabási-Albert 无标度模型 (1999)
- GROWTH: Starting with a small number of vertices , at every timestep add a new vertex with .
- PREFERENTIAL ATTACHMENT: The probability that a new vertex will be connected to vertex depends on the connectivity of that vertex.
- 特点:
- 基于“增长”与“优先连接”机制。
- 网络中的节点度分布遵循幂律分布 P(k)∼k−γP(k) \sim k^{-\gamma},其中 γ>2\gamma > 2。
- 存在“枢纽节点”或超级节点(节点度非常高)。
- 应用:
- 解释了互联网、航空网络、社交网络等实际网络中的无标度性质。
- 优势:
- 能够真实反映现实世界中许多复杂网络的拓扑特征。
总结对比:
模型 | 主要特性 | 度分布 | 小世界效应 | 聚类系数 |
---|---|---|---|---|
Erdős-Rényi | 随机连接,简单概率 | 泊松分布 | 否 | 低 |
Watts-Strogatz | 高聚类+短路径(小世界效应) | 非幂律分布 | 是 | 高 |
Barabási-Albert | 增长与优先连接,无标度网络 | 幂律分布 | 是 | 变化 |
这三种模型为研究复杂网络结构提供了基础框架,其中Barabási-Albert模型最贴近现实世界中的网络特征。