Rohdy
Rohdy의 study
Rohdy
전체 방문자
오늘
어제
  • 분류 전체보기 (32)
    • 석사 이야기 (0)
    • DL Study (17)
    • Pr4AI Study (1)
    • RecSys (3)
    • GraphMining (9)
    • 논문 (2)

블로그 메뉴

  • 홈
  • 태그
  • 방명록

공지사항

인기 글

태그

  • 데이터사이언스
  • 컨택
  • 연세대
  • 로지스틱
  • NeurIPS
  • gradientdescent
  • 대학원입시
  • IDGL
  • logistic
  • 시그모이드
  • 대학원컨택
  • 대학원
  • 정보대학원
  • LinearRegression
  • 논문요약
  • ML
  • KAIST
  • GSDS
  • sigmoid
  • gnn
  • 대학원면접
  • 대학원준비
  • 2020

최근 댓글

최근 글

티스토리

hELLO · Designed By 정상우.
Rohdy

Rohdy의 study

Structural properties
GraphMining

Structural properties

2022. 9. 14. 15:37

KAIST AI 대학원 신기정 교수님의 수업인 데이터마이닝 및 소셜 네트워크 분석 수업 필기입니다.

 

Three parts/goals

  • Statistical properties of large networks
  • Models that help understand these properties
  • Predict behavior of networked systems based on measured srtructural properties and local rules governing individual nodes

 

Random Graph: Null Model

  • What is the simplest way to generate a graph?
  • Random graph model \(G(n,p)\) (= Erdos-Renyi model, Poisson random graph model)
    • Given \(n\) vertices, connect each pair i.i.d. with probability \(p\)

 

Paths / Distance in Graph

  • Path: Sequence of nodes between the two nodes
    Can intersect itself and pass through the same edge
  • Distance: the number of edges along a shortest path
    Not connected, the distance is usually defined as infinite

 

Network Diameter

  • Diameter: The maximum distance between any pair of nodes in a graph -> sensitive to outlier
  • Effective diameter: distance at which 90% of all connected pairs of nodes can be reached
  • Average path length for a connected graph or component

 

Small-world Effect

  • 약 6번의 path를 건너면 거의 모든 사람들과 연결되어 있다는 이론(Diameter in social network)
    EX) Boston에 편지보내기, MSN

MSN network & Milgram's small world experiment

 

Degree Distribution

  • Degree Distribution \(p_k\): Fraction of node with degree \(k\)
    $$ p_k = N_k / N $$

  • Degrees in real graph: Heavily skewed to the right
    Long tail of values far above the mean
    many real world networks contain hubs
    A power-law distribution: \(p_k ~ k^{-\alpha}\)

Degrees in real graph

  • Degree distribution in random graph: binomial
    $$ P(k)={n-1\choose k}p^k(1-p)^{n-1-k} $$
    approximately nomarl distribution

Degree distribution in random graph

 

Clustering Coefficient

  • Local Clustering Coefficient of node \(i\) with degree \(k_i\) -> 인접노드들끼리 얼마나 연결되어 있나
    $$ C_i := {e_i\over k_i(k_i - 1)/2} $$
  • Average Clustering Coefficient
    $$ C := {1 \over N} \sum C_i $$
  • In real-world graphs
    Expected value CC:    \(C >> {2m \over N(N-1)}\)
    Why?
       Homophily: nodes connect to nodes similar to themselves(같은 관심사가 있으면 친구 됨)
       Transitivity: nodes with common friends become friends(공통의 친구가 있으면 친구 됨)

Giant Connected Component(GCC)

  • Largest set where any two vertices can be joined by a path
  • In real world graphs, huge GCC contains most nodes

MSN network

  • In Random Graph \(G(n,p)\)

 

Difference between real-world graph and random graph

  • Degree distribution: Heavily skewed to the right vs Gaussian Distribution
  • Avg path length: 비슷
  • Avg clustering coef: real-world graph가 더 높음
  • Largest Connected Componet: 일정수 이상의 node가 존재하면 비슷

Network Resilience

How the connectivity of the network changes as the nodes get removed?

  • Real graphs are resilient to random attacks
  • Real graphs are vulnerable to targeted attacks(GCC가 attack 받으면 취약)
  • Randm network는 real network에 비해 target attack에 robost

동그라미가 target attack, 네모가 random attack

 

How Graphs Evolve over Time

Conventional wisdom/intuition

  • Constant average degree: the number of edges grows linearly with the number of nodes
    $$ E(t) \propto N(t) $$
    \(E(t)\): number of edges at time
    \(N(t)\): number of nodes at time
  • Slowly growing diameter: as the network grows the distances between nodes grow
    $$ O(N^{1/3})  \; or \;  O(logN) $$

 

Real Network

Networks are denser over time(denser = \({E(t) \over N(t)}\))

  • the number of edges grow faster than that of nodes
  • average degree is increasing
    $$ E(t) \propto N(t)^a $$

with \(1  \le a  \le 2\)

Diameter shrinks over time(= people get closer over time)

'GraphMining' 카테고리의 다른 글

Kronecker Graph Model  (0) 2022.09.26
Community Guided Attachment & Forest Fire Model  (0) 2022.09.26
Basic Models  (0) 2022.09.14
Random Networks  (0) 2022.09.14
Intro & Basic Concepts  (0) 2022.09.06
    'GraphMining' 카테고리의 다른 글
    • Community Guided Attachment & Forest Fire Model
    • Basic Models
    • Random Networks
    • Intro & Basic Concepts
    Rohdy
    Rohdy

    티스토리툴바