๐Ÿ“ฆ๊ทธ๋ž˜ํ”„์˜ ํŠน์ • ํ•˜์œ„ ๊ทธ๋ž˜ํ”„๋ฅผ ์ถ”์ถœํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜, K-Core

๐Ÿ“ฆ๊ทธ๋ž˜ํ”„์˜ ํŠน์ • ํ•˜์œ„ ๊ทธ๋ž˜ํ”„๋ฅผ ์ถ”์ถœํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜, K-Core
Photo by Alina Grubnyak / Unsplash

Definition

  • K-Core ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๊ทธ๋ž˜ํ”„ ์ด๋ก ์—์„œ ์‚ฌ์šฉ๋˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ, ๊ทธ๋ž˜ํ”„์˜ ํŠน์ • ๋ถ€๋ถ„์„ ์ถ”์ถœํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.
  • K-Core๋Š” ๊ฐ ๋…ธ๋“œ๊ฐ€ ์ตœ์†Œํ•œ K๊ฐœ์˜ ์ด์›ƒ ๋…ธ๋“œ๋ฅผ ๊ฐ€์ง€๋Š” ์ตœ๋Œ€ ๋ถ€๋ถ„ ๊ทธ๋ž˜ํ”„๋ฅผ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.

Motivation

  • ์†Œ์…œ ๋„คํŠธ์›Œํฌ, ์ƒ๋ฌผํ•™์  ๋„คํŠธ์›Œํฌ, ์ธํ„ฐ๋„ท ๊ตฌ์กฐ ๋“ฑ ๋‹ค์–‘ํ•œ ๋ถ„์•ผ์—์„œ ๋ฐ€์ง‘๋œ ํ•˜์œ„ ๊ตฌ์กฐ๋ฅผ ์‹๋ณ„ํ•˜๊ธฐ ์œ„ํ•ด ๋“ฑ์žฅํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ฐœ์ธ์ ์ธ ๊ฒฝํ—˜์—์„œ๋Š” FDS ๊ด€๋ จ ์‚ฌ๊ธฐ๊ฑฐ๋ž˜๋ฅผ ์ง‘๋‹จ์œผ๋กœ ํ•˜๋Š” ์—…์ฒด๋ฅผ ํƒ์ง€ํ•˜๊ธฐ ์œ„ํ•ด์„œ ํ™œ์šฉํ•œ ๋ฐ” ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ด๋Š” ๋„คํŠธ์›Œํฌ์˜ ์ฃผ์š” ๊ตฌ์„ฑ ์š”์†Œ๋ฅผ ๋ถ„์„ํ•˜๊ณ , ์ค‘์š”ํ•œ ๋…ธ๋“œ๋“ค์„ ์‹๋ณ„ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

Pros & Cons

Pros

  1. ๋ฐ€์ง‘๋œ ํ•˜์œ„ ๊ตฌ์กฐ ์‹๋ณ„: ๊ทธ๋ž˜ํ”„ ๋‚ด์—์„œ ๋ฐ€์ง‘๋œ ํ•˜์œ„ ๊ตฌ์กฐ๋ฅผ ํšจ์œจ์ ์œผ๋กœ ์‹๋ณ„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  2. ์ค‘์š” ๋…ธ๋“œ ์‹๋ณ„: ๋„คํŠธ์›Œํฌ ๋‚ด์—์„œ ์ค‘์š”ํ•œ ๋…ธ๋“œ๋ฅผ ์‹๋ณ„ํ•˜๋Š” ๋ฐ ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค.
  3. ๋‹ค์–‘ํ•œ ์‘์šฉ ๋ถ„์•ผ: ์†Œ์…œ ๋„คํŠธ์›Œํฌ, ์ƒ๋ฌผํ•™์  ๋„คํŠธ์›Œํฌ, ์ธํ„ฐ๋„ท ๊ตฌ์กฐ ๋“ฑ ๋‹ค์–‘ํ•œ ๋ถ„์•ผ์— ์ ์šฉ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

Cons

  1. ๋ณต์žก์„ฑ: ํฐ ๊ทธ๋ž˜ํ”„์— ๋Œ€ํ•ด K-Core๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๊ฒƒ์€ ๋ณต์žกํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  2. ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ: ํฐ ๊ทธ๋ž˜ํ”„๋ฅผ ์ฒ˜๋ฆฌํ•  ๋•Œ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์ด ๋งŽ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  3. ์ •ํ™•๋„ ๋ฌธ์ œ: ๋ฐ€์ง‘๋œ ํ•˜์œ„ ๊ตฌ์กฐ๋ฅผ ์‹๋ณ„ํ•˜๋Š” ๊ณผ์ •์—์„œ ์ผ๋ถ€ ์ค‘์š”ํ•œ ๋…ธ๋“œ๊ฐ€ ๋ˆ„๋ฝ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Alternatiive

  1. Clique: ๊ทธ๋ž˜ํ”„ ๋‚ด์—์„œ ๋ชจ๋“  ๋…ธ๋“œ๊ฐ€ ์„œ๋กœ ์—ฐ๊ฒฐ๋œ ๋ถ€๋ถ„ ๊ทธ๋ž˜ํ”„๋ฅผ ์ฐพ๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜.
  2. Community Detection: ๋„คํŠธ์›Œํฌ ๋‚ด์—์„œ ์ปค๋ฎค๋‹ˆํ‹ฐ๋ฅผ ๋ฐœ๊ฒฌํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜.
  3. Modularity Optimization: ๋„คํŠธ์›Œํฌ์˜ ๋ชจ๋“ˆ์„ฑ์„ ์ตœ์ ํ™”ํ•˜์—ฌ ์ปค๋ฎค๋‹ˆํ‹ฐ๋ฅผ ์ฐพ๋Š” ๋ฐฉ๋ฒ•

Sample

import networkx as nx
import matplotlib.pyplot as plt
from networkx.algorithms.clique import find_cliques
from networkx.algorithms.community import greedy_modularity_communities

def compute_and_visualize_k_core(edges, k):
    # ๊ทธ๋ž˜ํ”„ ์ƒ์„ฑ
    G = nx.Graph()
    G.add_edges_from(edges)
    
    # K-Core ๊ณ„์‚ฐ
    k_core = nx.k_core(G, k=k)
    
    # ๊ทธ๋ž˜ํ”„ ์‹œ๊ฐํ™”
    pos = nx.spring_layout(G)
    plt.figure(figsize=(8, 6))
    nx.draw(G, pos, with_labels=True, node_color='lightblue', edge_color='gray')
    nx.draw(k_core, pos, with_labels=True, node_color='red', edge_color='black')
    plt.title(f"K-Core (k={k}) of the Graph")
    plt.show()
    
    return k_core.nodes(), k_core.edges()

def find_and_visualize_cliques(edges):
    # ๊ทธ๋ž˜ํ”„ ์ƒ์„ฑ
    G = nx.Graph()
    G.add_edges_from(edges)
    
    # Clique ๊ณ„์‚ฐ
    cliques = list(find_cliques(G))
    
    # ๊ทธ๋ž˜ํ”„ ์‹œ๊ฐํ™”
    pos = nx.spring_layout(G)
    plt.figure(figsize=(8, 6))
    nx.draw(G, pos, with_labels=True, node_color='lightblue', edge_color='gray')
    for clique in cliques:
        nx.draw_networkx_nodes(G, pos, nodelist=clique, node_color='red')
    plt.title("Cliques in the Graph")
    plt.show()
    
    return cliques

def detect_and_visualize_communities(edges):
    # ๊ทธ๋ž˜ํ”„ ์ƒ์„ฑ
    G = nx.Graph()
    G.add_edges_from(edges)
    
    # ์ปค๋ฎค๋‹ˆํ‹ฐ ๊ฐ์ง€
    communities = list(greedy_modularity_communities(G))
    
    # ๊ทธ๋ž˜ํ”„ ์‹œ๊ฐํ™”
    pos = nx.spring_layout(G)
    plt.figure(figsize=(8, 6))
    nx.draw(G, pos, with_labels=True, node_color='lightblue', edge_color='gray')
    colors = ['red', 'blue', 'green', 'orange', 'purple']
    for i, community in enumerate(communities):
        nx.draw_networkx_nodes(G, pos, nodelist=list(community), node_color=colors[i % len(colors)])
    plt.title("Communities in the Graph")
    plt.show()
    
    return communities

# ์˜ˆ์‹œ ๋ฐ์ดํ„ฐ
edges = [(0, 1), (0, 2), (1, 2), (1, 3), (2, 3), (3, 4)]

# K-Core ๊ฒฐ๊ณผ
k = 2
nodes, edges = compute_and_visualize_k_core(edges, k)
print("Nodes in k-core:", nodes)
print("Edges in k-core:", edges)

# Clique ๊ฒฐ๊ณผ
cliques = find_and_visualize_cliques(edges)
print("Cliques in the graph:", cliques)

# Community Detection ๊ฒฐ๊ณผ
communities = detect_and_visualize_communities(edges)
print("Communities in the graph:", communities)


Read more

๋‹ค์ค‘๊ณต์„ ์„ฑ์€ ์ž˜๋ชป๋œ ์ธ๊ณผ์ถ”๋ก  ๊ฒฐ๊ณผ๋ฅผ ๋งŒ๋“ค์–ด๋‚ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋‹ค์ค‘๊ณต์„ ์„ฑ์€ ์ž˜๋ชป๋œ ์ธ๊ณผ์ถ”๋ก  ๊ฒฐ๊ณผ๋ฅผ ๋งŒ๋“ค์–ด๋‚ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋‹ค์ค‘๊ณต์„ ์„ฑ(Multi Collinearity) * **Multi-Collinearity(๋‹ค์ค‘๊ณต์„ ์„ฑ)**๋Š” ๋…๋ฆฝ ๋ณ€์ˆ˜๋“ค ๊ฐ„์˜ ๊ฐ•ํ•œ ์ƒ๊ด€๊ด€๊ณ„๊ฐ€ ์กด์žฌํ•  ๋•Œ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, ํ•œ ๋…๋ฆฝ ๋ณ€์ˆ˜๊ฐ€ ๋‹ค๋ฅธ ๋…๋ฆฝ ๋ณ€์ˆ˜์— ์˜ํ•ด ์„ค๋ช…๋  ์ˆ˜ ์žˆ์„ ์ •๋„๋กœ ์ƒ๊ด€๊ด€๊ณ„๊ฐ€ ๋†’์€ ์ƒํ™ฉ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. * ์ด ๋ฌธ์ œ๋Š” ์ฃผ๋กœ ํšŒ๊ท€ ๋ถ„์„์—์„œ ๋‚˜ํƒ€๋‚˜๋ฉฐ, ๋ณ€์ˆ˜๋“ค ๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ํ•ด์„ํ•˜๋Š” ๋ฐ ์žˆ์–ด ํฐ ์žฅ์• ๋ฌผ์ด ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. * ์ผ๋ฐ˜์ ์ธ ํšŒ๊ท€์‹์„ $Y=

Bayesian P-Value๋Š” ๋ถˆํ™•์‹ค์„ฑ์„ ๊ฐ์•ˆํ•˜์—ฌ ๋ชจ๋ธ์˜ ์ ํ•ฉ๋„๋ฅผ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

Bayesian P-Value๋Š” ๋ถˆํ™•์‹ค์„ฑ์„ ๊ฐ์•ˆํ•˜์—ฌ ๋ชจ๋ธ์˜ ์ ํ•ฉ๋„๋ฅผ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

Bayesian P- Value * Bayesian P-Value๋Š” **๋ชจ๋ธ์˜ ์ ํ•ฉ๋„(goodness-of-fit)**๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. * ์‚ฌํ›„ ๋ถ„ํฌ(posterior distribution)๋ฅผ ์ด์šฉํ•˜์—ฌ ์‹ค์ œ ๋ฐ์ดํ„ฐ์™€ ๋ชจ๋ธ์ด ์ƒ์„ฑํ•œ ์˜ˆ์ƒ ๋ฐ์ดํ„ฐ๋ฅผ ๋น„๊ตํ•จ์œผ๋กœ์จ, ๊ด€์ธก๋œ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ชจ๋ธ์— ์˜ํ•ด ์–ผ๋งˆ๋‚˜ ์ž˜ ์„ค๋ช…๋˜๋Š”์ง€๋ฅผ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. * ๋นˆ๋„์ฃผ์˜ p-๊ฐ’์€ "๊ด€์ฐฐ๋œ ๋ฐ์ดํ„ฐ๋ณด๋‹ค ๊ทน๋‹จ์ ์ธ ๋ฐ์ดํ„ฐ๊ฐ€ ๋‚˜์˜ฌ ํ™•๋ฅ "์„ ๊ณ„์‚ฐํ•˜์ง€๋งŒ, Bayesian P-Value๋Š” "๋ชจ๋ธ์ด ์‹ค์ œ

Non-Identifiability๋Š” Model Parameter๋ฅผ ๊ณ ์œ ํ•˜๊ฒŒ ์‹๋ณ„ํ•  ์ˆ˜ ์—†๋Š” ํ˜„์ƒ์ž…๋‹ˆ๋‹ค.

Non-Identifiability๋Š” Model Parameter๋ฅผ ๊ณ ์œ ํ•˜๊ฒŒ ์‹๋ณ„ํ•  ์ˆ˜ ์—†๋Š” ํ˜„์ƒ์ž…๋‹ˆ๋‹ค.

Non Identifiability * Non-Identifiability๋Š” ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ์™€ ๋ชจ๋ธ์— ๋Œ€ํ•ด ํŠน์ • ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ณ ์œ ํ•˜๊ฒŒ ์‹๋ณ„ํ•  ์ˆ˜ ์—†๋Š” ์ƒํ™ฉ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, ์—ฌ๋Ÿฌ ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’๋“ค์ด ๋™์ผํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ด๋กœ ์ธํ•ด ํŠน์ • ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’์„ ํ™•์ •์ ์œผ๋กœ ์ถ”์ •ํ•˜๊ธฐ ์–ด๋ ต๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. * ๋ฒ ์ด์ง€์•ˆ ์ถ”๋ก ์—์„œ Non-Identifiability๋Š” ์‚ฌํ›„ ๋ถ„ํฌ๊ฐ€ ํŠน์ • ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’์— ๋Œ€ํ•ด ๋ช…ํ™•ํ•˜๊ฒŒ ์ˆ˜๋ ดํ•˜์ง€ ์•Š๊ณ , ์—ฌ๋Ÿฌ ๊ฐ’๋“ค์— ๋Œ€ํ•ด ๋น„์Šทํ•œ ํ™•๋ฅ ์„

Rootgram์€ ํฐ ๋ถ„์‚ฐ์„ ๊ฐ–๊ฑฐ๋‚˜ ๋น„์ •๊ทœ ํ˜•ํƒœ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์œ„ํ•œ ํžˆ์Šคํ† ๊ทธ๋žจ์ž…๋‹ˆ๋‹ค.

Rootgram์€ ํฐ ๋ถ„์‚ฐ์„ ๊ฐ–๊ฑฐ๋‚˜ ๋น„์ •๊ทœ ํ˜•ํƒœ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์œ„ํ•œ ํžˆ์Šคํ† ๊ทธ๋žจ์ž…๋‹ˆ๋‹ค.

Rootgram * ํžˆ์Šคํ† ๊ทธ๋žจ์˜ ๋ณ€ํ˜•์œผ๋กœ ๋ฐ์ดํ„ฐ๊ฐ€ ๋น„์ •๊ทœ์ ์ด๊ฑฐ๋‚˜ ํฐ ๋ถ„์‚ฐ์„ ๊ฐ€์ง€๋Š” ๊ฒฝ์šฐ, ์ •ํ™•ํ•œ ๋ถ„ํฌ๋ฅผ ํŒŒ์•…ํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. * ์ผ๋ฐ˜์ ์œผ๋กœ ํžˆ์Šคํ† ๊ทธ๋žจ์€ ๋ฐ์ดํ„ฐ์˜ ๋นˆ๋„๋ฅผ ์ง์ ‘์ ์œผ๋กœ ๋‚˜ํƒ€๋‚ด๊ธฐ ๋•Œ๋ฌธ์—, ํฐ ๊ฐ’์ด ๋นˆ๋ฒˆํ•˜๊ฒŒ ๋ฐœ์ƒํ•˜๋Š” ๊ฒฝ์šฐ ์ƒ๋Œ€์ ์œผ๋กœ ์ž‘์€ ๊ฐ’์„ ์ž˜ ๋“œ๋Ÿฌ๋‚ด์ง€ ๋ชปํ•˜๋Š” ๊ฒฝํ–ฅ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฐ˜๋ฉด, Rootgram์€ ๋นˆ๋„๋ฅผ ์ œ๊ณฑ๊ทผ ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ, ๋ฐ์ดํ„ฐ ๋ถ„ํฌ์˜ ์ฐจ์ด๋ฅผ ๋” ์ž˜ ์‹œ๊ฐํ™”ํ•  ์ˆ˜ ์žˆ๋„๋ก ๋•์Šต๋‹ˆ๋‹ค * ์—ฌ๊ธฐ์„œ