Patterns around Gnutella Network Nodes

About This Presentation

Title:

Description:

Number of Views:96

Avg rating:3.0/5.0

Slides: 17

Provided by: cseLehig7

Learn more at: http://www.cse.lehigh.edu

Category:

Tags: around | gnutella | network | nodes | patterns

Transcript and Presenter's Notes

Title: Patterns around Gnutella Network Nodes

1
Patterns around Gnutella Network Nodes

2
Introduction

3
Goal

Find out the existence of frequent patterns
Verify the validity of the model
Use this model to predict patterns around nodes
that is not in the training data

4
Representation of the Network (1)

Undirected Graph G N, E
N center, depth_1,, depth_n
E 1, 2,, TTL
The depth of nodes other than the center node is
defined as the shortest path from that node to
the center

5
Representation of the Network (2)

6
Frequent Subgraph Discovery

Developed by Michihiro Kuramochi, George Karypis
Able to mine patterns in a set of transaction
give minimum frequency the patterns appear in the
set
Gives parent-child relation between subgraphs

7
Power Law

The frequency, , of an out degree, d, is
proportional to the out degree to the power of
the constant, O

8
Stratified Sampling

Principle of Stratification partitions are best
performed by partitioning data so that samples in
each strata are most similar to each other
Population of nodes are partitioned into strata
Partition by size of transaction
Partition by the power law

9
Experiment (1)

Find out the frequent patterns in two set of data
collected at the same time but belong to
different connected component
The comparison between two distributions is
performed by comparing the relation of frequent
subgraphs in each strata
The maximum depth in each graph is set to be 3
The TTL is 1
Data is partitioned by the size of transaction

10
(No Transcript)
11
Experiment (1)

There are one pattern of size 3, 2 patterns of
size 4, and 2 patterns of size 5 missing in data
set 2
Missing parent will cause missing child
Grouping based on power law shows similar result
Possible reason for difference
Size of data
Classification error
Incomplete observation of the true distribution

12
Experiment (2)

13
Set1 Set2
Graph1 74.2 71.1
Graph2 100 100
Graph3 73.2 69.9
Graph4 71.5 71.1
Graph5 71.5 71.1
Graph6 73.2 69.9
14
Experiment (3)

15
Set1 Set2
Graph1 59.5 52.5
Graph2 55.4 59.0
Graph3 55.0 56.8
Graph4 52.8 55.5
16
Prediction Model

Suppose the model has a graph G with two children
and
The frequency of them are , and
If a node finds a has a subgraph
isomorphism with G, the chances of finding and
in are / and /
respectively

Write a Comment

User Comments (0)