# Machine Learning/Kaggle Social Network Contest/Features

From Noisebridge

< Machine Learning | Kaggle Social Network Contest(Difference between revisions)

(→Possible Features) |
|||

Line 18: | Line 18: | ||

** unweighted random walk score | ** unweighted random walk score | ||

** Adamic-Adar score | ** Adamic-Adar score | ||

+ | *** see [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.108.1370&rep=rep1&type=pdf original paper] | ||

** number of common friends | ** number of common friends | ||

** indegrees and outdegrees of s | ** indegrees and outdegrees of s |

## Revision as of 22:30, 19 November 2010

## TODO

- Precisely define the listed features

## Possible Features

- nodeid
- nodetofollowid
- median path length
- shortest distance from nodeid to nodetofollowid
- inbound edges
- outbound edges
- clustering coefficient
- reciprocation probability (num of edges returned / num of outbound edges)

The response variable is the probability that the nodeid to nodetofollowid edge will be created in the future

From the Backstrom and Leskovec, for a node s and a potential target c

- Network features
- unweighted random walk score
- Adamic-Adar score
- see original paper

- number of common friends
- indegrees and outdegrees of s
- the indegree is the number of edges coming into node s
- the outdegree is the number of edges leaving node s

- indegrees and outdegrees of c