# Machine Learning/Kaggle Social Network Contest/Features

From Noisebridge

< Machine Learning | Kaggle Social Network Contest(Difference between revisions)

(→Possible Features) |
m (→Possible Features) |
||

Line 15: | Line 15: | ||

**shortest distance nodeid to nodetofollowid | **shortest distance nodeid to nodetofollowid | ||

**density? (<strike>median path length</strike>) | **density? (<strike>median path length</strike>) | ||

− | **is nodetofollowid following nodeid? | + | **does reverse edge exist? (aka is nodetofollowid following nodeid?) |

**number of common friends | **number of common friends | ||

**indegrees & outdegrees of nodetofollowid | **indegrees & outdegrees of nodetofollowid |

## Revision as of 04:25, 20 November 2010

## TODO

- Precisely define the listed features

## Possible Features

- Node Features
- nodeid
- outdegree
- indegree
- local clustering coefficient
- reciprocation of inbound probability (num of edges returned / num of inbound edges)
- reciprocation of outbound probability (num of edges returned / num of outbound edges)

- Edge Features
- nodetofollowid
- shortest distance nodeid to nodetofollowid
- density? (
~~median path length~~) - does reverse edge exist? (aka is nodetofollowid following nodeid?)
- number of common friends
- indegrees & outdegrees of nodetofollowid

- Network features
- unweighted random walk score
- global clustering coefficient
- Adamic-Adar score
- see original paper

The response variable is the probability that the nodeid to nodetofollowid edge will be created in the future