Machine Learning/Kaggle Social Network Contest/Features

From Noisebridge
< Machine Learning | Kaggle Social Network Contest(Difference between revisions)
Jump to: navigation, search
(Created page with '== Possible Features == *nodeid *nodetofollowid *median path length *shortest distance from nodeid to nodetofollowid *inbound edges *outbound edges *clustering coefficient *recip…')
 
Line 1: Line 1:
 +
== TODO ==
 +
* Precisely define the listed features
 +
 
== Possible Features ==
 
== Possible Features ==
 
*nodeid
 
*nodeid
Line 10: Line 13:
  
 
The response variable is the probability that the nodeid to nodetofollowid edge will be created in the future
 
The response variable is the probability that the nodeid to nodetofollowid edge will be created in the future
 +
 +
From the Backstrom and Leskovec, for a node s and a potential target c
 +
* Network features
 +
** unweighted random walk score
 +
** Adamic-Adar score
 +
** number of common friends
 +
** indegrees and outdegrees of  s
 +
*** the indegree is the number of edges coming into node s
 +
*** the outdegree is the number of edges leaving node s
 +
** indegrees and outdegrees of  c

Revision as of 22:17, 19 November 2010

TODO

  • Precisely define the listed features

Possible Features

  • nodeid
  • nodetofollowid
  • median path length
  • shortest distance from nodeid to nodetofollowid
  • inbound edges
  • outbound edges
  • clustering coefficient
  • reciprocation probability (num of edges returned / num of outbound edges)

The response variable is the probability that the nodeid to nodetofollowid edge will be created in the future

From the Backstrom and Leskovec, for a node s and a potential target c

  • Network features
    • unweighted random walk score
    • Adamic-Adar score
    • number of common friends
    • indegrees and outdegrees of s
      • the indegree is the number of edges coming into node s
      • the outdegree is the number of edges leaving node s
    • indegrees and outdegrees of c
Personal tools