Machine Learning/Kaggle Social Network Contest/Features: Difference between revisions

From Noisebridge
Jump to navigation Jump to search
(Created page with '== Possible Features == *nodeid *nodetofollowid *median path length *shortest distance from nodeid to nodetofollowid *inbound edges *outbound edges *clustering coefficient *recip…')
 
No edit summary
Line 1: Line 1:
== TODO ==
* Precisely define the listed features
== Possible Features ==
== Possible Features ==
*nodeid
*nodeid
Line 10: Line 13:


The response variable is the probability that the nodeid to nodetofollowid edge will be created in the future
The response variable is the probability that the nodeid to nodetofollowid edge will be created in the future
From the Backstrom and Leskovec, for a node s and a potential target c
* Network features
** unweighted random walk score
** Adamic-Adar score
** number of common friends
** indegrees and outdegrees of  s
*** the indegree is the number of edges coming into node s
*** the outdegree is the number of edges leaving node s
** indegrees and outdegrees of  c

Revision as of 22:17, 19 November 2010

TODO

  • Precisely define the listed features

Possible Features

  • nodeid
  • nodetofollowid
  • median path length
  • shortest distance from nodeid to nodetofollowid
  • inbound edges
  • outbound edges
  • clustering coefficient
  • reciprocation probability (num of edges returned / num of outbound edges)

The response variable is the probability that the nodeid to nodetofollowid edge will be created in the future

From the Backstrom and Leskovec, for a node s and a potential target c

  • Network features
    • unweighted random walk score
    • Adamic-Adar score
    • number of common friends
    • indegrees and outdegrees of s
      • the indegree is the number of edges coming into node s
      • the outdegree is the number of edges leaving node s
    • indegrees and outdegrees of c