Machine Learning/Kaggle Social Network Contest/Features

From Noisebridge
Jump to navigation Jump to search

TODO

  • Precisely define the listed features

Possible Features

  • Node Features
    • nodeid
    • outdegree
    • indegree
    • local clustering coefficient
    • reciprocation of inbound probability (num of edges returned / num of inbound edges)
    • reciprocation of outbound probability (num of edges returned / num of outbound edges)
  • Edge Features
    • nodetofollowid
    • shortest distance nodeid to nodetofollowid
    • density? (median path length)
    • does reverse edge exist? (aka is nodetofollowid following nodeid?)
    • number of common friends
    • indegrees & outdegrees of nodetofollowid
  • Network features
    • unweighted random walk score
    • global clustering coefficient
    • Adamic-Adar score

The response variable is the probability that the nodeid to nodetofollowid edge will be created in the future