[ml] drama prediction - training set

Wladyslaw Zbikowski embeddedlinuxguy at gmail.com
Thu May 31 18:57:47 PDT 2012


I'm here, another guy is in the library who came for ML, Zephyr and
Mischief might come.

On Thu, May 31, 2012 at 9:45 PM, Full Name <imsoexcitd at excite.com> wrote:
> Hey,
> I am planning on coming to the space tonight, is anyone else planning on coming in?  I'd like to talk about creating a training set from the mbox file so we can create a drama prediction model.  We can consider all sorts of interesting features, but at the bare minimum, we should create a large spare matrix of wordcounts for all (or a subset) of the words contained in either the message body, subject line or both.  Secondly, we need develop a protocol for labeling each message as drama or not-drama.  I don't know how diligently the [DRAMA] tag was applied to drama messages, but we can start there, and possibly also mark any messages that contain the word drama as "drama."
>
> Anyone want to work on creating the training set?
>
> -Erin
> _______________________________________________
> ml mailing list
> ml at lists.noisebridge.net
> https://www.noisebridge.net/mailman/listinfo/ml


More information about the ml mailing list