Machine Learning Meetup Notes: 2010-05-26: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
ThomasLotze (talk | contribs) No edit summary |
||
(One intermediate revision by one other user not shown) | |||
Line 6: | Line 6: | ||
*Brainstorming session on how to reduce skill set column | *Brainstorming session on how to reduce skill set column | ||
**Tom tried to quantify opportunity per skills per row as high dimensional vector | **Tom tried to quantify opportunity per skills per row as high dimensional vector | ||
*Brainstorming on how to reduce other data and compute new features | *Brainstorming on how to reduce other data and compute new features for the KDD Dataset | ||
**Tom | **Tom will apply k-means clustering of skills (or steps), for data reduction | ||
**Andy | **Andy will compute new features: unique step/problem id, student IQ (avg. correct), step challenge/difficulty (avg correct), step complexity (# skills required) | ||
** | **Mike will use self-organizing maps to reduce skills | ||
**Paul will visualize/summarize the data, to provide understanding and insight | |||
**Mike will set up an FTP server for people to transfer their enormous datasets | |||
**Theo will use some Weka classifiers to produce a classification method for the data |
Latest revision as of 22:09, 26 May 2010
- Andy gave overview of where we're at with KDD data
- Mike S gave presentation:
- Gaussian Mixture Models
- k-means clustering
- very basic expectation-maximization
- Brainstorming session on how to reduce skill set column
- Tom tried to quantify opportunity per skills per row as high dimensional vector
- Brainstorming on how to reduce other data and compute new features for the KDD Dataset
- Tom will apply k-means clustering of skills (or steps), for data reduction
- Andy will compute new features: unique step/problem id, student IQ (avg. correct), step challenge/difficulty (avg correct), step complexity (# skills required)
- Mike will use self-organizing maps to reduce skills
- Paul will visualize/summarize the data, to provide understanding and insight
- Mike will set up an FTP server for people to transfer their enormous datasets
- Theo will use some Weka classifiers to produce a classification method for the data