Machine Learning Meetup Notes: 2010-05-19: Difference between revisions

Latest revision as of 22:04, 23 May 2010

Erin provided a list of unique SubSkills and TracedSkills with frequencies, as well as a python script to normalize the skill values in the challenge sets.
Vikram gave a presentation and demo on Hadoop, EC2 and MapReduce. He created a bunch of scripts for EC2 MapReduce. Those tools can be found on github.

Here are some map reduce notes:

Word Counts (let line number be the key):

1 hello how are you

2 how is it going

3 are you happy

def map(key, value):

	words = value.split()

	#["hello", "how", "are", "you"]

	for word in words

		emit(word, 1)
		

def reduce(key, values):

	emit(key, len(values))

results:

hello [1]

how [1,1]

are [1,1]

@@ Line 1: / Line 1: @@
 *Erin provided a list of unique SubSkills and TracedSkills with frequencies, as well as a python script to normalize the skill values in the challenge sets.
-*Vikram gave a presentation on Hadoop, EC2 and MapReduce.  He created a bunch of scripts for EC2 MapReduce.  Those tools can be found on [http://github.com/voberoi/hadoop-mrutils github].
+*Vikram gave a presentation and demo on Hadoop, EC2 and MapReduce.  He created a bunch of scripts for EC2 MapReduce.  Those tools can be found on [http://github.com/voberoi/hadoop-mrutils github].
 Here are some map reduce notes:

Machine Learning Meetup Notes: 2010-05-19: Difference between revisions

Latest revision as of 22:04, 23 May 2010

Navigation menu

Search