Machine Learning

From Noisebridge
(Difference between revisions)
Jump to: navigation, search
(Software Tools)
(48 intermediate revisions by 11 users not shown)
Line 1: Line 1:
 
=== Next Meeting===
 
=== Next Meeting===
  
Note: there is currently a study group for the [http://www.ml-class.org/ Stanford ML Course] that meets up every Wednesday @ 7:30pm in the Church room, Monday meetups will be reserved for special presentations and announced here and on the [https://www.noisebridge.net/mailman/listinfo/ml mailing list].
+
*When:  
 
+
*Where: 2169 Mission St. (back NE corner, Church classroom)
*When: Wednesday, 11/16/2011 @ 7:30-9:00pm
+
*Topic:  
*Where: 2169 Mission St. (back corner, Church classroom)
+
*Details: Currently on hiatus until somebody decides to pick it back up!
*Topic: Working through [http://www.ml-class.org/course/quiz/list?type=quiz review] for [http://www.ml-class.org Stanford's ML Class]
+
*Who:  
*Details:
+
-----------------
+
*When: Monday, 11/21/2011 @ 7:30-9:00pm
+
*Where: 2169 Mission St. (back corner, Church classroom)
+
*Topic: Probability Distributions in Python, Gibb's Sampling for Restricted Boltzmann Machines
+
*Details: A gentle introduction to sampling from probability distributions using SciPy, and loose talk about how to do sampling for RBMs and Ising models
+
*Who: Mike S
+
 
+
  
 
=== Take the Noisebridge ML Survey ===
 
=== Take the Noisebridge ML Survey ===
Line 19: Line 11:
  
 
=== Crowdsourced Q&A ===
 
=== Crowdsourced Q&A ===
Are you working on a data mining, machine learning, or statistics problem? Do you want some help? Consider sending an email to the [https://www.noisebridge.net/mailman/listinfo/ml mailing list] about it! Also consider setting up a day to come in and talk about the project you're working on and get input from other ML people.
+
Are you working on a data mining, machine learning, or statistics problem? Do you want some help? Consider sending an email to the [https://www.noisebridge.net/mailman/listinfo/ml mailing list] about it! Also consider setting up a day to come in and talk about the project you're working on and get input from <span class="plainlinks">[http://www.andrewflusche.com/services/spotsylvania-reckless-driving-defense/<span style="color:black;font-weight:normal; text-decoration:none!important; background:none!important; text-decoration:none;">Spotsylvania reckless driving</span>] other ML people.
  
 
=== About Us ===
 
=== About Us ===
We're a loosely-knit stochastic federation of people who like Noisebridge and like machine learning. What is machine learning? It's broad field that typically involves training computer models to solve problems. How can you participate? Join the [https://www.noisebridge.net/mailman/listinfo/ml mailing list], send an email and introduce yourself. Show up to the next meeting, share your thoughts. Participate in projects or start your own. Go to workshops, write code at workshops, learn stuff, give workshops of your own! All are welcome.
+
We're a loosely-knit stochastic federation of people who like Noisebridge and like machine learning. What is machine learning? It's broad field that typically involves training computer models to solve problems. How can you <span class="plainlinks">[http://www.monoloop.com<span style="color:black;font-weight:normal; text-decoration:none!important; background:none!important; text-decoration:none;">website personalization</span>] participate? Join the [https://www.noisebridge.net/mailman/listinfo/ml mailing list], send an email and introduce yourself. Show up to the next meeting, share your thoughts. Participate in projects or start your own. Go to workshops, write code at workshops, learn stuff, give workshops of your own! All are welcome.
  
 
=== Talks and Workshops ===
 
=== Talks and Workshops ===
Line 42: Line 34:
  
 
=== Future Talks and Topics, Ideas ===
 
=== Future Talks and Topics, Ideas ===
*Restricted Boltzmann Machines (Mike S, late August)
+
*Random Forests in R
*Deep Nets w/ Stacked Autoencoders (Mike S, September)
+
*Restricted Boltzmann Machines (Mike S, some day)
*Generalized Linear Models (Mike S, September/October)
+
*Analyzing brain cells (Mike S)
*Graphical Models (Tony)
+
*Deep Nets w/ Stacked Autoencoders (Mike S, some day)
 +
*Generalized Linear Models (Mike S, Erin L? some day)
 +
*Graphical Models
 
*Working with the Kinect
 
*Working with the Kinect
 
*Computer Vision with OpenCV
 
*Computer Vision with OpenCV
Line 73: Line 67:
 
*[http://mlcomp.org/ MLcomp]
 
*[http://mlcomp.org/ MLcomp]
 
**Upload your algorithm and objectively compare it's performance to other algorithms
 
**Upload your algorithm and objectively compare it's performance to other algorithms
 +
*[http://www.ntis.gov/products/ssa-dmf.aspx Social Security Death Master File!]
  
=== [[Machine Learning/Tools | Software Tools]] ===
+
=== Software Tools ===
*[http://opencv.willowgarage.com/documentation/index.html OpenCV]
+
 
**Computer Vision Library
+
==== Generic ML Libraries ====
**Has ML component (SVM, trees, etc)
+
**Online tutorials [http://www.pages.drexel.edu/~nk752/tutorials.html here]
+
*[http://lucene.apache.org/mahout/ Mahout]
+
**Hadoop cluster based ML package.
+
 
*[http://www.cs.waikato.ac.nz/ml/weka/ Weka]
 
*[http://www.cs.waikato.ac.nz/ml/weka/ Weka]
 
**a collection of data mining tools and machine learning algorithms.
 
**a collection of data mining tools and machine learning algorithms.
Line 95: Line 86:
 
*[http://pyml.sourceforge.net PyML]
 
*[http://pyml.sourceforge.net PyML]
 
*[http://mdp-toolkit.sourceforge.net/ MDP]
 
*[http://mdp-toolkit.sourceforge.net/ MDP]
**Does not stand for Markov Decision Process :(
+
**Modular framework, has lots of stuff!
*[http://www.ailab.si/orange/ Orange]
+
**Strong data visualization component
+
*[http://jmlr.csail.mit.edu/mloss/ Journal of Machine Learning Software List]
+
 
*[[Machine Learning/VirtualBox|VirtualBox]] Virtual Box Image with Pre-installed Libraries listed here
 
*[[Machine Learning/VirtualBox|VirtualBox]] Virtual Box Image with Pre-installed Libraries listed here
 
*[http://deeplearning.net/software/theano/ Theano: Symbolic Expressions and Transparent GPU Integration]
 
*[http://deeplearning.net/software/theano/ Theano: Symbolic Expressions and Transparent GPU Integration]
 +
*[http://sympy.org sympy] Does symbolic math
 
*[http://waffles.sourceforge.net/ Waffles]
 
*[http://waffles.sourceforge.net/ Waffles]
 
**Open source C++ set of machine learning command line tools.
 
**Open source C++ set of machine learning command line tools.
 
*[http://rapid-i.com/content/view/181/196/ RapidMiner]
 
*[http://rapid-i.com/content/view/181/196/ RapidMiner]
 
*[http://www.mrpt.org/ Mobile Robotic Programming Toolkit]
 
*[http://www.mrpt.org/ Mobile Robotic Programming Toolkit]
 +
*[http://nipy.sourceforge.net/nitime/ nitime]
 +
**NeuroImaging in Python, has some good time series analysis stuff and multi-variate response fitting.
 +
*[http://pandas.pydata.org/ Pandas]
 +
**Data analysis workflow in python
 +
*[http://www.pytables.org/moin PyTables]
 +
**Adds querying capabilities to HDF5 files
 +
*[http://statsmodels.sourceforge.net/ statsmodels]
 +
**Regression, time series analysis, statistics stuff for python
 +
*[https://github.com/JohnLangford/vowpal_wabbit/wiki Vowpal Wabbit]
 +
**"Intrinsically Fast" implementation of gradient descent for large datasets
 +
*[http://www.shogun-toolbox.org/ Shogun]
 +
**Fast implementations of SVMs
 +
*[http://mc-stan.org/ Stan]
 +
**A graphical model compiler
 +
 +
==== Computer Vision ====
 +
*[http://opencv.willowgarage.com/documentation/index.html OpenCV]
 +
**Computer Vision Library
 +
**Has ML component (SVM, trees, etc)
 +
**Online tutorials [http://www.pages.drexel.edu/~nk752/tutorials.html here]
 +
 +
==== Audio Processing ====
 +
*[http://tlecomte.github.com/friture/ Friture]
 +
**Real-time spectrogram generation
 +
*[http://code.google.com/p/pyo/ pyo]
 +
**Real-time audio signal processing
 +
*[https://github.com/jsawruk/pymir PYMir]
 +
**A library for reading mp3's into python, and doing analysis
 +
*[http://wiki.python.org/moin/PythonInMusic List of Sound Tools for Python]
 +
 +
==== Data Visualization ====
 +
*[http://www.ailab.si/orange/ Orange]
 +
**Strong data visualization component
 
*[http://gephi.org/ Gephi]
 
*[http://gephi.org/ Gephi]
 
**Graph Visualization
 
**Graph Visualization
 
*[http://had.co.nz/ggplot2/ ggplot]
 
*[http://had.co.nz/ggplot2/ ggplot]
 
**Nice plotting package for R
 
**Nice plotting package for R
*[http://nipy.sourceforge.net/nitime/ nitime]
+
*[http://code.enthought.com/projects/mayavi/ MayaVi2]
**NeuroImaging in Python, has some good time series analysis stuff and multi-variate response fitting.
+
**3D Scientific Data Visualization
 +
 
 +
==== Cluster Computing ====
 +
*[http://lucene.apache.org/mahout/ Mahout]
 +
**Hadoop cluster based ML package.
 +
*[http://web.mit.edu/star/cluster/ STAR: Cluster]
 +
**Easily build your own Python computing cluster on Amazon EC2
 +
 
 +
==== Other ====
 +
*[http://jmlr.csail.mit.edu/mloss/ Journal of Machine Learning Software List]
  
 
=== Presentations and other Materials ===
 
=== Presentations and other Materials ===
Line 164: Line 195:
  
 
=== [[Machine Learning/Meeting Notes|Meeting Notes]]===
 
=== [[Machine Learning/Meeting Notes|Meeting Notes]]===
 +
 +
[[Category:Events]]
 +
[[Category:Projects]]

Revision as of 23:37, 5 February 2013

Contents

Next Meeting

  • When:
  • Where: 2169 Mission St. (back NE corner, Church classroom)
  • Topic:
  • Details: Currently on hiatus until somebody decides to pick it back up!
  • Who:

Take the Noisebridge ML Survey

Take a survey and vote for what you want to learn!

Crowdsourced Q&A

Are you working on a data mining, machine learning, or statistics problem? Do you want some help? Consider sending an email to the mailing list about it! Also consider setting up a day to come in and talk about the project you're working on and get input from Spotsylvania reckless driving other ML people.

About Us

We're a loosely-knit stochastic federation of people who like Noisebridge and like machine learning. What is machine learning? It's broad field that typically involves training computer models to solve problems. How can you website personalization participate? Join the mailing list, send an email and introduce yourself. Show up to the next meeting, share your thoughts. Participate in projects or start your own. Go to workshops, write code at workshops, learn stuff, give workshops of your own! All are welcome.

Talks and Workshops

We've given lots of workshops and talks over the past year or so, here's a few. Many of the workshops we've given previously are recurring and will be given again, especially upon request!

Code and SourceForge Site

    git clone git://ml-noisebridge.git.sourceforge.net/gitroot/ml-noisebridge/ml-noisebridge
  • Send an email to the list if you want to become an administrator on the site to get write access to the git repo!

Future Talks and Topics, Ideas

  • Random Forests in R
  • Restricted Boltzmann Machines (Mike S, some day)
  • Analyzing brain cells (Mike S)
  • Deep Nets w/ Stacked Autoencoders (Mike S, some day)
  • Generalized Linear Models (Mike S, Erin L? some day)
  • Graphical Models
  • Working with the Kinect
  • Computer Vision with OpenCV

Mailing List

https://www.noisebridge.net/mailman/listinfo/ml

Projects

Datasets and Websites

Software Tools

Generic ML Libraries

Computer Vision

  • OpenCV
    • Computer Vision Library
    • Has ML component (SVM, trees, etc)
    • Online tutorials here

Audio Processing

Data Visualization

  • Orange
    • Strong data visualization component
  • Gephi
    • Graph Visualization
  • ggplot
    • Nice plotting package for R
  • MayaVi2
    • 3D Scientific Data Visualization

Cluster Computing

  • Mahout
    • Hadoop cluster based ML package.
  • STAR: Cluster
    • Easily build your own Python computing cluster on Amazon EC2

Other

Presentations and other Materials

Topics to Learn and Teach

NBML Course - Noisebridge Machine Learning Curriculum (work-in-progress)

CS229 - The Stanford Machine learning Course @ noisebridge

  • Supervised Learning
    • Linear Regression
    • Linear Discriminants
    • Neural Nets/Radial Basis Functions
    • Support Vector Machines
    • Classifier Combination [1]
    • A basic decision tree builder, recursive and using entropy metrics
  • Reinforcement Learning
    • Temporal Difference Learning
  • Math, Probability & Statistics
    • Metric spaces and what they mean
    • Fundamentals of probabilities
    • Decision Theory (Bayesian)
    • Maximum Likelihood
    • Bias/Variance Tradeoff, VC Dimension
    • Bagging, Bootstrap, Jacknife [2]
    • Information Theory: Entropy, Mutual Information, Gaussian Channels
    • Estimation of Misclassification [3]
    • No-Free Lunch Theorem [4]
  • Machine Learning SDK's
    • OpenCV ML component (SVM, trees, etc)
    • Mahout a Hadoop cluster based ML package.
    • Weka a collection of data mining tools and machine learning algorithms.
  • Applications
    • Collective Intelligence & Recommendation Engines

Meeting Notes

Personal tools