[ml] k-means data clustering... Bueller?
David Faden
dfaden at gmail.com
Wed Jun 17 17:42:58 PDT 2009
I've attached some C code for k-means. I think this was partially
automatically translated from Fortran so is somewhat ugly. The author
has okayed its use. (I guess you might want to go with a GPL
implementation just to be perfectly safe though.) I believe it should
build with no dependencies.
David
On Tue, Jun 16, 2009 at 5:19 PM, Almir Karic<almir at kiberpipa.org> wrote:
> i have toyed with it, scipy.cluster.vq.kmeans2 :D does pretty much everything
> we need/want, you can pass it means or you can tell it how many means you want
> and it randomly picks them (ofcourse in both cases telling you to which of the
> means your vectors are closest to)
>
> On Tue, Jun 16, 2009 at 03:04:02PM -0700, Josh Myer wrote:
>> Following up on Michael's post about HMM libraries, has anyone started
>> on the k-means part of the wiimote stuff? Any libraries to post
>> about, etc?
>> --
>> Josh Myer 650.248.3796
>> josh at joshisanerd.com
>> _______________________________________________
>> ml mailing list
>> ml at lists.noisebridge.net
>> https://www.noisebridge.net/mailman/listinfo/ml
> _______________________________________________
> ml mailing list
> ml at lists.noisebridge.net
> https://www.noisebridge.net/mailman/listinfo/ml
>
--
David Faden, dfaden at iastate.edu
AIM: pitulx
-------------- next part --------------
#ifndef __KMEANS_H__
#define __KMEANS_H__
/**
* Run k-means on one-dimensional data.
* Returns a nonzero value to indicate failure.
*/
int kmeans1dCenters(const double* y, int len,
double* centers, int k);
/**
* Run k-means on one-dimensional data.
* Returns a nonzero value to indicate failure.
*
* classes -- len-length array giving final closest center for ith point
* counts -- k-length array giving number of points assigned to ith center.
*/
int kmeans1d(const double* y, int len, double* centers,
int k, int* classes, int* counts);
/** ALGORITHM AS 136 APPL. STATIST. (1979) VOL.28, NO.1
* Divide M points in N-dimensional space into K clusters so that the within
* cluster sum of squares is minimized.
*
* a -- m x n input array
* m -- number of points/rows in a
* n -- dimension of a point (length of a row in a)
* c -- k x n array of centers
* k -- number of clusters
* ic1 -- m-length workspace (closest center to ith point)
* nc -- k-length workspace (number of points in jth cluster)
* iter -- max num of iterations
* wss -- k-length workspace
* ifault -- a nonzero value indicates an error
*/
void kmeans(const double * const * a, int m, int n,
double **c, int k, int *ic1, int *nc,
int iter, double *wss, int *ifault);
#endif
-------------- next part --------------
#ifndef __KMEANS_H__
#define __KMEANS_H__
/**
* Run k-means on one-dimensional data.
* Returns a nonzero value to indicate failure.
*/
int kmeans1dCenters(const double* y, int len,
double* centers, int k);
/**
* Run k-means on one-dimensional data.
* Returns a nonzero value to indicate failure.
*
* classes -- len-length array giving final closest center for ith point
* counts -- k-length array giving number of points assigned to ith center.
*/
int kmeans1d(const double* y, int len, double* centers,
int k, int* classes, int* counts);
/** ALGORITHM AS 136 APPL. STATIST. (1979) VOL.28, NO.1
* Divide M points in N-dimensional space into K clusters so that the within
* cluster sum of squares is minimized.
*
* a -- m x n input array
* m -- number of points/rows in a
* n -- dimension of a point (length of a row in a)
* c -- k x n array of centers
* k -- number of clusters
* ic1 -- m-length workspace (closest center to ith point)
* nc -- k-length workspace (number of points in jth cluster)
* iter -- max num of iterations
* wss -- k-length workspace
* ifault -- a nonzero value indicates an error
*/
void kmeans(const double * const * a, int m, int n,
double **c, int k, int *ic1, int *nc,
int iter, double *wss, int *ifault);
#endif
More information about the ml
mailing list