[EEG] Notes on non-trivial data analysis
Anthony Di Franco
di.franco at aya.yale.edu
Wed Jul 1 15:53:06 PDT 2009
Notes on analysis of non-trivial data, quick and dirty.
Orientation: Let's kick this off by looking at this, and bear in mind
we are in the third and fourth quadrants of Taleb's taxonomy, and even
have problems that would confound us in simpler settings like extreme
signal degeneracy wrt. what we care about (cardinality of eeg traces
vs. cardinality of brain states) and low signal-to-noise, so we will
need to pull out the heavy metal on this, but that's what I like best
anyway.
http://www.edge.org/3rd_culture/taleb08/taleb08_index.html
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1343042
Approaches:
Clustering / vector quantization
An eeg cap brings in a vector time series multiple readings per unit
time. As a preprocessing / compression step, or to infer some
unobserved part of the vector from the observed parts, we might want
to group vectors that are alike in some sense together, and represent
them as members of or with respect to the group each is in.
See http://en.wikipedia.org/wiki/LVQ and http://code.google.com/p/lda1pfs/
Static function approximation, (artificial) neural nets:
Given a set of input vectors, we might want to learn an arbitrary
function from the input vectors to some other space, for example,
static EEG signal to some other item of interest. Feed-forward neural
networks and support vector (kernel) machines can do this, but they
are meant for stateless mappings so dealing with time series using
them is crappy and ad-hoc.
See http://en.wikipedia.org/wiki/Feedforward_neural_network and
http://en.wikipedia.org/wiki/Support_vector_machine
Liquid / echo state machine:
An approach to using static function approximators / estimators with
time series data is to pass the input sequence through a chaotic (in
the mathematical sense) mapping as a pre-processing step. I used this
several years ago to learn prosody classes of spoken utterances quite
successfully, and it was used to discriminate among spoken digits
before then. But it's kind of hacky and has been surpassed in
performance on benchmark tasks by advanced recurrent neural networks.
http://www.scholarpedia.org/article/Echo_state_network
http://en.wikipedia.org/wiki/Liquid_state_machine
Recursive filters - Kalman - extended - unscented - particle:
Suppose we have a time series that represents the evolution of the
(vector) state of a system according to a known rule, but the
evolution of the state of the system itself is corrupted by noise, and
we can only estimate the state of the system via measurements that are
themselves corrupted by noise and not the state itself but a function
of the state. Like, say we wish to know the true set eeg traces given
the estimate of the previous trace set and the current measurements.
Choosing linear functions for the state evolution and measurement
function, and gaussian noise, and minimizing the sum of squared
errors, so that all the math works out to nice linear algebra, and
solving for the optimal estimator, we get the Kalman filter. Dual
filtering estimates the transition function as well by interleaving a
step where the roles of the state and transition function are
reversed. The extended, unscented, and particle filters relax mainly
the linearity assumption progressively and form the core of the field
of recursive Bayesian estimation.
Fortunately for us, lots of things can be represented as linear
systems or simplified to look like linear systems, including systems
of linear / continuous differential equations and linear / continuous
partial differential equations, including the wave equation, which is
a good model of electomagnetic signals as well as sound.
http://en.wikipedia.org/wiki/Kalman_filter
Deferred for now: Recurrent neural networks - in terms of theory of
computation, the most general and powerful paradigm, but algorithmic
work is ongoing to realize this potential (including my work).
Control theory.
Anthony
On Jun 25, 2009 1:34 PM, "Anthony Di Franco" <di.franco at aya.yale.edu> wrote:
On Thu, Jun 25, 2009 at 13:18, Kelly<hurtstotouchfire at gmail.com> wrote:
>> H) CRAPPY CODE >> >> heres some crappy code i used. none of the tools are that useful but if >> y...
If you want to use a less rubbing-two-sticks-together-to-make-dust
sort of software toolbox, but still keep the warm fuzzy
Stallman-approving-of-what-you're-doing feeling, and your money, have
a look at these:
http://www.gnu.org/software/octave/ - good matlab clone
http://www.scilab.org/ - matlab clone with evil widgets from 1991
http://www.sagemath.org/ - everything and everything else,
Mathematica-like feature set, Python
http://scipy.org/ - more numerical focus in a Python package, includes
Numpy for linear algebra
http://acs.lbl.gov/~hoschek/colt/ - java linear algebra
http://rapid-i.com/ - java machine learning
More information about the EEG
mailing list