Kaggle: Difference between revisions
(add mat_to_pandas helper function) |
|||
Line 41: | Line 41: | ||
def load(fn): | def load(fn): | ||
return loadmat(fn, struct_as_record=False)['dataStruct'][0, 0].data | return loadmat(fn, struct_as_record=False)['dataStruct'][0, 0].data | ||
</nowiki> | |||
<nowiki> | <nowiki> | ||
import pandas as pd | import pandas as pd | ||
from scipy.io import loadmat | from scipy.io import loadmat |
Revision as of 17:33, 18 September 2016
Noisebridge Kaggle team!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
We use this wiki to archive information. We use this google group to communicate with each other: https://groups.google.com/forum/#!forum/nbkaggle
The Noisebridge neuro hacking dream team has a lot of useful stuff on their reading list: https://noisebridge.net/wiki/DreamTeam/Reading#Seizure_Detection
Here is a link to the competition: https://www.kaggle.com/c/melbourne-university-seizure-prediction/data
Papers
Random papers from google searching "machine learning seizure detection"
Application of Machine Learning To Epileptic Seizure Detection
Ali Shoeb, John Guttag Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts, 02139
https://drive.google.com/open?id=0ByjOj5sb0Oj_SGduRjduNVdfX1k
EEG-based neonatal seizure detection with Support Vector Machines
A. Temko a,*, E. Thomas a, W. Marnane a,b, G. Lightbody a,b, G. Boylan a,c a Neonatal Brain Research Group, University College Cork, Ireland b Department of Electrical and Electronic Engineering, University College Cork, Ireland c Department of Paediatrics and Child Health, University College Cork, Ireland
https://drive.google.com/open?id=0ByjOj5sb0Oj_UzVxdGpkcTNPV0E
Code
How should we organize our code? A github organization?
reading the data
Here is a python function to load a file from the matplotlib file format.
from scipy.io import loadmat def load(fn): return loadmat(fn, struct_as_record=False)['dataStruct'][0, 0].data
import pandas as pd from scipy.io import loadmat def mat_to_pandas(path): mat = loadmat(path) names = mat['dataStruct'].dtype.names ndata = {n: mat['dataStruct'][n][0, 0] for n in names} sequence = -1 if 'sequence' in names: sequence = mat['dataStruct']['sequence'] return pd.DataFrame(ndata['data'], columns=ndata['channelIndices'][0]), sequence