cleaning up preprocessing

KordingLab · Feb 14, 2017 · 4f9f52b · 4f9f52b
1 parent dfda9ce
commit 4f9f52b
Show file tree

Hide file tree

Showing 5 changed files with 284 additions and 28 deletions.
diff --git a/Example_format_data.ipynb b/Example_format_data.ipynb
@@ -0,0 +1,172 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Example of correctly formatting data\n",
+    "\n",
+    "For use in decoding (see \"Examples_all_decoders\" and \"Examples_kf_decoder\"), we need the following format of inputs:\n",
+    "- Neural data should be a matrix of size \"number of time bins\" x \"number of neurons\", where each entry is the firing rate of a given neuron in a given time bin\n",
+    "- The output you are decoding should be a matrix of size \"number of time bins\" x \"number of features you are decoding\"\n",
+    "\n",
+    "In this example, we load Matlab data that contains \n",
+    "- The spike times of all neurons. In Matlab, \"spike_times\" is a cell of size \"number of neurons\" x 1. Within spike_times{i} is a vector containing all the spike times of neuron i.\n",
+    "- A continuous stream of the output variables. In this example, we are aiming to decode velocity. In Matlab, \"vels\" is a matrix of size \"number of recorded time points\" x 2 (x and y velocities were recorded) that contains the x and y velocity components at all time points. \"vel_times\" is a vector that states the time at all recorded time points. \n",
+    "\n",
+    "We will put this data in the format described above, with the help of the functions \"bin_spikes\" and \"bin_output\" that are in the file \"preprocessing_funcs.py\"\n",
+    "\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Import packages and functions"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "###Import standard packages###\n",
+    "import numpy as np\n",
+    "from scipy import io\n",
+    "\n",
+    "###Import functions for binning data for preprocessing###\n",
+    "from preprocessing_funcs import bin_spikes\n",
+    "from preprocessing_funcs import bin_output"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Load Data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "###Load Data###\n",
+    "folder='/Users/jig289/Dropbox/MATLAB/Projects/In_Progress/BMI/Processed_Data/' #ENTER THE FOLDER THAT YOUR DATA IS IN\n",
+    "data=io.loadmat(folder+'s1_data_raw.mat')\n",
+    "spike_times=data['spike_times'] #Load spike times of all neurons\n",
+    "vels=data['vels'] #Load x and y velocities\n",
+    "vel_times=data['vel_times'] #Load times at which velocities were recorded"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## User Inputs"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "dt=.05 #Size of time bins (in seconds)\n",
+    "t_start=vel_times[0] #Time to start extracting data - here the first time velocity was recorded\n",
+    "t_end=vel_times[-1] #Time to finish extracting data - here the last time velocity was recorded\n",
+    "downsample_factor=1 #Downsampling of output (to make binning go faster). 1 means no downsampling."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Put data in binned format"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "#When loading the Matlab cell \"spike_times\", Python puts it in a format with an extra unnecessary dimension\n",
+    "#First, we will put spike_times in a cleaner format: an array of arrays\n",
+    "spike_times=np.squeeze(spike_times)\n",
+    "for i in range(spike_times.shape[0]):\n",
+    "    spike_times[i]=np.squeeze(spike_times[i])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "###Preprocessing to put spikes and output in bins###\n",
+    "\n",
+    "#Bin neural data using \"bin_spikes\" function\n",
+    "neural_data=bin_spikes(spike_times,dt,t_start,t_end)\n",
+    "\n",
+    "#Bin output (velocity) data using \"bin_output\" function\n",
+    "vels_binned=bin_output(vels,vel_times,dt,t_start,t_end,downsample_factor)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Save Data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "import pickle\n",
+    "\n",
+    "with open('s1_test_data_p3.pickle','wb') as f:\n",
+    "    pickle.dump([neural_data,vels_binned,pos_binned,acc_binned],f)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 2",
+   "language": "python",
+   "name": "python2"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 2
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython2",
+   "version": "2.7.11"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
diff --git a/Examples_all_decoders.ipynb b/Examples_all_decoders.ipynb
@@ -127,7 +127,7 @@
    "source": [
     "bins_before=13 #How many bins of neural data prior to the output are used for decoding\n",
     "bins_current=1 #Whether to use concurrent time bin of neural data\n",
-    "bins_after=0 #How many bins of neural data after (and including) the output are used for decoding"
+    "bins_after=0 #How many bins of neural data after the output are used for decoding"
    ]
   },
   {
@@ -615,9 +615,9 @@
  "metadata": {
   "anaconda-cloud": {},
   "kernelspec": {
-   "display_name": "Python [Root]",
+   "display_name": "Python 2",
    "language": "python",
-   "name": "Python [Root]"
+   "name": "python2"
   },
   "language_info": {
    "codemirror_mode": {
@@ -629,7 +629,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython2",
-   "version": "2.7.12"
+   "version": "2.7.11"
   }
  },
  "nbformat": 4,

diff --git a/Neural_preprocessing.ipynb b/Neural_preprocessing.ipynb
@@ -1,5 +1,27 @@
 {
  "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Example of correctly formatting data\n",
+    "\n",
+    "For use in decoding (see \"Examples_all_decoders\" and \"Examples_kf_decoder\"), we need the following format of inputs:\n",
+    "- Neural data should be a matrix of size \"number of time bins\" x \"number of neurons\", where each entry is the firing rate of a given neuron in a given time bin\n",
+    "- The output you are decoding should be a matrix of size \"number of time bins\" x \"number of features you are decoding\"\n",
+    "\n",
+    "In this example, we load Matlab data that contains the spike times of all neurons and a continuous stream of the output variables. We will put this data in the format described above, with the help of the functions \"bin_spikes\" and \"bin_output\" that are in the file \"preprocessing_funcs.py\"\n",
+    "\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Import packages and functions"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 1,
@@ -17,6 +39,13 @@
     "from preprocessing_funcs import bin_output"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Load Data"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 2,
@@ -156,9 +185,9 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "Python [Root]",
+   "display_name": "Python 2",
    "language": "python",
-   "name": "Python [Root]"
+   "name": "python2"
   },
   "language_info": {
    "codemirror_mode": {
@@ -170,7 +199,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython2",
-   "version": "2.7.12"
+   "version": "2.7.11"
   }
  },
  "nbformat": 4,

diff --git a/decoders.py b/decoders.py
@@ -2,7 +2,14 @@
 
 import numpy as np
 from numpy.linalg import inv as inv #Used in kalman filter
-from sklearn import linear_model #For linear regression (wiener filter)
+
+#Import scikit-learn (sklearn) if it is installed
+try:
+    from sklearn import linear_model #For Wiener Filter and Wiener Cascade
+except ImportError:
+    print("\nWARNING: scikit-learn is not installed. You will be unable to use the Wiener Filter or Wiener Cascade Decoders")
+    pass
+
 #Import XGBoost if the package is installed
 try:
     import xgboost as xgb #For xgboost