Tuesday, February 13, 2007

Personal notes: From this point forward...

We finished up a submission to ICML last week. The submission took a general tone and focused on a new technique solve a generalized eigenvalue problem subject to sparsity constraints. There's still a lot of juice that can be squeezed out of this project actually, in addition to the other stuff that I wanna/need to do for my thesis.

Here is a list of directions/to do list from this point forward:

  • Check out probabilistic pca / probabilistic eigenmethods papers. This could lead to a better formulation of the sparse eigenmethod programs. The only issue is that this is turning into a project unto itself, rather disjoint from the applied paper that i was thinking of doing, and this could eat up valuable time that could be spent else where
  • A good chunk of my thesis could be spent on explaining sparse methods. If this is the case I need to spend some time combing through the sparse methods papers and organizing how this material will be presented in the thesis. Could probably start writing it in pieces rather quickly.
  • An application paper for ISMIR 2007 must be done as well. This is the original plan for the thesis, though it seems like it could easily focus on sparse eigenmethods instead. The thing is this project really should get off the ground as all of us here in the CAL lab want to turn in a submission to the conference.

    There's an interesting lesson that I picked up over this last project, and that is that solvers like Mosek can be used to estimate the eigenvalue decomposition of a large matrix very quickly. When performing PCA or CCA on a large matrix of say several thousand (7-8000) dimensions it takes PCA a sizable amount of time to compute its eigenvalue/vectors. You can use an interative method (like we used in our paper) where we solve locally linear problems, and these methods converge to the solution in a fraction of the time.

    This is will be an useful observation to stress. Eigenvalue problems lie at the core of many important and widely used machine learning algorithms, and any method that can calculate (or approximate) an eigenvalue solution quickly is a boon in our field, (particularly in real time applications.)