This course will cover modeling and data analysis for Cognitive scientists, scientists of other disciplines, and Engineers. Emphasis will be given to the Cognitive perspective, with examples drawn from topics relevant to cognitive scientists.
Topics may include programming tools, matlab, linear algebra, linear regression, nonlinear regression (polynomial and exponential fits), basic statistical analysis (mean, standard deviation, mode, median, hypothesis testing), numerical solution of differential equations, optimization, curve fitting, and data visualization. Neural networks will be included as well if time permits.
The objective of this course is to give fundamental tools to the student which will allow the student to effectively analyze data from experiments, extract information, and understand standard analyis tools commonly found in the literature of science and engineering. The student will also be given many resources from which to draw in the future, thus allowing them to expand their knowledge and skills.
Theory lectured on in class will be followed up with readings to expand on the concepts, homeworks to give experience with the techniques, and additional references/readings of research work in the field applying these techniques (demonstrating how these techniques are applied in real life).
Prerequisites
Functional Brain
CogSci 18 or consent of instructor
Grading
Grading will follow the fill-the-bucket principle. For each homework assignment and for the Midterm and Final test you will get score points. These will be added. The grade will be based on your score and the maximum achievable score. The course average will be scaled (only up if need be, not down!)
Tentatively:
~ <= 7 projects
50%
<= 1 final project
??
1 midterm exam
20%
1 final exam
30%
total possible will be around 1000 pts, plus bonus
100%+ bonus
Cheating and Academic Honesty Policies
First of all please DON'T CHEAT!!! It detracts from your learning in this class. When you go into the world you won't have the skills you should have gained here. Our goal is to help you learn, so if you have any problems, please come speak with us and we will help you resolve them to the best of our ability. That being said, the definition of cheating must be defined clearly:
Cheating on exams involves any form of copying from another student, giving or getting answers from another student, acquiring information in any way from an external source during the exam, or giving information to or receiving information from another individual which you should not receive during an exam (ie theories, data, answers, etc). You may ask questions during an exam of the instructor or TA's at any time. The TA's are not to give answers directly, but may provide hints.
Cheating on homeworks involves duplicating another person's code. You are to write your own code, unless the instructional team provides a starter code, or sample code for you to use. You may not use code from sources other than this course. You may not copy another student's code. However, you ARE encouraged to help each other and discuss the homeworks and material from the course. It is often through explaining something that one learns that concept even better than before. But when it comes to writing the code, you must do the actual writing of your own code. Programming is very much something you must do as well as study to learn it well, very similar to driving.
The Standard academic honesty policies of the university apply during this course as well. Click here for details
Textbooks and References
(required and recommended)
Introduction to Matlab (pdf) There will be several readings from this onine book. If you prefer to read from paper rather than on the computer, take this file to Imprints and have a printout made and bound.
Note: No textbooks are required at this time, however there will be weekly PDF handouts, lecture notes, online books and tutorials assigned as reading. There will also be a few recommended texts.
Other References
Many of these books may have newer editions. The most up to date is often useful.
Optimization
Stephen Boyd and Lieven Vandenberghe, Convex Optimization, Cambridge University Press, 3rd edition, 2006
Joel H. Ferziger. Numerical Methods for Engineering Application. John Wiley and Sons, 2nd edition, 1998
Press, W.H., S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical Recipes in C, The Art of Scientific Computing, Cambridge University Press, Cambridge, England, 1993.
this book is free online! Worth the download.
Statistics
Arthur Aaron and Elaine Aaron. Statistics for Psychology. Prentice Hall,3rd edition 2003
Graphics and data visualization
Chandrajit Bajaj. Data Visualization Techniques, John Wiley & Sons, 1999.
F. S. Hill. Computer Graphics Using OpenGL, Prentice Hall, 2001.
OpenGL Reference Manual, Third Edition, Addison-Wesley, 2000.
Edward Angel, Interactive Computer Graphics: A Top-down Approach Using OpenGL, Addison-Wesley, 1999. (Second Edition)
Alan Watt, 3D Computer Graphics, Addison-Wesley, 2000. (Third Edition)
Will Schroeder, Ken Martin, and Bill Lorensen, The Visualization Toolkit: An Object-Oriented Approach to 3D Graphics, Prentice Hall PTR, 1998.
Michael Mortenson, Geometric Modeling, John Wiley & Sons, 1985.
Gerald Farin, Curves and Surfaces for Computer Aided Geometric Design, Academic Press, 1990.
David Thompson, Jeff Braun, and Ray Ford, OpenDX: Paths to Visualization, VIS, Inc., 2001.
Regression and data analysis
Draper, N.R and H. Smith, Applied Regression Analysis, 3rd Ed., John Wiley & Sons, New York, 1998.
Bevington, P.R. and D.K. Robinson, Data Reduction and Error Analysis for the Physical Sciences, 2nd Ed., WCB/McGraw-Hill, Boston, 1992.
Daniel, C. and F.S. Wood, Fitting Equations to Data, John Wiley & Sons, New York, 1980.
Branch, M.A., T.F. Coleman, and Y. Li, "A Subspace, Interior, and Conjugate Gradient Method for Large-Scale Bound-Constrained Minimization Problems," SIAM Journal on Scientific Computing, Vol. 21, Number 1, pp. 1-23, 1999.
Levenberg, K., "A Method for the Solution of Certain Problems in Least Squares," Quart. Appl. Math, Vol. 2, pp. 164-168, 1944.
Marquardt, D., "An Algorithm for Least Squares Estimation of Nonlinear Parameters," SIAM J. Appl. Math, Vol. 11, pp. 431-441, 1963.
DuMouchel, W. and F. O'Brien, "Integrating a Robust Option into a Multiple Regression Computing Environment," in Computing Science and Statistics: Proceedings of the 21st Symposium on the Interface, (K. Berk and L. Malone, eds.), American Statistical Association, Alexandria, VA, pp. 297-301, 1989.
DeAngelis, D.J., J.R. Calarco, J.E. Wise, H.J. Emrich, R. Neuhausen, and H. Weyand, "Multipole Strength in 12C from the (e,e') Reaction for Momentum Transfers up to 0.61 fm-1," Phys. Rev. C, Vol. 52, Number 1, pp. 61-75 (1995).
Cleveland, W.S., "Robust Locally Weighted Regression and Smoothing Scatterplots," Journal of the American Statistical Association, Vol. 74, pp. 829-836, 1979.
Cleveland, W.S. and S.J. Devlin, "Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting," Journal of the American Statistical Association, Vol. 83, pp. 596-610, 1988.
Chambers, J., W.S. Cleveland, B. Kleiner, and P. Tukey, Graphical Methods for Data Analysis, Wadsworth International Group, Belmont, CA, 1983.
Press, W.H., S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical Recipes in C, The Art of Scientific Computing, Cambridge University Press, Cambridge, England, 1993.
Goodall, C., "A Survey of Smoothing Techniques," Modern Methods of Data Analysis, (J. Fox and J.S. Long, eds.), Sage Publications, Newbury Park, CA, pp. 126-176, 1990.
John R. Taylor, An Introduction to Error Analysis: The Study of Uncertainties in Physical Measurements, 2nd edition, University Science Books, 1997
Assignments
Most assignments will be due on Wednesdays (for all sections) by turning in to a WebCT-based turn in page. Typically students will have one week for the assignment.
It is quick reading, so it goes by fast. If you have access to Matlab from home, it helps to follow along with the reading, otherwise you could go to a computer lab with matlab installed (CSB115) - we will go over how to run matlab remotely (ie from a computer which does not have matlab installed, but has an internet connection and a login account)
Math review (Review and perform as many problems as you can. If you cannot do a group of problems, review material from that section. The review is divided into different areas of math with reading references given for each)
Thurs Oct 05, 2006
50 pts total for turning in efforts for 3 problems from each section (a total of 12 problems must be attempted. Must show work. 5 bonus points possible for completing all problems with over 80% correct
Math review paper - provides an overview of the mathematics you should know (or at least be familiar with) for this course.
Visual communication, interpolation/extrapolation, least squares, and statistics
Thursday, Nov. 2, 2006
(But since the midterm is thursday you can turn it in friday Nov. 3 at either of the sections. It is recommended you try to get it done before the midterm because it will help you study for the midterm.)
Read the chapter in Dan Olphe's book on curve fitting (this is fairly long, so focus on lagrange, then natural cubic spline interpolation. But read it all)
there will be some readings to deepen your understanding of these methods, but not all will be required
Read numerical methods book Ch5, covers minimization and gradient descent (focus on the introduction, and then the gradient descent parts). This chapter covers more than the lecture material, and it's ok if you don't understand every detail fully, but this will be useful in the future, and it will explain in depth the gradient descent algorithm.
Click here for reading assignment (more posted soon)
This assignment consists of a group of readings to balance your understandings, demonstrate common applications of these theories
getting matlab (running remotely), review of programming concepts by demonstration, functions, matlab plots, importing files (ascii, binary, mat) -offer after class review and office hours of programming, math homework, improving remote speed, basic data visualization, fileIO, memory, toolboxes
Interpolation is an important concept for numerical integration and differentiation. There are many forms of interpolation, one group of which we've seen with least squares. Here we see another group, examples, and matlab implementation. How to LERP, BERP, TERP, and SLERP the right way.
Review of course up to now, course organization and progress report (you're doing well!), project discussion, optimization, method of steepest descent, conjugate gradient
Review of homework 4 took a while, so some of this lecture was review. Little dog. Threshold logic units, the history of neural networks, and perceptrons, as well as examples, neural networks for boolean logic, failures of single neurons to solve all possible problems
Modern neural network theory, multi-layer networks, examples, some common structures and modeling methods, function fitting and classification, if time
AI search methods in addition, hypothesis testing, reading results in papers (Heuristics, A*, etc)
single and multi-layer ANN's, review, sorting out homework/midterm issues
Final Exam :12/07/2006 Th 3:00p - 5:59p CENTR 105
Handouts
Some of these will be assigned reading, others are recommended. Please see the lectures section for when to read each, but you are encouraged to read ahead.
A text which was written at UCSD by a Professor of computational science and engineering, Thomas R. Bewley. Covers many areas of useful numerical methods. A great reference
Handout for linear and nonlinear least squares partial differential equation derivation and matlab implementation. Also a second file which demonstrates and explains specifics of linear least squares (extendable to nonlinear polynomial fits) is linked here.
An exerpt from Dan Olphe's book 'Computer Graphics for Design: From Algorithms to AutoCAD.' This chapter gives an explanation of various data fitting methods. We'll only be using some of these, but you can read more.
A fairly useful online html book to read about neural networks and applications to learning, automata, pattern recognition, etc. Specific readings are assigned from a few sections of this book
Replacement histogram function for your homework 3. The lab computers have a problem with the hist function. To use, make sure this code is in your matlab path, or in the same directory as your homework code
A brief history from the roots of computational machines and automata to modern times, linking philosophy, mechanical engineering, mathematics, and cognitive science
An example of using matlab's nonlinear function minimization algorithm to fit functions which may be nonlinear in the parameters (ie y=a*sin(b*x)+c*x+d)
Practice gradient descent by going through this code, the reading in ch5 of numerical methods, and practicing several problems - ie make up A and b, and find the coefficients x
A very good publishing company - they have books on eastern philosophy, psychology, meditation, and self-hypnosis. These books can be helpful in a sense of discipline, concentration, and finding bugs in code.
http://mathworld.wolfram.com/ - have a math question? Don't know or remember some definition, math function, technique? Mathworld may have it listed
Class picture can be downloaded here or by clicking on the picture below:
Posted Fri, Dec. 8, 1pm
Office hours will start roughly at 2:30pm today at CSB115
Posted Thu, Dec. 7, 2006, 10pm
Since CRB is being locked, possibly due to the upcoming holiday, please turn in your assignment to the CSB115 computer lab box which I've placed there - it is labeled as the turn-in for the final homework.
If CRB is unlocked tomorrow I'll also place a box there so people who don't get this note won't be confused, but it won't be in place before 1pm Friday.
Posted Wed, Dec. 6, 2006, 2pm
SCANTRON FORM TO BUY: purchase the half-sheet green form X-101864-ERI-L at the UCSD Bookstore or General Store. (Sometimes the form says PAR instead of ERI - either is correct.)
Posted Tues, Dec. 5, 2006, 11:26pm
Review session locations - apparently not posted well (sorry about that)- it was on the FAQ page
12/05/06 T 07:30-09:20 pm YORK 4080A seats: 50
12/06/06 W 07:30-09:20 pm CSB 002 seats: 120
if you couldn't find it tonight and can't go tomorrow, I'll arrange some other working time for you. Please email me. Office hours tomorrow will be from 1-3pm (Wed) at Muir Woods coffee shop. If you couldn't make anything Wednesday, I can arrange something for Thursday, but give me warning
Posted Tues, Dec. 5, 2006, 12:20pm
Practice Final Solution can now be downloaded here
Hw6/Final FAQ is up, look there first to see if questions have been answered
Posted Sat, Dec 2, 2006, 3:46pm
The assignment is NOT due at the final, I was only suggesting that as a turn in time, it was due friday, but I'm EXTENDING the due date to SATURDAY NIGHT DEC 9, midnight in the box at CRB 2nd floor outside the code-locked door (there will be a box to slip the assignments into)
Posted Sat, Dec 2, 2006, 3:25pm
FAQ for homework 6/ takehome section of final coming up in a few minutes...Remember to read carefully the problems. They will not each take long, since most of them are repeats of things you've been doing all quarter, and most of them are conceptual rather than computational. Only a small number of them make you write anything but really simple code. Also you can use code from your homeworks to help you (problem 2.3 you have already solved in the least squares homework - just change the input data, and the coding part is essentially done).
Posted Sat, Dec 2, 2006, 3pm
there are 15 bonus points possible, I decided to add the 5 points at some point and have now updated everything to reflect that.
Posted Sat, Dec 2, 2006, 12 noon
Homework 6/takehome section of final posted (you will also need the data set here). Please email with questions, but realize that Nick Butko and Dan Liu are away so won't be able to respond to email. Dan is back Tuesday and can answer questions then.
If there are any problems with the data or downloading the document please inform Alex (me) IMMEDIATELY so they can be fixed!
Posted Fri, Dec 1, 2006, 1:22am
FINAL EXAM LOCATION/DATE/TIME:12/07/2006 Th 3:00p - 5:59p CENTR 105
Posted Thurs, Nov 30, 2006, 8:56pm
Readings are up for searches in the assignments section
review sessions will be Tuesday night 7:30-9:30pm and Wednesday night 7:30-9:30pm, location TBA
Posted Thurs, Nov 30, 2006, 8:45pm
slides from today, reading and assignment on the way
Posted Sun, Nov 19, 2006, 11:46pm
Due to a seminar obligation I will have to reschedule office hours for Monday from 1-2 to be instead from 2:50-3:50.
NO SECTION THIS WEEK BECAUSE OF THANKSGIVING!!!
homework 5 is posted here, and on the assignments section as well.
Posted Sun, Nov 19, 2006, 11:20pm
lecture notes, references, and info about next assignment (assignment posted in a few minutes)
Posted Thurs, Nov 16, 2006, 3:45am
I made a few slight typographical updates to the homework assignment. Also see the last question for a slight wording change. I'll discuss these in class
Posted Wed, Nov 15, 2006, 5:45pm
I'll be holding a help session tonight from ~7pm-9pm, though often I end up staying later it seems
Due to popular request, Homework 4 due date is extended to Friday either in section or at the box of the TAs
Posted Wed, Nov 15, 2006, 2:19am
I've created a Frequently Asked Questions page for homework 4 here
Posted Tues, Nov 14, 2006, 7am
Gradient descent worksheet and code zipped into one file - use for section material and practice
Posted Tues, Nov 14, 2006, 2:25am
Added several reading links, one on neural networks, and another on AI. See the handouts section
Posted Sun, Nov 12, 2006, 5 pm
Homework 4 had a typo - it didn't actually affect any computatins, but the start and location variables were exchanged on page 5. This has been fixed, please download the description again, sorry for any confusion
Homework 0.2 solutions are here (I wouldn't worry too much about computations, but more about understanding the concepts)
Posted Wed, Oct 31, 2006, 8:18pm
you are allowed two sheets (double sided) of handwritten notes to bring into the exam
Bring a pencil, your note sheets, and brain
Please take note, there was an unclear statement made in class about covariance - it's range is not only from 0-1. Normalized data (data from 0-1) has a covariance range below 1, but raw data values may have arbitrarily large covariances. The slide notes have been updated to reflect this.
Solutions to midterm practice will be posted soon, and a 'last minute hints' document
Posted Wed, Oct 31, 2006, 11:15am
Here is the practice midterm. This is approximately (slightly less than) half the length of the exam, so you can time yourself. Solutions will be posted in a little while (I suggest you try the exam first then look at the solutions after) so you work through the problems first. There will be more topics included on the actual exam (one can't cover everything in 1/2 the length) but if you know the material behind these questions you will know a significant quantity of the midterm material.
Posted Wed, Oct 31, 2006, 3:04am
The review session for the midterm will be in PCYNH 106 (peppercanyon hall room 106) from 7-8:50pm
Posted Tues, Oct 31, 2006, 10:23am
Some people had trouble with blank histograms with the cogs109hist.m code. When you get this, it's the axis control code, so ultimately I commented that out in the function. You can download it again, here, or comment out the last line ("axis(.....)")
Posted Mon, Oct 30, 2006, 7:12pm
The built-in matlab function HIST() was giving some 0 sizes for bins when you give too large of a NBINS size, which was causing an error. I hadn't written much error handling into the code, so I added that and it should work fine now. Please download the cogs109hist.m code again. The old code works if you use less bins, but you should have no problems with this version. Let me know if you do. (-Alex)
One more code update. Now you can get help once you put the cogs109hist.m function in your path by typing help cogs109hist at the command prompt. It gives an example and desciption in standard matlab format
Posted Sun, Oct 29, 2006, 4:18pm
Slight code update for the cogs109hist.m function. Now it basically duplicates the output of the built-in histogram function in matlab.
Posted Sun Oct 29, 2006, 3:48pm
There may be a problem with the hist function in the lab computers. Here is a replacement function that plots histograms in the same way. It is somewhat simplified in the interface - you only give it data and number of bins (bars) that you want, and it creates a histogram plot
Posted Sat Oct 28, 2006, 10:45pm
There will be a review session Wednesday night at 7pm, location TBA
Here is a list of topics to review for the midterm. Midterm hints will be posted later, along with a practice midterm.
Posted Sat Oct 28, 2006, 6:20pm
To create an email list, please send an email to me at "csimpkin at ucsd dot edu" with the title of the email written as "cogsci109 email list add" exactly - I'll use that title to sort the emails and create the list. This will make it easier to make email announcements
Posted the commented version of the LERP example from class in the handouts section
Posted Wed, Oct 25, 2006, 2:23pm
Sorry I had some car trouble but am now in office hours until 3:15pm at the muir coffee shop. (-Alex)
Posted Wed, Oct 25, 2006, 2:23am
Posted an update to the handout for least squares - it now includes nonlinear least squares derivation as well and matlab examples demonstrating how to use it. If you find errors please email me (Alex).
Posted Tues, Oct 24, 2006, 12:17pm
Posted a handout for linear least squares. It includes the derivation and matlab examples. A paper for nonlinear, exponential, correlation coefficients etc. will also be posted soon
Posted Fri, oct 20, 2006, 2:30pm
Sorry I left out the location - earl's place coffee shop in warren
Posted Fri, Oct 20, 2006, 1:16am
Alex's office hours for friday will be after he gets done with the chancellor's challenge. To be safe in terms of time he's scheduling them for 2:30pm-3:30pm
Posted Wed, Oct 18, 2006, 1:20pm
There was an extra homework help session Tues 7:30-11, and there will be one Wed (tonight) 7-11 in CSB 115
Tons of FAQ hints updates for homework 2 Lots and lots of questions answered! If you have questions look here first.
Posted Sun Oct 15, 2006, 6:41pm
HW2 Due DATE EXTENDED TO THURSDAY!!!
It isn't my intention to make this an unpleasant experience for everyone, so I'm going to extend the homework to Thursday. I've had many many requests and questions this weekend about extending the due date, and I care most about people learning this material rather than rushing through it just for the grade. I may assign the next homework Tuesday and just make them due Thursdays generally, but I'll have to think about that.
I know people are busy so it's important with these assignments not to let all the other demands prevent them from starting the homework until a day or two before (it often happened to me too I have to admit, and that's one thing one can always improve). I hope extending it won't make people just wait longer to get going on it.
Posted a small math review (it still has much that needs to be added, so far it just has limits and derivatives). Go to the handouts section for the review
Posted Fri, Oct. 13, 2006, 11:07pm
updated the homework description. For memory considerations, I reduced the size of the larger data set. I announced this in class, so it is the largeset.mat file. Somehow this file was not present in the original zip uploaded to the site, but it is there now. Please download the data again.
Posted Fri, Oct 13, 2006, 9:10pm
I reposted the data - some people could not load the data files with earlier versions of matlab, so I saved it in a more compatible file version. Please re-download
Homework 2 has a slight correction - I mentioned a 4D matrix as the 'large' data set. It isn't a 4D, so I made a slight adjustment to the pdf assignment. I was going to give you a larger matrix but decided it would bog down matlab running over the network, since it took 356MB of RAM!
Posted Wed, Oct. 10, 2006, 4am
Homework 2 is posted in the homeworks section. The data is in the handouts section. Technically since I haven't slept yet, it's still 'tonight' :) Look for hints soon (but not 'tonight' :) .
For those of you with intel-based macs can download a beta version of matlab student edition (it has basic functionality and is licenced until june of 07, but will work for much of what we're doing). Here is the link: http://www.mathworks.com/academia/student_version/intelmacbeta/
the only danger here is that it is a beta and will probably crash here and there, so the other option is using the lab computers or running matlab remotely. Realize also that your accounts give you access to computers all over campus with matlab on them
Posted Fri, Oct 6, 2006, 2:18am
Note for the Low_Pass() function: There was a small bug which I have decided to fix for you, please check the code and see if you can find where it was changed from what you saw in lecture (Hint: check the loop counters)
A new reading assignment has been posted in the assignments section
For those who want to read more about matlab, do read the pdf book from mathworks. It is a complete book. If you don't want to read on your computer screen, take the file to the UCSD Imprints and print there, then have it bound for a few dollars - a complete book for only a few $!
Also, please see the matlab helpwin documentation (type helpwin at the command prompt). It is excellent.
Finally, please go through the matlab demos and examples. This all together is a tremendous amount of information and worth the read
Posted Thurs, Oct 5, 2006, 12:32am
For installing cygwin, please see the following document for instructions:
Hints for homework 0.2 have been posted in the handouts section. Please refer to them if you need help. Also check the handout section for help plotting.
Note for printing in CSB115: The printer name in CSB 115 to print with a printing account is csb115l@csb-115 it is not laserjet4!!!
Posted Tues, Oct 3, 2006, 7:18pm
There will be a math review Tuesday (tonight) and Wednesday both beginning at 8pm, going for about an hour in CSB 115. If you haven't had dinner we'll order pizza!!!
Posted Tues, Oct 3, 2006, 11:11am
Office hours updated
Posted Sun, Oct 1, 2006, 1:39am
There was a slight correction to the homework: problem 3.1.3 was listed with part a and part b on the same line, so it has been reposted. Please download again for clarification.
Posted Sat, Sept 30, 2006, 2:16am
A recording of the entire lecture 3 can be found here in mp3 format. Due to space restrictions I won't be able to leave it up at this point for more than a few days (it's 88MB, so download over a fast connection), but you may download it and listen if there were any confusions about the lecture.
The reading assignment for next Tuesday has been posted. Stay tuned for a small math review homework to be posted soon.
The references section has been updated with some basic math books, numerical methods, statistics, etc. These are a good place to start - make a trip to the library and use those ID cards to check out a few references.
the lecture powerpoint has been posted for Sept 21 in pdf format. If you do not have a pdf viewer, please go to www.adobe.com to download the free reader.
Welcome to CogSci 109 : Modeling and Data Analysis!
please check this site frequently for updates
We now have the lab reserved! So all labs will be in CSB 115 unless otherwise announced (announcements will typically include lecture, email and posting on the web site)
Lab locations: The labs (discussion sections) will all be in a computer lab, unless otherwise announced. We're in process of reserving the lab, hopefully it will be CSB 115
Due to the move to a computer lab, we'll have to add a section. Please see Tritonlink for specifics.