This course is about various aspects of statistical model model building, supervised machine learning and multivariate function estimation given scattered, noisy, direct, and indirect data, primarily using reproducing kernel Hilbert space (rkhs) methods, regularization, and splines.
1. Background, introduction to the theory of reproducing kernel Hilbert spaces (rkhs). Varieties of splines on various domains. Representer theory. Connections between smoothing splines, Bayes estimates, optimization problems in rkhs and regularization.
2. Degrees of freedom for signal and the bias-variance tradeoff. Generalized cross validation, generalized approximate cross validation, unbiased risk, maximum likelihood and other model tuning methods.
3. Model selection and model building methods suitable for spline and related models. Bayesian and bootstrap confidence intervals. Penalized likelihood models for risk factor modeling. Two and multicategory support vector machines, and other large margin classifiers, 'hard' and 'soft' classification.
4. Numerical methods for medium sized to very large data sets. Randomized trace estimation for the degrees of freedom for signal. Early termination of iterative methods as a form of regularization. Basis function thinning methods.
5. Applications in biostatistics and bioinformatics (risk factor modeling, classification), statistical learning theory (supervised machine learning, support vector machines), meteorology (ill-posed inverse problems, remote sensing, tuning, variable selection and classification), physics (signal detection), and other areas, will be discussed, according to the interests of the class.
Prerequisites: - Statistics Majors, mathematical maturity to the level of a year of graduate work, and either multivariate analysis, or, some exposure to Hilbert spaces, or cons. instr. Those unfamiliar with Hilbert spaces will be asked to read the first 33 pages of Akheizer and Glazman, Theory of Linear Operators in Hilbert Spaces, vol. I at the beginning of the course. Graduate students in CS, AOS, ECE, Biostatistics, Physics, and other physical sciences, Engineering, Math, Economics, and Business may find some of the techniques studied here useful and are welcome to sit in, or, take the course for credit if they have exposure to linear algebra, sufficient math background to read the Akhiezer and Glazman assignment, and are familiar with the basic properties of the multivariate normal distribution, as found, e. g. in Anderson, Multivariate Analysis, or Wilks, Mathematical Statistics. Otherwise, the development will be self-contained. If in doubt, please contact the instructor by e-mail (firstname.lastname@example.org) or come to the first class. This will be a seminar-type course. There will be no sit-down exams. Students taking the course for credit will be expected to do one or two computer projects studying the behavior of some of the methods discussed on simulated or experimental data, and one or two projects in an area of application of their choice with a possible project being the presentation of a lecture in class on a recent paper or recent resarch. Text: Wahba, Spline Models for Observational Data (1990). Material from selected recent papers, books and conferences will be discussed, tba.
Math 801 - Topics in Applied Mathematics (Prof. Amir
Descriptive Title: Biological Computation and Mathematics with
to Learning in Intelligent Systems
Prerequisite: Graduate standing or consent of instructor for Undergraduate students.
(1) Mathematical Biology, 2nd Edition or later, by J.D. Murray. Springer ISNB 0-387-95228-4
(2) (Recommended) Dana Ballard, Introduction to Natural Computation (MIT Press, ISBN 0-262-02420-9. www.mitpress.mit.edu . Soft cover also available but with a different ISBN);
(3) (Recommended) Evolution of Networks, by S.N. Dorogovtsev and J.F.F. Mendes. Oxford University Press. ISBN 0-19-851590-1
DESCRIPTION: This course will treat topics in Biological Computation
and Mathematics. Its objective is to introduce the students to selected
research topics in cross disciplinary mathematics, computational
bioinformatics, and modeling complex biological systems. The lectures
discuss the mathematical foundations and computational methods for
from four biological space/time scales: (a) biomolecular information
(with very small time scales for events to take place), (b) modeling
neurons (micrometer length with milli-second time scale), (c)
learning at system level (millimeters long or higher lengths and
time scale), (d) evolution of biological information processing at
level (large space and large time scales). I will cover the following
the above-mentioned topics: (a) DNA computation, its mathematical and
challenges and promises, as well as selected methods for analysis of
in bioinformatics, such as analysis of the gene-chip and micro-array
(used in the Human Genome Project, for example); (b) modeling neurons
excitable cells, with a brief introduction to dynamical systems related
to such models; (c) Neural networks and biological intelligence, memory
and learning (for example, the sensory systems in the human brain, with
discussion of some concrete applications to one or more of vision,
and pain); (d) evolutionary computation, such as genetic algorithms and
programming, and their applications in optimization and solution of
problems. Undergraduate mathematics as typically covered by science/
students will be assumed. I will review advanced topics as needed for
the selected biological topics. I will also review the related
biology, the basic cell biology of brain cells and basic facts from
biology. There will be a tutorial for students who need to get started
with hands-on computation with MATLAB as needed for term projects. The
course grade is based on a term project. Sample projects could be found
by following the appropriate links in my web page: http://www.math.wisc.edu/~assadi