CSE459 Paper Topics
CSE459 Causal Discovery Paper Topics
Some possible paper topics. Possible programs are described where
appropriate. Each identifies a lecture in CSE459 which
is relevant. Past lecture notes are available for follow-up; see my
Bayesian AI web page.
- (Lecture 1): Is machine learning just computational statistics?
Program: Implement some simple symbolic learning algorithms.
Wipe them out with Excel.
- (Lecture 2): Is the causal interpretation of Bayesian nets
primary? Argue that the true variable order
necessarily produces the sparsest Bnet when using the
Pearl net construction algorithm.
Program: Write a program which, given a true net (true
causal structure) and a total ordering, generate the Bnet
using Pearl's algorithm. Run it N! times on various cases,
illustrating the thesis.
- (Lecture 3): Apply the concepts of Bayesian confirmation
theory to the value of randomizing selection into
experimental and control groups.
Program: A simulation program to generate the
relevant likelihoods (a la Monte Hall, 2 envelopes).
- (Lecture 4): Define causal power for linear and/or
discrete causal models. For linear models this might be in terms of
variance accounted for; in discrete models this might be in terms of
relative entropy between variables. (On last topic: see Cover and
Thomas Elements of Info Theory.) Program: Write a
program to report the causal power of one variable on another, using
Wright's rules to factor out the different lines of influence.
- (Lecture 6): Generalize the concept of Markov (statistical)
equivalence to experimental equivalence, using the concept
of augmented causal models.
- (Lectures 11 & 13, 2001): Compare conditional independence
learning with metric learning, in the form of the PC
algorithm and K2.
Program: Implement PC with an oracle (see Assignment
1, 2001, for details). Implement K2 (see Lect 13). Run them
both against simple problems, both using noiseless data
exactly reflecting the causal model. Compare the
results. K2 requires a total ordering as input, to get
started. PC provides a partial order as output. To what
extent could the two be fused, with PC providing ordering
data to K2? Would this make any sense?
- (Lecture 12, 2001): Use classification tree learning (DTREE)
and CaMML to model missing data both when the missing values
are and are not correlated to the observed values.
Compre the results empirically with techniques that simply
assume the missing values are unrelated to known data.
Describe
a procedure/criterion for identifying when the modeling of
missing data is useful and when not.
Program: Using CaMML, DTREE and possibly other
programs.
- (Lecture 12, 2001): Implement a simple Expectation
Maximization (EM) program to estimate parameters given missing
data. Run it on cases where missing data are uncorrelated
with measured data and also where such data are related
to available data. Describe the relative effectiveness of
EM under the different conditions. Can you come up with
any criteria for identifying when EM filling-in is useful
and when not?
Program: EM algorithm.
- (Lecture 13, 2001): Prove that causal models in a
statistical equivalence class are likelihood equivalent.
Argue against Heckerman that hypothesis equivalence is
in most cases an inappropriate assumption:
argue that the causal interpretation is fundamental
to making sense of Bayesian nets in general and
the automated learning of Bayesian nets in particular.
Program: Proofs are better than programs!
- (Lecture 13, 2001): Improve upon the MDL metric
described there (some defects of which are pointed out
in the lecture). Implement the new scoring metric with
a simple search (e.g., greedy search starting with
known constraints, or a simple stochastic search). Compare
it empirically with the PC algorithm in Tetrad II (I have a copy).
Program: MDL causal search program.