Re: Summary of Robust Regression Algorithms

Weiguo Fan (fanwg@iscs.nus.edu.sg)
Tue, 6 Jan 1998 16:47:29 +0800 (GMT-8)


LTS and other robust regression are primarily used to solve linear
regression problems with outliers(less than a half) in the data. Of course
you can always code examples that fail those robust regressions. But what
does that mean for? Every method has its own merits and shortcomings. We
have to allow for that. Otherwise, why bother to do any research?
Everything is perfect.

The highest breakdown point you can get use robust estimators is 50%. If
you have more than a half of outliers in your data set, you yourself even
don't know which are outliers (see robust regression and outliers
detection 87). They may have different meaning in different context, in
which case, you need to split your data set.

Typical applications in financial databases seems to contain 1 to 10
percent of error records. Robust estimation methods have a wide
application in this context. I have been using lts for both linear and
nonlinear in data mining for a long time. It seems fine with our
research.

Weiguo

On Mon, 5 Jan 1998, David Ross wrote:

> > Doug Martin sent me an e-mail suggesting that he may post to s-news some
> > comments about the merit or lack of merit of robust procedures like
> > least trimmed squares. I think that could spark a fun and lively
> > debate. Go for it, Doug.
>
> Packages (like S+), coupled with the ubiquity of remarkable computing
> power, sometimes make it a bit *too* tempting to drop data into robust
> procedures. I'm guilty of this myself (especially since the robustness
> of lmsreg is backed up by a beautiful geometric characterization). In
> anticipation of Doug Martin's response, let me offer (to anyone reading
> this thread) an example I sometimes give my students. Suppose your
> dataset has 2k+1 elements in it, k+1 of them take the value 0, and k of
> them take the value 100. The median is therefore 0. Now, change one of
> the 0s into a 100. The median is now 100. This is robust?
>
> A similar example can be concocted for almost every procedure I
> know with a high breakdown point (including lmsreg).
>
> - David R. (ross@math.hawaii.edu)
>

----------------------------------------------------------------------
Mr. Weiguo Fan
Department of Information Systems and Computer Science
National University of Singapore
Lower Kent Ridge Road
Singapore 119260

Tel: (O)65-8746136 (H)65-6654106
Fax: (O)65-7794580
Personal Web Page: http://www.iscs.nus.edu.sg/~fanwg/
Project Web Page: http://www.iscs.nus.edu.sg/~fanwg/project/context.html