February 08, 2010
News
ESPN article: "You want winners? Here's a formula" reports on work done by ERI Research Scientist Dr. Andrew Fast. In his research, Dr. Fast found valuable information in the social network of NFL coaches.

STATISTICA blogs about partnership with Elder Research. Includes comments from ERI VP Dustin Hux about STATISTICA Data Miner.

Audio Webcast, recorded Thursday August 27 at 11:00 am, "Top 10 Data Mining Mistakes," John Elder.

15th Annual KDD Conference, Paris, France; ERI Participating Officers:
John Elder, General Chair
Antonia de Medinaceli, Sponsorship Co-Chair
JR Lawhorne, Webmaster

John Elder on "The High ROI of Predictive Analytics for Innovative Organizations," March 25, 2009.  Blog post by James Taylor; Audio interview by predictiveanalytics.org. 

Elder Research Inc. partners with Kansas State University to help develop an Internet tool that will monitor diseases worldwide.
Upcoming Talks
The University of Virginia's Darden Business School, "Everything an MBA Needs to Know About Data Mining," John Elder, Charlottesville, VA, April 20, 2010

2010 Annual Two-Day Course, "Tools for Discovering Patterns in Data: A Survey of Modern Data Mining Algorithms," Dr. John Elder, Charlottesville, September 13-14, 2010.
Selected Past Talks
Two-Day Course presented, "Tools for Discovering Patterns in Data: A Survey of Modern Data Mining," John Elder, Charlottesville, Virginia, September 14-15, 2009.

Aviation Safety Information Analysis and Sharing Technology and Tools Symposium, MITRE, "Text Mining Lessons Learning from Real Applications," Dr. John Elder, and "Breakthroughs Using Ensembles--A Committee of Models," Dr. Cheryl Howard, July 27, 2009.

KDD Conference 2009, "Organizational Traits Leading to High ROI for Dating Mining," Antonia de Medinaceli, Paris, France, June 29, 2009.

One-Day Seminar, Predictive Analytics World Conference,"The Best and Worst of Predictive Analytics: Predictive Modeling Methods and Common Data Mining Mistakes," John Elder, San Francisco, CA, February 20, 2009. 
  
You are here: HOME
Untitled-1
Handbook of Statistical Analysis and Data Mining Applications
Book.jpg
Handbook of Statistical Analysis and Data Mining Applications


Buy Here

Privacy Information
Handbook of Statistical Analysis and Data Mining Applications

Authors: Robert Nisbet, Ph.D John Elder, IV, Ph.D Gary Miner, Ph.D
Published June 5, 2009, Elsevier Publishing

Reader comments from Amazon.com:

"Rarely do authors succeed in writing THE comprehensive guide to anything, particularly when the subject matter is as complex, multifaceted, and rapidly changing as the field of data mining.  The Handbook of Statistical Analysis & Data Mining Applications far exceeds that worthy goal.  The text is well-organized, thoughtfully written and intuitive."

"The "Handbook of Statistical Analysis and Data Mining Applications" is the finest book I have seen on the subject.  It is not only a beautifully crafted book, with numerous color graphs, charts, tables, and screen shots, but the statistical discussion is both clear and comprehensive."

Untitled-1
Top 10 Data Mining Mistakes

Top 10 Data Mining Mistakes

The following is a portion of Dr. John Elder's well-known talk on the top ten data mining mistakes.  This talk has been presented at many conferences, and continues to be in high demand.


See also:
Part 2: Don't rely on only one technique
Part 3: Don't extrapolate
Part 4: The path to data mining success
Untitled-1
Peregrine Case Study
Peregrine.jpg This Case Study details ERI's involvement in the development of Peregrine's DecisionCenter software product.
Untitled-1
Data Mining and Investments
John Elder responds to an article in The Wall Street Journal about the potential for data mining to serve as a tool for investment modeling.  See his comments below:

"Data Mining Isn't a Good Bet For Stock-Market Predictions" (Wall Street Journal, The Intelligent Investor, 8/8/2009) is perfectly true if one does it wrong.  The examples given (of what statisticians call "overfit") are instructive; if you only have a few cases to fit and you look at enough candidate variables then you can model anything--on the known data.  But that model won't hold up on new, "out-of-sample" data, which is the only place that matters.

Such "torturing the data until it confesses" (or, "data dredging") is the greatest mistake one can make when trying to learn from the past.  It's effects are worse than ignorance as it leads you to move, with dangerous confidence, in a random direction.  Breaking the data into pieces and looking for consistency, as the article suggests, is great advice.  (Over a decade ago, the WSJ eulogized Julian Simon--a leader who popularized this powerful idea of "resampling," albeit for different reasons).  However, insisting that the model "make sense," as often advised, isn't the protection it should be.  We humans are such good sign-seekers and reason-finders that we can make sense out of literally any result presented!  (For instance, who doesn't see something interesting when looking up at fluffy clouds?)

Overfit, and other top dating mining mistakes, are exposed in a chapter of our recent book, The Handbook of Statistical Analysis and Data Mining Applications (Nisbet, Elder, Miner, 2009, Academic Press).  When such mistakes are avoided, the predictions of data mining are statistically valid and thus capable of yielding an enormous return on investment.

-John Elder, CEO, Elder Research, Inc.
 
Copyright 2009 by Elder Research, Inc.