Search notes:

Data-mining

Data mining applies statistics and pattern recognizion to large amounts of data from the past to discover (potentially) interesting information and knowledge for future actions.
Problem: separate signal from noise.
The data mining process is also known as knowledge discovery. It is not enough, just to find interesting, previously unknown, patterns in the data, the process must also show what a company has to do.
Data mining bridges the gap between artificial intelligence and databases.
Techniques and tasks
New techniques:
A datamining process can be broken into six major phases (cf CRISP-DM)

Directed vs undirected

Directed data mining: try to explain or categorize a target value (temperature, income etc).
Used techniques: Classification, Estimation and Prediction.
Undirected data mining: try to find patterns or similarities without targeting a specific value.
Used techniques: Affinity grouping and Clustering.
Profiling is used for both directed and undirected data mining.

Dangers

Finding patterns doesn't mean there an underlying rule for it (Big dipper in the sky)

Model set

The (historical) data used to develop data mining models.
Beware of biased samples (customers vs prospects, responders vs non-responders).

TODO

Does political correctness consideration thwart the opportunities of data mining?

See also

Compare with data science.
Data mining process (steps)
Data analysis
Data warehouse (DWH)
The virtuous cycle of data mining
A misuse of data mining is data dredging.
Related terms: data archaeology, information harvesting, information discovery, knowledge extraction etc.
Oracle's dbms_predictive_analytics package.
SQL Server Analysis Services (SSAS)
CRISP-DM
Information silos are a problem for organization wide data mining.
Oracle Advanced Analytics
Process mining

Index