Data Mining_ Concepts and Techniques - Jiawei Han [419]
usage for information processing153
view151
virtual133
warehouse database server131
database management systems (DBMSs)9
database queries. seequeries
databases9
inductive601
relational. seerelational databases
research26
statistical148–149
technology evolution3
transactional13–15
types of32
web-based4
data/pattern analysis. seedata mining
DBSCAN471–473
algorithm illustration474
core objects472
density estimation477
density-based cluster472
density-connected472, 473
density-reachable472, 473
directly density-reachable472
neighborhood density471 see alsocluster analysis; density-based methods
DDPMine422
decimal scaling, normalization by115
decision tree analysis, discretization by116
decision tree induction330–350, 385
algorithm differences336
algorithm illustration333
attribute selection measures336–344
attribute subset selection105
C4.5332
CART332
CHAID343
gain ratio340–341
Gini index332, 341–343
ID3332
incremental versions336
information gain336–340
multivariate splits344
parameters332
scalability and347–348
splitting criterion333
from training tuples332–333
tree pruning344–347, 385
visual mining for348–350
decision trees18, 330
branches330
illustrated331
internal nodes330
leaf nodes330
pruning331, 344–347
root node330
rule extraction from357–359
deep web597
default rules357
DENCLUE476–479
advantages479
clusters478
density attractor478
density estimation476
kernel density estimation477–478
kernels478 see alsocluster analysis; density-based methods
dendrograms460
densification power law592
density estimation476
DENCLUE477–478
kernel function477–478
density-based methods449, 471–479, 491
DBSCAN471–473
DENCLUE476–479
object division449
OPTICS473–476
STING as480 see alsocluster analysis
density-based outlier detection564–567
local outlier factor566–567
local proximity564
local reachability density566
relative density565
descendant cells189
descriptive mining tasks15
DIANA (Divisive Analysis)459, 460
dice operation148
differential privacy622
dimension tables136
dimensional cells189
dimensionality reduction86, 99–100, 120
dimensionality reduction methods510, 519–522, 538
list of587
spectral clustering520–522
dimension/level
application of297
constraints294
dimensions10, 136
association rule281
cardinality of159
concept hierarchies and142–144
in multidimensional view33
ordering of210
pattern281
ranking225
relevance analysis175
selection225
shared204 see alsodata warehouses
direct discriminative pattern mining422
directed acyclic graphs394–395
discernibility matrix427
discovery-driven exploration231–234, 235
discrepancy detection91–93
discrete attributes44
discrete Fourier transform (DFT)101, 587
discrete wavelet transform (DWT)100–102, 587
discretization112, 120
by binning115
by clustering116
by correlation analysis117
by decision tree analysis116
by histogram analysis115–116
techniques113
discriminant analysis600
discriminant rules16
discriminative frequent pattern-based classification416, 419–422, 437
basis for419
feature generation420
feature selection420–421
framework420–421
learning of classification model421
dispersion of data44, 48–51
dissimilarity
asymmetric binary71
between attributes of mixed type76–77
between binary attributes71–72
measuring65–78, 79
between nominal attributes69
on numeric data72–74
between ordinal attributes75
symmetric binary70–71
dissimilarity matrix67, 68
data matrix versus67–68
n-by-n table representation68
as one-mode matrix68
distance measures461–462
Euclidean72–73
Manhattan72–73
Minkowski73
supremum73–74
types of72
distance-based cluster analysis445
distance-based outlier detection561–562
nested loop algorithm561, 562 see alsooutlier detection
distributed data mining615, 624
distributed privacy preservation622
distributions
boxplots