Data Mining_ Concepts and Techniques - Jiawei Han [421]
in recommender systems319
road map279–283
scalable computation and319
scope of319–320
in sequence or structural data analysis319
in spatiotemporal data analysis319
for structure and cluster discovery318
for subspace clustering318–319
in time-series data analysis319
top-k310
in video data analysis319 see alsofrequent patterns
frequent pattern-based classification415–422, 437
associative415, 416–419
discriminative416, 419–422
framework422
frequent patterns17, 243
abstraction levels281
association rule mapping280
basic280
closed262–264, 280
concepts243–244
constraint-based281
dimensions281
diversity280
exploration313–319
growth257–259, 272
max262–264, 280
mining243–244, 279–325
mining constraints or criteria281
number of dimensions involved in281
semantic annotation of313–317
sequential243
strong associations437
structured243
trees257–259
types of values in281
frequent subgraphs591
front-end client layer132
full materialization159, 179, 234
fuzzy clustering499–501, 538
data set for506
with EM algorithm505–507
example500
expectation step (E-step)505
flexibility501
maximization step (M-step)506–507
partition matrix499
as soft clusters501
fuzzy logic428
fuzzy sets428–429, 437, 499
evaluation500–501
example499
G
gain ratio340
C4.5 use of340
formula341
maximum341
gateways131
gene expression513–514
generalization
attribute169–170
attribute, control170
attribute, threshold control170
in multimedia data mining596
process172
results presentation174
synchronous175
generalized linear models599–600
generalized relations
attribute-oriented induction172
presentation of174
threshold control170
generative model467–469
genetic algorithms426–427, 437
genomes15
geodesic distance525–526, 539
diameter525
eccentricity525
measurements based on526
peripheral vertex525
radius525
geographic data warehouses595
geometric projection visualization58–60
Gini index341
binary enforcement332
binary indexes341
CART use of341
decision tree induction using342–343
minimum342
partitioning and342
global constants, for missing values88
global outliers545, 581
detection545
example545
Flu Trends2
popularity of619–620
gradient descent strategy396–397
algorithms397
greedy hill-climbing397
as iterative396–397
graph and network data clustering497, 522–532, 539
applications523–525
bipartite graph523
challenges523–525, 530
cuts and clusters529–530
generic method530–531
geodesic distance525–526
methods528–532
similarity measures525–528
SimRank526–528
social network524–525
web search engines523–524 see alsocluster analysis
graph cuts539
graph data14
graph index structures591
graph pattern mining591–592, 612–613
graphic displays
data presentation software44–45
histogram54, 55
quantile plot51–52
quantile-quantile plot52–54
scatter plot54–56
greedy hill-climbing397
greedy methods, attribute subset selection104–105
grid-based methods450, 479–483, 491
CLIQUE481–483
STING479–481 see alsocluster analysis
grid-based outlier detection562–564
CELL method562, 563
cell properties562
cell pruning rules563 see alsooutlier detection
group-based support286
group-by clause231
grouping attributes231
grouping variables231
Grubb' test555
H
hamming distance431
hard constraints534, 539
example534
handling535–536
harmonic mean369
hash-based technique255
heterogeneous networks592
classification of593
clustering of593
ranking of593
heterogeneous transfer learning436
hidden Markov model (HMM)590, 591
hierarchical methods449, 457–470, 491
agglomerative459–461
algorithmic459, 461–462
Bayesian459
BIRCH458, 462–466
Chameleon458, 466–467
complete linkages462, 463
distance measures461–462
divisive459–461
drawbacks449
merge or split points and458
probabilistic459, 467–470
single linkages462, 463 see alsocluster analysis