Data Mining_ Concepts and Techniques - Jiawei Han [427]
grid-based562–564
types of552, 560 see alsooutlier detection
pruning
cost complexity algorithm345
data space300–301
decision trees331, 344–347
in k-nearest neighbor classification425
network406–407
pattern space295, 297–300
pessimistic345
postpruning344–345, 346
prepruning344, 346
rule363
search space263, 301
sets345
shared dimensions205
sub-itemset263
pyramid algorithm101
Q
quality control600
quantile plots51–52
quantile-quantile plots52
example53–54
illustrated53 see alsographic displays
quantitative association rules281, 283, 288, 320
clustering-based mining290–291
data cube-based mining289–290
exceptional behavior disclosure291
mining289
quartiles48
first49
third49
queries10
intercuboid expansion223–225
intracuboid expansion221–223
language10
OLAP129, 130
point216, 217, 220
processing163–164, 218–227
range220
relational operations10
subcube216, 217–218
top-k225–227
query languages31
query models149–150
query-driven approach128
querying function433
R
rag bag criterion488
RainForest385
random forests382–383
random sampling370, 386
random subsampling370
random walk526
similarity based on527
randomization methods621
range48
interquartile49
range queries220
ranking
cubes225–227, 235
dimensions225
function225
heterogeneous networks593
rare patterns280, 283, 320
example291–292
mining291–294
ratio-scaled attributes43–44, 79
reachability density566
reachability distance565
recall measure368–369
recognition rate366–367
recommender systems282, 615
advantages616
biclustering for514–515
challenges617
collaborative610, 615, 616, 617, 618
content-based approach615, 616
data mining and615–618
error types617–618
frequent pattern mining for319
hybrid approaches618
intelligent query answering618
memory-based methods617
use scenarios616
recursive partitioning335
reduced support285, 286
redundancy
in data integration94
detection by correlations analysis94–98
redundancy-aware top-k patterns281, 311, 320
extracting310–312
finding312
strategy comparison311–312
trade-offs312
refresh, in back-end tools/utilities134
regression19, 90
coefficients105–106
example19
linear90, 105–106
in statistical data mining599
regression analysis19, 328
in time-series data587–588
relational databases9
components of9
mining10
relational schema for10
relational OLAP (ROLAP)132, 164, 165, 179
relative significance312
relevance analysis19
repetition346
replication347
illustrated346
representative patterns309
retail industry609–611
RIPPER359, 363
robustness, classification369
ROC curves374, 386
classification models377
classifier comparison with373–377
illustrated376, 377
plotting375
roll-up operation11, 146
rough set approach428–429, 437
row enumeration302
rule ordering357
rule pruning363
rule quality measures361–363
rule-based classification355–363, 386
IF-THEN rules355–357
rule extraction357–359
rule induction359–363
rule pruning363
rule quality measures361–363
rules for constraints294
S
sales campaign analysis610
samples218
cluster108–109
data219
simple random108
stratified109–110
sampling
in Apriori efficiency256
as data redundancy technique108–110
methods108–110
oversampling384–385
random386
with replacement380–381
uncertainty433
undersampling384–385
sampling cubes218–220, 235
confidence interval219–220
framework219–220
query expansion with221
SAS Enterprise Miner603, 604
scalability
classification369
cluster analysis446
cluster methods445
data mining algorithms31
decision tree induction and347–348
dimensionality and577
k-means454
scalable computation319
SCAN. seeStructural Clustering Algorithm for Networks
core vertex531
illustrated532
scatter plots54
2-D data set visualization with59
3-D data set visualization with60
correlations between attributes54–56
illustrated55
matrix56,