Data Mining_ Concepts and Techniques - Jiawei Han [428]
schemas
integration94
snowflake140–141
star139–140
science applications611–613
search engines28
search space pruning263, 301
second guess heuristic369
selection dimensions225
self-training432
semantic annotations
applications317, 313, 320–321
with context modeling316
from DBLP data set316–317
effectiveness317
example314–315
of frequent patterns313–317
mutual information315–316
task definition315
Semantic Web597
semi-offline materialization226
semi-supervised classification432–433, 437
alternative approaches433
cotraining432–433
self-training432
semi-supervised learning25
outlier detection by572
semi-supervised outlier detection551
sensitivity analysis408
sensitivity measure367
sentiment classification434
sequence data analysis319
sequences586
alignment590
biological586, 590–591
classification of589–590
similarity searches587
symbolic586, 588–590
time-series586, 587–588
sequential covering algorithm359
general-to-specific search360
greedy search361
illustrated359
rule induction with359–361
sequential pattern mining589
constraint-based589
in symbolic sequences588–589
shapelets method590
shared dimensions204
pruning205
shared-sorts193
shared-partitions193
shell cubes160
shell fragments192, 235
approach211–212
computation algorithm212, 213
computation example214–215
precomputing210
shrinking diameter592
sigmoid function402
signature-based detection614
significance levels373
significance measure312
significance tests372–373, 386
silhouette coefficient489–490
similarity
asymmetric binary71
cosine77–78
measuring65–78, 79
nominal attributes70
similarity measures447–448, 525–528
constraints on533
geodesic distance525–526
SimRank526–528
similarity searches587
in information networks594
in multimedia data mining596
simple random sample with replacement (SRSWR)108
simple random sample without replacement (SRSWOR)108
SimRank526–528, 539
computation527–528
random walk526–528
structural context528
simultaneous aggregation195
single-dimensional association rules17, 287
single-linkage algorithm460, 461
singular value decomposition (SVD)587
skewed data
balanced271
negatively47
positively47
wavelet transforms on102
slice operation148
small-world phenomenon592
smoothing112
by bin boundaries89
by bin means89
by bin medians89
for data discretization90
snowflake schema140
example141
illustrated141
star schema versus140
social networks524–525, 526–528
densification power law592
evolution of594
mining623
small-world phenomenon592 see alsonetworks
social science/social studies data mining613
soft clustering501
soft constraints534, 539
example534
handling536–537
space-filling curve58
sparse data102
sparse data cubes190
sparsest cuts539
sparsity coefficient579
spatial data14
spatial data mining595
spatiotemporal data analysis319
spatiotemporal data mining595, 623–624
specialized SQL servers165
specificity measure367
spectral clustering520–522, 539
effectiveness522
framework521
steps520–522
speech recognition430
speed, classification369
spiral method152
split-point333, 340, 342
splitting attributes333
splitting criterion333, 342
splitting rules. seeattribute selection measures
splitting subset333
SQL, as relational query language10
square-error function454
squashing function403
standard deviation51
example51
function of50
star schema139
example139–140
illustrated140
snowflake schema versus140
Star-Cubing204–210, 235
algorithm illustration209
bottom-up computation205
example207
for full cube computation210
ordering of dimensions and210
performance210
shared dimensions204–205
starnet query model149
example149–150
star-nodes205
star-trees205
compressed base table207
construction205
statistical data mining598–600
analysis of variance600
discriminant analysis600
factor analysis600
generalized linear models599–600
mixed-effect