Data Mining_ Concepts and Techniques - Jiawei Han [417]
base111, 137–138, 158
child193
individual190
lattice of139, 156, 179, 188–189, 234, 290
sparse190
subset selection160 see alsodata cubes
curse of dimensionality158, 179
customer relationship management (CRM)619
customer retention analysis610
CVQE. seeConstrained Vector Quantization Error algorithm
cyber-physical systems (CPS)596, 623–624
D
data
antimonotonicity300
archeology6
biological sequence586, 590–591
complexity32
conversion to knowledge2
cyber-physical system596
for data mining8
data warehouse13–15
database9–10
discrimination16
dredging6
generalizing150
graph14
growth2
linearly inseparable413–415
linearly separated409
multimedia14, 596
multiple sources15, 32
multivariate556
networked14
overfitting330
relational10
sample219
similarity and dissimilarity measures65–78
skewed47, 271
spatial14, 595
spatiotemporal595–596
specializing150
statistical descriptions44–56
streams598
symbolic sequence586, 588–589
temporal14
text14, 596–597
time-series586, 587
“tombs”5
training18
transactional13–14
types of33
web597–598
data auditing tools92
data characterization15, 166
attribute-oriented induction167–172
data mining query167–168
example16
methods16
output16
data classification. seeclassification
data cleaning6, 85, 88–93, 120
in back-end tools/utilities134
binning89–90
discrepancy detection91–93
by information network analysis592–593
missing values88–89
noisy data89
outlier analysis90
pattern mining for318
as process91–93
regression90 see alsodata preprocessing
data constraints294
antimonotonic300
pruning data space with300–301
succinct300 see alsoconstraints
data cube aggregation110–111
data cube computation156–160, 214–215
aggregation and193
average()215
BUC200–204, 235
cube operator157–159
cube shells211
full189–190, 195–199
general strategies for192–194
iceberg160, 193–194
memory allocation199
methods194–218, 235
multiway array aggregation195–199
one-pass198
preliminary concepts188–194
shell fragments210–218, 235
Star-Cubing204–210, 235
data cubes10, 136, 178, 188
3-D138
4-D138, 139
apex cuboid111, 138, 158
base cuboid111, 137–138, 158
closed192
cube shell192
cuboids137
curse of dimensionality158
discovery-driven exploration231–234
example11–13
full189–190, 196–197
gradient analysis321
iceberg160, 190–191, 201, 235
lattice of cuboids157, 234, 290
materialization159–160, 179, 234
measures145
multidimensional12, 136–139
multidimensional data mining and26
multifeature227, 230–231, 235
multimedia596
prediction227–230, 235
qualitative association mining289–290
queries230
query processing218–227
ranking225–227, 235
sampling218–220, 235
shell160, 211
shell fragments192, 210–218, 235
sparse190
spatial595
technology187–242
data discretization. seediscretization
data dispersion44, 48–51
boxplots49–50
five-number summary49
quartiles48–49
standard deviation50–51
variance50–51
data extraction, in back-end tools/utilities134
data focusing168
data generalization179–180
by attribute-oriented induction166–178
data integration6, 85–86, 93–99, 120
correlation analysis94–98
detection/resolution of data value conflicts99
entity identification problem94
by information network analysis592–593
object matching94
redundancy and94–98
schema94
tuple duplication98–99 see alsodata preprocessing
data marts132, 142
data warehouses versus142
dependent132
distributed134
implementation132
independent132
data matrix67–68
dissimilarity matrix versus67–68
relational table67–68
rows and columns68
as two-mode matrix68
data migration tools93
data mining5–8, 33, 598, 623
ad hoc31
applications607–618
biological data624
complex data types585–598, 625
cyber-physical system data596
data streams598
data types for8
data warehouses for154