Data Mining_ Concepts and Techniques - Jiawei Han [413]
average()215
B
background knowledge30–31
backpropagation393, 398–408, 437
activation function402
algorithm illustration401
biases402, 404
case updating404
efficiency404
epoch updating404
error403
functioning of400–403
hidden layers399
input layers399
input propagation401–402
interpretability and406–408
learning400
learning rate403–404
logistic function402
multilayer feed-forward neural network398–399
network pruning406–407
neural network topology definition400
output layers399
sample learning calculations404–406
sensitivity analysis408
sigmoid function402
squashing function403
terminating conditions404
unknown tuple classification406
weights initialization401 see alsoclassification
bagging379–380
algorithm illustration380
boosting versus381–382
in building random forests383
bar charts54
base cells189
base cuboids111, 137–138, 158
Basic Local Alignment Search Tool (BLAST)591
Baum-Welch algorithm591
Bayes’ theorem350–351
Bayesian belief networks393–397, 436
algorithms396
components of394
conditional probability table (CPT)394, 395
directed acyclic graph394–395
gradient descent strategy396–397
illustrated394
mechanisms394–396
problem modeling395–396
topology396
training396–397 see alsoclassification
Bayesian classification
basis350
Bayes’ theorem350–351
class conditional independence350
naive351–355, 385
posterior probability351
prior probability351
BCubed precision metric488, 489
BCubed recall metric489
behavioral attributes546, 573
believability, data85
BI (business intelligence)27
biases402, 404
biclustering512–519, 538
application examples512–515
enumeration methods517, 518–519
gene expression example513–514
methods517–518
optimization-based methods517–518
recommender system example514–515
types of538
biclusters511
with coherent values516
with coherent values on rows516
with constant values515
with constant values on columns515
with constant values on rows515
as submatrix515
types of515–516
bimodal47
bin boundaries89
binary attributes41, 79
asymmetric42, 70
as Boolean41
contingency table for70
dissimilarity between71–72
example41–42
proximity measures70–72
symmetric42, 70–71 see alsoattributes
binning
discretization by115
equal-frequency89
smoothing by bin boundaries89
smoothing by bin means89
smoothing by bin medians89
biological sequences586, 624
alignment of590–591
analysis590
BLAST590
hidden Markov model591
as mining trend624
multiple sequence alignment590
pairwise alignment590
phylogenetic tree590
substitution matrices590
bipartite graphs523
BIRCH458, 462–466
CF-trees462–463, 464, 465–466
clustering feature462, 463, 464
effectiveness465
multiphase clustering technique464–465 see alsohierarchical methods
bitmap indexing160–161, 179
bitmapped join indexing163, 179
bivariate distribution40
BLAST. seeBasic Local Alignment Search Tool
BOAT.
Boolean association rules281
Boolean attributes41
boosting380
accuracy382
AdaBoost380–382
bagging versus381–382
weight assignment381
bootstrap method371, 386
bottom-up design approach133, 151–152
bottom-up subspace search510–511
boxplots49
computation50
example50
five-number summary49
illustrated50
in outlier visualization555
BUC200–204, 235
for 3-D data cube computation200
algorithm202
Apriori property201
bottom-up construction201
iceberg cube construction201
partitioning snapshot203
performance204
top-down processing order200, 201
business intelligence (BI)27
business metadata135
business query view151
C
C4.5332, 385
class-based ordering358
gain ratio use340
greedy approach332
pessimistic pruning345
rule extraction358 see alsodecision tree induction
cannot-link constraints533
CART332, 385
cost complexity pruning algorithm345
Gini index use341
greedy approach332 see alsodecision tree induction
case updating404
case-based