January 2002. Volume 3, Issue 2
Editorial by
M. J. Zaki (available in PDF
and Postscript
formats or HTML)
Contributed
Articles on Online, Interactive, and Anytime Data Mining
Mining Data Streams under Block Evolution
V. Ganti, J. Gehrke and
R. Ramakrishnan
(available in PDF
and Postscript
formats)
ABSTRACT: In this paper we survey recent
work on incremental data mining model maintenance and change detection
under block evolution. In block evolution, a data is updated
periodically through insertions and deletions of blocks of records
at a time. We describe two techniques: (1) We describe a generic
algorithm for model maintenance that takes any traditional incremental
data mining model maintenance algorithm and transforms it into an algorithm
that allows restrictions on a temporal subset of the databases. (2)
We also describe a generic framework for change detection, that quantifies
the difference between two datasets in terms of the data mining models
the induce.
Towards Effective and Interpretable Data Mining
by Visual Interaction
C. C. Aggarwal
(available in PDF
and Postscript
formats)
ABSTRACT: The primary aim of most
data mining algorithms is to facilitate the discovery of concise and interpretable
information from large amounts of data. However, many of the current
formalizations of data mining algorithms have not quite reached this goal.
One of the reasons for this is that the focus on using purely automated
techniques has imposed several constraints on data mining algorithms.
For example, any data mining problem such as clustering or association
rules requires the specification of particular problem formulations, objective
functions, and parameters. Such systems fail to take the user's needs
into account very effectively. This makes it necessary to keep the
user in the loop in a way which is both efficient and interpretable.
One unique way of achieving this is by leveraging human visual perceptions
on intermediate data mining results. Such a system combines the computational
power of a computer and the intuitive abilities of a human to provide solutions
which cannot be achieved by either. This paper will discuss a number
of recent approaches to several data mining algorithms along these lines.
Requirements for Clustering Data Streams
D. Barbará
(available in PDF
and Postscript
formats)
ABSTRACT: Scientific and industrial
examples of data streams abound in astronomy, telecommunication operations,
banking and stock market applications, e-commerce and other fields.
A challenge imposed by continuously arriving data streams is to analyze
them and to modify the models that explain them as new data arrives.
In this paper, we analyze the requirements needed for clustering data streams.
We review some of the latest algorithms in the literature and assess if
they meet these requirements.
Interactive Mining and Knowledge Reuse
for
the Closed-Itemset Incremental-Mining Problem
L. Dumitriu
(available in PDF
and Postscript
formats)
ABSTRACT: Using
concept lattices as a theoretical background for finding association rules
has led to designing algorithms like Charm, Close or Closet. While
they are considered as extremely appropriate when finding concepts for
association rules, due to the smaller amount of results, they do not cover
a certain area of significant results, namely the pseudo-intents the form
the base for global implications. We have proposed an approach that,
besides finding all proper partial implications, also finds pseudo-intents.
The way our algorithm is devised, it allows certain important operations
on concept lattices, like adding or extracting items, meaning we can reuse
previously found results. It is a well-known fact that mining association
rules may lead to a large amount of results. Since, the mining results
are meant to be understood by the user, we have come to the conclusion
that he will benefit more from starting small, with some of the items in
the database, understand a small amount of results, and then add items
receiving only the extra-results. This way the number of human interventions
during the "full" mining process is increased and the process becomes user-driven.
MobiMine: Monitoring the Stock Market
from a PDA
H. Kargupta, B.-H.
Park, S. Pittie, L. Liu, D. Kushraj and K. Sarkar
(available in PDF
and Postscript
formats)
ABSTRACT: This paper describes an experimental
mobile data mining system that allows intelligent monitoring of time-critical
financial data from a hand-held PDA. It presents the overall system
architecture and the philosophy behind the design. It explores one
particular aspect of the system -- automated construction of personalized
focus area that calls for user's attention. The module works using
data mining techniques. The paper describes the data mining component
of the system that employs a novel Fourier analysis-based approach to efficiently
represent, visualize, and communicate decision trees over limited bandwidth
wireless networks. The paper also discusses a quadratic programming-based
personalization module that runs on the PDAs and the multi-media based
user-interfaces. It reports experimental results using an ad hoc
peer-to-peer IEEE 802.11 wireless network.
Reports
from KDD-2001
KDD Cup 2001 Report
J. Cheng, C. Hatzis, H.
Hayashi, M.-A. Krogel, S. Morishita, D. Page, J. Sese
(available in PDF
and Postscript
formats)
ABSTRACT: This paper presents results
and lessons from KDD Cup 2001. KDD Cup 2001 focused on mining biological
databases. It involved three cutting-edge tasks related to drug design
and genomics.
MDM/KDD: Multimedia Data Mining for the
Second Time
O. R. Zaïane and
S. J. Simoff
(available in PDF
and Postscript
formats)
ABSTRACT: This is brief report summarizes
the presentations, conclusions and directions for future work that were
discussed during the second edition of the International Workshop on Multimedia
Data Mining. The report includes references to resources where one
can find more information about the workshop format, the proceedings and
the workshop participants.
Workshop Report: The Fourth Workshop on
Mining Scientific Datasets, August 2001
C. Kamath
(available in PDF
and Postscript
formats)
Visual Data Mining -- KDD Workshop Report
S.
G. Eick and D. A. Keim
(available in PDF
and Postscript
formats)
BIOKDD01: Workshop on Data Mining in Bioinformatics
M.
J. Zaki, J. T. L. Wang, H. T. T. Toivonen
(available in PDF
and Postscript
formats)
ABSTRACT: In this report we provide a summary
of the BIOKDD01 Workshop on Data Mining in Bioinformatics, held in conjunction
with the 7th ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining, August 26, 2001 at San Francisco, California, USA.
When and How to Subsample: Report on
the KDD-2001 Panel
P.
Domingos
(available in PDF
and Postscript
formats)
Report on the SIGKDD 2001 Conference Panel "New
Research Directions in KDD"
J.
Gehrke
(available in PDF
and Postscript
formats)
Workshop
Reports
VDM@ECML/PKDD2001:
The International Workshop on Visual Data Mining at ECML/PKDD 2001
S.
J. Simoff
(available in PDF
and Postscript
formats)
ABSTRACT: This brief
report presents an overview of the International Workshop on Visual Data
Mining, conducted on 4 September 2001 in conjunction with the 12th European
Conference on Machine Learning (ECML'01) and the 5th European Conference
on Principles and Practice of Knowledge Discovery in Databases (PKDD'01).
It includes summary of the presentations and discussions, and provides
pointers to relevant resources in the area. |