## SEARCH

#### Institution

##### ( see all 1045)

- The University of Hong Kong 12 (%)
- Warsaw University of Technology 12 (%)
- Fudan University 10 (%)
- Polish Academy of Sciences 10 (%)
- Chinese Academy of Sciences 9 (%)

#### Author

##### ( see all 2313)

- Alhajj, Reda 8 (%)
- Hong, Tzung-Pei 8 (%)
- Liu, Huan 7 (%)
- Cheung, David W. 6 (%)
- Shi, Yong 6 (%)

## CURRENTLY DISPLAYING:

Most articles

Fewest articles

Showing 1 to 10 of 913 matching Articles
Results per page:

## An Algorithm for Mining Association Rules with Weighted Minimum Supports

### Artificial Intelligence Applications and Innovations (2005-01-01): 187 , January 01, 2005

Most existing algorithms employ a uniform minimum support for mining association rules. Nevertheless, each item in a publication database, even each set of items, is exhibited in an individual period. A reasonable minimum support threshold has to be adjusted according to the exhibition period of each *k*-itemset. Accordingly, this paper proposes a new algorithm, called WMS, for mining association rules with weighted minimum supports in publication databases. WMS discovers all frequent itemsets which satisfy their individual requirement of minimum support thresholds. WMS applies the group closure property to prune futile itemsets, to reduce the number of candidates generated, and thus to generate the candidate sets efficiently.

## Flock by Leader: A Novel Machine Learning Biologically Inspired Clustering Algorithm

### Advances in Swarm Intelligence (2012-01-01) 7332: 117-126 , January 01, 2012

In the April 2010 *Nature* research report, it was announced that biological physicists only very recently discovered that there exists a leadership pattern in flocks of pigeon birds. The most authoritative birds of the pigeons’ flock take the lead, and followers follow the leaders’ directions. Pigeon leaders’ roles vary over time. Following this unprecedented discovery made by zoologists at the University of Oxford and Eötvös University, we extend in this paper the flocking model largely used in computer science. We define a new biologically inspired clustering algorithm entitled “FlockbyLeade” that detects hierarchical leaders, discovers their followers, and enables them to flock based on local proximity in an artificial virtual space to create clusters. We offer empirical evidence that the algorithm outperforms both the existing flocking algorithm and the K-means algorithm. We analyze the performance of the algorithm based on widely used datasets in the literature.

## A Survey of Datamining Methods for Sensor Network Bug Diagnosis

### Managing and Mining Sensor Data (2013-01-01): 429-458 , January 01, 2013

This chapter surveys recent debugging tools for sensor networks that are inspired by data mining algorithms. These tools are motivated by the increased complexity and scale of sensor network applications, making it harder to identify root causes of system problems. At a high level, debugging solutions in the domain of sensor networks can be classified according to their goal into two distinct categories; (i) solutions that attempt to localize errors to a single node, component, or code snippet, and (ii) solutions that attempt to identify a global pattern that causes misbehavior to occur. The first category inherits the usual wisdom that problems are often localized. It is unlikely for independent failures to coinside. Hence, while many different trouble symptoms may occur simultaneously, they typically arise from a single misbehaving component such as a failed radio or a crashed node that may, in turn, trigger a cascade of other problems. In contrast, the second category of solutions is motivated by interactive complexity problems. They seek to uncover bugs in networked sensing systems that arise due to unexpected interactions between components. The underlying assumption is that individual components are easier to test, which ensures that they work well in isolation. Therefore, practical software systems seldom fail due to a single poorly-coded component. Rather, they fail due to an unexpected interaction pattern between *individually well-behaved* components.

## Rough Sets and their Applications

### Computational Intelligence in Theory and Practice (2001-01-01) 8: 73-91 , January 01, 2001

The paper discusses basic concepts of rough set theory. Starting point of the theory are data tables which are used to define rudiments of the theory: approximations, dependency and reduction of attributes, decision rules and others. Various applications of the theory are outlined and future problems pointed out.

## Clustering Text Data Streams

### Journal of Computer Science and Technology (2008-01-01) 23: 112-128 , January 01, 2008

Clustering text data streams is an important issue in data mining community and has a number of applications such as news group filtering, text crawling, document organization and topic detection and tracing etc. However, most methods are similarity-based approaches and only use the TF***IDF scheme to represent the semantics of text data and often lead to poor clustering quality. Recently, researchers argue that semantic smoothing model is more efficient than the existing TF***IDF scheme for improving text clustering quality. However, the existing semantic smoothing model is not suitable for dynamic text data context. In this paper, we extend the semantic smoothing model into text data streams context firstly. Based on the extended model, we then present two online clustering algorithms OCTS and OCTSM for the clustering of massive text data streams. In both algorithms, we also present a new cluster statistics structure named cluster profile which can capture the semantics of text data streams dynamically and at the same time speed up the clustering process. Some efficient implementations for our algorithms are also given. Finally, we present a series of experimental results illustrating the effectiveness of our technique.

## Clustering in Dynamic Spatial Databases

### Journal of Intelligent Information Systems (2005-01-01) 24: 5-27 , January 01, 2005

Efficient clustering in dynamic spatial databases is currently an open problem with many potential applications. Most traditional spatial clustering algorithms are inadequate because they do not have an efficient support for incremental clustering.In this paper, we propose *DClust*, a novel clustering technique for dynamic spatial databases. *DClust* is able to provide multi-resolution view of the clusters, generate arbitrary shapes clusters in the presence of noise, generate clusters that are insensitive to ordering of input data and support incremental clustering efficiently. *DClust* utilizes the density criterion that captures arbitrary cluster shapes and sizes to select a number of representative points, and builds the Minimum Spanning Tree (MST) of these representative points, called R-MST. After the initial clustering, a summary of the cluster structure is built. This summary enables quick localization of the effect of data updates on the current set of clusters. Our experimental results show that *DClust* outperforms existing spatial clustering methods such as DBSCAN, C2P, DENCLUE, Incremental DBSCAN and BIRCH in terms of clustering time and accuracy of clusters found.

## The Research of Data Mining Technology of Privacy Preserving in Sharing Platform of Internet of Things

### Internet of Things (2012-01-01) 312: 481-485 , January 01, 2012

The development of the Internet of Things sets off the third waves of the world information industry after the invention and use of computer and Internet. The data mining technology plays a vital role in the development and promotion of Internet of Things, but it causes leakage problem of privacy information at the same time. In the light of the data mining association rules and randomized response method. We propose a new method, suppressible randomized response method (SRRM), and introduce the data mining algorithm of privacy protection based on SRR. Finally, this paper evaluates the privacy of the method.

## Logic of Association Rules

### Applied Intelligence (2005-01-01) 22: 9-28 , January 01, 2005

Association rules corresponding to general relation of two Boolean attributes are introduced. Association rules based on statistical hypotheses test are also included. Several classes of association rules are defined e.g. classes of implicational and of equivalence rules. Special logical calculi such that their formulae correspond to association rules are defined and studied. Practically important deduction rules of these calculi are introduced. It is shown that the question if the given association rule logically follows from an other given association rule can be converted into the question if suitable formulae of propositional calculus are tautologies. Several further theoretical results and research directions are mentioned.

## Combining Bagging, Boosting and Dagging for Classification Problems

### Knowledge-Based Intelligent Information and Engineering Systems (2007-01-01): 4693 , January 01, 2007

Bagging, boosting and dagging are well known re-sampling ensemble methods that generate and combine a diversity of classifiers using the same learning algorithm for the base-classifiers. Boosting algorithms are considered stronger than bagging and dagging on noise-free data. However, there are strong empirical indications that bagging and dagging are much more robust than boosting in noisy settings. For this reason, in this work we built an ensemble using a voting methodology of bagging, boosting and dagging ensembles with 8 sub-classifiers in each one. We performed a comparison with simple bagging, boosting and dagging ensembles with 25 sub-classifiers, as well as other well known combining methods, on standard benchmark datasets and the proposed technique had better accuracy in most cases.

## Bi-Decomposition of Function Sets in Multiple-Valued Logic for Circuit Design and Data Mining

### Artificial Intelligence Review (2003-12-01) 20: 233-267 , December 01, 2003

This article presents a theory for the bi-decomposition of functions in multi-valued logic (MVL). MVL functions are applied in logic design of multi-valued circuits and machine learning applications. Bi-decomposition is a method to decompose a function into two decomposition functions that are connected by a two-input operator called gate. Each of the decomposition functions depends on fewer variables than the original function. Recursive bi-decomposition represents a function as a structure of interconnected gates. For logic synthesis, the type of the gate can be chosen so that it has an efficient hardware representation. For machine learning, gates are selected to represent simple and understandable classification rules.

Algorithms are presented for non-disjoint bi-decomposition, where the decomposition functions may share variables with each other. Bi-decomposition is discussed for the min- and max-operators. To describe the MVL bi-decomposition theory, the notion of incompletely specified functions is generalized to function intervals. The application of MVL differential calculus leads to particular efficient algorithms. To ensure complete recursive decomposition, separation is introduced as a new concept to simplify non-decomposable functions. Multi-decomposition is presented as an example of separation.

The decomposition algorithms are implemented in a decomposition system called YADE. MVL test functions from logic synthesis and machine learning applications are decomposed. The results are compared to other decomposers. It is verified that YADE finds decompositions of superior quality by bi-decomposition of MVL function sets.