Intrusion detection a data mining approach nandita. A rough set approach to mining incomplete data is presented in this paper. An overview of useful business applications is provided. The results obtained in the monograph can be useful for researchers in such areas as machine learning, data mining and knowledge discovery, especially for those who are working in rough set theory, test theory, and logical analysis of data lad.
For the purposes of analysis and decision support in the business area in many cases data mining using rough set theory is used. Granular computing is an emerging computing paradigm of information processing and an approach for knowledge representation and data mining. Recently, the rough set and fuzzy set theory have generated a great deal of interest among more and more researchers. Rough sets theory is a new mathematical approach used in the intelligent data analysis and data mining if data is uncertain or incomplete. Rough set theory 7 is a new mathematical approach to data analysis and data mining. The results considered in this book can be useful for researchers in machine learning, data mining and knowledge discovery, especially for those who are working in rough set theory, test theory and logical analysis of data. Chapter 2 presents the data mining process in more detail.
Hongmei chen, tianrui li, ieee senior member, chuan luo, shijinn horng, ieee member. This paper, introduces the fundamental concepts of rough set theory and other aspects of data mining, a. Pdf a decisiontheoretic rough set approach for dynamic. A rough set approach for generation and validation of rules. Download data mining tutorial pdf version previous page print page. A rough set approach to attribute generalization in data mining. Data mining technology has emerged as a means for identifying patterns and trends from large quantities o. A gdt is a table in which the probabilistic relationships between concepts and instances over discrete domains are represented. Fuzzyrough data mining with weka aberystwyth university. Consequently, the theoretical part is minimized to emphasize the practical application side of the rough set approach in the context of data analysis and modelbuilding applications. Granular computing is an emerging computing paradigm of information processing.
The customer related data are categorical in nature. The rough set theory, which originated in the early 1980s, provides an alternative approach to the fuzzy set theory, when dealing with uncertainty, vagueness or inconsistence often. The approach is based on the combination of generalization distribution table gdt. Various topics of data mining techniques are identified and described throughout. In this thesis, a rulebased rough set decision system is.
The rough set theory offers a viable approach for decision rule extraction from data. A gdt is a table in which the probabilistic relationships between concepts and instances over discrete domains are. Finally, some description about applications of the data mining system with. Also, this method locates the clusters by clustering the density. Puts forward a xml mining model based on rough set theory. Theory and application on rough set, fuzzy logic, and. Rough set approach to the analysis of the structureactivity relationship of quaternary imidazolium compounds. In its abstract form, it is a new area of uncertainty mathematics closely related to fuzzy theory. Through indepth study on the existing rough set and data mining technologies, for the shortcomings of the existing data mining algorithms based on rough set, this paper presents an improved algorithm. Comparative analysis between rough set theory and data. New directions in rough sets, data mining, and granularsoft computing, lnai. The rough set approach 7 to data analysis has many important advantages that provide efficient algorithms for finding hidden patterns in data, finds minimal sets of data, evaluates significance of data, generates. A rough set approach for generation and validation of rules for missing attribute values of a data set. It is a new mathematical tool to deal with partial information.
It also provides a powerful way to calculate the importance degree of vague and uncertain big data to help in decision making. The chapters in this work cover a range of topics that focus on discovering dependencies among data, and reasoning about vague, uncertain and imprecise information. Chapter 1 gives an overview of data mining, and provides a description of the data mining process. Parallel computing of approximations in dominancebased rough sets approach, knowledgebased systems, 87. The data mining technology instead of classic statistical analysis is developed to help the people to discover the knowledge inside of the data. In this perspective, granular computing has a position of centrality in data mining. Analysis of imprecise data is an edited collection of research chapters on the most recent developments in rough set theory and data mining. Seminal description of datamining approaches with reference. Risk analysis technique on inconsistent interview big data. The approach is based on the combination of generalization distribution table gdt and the rough set methodology.
Many techniques have been proposed for processing, managing and mining trajectory data in the past decade, fostering a broad range of applications. In recent years we witnessed a rapid grow of interest in rough set theory and its application, world wide. Rough set theory, data mining, decision table, decision rule, data representation. A convenient way to present equivalence relations is through partitions. Decision rule induction for service sector using data. A characteristic set, a generalization of the elementary set wellknown in rough set. For the rough set theory, in the process of data mining, there are still a large number of problems need to be discussed, such as large data sets, efficient reduction algorithm, parallel computing, hybrid algorithm, etc. It demonstrates this process with a typical set of data. Chapter 2 rough sets and reasoning from data presents the application of rough set concept to reason from data data mining.
And combining with probability logic, random truth. This overview provides a description of some of the most common data mining algorithms in use today. A rough set approach for the discovery of classification rules in. Approximation can further be applied to data mining related task, e. In this data mining clustering method, a model is hypothesized for each cluster to find the best fit of data for a given model. Visualization of data through data mining software is addressed. A partition of u is a family of mutually disjoint nonempty subsets of u, called blocks, such that the union of all blocks is u. Risk assessment is very important for safe and reliable investment. Book description practical applications of data mining emphasizes both theory and applications of data mining algorithms. A rough set knowledge discovery framework is formulated for the analysis of.
Rule induction from a decision table using rough sets theory. Introduction modern organizations use several types of decision support systems to facilitate decision support. A crucial concept in the rough set approach to machine learning is that of. Summarization providing a more compact representation of the data set, including visualization and report. Additionally, the rough set approach to lower and upper approximations and certain possible rule sets concepts are introduced. Rough set theory fundamental concepts, principals, data. Rough set approach, fuzzy set approachs, prediction, linear and multipleregression. A rough set based method for updating decision rules on attribute values coarsening and refining, ieee transactions on knowledge and data engineering, 2612. Data mining is a process of discovering various models, summaries, and derived values from a given collection of data. Due to the xml document is a kind of semistructured data, using the traditional data mining methods for mining of xml data is not applicable.
It also provides a powerful way to calculate the importance degree of vague and uncertain big data to. The rough set approach 7 to data analysis has many important advantages that. The advances in locationacquisition and mobile computing techniques have generated massive spatial trajectory data, which represent the mobility of a diversity of moving objects, such as people, vehicles. The authors have broken the discussion into two sections, each with a specific theme. The dominancebased rough set approach drsa is an extension of rough set theory for multicriteria decision analysis mcda, introduced by greco, matarazzo and slowinski. Data mining is an emerging powerful tool for analysis and prediction.
A rough set approach to data mining this paper reports our experiences with the application of the hierarchy of probabilistic decision tables to face recognition. This book provides stateoftheart research results on intrusion detection using reinforcement learning, fuzzy and rough set theories, and genetic algorithm and serves wide range of applications, covering. Comparative analysis between rough set theory and data mining. Rule generation from raw data is a very effective and most widely used tool of data mining. Rough set can be used as a tool to generate rules form decision table in data mining. Mining incomplete dataa rough set approach springerlink. Relationships exist between rough set theory and dempstershafers theory of evidence. However, the clustering algorithms for categorical data are few and are unable to handle uncertainty. Rough set theory was proposed by pawlak 17 in 1982, which o ers a mathematical approach to data analysis and data mining 11,1821. Rough set theory and zadehs fuzzy set theory are two independent approaches to deal with uncertainty. A rough set is a formal approximation of a crisp set in terms of a pair of sets that give the lower and upper approximation of the original set learn more in.
Discriminant versus rough set approach to vague data analysis. After 15 year of pursuing rough set theory and its application the theory has reached a certain degree of maturity. Clustering in data mining algorithms of cluster analysis. A characteristic set, a generalization of the elementary set wellknown in rough set theory, may be computed using such blocks. This paper introduces a new approach for mining ifthen rules in databases with uncertainty and incompleteness. The advances in locationacquisition and mobile computing techniques have generated massive spatial trajectory data, which represent the mobility of a diversity of moving objects, such as people, vehicles and animals. Performance analysis and prediction in educational data. Pdf a decisiontheoretic rough set approach for dynamic data. Inhibitory rules in data analysis a rough set approach. This section presents the concepts of rough set theory.
Pawlak as a mathematical ap proach to deal with vagueness and uncertainty in data analysis 50, 51. A survey on rough set theory and its applications sciencedirect. Rough set theory rst is a mathematical approach that handles uncertainty and is capable of discovering knowledge from a database. The general experimental procedure adapted to data mining problems involves the following steps. Rough set theory provides a simple and elegant method for analyzing data. Various topics of data mining techniques are identified and described throughout, including clustering, association rules, rough set theory, probability theory, neural networks, classification, and fuzzy logic. A decisiontheoretic rough set approach for dynamic data mining. Real life data are frequently imperfect, erroneous, incomplete, uncertain and vague. The monograph can be used under the creation of courses for graduate students and for ph. We can use rough set approach to discover structural relationship within imprecise and noisy data. Sets, fuzzy sets and rough sets our digital library. Another methodology which has high relevance to data mining and plays a central role in this volume is that of. There are so many approaches for handling missing attribute values. Finally, some description about applications of the data mining system with rough set theory is included.
On rough set based approaches to induction of decision. The results considered in this book can be useful for researchers in machine learning, data mining and knowledge discovery, especially for those who are working in rough set theory, test theory and logical. Dominancebased rough set approach for group decisions, european journal of operational research, 2511. This book provides stateoftheart research results on intrusion detection using reinforcement learning, fuzzy and rough set theories, and genetic algorithm and serves wide range of applications, covering general computer security to server, network, and cloud security. The results obtained in the monograph can be useful for researchers in such areas as machine learning, data mining and knowledge discovery, especially for those who are working in rough set theory, test.
Relationships exist between rough set theory and dempstershafers theory of. The main aim is to show how rough set and rough set analysis can be effectively used to extract knowledge from large databases. The chapter is focused on the data mining aspect of the applications of rough set theory. Based on rough sets 4 and the concept of lower and upper boundary sets 5, we introduce a method for updating approximations by considering adding and. It is useful for dealing with indiscernibility of objects caused by incomplete or limited information. It is useful for dealing with indiscernibility of objects caused by. Data mining, decision tables, rough set, rule extraction. Consequently, the theoretical part is minimized to emphasize the practical application side of the rough set approach. And combining with probability logic, random truth degree of rough logic can be studied in the future. For the rough set theory, in the process of data mining, there are still a large number. Based on the rough set theory, the rough logic and its deduction theory system can be established. Chapter 1 basic concepts contains general formulation of basic ideas of rough set theory together with brief discussion of its place in classical set theory.
Rough set theory was introduced by zdzislaw pawlak in 1982. The concept of rough, or approximation, set s was introduced by pawlak, and is based on the single assumption that information is associated with. Pdf this article comments on data mining and rough set theory, regarding the article myths about rough set. Rough set theory is relativly new to area of soft computing to handle the uncertain big data efficiently. A rough set approach for generation and validation of. Mining incomplete dataa rough set approach jerzy w. Data mining, data tables, distributed data mining ddm. Research on data mining algorithm based on rough set. Data mining and knowledge discovery in real life applications 36 outset, rough set theory has been a methodology of database mining or knowledge discovery in relational databases.
597 47 33 1403 127 694 111 449 826 125 1048 569 305 259 1427 1003 207 494 25 845 1253 641 323 328 76 616 433 997 1337 1200 966 1307 1080 731 830 1166 1040 181 99 712 195 673 1256 1102 401 1242