IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,                    VOL. 22,     NO. 9,    SEPTEMBER 2010             ...
1300                                    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,        VOL. 22,   NO. 9,   SE...
CAO ET AL.: FLEXIBLE FRAMEWORKS FOR ACTIONABLE KNOWLEDGE DISCOVERY                                                        ...
1302                                         IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,          VOL. 22,   NO. ...
ts ðÞ þ  bo ðÞ þ bs ðÞ;                    Besides the measurement, knowledge actionability also                          ...
,  and  are weights respectively.                                                                           actionable pat...
ts ðÞÞ ^ Oð bo ðÞÞ ^ Oðbs ðÞÞ:         ! Oðto          ^                                                                  ...
ts ðpÞÞ ^ Oðbo ðpÞÞ ^ Oðbs ðpÞÞ          ð7Þ           conditions; and                   ! tact ^ tact ^ bact ^ bact ! tac...
CAO ET AL.: FLEXIBLE FRAMEWORKS FOR ACTIONABLE KNOWLEDGE DISCOVERY                                                        ...
1304                                      IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,                  VOL. 22,  ...
CAO ET AL.: FLEXIBLE FRAMEWORKS FOR ACTIONABLE KNOWLEDGE DISCOVERY                                                        ...
1306                                      IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,          VOL. 22,   NO. 9, ...
CAO ET AL.: FLEXIBLE FRAMEWORKS FOR ACTIONABLE KNOWLEDGE DISCOVERY                                                        ...
1308                                    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,            VOL. 22,   NO. 9, ...
Upcoming SlideShare
Loading in …5

Actionable Knowledge Discovery


Published on

Actionable Knowledge Discovery

Published in: Education, Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Actionable Knowledge Discovery

  1. 1. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 22, NO. 9, SEPTEMBER 2010 1299 Flexible Frameworks for Actionable Knowledge Discovery Longbing Cao, Senior Member, IEEE, Yanchang Zhao, Member, IEEE, Huaifeng Zhang, Member, IEEE, Dan Luo, Chengqi Zhang, Senior Member, IEEE, and E.K. Park Abstract—Most data mining algorithms and tools stop at the mining and delivery of patterns satisfying expected technical interestingness. There are often many patterns mined but business people either are not interested in them or do not know what follow- up actions to take to support their business decisions. This issue has seriously affected the widespread employment of advanced data mining techniques in greatly promoting enterprise operational quality and productivity. In this paper, we present a formal view of actionable knowledge discovery (AKD) from the system and decision-making perspectives. AKD is a closed optimization problem- solving process from problem definition, framework/model design to actionable pattern discovery, and is designed to deliver operable business rules that can be seamlessly associated or integrated with business processes and systems. To support such processes, we correspondingly propose, formalize, and illustrate four types of generic AKD frameworks: Postanalysis-based AKD, Unified- Interestingness-based AKD, Combined-Mining-based AKD, and Multisource Combined-Mining-based AKD (MSCM-AKD). A real-life case study of MSCM-based AKD is demonstrated to extract debt prevention patterns from social security data. Substantial experiments show that the proposed frameworks are sufficiently general, flexible, and practical to tackle many complex problems and applications by extracting actionable deliverables for instant decision making. Index Terms—Data mining, domain-driven data mining (D3 M), actionable knowledge discovery, decision making. Ç1 INTRODUCTION The above issues inform us that there is a large gap [22],I N general, data mining (or KDD) algorithms and tools only focus on the discovery of patterns satisfying expectedtechnical significance. The identified patterns are then [15], [14], [12] between academic deliverables and business expectations, as well as between data miners and businesshanded over to business people for further employment. analysts. Therefore, it is critical to develop effective methodologies and techniques to narrow down and bridgeSurveys of data mining for business applications following the gap. Clearly, there is a need to develop general,the above paradigm in various domains [10] have shown that effective, and practical methodologies for actionable knowl-business people cannot effectively take over and interpret the edge discovery (AKD).identified patterns for business use. This may result from One essential way is to develop effective approaches forseveral aspects of challenges besides the dynamic environ- discovering patterns that not only are of technical significancement enclosing constraints [4]. 1) There are often many [35], but also satisfy business expectations [14], and furtherpatterns mined but they are not informative and transparent indicate the possible actions that can be explicitly taken byto business people who do not know which are truly business people [1], [6]. Therefore, we need to discoverinteresting and operable for their businesses. 2) A large actionable knowledge that is much more than simply satisfying predefined technical interestingness thresholds. Such action-proportion of the identified patterns may be either common- able knowledge is expected to be delivered in operable formssense or of no particular interest to business needs. Business for transparent business interpretation and action taking.people feel confused by why and how they should care about It has been increasingly recognized that traditional datathose findings. 3) Further, business people often do not mining is facing crucial problems in satisfying userknow, and are also not informed, how to interpret them and preferences and business needs. For example, researchwhat straightforward actions can be taken on them to support work has been reported on developing actionable interest-business decision-making and operation. ingness [14], [1] and subjective interestingness such as profit mining [37] to extract more interesting patterns, and on enhancing the interpretation of findings through explanation [39]. However, the nature of the existing work. L. Cao, Y. Zhao, H. Zhang, D. Luo, and C. Zhang are with the Faculty of Engineering and Information Technology, University of Technology, on actionable interestingness development is mainly Sydney, PO Box 123 Broadway, New South Wales 2007, Australia. technical-significance-oriented, e.g., by developing alterna- E-mail: {lbcao, yczhao, hfzhang, dluo, chengqi} tive and subjective metrics. The critical problem to a great. E.K. Park is with the Department of Computer Science Electrical extent comes from the oversimplification of complex Engineering, University of Missouri at Kansas City, 5100 Rockhill Road, domain factors surrounding business problems, the uni- RFH RM 560B, Kansas City, MO 64110. E-mail: versal focus on algorithm innovation and improvement,Manuscript received 14 July 2008; revised 5 Feb. 2009; accepted 20 May 2009; and the little attention taken of enhancing KDD systempublished online 4 June 2009. infrastructure to tackle organizational and social complex-Recommended for acceptance by C. Ling.For information on obtaining reprints of this article, please send e-mail to: ities in real-world, and reference IEEECS Log Number TKDE-2008-07-0357. Fundamental work on AKD is therefore necessary toDigital Object Identifier no. 10.1109/TKDE.2009.143. cater for critical elements in real-world applications such as 1041-4347/10/$26.00 ß 2010 IEEE Published by the IEEE Computer Society
  2. 2. 1300 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 22, NO. 9, SEPTEMBER 2010environment, expert knowledge, and operability. This is TABLE 1related to, but much beyond, algorithm innovation and Key Conceptsperformance improvement. To this end, AKD must cater fordomain knowledge [40] and environmental factors, balancetechnical significance and business expectations from bothobjective and subjective perspectives [14], and supportautomatically converting patterns into deliverables in busi-ness-friendly and operable forms such as actions or rules. Itis expected that the AKD deliverables will be business-friendly enough for business people to interpret, validate,and action, and that they can be seamlessly embedded intobusiness processes and systems. If that is the case, datamining has good potential to lead to productivity gain,smarter operation, and decision making in business intelli-gence. Such efforts actually aim at the KDD paradigm shiftfrom traditionally technical interestingness-oriented and data-centered hidden pattern mining toward business-use-orientedand domain-driven actionable knowledge discovery [12]. 3. CM-AKD: A multistep pattern mining on the data set Relevant preliminary work on AKD mainly addresses in terms of a certain combination strategy. Thespecific algorithms and tools for the filtration, summariza- mined patterns in a step may be fed into anothertion, and postprocessing [42] of learned rules. There is a need mining procedure to guide its feature constructionto develop general AKD frameworks that can cater for and corresponding pattern mining. Individual pat-critical elements in the real world and can also be instantiated terns identified from each step are then merged intointo various approaches for different domain problems. To final deliverables based on merger strategy, domainthe best of our knowledge, very limited research work has knowledge, and/or business needs.been reported in this regard. 4. MSCM-AKD: Handles AKD in either multiple data This paper features the definition and development of sources or large quantities of data. One of the dataseveral general AKD frameworks from the system view- sets is selected for mining initial patterns. Somepoint, which follow the methodology of Domain-Driven learned patterns are then selected to guide featureData Mining (DDDM, or D3 M for short) [12], [14], [15], [6], construction and pattern mining on the next data[10]. Our focus is on introducing their concepts, principles, set(s). The iterative mining stops when all data setsand processes that are new, effective to AKD, flexible, and are mined, and the corresponding patterns are then merged/summarized into actionable deliverables.practical. Such frameworks are necessary and useful forimplementing real-world data mining processes and sys- The above frameworks have been tested in severaltems, but are often ignored in the current KDD research. domains, such as on social security transactional data [13], [43], [18] and exchange orderbook data [9], [8]. Substantial The main contributions of this work are: experiments show that these frameworks are effective and 1. stating the AKD problem from system and micro- flexible for extracting actionable knowledge in complex economy perspectives to define fundamental con- real-world situations, and assist data mining practitioners cepts of actionability and actionable patterns, with catering for their business requirements, needs, and decision-making actions on the findings and deliverables in 2. defining knowledge actionability by highlighting the business environment. both technical significance and business expectations that need to be considered, balanced, and/or aggregated in AKD, 2 RELATED WORK 3. proposing four general frameworks to facilitate Actionable knowledge discovery is critical in promoting AKD, and and releasing the productivity of data mining and knowl- 4. demonstrating the effectiveness and flexibility of the edge discovery for smart business operations and decision proposed frameworks in tackling real-life AKD. making. Both SIGKDD and ICDM panelists pointed it out as The main ideas of D3 M-based AKD and the four one of the great challenges in developing the next-frameworks are as follows: Table 1 lists key concepts and generation KDD methodologies and systems [3], [19]. Intheir abbreviations used in this paper. recent years, some relevant work has been emerging. The term “actionability” measures the ability of a pattern to 1. PA-AKD: A two-step AKD process. First, general suggest a user to take some concrete actions to his/her advantage patterns are mined based on technical significance; in the real world. It mainly measures the ability to suggest the learned patterns are then filtered and summar- business decision-making actions. Existing efforts in the ized in terms of business expectations, and further development of effective interestingness metrics are basi- are converted into opertionalizable business rules for cally on developing and refining objective technical interest- business people’s use. ingness metrics (to ðÞ) [21], [25]. They aim to capture the 2. UI-AKD: AKD develops unified interestingness that complexities of pattern structure and statistical significance. aggregates and balances both technical significance Other work appreciates subjective technical measures (ts ðÞ) and business expectation. The mined patterns are [29], [31], [34], which also recognize to what extent a pattern is further converted into deliverables based on domain of interest to particular user preferences. For example, prob- knowledge and semantics. ability-based belief is used to describe user confidence of
  3. 3. CAO ET AL.: FLEXIBLE FRAMEWORKS FOR ACTIONABLE KNOWLEDGE DISCOVERY 1301unexpected rules [31]. There is very limited research on Let DB be a database related to business problems É,developing business-oriented interestingness, for instance, X ¼ fx1 ; x2 ; . . . ; xL g be the set of items in DB, where xlprofit mining [37]. (l ¼ 1; . . . ; L) be an item set, and the number of attributes in The main limitations for the existing work on interest- DB be S. Suppose E ¼ fe1 ; e2 ; . . . ; eK g denotes the environ-ingness development lie in a number of aspects. Most work ment set, where ek represents a particular environmentis on developing alternative interest measures focusing on setting for AKD. Further, let M ¼ fm1 ; m2 ; . . . ; mN g be thetechnical interestingness only [30]. Emerging research ongeneral business-oriented interestingness is isolated from data mining method set, where mn (n ¼ 1; . . . ; N) is atechnical significance. A question to be asked is “what method. For method mn , suppose its identified pattern setmakes interesting patterns actionable in the real world?” P mn ¼ fpmn ; pmn ; . . . ; pmn g includes all patterns discovered in 1 2 UFor that, knowledge actionability needs to pay equal DB, where pmn (u ¼ 1; . . . ; U) is a pattern discovered by mn . uattention to both technical and business-oriented interesting- In the real world, data mining is a problem-solving processness from both objective and subjective perspectives [14]. (R) from business problems É (with problem status ) to With regard to AKD approaches, the existing work problem-solving solutions È:mainly focuses on developing postanalysis techniques tofilter/prune rules [27], reduce redundancy [26], and R : Éð1 Þ ! Èð2 Þ: ð1Þsummarize learned rules [27], as well as on matchingagainst expected patterns by similarity/difference [28]. In From the modeling perspective, such an AKD-basedpost analysis, a recent highlight is to extract actions from problem-solving process is a state transformation from thelearned rules [38]. A typical effort on learning action rules is source data DBðÉ ! DB) to the resulting pattern setto split attributes into “hard/soft” [38] or “stable/flexible” P ðÈ ! P ).[36] to extract actions that may improve the loyalty orprofitability of customers. Other work is on action hierarchy Definition 1 (Actionable patterns). Let[1]. Some other approaches include a combination of two ormore methods, for instance, class association rules (or e P mn ¼ fp1 mn ; p2 mn ; . . . ; pU mn g ~ ~ ~associative classifier) that build classifiers on association rules be an Actionable Pattern Set mined by method mn for the(A ! C) [23]. In [23], external databases are input intocharacterizing the item sets. In [32], clustering is used to given problem É (its data DB), in which pu mn is actionable ~reduce the number of learned association rules. Some other for the problem solving if it satisfies the following conditions:work is on the transformation from data mining to knowl- a. ti ðpu Þ ! ti;0 ; “! ” indicates the pattern pu can beat ~ ~edge discovery [20], and developing a general KDD technical interestingness ti with threshold ti;0 ;framework to fit more factors into the KDD process [39]. Regarding the existing work, we have the following b. bi ðpu Þ ! bi;0 ; “! ” indicates the pattern pu can beat ~ ~comments: First, existing work often stops at pattern business interestingness bi with threshold bi;0 ;discovery mainly based on technical significance and c.interestingness. As a result, the summarized “actions” do Aðpu mn Þ ~not reflect the genuine expectations of business needs, and R : 1 À! 2 ;therefore, cannot support decision making. Second, most ofthe existing postanalysis and postmining focuses on the pattern can support business problem solving byassociation rules or their combination with some specificmethods. This limits the actionability of learned actions and taking action A, and correspondingly, transform thethe generalization of proposed approaches for AKD. problem status from the initially nonoptimal status 1 To tackle the challenges in real-world KDD and bridge to the greatly improved 2 .the gap, it is necessary to take a critical view of KDD, suchas from microeconomic [24] and system perspectives, and Definition 2 (Actionable knowledge discovery). AKD is andevelop workable methodologies and frameworks to sup- iterative optimization process toward the actionable pattern setport AKD. To this end, D3 M [15], [12] is proposed to e P , considering surrounding business environment andinvolve ubiquitous intelligence in the AKD process toward problem states.the delivery of operable business rules. With the D3 M, thispaper proposes four types of general frameworks that can e;;mn AKDe;;m2M : DB À P mn !be customized to extract actionable deliverables satisfyingboth technical significance and business expectations. À Oe;;m2M IntðpÞ ! p2P ð2ÞAdditionally, rather than addressing the whole KDD !e À P;process as it did in some of related works, this paper onlyfocuses on method/algorithm frameworks toward AKD. where P ¼ P m1 UP m2 ; . . . ; UP mn , Intð:Þ is the interestingness evaluation function, and Oð:Þ is the optimization function to3 A SYSTEM VIEW OF ACTIONABLE KNOWLEDGE ~ e extract those p 2 P when Intð~Þ can beat a given benchmark. p DISCOVERY For a pattern p, IntðpÞ can be further measured in termsReal-world data mining is a complex problem-solvingsystem. From the view of systems and microeconomy, the of technical interestingness (ti ðpÞ) and business interestingnessendogenous character of AKD determines that it is an (bi ðpÞ) [14].optimization problem with certain objectives under a IntðpÞ ¼ Iðti ðpÞ; bi ðpÞÞ; ð3Þparticular environment.
  4. 4. 1302 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 22, NO. 9, SEPTEMBER 2010where Ið:Þ “aggregates” the contributions of all particular combine all types of interestingness metrics to generateaspects of interestingness. Further, IntðpÞ can be described uniform, balanced, and interpretable mechanisms forin terms of objective ðoÞ and subjective ðsÞ factors from both measuring knowledge deliverability and extracting andtechnical ðtÞ and business ðbÞ perspectives. selecting resulting patterns. A reasonable way is to balance both sides toward an acceptable tradeoff. To this end, we IntðpÞ ¼ Iðto ðpÞ; ts ðpÞ; bo ðpÞ; bs ðpÞÞ need to develop interestingness aggregation methods, ð4Þ ! to ðx; pÞ ^ ts ðx; pÞ ^ bo ðx; pÞ ^ bs ðx; pÞ; namely the I-function (or “^”) to aggregate all elements of interestingness. In fact, each of the interestingness cate-where to ðÞ is objective technical interestingness, ts ðÞ is gories may be instantiated into more than one metric. Theirsubjective technical interestingness, bo ðÞ is objective busi- “aggregation” does not mean the essential combination intoness interestingness, and bs ðÞ is subjective business inter- a single supermeasure, rather indicating the satisfaction ofestingness, and I ! ‘ ^0 indicates the “aggregation.” all respective components during the AKD process if In general, to ðÞ, ts ðÞ, bo ðÞ, and bs ðÞ of practical applications possible. They could be checked at the same time or duringcan be regarded as independent of each other. With their the AKD processes. There could be several methods ofnormalization (expressed by ^), we can get: doing the aggregation, for instance, empirical methods such ^^ ^ ^ ^ as business-expert-based voting, or more quantitative IntðpÞ ! Iðto ðÞ; ts ðÞ; bo ðÞ; bs ðÞÞ methods such as multiobjective optimization methods. ð5Þ ^ ^ ^ ^ ¼ to ðÞ þ
  5. 5. ts ðÞ þ bo ðÞ þ bs ðÞ; Besides the measurement, knowledge actionability also needs to cater for the semantic aspect of the identifiedwhere ,
  6. 6. , and are weights respectively. actionable patterns. This is particularly important in So, the AKD optimization problem is as follows: deploying the patterns. Briefly speaking, the conversion AKDe;;m2M À Op2P ðIntðpÞÞ ! from an identified pattern to a business rule can follow a ð6Þ BusinessRule Model [12] defined in OWL-S [5]. To describe ^ ^ ^ ðÞÞ ^ Oð
  7. 7. ts ðÞÞ ^ Oð bo ðÞÞ ^ Oðbs ðÞÞ: ! Oðto ^ a business rule, it is necessary to specify:Definition 3 (Actionability of a pattern). The actionability . Object: On what object(s), the actions are taken, with of a pattern p is measured by actðpÞ: predicates to limit the range; . Condition: Under what situations, the actions can be actðpÞ ¼ Op2P ðIntðpÞÞ taken on the objects, with predicates to specify the ! Oðto ðpÞÞ ^ Oð
  8. 8. ts ðpÞÞ ^ Oðbo ðpÞÞ ^ Oðbs ðpÞÞ ð7Þ conditions; and ! tact ^ tact ^ bact ^ bact ! tact ^ bact ; o s o s i i . Operation: What actions are to be taken on the objects, with predicates to deliver the specific where tact , tact , bact , and bact measure the respective actionable o s o s decision-making activities. performance. Subsequent to the following specification, actionable patterns are converted into business rules as a form of For example, actionable frequent trading pattern mining[16], [8], [9] considers the satisfaction of support, confidence, deliverable, which not only enhances interpretation but alsoas well as business performance like sharpe ratio. Suppose indicates what actions can be taken on what objects underthey are independent. We then expect an actionable what pattern to concurrently satisfy these metrics in a /*BusinessRule Specification*/maximal manner. business rule ::¼ object þ condition à Due to the inconsistency often existing in different operation þaspects, we often find identified patterns only fitting in object ::¼ ðAlljAnyjGivenij . . .Þone of the following subsets: condition ::¼ ðsatisfyjrelatedjandj . . .Þ ÈÈ É È É IntðpÞ ! tact ; bact ; :tact ; bact ; operation ::¼ ðAlertjActionj . . .Þ È act iact É È iact i act ÉÉ i ð8Þ ti ; :bi ; :ti ; :bi ;where “:” indicates unsatisfactory. 4 ACTIONABLE KNOWLEDGE DISCOVERY However, in real-world mining, as we know, it is very FRAMEWORKSchallenging to find the most actionable patterns that are 4.1 Postanalysis-Based AKD: PA-AKDassociated with both “optimal” tact and bact . Quite often, a i i PA-AKD is a two-step pattern extraction and refinementpattern with significant ti ðÞ is associated with unconfident exercise. First, generally interesting patterns (which we callbi ðÞ. Contrarily, patterns with low ti ðÞ are often associated “general patterns”) are mined from data sets in terms ofwith confident bi ðÞ. Clearly, AKD targets patterns confirm- technical interestingness (to ðÞ; ts ðÞ) associated with theing the relationship ftact ; bact g. i i Therefore, it is necessary to deal with such possible algorithms used. Further, the mined general patterns areconflict and uncertainty among respective interestingness pruned, distilled, and summarized into operable businesselements. However, it is something of an art form and needs rules (embedding actions) (which we call “deliverables”) into involve domain knowledge and domain experts to tune terms of domain-specific business interestingness (bo ðÞ; bs ðÞ)thresholds and balance differences between ti ðÞ and bi ðÞ. and involving domain (d ) and meta (m ) knowledge. Fig. 1Another issue is to develop techniques to balance and illustrates the PA-AKD.
  9. 9. CAO ET AL.: FLEXIBLE FRAMEWORKS FOR ACTIONABLE KNOWLEDGE DISCOVERY 1303 Fig. 2. Unified-interestingness-based AKD approach. 4.2 Unified-Interestingness-Based AKD: UI-AKD As discussed in Section 3, one of the essential jobs inFig. 1. Postanalysis-based AKD (PA-AKD) approach. extracting actionable knowledge is to balance the interest- ingness concerns of the identified patterns from both Based on the system view developed for AKD, PA-AKD technical and business sides. To this end, a straightforwardis a two-step optimization problem that can be expressed idea is to develop unified interestingness metrics capturingas follows: and describing both business and technical concerns, and then to extract patterns based on this unified interestingness e;ti ðÞ;m1 e;bi ðÞ;m2 ;d ;m P A À AKD : DB À! P À! e e P ; R: ð9Þ system. Thus, UI-AKD is based on such a unified interest- ingness system. The following pseudocode describes the PA-AKD: Fig. 2 shows the framework of UI-AKD. It looks just theFRAMEWORK 1: Post Analysis-based AKD (PA-AKD) same as the normal data mining process except for threeINPUT: target data set DB, business problem É, and inherent characteristics. One is the interestingness system, thresholds (to;0 , ts;0 , bo;0 and bs;0 ) which combines technical interestingness (ti ðÞ) with busi- eOUTPUT: actionable patterns P and operable business ness expectations (bi ðÞ) into a unified AKD interestingness e system (iðÞ). This unified interestingness system is then rules R used to extract truly interesting patterns. The second is thatStep 1: Extracting general patterns P ; domain knowledge (d ) and environment (e) must be FOR n ¼ 1 to N considered in the data mining process. Finally, the outputs Develop modeling method mn with technical e e are P and R. interestingness ti ðÞ (i.e., to ðÞ; tb ðÞ); Ideally, UI-AKD can be expressed as follows: Employ method mn on DB and environment e; Extract the general pattern set P mn ; e;iðÞ;m;d ;m e e UI À AKD : DB À! P ; R: ð10Þ ENDFORStep 2: Extracting actionable patterns P ; e Based on the AKD formulas addressed before, iðÞ can be m1 P ¼ P U . . . UP mN further expressed as follows: FOR j ¼ 1 to (countðP Þ) iðÞ ¼ IntðÞ ¼ Iðti ðÞ; bi ðÞÞ: ð11Þ Design post-analysis method m2 by involving domain knowledge d and business interestingness bi ðÞ; Very often ti ðÞ and bi ðÞ are not dependent, thus Employ the method m0 on the pattern set P ^ ^ iðÞ ! ti ðÞ þ $bi ðÞ: ð12Þ as well as data set DB if necessary; Extract the actionable pattern set P ; e Weights and $ reflect the interestingness balance/ ENDFOR tradeoff negotiated between data analysts and domain eStep 3: Converting pattern P to business rules R. e experts in terms of business problem, data, environment, The key point in this framework is to utilize both and deliverable expectation. In some cases, both weightsdomain/metaknowledge and business interestingness in and aggregation can be fuzzy. In other cases, the aggrega-postprocessing the learned patterns. In the real world, this tion may happen in a step-by-step manner. For each step,framework can be further instantiated into varied mutations weights may be differentiated.[28], [27], [38]. In fact, many existing methods, such as Patterns with iðÞ beating given thresholds (again, thispruning redundant patterns, summarizing and aggregating must be mutually determined by stakeholders) come intopatterns to reduce the quantity of patterns, and constructing the actionable pattern list. The pseudocode describing theactions on top of learned patterns, can be further enhanced UI-AKD process is as follows:by expanding the PA-AKD framework and introducing FRAMEWORK 2: Unified Interestingness-based AKDbusiness interestingness and domain/metaknowledge intothe AKD process. Cao [9] presents examples of considering (UI-AKD)domain knowledge and organizational factors in extracting INPUT: target data set DB, business problem É, andactionable trading strategies in stock markets. Zhao et al. thresholds (tO;0 , ts;0 , bO;0 and bs;0 )[42] collect case studies of utilizing the PA-AKD framework e e OUTPUT: actionable patterns P and business rules Rfor extracting effective associations. Step 1: Extracting general patterns P ;
  10. 10. 1304 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 22, NO. 9, SEPTEMBER 2010 FOR n ¼ 1 to N Design data mining method mn by involving domain knowledge d and considering environment e against unified interestingness iðÞ; Employ the method mn on DB given e and d ; Generate pattern set P ; ENDFOR eStep 2: Extracting deliverables P ; e eStep 3: Converting P to business rules R. In practice, the combination of technical interestingnesswith business expectations may be implemented by variousmethods. An ideal situation is to generate a single formula iðÞintegrating ti and bi , and then to filter patterns accordingly. Ifsuch a uniform metric is not available, an alternative way isto calculate ti and bi for all patterns, and then rank them interms of them, respectively. A weight-based voting (weightsare determined by stakeholders) can then be taken toaggregate the two ranked lists into a unified pattern set. If Fig. 3. Combined-mining-based AKD (CM-ADK).there is uncertainty in merging the pattern sets, fuzzy-set-based aggregation and ranking may be helpful. Cao and CM-AKD can be formalized as follows:Zhang [16] introduce fuzzy-set-based aggregation of trading e;ti;j ðÞ½ii;j ðފ;mj ;d ;mrules in stock markets. First, trading rules are identified CM À AKD : DB À ! fPj g |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}through Genetic Algorithms. The identified rules are then J ð13Þranked in terms of technical significance and trading e;bi;j ðÞ;]J Pj ;d ;m À! e e P ; R;performance, respectively, and then fuzzified into fivesignificance levels. The two fuzzy sets are then aggregated where ti;j and bi;j are technical and business interestingnessin terms of fuzzy aggregation rules into an integrated fuzzy of model mj , and ½ii;j ðފ indicates the alternative checking ofset. Final top-n rules are selected from this set. unified interestingness, ]J Pj is the merger function, m is As shown in (8), in real life, there may exist potential the metaknowledge consisting of metadata about patterns,incompatibility between technical and business interesting-ness values for a particular pattern. The relationships features, and their relationships.between technical interestingness metrics and business ones The CM-AKD process can be expressed as follows:may be linear and/or nonlinear. In addition, the simplemerger of pattern sets divided on technical and business FRAMEWORK 3: Combined Mining-based AKDsides may cause uncertainty. These make it very challenging (CM-AKD)to develop a unified interestingness system for AKD. INPUT: target data set DB, business problem É, and thresholds (to;0 , ts;0 , bo;0 and bs;0 )4.3 Combined-Mining-Based AKD: CM-AKD e OUTPUT: actionable patterns P and operable businessFor many complex enterprise applications, one-scan mining e rules R;seems unworkable for many reasons. To this end, we Step 1: AKD is split into J steps of mining;propose the Combined-Mining [18]-based AKD framework to Step 2: Step-j mining: Extracting general patterns Pjprogressively extract actionable knowledge. Fig. 3 illus- (j ¼ 1; . . . ; J);trates the CM-AKD. FOR j ¼ 1 to J CM-AKD comprises multisteps of pattern extraction and Develop modeling method mj with technicalrefinement on the whole data set. First, AKD is split into interestingness ti;j ðÞ (i.e., to ðÞ; tb ðÞ) or unified ii;j ðÞJ steps of mining based on business understanding, data Employ method mj on the environment e and dataunderstanding, exploratory analysis, and goal definition. DB engaging meta-knowledge m ;Second, generally interesting patterns are extracted based Extract the general pattern set Pj ;on technical significance (ti ðÞ) (or unified interestingness ENDFOR(iðÞ)) into a pattern subset (Pj ) in step j. Third, knowledge e Step 3: Pattern merging: Extracting actionable patterns P ;obtained in step j is further fed into step j þ 1 or relevantly FOR j ¼ 1 to Jremaining steps to guide corresponding feature construc-tion and pattern mining (Pjþ1 ). Fourth, after the completion Design the pattern merger functions ]J Pj byof all individual mining procedures, all identified pattern involving domain (d ) and meta (m ) knowledge,subsets are merged into a final pattern set (P ) based on and business interestingness bi;j ðÞ;environment (e), domain knowledge (d ), and business Employ the method ]Pj on the pattern set Pj ;expectations (bi ). Finally, the merged patterns are converted Extract the actionable pattern set P ; e e einto business rules as final deliverables (P , R) that reflect ENDFORbusiness preferences and needs. e Step 4: Converting patterns P to rules R. e
  11. 11. CAO ET AL.: FLEXIBLE FRAMEWORKS FOR ACTIONABLE KNOWLEDGE DISCOVERY 1305Fig. 4. Unsupervised + supervised learning-based CM-AKD (USCM-AKD). Fig. 5. Multisource + combined-mining-based AKD. This framework can be instantiated into a few mutationsby employing technical and business interestingness at costly to scan the whole data set. Mining such complex andvarious stages, and by combining miscellaneous data mining large volumes of data challenges existing data miningmodels in a multistep and iterative manner. One example is approaches. To this end, we propose a Multisource +an unsupervised + supervised learning-based CM-AKD: USCM- combined-mining-based AKD framework. Fig. 5 shows theAKD. As shown in Fig. 4, the USCM-AKD first deploys an idea of MSCM-AKD.unsupervised learning method to mine general patterns in MSCM-AKD discovers actionable knowledge either interms of technical interestingness ti;j ðÞ associated with the multiple data sets or data subsets (DB1 ; . . . ; DBN ) throughmethods m1 . New variables triggered by the unsupervised partition. First, based on domain knowledge, businesslearning process are added into the metaknowledge base understanding, and goal definition, one of the data sets orm . The original data set is then filtered, transformed, and/ certain partial data (say DBn ) is selected for miningor aggregated, guided by knowledge obtained in previous exploration (m1 ). Second, the exploration results are used to guide either data partition or data set managementlearning to generate a transformed data set for further through a data coordinator agent Âdb (coordinating datamining. The learned patterns P1 are then used to guide the e e partition and/or data set/feature selection in terms ofextraction of deliverables P and R by a supervised learning iterative mining processes, see more from AMII-SIG1method m2 on the transformed data set concerning both regarding agents in data mining), and to design strategiestechnical (ti ðÞ) and business (bi ðÞ) interestingness. for managing and conducting parallel pattern mining on each An example is to develop sequential classifiers [41]. First, data set or subset and/or combined mining [18] on relevantwe mine for the most discriminative sequential patterns, in remaining data sets. The deployment of method mn , whichwhich an aggressive strategy is used to select a small set of could be either in parallel or combined, is determined bysequential patterns. Second, pattern pruning and serial data/business understanding and objectives. Third, aftercoverage test are done on the mined patterns. Those the mining of all data sets, patterns Pn identified frompatterns passing the serial test are used to build the individual data sets are merged (]N P ) and extracted intosubclassifiers on the first level of the final classifier. Third, e e final deliverables (P ,R).the training samples that cannot be covered are fed back to MSCM-AKD can be expressed as follows:the sequential pattern mining procedure with updatedparameters. This process continues until the predefined MSCM À AKD :thresholds are reached or all samples are covered. Patterns e;ti;n ðÞ½ui;n ðފ;mn ;mgenerated in each loop form the subclassifier on each level DBn ½DB À DBn Š ! À ! fPn gof the final classifiers. |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ð14Þ N In addition, the CM-AKD framework can be further e;bi;n ðÞ;]N Pn ;d ;mjoined with the PA-AKD approach to generate a more À! e e P ; R;comprehensive framework: Combined Mining + Postanalysis-based AKD (CMPA-AKD). In the CMPA-AKD approach, where ti;n and bi;n are technical and business interestingnessmultistep mining may be conducted by checking technical of model mn on data set/subset n, and ½ii;n ðފ indicates theinterestingness only, and leaving the checking of business alternative checking of unified interestingness as in UI-interests to the postanalysis component. In some other AKD, ]N Pn is the merger function, indicates the datacases, multistep mining is based on unified interestingness partition if the source data needs to be split.while pattern merging is conducted during postanalysis. The MSCM-AKD process is expressed as follows:4.4 Multisource + Combined-Mining-Based AKD: FRAMEWORK 4: Multi-Source + Combined Mining Based MSCM-AKD AKD (MSCM-AKD)Enterprise applications often involve multiple-subsystems- INPUT: target data sets DB, business problem É, andbased and heterogeneous data sources that cannot be thresholds (to;0 , ts;0 , bo;0 and bs;0 )integrated, or are too costly to do so. Another commonsituation is that the data volume is so large that it is too 1.
  12. 12. 1306 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 22, NO. 9, SEPTEMBER 2010 eOUTPUT: actionable patterns P and business rules R e First, the proposed AKD frameworks are general andStep 1: Identify or partition whole source data into N data flexible, and can cover many common problems and sets DBn (n ¼ 1; . . . ; N); applications. Basically, they enclose many key features thatStep 2: Data Set-n mining: Extracting general patterns Pn on are critical for offering flexibility and expandability to handle practical challenges in mining complex enterprise data set/subset DBn ; applications. These include catering for the organizational FOR l ¼ n to (N) environment and domain knowledge (all frameworks Develop modeling method mn with technical care about domain knowledge and environment, and the interestingness ti;n ðÞ (i.e., to ðÞ; tb ðÞ) or unified ii;n ðÞ interestingness system has been expanded to facilitate Employ method mn on the environment e and business concerns), mining multiple data sources [32] and data DBn engaging meta-knowledge m ; large volumes of data (see MSCM-AKD), postprocessing the Extract the general pattern set Pn ; learned patterns as per business needs (see PA-AKD), and ENDFOR supporting multistep and combined mining (see CM-AKDStep 3: Pattern merger: Extracting actionable patterns P ; e and MSCM-AKD), as well as closed data mining (by FOR l ¼ n to N delivering operable business rules, see Section 4.4). Design the pattern merger functions ]N Pn to These general features, on one hand, can be instantiated e merge all patterns into P by involving domain into many concrete approaches. As we discussed in introducing each framework, they can be instantiated into and meta knowledge d and m , and business various mutations. For instance, the postanalysis-based interestingness bi ðÞ; approach can be embodied on top of association mining, Employ the method ]Pn on the pattern set Pn ; clustering, and classification to extract actionable associa- Extract the actionable pattern set P ; e tions, clusters, and classes. The unsupervised + supervised ENDFOR learning process can be instantiated into approaches such as eStep 4: Converting patterns P to business rules R. e association + classification and clustering + classification. On the other hand, they can fit into requirements and constraints in The MSCM-AKD framework can also be instantiated into many practical applications, for instance, analyzing rare buta number of mutations. For instance, for a large volume of significant linkages isolated in multiple organizations’ data,data, MSCD-AKD can be instantiated into data partition + and dealing with complex data structure mixing hetero-unsupervised + supervised-based AKD by integrating data geneous and distributed data sources.partition into combined mining. An example is as follows: Second, the AKD frameworks are effective and workable forFirst, the whole data set is partitioned into several data extracting knowledge that can be taken over by businesssubsets based on the data/business understanding and people for instant decision making. There are three keydomain knowledge jointly by data miners and domain factors contributing to effectiveness and workability. 1) Theexperts, say data sets 1 and 2. Second, an unsupervised learning extraction of patterns is based on both technical significancemethod is used to mine one of the preference data sets, say and business expectations, and as a result, they are ofdata set 1. Some of the mined results are then used to design business interest. 2) The frameworks support miningnew variables for processing the other data set. Supervised complex data and knowledge in the real world. 3) Thelearning is further conducted on data set 2 to generate delivery of business rules as the mining deliverables makeactionable patterns by checking both technical and business them operable and bridge the gap between the deliverablesinterestingness. Finally, the individual patterns mined from and business needs.both data subsets are combined into deliverables. In addition, the deep study of D3 M-based AKD has In Section 6, we introduce a real-life case study in disclosed many open issues and broad prospects inextracting deliverables for government debt prevention. developing next-generation KDD, i.e., knowledge proces-The deliverables take forms of either combining arrange- sing and decision-support methodologies, techniques, andment activities initiated by government officers with systems for real-world applications. The issues are:repayment activities conducted by debt-associated custo- . AKD as a closed problem-solving system: current KDDmers, or combining demographics patterns with arrange- is weak in feeding back the resulting solutions toment-repayment activity sequential patterns. Government business problems. The extraction of operable busi-officers who receive such knowledge feel more comfortable ness rules presents a feasible way to achieve such anin applying them to their routine processes and rules to objective. Further work is necessary in definingprevent debts. universal representation and modeling languages for such a purpose.5 DISCUSSIONS . AKD problem-solving environment: KDD researchers 3 increasingly recognize the significance of under-The D M views real-world AKD as a closed optimization standing, involving, and tackling “environment”system. “Closed” indicates the problem solving is a closed factors in AKD modeling and presenting deliver-process starting from business problem definition and ables. Environmental factors refer to the surround-ending with operable business rules fed back into business ings related to human beings, business process,problem solving. “Optimization” means that the AKD policies, rules, workflow, organizational factors, andtargets optimal solutions namely actionable and operable networking factors [7]. Exemplary explanation canpatterns and produces operable business rules. Based on be found in [16], [8], [9]. With the system view, it isthe above principles, the proposed AKD frameworks necessary to develop techniques to describe, repre-present many promising characteristics. sent, and involve environmental elements and
  13. 13. CAO ET AL.: FLEXIBLE FRAMEWORKS FOR ACTIONABLE KNOWLEDGE DISCOVERY 1307 facilitate the interaction between an AKD system and may include evidence and indicators for recovering, and its environment. detecting, preventing, and predicting debt occurrences. . Ubiquitous intelligence surrounding and assisting in The analysis of social security data needs to concern AKD problem-solving systems: AKD inevitably en- environmental factors. This involves government service gages human intelligence, domain intelligence, net- policies, business rules, debt management processes and work intelligence, and organizational and social rules, and debt arrangement and repayment activities and intelligence. Appropriate metasynthesis [17] of such rules, etc. We cater for them in data extraction such as intelligence can greatly enhance the power of AKD involving arrangement and repayment activity data, data in handling complex data and applications. preparation such as policy-sensitive seasonal effect analysis . Representation and integration of ubiquitous intelligence and activity independence analysis, and pattern structures in AKD: it is necessary to develop effective mechan- such as combined patterns [18]. isms to represent, transform, map, search, coordi- Activity mining [11], [13] has been proposed based on nate, and integrate such ubiquitous intelligence in AKD frameworks to identify patterns of high impact- AKD systems. oriented customer behavior and customer-government . Actionability checking: this involves what actionability officer contacts that are likely to be correlated to debts. system is, and how to evaluate actionability. In AKD, a Some of the methods are based on a UI-AKD framework by critical issue is what metrics and at what stage of AKD developing interestingness metrics measuring both techni- process should actionability be checked. Appro- cal significance and business impact of the patterns, for priated combination strategies may be necessary for instance, in identifying sequential impact-contrasted activity checking actionability from both technical and busi- patterns where P is frequently associated with both patterns ness perspective in terms of objective and subjective P ! T and P ! T in separated data sets, and sequential- aspects. In practice, identifying/pruning generally impact-reversed activity patterns in which both P ! T and technical-interesting patterns may be conducted first P Q ! T are frequent. This section briefly illustrates the followed by checking business interestingness of MSCM-AKD framework in identifying combined associa- identified patterns and accordingly pruning them. tions and combined association clusters for debt prevention, . Status optimization and transferability: the system by following the approach of combined mining [18]. The status transformation by taking a decision-making resulting combined patterns consist of items from two action is subject to many factors and constraints such heterogeneous data sets in terms of domain knowledge, as cost and transferability. It is essential to consider business policies, a new interestingness system, and them in cost-sensitive status optimization and discussions with domain experts. For more details on transformation. For that, for example, the corre- theoretical analysis and implementation, please refer to [18]. sponding mechanisms and metrics for cost-benefit analysis can be helpful. 6.2 Combined Associations and Association Clusters The effectiveness, general capability, flexibility, andadaptability of the proposed AKD frameworks have been Based on the MSCM-AKD framework, combined associa-tested and demonstrated in several problem domains, for tions and association clusters are defined as follows:example, mining social security data for debt recovery and Definition 4 (Combined associations). A combined associa-prevention [13], [43], [18], and identifying actionable tion rule p is in the form of x þ y ! c, where x and y aretrading strategies [9], and discovering exceptional trading different item sets from two heterogeneous data sets, and c is abehavior in capital market microstructure data [16], [8], [9]. target item or class.Due to space limitation, we cannot illustrate these examplesone by one. However, interested readers can access more For example, x is a demographic item set, y is adetails from the references. In the following section, we transactional item set, and c is the class of a customer.illustrate a case study using the MSCM-AKD framework for The combined associations can be organized into rulediscovering actionable combined patterns in social security clusters by putting similar or related rules together, whichdata. It consists of customer demographic components and can provide more useful knowledge than separated rules.customer transactional activities in social security areas for Definition 5. (Combined association cluster). A combinedgovernment officers to prevent customer debt. association cluster P is a set of combined associations with the same x (or y) but different y (or x) 8 86 CASE STUDY: MINING ACTIONABLE COMBINED x þ y1 ! ck1 x1 þ y ! cl1 PATTERNS IN SOCIAL SECURITY DATA x þ y2 ! ck2 x2 þ y ! cl2 P1 : ; P2 : ð15Þ6.1 Problem Overview ÁÁÁ ÁÁÁ : :Social security data are widely seen in welfare states, but x þ ym ! ckm xn þ y ! cln ;they have rarely been analyzed. The data consist ofcustomer demographics, government overpayment (debt) where P1 and P2 are two combined clusters. The rules ininformation, activities such as government arrangements cluster P1 have the same x but different y, which makesfor debtors’ payback agreed by both parties, and debtors’ them associated with various results c. By contrast, the rulesrepayment information. Such data contain important in P2 have the same y but different x.information about the experience and performance of Here “combined” means: 1) each single rule consists ofgovernment service objectives and social security policies, item sets from different data sets and 2) each rule cluster is
  14. 14. 1308 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 22, NO. 9, SEPTEMBER 2010 Finally, a combined pattern is interesting only if it satisfies the interestingness of a corresponding combined pattern type, as defined in the following sections: 6.3 Interestingness for Combined Associations and Association Clusters To support the discovery of combined associations and association clusters, we developed corresponding interest- ingness metrics. For a combined association rule x þ y ! c, the traditional interestingness metrics such as support, confidence, and lift contribute little to selecting actionable combined association rules. To measure the interestingness of a combined association, we define two new lifts based on traditional support, confidence and liftFig. 6. MSCM-AKD-based framework for mining combined associationsand association clusters. Confðx þ y ! cÞ Liftx ðx þ y ! cÞ ¼ ; ð16Þcomposed of two or more related local rules identified in Confðy ! cÞindividual data sets. The process of mining combinedassociations and association clusters in social security data Confðx þ y ! cÞ Lifty ðx þ y ! cÞ ¼ : ð17Þis shown in Fig. 6, where DB1 is demographic data, DB2 is Confðx ! cÞtransactional data. Liftx ðx þ y ! cÞ is the lift of x with y as a precondition,Mining Combined Associations and Association Clusters which shows how much x contributes to the rule. Similarly,INPUT: target data sets DB1 and DB2 Lifty ðx þ y ! cÞ gives the contribution of y in the rule. TheOUTPUT: Combined association rules and association following can be derived from the above formulas: clustersStep-1 mining: Mining frequent patterns on the whole Liftðx þ y ! cÞ Liftx ðx þ y ! cÞ ¼ ; ð18Þ population in data set DB1 . Select the top-m frequent Liftðy ! cÞ demographic patterns discovered pi ði ¼ 1; 2; . . . ; mÞ;Step-2 partition: Partitioning the data set DB2 into n sub- Liftðx þ y ! cÞ data sets DB2n (n ¼ 1; . . . ; m) based on the identified Lifty ðx þ y ! cÞ ¼ : ð19Þ Liftðx ! cÞ top demographic patterns.Step-3 mining: Mining associations on data set DB2n in Based on the above lifts, the interestingness (Irule ) of a terms of the groups of people identified in Step-1. single combined association is defined as follows: Identifying fQn ¼ ynj ! cnj g (n ¼ 1; . . . ; N; j ¼ 1; . . . ; J), the top-J frequent patterns discovered Liftx ðx þ y ! cÞ Irule ðx þ y ! cÞ ¼ from DB2n . Liftðx ! cÞ ð20ÞStep-4 merging patterns: Merging results discovered in Lifty ðx þ y ! cÞ the relevant groups into combined patterns like: ¼ : Liftðy ! cÞ 1) x1 þ y1 ! c1 and x1 þ y2 ! c2 , and 2) x1 þ y1 ! c1 and x2 þ y1 ! c3 . Irule indicates whether the contribution of x (or y) to the In utilizing the MSCM-AKD framework, two critical steps occurrence of c increases with y (or x) as a precondition.are data partition and pattern merging. In this exercise, the Therefore, “Irule 1” suggests that x þ y ! c is less inter-partition of debt-related transactional data is driven by esting than x ! c and y ! c. The value of Irule falls indomain knowledge and top-m frequent demographic pat- [0,+1). When Irule 1, the higher Irule is, the more interesting the rule is.terns recognized by domain experts. These top-m frequent Furthermore, we define the interestingness of a combineddemographic patterns actually represent m groups of association cluster. First, the interestingness of a pair ofcustomers. As a result, each sub-data-set in DB2 is associated combined association rules is defined as follows: Suppose p1with a particular group of people in the whole population. and p2 are a pair of combined associations with different Patterns identified in respective data sets are finally consequents within a single rule cluster, say, p1 ¼ ðx1 þ y1 !merged in terms of combined associations and combined c1 Þ and p2 ¼ ðx2 þ y2 ! c2 Þ, where x1 ¼ x2 , y1 6¼ y2 , and c1 6¼association clusters. A combined pattern integrates one c2 (or x1 6¼ x2 , y1 ¼ y2 , and c1 6¼ c2 ), the interestingness (Ipair )demographic component, namely one of the m frequent of the rule pair fp1 ; p2 g is defined as:demographic groups, into one to many items from thetransactional data sets based on inputs of business analysts. Ipair ðp1 ; p2 Þ ¼ Confðp1 Þ Confðp2 Þ: ð21ÞBusiness inputs represent the expectations of domainexperts, including whether a pattern makes sense in Ipair measures the contribution of the two different partsbusiness or not, and whether it indicates significant in antecedents to the occurrence of different classes in abusiness impacts (namely business interestingness). group of customers with the same demographics or the
  15. 15. CAO ET AL.: FLEXIBLE FRAMEWORKS FOR ACTIONABLE KNOWLEDGE DISCOVERY 1309 TABLE 2 Arrangement Rules “y ! c” TABLE 3 Sample Combined Associations TABLE 4 Sample Combined Association Clusterssame transactions. Such knowledge can help design busi- Learned rules with high support and confidence are furtherness campaigns and improve business process. The value of merged into association clusters ranked by Icluster .Ipair falls in [0, 1]. The larger Ipair is, the more interesting andactionable a pair of rules are. 6.4 Experiments For an association cluster P with J combined associations We test the above methods in government social securityp1 ; p2 ; . . . ; pJ , its interestingness (Icluster ) is: data with debts raised in the calendar year 2006 and the È É corresponding customers and arrangement/repayment Icluster ðP Þ ¼ max Ipair ðpi ; pj Þ : ð22Þ activities. The cleaned sample data contains 355,800 custo- pi ;pj 2R;i6¼j;ci 6¼cj mers with their demographic attributes, arrangements, The above definition of Icluster indicates that interesting and repayments. There are 7,711 traditional associationsclusters are the rule clusters with interesting rule pairs, and mined. The association rules are illustrated in Table 2.the other rules in the cluster provide additional informa-tion. Similar to Ipair , the value of Icluster also falls in [0, 1]. Tables 3 and 4 show samples of combined associations With the above interestingness and traditional metrics: and association clusters, respectively. From the tables, it issupport, confidence, lift, Liftx , Lifty , and Irule , interesting clear that the combined associations cannot be discoveredcombined associations are filtered from the learned rules. by traditional association rule techniques.
  16. 16. 1310 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 22, NO. 9, SEPTEMBER 2010 Compared with the single associations from respective 7 CONCLUSIONdata sets, the combined associations and combined A common problem in mainstream KDD research is itsassociation clusters are much more workable than single dominating focus on algorithm innovation and neglect ofrules presented in the traditional way. They contain muchricher information from multiple aspects rather than from real-life decision-making capability. Consequently, dataa single one, or a collection of separated single rules. For mining applications face the significant problem of work-instance, the following combined association shows that ability of deployed algorithms, tools, and resulting deliver-customers aged 65 or more, whose arrangement method is ables. To fundamentally change such situations, andof “withholding” plus “irregular,” and who actually repay empower the workable capability and performance ofin the approach of “withholding,” can be classified into advanced data mining in real-world production andclass “C” (high risk of payback). Obviously, this pattern economy, there is an urgent need to develop next-genera-combines heterogeneous information regarding the speci- tion data mining methodologies and techniques that targetfic group of the debtor’s demographic, repayment, and the paradigm shift from data-centered hidden patternarrangement method mining to domain-driven actionable knowledge delivery. This paper has formally defined the AKD concepts, fx ¼ age : 65þ; y ¼ withholding irregular processes, actionability of patterns, and operable deliver- ð23Þ ables. With such components, we have proposed four types þ withholding ! c ¼ Cg: of AKD frameworks capable of handling various business Finally, combined patterns can be transformed into problems and applications. These frameworks supportoperable business rules that may indicate direct actions closed-optimization-based problem solving from a businessfor business decision making. For instance, for the above problem/environment definition, to actionable patterncombined association, it actually connects key business discovery, and to operable business rule conversion.elements with segmented customer characteristics, and we Deliverables extracted in this way are not only of technicalcan generate the following business rule by extending the significance but also are capable of smoothly integratingBusinessRule specification: into business processes. Substantial experiments in significant data miningDELIVERING BUSINESS RULES: Customer Demographic- applications such as financial data mining and miningArrangement-Repayment combination business rules social security data have shown that the proposed frame- For All customer i (i 2 I is the number of valid customers) works have the potential to handle the limitations in Condition: existing methodologies and approaches. They are suffi- satisfies S/he is a debtor aged 65 or plus; ciently general, flexible, and workable to be instantiated relates into various approaches for tackling complex data and business applications. S/he is under arrangement of “withholding” and Following the D3 M theory, there are many issues to be “irregularly”, studied, for instance, defining operable business rules by and involving ontological techniques for representing both His/her favorite Repayment method is “withholding”; syntactic and semantic components. Operation: Alert = “S/he has ‘High’ risk of paying off debt in a very long timeframe.” ACKNOWLEDGMENTS Action = “Try other arrangements and repayments in This work is sponsored in part by the Australian Research R2 , such as trying to persuade her/him to repay under Council Discovery Grants (DP0988016, DP0773412, and ‘irregular’ arrangement with ‘cash or post.’” DP0667060) and ARC Linkage Grant (LP0989721 and End-All LP0775041). The converted business rules are deliverables presentedto business people. They are convenient and it is easy for REFERENCESclients to embed them into their routine business processes [1] G. Adomavicius and A. Tuzhilin, “Discovery of Actionableand operational systems for filtering debtors and monitor- Patterns in Databases: The Action Hierarchy Approach,” the debt recovery process. Our clients feel more Int’l Conf. Knowledge Discovery and Data Mining (KDD ’97), pp. 111-comfortable in understanding, interpreting, and actioning 114, 1997.these business rules than those patterns directly mined in [2] C. Aggarwal, “Towards Effective and Interpretable Data Mining by Visual Interaction,” ACM SIGKDD Explorations Newsletter,the data. Therefore, combined patterns are more business- vol. 3, no. 2, pp. 11-22, 2002.friendly and indicate much more straightforward decision- [3] M. Ankerst, “Report on the SIGKDD-2002 Panel the Perfect Datamaking actions to be taken by business analysts in the Mining Tool: Interactive or Automated?” ACM SIGKDD Explora-business world, while this cannot be achieved by patterns tions Newsletter, vol. 4, no. 2, pp. 110-111, 2002.identified by traditional methods. [4] J.F. Boulicaut and B. Jeudy, “Constraint-Based Data Mining,” The Data Mining and Knowledge Discovery Handbook, pp. 399-416, In addition, the use of combined mining leads to Springer, 2005.combined patterns consisting of attributes from different [5] K. Breitman, M. Casanova, and W. Truszkowski, Semantic units or by partitioning into organized segments. Springer, 2007.Through attribute segmentation or merger, it is manageable [6] L. Cao, “Domain-Driven Actionable Knowledge Discovery,” IEEE Intelligent Systems, vol. 22, no. 4, pp. 78-89, July/Aug. differentiate attribute impact on business objectives, and [7] L. Cao, “Domain-Driven Data Mining: Empowering Actionablethus, extract more and more informative patterns and more Knowledge Delivery,” Proc. Pacific-Asia Conf. Knowledge Discoveryoperable decision-making actions. and Data Mining (PAKDD ’09) Tutorial, 2009.
  17. 17. CAO ET AL.: FLEXIBLE FRAMEWORKS FOR ACTIONABLE KNOWLEDGE DISCOVERY 1311[8] L. Cao and Y. Ou, “Market Microstructure Pattern Analysis for [34] A. Silberschatz and A. Tuzhilin, “On Subjective Measures of Powering Trading and Surveillance Agents,” J. Universal Computer Interestingness in Knowledge Discovery,” Proc. Int’l Conf. Knowl- Science, vol. 14, no. 14, pp. 2288-2308, 2008. edge Discovery and Data Mining, pp. 275-281, 1995.[9] L. Cao, “Developing Actionable Trading Strategies,” Knowledge [35] P. Tan, V. Kumar, and J. Srivastava, “Selecting the Right Processing and Decision Making in Agent-Based Systems, N. Nguyen Interestingness Measure for Association Patterns,” Proc. ACM and L. Jain, eds., Springer, 2008. SIGKDD, pp. 32-41, 2002.[10] L. Cao, P. Yu, C. Zhang, and H. Zhang, Data Mining for Business [36] A. Tzacheva and Z. Ras, “Action Rules Mining,” Int’l J. Intelligent Applications. Springer, 2008. Systems, vol. 20, no. 7, pp. 719-736, 2005.[11] L. Cao, Y. Zhao, C. Zhang, and H. Zhang, “Activity Mining: From [37] K. Wang, S. Zhou, and J. Han, “Profit Mining: From Patterns to Activities to Actions,” Int’l J. Information Technology and Decision Actions,” Proc. Int’l Conf. Extending Database Technology (EBDT), Making, vol. 7, no. 2, pp. 259-273, 2008. 2002.[12] L. Cao, P. Yu, C. Zhang, and Y. Zhao, Domain Driven Data Mining. [38] Q. Yang, J. Yin, C. Ling, and R. Pan, “Extracting Actionable Springer, 2009. Knowledge from Decision Trees,” IEEE Trans. Knowledge and Data[13] L. Cao, Y. Zhao, and C. Zhang, “Mining Impact-Targeted Activity Eng., vol. 19, no. 1, pp. 43-56, Jan. 2007. Patterns in Imbalanced Data,” IEEE Trans. Knowledge and Data [39] Y. Yao and Y. Zhao, “Explanation-Oriented Data Mining,” Eng., vol. 20, no. 8, pp. 1053-1066, Aug. 2008. Encyclopedia of Data Warehousing and Mining, J. Wang, ed.,[14] L. Cao and C. Zhang, “Knowledge Actionability: Satisfying pp. 492-497, 2005. Technical and Business Interestingness,” Int’l J. Business Intelli- [40] S. Yoon, L. Henschen, E. Park, and S. Makki, “Using Domain gence and Data Mining, vol. 2, no. 4, pp. 496-514, 2007. Knowledge in Knowledge Discovery,” Proc. Eighth Int’l Conf.[15] L. Cao and C. Zhang, “The Evolution of KDD: Towards Domain- Information and Knowledge Management, pp. 243-250, 1999. Driven Data Mining,” Int’l J. Pattern Recognition and Artificial [41] H. Zhang, Y. Zhao, L. Cao, C. Zhang, and H. Bohlscheid, Intelligence, vol. 21, no. 4, pp. 677-692, 2007. “Customer Activity Sequence Classification for Debt Prevention[16] L. Cao and C. Zhang, “Fuzzy Genetic Algorithms for Pairs in Social Security,” J. Computer Science and Technology, vol. 24, Mining,” Proc. Pacific Rim Int’l Conf. Artificial Intelligence no. 6, pp. 1000-1009, 2009. (PRICAI ’06), pp. 711-720, 2006. [42] Post-Mining of Association Rules: Techniques for Effective Knowledge[17] L. Cao, R. Dai, and M. Zhou, “Metasynthesis: M-Space, M Extraction, Y. Zhao, C. Zhang, and L. Cao, eds. IGI Press, 2008. Interaction and M-Computing for Open Complex Giant Systems,” [43] Y. Zhao, H. Zhang, L. Cao, C. Zhang, and Y. Ou, “Data Mining IEEE Trans. Systems, Man, and Cybernetics-Part A, vol. 39, no. 5, Application in Social Security Data,” Data Mining for Business pp. 1007-1021, Sept. 2009. Applications, L. Cao, P. Yu, C. Zhang, and H. Zhang, eds., Springer,[18] L. Cao, H. Zhang, Y. Zhao, and C. Zhang, “Combined Mining: 2008. Discovering More Informative Knowledge in e-Government Services,” technical report, Univ. of Technology Sydney, 2008. Longbing Cao received the PhD degrees in[19] U. Fayyad, G. Shapiro, and R. Uthurusamy, “Summary from the both intelligence sciences and computing KDD-03 Panel—Data mining: The Next 10 Years,” ACM SIGKDD sciences. He is a professor at the University of Explorations Newsletter, vol. 5, no. 2, pp. 191-196, 2003. Technology, Sydney, and the data mining[20] U. Fayyad and P. Smyth, “From Data Mining to Knowledge research leader of the Australian Capital Markets Discovery: An Overview,” Advances in Knowledge Discovery and Cooperative Research Centre. His research Data Mining, U. Fayyad and P. Smyth, eds., pp. 1-34, AAAI Press/ interests include data mining, multiagent tech- MIT Press, 1996. nology, agent and data mining integration, and[21] A. Freitas, “On Objective Measures of Rule Surprisingness,” Proc. behavior informatics. He is a senior member of Second European Symp. Principles of Data Mining and Knowledge the IEEE. Discovery (PKDD ’98), pp. 1-9, 1998.[22] O.G. Ali and W. Wallace, “Bridging the Gap between Business Yanchang Zhao is a research fellow at the Data Objectives and Parameters of Data Mining Algorithms,” Decision Sciences and Knowledge Discovery Research Support Systems, vol. 21, pp. 3-15, 1997. Lab, Centre for Quantum Computation and[23] H. Kargupta, B. Park, D. Hershbereger, and E. Johnson, Intelligent Systems, University of Technology, “Collective Data Mining: A New Perspective toward Distributed Sydney, Australia. His research interests include Data Mining,” Advances in Distributed Data Mining, H. Kargupta association rules, time series, sequential pat- and P. Chan, eds., AAAI/MIT Press, 1999. terns, clustering, and postmining. He is a[24] J. Kleinberg, C. Papadimitriou, and P. Raghavan, “A Microeco- member of the IEEE. nomic View of Data Mining,” Data Mining and Knowledge Discovery, vol. 2, no. 4, pp. 311-324, 1998.[25] R. Hilderman and H. Hamilton, “Applying Objective Interesting- ness Measures in Data Mining Systems,” Proc. Symp. Principles of Data Mining and Knowledge Discovery (PKDD), pp. 432-439, 2000. Huaifeng Zhang is a research fellow at the Data[26] B. Lent, A.N. Swami, and J. Widom, “Clustering Association Sciences and Knowledge Discovery Research Rules,” Proc. 13th Int’l Conf. Data Eng., pp. 220-231, 1997. Lab, Centre for Quantum Computation and[27] B. Liu, W. Hsu, and Y. Ma, “Pruning and Summarizing the Intelligent Systems, University of Technology, Discovered Associations,” Proc. ACM SIGKDD, 1999. Sydney, Australia. His research interests include[28] B. Liu and W. Hsu, “Post-Analysis of Learned Rules,” Proc. Nat’l combined pattern mining, sequence classifica- Conf. Artificial Intelligence/Innovative Applications of Artificial In- tion, pattern recognition, computer vision, beha- telligence Conf. (AAAI/IAAI), 1996. vior analysis and modeling, etc. He is a member[29] B. Liu, W. Hsu, S. Chen, and Y. Ma, “Analyzing Subjective of the IEEE. Interestingness of Association Rules,” IEEE Intelligent Systems, vol. 15, no. 5, pp. 47-55, Sept./Oct. 2000.[30] E. Omiecinski, “Alternative Interest Measures for Mining Associa- tions,” IEEE Trans. Knowledge and Data Eng., vol. 15, no. 1, pp. 57- 69, Jan./Feb. 2003. Dan Luo is a research fellow at the Data Sciences and Knowledge Discovery Research[31] B. Padmanabhan and A. Tuzhilin, “A Belief-Driven Method for Lab, Centre for Quantum Computation and Discovering Unexpected Patterns,” Proc. Int’l Conf. Knowledge Intelligent Systems, University of Technology, Discovery and Data Mining (KDD), pp. 94-100, 1998. Sydney, Australia. Her research focuses on[32] B. Park and H. Kargupta, “Distributed Data Mining: Algorithms domain-driven data mining, actionable knowl- and Systems, Applications,” Data Mining Handbook, pp. 341-358, edge discovery, and business intelligence. 2002.[33] A. Silberschatz and A. Tuzhilin, “What Makes Patterns Interesting in Knowledge Discovery Systems,” IEEE Trans. Knowledge and Data Eng., vol. 8, no. 6, pp. 970-974, Dec. 1996.