Distributed Information Sharing


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Distributed Information Sharing

  1. 1. DISTRIBUTED INFORMATION SHARING and PREDICTION Emel METEOGLU 12200498 Abstract Effective information sharing in a network-centric system is a golden key for success of the system. Information sharing is a difficult issue in large network-centric systems because of complexity and evolving nature of the network topology. The aim of this paper is to understand the nature of the concepts of distributed information sharing (DIS) in a network-centric system. Our focus is on trying to explain DIS environment and challenges and advantages of DIS environment. Situation assessment is another important subject that we address in this paper in order to develop an intelligent prediction agent in DIS environment. Although much related work has been done on efficient situation assessment, most work is based on assumptions which are not suited to network-centric systems. In another section, we highlight prediction in situation assessment under DIS environment. In last section, new solutions such as evolutionary programming are discussed as an effective tool for prediction under DIS environment. I. Introduction The paper presents a concept for distributed information sharing for network-centric systems. Distributed Information sharing (DIS) is a collection of knowledge about nodes in a system. Each of these nodes has an ability to run their own data fusion process and these nodes also take place in a specific data topology with specific features. All nodes in the system are interconnected and each has autonomous processing capability serving local applications. Each node can execute only one job or it can also work together with other nodes for global applications. Such applications require data from more than one site [1]. The first step of DIS is collecting data through intelligence, surveillance, and reconnaissance activities. Intelligence provides obtaining data through observation, investigation or analysis. Surveillance means that “systematic” observation to collect available data. Reconnaissance is a specific mission to obtain specific data. The second step is transforming that raw data to the required human understandable in real time supporting information superiority. Information superiority is the capability to collect, process, and disseminate an uninterrupted flow of information while exploiting or denying an adversary's ability to do the same. It can also be said that DIS provides deliver the right information to the right person at the right place at the right time. DIS also provides to allow humans to quickly gain knowledge and understanding. It deploys all
  2. 2. decision support tools to all command levels leading to Decision Superiority. Decision superiority provides better decisions arrived at and implemented faster than an opponent can react, or in a non-combat situation, at a tempo that allows the force to shape the situation or react to changes and accomplish its mission. Decision superiority does not automatically result from informational superiority. Organizational and doctrinal adaptation, relevant training and experience, and the proper command and control mechanisms and tools are equally necessary. DIS supports effect based mission execution and monitoring with seeking execution superiority which is the ability to execute well designed plans faster than the opponent is crucial for success in any system [2].As a result, it can be noticed that execution superiority is based on decision superiority and decision superiority is based on information and knowledge superiority. Human Process Decision Superiorit y Knowledge Orders/Plans Execution Informatio Superiorit n y Superiority Information Fast Strategic Response Network-Centric Capabilities Automated Process Raw Data Reaction (Change) Complex Environment Figure 1: DIS structure
  3. 3. DIS has been becoming more important as systems become more complex and technology intense. There are many applications of DIS. Some of them can be categorized as [3]: • National/Local Warning System • National/Local Border Management System • Incident Management System • Response and Recovery System • Military Support System The purpose of DIS is to develop an intelligent real time prediction agent for situation assessment under DIS environment. The reason for this is that sometimes valuable data is geographically distributed and this information can be extremely valuable to decision makers in taking both proactive as well as reactive measures designed to ensure effective functioning of system. Decision-makers in systems, in order to be able to effectively perform their responsibilities, need to have critical information in a timely fashion. Potential hotspots and major crisis can be prevented when system faces with them, if decision maker have the right information at the right time and in right format. However, given the complex dynamics and increasingly interdependent nature of systems, it is almost impossible to completely avoid unanticipated developments or crises. In a crisis, it is also important to deliver the data to decision makers without overwhelming them with large amount of irrelevant data. Well structured DIS also make possible to discover of hidden regularities or patterns in specific domains. Thus, knowledge so gathered can be useful in the decision making process. II. Advantages and Challenges of DIS Environment To guarantee safe information sharing in distributed environment is difficult. Large volume of data, distributed access control mechanism can not be handled by traditional centralized information sharing model [4]. Therefore, it can be said that one of the most important advantages of DIS is accessibility. In DIS environment, if any node or communication link is fail, the result is a gradual degradation in network performance rather than catastrophic failure. From another node, replicated copy of data can be accessible even the node is failed. It means that there is no single critical element for operation of the network. Accessibility and availability of the data in DIS environment also provide to increase in reliability of the system. Since, there is no central and critical node in the network, necessary information can still be delivered to the right person and at right place. Even a system comes across a failure; reliability of the system remains high because of accessibility of the data from another node. In DIS environment, data is stored close to the anticipated point of use. Thus, data can be dynamically moved to where it is needed. Moreover, replicated copy of the data also moved, if the local application of the data is changed. This flexibility increases the
  4. 4. efficiency of the network. Having the data at the right place and at the right time depend on this feature. With flexible nature of DIS, there will be no time consuming information sharing process. The result of having evolving nature, sometimes adding new nodes to the network is required in DIS environment. For example, growing in data-mining tasks may necessitate an increase in number of nodes in the network. In this kind of a situation, only little or no upheaval of the system occurs. Capacity of the DIS environment and incremental growth feature enable the system adapting the changes without re-building or re-designing the network. Besides of many advantages, DIS environment has also disadvantages. Since information are transmitted between people and programs in distributed environment, each node of network either people or program may not trust each other [1]. To secure the data sharing of information, common security models have two goals. First, protect information from adversaries to destroy it. The second is preventing the information from disseminating to other authorized users. Access control mechanism of DIS supports first goal well. Access control refers to using the target system to control the behavior of access to information resources. However, access control is not qualified to second goal. The concept of controlling information flow is important for security models. It prevents malicious propagation of information resources. Moreover, most attentions should be directed to relationships between information resources. Although privacy and security issues are better than centralized information sharing, these issues still carry risk also in DIS environment. Cryptographic protocols can be used for enabling privacy-preserving information sharing in distributed environment. Although they are efficient enough for practical use but they are also extremely difficult task [5]. On the other hand, other techniques can be inefficient to be practical. Correspondingly, each node may identify a large number of candidate anomalous behaviors, which must be compared quickly against those identified by other nodes. This operation, when conducted with strong privacy guarantees, can be so expensive as to render the network monitor slow and nearly useless. These security definitions must be described, completely understood, and carefully achieved. As mentioned above, one of the major disadvantages of DIS environment is to control and monitor the system. It also makes planning issues difficult for the system. Also, because of distributed environment, during information flow, information leak may occur. DIS environments are generally complex system of systems. This nature of DIS environment requires interoperability between subsystems. Communication interface in the environment is vital issue for success of the system.
  5. 5. Security Flexibility Reliability& Availability Privacy Incremental Information Growth Leak Figure2: Attributes of Distributed Information Sharing III. Situation and Situation Assessment Situation assessment outlines the process of gathering and analyzing the information needed to make an explicit evaluation of a system in its environment. It can be defined as a process of aggregating sensory, non-sensory, and priori input to construct a representation and evaluation of a situation [1]. It is a necessary step in understanding information which is related to one’s goals. Also, it helps in improvement of DIS environment through communication with others. Situation assessment can be described in four steps: (1) collecting internal and external data, (2) evaluating data’s impact on the network, (3) analyzing data and, (4) defining strategies for data-related situation. At the conclusion of a situation assessment, a strategic planner will have a database of quality information that can be used to make decisions and a list of critical issues which demand a response from the system. Situation is a meaningful abstraction of personal, group, behavioral and/environmental information relevant to the goal of data-mining specific agent [6]. Each agent is one of the DIS nodes. Each agent has some goals or tasks to do. Agents can reach their goals by themselves or they can share the goal with other members in the group. Each agent has their own experience held within their memory and they observe different kind of environmental situations. However, an agent can have knowledge which can be needed by other agents to maintain successful performance [7].
  6. 6. Is situation No assessment necessary? Yes Data Data Data Prediction Collection Evaluation Analysis (DM) Design Share the strategies for information situation Figure 3: Situation Assessment Process Although skills involved in situation assessment are not yet well understood, improved situation assessment ability may lead to faster, better planning. There are some important structures which are helpful in situation assessment. First of all, long-term memory structures are useful in organizing structures about enemy’s goals, intentions, strengths, and weakness. Another structure is value/actions structures which reflect a qualitatively different way of viewing knowledge. Moreover, meta-cognitive processes shape and guide the retrieval of knowledge from long-term memory structure and its synthesis in a model and/or plan for the current situation. However, sometimes resolve one kind of a problem can create other problems. For example, in situation assessment process, making new assumptions about enemy’s intentions or capabilities to solve conflicting data renders the situation assessment process unreliable. In this sense, detecting such unreliability depends crucially on remembering past assumptions. Situation assessment requires skills in monitoring and regulating cognition of situation [8]. On the other hand, there are also some key factors which affect the processing of situation assessment within an agent. Temporal relations between perceived sensory data are critical to sense the situation. Another thing is the agent’s expectation and prior experiences about the meaningfulness of the sensory data because an agent acts depending on its memory. Also when an agent has a piece of information, it delivers the information to its neighbors who need it mostly. An agent does that with basing on related messages previously received. The agent capacity is another factor because agents are resource bounded. Therefore they should not spend so much time for information sharing. The agent’s mental attitudes such as beliefs, wishes, desires, goals, and the agent’s emotions such as fear, anxiety, and joy also affect the success of information
  7. 7. sharing [1]. For example, an agent’s beliefs can determine which information is worthy enough to take its attention. In this sense, it can be said that the inclusion of all of these influencing factors requires extremely high-level of DIS architecture. IV. Data Mining in DIS Environment Generally, data mining (DM) is the process of analyzing data from different perspectives and summarizes it into useful information. Data mining can also be described as extracting information which is previously unknown and potentially useful- from large databases [9]. Data mining is a process which starts with analysing data to show patterns or relationships and then sorting data through large amount of data. It also includes picking out pieces of relative information or patterns. Upper-level data mining methods are prediction, characterization, classification and clustering. At the upper level, data is randomly separated into train and test sets. The training set is given to the bias optimizer (the lower level) that searches for the best point in the bias space based on an internal train test evaluation. The best point in bias space is applied to all the training examples, and the resulting function is returned to the upper level for final evaluation against the "unseen" upper level testing examples [10]. The major purpose of DM in a DIS environment is to make Situation Assessment (SA). Data analyzing is the one of the steps of SA. Well performed DM process makes SA more efficient. After DM process, strategies for the situation are designed and this information is shared with other nodes. The distributed nature of the information in DIS environment is hidden from the upper level data-mining agents and this transparency manifests itself in a number of ways. For upper level data-mining agent, “One-Stop” virtual gateway must be provided by DIS environment to ensure enough data integration. Integration of the nodes is provided by these gateway agents. They do not act as a center of information sharing. They function as a broker. They establish a relationship which is called subscription between source of the data and user of the data. The purpose of the subscription is to maintain currency of the data. If data changes, the source of the data should inform gateway agent and likewise user of the data should be informed about changes by gateway agent. It can be said that every communication interfaces between source and user of the data can be handled by gateway agents. Since the idea under DIS is being not centralized, communication between distributed nodes is provided by gateway agent interface. It also provides secure path between nodes of the network with authentication and to access the external environment of the network.
  8. 8. ` Local Users Data Node 1 Virtual “One-stop” ` Gateway Data Source for DM Data Local Users Upper Level DM Agent Data Node 3 ` V. Prediction in Situation Assessment and DIS Environment Local Users Data Node 2 Figure 4: DM in DIS Environment (Retrieved from [11]) V. Prediction under DIS Environment The ultimate purpose of the data mining is prediction. Predictive data mining is the most common and also most difficult type of data mining. Moreover, it is the one that has most direct applications in both business and military. The prediction is the core activity of the SA analysis. It provides basis for any type of further actions such as mitigation, preparedness, planning and emergency response. The values/patterns of predictive targets in a DIS environment usually exhibit highly non- stationary dynamics because it evolves along with the evolution of environment. Since the short-term performance of DM agents in DIS environment can be measured, it is possible to integrate reinforcement learning philosophy with traditional teacher-based prediction framework. Such integration is breakthrough idea in this domain. Reinforcement provides history for further predictions and further predictions depend on previously reinforcement learning process [11]. Prediction is important because of the reasons that are mentioned above, so data-mining necessitates some characteristics. For efficient and effective data-mining, tools which are used for prediction should provide accurate and realistic solutions. It enables to increase performance of the data-mining process. Also, tools should allow users to perform customize objectives. Besides these, tools should be able to cope with evolving nature of DIS environment [12].
  9. 9. In the next section, data-mining tools which can cope with evolving and complex nature of the DIS environment are presented. These tools can tackle different requirements and challenges of prediction under DIS environment. Although there is no concrete solution which is proposed, conceptual descriptions of tools for prediction challenges are given and it is understood that combination of different tools is an efficient way for better data- mining. VI. Proposed Solutions for Prediction under DIS Environment While the complexity of DIS environment has been increasing, the need for new systematic techniques has been realized by researchers. Also, because of distributed nature, decentralization makes control of information sharing difficult. The need for more robust and adaptive techniques for prediction under DIS environment is essential in information sharing. More useful techniques should address the basic issues in the DIS and give effective solutions for prediction. In this sense, evolutionary techniques which consist of the genetic algorithm (GA), evolutionary programming (EP), evolutionary strategies (ES), and genetic programming (GP) provide a basis for understanding the problems of prediction under DIS environment. Dealing with using conventional large engineering process is crucial problem because of the distributed nature of information sharing. Therefore, creating an environment in which continuous innovation can occur which is also called evolutionary process is becoming better solution. The concept of evolution depends on that many different systems can exist at same time and they can be affected a change in parallel [13]. The effective solutions from evolution techniques are gained from the feedback mechanism. Most recent programming strategies such as spiral development, extreme programming, and open source movement consult the features of evolutionary techniques. According to the GA/EA approach, automation of the design process is occurred by transferring the whole problem into a computer. Therefore, representation of possible systems, identifying utility function, implementation of selection and replication and also creating the system design in computer should be developed by systems engineers. It depends on using advantages of both capability of human being and computers [13]. Creating an environment which innovations and creative changes take place is the fundamental concept of evolutionary process. Evolutionary process also assumes that even the systems have same components in different parts of the system, the effect of the changes are not occurred at the same time. Therefore, development of environment should be built in such a way that enables us to accomplish the exploration of possibilities in a fast manner [13]. On the other hand, the conventional techniques which are used in large engineering systems are not entirely abandoned in the evolutionary context. Instead, they should be used to extend the concept of evolutionary process. Well known and tested strategies for
  10. 10. planning, specification, design, implementation and testing can be used by the developing parts of the system which are individuals or teams [13]. Besides the GA, there are also some computer programming approaches that have emerged to analyze, simulate and model the characteristics of DIS environment. One of them is Artificial Life program that simulates the evolutionary process of life which chooses the survivors with computer generated agents on computer. Agents communicate with the environment receive and transmit information from and to the environment. Communication and rules in the agent-based modeling models the environment of distributed information sharing. Another approach is Neural Network in which an interconnected group of artificial nerve cells affect each other to reach a result based upon their inputs. The simulation continues until the best matches for the answer is reached. It is commonly used classification problems in data-mining. Data-mining generally requires smart architecture, user interactions and performance. Neural network which is combined with GA provide an evolving architecture to select the inputs from environment and control its topology. Besides these, it can be proposed that Adaptive Critic Designs (Heuristic Dynamic Programming, Dual Heuristic Programming, and Globalized Dual Heuristic Programming) and Q-learning will be considered as potential reinforcement learning candidates. Whatever the solution is, there are three key factors that should be considered in the solution: (1) system state, (2) reinforcement reward signal, and (3) actions that an agent can take. Supervised learning blocks such as Artificial Neural Networks, Fuzzy Association System will be used to make coarse learning and the "normal" system patterns can be generalized from historical data. After sufficient coarse learning, fine learning is applied which employs reinforcement learning (RL) algorithms. RL agent will mimic system evolution. The offset will be generated to show the effects of “unexpected” events [13]. In addition to these, swarm intelligence can be also useful tool to understand the environment of distributed information sharing because swarm intelligence systems have common characteristics with DIS such as being distributed, local interactions, being flexible and robust, and autonomy of units. VII. Conclusion For net-work centric systems, information sharing is a crucial for survivability of the networks. Decentralized information sharing has many advantages such as reliability, efficiency, capacity and flexibility. On the other hand, evolving nature of DIS environment makes prediction process much more difficult. In this paper, we describe environment of DIS and advantages and challenges of it. Then, we focus on situation assessment and prediction challenges under DIS environment. We highlight importance of combination of different tools usage to understand and cope with these challenges. It is obvious that DIS environment necessitates more attention to this area and more focus on robust solutions. Also, new tools should be added to systems engineers’ tool box to comprehend and solve DIS problems.
  11. 11. References: 1. http://www.infofusion.buffalo.edu/reports/LLINAS/papers/concepts_strategie s_dist_info_sharing.pdf 2. http://www.spawar.navy.mil/sti/publications/pubs/td/3155/4_Sectn3.pdf 3. https://research.maxwell.af.mil/papers/ay2004/affellows/Bontrager.pdf 4. http://www.cs.cmu.edu/~softagents/papers/xu_InfoShare04.pdf 5. N. Anderson, H.S. Abdalla, "Distributed Information Sharing for Collaborative Systems (DISCS)," iv, p. 476, Third International Conference on Information Visualisation (IV'99), 1999 6. http://www.cs.cornell.edu/andru/papers/iflow-sosp97/paper.html 7. http://www.acfr.usyd.edu.au/publications/downloads/2003/Nettleton203/ICA R2003.pdf 8. http://www.au.af.mil/au/awc/awcgate/army/critical/leav-rpt.pdf 9. http://ml.typepad.com/machine_learning_thoughts/2005/07/data_mining_sta.h tml 10. http://www.cs.put.poznan.pl/mzakrzewicz/pubs/pakdd01b.pdf 11. Hailin Li and Cihan H. Dagli, “Hybrid Least-Squares Methods for Reinforcement Learning,” Proceedings of the 16th International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems, IEA/AIE 2003, Loughborough, UK, June 2003. Lecture Notes in Computer Science, Springer-Verlag, Vol. 2718, 2003, pp. 471–480. 12. S. Sohn and C. Dagli. Ensemble of Evolving Neural Networks in Classification. Neural Processing Letters, 19(3):191-203, 2004. 13. Bar-Yam, Y., “When Systems Engineering Fails-Toward Complex Systems Engineering” 2003 IEEE International Conference on Systems, Man & Cybernetics, Washington, D.C., USA, (October 5–8 2003), pp. 2021-2028.