In this report, which is based on our research via the Web and magazines, First we have a brief introduction about: Objectives of this report How is Today’s Business Environment Then we describe basic concepts and features of BI technology Then we will have a look at Open-Source concept, its importance and effects on BI products and tools In forth section we introduce most famous BI products, focusing on “Pentaho” as a complete BI suite to explore how a BI product solves the organizational problems At final section we will conclude our discussion
There are two approaches to this project In 1 st approach, as we have done in the hard-copy report, our focus is on BI products. Products are divided into 2 categories: Open-Source and proprietary products each products is discussed in detail and there is side-by-side comparison among products make it easy to choose between them based on organizational requirements. In 2 nd Approach, as we have done for this presentation based on different audiences with different backgrounds on IT , and of course time limitation for evaluating each and every product, our focus is more on BI Basic concepts and its importance Open-Source advantages And we have selected Pentaho as the most powerful product in BI environment, base on which we try to understand more about a BI product, its features, abilities and architecture
It should be mentioned, this report does not imply that organizations are guaranteed success by using only these tools to improve overall corporate performance. An assessment of organizational unique needs is so important to gaining the most benefit from any technology.
Businesses today are faced with a highly competitive marketplace, where technology is moving at an unprecedented pace and customers’ demands are changing just as quickly. Understanding customers, rather than markets, is recognized as the key to success. Industry leaders are quickly moving from a product-centric world into a customer-centric world. Information technology is taking on a new level of importance due to its business intelligence application solutions. The goal is to be more competitive
Imagine being a passenger on an airplane when the pilot suddenly announces that the airplane has lost all communications with air traffic control as well as on-board radar. In other words , the pilot has no way to understand the flight environment — including other airliners and potential hazardous weather. Would it make you feel better if the pilot assured you that there’s nothing to worry about because he’s an experienced pilot and has flown the same route many times ? It shouldn’t, yet many business leaders make decisions daily without an operational business radar — a reliable business intelligence system. It doesn’t matter if the plane is large or small; the pilot must know the environment in which the plane is flying.
Sears, Roebuck and Company is a mid-range chain of international department stores , founded by Richard Sears and Alvah Roebuck. Sears merged with Kmart in early 2005, creating the Sears Holdings Corporation. The company competes on an average price level on par with J.C. Penney. Sears has also recently rivaled with Belk, Dillard's, and Macy's. However, the company competes below Bloomingdale's, Neiman Marcus, Nordstrom and Saks Fifth Avenue.
James H. Thomas Jr. is a market intelligence consultant who served for 26 years as a federal intelligence officer. He is managing director of the J Thomas Group Inc., specializing in the development of strategic business intelligence and counterintelligence systems. He developed the business intelligence system for three Fortune 100 companies and numerous smaller companies. e-Mail: firstname.lastname@example.org; Website: www.mindspring.com/~jt-group/default.htm.
Identifying indicators Performance indicators differ with business drivers and aims (or goals). A school might consider the graduation rate of its students as a Key Performance Indicator which might help the school understand its position in the educational community, whereas a business might consider the percentage of income from return customers as a potential KPI. But it is necessary for an organization to at least identify its KPIs. The key conditions before properly identifying KPIs are: Having a pre-defined business process. Having clear goals/performance requirements for the business proceses. Having a quantitative/qualitative measurement of the results and comparison with set goals. Investigating variances and tweaking processes or resources to achieve long-term goals. Categorization of indicators Key Performance Indicators define a set of values used to measure against. These raw sets of values fed to systems to summarize information against are called indicators. Indicators identifiable as possible candidates for KPIs can be summarized into the following sub-categories: Quantitative indicators which can be presented as a number. Practical indicators that interface with existing company processes. Directional indicators specifying whether an organization is getting better or not. Actionable indicators are sufficiently in an organization's control to effect change.
Ackoff indicates that the first four categories relate to the past; they deal with what has been or what is known. Only the fifth category, wisdom, deals with the future because it incorporates vision and design. With wisdom, people can create the future rather than just grasp the present and past. But achieving wisdom isn't easy; people must move successively through the other categories. Knowledge ... knowledge is the appropriate collection of information, such that it's intent is to be useful. Knowledge is a deterministic process. When someone "memorizes" information (as less-aspiring test-bound students often do), then they have amassed knowledge. This knowledge has useful meaning to them, but it does not provide for, in and of itself, an integration such as would infer further knowledge. For example, elementary school children memorize, or amass knowledge of, the "times table". They can tell you that "2 x 2 = 4" because they have amassed that knowledge (it being included in the times table). But when asked what is "1267 x 300", they can not respond correctly because that entry is not in their times table. To correctly answer such a question requires a true cognitive and analytical ability that is only encompassed in the next level... understanding. In computer parlance, most of the applications we use (modeling, simulation, etc.) exercise some type of stored knowledge. Understanding ... understanding is an interpolative and probabilistic process. It is cognitive and analytical. It is the process by which I can take knowledge and synthesize new knowledge from the previously held knowledge. The difference between understanding and knowledge is the difference between "learning" and "memorizing". People who have understanding can undertake useful actions because they can synthesize new knowledge, or in some cases, at least new information, from what is previously known (and understood). That is, understanding can build upon currently held information, knowledge and understanding itself. In computer parlance, AI systems possess understanding in the sense that they are able to synthesize new knowledge from previously stored information and knowledge. Wisdom ... wisdom is an extrapolative and non-deterministic, non-probabilistic process. It calls upon all the previous levels of consciousness, and specifically upon special types of human programming (moral, ethical codes, etc.). It beckons to give us understanding about which there has previously been no understanding, and in doing so, goes far beyond understanding itself. It is the essence of philosophical probing. Unlike the previous four levels, it asks questions to which there is no (easily-achievable) answer, and in some cases, to which there can be no humanly-known answer period. Wisdom is therefore, the process by which we also discern, or judge, between right and wrong, good and bad. I personally believe that computers do not have, and will never have the ability to posses wisdom. Wisdom is a uniquely human state, or as I see it, wisdom requires one to have a soul, for it resides as much in the heart as in the mind. And a soul is something machines will never possess (or perhaps I should reword that to say, a soul is something that, in general, will never possess a machine). Data represents a fact or statement of event without relation to other things. Ex: It is raining. Information embodies the understanding of a relationship of some sort, possibly cause and effect. Ex: The temperature dropped 15 degrees and then it started raining. Knowledge represents a pattern that connects and generally provides a high level of predictability as to what is described or what will happen next. Ex: If the humidity is very high and the temperature drops substantially the atmospheres is often unlikely to be able to hold the moisture so it rains. Wisdom embodies more of an understanding of fundamental principles embodied within the knowledge that are essentially the basis for the knowledge being what it is. Wisdom is essentially systemic. Ex: It rains because it rains. And this encompasses an understanding of all the interactions that happen between raining, evaporation, air currents, temperature gradients, changes, and raining. Now consider the following: I have a box. The box is 3' wide, 3' deep, and 6' high. The box is very heavy. The box has a door on the front of it. When I open the box it has food in it. It is colder inside the box than it is outside. You usually find the box in the kitchen. There is a smaller compartment inside the box with ice in it. When you open the door the light comes on. What is it? A refrigerator. You knew that, right? At some point in the sequence you connected with the pattern and understood it was a description of a refrigerator. From that point on each statement only added confirmation to your understanding. If you lived in a society that had never seen a refrigerator you might still be scratching your head as to what the sequence of statements referred to. Also, realize that I could have provided you with the above statements in any order and still at some point the pattern would have connected. When the pattern connected the sequence of statements represented knowledge to you. To me all the statements convey nothing as they are simply 100% confirmation of what I already knew as I knew what I was describing even before I started.
Many times, multiple operational systems may have different formats of data. Often, the transactional data does not provide a comprehensive view of the business environment and must be integrated with data from external sources such as industry reports, media data, etc. Existing data in the operational data store is updated to reflect the current status of the source system. Typically, the data is stored in “real time” and used for day-to-day management of business operations. • Data warehouse A data warehouse (or an enterprise data warehouse) contains detailed and summarized data extracted from transaction processing systems and possibly other sources. The data is cleansed, transformed, integrated, and loaded into databases separate from the production databases. The data that flows into the data warehouse does not replace existing data, rather it is accumulated to maintain historical data over a period of time. The historical data facilitates detailed analysis of business trends and can be used for decision making in multiple business units. • Data mart A data mart contains a subset of corporate data that is important to a particular business unit or a set of users. A data mart is usually defined by the functional scope of a given business problem within a business unit or set of users. It is created to help solve a particular business problem, such as customer attrition, loyalty, market share, issues with a retail catalog, or issues with suppliers. A data mart, however, does not facilitate analysis across multiple business units. • Extract, transform and load (ETL) tools These solutions are concerned with the collection of data from disparate systems (enterprise solutions across the business), the standardization of data, and then population of the data warehouse (DW). • Data quality (DQ) tools The usefulness of analysis of data from the DW depends on its quality. So-called 'dirty' data can significantly reduce the value of a CRM, problems include duplicate records, incomplete records and issues relating to the formatting of data from different sources. DQ tools are focused on addressing these issues. • Data warehouses (DW) Acting as an enterprise-wide data depository, the DW should enable what has become widely referred to as the 'single customer view'. The single customer view represents the full range of information a business holds on its customers and their interactions with the company. It should be held in a standardized format, and refreshed as appropriate for that company's needs • Business intelligence tools Rather than attempt to create an exhaustive list of the different types of tool used to analyze data, Datamonitor defines the broad range as business intelligence tools. These may include online analytical processing (OLAP), data mining, reporting, dashboards, ad-hoc reporting and numerous other tools. This range of technologies can be simplified further: • Analytical infrastructure, which include ETL and DQ tools; • Data warehousing and data management tools; • Business intelligence tools: The tools employed to analyse data collected by the first two components. BI processes and tasks can be summarized as follows: • Understand the business problem to be addressed • Design the warehouse • Learn how to extract source data and transform it for the warehouse • Implement extract-transform-load (ETL) processes • Load the warehouse, usually on a scheduled basis • Connect users and provide them with tools • Provide users a way to find the data of interest in the warehouse • Leverage the data (use it) to provide information and business knowledge • Administer all these processes • Document all this information in meta-data BI processes extract the appropriate data from operational systems. Data is then cleansed, transformed and structured for decision making. Then the data is loaded in a data warehouse and/or subsets are loaded into data mart(s) and made available to advanced analytical tools or applications for multidimensional analysis or data mining.
The business executive views the challenges of implementing effective business intelligence solutions differently than does the CIO, who must build the infrastructure and support the technology. The business executive wants : • Application freedom and unlimited access to data - the flexibility and freedom to utilize any application or tool on the desktop, whether developed internally or purchased off the shelf, and access to any and all of the many sources of data that are required to feed the business process, such as operational data, text and html data from e-mail and the internet, flat files from industry consultants, audio and video from the media, without regard to the source or format of that data. And the executive wants that information accessible at all times. The CIO’s challenge is: • Connectivity and heterogeneous data sources - building an information infrastructure with a database technology that is accessible from all application servers and can integrate data from all data formats into a single transparent interface to those applications. The business executive wants: • Information systems in synch with business processes – information systems that can recognize his business priorities and adjust automatically when those priorities change. The CIO challenge is: • Dynamic resource management/performance and throughput - a system infrastructure that dynamically allocates system resources to guarantee that business priorities are met, ensuring that service level agreements are constantly honored and that the appropriate volume of work is consistently produced.
The business executive wants: • Low purchase cost. Today’s BI solutions are most often funded in large part or entirely by the business unit, and the focus at this level is on the cost of purchasing the solution and bringing it on line. The CIO challenge is: • Total cost of ownership. Skill shortages and rising cost of the workforce, along with incentives to come in under budget, drive the CIO to leverage the infrastructure and skill investments already made. The business executive wants: • Access to all the data all the time/the ability to transform information into actions. Most e-business companies operate across multiple time zones and multiple nations. With decision makers in headquarters and regional offices around the world, BI systems must be on line 24x365. Furthermore, the goal of integrating customer relationship management with real-time transactions makes currency of data in the decision support systems critical. The CIO challenge is: • Availability/multi-tiered and multi-vendor solutions. Reliability and integrity of the hardware and software technology for decision support systems are as critical as those for transaction systems. Growing in importance is the need to be able to do real-time updates to operational data stores and data warehouses, without interrupting access by the end users.
Implementing BI is a long process and it requires a lot of analysis and investment. A typical BI environment involves business models, data models, data sources, ETL, tools needed to transform and organize the data into useful information, target data warehouse, data marts, OLAP analysis and reporting tools.
Setting up a Business Intelligence environment not only relies on tools, techniques and processes, it also requires skilled business people to carefully drive these in the right direction. Care should be taken in understanding the business requirements, setting up the targets, analyzing and defining the various processes associated with these, determining what kind of data needed to be analyzed, determining the source and target for that data, defining how to integrate that data for BI analysis and determining and gathering the tools and techniques to achieve this goal.
At its base, Open-Source software is software that comes with the source code in a form that customers can modify for their own needs and resell or give away to others under the same terms. Users of the software fund its development directly by either working on the software themselves or contracting someone to do it. This is the key to its success and why it is revolutionizing the software industry Linux vs. Windows face-off is the wrong way to think about Open-Source
1. Free Redistribution The license shall not restrict any party from selling or giving away the software as a component of an aggregate software distribution containing programs from several different sources. The license shall not require a royalty or other fee for such sale. 2. Source Code The program must include source code, and must allow distribution in source code as well as compiled form. 3. Derived Works The license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software. 5. No against Persons or Groups The license must not discriminate against any person or group of persons. 6. No Discrimination against Fields of Endeavor The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research. License Must Not Be Specific to a Product The rights attached to the program must not depend on the program's being part of a particular software distribution 9. License Must Not Restrict Other Software The license must not place restrictions on other software that is distributed along with the licensed software. For example, the license must not insist that all other programs distributed on the same medium must be Open-Source software. 10. License Must Be Technology-Neutral No provision of the license may be predicated on any individual technology or style of interface.
One of the most exciting things about Open-Source is that it represents a huge shift of power from vendors to end users, who are not left without recourse if the original developer abandons the marketplace
Open-Source projects may fail. Open-Source methods, like all software development methods, do not guarantee success. The key technical factors for success are the skills and dedication of the core developers and interface designers on the project. Open-Source projects can also fail for market reasons if they do not produce results faster than competitive projects. Although no statistics are available, the failure rate for new Open-Source projects is probably similar to the failure rate for new proprietary development projects. Canceled Open-Source projects leave a legacy of source code and ideas that can be merged into more successful efforts or recycled into other projects. Open-Source projects are not deadline driven. With an open ended development team, it is impossible to reliably predict release dates. This is not a problem when deploying finished works, but can be a problem if customers become dependent on anticipated future events. Customers can manage this risk by active participation in the Open-Source project concerned. There are some application areas where the economics don't make sense. Where the number of users is small and they are in strong competition, the value of contributing to an Open-Source project is less clear. Open-Source software is not as well established as proprietary software. Open-Source software has been available and growing in scope for decades, but there are still many application areas where Open-Source solutions are not yet available in final form. There are an extremely large number of active projects working to close this gap. Open-Source software is also unfamiliar to many potential users. Individuals and corporations with UNIX experience have a wide range of Open-Source products that are familiar and available. Users habituated to other platforms have fewer Open-Source products available without changing operating systems and face more of a learning curve. In the past, the press and market research organizations have not evaluated Open-Source alternatives to proprietary software. Background information on Open-Source software is only now being written. Open-Source software is unproven for non-technical applications. Not surprisingly, the first successes of Open-Source software have been in areas where the users and developers are one and the same. The origins of Open-Source have been developers with unmet problems, needs or desires who then wrote code for their own use, often in their spare time, and shared the results with other developers. Open-Source is now expanding into new areas and producing products for non-technical users, but this work is in its infancy. BUT If we say, Open-Source software isn't reliable enough to use, then the Internet isn't reliable enough, because the Internet infrastructure relies heavily on Open-Source software. Every single internet address--both web and email--depends on the Domain Name System, or DNS. At the heart of the DNS is an Open-Source program called BIND BIND ( Berkeley Internet Name Domain , previously: Berkeley Internet Name Daemon ) is the most commonly used DNS server on the Internet, especially on Unix-like systems It's also well known that the Open-Source Apache web server hosts more than 60% of the world's web sites, including many of the most heavily trafficked, such as Yahoo!, which runs on a network of more than 2000 FreeBSD-based machines running a modified version of the Apache web server.
The Internet is a key enabler for the development and distribution of Open-Source software. The rapid expansion of the Internet into business and the home has extended the reach of Open-Source and more widely publicized its benefits. Open-Source products have been available for years and used extensively in the UNIX world. The creation of Open-Source operating systems such as Linux and others, have "completed the circle" enabling complete systems and networks to be deployed entirely with Open-Source software. The popularity of Linux has generated new revenue for Open-Source vendors that is now being used to expand development efforts. The unbundling of support from products makes the Open-Source business model more attractive to vendors and more familiar to customers. Linux is the only operating system other than the Microsoft Windows family with a growing market share. According to press reports quoting IDC and Datapro studies , Linux is now used by more than 14% of businesses and it's market share is expected to overtake the Mac OS before 2010.
Many customers spend massive amounts of money on proprietary BI solutions in the hope that these software products will help them, But, Commercial BI solutions are consistently criticized in the following areas: (have following problems) Price The price, maintenance costs, support, and services are too expensive. Usability Too difficult to use for most users. Skills Lack of adequate skills transfer from vendor to customer. Lack of implementation methodologies. Customization Too difficult for customers to develop solutions and integrate business rules. Tool-Set orientation The ‘solutions’ are tool sets and not a solution at all. Extensibility The solutions are proprietary and impossible for customers and aftermarket suppliers to extend and direct the system. Customers did not buy the software, they paid upfront for the right to use it. This is like getting a lease on a car but making all the payments on day one: it’s the worst of both worlds. Reporting and analysis focus The solutions are focused on the reporting and analysis of KPIs, and ignore the performance of the processes that affect the metric. Process influence They are unable to ensure driving changes in a business process. They assume that the delivery of a report will have the side effect of influencing a business process. Tracking and Auditing They are unable to provide complete tracking and auditing. Who got the report? What action did they take? How long did it take? Was a business process initiated as a result? How far along is that process? What is the performance of the process? Prototyping The software pricing models do not support the prototyping phases necessary to ensure the success of Business Intelligence projects. Significant financial outlay and contractual agreements must be signed before full evaluation and prototyping can be done.
The Pentaho BI Platform is different from traditional BI products. It is a process-centric, solution-oriented framework with Business Intelligence (BI) components that enable companies to develop complete solutions to Business Intelligence problems. The BI Platform is process-centric because the central controller is a workflow engine. The workflow engine uses process definitions to define the Business Intelligence processes that execute within the BI Platform. The processes can be easily customized and new processes can be added. The BI Platform includes components and reports for analyzing the performance of these processes. The BI Platform is solution-oriented because the operations of the Platform are specified in process definitions and action documents that specify every activity. These processes and operations collectively define the solution to a Business Intelligence problem. This BI Solution can be easily integrated into business processes that are external to the Platform. The definition of a Solution can contain any number of processes and operations. The Platform consists of a BI Framework, BI Components, a BI Workbench, and desktop Inboxes: • The BI Framework provides logging, auditing, security, scheduling, ETL, web services, attribute repository and rules engines. • The BI Components include reporting, analysis, workflow, dashboards, and data mining. • The BI Workbench is a set of design and administration tools that are integrated into the popular Eclipse environment. These tools allow business analysts or developers to create reports, dashboards, analysis models, business rules, and BI processes. • The desktop Inboxes can be third-party RSS readers or the Pentaho Inbox Alerter. The Inboxes deliver tasks and report / exception notifications. • The BI Framework and BI Components form the Pentaho Server. BI Solutions are as designed using the BI Workbench and deployed to the Pentaho Server. The Pentaho Server is the runtime engine, driven by the workflow engine, which coordinates the execution and communication between all the BI Components. The architecture is a combination of original source code and mature Open-Source components that have been integrated to form a complete, scalable, sophisticated BI Platform. The Pentaho BI Platform is built upon a foundation of servers, engines, and components. These provide the J2EE server, security, portal, workflow, rules engines, charting, collaboration, content management, data integration, analysis, and modeling features of the system. Many of these components are standards-based and can be replaced with other products. To create a truly integrated, single-source solution, Pentaho adds the following: • Common metadata in the form of solution definition documents • Common user interfaces and user interface components • Security • Email and desktop notifications • Installation, integration and validation of all components • Sample solutions • Application connectors • Usage and diagnostic tools • Design tools • Customization and configuration • Process Performance analysis reports and ‘what-if’ modeling
BI Platform is integrated with external applications that provide data to drive the solutions. This data is loaded into a data warehouse or data mart using an Open-Source ETL tool. The Solution Engine is central to the architecture and manages access to the BI components. The services of the BI Framework: Provide web services to external applications Have access to the same Solution Engine as the user interface components Are called by the workflow engine and scheduler to execute system actions The Server includes the components and technologies required to build a Business Intelligence solution: reporting, workflow, business rules, dashboards/analysis, web services, scheduling, a mix of convenient web and desktop user interfaces, and auditing. The Pentaho BI Platform provides system monitoring via Simple Network Management Protocol (SNMP). The repositories are stored inside an RDBMS that is outside of the Pentaho platform. The embedded repositories in the preconfigured installation are stored inside an Open-Source database, either FireBird (preferred) or MySQL. These repositories can be replaced with other relational databases such as Oracle, SQLServer or DB/2 if required The Desktop Alerter is an application that provides alerts in RSS format when new workflow tasks are assigned, or reports made available to, a user. This application must be installed on the computer of every user that needs to use it.
Importance of BI As a competitive advantage As a binocular which ensures management isn’t blindsided Access the right information at the right time Single version of the TRUTH Business Intelligence A systematic process that collects analyzes organizes the flow of critical information Assists all levels of people in org. in making strategic tactical operational DECISIONs BI features Analytical Tools Data Warehousing OLAP ETL Tools Dash boarding Flexible Reports Workflow designing Open-Source benefits Lower software costs More flexibility More reliable products Better standardization and long term stability Not reliant on a single vendor L.A.M.P: web platform BI Product Process-Centric Solution-Oriented Architecture Server BI work bench Inbox alerter
Aureole InfoTech Co. Hamdard University A Summer Training Report on Department of Management Studies New Delhi-India November 2006 Presented by: Parinaz SarafiGoharHamid ShamlouNasab MBA 2nd year
Agenda I. Introduction II. Business Intelligence III. Open-Source IV. BI Products V. Summary
I.1 Introduction Research Methodology Research Mandate: Understanding BI systems, architecture and features of available tools Research Type: Descriptive and Comparative Approaches: Product-Oriented: Hardcopy report Focus on products Two categories Side-by-side comparison Concept-Oriented: PPT Presentation Different audience’s background in IT Focus on concepts Pentaho as a complete BI suite
I.2 Introduction Objectives Our objective to this report is based on the fact that most organizations have become very good at catching and storing data that they generate every time they perform a business operation But the question is: “Is it enough !?” Does simply saving detailed data/information on each and every entity, guarantee access to useful intelligence !?
I.3 Introduction Today’s Business Environment Huge and constantly growing operational databases, but little insight into the driving forces of the business Moving from product-centric world to a customer-centric world Rapidly advancing technology, delivers new opportunities Reduced time to market Highly competitive environment Mergers and acquisitions cause business confusion The goal is to be more competitive
II. Business Intelligence II.1 Importance of BI II.2 What is BI? II.3 BI Environment & Business Flow II.4 BI Implementation
II.1 Importance of BI Airplane Scenario Pilot: “Airplane has lost all communications with air traffic control” no way to understand the flight environment (other airliners and potential hazardous weather) Pilot: “There’s nothing to worry about, because I’m an experienced pilot and have flown the same route many times !” Yet many business leaders make decisions daily without an operational business radar A reliable BI system
II.1 Importance of BI Airplane Scenario (Continued) It doesn’t matter if the plane is large or small; the pilot must know the environment, in which, the plane is flying
II.1 Importance of BI Facts “The question is not whether your company will lose touch with the competitive arena or not, but when it will lose touch.” Ben Gilad, educator and author of Business Blindspots In 1980, when ‘Richard Sears’ (Roebuck-int. Dep.Stores) was the retail leader, they admitted they had never heard of ‘Sam Walton’ and ‘Wal- Mart’; they know them now More than 200 companies that made up the 1979 ‘Fortune 500’ are now out of business, just 21 years later Of the companies that made the 1955 list, 70% (350) no longer exist
II.1 Importance of BI Facts (Continued) Recently, executives of the a leading Internet network supplier, said they had no need for a business intelligence system because they’re the market leader and have no real competition If they don’t have competition now, they will
II.1 Importance of BI Facts (Continued) In 1997, the CEO of a Silicon Valley-based software company said he had all the information he needed about his Competitors Several years later, that same CEO used a private investigator to unethically pilfer through the trash of a major competitor When that CEO needed competitive intelligence to learn about the future strategy and courses of action of that competitor (a large Seattle-based software company), he didn’t have the capability
II.1 Importance of BI Questions What was the net profit for a particular product last year? What will be the total sales for coming year? What are the key factors to be focused on, in order to increase the sales for this year? How can we analyze our competitors? How fast can we assess the business environment? How can we gain our competitive advantages?
II.2 What is BI? Definition “It’s a systematic process that collects, analyzes, and organizes the flow of critical information, focusing it on important strategic and operational issues” James H. Thomas Using BI, the corporate data can be organized and analyzed in a better way and then converted into an useful knowledge
II.2 What is BI? Definition Movies Graphics Spread-sheets Web Pages Text Video CRM DSS DM KM GIS Audio Documents EIS OLAP ERP DW A single ver sion of t he TR UTH
II.2 What is BI? Features BI applications include: Query and Reporting Online Analytical Processing (OLAP) Statistical Analysis Decision Support Systems Forecasting Data Mining Dash boarding
II.2 What is BI? BI Model BI Models are based on: Key Performance Indicators (KPI) Multi dimensional analysis
II.2 What is BI? BI Model KPI: Key performance indicators KPI is a statistical measure used to quantify objectives to reflect the strategic performance of an organization A KPI is used in BI to assess the present state of business and to prescribe the course of action KPIs are frequently used to “value”, difficult-to-measure activities Benefits of leadership development Engagement service Satisfaction KPIs are typically tied to an organizations strategy
II.2 What is BI? BI Model KPI - Key performance indicators (Continued) KPIs differ depending on the nature of the organization and the organizations strategy A KPI is a key part of a measurable objective, which is made up of a direction, KPI, benchmark, target and timeframe For example: "Increase Average Revenue per Customer from $10 to $15 by EOY 2008" Where Average Revenue Per Customer is the KPI KPI should not be confused with a Critical Success Factor For the example above, a critical success factor would be something that needs to be in place to achieve that objective; e.g. “launching a new product”
II.2 What is BI? BI Model Multi-Dimensional Data (Cube) Sales of Laptop in India during Fall season Season Total Spring Summer Fall Winter Product-Branch t PC uc Laptop India od Pr PDA Branch Branch Total AustraliaSeason-Branch USA Total Branch l ta uct Total To rodProduct-Season P Total Seasonal GRAND TOTAL
II.2 What is BI? Goals BI is binocular which ensures management isn’t blindsided The primary goals of BI are: Avoid surprises Identify threats and opportunities Understand where your company is vulnerable Decrease reaction time Out-think the competition Protect intellectual capital
II.3 BI Environment & Business Flow Journey from Data to Wisdom According to Russell Ackoff, a systems theorist and professor of organizational change, the content of the human mind can be classified into five categories: Data: symbols, raw facts Information: data that are processed to be useful; provides answers to “who”, “what”, “where”, and “when” questions Knowledge: relevant and actionable data and information; answers “how” questions Understanding: appreciation of “why” and difference between understanding and knowledge is like difference between learning and memorizing Wisdom: evaluated understanding and by that we can judge between wrong and right, between good and bad “Wisdom is not a product of schooling, but of the lifelong attempt to acquire it” Albert Einstein
II.2 What is BI? Journey from Data to Wisdom (Continued) Division New Product Succession Plans Intro Plans BI is comprised of Marketing a variety of types of information Strategy What if? Material that can range from Costs Tooling Decisions Ultimate Pricing being fairly easy to acquire, Research Volumes & Programs Strategy to being very difficult to acquire Capacities Sales Interviews Mix Emphasis Street Pricing Local Press Customer Information on the right side is Satisfaction Product typically only available through Trade Press Teardown primary research interviews Dictionary Listings D&B ADS DOW Annual Reports Product Literature Information on the left side is often available through secondary research using online databases or the Internet
II.3 BI Environment & Business Flow BI Architecture EnterpriseApplications Data Mart Cubes Data Warehouse SQL Server OracleMain Frame Data Mart1 2 3 4 5 Output:OLTP Analytical Data BI Tools PerformanceSystems Infrastructure Management Management
II.3 BI Environment & Business Flow ChallengesBusiness Executive CIO Heterogeneous data sources Building an information Infrastructure that is Application freedom accessible & can integrate and data from all application unlimited access to servers data Providing a single transparent interface to those applications A system infrastructure that dynamically allocates system resources to guarantee that Information systems business priorities are met in synch with business Ensuring that service level agreements are constantly processes honored Appropriate volume of work is consistently produced
II.3 BI Environment & Business Flow Challenges (Continued)Business Executive CIO Total cost of ownership Low purchase cost Skill shortages and rising cost of workforce Availability of data Access to all the data Multi-tiered and multi-vendor all the time solutions Real-time updates to operational data stores and data Ability to transform warehouses information into actions No interruption of end user’s access
II.4 BI Implementation BI Implementation BI Implementation is a large process that involves: Business Models Data Models Data sources ETL Tools Then it transforms and organizes the data into: Useful Information Target Data warehouse Data marts OLAP analysis Reporting Tools
II.4 BI Implementation Requirements for setting up a BI environment Intelligence environment relies on Tools Techniques Processes Skilled business people
III.1 Open-Source Definition Open-Source software is customer-constructed software With the source code Is modifiable Is resalable Open-Source is like a stone thrown into a pond; the ripples spread outwards, even if you can no longer see the stone that caused them
III.2 Open-Source Regulations Free Redistribution Source Code Derived Works (Modification and redistribution) No against Persons or Groups No Discrimination Against Fields of Endeavor License Must Not Be Specific to a Product License Must Not Restrict Other Software License Must Be Technology-Neutral
III.3 Open-Source Benefits Lower software costs More flexibility More reliable products Better standardization and long term stability Not reliant on a single vendor Faster pace of innovation New projects can build on the existing base of Open-Source code Peer review increases security for systems exposed to public networks
III.4 Open-Source Risks Open-Source projects may fail Open-Source projects are not deadline driven There are some application areas where the economics dont make sense Open-Source software is not as well established as proprietary software Open-Source software is unproven for non-technical applications
III.5 Open-Source Why should we use it? The main reasons are: Internet as a key enabler for development & distribution of open-source Linux & Apache (Popularity & market share of %15) Changes in proprietary software pricing Shortcomings of proprietary solutions x Li nu
III.5 Open-Source Shortcomings of Proprietary Solutions Price Usability They price, to are The solutions pricing skills Lack‘solutions’are tool are adequate of unable Toosoftware areto provide The difficult foruse for The solutions Skills proprietarynot vendor maintenance impossible complete and customers toacosts, to sets andfromsupport the transfer tracking and most users thesolution focused on models do not develop at all Who and the for customers services prototypingand integrate auditing. phases solutionsandgot support, customer reporting analysis Customization aftermarket suppliersthe to action to necessaryand ignore the report? What ensuredid are too expensive of KPIs, direct business rules the Tool-set Orientation Lack of of Business success implementation extend andHow long did it they take? performance of the not system. Customers did methodologies Intelligence projects. take? Was a business Extensibility processes thatas outlay process initiated affect buy the software, they paid Significant financial a Reporting and analysis focus the contractual and metric right to use upfront How far along is result? for the it. This is likemust be a the agreements gettingis that process? What Tracking and Auditing lease on a car full making performance ofbut signed before the Prototyping all the payments on day evaluation process? and prototyping one:be doneworst of both can it’s the worlds
III.6 Open-Source L.A.M.P The acronym LAMP refers to a set of free software programs commonly used together to run dynamic web sites or servers: Linux, the operating system Apache, the Web server MySQL, the database management system Perl, PHP, Python, the scripting/programming languages To be precise, it is an Open-Source Web platform
IV. BI ProductsIV.1 Open-Source BI Products IV.2 Proprietary BI Products IV.1.A Pentaho IV.2.A Microsoft IV.1.B BEE Project IV.2.B SAS IV.1.C Bizgres IV.2.C Cognos IV.1.D MARVELit IV.2.D Hyperion IV.1.E Open I IV.2.E Panorama IV.1.F SpagoBI IV.2.F Prophix IV.1.G JasperSoft IV.2.G Targit IV.1.H Firebird IV.2.H TM1 IV.1.I MySQL IV.1.J PostgreSQL
IV.1 Open-Source BI Products Pentaho Process-Centric: Processes can be easily customized and new processes can be added Solution-Oriented: Enables companies to develop complete solutions to business intelligence problemsThe platform consists of: BI framework: Provides logging, auditing, security, scheduling, ETL, Web services, attribute repository, rules engines BI component: Includes reporting, analysis, workflow, dashboards, Data mining BI workbench: A set of design and administration tools that allows business analysts or developers to create reports, dashboards, analysis, models, business rules Desktop Inboxes: Third-party RSS readers
IV.1 Open-Source BI Products Pentaho Architecture Briefly, we can say Pentaho includes: Server BI work bench Inbox alerter
IV.1 Open-Source BI Products Pentaho Architecture - Server Server is made up of BI framework BI components The server runs inside a J2EE Web server such as: Apache Oracle WebLogic JBoss Websphere In Pentaho, component content can be retrieved as XML, HTML
IV.1 Open-Source BI Products Pentaho Architecture - Server The Pentaho Server includes embedded repositories that store the data necessary to define, execute and audit a solution Solution repository: The meta data that defines solutions Runtime repository: Items of work that the workflow engine Audit repository: Tracking and auditing information
IV.1 Open-Source BI Products Pentaho Architecture External application (data warehouse-data mart using an Open-Source ETL tool) The services of BI framework (Web services, Workflow engine) the Pentaho Server includes: Reporting Workflow Business rules Dashboards/ analysis Web services Scheduling Pentaho BI platform provides system monitoring via SNMP (Simple Network Management Protocol) The Repositories are stored inside an RDBMS that is outside of the Pentaho platform (FireBird (preferred) , MySQL, Oracle, SQLServer or DB/2) The Desktop Alerter is an application that provides alerts in RSS format
IV.1 Open-Source BI Products Pentaho Architecture – Inbox Alerter Pentaho- Inbox Alerter The optional inbox alerter is an agent that needs to be installed on the machines of the users that wish to take advantage of its functionality It has these features: Notification of new workflow tasks Notification of report delivery Management of Off-line content
IV.1 Open-Source BI Products Pentaho Architecture – Work Bench Pentaho- BI Work Bench (Continued) Analysis Enables ad-hoc, interactive data exploration with the ability to slice-and-dice, drill-down, and pivot information Includes highly graphical front-end to OLAP cubes for automated aggregation and speed-of-thought response times
IV.1 Open-Source BI Products Pentaho Architecture – Work Bench Pentaho- BI Work Bench It provides easy to use design tools for reports, dashboards, analytic views Reporting From simple reports on a web page to high quality production reporting for applications such as financial statements and other formal reporting needs Enterprise-class features include automated bursting of reports tailored by role, parameter-driven filtering, and a server-based report repository
IV.1 Open-Source BI Products Pentaho Architecture – Work Bench Pentaho- BI Work Bench (Continued) Dashboards Brings together reports, analyses, and other displays into a single graphical place for easy access Can be customized by person, business role, and/or subject matter
IV.1 Open-Source BI Products Pentaho Architecture – Work Bench Pentaho- BI Work Bench (Continued) Data mining console for data preparation Uncovers hidden relationships in data which can be used to optimize business processes and predict future results Provides a full range of advanced data mining algorithms Enables results to be displayed to users in an easy-to-understand format
IV.1 Open-Source BI Products How Pentaho solves the problem It integrates: Design & administration tools Workflow Analysis tools Business rules Dashboards Information delivery Data warehouse Notification Data mining Scheduling Inbox alerter Auditing Application Integration Content navigation User Interfaces Reporting tools
Summary Importance of BI Lower software costs Process-Centric A systematic process Analytical Tools As a competitive that Warehousing advantage More flexibility Solution-Oriented Datacollects Business Intelligence Morebinocular which As aanalyzes OLAPreliable BI features ensures organizes products ETL Tools Architecture management isn’t the flow of critical Open-Source benefits Betterboarding information Dash blindsided Server standardization and BI Product Flexible the right Access Reports long term stabilityof Assists all bench BI work levels information at the Not reliantdesigning Workflow org. in people inalerter Inbox on a right time making single vendor Single version of the strategic TRUTH tactical operational L.A.M.P: web DECISIONs platform
Magazines References Data Quest Magazine - Issued: May 31, 2006 World Business Magazine - Issued: June 5, 2006 Articles Business Intelligence, by Elizabeth Vitt, Michel Luckvich, Stucia Misner Thank You Sun Microsystems-Business Intelligence and Data Warehousing -Transform raw data into business results Microsystems-BusinessIBM systems Journal: The integration of business intelligence and knowledge management, by W. F. Cody, J. T.Kreulen, V. Krishna, and W. S. Spangler For Your Migration to Open-Source Databases, by Jutta Horstmann Moving to Strategic Business Intelligence, Butler Group, Mar. 1, 2006 Time And Attention The Business Value of Business Intelligence, by: Steve Williams,Nancy Williams Business Intelligence, why?, By: James H. Thomas Jr. IBM-Business Intelligence Architecture on S/390, by:Viviane Anavi-Chaput, Patrick Bossman, Robert Catterall, Kjell Hansson, Vicki Hicks, Ravi Kumar, Jongmin Son Data, Information, Knowledge, and Wisdom, by Gene Bellinger, Durval Castro, Anthony Mills Web http://www.Bee.insightstrategy.cz http://www.CaMagazine.com Questions ? http://www.180Systems.com http://www.DestinationCRM.com http://www.DMReview.com http://www.LearnBI.com http://www.Wikipedia.com http://www.Oreillynet.com http://www.OpenSource.org http://www.Openi.sourceforge.net http://www.pentaho.com http://www.sas.com http://www.hyperion.com http://www.cognos.com
Aureole InfoTech Co. Hamdard University A Summer Training Report on Department of Management Studies New Delhi-India November 2006 Presented by: Parinaz SarafiGoharHamid ShamlouNasab MBA 2nd year