SlideShare a Scribd company logo
1 of 4
Download to read offline
K6221 Business Intelligence Mini Assignment



                           K6221 Business Intelligence 2011-2012
                                               Mini Assignment
                                 Sesagiri Raamkumar Aravind (G1101761F)
                                  Mane Shivaji Dilip Kumar (G1101841A)

   “Enterprises today have access to large amounts of information from internal as well as external sources.
   The information typically comes in either structured or less structured forms. However, enterprises
   generally do not make the best use of the information they have access to, tending instead to focus on just
   internal structured data generated by core transactional systems.”


Statement Elucidation

As per the problem statement, even though enterprises have access to plethora of required information around them,
they make good use of the data coming from traditional OLTP systems only and it is restricted to structured content.
Internal and external unstructured data is not leveraged for making business decisions. Wittles (n.d.) asserts that only
20% of an organization’s data is structured and ready for use in BI data analysis. The remaining 80% is unstructured
data. Therefore, the significance of unstructured data is highly underestimated in most enterprises.

Scenario

The authors opt to critically discuss the problem statement based on a particular scenario. The scenario is
“„Marketing Director‟ of a major pharmaceutical company monitoring the performance of a newly launched
potential blockbuster drug in the Asia Pacific region (excluding Japan).”

Discussion

Large enterprises of today rely on enormous and complicated information systems to fuel their growth and help with
their daily operations and sustainability. The amount spent on such systems even reaches billions in certain
companies. In our scenario, pharmaceutical companies inadvertently rely on unstructured data for leading the race
against competitors as studies show that the average company makes decisions based on data that is 14 months old.
It has become clear that companies that can make faster decisions will spearhead that particular market. Strategic
adoption of the IT systems is very critical as it has direct impact to the process of research, development and sales of
drugs (Dave). Enterprises have reached a stable stage with respect to the setup of BI infrastructure that can handle
internal data extracted from different sources such as ERP and CRM systems. Enterprise data warehouses are
updated on a daily basis with transactional data coming from different regions. Data from EDW cascades to
region/domain specific data marts and ODS so as to meet local reporting needs. In totality, EDW provides a good
canvas for supporting transactional and historical reporting needs of MIS, ESS and DSS systems.

A product launch is a major make or break event for a pharmaceutical company as it feels the push to realize
revenue generation through short term and long term strategies so as to fund further R&D activities. A marketing
director cannot afford to rely entirely on transactional data for making sound business decisions. These decisions are
made to increase visibility and saleability of the new drug in a particular market. As a part of the job, the marketing
director would be expecting to get information about different aspects. The table 1.1 provides the details



                                                      Page 1 of 4
K6221 Business Intelligence Mini Assignment




Sl.                                                                            Readily
No                    Information                       Source        Type    Available               Remarks
                                                                                          Assumption that internal DSS
    Sales of drug in each market (split-up by day,                                        has data from all markets at
  1 region, distributor etc)                           Internal Structured   Y            required frequency

    Marketing Cost in each market (by media)- this                                        Assumption that internal DSS is
  2 includes free samples                              Internal Structured Y              integrated with CRM systems
    Perception about the drug from Doctors, Sales      Internal
    Personnel, Marketing staff, other internal staff   and                                Can be got only after collation
  3 and general public                                 External Unstructured N            from different sources

    Market Share of new drug by value and volume                                          Can be got at end of every
    by each market on comparison to other                                                 quarter from market
  4 competitor drugs from same therapeutic area External Structured          N            intelligence firms such as IMS
                                                                                          Assumption that internal DSS
    Actuals vs Budget and Actuals vs Forecast                                             has data from all markets at
  5 comparison by each market.                         Internal Structured   Y            required frequency
                                                                                          'Y' because Readily available in
    Details about dept level decisions recorded in                                        repository and 'N' because not
  6 documents                                          Internal Unstructured Y and N      in integrated state

                 Table 1.1: Valuable information for pharmaceutical company during drug launch

It is clear that information about some important aspects is of unstructured format. Examples of unstructured data in
an enterprise are HTML content (e.g. web chat, blogs and web pages), Documents (e.g. memos, research papers,
MoMs and articles), Forms (e.g. patent applications), Emails, SMS content and Multimedia content (audio, video,
images) (Ferguson,2011; McCallum, 2005; SPSS, 2003).

Decision makers in a company have to rely on facts to make sound business decisions. The availability of sufficient
and timely facts can help in the process. In this case, the Marketing Director should be able to pull the required data
and the system should have the mechanism to push specific information as well. A distinction is made between data
and information because only information should be pushed to a user as he/she will not have time to analyze plain
facts without any context. Typical examples applicable to this case are listed below.

Pull data: Sales & Expenses data, Market share, and Supply chain inventory data.

Push information: Supply chain deficiencies, summarized content delivery from analytics systems pertaining to
sentiment and opinion about the new drug from internal and external social media platforms, flash updates on sales,
libel cases on new drug from FDA and other sources.

The Push type of information is mostly of unstructured format thereby justifying its importance. Unstructured data
characteristics are visibly and intrinsically different from transactional data. Differentiating factors are mainly
related to representation, source, context, understandability, timeliness and shelf-life. In general, characteristics of
unstructured data are:-




                                                        Page 2 of 4
K6221 Business Intelligence Mini Assignment



         Does not reside in relational database tables.
         Has no predefined structure or format.
         Not arranged in any order.
         Difficult to categorize for use in BI.
         Resides in several documents over multiple sources
            Internal (data within an organization)
            External (data outside the organization)


These characteristics make it difficult for technical personnel to store and catalog unstructured data in an Enterprise
Data Warehouse (EDW) apart from the inherent difficulty in capturing required data. The heterogeneous nature of
the sources adds to the complexity. Typical sources for unstructured data include Email archives, Call center
transcripts, Customer feedback databases, Enterprise intranets, Enterprise content management systems, File
systems, Document management systems, Social networking sites and RSS Newsfeeds (Ferguson 2011:6).

There are techniques for unstructured data to be captured and utilized. Crawlers can be used for capturing relevant
information from enterprise data ecosystem, social media sites and WWW. The captured information is then tagged
and indexed for retrieval purpose. The final stage is the knowledge discovery stage that involves text mining and
web mining (popularly called as content analytics), to derive insight for business benefits.

An ideal BI system should provide the ability to create Enterprise Mashups. Mashups are used to integrate
information sources and functionality from different sources to create new services. These kinds of applications are
more suitable for agile development project thereby suitable to our scenario to look at data from different sources
that help in making decisions. However, there are few challenges to it. Choosing the right information sources
amongst unstructured data and content sifting mechanisms are some known challenges. Mashups are an emerging
trend that is there to stay as it provides a one-stop shop for decision makers.

Future considerations for handling unstructured data

         Ensuring that user content is accurately tagged.
         Ensure that content is up-to-date and relevant.
         Validating content sources.
         Identify business drivers to get the best solution.
         For scalability issues allocate adequate processing power to analytics.


Figure 1 gives a pictorial representation of the current usage of BI in pharmaceutical companies and the neglected
blue ocean segment of unstructured data BI.




                                                        Page 3 of 4
K6221 Business Intelligence Mini Assignment




                         Fig 1: Usage of Business Intelligence in a pharmaceutical company

Conclusion

Enterprises are aware of the importance of unstructured data in current day scenario but they fail to leverage it due
to technical (capturing and storing) and logical (classification and integration) constraints. This situation is bound to
improve with best practices and simpler technical processes. Investment in Content Analytics and Enterprise
Mashups will definitely be realized in the long run.

References

Wittles, G. (n.d.). Unstructured data offers a vast store of untapped BI value . Retrieved from
http://www.themanager.org/strategy/Unstructured_data.htm (Wittles)

Dave , W. (n.d.). Unstructured data in life sciences. Retrieved from
http://blogs.hds.com/storagestat/2011/11/unstructured-data-in-life-sciences.html (Dave)

Ferguson, M. (n.d.). Integrating and analyzing unstructured data. Info 360 BI Conference. Washington DC.
(Ferguson, 2011)

McCallum, A. 2005. Information Extraction. (http://people.cs.umass.edu/~mccallum/papers/acm-queue-ie.pdf )
Retrieved 17 February 2011. (McCallum, 2005)

SPSS. 2003. Meeting the challenge for text: Making text ready for predictive analysis. Chicago (SPSS, 2003)

Grimes, S. (n.d.). Nimble intelligence: Enterprise bi mashup best practices. Retrieved from
http://www.jackbe.com/downloads/nimblebi_grimes.pdf (Grimes)

                                                       Page 4 of 4

More Related Content

More from Aravind Sesagiri Raamkumar

A task-based scientific paper recommender system for literature review and ma...
A task-based scientific paper recommender system for literature review and ma...A task-based scientific paper recommender system for literature review and ma...
A task-based scientific paper recommender system for literature review and ma...Aravind Sesagiri Raamkumar
 
Using altmetrics to support research evaluation
Using altmetrics to support research evaluationUsing altmetrics to support research evaluation
Using altmetrics to support research evaluationAravind Sesagiri Raamkumar
 
Evolution and state-of-the art of Altmetric research: Insights from network a...
Evolution and state-of-the art of Altmetric research: Insights from network a...Evolution and state-of-the art of Altmetric research: Insights from network a...
Evolution and state-of-the art of Altmetric research: Insights from network a...Aravind Sesagiri Raamkumar
 
Scientometric Analysis of Research Performance of African Countries in select...
Scientometric Analysis of Research Performance of African Countries in select...Scientometric Analysis of Research Performance of African Countries in select...
Scientometric Analysis of Research Performance of African Countries in select...Aravind Sesagiri Raamkumar
 
New Dialog, New Services with Altmetrics: Lingnan University Library Experience
New Dialog, New Services with Altmetrics: Lingnan University Library ExperienceNew Dialog, New Services with Altmetrics: Lingnan University Library Experience
New Dialog, New Services with Altmetrics: Lingnan University Library ExperienceAravind Sesagiri Raamkumar
 
Field-weighting readership: how does it compare to field-weighting citations?
Field-weighting readership: how does it compare to field-weighting citations?Field-weighting readership: how does it compare to field-weighting citations?
Field-weighting readership: how does it compare to field-weighting citations?Aravind Sesagiri Raamkumar
 
How do Scholars Evaluate and Promote Research Outputs? An NTU Case Study
How do Scholars Evaluate and Promote Research Outputs? An NTU Case StudyHow do Scholars Evaluate and Promote Research Outputs? An NTU Case Study
How do Scholars Evaluate and Promote Research Outputs? An NTU Case StudyAravind Sesagiri Raamkumar
 
Monitoring the broad impact of the journal publication output on country leve...
Monitoring the broad impact of the journal publication output on country leve...Monitoring the broad impact of the journal publication output on country leve...
Monitoring the broad impact of the journal publication output on country leve...Aravind Sesagiri Raamkumar
 
A Comparative Investigation on Citation Counts and Altmetrics between Papers ...
A Comparative Investigation on Citation Counts and Altmetrics between Papers ...A Comparative Investigation on Citation Counts and Altmetrics between Papers ...
A Comparative Investigation on Citation Counts and Altmetrics between Papers ...Aravind Sesagiri Raamkumar
 
Database-Centric Guidelines for Building a Scholarly Metrics Information Syst...
Database-Centric Guidelines for Building a Scholarly Metrics Information Syst...Database-Centric Guidelines for Building a Scholarly Metrics Information Syst...
Database-Centric Guidelines for Building a Scholarly Metrics Information Syst...Aravind Sesagiri Raamkumar
 
Altmetrics for Research Impact Actuation (ARIA)
Altmetrics for Research Impact Actuation (ARIA)Altmetrics for Research Impact Actuation (ARIA)
Altmetrics for Research Impact Actuation (ARIA)Aravind Sesagiri Raamkumar
 
Proposing a Scientific Paper Retrieval and Recommender Framework
Proposing a Scientific Paper Retrieval and Recommender FrameworkProposing a Scientific Paper Retrieval and Recommender Framework
Proposing a Scientific Paper Retrieval and Recommender FrameworkAravind Sesagiri Raamkumar
 
What papers should I cite from my reading list? User evaluation of a manuscri...
What papers should I cite from my reading list? User evaluation of a manuscri...What papers should I cite from my reading list? User evaluation of a manuscri...
What papers should I cite from my reading list? User evaluation of a manuscri...Aravind Sesagiri Raamkumar
 
What’s in a Country Name – Twitter Hashtag Analysis of #singapore
What’s in a Country Name – Twitter Hashtag Analysis of #singaporeWhat’s in a Country Name – Twitter Hashtag Analysis of #singapore
What’s in a Country Name – Twitter Hashtag Analysis of #singaporeAravind Sesagiri Raamkumar
 
More Than Just Black and White: A Case for Grey Literature References in Scie...
More Than Just Black and White: A Case for Grey Literature References in Scie...More Than Just Black and White: A Case for Grey Literature References in Scie...
More Than Just Black and White: A Case for Grey Literature References in Scie...Aravind Sesagiri Raamkumar
 
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...Aravind Sesagiri Raamkumar
 
Object Recognition-based Mnemonics Mobile App for Senior Adults Communication
Object Recognition-based Mnemonics Mobile App for Senior Adults CommunicationObject Recognition-based Mnemonics Mobile App for Senior Adults Communication
Object Recognition-based Mnemonics Mobile App for Senior Adults CommunicationAravind Sesagiri Raamkumar
 
Linked-Data based Data Management for data.gov.sg
Linked-Data based Data Management for data.gov.sgLinked-Data based Data Management for data.gov.sg
Linked-Data based Data Management for data.gov.sgAravind Sesagiri Raamkumar
 
Rec4LRW – Scientific Paper Recommender System for Literature Review and Writing
Rec4LRW – Scientific Paper Recommender System for Literature Review and WritingRec4LRW – Scientific Paper Recommender System for Literature Review and Writing
Rec4LRW – Scientific Paper Recommender System for Literature Review and WritingAravind Sesagiri Raamkumar
 

More from Aravind Sesagiri Raamkumar (20)

A task-based scientific paper recommender system for literature review and ma...
A task-based scientific paper recommender system for literature review and ma...A task-based scientific paper recommender system for literature review and ma...
A task-based scientific paper recommender system for literature review and ma...
 
Using altmetrics to support research evaluation
Using altmetrics to support research evaluationUsing altmetrics to support research evaluation
Using altmetrics to support research evaluation
 
Evolution and state-of-the art of Altmetric research: Insights from network a...
Evolution and state-of-the art of Altmetric research: Insights from network a...Evolution and state-of-the art of Altmetric research: Insights from network a...
Evolution and state-of-the art of Altmetric research: Insights from network a...
 
Feature Analysis of Research Metrics Systems
Feature Analysis of Research Metrics SystemsFeature Analysis of Research Metrics Systems
Feature Analysis of Research Metrics Systems
 
Scientometric Analysis of Research Performance of African Countries in select...
Scientometric Analysis of Research Performance of African Countries in select...Scientometric Analysis of Research Performance of African Countries in select...
Scientometric Analysis of Research Performance of African Countries in select...
 
New Dialog, New Services with Altmetrics: Lingnan University Library Experience
New Dialog, New Services with Altmetrics: Lingnan University Library ExperienceNew Dialog, New Services with Altmetrics: Lingnan University Library Experience
New Dialog, New Services with Altmetrics: Lingnan University Library Experience
 
Field-weighting readership: how does it compare to field-weighting citations?
Field-weighting readership: how does it compare to field-weighting citations?Field-weighting readership: how does it compare to field-weighting citations?
Field-weighting readership: how does it compare to field-weighting citations?
 
How do Scholars Evaluate and Promote Research Outputs? An NTU Case Study
How do Scholars Evaluate and Promote Research Outputs? An NTU Case StudyHow do Scholars Evaluate and Promote Research Outputs? An NTU Case Study
How do Scholars Evaluate and Promote Research Outputs? An NTU Case Study
 
Monitoring the broad impact of the journal publication output on country leve...
Monitoring the broad impact of the journal publication output on country leve...Monitoring the broad impact of the journal publication output on country leve...
Monitoring the broad impact of the journal publication output on country leve...
 
A Comparative Investigation on Citation Counts and Altmetrics between Papers ...
A Comparative Investigation on Citation Counts and Altmetrics between Papers ...A Comparative Investigation on Citation Counts and Altmetrics between Papers ...
A Comparative Investigation on Citation Counts and Altmetrics between Papers ...
 
Database-Centric Guidelines for Building a Scholarly Metrics Information Syst...
Database-Centric Guidelines for Building a Scholarly Metrics Information Syst...Database-Centric Guidelines for Building a Scholarly Metrics Information Syst...
Database-Centric Guidelines for Building a Scholarly Metrics Information Syst...
 
Altmetrics for Research Impact Actuation (ARIA)
Altmetrics for Research Impact Actuation (ARIA)Altmetrics for Research Impact Actuation (ARIA)
Altmetrics for Research Impact Actuation (ARIA)
 
Proposing a Scientific Paper Retrieval and Recommender Framework
Proposing a Scientific Paper Retrieval and Recommender FrameworkProposing a Scientific Paper Retrieval and Recommender Framework
Proposing a Scientific Paper Retrieval and Recommender Framework
 
What papers should I cite from my reading list? User evaluation of a manuscri...
What papers should I cite from my reading list? User evaluation of a manuscri...What papers should I cite from my reading list? User evaluation of a manuscri...
What papers should I cite from my reading list? User evaluation of a manuscri...
 
What’s in a Country Name – Twitter Hashtag Analysis of #singapore
What’s in a Country Name – Twitter Hashtag Analysis of #singaporeWhat’s in a Country Name – Twitter Hashtag Analysis of #singapore
What’s in a Country Name – Twitter Hashtag Analysis of #singapore
 
More Than Just Black and White: A Case for Grey Literature References in Scie...
More Than Just Black and White: A Case for Grey Literature References in Scie...More Than Just Black and White: A Case for Grey Literature References in Scie...
More Than Just Black and White: A Case for Grey Literature References in Scie...
 
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
 
Object Recognition-based Mnemonics Mobile App for Senior Adults Communication
Object Recognition-based Mnemonics Mobile App for Senior Adults CommunicationObject Recognition-based Mnemonics Mobile App for Senior Adults Communication
Object Recognition-based Mnemonics Mobile App for Senior Adults Communication
 
Linked-Data based Data Management for data.gov.sg
Linked-Data based Data Management for data.gov.sgLinked-Data based Data Management for data.gov.sg
Linked-Data based Data Management for data.gov.sg
 
Rec4LRW – Scientific Paper Recommender System for Literature Review and Writing
Rec4LRW – Scientific Paper Recommender System for Literature Review and WritingRec4LRW – Scientific Paper Recommender System for Literature Review and Writing
Rec4LRW – Scientific Paper Recommender System for Literature Review and Writing
 

Recently uploaded

The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024Brian Pichman
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
 
Scenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenariosScenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenariosErol GIRAUDY
 
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfQ4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfTejal81
 
Patch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updatePatch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updateadam112203
 
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - Tech
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - TechWebinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - Tech
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - TechProduct School
 
Planetek Italia Srl - Corporate Profile Brochure
Planetek Italia Srl - Corporate Profile BrochurePlanetek Italia Srl - Corporate Profile Brochure
Planetek Italia Srl - Corporate Profile BrochurePlanetek Italia Srl
 
Graphene Quantum Dots-Based Composites for Biomedical Applications
Graphene Quantum Dots-Based Composites for  Biomedical ApplicationsGraphene Quantum Dots-Based Composites for  Biomedical Applications
Graphene Quantum Dots-Based Composites for Biomedical Applicationsnooralam814309
 
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveKeep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveIES VE
 
Where developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingWhere developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingFrancesco Corti
 
Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Muhammad Tiham Siddiqui
 
3 Pitfalls Everyone Should Avoid with Cloud Data
3 Pitfalls Everyone Should Avoid with Cloud Data3 Pitfalls Everyone Should Avoid with Cloud Data
3 Pitfalls Everyone Should Avoid with Cloud DataEric D. Schabell
 
20140402 - Smart house demo kit
20140402 - Smart house demo kit20140402 - Smart house demo kit
20140402 - Smart house demo kitJamie (Taka) Wang
 
IT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced ComputingIT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced ComputingMAGNIntelligence
 
How to release an Open Source Dataweave Library
How to release an Open Source Dataweave LibraryHow to release an Open Source Dataweave Library
How to release an Open Source Dataweave Libraryshyamraj55
 
EMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarEMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarThousandEyes
 
UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2DianaGray10
 
Flow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameFlow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameKapil Thakar
 
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInThousandEyes
 

Recently uploaded (20)

The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Scenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenariosScenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenarios
 
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfQ4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
 
Patch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updatePatch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 update
 
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - Tech
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - TechWebinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - Tech
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - Tech
 
Planetek Italia Srl - Corporate Profile Brochure
Planetek Italia Srl - Corporate Profile BrochurePlanetek Italia Srl - Corporate Profile Brochure
Planetek Italia Srl - Corporate Profile Brochure
 
Graphene Quantum Dots-Based Composites for Biomedical Applications
Graphene Quantum Dots-Based Composites for  Biomedical ApplicationsGraphene Quantum Dots-Based Composites for  Biomedical Applications
Graphene Quantum Dots-Based Composites for Biomedical Applications
 
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveKeep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
 
Where developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingWhere developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is going
 
Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)
 
3 Pitfalls Everyone Should Avoid with Cloud Data
3 Pitfalls Everyone Should Avoid with Cloud Data3 Pitfalls Everyone Should Avoid with Cloud Data
3 Pitfalls Everyone Should Avoid with Cloud Data
 
20140402 - Smart house demo kit
20140402 - Smart house demo kit20140402 - Smart house demo kit
20140402 - Smart house demo kit
 
IT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced ComputingIT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced Computing
 
How to release an Open Source Dataweave Library
How to release an Open Source Dataweave LibraryHow to release an Open Source Dataweave Library
How to release an Open Source Dataweave Library
 
EMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarEMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? Webinar
 
UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2
 
Flow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameFlow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First Frame
 
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
 

Unstructured BI in pharmaceutical company

  • 1. K6221 Business Intelligence Mini Assignment K6221 Business Intelligence 2011-2012 Mini Assignment Sesagiri Raamkumar Aravind (G1101761F) Mane Shivaji Dilip Kumar (G1101841A) “Enterprises today have access to large amounts of information from internal as well as external sources. The information typically comes in either structured or less structured forms. However, enterprises generally do not make the best use of the information they have access to, tending instead to focus on just internal structured data generated by core transactional systems.” Statement Elucidation As per the problem statement, even though enterprises have access to plethora of required information around them, they make good use of the data coming from traditional OLTP systems only and it is restricted to structured content. Internal and external unstructured data is not leveraged for making business decisions. Wittles (n.d.) asserts that only 20% of an organization’s data is structured and ready for use in BI data analysis. The remaining 80% is unstructured data. Therefore, the significance of unstructured data is highly underestimated in most enterprises. Scenario The authors opt to critically discuss the problem statement based on a particular scenario. The scenario is “„Marketing Director‟ of a major pharmaceutical company monitoring the performance of a newly launched potential blockbuster drug in the Asia Pacific region (excluding Japan).” Discussion Large enterprises of today rely on enormous and complicated information systems to fuel their growth and help with their daily operations and sustainability. The amount spent on such systems even reaches billions in certain companies. In our scenario, pharmaceutical companies inadvertently rely on unstructured data for leading the race against competitors as studies show that the average company makes decisions based on data that is 14 months old. It has become clear that companies that can make faster decisions will spearhead that particular market. Strategic adoption of the IT systems is very critical as it has direct impact to the process of research, development and sales of drugs (Dave). Enterprises have reached a stable stage with respect to the setup of BI infrastructure that can handle internal data extracted from different sources such as ERP and CRM systems. Enterprise data warehouses are updated on a daily basis with transactional data coming from different regions. Data from EDW cascades to region/domain specific data marts and ODS so as to meet local reporting needs. In totality, EDW provides a good canvas for supporting transactional and historical reporting needs of MIS, ESS and DSS systems. A product launch is a major make or break event for a pharmaceutical company as it feels the push to realize revenue generation through short term and long term strategies so as to fund further R&D activities. A marketing director cannot afford to rely entirely on transactional data for making sound business decisions. These decisions are made to increase visibility and saleability of the new drug in a particular market. As a part of the job, the marketing director would be expecting to get information about different aspects. The table 1.1 provides the details Page 1 of 4
  • 2. K6221 Business Intelligence Mini Assignment Sl. Readily No Information Source Type Available Remarks Assumption that internal DSS Sales of drug in each market (split-up by day, has data from all markets at 1 region, distributor etc) Internal Structured Y required frequency Marketing Cost in each market (by media)- this Assumption that internal DSS is 2 includes free samples Internal Structured Y integrated with CRM systems Perception about the drug from Doctors, Sales Internal Personnel, Marketing staff, other internal staff and Can be got only after collation 3 and general public External Unstructured N from different sources Market Share of new drug by value and volume Can be got at end of every by each market on comparison to other quarter from market 4 competitor drugs from same therapeutic area External Structured N intelligence firms such as IMS Assumption that internal DSS Actuals vs Budget and Actuals vs Forecast has data from all markets at 5 comparison by each market. Internal Structured Y required frequency 'Y' because Readily available in Details about dept level decisions recorded in repository and 'N' because not 6 documents Internal Unstructured Y and N in integrated state Table 1.1: Valuable information for pharmaceutical company during drug launch It is clear that information about some important aspects is of unstructured format. Examples of unstructured data in an enterprise are HTML content (e.g. web chat, blogs and web pages), Documents (e.g. memos, research papers, MoMs and articles), Forms (e.g. patent applications), Emails, SMS content and Multimedia content (audio, video, images) (Ferguson,2011; McCallum, 2005; SPSS, 2003). Decision makers in a company have to rely on facts to make sound business decisions. The availability of sufficient and timely facts can help in the process. In this case, the Marketing Director should be able to pull the required data and the system should have the mechanism to push specific information as well. A distinction is made between data and information because only information should be pushed to a user as he/she will not have time to analyze plain facts without any context. Typical examples applicable to this case are listed below. Pull data: Sales & Expenses data, Market share, and Supply chain inventory data. Push information: Supply chain deficiencies, summarized content delivery from analytics systems pertaining to sentiment and opinion about the new drug from internal and external social media platforms, flash updates on sales, libel cases on new drug from FDA and other sources. The Push type of information is mostly of unstructured format thereby justifying its importance. Unstructured data characteristics are visibly and intrinsically different from transactional data. Differentiating factors are mainly related to representation, source, context, understandability, timeliness and shelf-life. In general, characteristics of unstructured data are:- Page 2 of 4
  • 3. K6221 Business Intelligence Mini Assignment Does not reside in relational database tables. Has no predefined structure or format. Not arranged in any order. Difficult to categorize for use in BI. Resides in several documents over multiple sources  Internal (data within an organization)  External (data outside the organization) These characteristics make it difficult for technical personnel to store and catalog unstructured data in an Enterprise Data Warehouse (EDW) apart from the inherent difficulty in capturing required data. The heterogeneous nature of the sources adds to the complexity. Typical sources for unstructured data include Email archives, Call center transcripts, Customer feedback databases, Enterprise intranets, Enterprise content management systems, File systems, Document management systems, Social networking sites and RSS Newsfeeds (Ferguson 2011:6). There are techniques for unstructured data to be captured and utilized. Crawlers can be used for capturing relevant information from enterprise data ecosystem, social media sites and WWW. The captured information is then tagged and indexed for retrieval purpose. The final stage is the knowledge discovery stage that involves text mining and web mining (popularly called as content analytics), to derive insight for business benefits. An ideal BI system should provide the ability to create Enterprise Mashups. Mashups are used to integrate information sources and functionality from different sources to create new services. These kinds of applications are more suitable for agile development project thereby suitable to our scenario to look at data from different sources that help in making decisions. However, there are few challenges to it. Choosing the right information sources amongst unstructured data and content sifting mechanisms are some known challenges. Mashups are an emerging trend that is there to stay as it provides a one-stop shop for decision makers. Future considerations for handling unstructured data Ensuring that user content is accurately tagged. Ensure that content is up-to-date and relevant. Validating content sources. Identify business drivers to get the best solution. For scalability issues allocate adequate processing power to analytics. Figure 1 gives a pictorial representation of the current usage of BI in pharmaceutical companies and the neglected blue ocean segment of unstructured data BI. Page 3 of 4
  • 4. K6221 Business Intelligence Mini Assignment Fig 1: Usage of Business Intelligence in a pharmaceutical company Conclusion Enterprises are aware of the importance of unstructured data in current day scenario but they fail to leverage it due to technical (capturing and storing) and logical (classification and integration) constraints. This situation is bound to improve with best practices and simpler technical processes. Investment in Content Analytics and Enterprise Mashups will definitely be realized in the long run. References Wittles, G. (n.d.). Unstructured data offers a vast store of untapped BI value . Retrieved from http://www.themanager.org/strategy/Unstructured_data.htm (Wittles) Dave , W. (n.d.). Unstructured data in life sciences. Retrieved from http://blogs.hds.com/storagestat/2011/11/unstructured-data-in-life-sciences.html (Dave) Ferguson, M. (n.d.). Integrating and analyzing unstructured data. Info 360 BI Conference. Washington DC. (Ferguson, 2011) McCallum, A. 2005. Information Extraction. (http://people.cs.umass.edu/~mccallum/papers/acm-queue-ie.pdf ) Retrieved 17 February 2011. (McCallum, 2005) SPSS. 2003. Meeting the challenge for text: Making text ready for predictive analysis. Chicago (SPSS, 2003) Grimes, S. (n.d.). Nimble intelligence: Enterprise bi mashup best practices. Retrieved from http://www.jackbe.com/downloads/nimblebi_grimes.pdf (Grimes) Page 4 of 4