The document discusses the use of secondary data in marketing analytics. It provides an overview of secondary data, including its purpose and advantages. Secondary data can help define problems, identify relevant variables, and test hypotheses. However, secondary data also has limitations, such as potential lack of relevance, accuracy, or currency for the current research problem. When using secondary data, researchers must evaluate it based on criteria like timeliness, sample characteristics, methodology, and definitions used.
The document discusses big data and predictive analytics. It defines big data as large volumes of diverse data that require new techniques and technologies to analyze. Predictive analytics uses statistical modeling of historical data to predict future outcomes. The document provides examples of how predictive models are used in weather forecasting, customer service, and marketing. It also distinguishes predictive analytics from machine learning and discusses common predictive modeling techniques like decision trees, neural networks, and regression.
1. The document discusses various advanced data analytics techniques including data mining, online analytical processing (OLAP), pivot tables, power pivot, power view in Excel, and different types of data mining techniques like classification, clustering, regression, association rules, outlier detection, sequential patterns, and prediction.
2. It provides details on each technique including definitions, applications, and examples.
3. The key data analytics techniques covered are data mining, OLAP, pivot tables, power pivot and power view in Excel, and various classification methods for advanced data analysis.
This document compares several open source tools that can be used for data science. It provides background on key concepts in data science like data mining, machine learning, predictive analytics and business intelligence. It also discusses techniques commonly used by data scientists like clustering, classification, regression etc. The document then reviews popular open source data science tools like Orange, RapidMiner, KNIME, Weka and R and compares their key features based on techniques covered in the EMC Data Science Associate certification. It finds that these tools provide capabilities for common data science techniques at no cost, making them suitable alternatives to expensive proprietary software, especially for small organizations.
Data Mining of Project Management Data: An Analysis of Applied Research Studies.Gurdal Ertek
Data collected and generated through and posterior to projects, such as data residing in project management software and post project review documents, can be a major source of actionable insights and competitive advantage. This paper presents a rigorous
methodological analysis of the applied research published in academic literature, on the application of data mining (DM) for project management (PM). The objective of the paper is to provide a comprehensive analysis and discussion of where and how data mining is applied for project management data and to provide practical insights for future research in the field.
https://dl.acm.org/citation.cfm?id=3176714
https://ertekprojects.com/ftp/papers/2017/ertek_et_al_2017_Data_Mining_of_Project_Management_Data.pdf
Asl rof businessintelligencetechnology2019kamilHussain15
This document presents a systematic literature review on business intelligence technology, contributions, and applications for higher education. The methodology used PRISMA to identify relevant research. 12 articles were included from databases based on screening criteria. To answer the research questions, the included articles were analyzed. For technology (Q1), business intelligence techniques identified were data mining, viable system model, learning analytics, cloud computing, and behavioral analytics. Tools included Hadoop, Gephi, BigData by IBM, and web-based. Contributions (Q2) were knowledge transfer, innovation, and evaluation. Applications (Q3) included research, curriculum, assessment, behavior analysis, student enrollment, and resource management.
This document summarizes a systematic literature review of 62 articles on business intelligence and analytics (BI&A) in small and medium enterprises (SMEs). The review identified several research topics addressed in the literature, including BI&A components, solutions, mobile and cloud BI&A, applications, adoption, implementation, and benefits. However, the review also found few studies focused specifically on BI&A in SMEs. The review synthesized the literature to understand the current state of research and identify gaps to inform future work on advancing BI&A in SMEs.
This chapter discusses secondary data sources that can be used in marketing research. It describes the relationship of secondary data to other parts of the marketing research process. The chapter also covers the advantages and disadvantages of secondary data, criteria for evaluating secondary data sources, and various types of secondary data sources including internal sources, published sources, databases, and sources for international marketing research.
This document presents a systematic literature review of 38 empirical studies on factors relating to successful business intelligence (BI) system implementation. The review identified 10 key factors that frequently influenced implementation success based on their frequency of occurrence in the literature. These factors included management support, data source systems, organizational resources, IT infrastructure, vision, champion, team skills, project manager, user participation, and change management. The study aims to help researchers better identify relevant studies for literature reviews on factors impacting BI system implementation.
The document discusses big data and predictive analytics. It defines big data as large volumes of diverse data that require new techniques and technologies to analyze. Predictive analytics uses statistical modeling of historical data to predict future outcomes. The document provides examples of how predictive models are used in weather forecasting, customer service, and marketing. It also distinguishes predictive analytics from machine learning and discusses common predictive modeling techniques like decision trees, neural networks, and regression.
1. The document discusses various advanced data analytics techniques including data mining, online analytical processing (OLAP), pivot tables, power pivot, power view in Excel, and different types of data mining techniques like classification, clustering, regression, association rules, outlier detection, sequential patterns, and prediction.
2. It provides details on each technique including definitions, applications, and examples.
3. The key data analytics techniques covered are data mining, OLAP, pivot tables, power pivot and power view in Excel, and various classification methods for advanced data analysis.
This document compares several open source tools that can be used for data science. It provides background on key concepts in data science like data mining, machine learning, predictive analytics and business intelligence. It also discusses techniques commonly used by data scientists like clustering, classification, regression etc. The document then reviews popular open source data science tools like Orange, RapidMiner, KNIME, Weka and R and compares their key features based on techniques covered in the EMC Data Science Associate certification. It finds that these tools provide capabilities for common data science techniques at no cost, making them suitable alternatives to expensive proprietary software, especially for small organizations.
Data Mining of Project Management Data: An Analysis of Applied Research Studies.Gurdal Ertek
Data collected and generated through and posterior to projects, such as data residing in project management software and post project review documents, can be a major source of actionable insights and competitive advantage. This paper presents a rigorous
methodological analysis of the applied research published in academic literature, on the application of data mining (DM) for project management (PM). The objective of the paper is to provide a comprehensive analysis and discussion of where and how data mining is applied for project management data and to provide practical insights for future research in the field.
https://dl.acm.org/citation.cfm?id=3176714
https://ertekprojects.com/ftp/papers/2017/ertek_et_al_2017_Data_Mining_of_Project_Management_Data.pdf
Asl rof businessintelligencetechnology2019kamilHussain15
This document presents a systematic literature review on business intelligence technology, contributions, and applications for higher education. The methodology used PRISMA to identify relevant research. 12 articles were included from databases based on screening criteria. To answer the research questions, the included articles were analyzed. For technology (Q1), business intelligence techniques identified were data mining, viable system model, learning analytics, cloud computing, and behavioral analytics. Tools included Hadoop, Gephi, BigData by IBM, and web-based. Contributions (Q2) were knowledge transfer, innovation, and evaluation. Applications (Q3) included research, curriculum, assessment, behavior analysis, student enrollment, and resource management.
This document summarizes a systematic literature review of 62 articles on business intelligence and analytics (BI&A) in small and medium enterprises (SMEs). The review identified several research topics addressed in the literature, including BI&A components, solutions, mobile and cloud BI&A, applications, adoption, implementation, and benefits. However, the review also found few studies focused specifically on BI&A in SMEs. The review synthesized the literature to understand the current state of research and identify gaps to inform future work on advancing BI&A in SMEs.
This chapter discusses secondary data sources that can be used in marketing research. It describes the relationship of secondary data to other parts of the marketing research process. The chapter also covers the advantages and disadvantages of secondary data, criteria for evaluating secondary data sources, and various types of secondary data sources including internal sources, published sources, databases, and sources for international marketing research.
This document presents a systematic literature review of 38 empirical studies on factors relating to successful business intelligence (BI) system implementation. The review identified 10 key factors that frequently influenced implementation success based on their frequency of occurrence in the literature. These factors included management support, data source systems, organizational resources, IT infrastructure, vision, champion, team skills, project manager, user participation, and change management. The study aims to help researchers better identify relevant studies for literature reviews on factors impacting BI system implementation.
Selection of Articles using Data Analytics for Behavioral Dissertation Resear...PhD Assistance
Outcomes in health-related issues including psychological, educational, Behavioral, environmental, and social are intended to sustain positive change by digital interferences. These changes may be delivered using any digital device like a phone or computer, and make them gainful for the provider. Complex and large-scale datasets that contain usage data can be yielded by testing a digital intervention. This data provides invaluable detail about how the users interact with these interventions and notify their knowledge of engagement, if they are analyzed properly. This paper recommends an innovative framework for the process of analyzing usage associated with a digital intervention .
PhD Assistance is an Academic The Best Dissertation Writing Service & Consulting Support Company established in 2001. specialiWeze in providing PhD Assignments, PhD Dissertation Writing Help , Statistical Analyses, and Programming Services to students in the USA, UK, Canada, UAE, Australia, New Zealand, Singapore and many more.
Website Visit: https://bit.ly/3dANXUD
Contact Us:
UK NO: +44-1143520021
India No: +91-8754446690
Email: info@phdassistance.com
This document discusses data architecture and management for data analytics. It begins by defining data architecture and explaining that it is composed of models, policies, and standards that govern how data is collected, stored, integrated, and used. Various factors influence data architecture design, including enterprise requirements, technology drivers, economics, business policies, and data processing needs. The document then outlines three levels of data architecture specification - the logical level, physical level, and implementation level. It also discusses primary and secondary sources of data, with primary sources including observation, surveys, and experiments, and secondary sources including internal sources like sales reports and accounting data as well as external sources.
This document discusses research methods for collecting primary and secondary data. It outlines different types of primary data collection methods like surveys, observation, interviews and focus groups. It also distinguishes between internal and external secondary data sources and discusses how to evaluate data sources based on criteria like accuracy, quality and timeliness.
1. Data science involves applying scientific methods and processes to extract knowledge and insights from data. It includes techniques like machine learning, statistical analysis, and data visualization.
2. Data science has many applications in fields like marketing, healthcare, banking, and government. It helps with tasks like demand forecasting, fraud detection, personalized recommendations, and policymaking.
3. The key characteristics of data science include business understanding, intuition, curiosity, and skills in areas like machine learning algorithms, statistics, programming, and communication. Data scientists help organizations make better decisions using data-driven insights.
International Refereed Journal of Engineering and Science (IRJES) irjes
International Refereed Journal of Engineering and Science (IRJES)
Ad hoc & sensor networks, Adaptive applications, Aeronautical Engineering, Aerospace Engineering
Agricultural Engineering, AI and Image Recognition, Allied engineering materials, Applied mechanics,
Architecture & Planning, Artificial intelligence, Audio Engineering, Automation and Mobile Robots
Automotive Engineering….
In this advanced business analysis training session, you will learn Data Analytics Business Intelligence. Topics covered in this session are:
• What is Business Intelligence?
• Data / information / knowledge
• What is Data Analytics?
• What is Business Analytics?
• What is Big Data?
• Types of Data
• Types of Analytics
• What is Business Intelligence?
For more information, click here: https://www.mindsmapped.com/courses/business-analysis/advanced-business-analyst-training/
The document outlines the 6 steps of the marketing research process: 1) defining the problem, 2) developing an approach, 3) designing the research, 4) collecting data, 5) analyzing and preparing the data, and 6) presenting the report. It then provides more details on each step, including secondary data analysis, qualitative research techniques, and research design considerations like sampling and measurement.
THE EFFECTIVENESS OF DATA MINING TECHNIQUES IN BANKINGcsijjournal
The aim of this study is to identify the extent of Data mining activities that are practiced by banks, Data mining is the ability to link structured and unstructured information with the changing rules by which people apply it. It is not a technology, but a solution that applies information technologies. Currently
several industries including like banking, finance, retail, insurance, publicity, database marketing, sales predict, etc are Data Mining tools for Customer . Leading banks are using Data Mining tools for customer segmentation and benefit, credit scoring and approval, predicting payment lapse, marketing, detecting illegal transactions, etc. The Banking is realizing that it is possible to gain competitive advantage deploy data mining. This article provides the effectiveness of Data mining technique in organized Banking. It also discusses standard tasks involved in data mining; evaluate various data mining applications in different
sectors
The document provides an overview of business analytics (BA) including its history, types, examples, challenges, and relationship to data mining. BA involves exploring past business performance data to gain insights and guide planning. It can focus on specific business segments. Types of BA include descriptive analytics like reporting, affinity grouping, and clustering, as well as predictive analytics. Challenges to BA include acquiring high quality data and rapidly processing large volumes of data. Data mining is an important task within BA that helps handle large datasets and specific problems.
6 ijaems sept-2015-6-a review of data security primitives in data miningINFOGAIN PUBLICATION
This document summarizes a review of 30 research papers on data security primitives in data mining. The review identified 9 key issues: spatial data handling, gaps between hidden patterns and business tools, decision making in heterogeneous databases, resource mining, visually interactive data mining, data cluster mining, load balancing and data fittability, privacy preservation, and mining complex patterns. For each issue, the document discusses solution approaches from the papers and identifies the best and worst approaches. Common findings are presented across the issues. The document concludes there is scope for future work integrating optimization techniques with neural networks for improved data mining and increasing system flexibility.
The document discusses tools and techniques for big data analytics, including A/B testing, crowdsourcing, machine learning, and data mining. It provides an overview of the big data analysis pipeline, including data acquisition, information extraction, integration and representation, query processing and analysis, and interpretation. The document also discusses fields where big data is relevant like industry, healthcare, and research. It analyzes tools like A/B testing, machine learning, and data mining techniques in more detail.
The document discusses setting up a database to track author contracts for a small publishing company. It identifies seven key fields that would be needed for the contract database, including author name, book title, payment details. It also notes that some of these fields like author name and book title could be used in other existing databases at the company like an author database or book title database.
Selection of Articles Using Data Analytics for Behavioral Dissertation Resear...PhD Assistance
Outcomes in health-related issues including psychological, educational, Behavioral, environmental, and social are intended to sustain positive change by digital interferences. These changes may be delivered using any digital device like a phone or computer, and make them gainful for the provider. Complex and large-scale datasets that contain usage data can be yielded by testing a digital intervention. This data provides invaluable detail about how the users interact with these interventions and notify their knowledge of engagement, if they are analyzed properly. This paper recommends an innovative framework for the process of analyzing usage associated with a digital intervention .
PhD Assistance is an Academic The Best Dissertation Writing Service & Consulting Support Company established in 2001. specialiWeze in providing PhD Assignments, PhD Dissertation Writing Help , Statistical Analyses, and Programming Services to students in the USA, UK, Canada, UAE, Australia, New Zealand, Singapore and many more.
Website Visit: https://bit.ly/3dANXUD
Contact Us:
UK NO: +44-1143520021
India No: +91-8754446690
Email: info@phdassistance.com
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
The document discusses how companies that are leading in analytics use data and analytics to gain competitive advantages and innovate. It profiles "Analytical Innovators" - companies that rely on analytics to compete and innovate. These companies share a belief that data is a core asset, make effective use of more data for faster results, and have senior management support for data-driven decision making. The document provides examples of companies in different industries that are successfully using analytics and a framework for other companies to also become more analytical.
KM tools can be categorized into the following types:
Groupware systems & KM 2.0, intranets & extranets, data warehousing/data mining/OLAP, decision support systems, content management systems, and document management systems. These tools help with knowledge discovery, organization, sharing, and decision making by providing functions like communication, collaboration, data analysis, content/document publishing and retrieval. Selecting the right KM tools is an important step in implementing a successful KM strategy.
Study of Data Mining Methods and its ApplicationsIRJET Journal
This document discusses data mining methods and their applications. It begins by defining data mining as the process of extracting useful patterns from large amounts of data. The document then outlines the typical steps in the knowledge discovery process, including data selection, preprocessing, transformation, mining, and evaluation. It classifies data mining techniques into predictive and descriptive methods. Specific techniques discussed include classification, clustering, prediction, and association rule mining. Finally, the document discusses applications of data mining in fields like healthcare, biology, retail, and banking.
This document discusses data mining applications in the telecommunications industry. It begins with an overview of the data mining process and definitions. It then describes the types of data generated by telecommunications companies, including call detail data, network data, and customer data. The document outlines several common data mining applications for telecommunications companies, including fraud detection, marketing/customer profiling, and network fault isolation. Specific examples within marketing like customer churn and insolvency prediction are also mentioned.
An effective pre processing algorithm for information retrieval systemsijdms
The Internet is probably the most successful distributed computing system ever. However, our capabilities
for data querying and manipulation on the internet are primordial at best. The user expectations are
enhancing over the period of time along with increased amount of operational data past few decades. The
data-user expects more deep, exact, and detailed results. Result retrieval for the user query is always
relative o the pattern of data storage and index. In Information retrieval systems, tokenization is an
integrals part whose prime objective is to identifying the token and their count. In this paper, we have
proposed an effective tokenization approach which is based on training vector and result shows that
efficiency/ effectiveness of proposed algorithm. Tokenization on documents helps to satisfy user’s
information need more precisely and reduced search sharply, is believed to be a part of information
retrieval. Pre-processing of input document is an integral part of Tokenization, which involves preprocessing
of documents and generates its respective tokens which is the basis of these tokens probabilistic
IR generate its scoring and gives reduced search space. The comparative analysis is based on the two
parameters; Number of Token generated, Pre-processing time.
This document discusses data analytics and related concepts. It defines data and information, explaining that data becomes information when it is organized and analyzed to be useful. It then discusses how data is everywhere and the value of data analysis skills. The rest of the document outlines the methodology of data analytics, including data collection, management, cleaning, exploratory analysis, modeling, mining, and visualization. It provides examples of how data analytics is used in healthcare and travel to optimize processes and customer experiences.
The document discusses research design and the use of secondary data in research. It defines research design as a framework that details procedures for collecting, measuring, and analyzing information to structure business research problems. A good research design typically includes selecting a design type, identifying needed information, specifying measurement and scaling, and determining data collection and analysis methods. The document also discusses exploratory research design and the advantages and disadvantages of using secondary data sources.
According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.08% during the forecast period, reaching USD 12290.99 million by 2028.According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.08% during the forecast period, reaching USD 12290.99 million by 2028.According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.08% during the forecast period, reaching USD 12290.99 million by 2028.According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.08% during the forecast period, reaching USD 12290.99 million by 2028.According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.08% during the forecast period, reaching USD 12290.99 million by 2028.According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.08% during the forecast period, reaching USD 12290.99 million by 2028.According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.08% during the forecast period, reaching USD 12290.99 million by 2028.According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.08% during the forecast period, reaching USD 12290.99 million by 2028.According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.08% during the forecast period, reaching USD 12290.99 million by 2028.According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.08% during the forecast period, reaching USD 12290.99 million by 2028.According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.0
Selection of Articles using Data Analytics for Behavioral Dissertation Resear...PhD Assistance
Outcomes in health-related issues including psychological, educational, Behavioral, environmental, and social are intended to sustain positive change by digital interferences. These changes may be delivered using any digital device like a phone or computer, and make them gainful for the provider. Complex and large-scale datasets that contain usage data can be yielded by testing a digital intervention. This data provides invaluable detail about how the users interact with these interventions and notify their knowledge of engagement, if they are analyzed properly. This paper recommends an innovative framework for the process of analyzing usage associated with a digital intervention .
PhD Assistance is an Academic The Best Dissertation Writing Service & Consulting Support Company established in 2001. specialiWeze in providing PhD Assignments, PhD Dissertation Writing Help , Statistical Analyses, and Programming Services to students in the USA, UK, Canada, UAE, Australia, New Zealand, Singapore and many more.
Website Visit: https://bit.ly/3dANXUD
Contact Us:
UK NO: +44-1143520021
India No: +91-8754446690
Email: info@phdassistance.com
This document discusses data architecture and management for data analytics. It begins by defining data architecture and explaining that it is composed of models, policies, and standards that govern how data is collected, stored, integrated, and used. Various factors influence data architecture design, including enterprise requirements, technology drivers, economics, business policies, and data processing needs. The document then outlines three levels of data architecture specification - the logical level, physical level, and implementation level. It also discusses primary and secondary sources of data, with primary sources including observation, surveys, and experiments, and secondary sources including internal sources like sales reports and accounting data as well as external sources.
This document discusses research methods for collecting primary and secondary data. It outlines different types of primary data collection methods like surveys, observation, interviews and focus groups. It also distinguishes between internal and external secondary data sources and discusses how to evaluate data sources based on criteria like accuracy, quality and timeliness.
1. Data science involves applying scientific methods and processes to extract knowledge and insights from data. It includes techniques like machine learning, statistical analysis, and data visualization.
2. Data science has many applications in fields like marketing, healthcare, banking, and government. It helps with tasks like demand forecasting, fraud detection, personalized recommendations, and policymaking.
3. The key characteristics of data science include business understanding, intuition, curiosity, and skills in areas like machine learning algorithms, statistics, programming, and communication. Data scientists help organizations make better decisions using data-driven insights.
International Refereed Journal of Engineering and Science (IRJES) irjes
International Refereed Journal of Engineering and Science (IRJES)
Ad hoc & sensor networks, Adaptive applications, Aeronautical Engineering, Aerospace Engineering
Agricultural Engineering, AI and Image Recognition, Allied engineering materials, Applied mechanics,
Architecture & Planning, Artificial intelligence, Audio Engineering, Automation and Mobile Robots
Automotive Engineering….
In this advanced business analysis training session, you will learn Data Analytics Business Intelligence. Topics covered in this session are:
• What is Business Intelligence?
• Data / information / knowledge
• What is Data Analytics?
• What is Business Analytics?
• What is Big Data?
• Types of Data
• Types of Analytics
• What is Business Intelligence?
For more information, click here: https://www.mindsmapped.com/courses/business-analysis/advanced-business-analyst-training/
The document outlines the 6 steps of the marketing research process: 1) defining the problem, 2) developing an approach, 3) designing the research, 4) collecting data, 5) analyzing and preparing the data, and 6) presenting the report. It then provides more details on each step, including secondary data analysis, qualitative research techniques, and research design considerations like sampling and measurement.
THE EFFECTIVENESS OF DATA MINING TECHNIQUES IN BANKINGcsijjournal
The aim of this study is to identify the extent of Data mining activities that are practiced by banks, Data mining is the ability to link structured and unstructured information with the changing rules by which people apply it. It is not a technology, but a solution that applies information technologies. Currently
several industries including like banking, finance, retail, insurance, publicity, database marketing, sales predict, etc are Data Mining tools for Customer . Leading banks are using Data Mining tools for customer segmentation and benefit, credit scoring and approval, predicting payment lapse, marketing, detecting illegal transactions, etc. The Banking is realizing that it is possible to gain competitive advantage deploy data mining. This article provides the effectiveness of Data mining technique in organized Banking. It also discusses standard tasks involved in data mining; evaluate various data mining applications in different
sectors
The document provides an overview of business analytics (BA) including its history, types, examples, challenges, and relationship to data mining. BA involves exploring past business performance data to gain insights and guide planning. It can focus on specific business segments. Types of BA include descriptive analytics like reporting, affinity grouping, and clustering, as well as predictive analytics. Challenges to BA include acquiring high quality data and rapidly processing large volumes of data. Data mining is an important task within BA that helps handle large datasets and specific problems.
6 ijaems sept-2015-6-a review of data security primitives in data miningINFOGAIN PUBLICATION
This document summarizes a review of 30 research papers on data security primitives in data mining. The review identified 9 key issues: spatial data handling, gaps between hidden patterns and business tools, decision making in heterogeneous databases, resource mining, visually interactive data mining, data cluster mining, load balancing and data fittability, privacy preservation, and mining complex patterns. For each issue, the document discusses solution approaches from the papers and identifies the best and worst approaches. Common findings are presented across the issues. The document concludes there is scope for future work integrating optimization techniques with neural networks for improved data mining and increasing system flexibility.
The document discusses tools and techniques for big data analytics, including A/B testing, crowdsourcing, machine learning, and data mining. It provides an overview of the big data analysis pipeline, including data acquisition, information extraction, integration and representation, query processing and analysis, and interpretation. The document also discusses fields where big data is relevant like industry, healthcare, and research. It analyzes tools like A/B testing, machine learning, and data mining techniques in more detail.
The document discusses setting up a database to track author contracts for a small publishing company. It identifies seven key fields that would be needed for the contract database, including author name, book title, payment details. It also notes that some of these fields like author name and book title could be used in other existing databases at the company like an author database or book title database.
Selection of Articles Using Data Analytics for Behavioral Dissertation Resear...PhD Assistance
Outcomes in health-related issues including psychological, educational, Behavioral, environmental, and social are intended to sustain positive change by digital interferences. These changes may be delivered using any digital device like a phone or computer, and make them gainful for the provider. Complex and large-scale datasets that contain usage data can be yielded by testing a digital intervention. This data provides invaluable detail about how the users interact with these interventions and notify their knowledge of engagement, if they are analyzed properly. This paper recommends an innovative framework for the process of analyzing usage associated with a digital intervention .
PhD Assistance is an Academic The Best Dissertation Writing Service & Consulting Support Company established in 2001. specialiWeze in providing PhD Assignments, PhD Dissertation Writing Help , Statistical Analyses, and Programming Services to students in the USA, UK, Canada, UAE, Australia, New Zealand, Singapore and many more.
Website Visit: https://bit.ly/3dANXUD
Contact Us:
UK NO: +44-1143520021
India No: +91-8754446690
Email: info@phdassistance.com
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
The document discusses how companies that are leading in analytics use data and analytics to gain competitive advantages and innovate. It profiles "Analytical Innovators" - companies that rely on analytics to compete and innovate. These companies share a belief that data is a core asset, make effective use of more data for faster results, and have senior management support for data-driven decision making. The document provides examples of companies in different industries that are successfully using analytics and a framework for other companies to also become more analytical.
KM tools can be categorized into the following types:
Groupware systems & KM 2.0, intranets & extranets, data warehousing/data mining/OLAP, decision support systems, content management systems, and document management systems. These tools help with knowledge discovery, organization, sharing, and decision making by providing functions like communication, collaboration, data analysis, content/document publishing and retrieval. Selecting the right KM tools is an important step in implementing a successful KM strategy.
Study of Data Mining Methods and its ApplicationsIRJET Journal
This document discusses data mining methods and their applications. It begins by defining data mining as the process of extracting useful patterns from large amounts of data. The document then outlines the typical steps in the knowledge discovery process, including data selection, preprocessing, transformation, mining, and evaluation. It classifies data mining techniques into predictive and descriptive methods. Specific techniques discussed include classification, clustering, prediction, and association rule mining. Finally, the document discusses applications of data mining in fields like healthcare, biology, retail, and banking.
This document discusses data mining applications in the telecommunications industry. It begins with an overview of the data mining process and definitions. It then describes the types of data generated by telecommunications companies, including call detail data, network data, and customer data. The document outlines several common data mining applications for telecommunications companies, including fraud detection, marketing/customer profiling, and network fault isolation. Specific examples within marketing like customer churn and insolvency prediction are also mentioned.
An effective pre processing algorithm for information retrieval systemsijdms
The Internet is probably the most successful distributed computing system ever. However, our capabilities
for data querying and manipulation on the internet are primordial at best. The user expectations are
enhancing over the period of time along with increased amount of operational data past few decades. The
data-user expects more deep, exact, and detailed results. Result retrieval for the user query is always
relative o the pattern of data storage and index. In Information retrieval systems, tokenization is an
integrals part whose prime objective is to identifying the token and their count. In this paper, we have
proposed an effective tokenization approach which is based on training vector and result shows that
efficiency/ effectiveness of proposed algorithm. Tokenization on documents helps to satisfy user’s
information need more precisely and reduced search sharply, is believed to be a part of information
retrieval. Pre-processing of input document is an integral part of Tokenization, which involves preprocessing
of documents and generates its respective tokens which is the basis of these tokens probabilistic
IR generate its scoring and gives reduced search space. The comparative analysis is based on the two
parameters; Number of Token generated, Pre-processing time.
This document discusses data analytics and related concepts. It defines data and information, explaining that data becomes information when it is organized and analyzed to be useful. It then discusses how data is everywhere and the value of data analysis skills. The rest of the document outlines the methodology of data analytics, including data collection, management, cleaning, exploratory analysis, modeling, mining, and visualization. It provides examples of how data analytics is used in healthcare and travel to optimize processes and customer experiences.
The document discusses research design and the use of secondary data in research. It defines research design as a framework that details procedures for collecting, measuring, and analyzing information to structure business research problems. A good research design typically includes selecting a design type, identifying needed information, specifying measurement and scaling, and determining data collection and analysis methods. The document also discusses exploratory research design and the advantages and disadvantages of using secondary data sources.
According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.08% during the forecast period, reaching USD 12290.99 million by 2028.According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.08% during the forecast period, reaching USD 12290.99 million by 2028.According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.08% during the forecast period, reaching USD 12290.99 million by 2028.According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.08% during the forecast period, reaching USD 12290.99 million by 2028.According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.08% during the forecast period, reaching USD 12290.99 million by 2028.According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.08% during the forecast period, reaching USD 12290.99 million by 2028.According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.08% during the forecast period, reaching USD 12290.99 million by 2028.According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.08% during the forecast period, reaching USD 12290.99 million by 2028.According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.08% during the forecast period, reaching USD 12290.99 million by 2028.According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.08% during the forecast period, reaching USD 12290.99 million by 2028.According to the World Supply Chain Finance Report, the global Supply Chain Finance market size was valued at USD 7298.46 million in 2022 and is expected to expand at a CAGR (Compound Annual Growth Rate) of 9.0
This document discusses secondary data - data originally collected by someone other than the user. It defines secondary data and lists common sources like censuses and government/organizational records. The purposes of secondary data are extracting relevant information, fact finding, model building, and data mining. Criteria for evaluating secondary data include specifications, error, currency, objectives, nature, and dependability. Secondary data is advantageous as it is economical, time saving, and helps focus primary data collection. However, disadvantages are that secondary data may not fit the research factors and accuracy is unknown. Secondary data can be used to identify problems, better define problems, develop research approaches, formulate research designs, and help interpret primary data.
Data Analytics in Industry Verticals, Data Analytics Lifecycle, Challenges of...Sahilakhurana
Banking and securities
Challenges
Early warning for securities fraud and trade visibilities
Card fraud detection and audit trails
Enterprise credit risk reporting
Customer data transformation and analytics.
The Security Exchange commission (SEC) is using big data to monitor financial market activity by using network analytics and natural language processing. This helps to catch illegal trading activity in the financial markets.
The Data Analytics Lifecycle is designed specifically for Big Data problems and data science projects. The lifecycle has six phases, and project work can occur in several phases at once. For most phases in the lifecycle, the movement can be either forward or backward. This iterative depiction of the lifecycle is intended to more closely portray a real project, in which aspects of the project move forward and may return to earlier stages as new information is uncovered and team members learn more about various stages of the project. This enables participants to move iteratively through the process and drive toward operationalizing the project work.
Phase 1—Discovery: In Phase 1, the team learns the business domain, including relevant history such as whether the organization or business unit has attempted similar projects in the past from which they can learn. The team assesses the resources available to support the project in terms of people, technology, time, and data. Important activities in this phase include framing the business problem as an analytics challenge that can be addressed in subsequent phases and formulating initial hypotheses (IHs) to test and begin learning the data.
Phase 2—Data preparation: Phase 2 requires the presence of an analytic sandbox, in which the team can work with data and perform analytics for the duration of the project. The team needs to execute extract, load, and transform (ELT) or extract, transform and load (ETL) to get data into the sandbox. The ELT and ETL are sometimes abbreviated as ETLT. Data should be transformed in the ETLT process so the team can work with it and analyze it. In this phase, the team also needs to familiarize itself with the data thoroughly and take steps to condition the data.
Age Friendly Economy - Improving your business with external dataAgeFriendlyEconomy
The objective of this module is to gain an overview how you can use the data available outside of your company to improve your business.
Upon completion of this module you will:
- Learn the basics of external data and where to find it
- Be able to recognize there is a lot of Open Data already out there for you to use – especially about Older People
- See the benefits of using the external data in order to improve your business
Discussion Forum data, sourced from sites like Reddit and other social media platforms, as well other sources of textual information, provides tremendous opportunity for insight and innovation. This presentation focuses on how an analysis of unstructured data can be used to innovate in Life/Health Science organizations
The proposal begins with an information paper, covering the importance of data management. This includes important concepts related to data quality (validity). It’s followed by a strategic plan to create a Data Management Section/Directorate.
Data can be quantitative or qualitative values of variables, such as numbers, images, words, or ideas. There are two types of data: primary and secondary. Primary data is originally collected for the specific purpose, while secondary data was previously collected for another purpose. The purpose of data collection is to obtain information to make decisions, keep records, and share information with others. Some advantages of secondary data are that it already exists so it takes less time and effort, and can cover a wider range if the source is reliable. However, the accuracy may be questionable as the researcher did not control its selection, quality or collection methods.
This document provides an introduction to data science concepts. It discusses the components of data science including statistics, visualization, data engineering, advanced computing, and machine learning. It also covers the advantages and disadvantages of data science, as well as common applications. Finally, it outlines the six phases of the data science process: framing the problem, collecting and processing data, exploring and analyzing data, communicating results, and measuring effectiveness.
This document discusses the importance of big data and analytics for businesses and UKMs. It defines key terms like business intelligence, data mining, business analytics, and data warehousing. It explains that businesses need tools to turn large and complex data sets into meaningful information to support decision making in uncertain economic conditions. The document also lists sources of big data and characteristics of good data for decision making. It provides examples of business intelligence dashboards and discusses common reasons why BI projects fail. Finally, it outlines some big data tools that can be used by small businesses.
1. The document discusses primary and secondary data, methods of data collection, and data classification.
2. Primary data is original data collected directly by researchers, while secondary data was previously collected by others. Primary data collection methods include surveys, interviews, and experiments, while secondary data comes from existing sources like publications and records.
3. Data classification organizes data into categories to aid usage and protection, with the goals of making data easily retrievable, meeting compliance requirements, and increasing awareness of data sensitivity.
Introducition to Data scinece compiled by huwekineheshete
This document provides an overview of data science and its key components. It discusses that data science uses scientific methods and algorithms to extract knowledge from structured, semi-structured, and unstructured data sources. It also notes that data science involves organizing data, packaging it through visualization and statistics, and delivering insights. The document further outlines the data science lifecycle and workflow, covering understanding the problem, exploring and preprocessing data, developing models, and evaluating results.
This document discusses secondary and primary data collection methods in marketing research. It defines secondary data as data collected previously for other purposes, while primary data is collected specifically for the research project. Some key advantages of secondary data are that it is more economical and saves time compared to primary data collection. However, secondary data may not always fit the research needs or be up-to-date. The document outlines various sources of secondary data, both internal and external to an organization. It also discusses requirements for using secondary data and limitations of the observational primary data collection method.
Data science involves extracting meaningful insights from raw data through scientific methods and algorithms. It is an interdisciplinary field that focuses on analyzing large datasets using skills from computer science, mathematics, and statistics. Python is a commonly used programming language for data science due to its powerful libraries for tasks like data analysis, machine learning, and visualization. Key Python libraries include NumPy, Pandas, Matplotlib, Scikit-learn, and SciPy. The document then discusses tools, applications, and basic concepts in data science and Python.
This document discusses the relationship between data and information. It defines information as meaningfully interpreted data. An information system is defined as a system that gathers data and disseminates information to provide information to users. The main differences between data and information are discussed, with information being interpreted, organized data. Several techniques for collecting data and characteristics of useful information are outlined. The types of information needed at different management levels and sources of information are also summarized.
DATA VISUALIZATION FOR MANAGERS MODULE 1| Creating Visual Analysis with Interactive Data Visualization software Desktop| BUSINESS ANALYTICS PAPER 1 |MBA SEM 3| RTMNU NAGPUR UNIVERSITY| BY JAYANTI R PANDE
MBA Notes by Jayanti Pande
#JayantiPande
#MBA
#MBAnotes
#BusinessAnalyticsNotes
This document discusses business analytics and data analytics capabilities. It covers key concepts like data warehouses, data marts, ETL processes, business intelligence, data mining techniques, and how organizations can use analytics to gain insights from data to support decision making and gain a competitive advantage. The document provides examples of how companies like IHG and retailers use analytics to improve operations and customer understanding.
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
The Ipsos - AI - Monitor 2024 Report.pdfSocial Samosa
According to Ipsos AI Monitor's 2024 report, 65% Indians said that products and services using AI have profoundly changed their daily life in the past 3-5 years.
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
Natural Language Processing (NLP), RAG and its applications .pptxfkyes25
1. In the realm of Natural Language Processing (NLP), knowledge-intensive tasks such as question answering, fact verification, and open-domain dialogue generation require the integration of vast and up-to-date information. Traditional neural models, though powerful, struggle with encoding all necessary knowledge within their parameters, leading to limitations in generalization and scalability. The paper "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks" introduces RAG (Retrieval-Augmented Generation), a novel framework that synergizes retrieval mechanisms with generative models, enhancing performance by dynamically incorporating external knowledge during inference.
1. 0 | P a g e
USE OF SECONDARY DATA IN
MARKETING ANALYTICS
AMARKETINGANALYTICSPROJECT
Prepared By,
Debasis Mohanty (UEBA19001)
XIMB – EMBA BUSINESS ANALYTICS
2. Table of Contents
Introduction: ______________________________________________________________1
Overview of Secondary Data: ________________________________________________1
Purpose of Secondary Data: _________________________________________________1
Primary Versus Secondary Data: _____________________________________________2
Classification of Secondary Data:_____________________________________________2
Advantages and Uses of Secondary Data:______________________________________3
Limitations and Disadvantages of Secondary Data: ______________________________3
Criteria for Evaluating Secondary Data: ________________________________________4
Usage of Secondary Data Sources in Marketing: ________________________________5
Secondary Data Sources that are Public: _______________________________________5
Secondary Data Sources that are Purchased:____________________________________6
How Secondary Data is used: ________________________________________________6
Who all uses Secondary Data: _______________________________________________8
Real-Life Application:_______________________________________________________8
Example: ________________________________________________________________8
Exploratory Data Analysis: __________________________________________________8
Cluster Analysis: _________________________________________________________ 11
Discriminant Analysis: _____________________________________________________ 13
My Experience: ___________________________________________________________ 19
References: ______________________________________________________________ 19
3. 1 | P a g e
Introduction:
Marketing analytics is the practice of measuring, managing and analyzing marketing
performance to maximize its effectiveness and optimize return on investment. It is a critical
part of marketing decision making. It helps in improving the performance by identifying the
variable importance measures. Every decision requires relevant information, and strategies
which can be developed based on data gathered through marketing analytics. Too often,
marketing analytics is considered narrowly as the gathering and analyzing of data for someone
else to use. However, firms can achieve and sustain a competitive advantage through the
creative use of market information generated by marketing analytics. Marketing analytics is
an integral part of marketing research and a bridge to the decision making. Hence, marketing
analytics is defined as information input to decisions, not simply the evaluation of decisions
that have been made.
Modern marketing analytics is mostly web based on done on internet. It was estimated that by
2023, there would be over 650 million internet users the country. Despite the large base of
internet users, the internet penetration rate in the country stood at around 50 percent in 2020.
This meant that around half of the 1.37 billion Indians had access to internet that year. As per
Marketing Research book by Naresh K. Malhotra, it discussed that the Internet as a source of
marketing research information. Analysis of secondary data helps define the marketing
research problem and develop an approach. Also, before the research design for collecting
primary data is formulated, the researcher should analyze the relevant secondary data. In
some projects, particularly those with limited budgets, research may be largely confined to the
analysis of secondary data, since some routine problems may be addressed based only on
secondary data.
This project report would be about the advantages and disadvantages of secondary data,
along with its classification and usages.
Overview of Secondary Data:
Secondary data are the data collected by those who are not related to the research study but
for those who have some other purpose in their mind. If the researcher uses these data, then
these become secondary data for the current users. A variety of secondary information
sources is available to the researcher gathering data on an industry, potential product
applications and the marketplace. Secondary data is also used to gain initial insight into the
research problem. Secondary data is classified in terms of its source – either internal or
external. Internal, or in-house data, is secondary information acquired within the organization
where research is being carried out. External secondary data is obtained from outside sources.
Purpose of Secondary Data:
• Model Building: Specifying relationship between two or more variables
• Fact Findings: Descriptive information to support research
• Extracting the relevant information from other sources, previous studies
4. 2 | P a g e
• Data Mining: Exploring data through computer. Using computer technology to go
through volumes of data to discover trends about an organization’s sales customers
and products. It is primarily used
• Identifying the relevant sources: To avoid plagiarism
Primary Versus Secondary Data:
Primary data are originated by a researcher for the specific purpose of addressing the problem
at hand. Obtaining primary data can be expensive and time consuming. The collection of
primary data involves all six steps of the marketing research process. They are as follows:
• Defining the Problem
• Developing an Approach to the Problem
• Formulating a Research Design
• Doing Field Work or Data Collection
• Preparing and Analyzing the data
• Preparing and Presenting the Report
Secondary Data are data that have already been collected for purposes other than the problem
at hand. These data can be located quickly and inexpensively. The differences between
primary and secondary data are summarized in the below table.
A comparison of Primary and Secondary Data
Primary Data Secondary Data
Collection purpose For the problem at hand For other problems
Collection process Very involved Rapid and easy
Collection cost High Relatively low
Collection time Long Short
Classification of Secondary Data:
Secondary data may be classified as either internal or external. Internal data are those
generated within the organization for which the research is being conducted. This information
may be available in a ready-to use-format, such as information routinely supplied by the
management decision support system. On the other hand, these data may exist within the
organization but may require considerable processing before they are useful to the researcher.
For example, a variety of information can be found on sales invoices. Yet this information may
not be easily accessible; further processing may be required to extract it.
External data are those generated by sources outside the organization. These data may exist
in the form of published material, computerized databases, or information made available by
syndicated services. Before collecting external secondary data, it is useful to analyze internal
secondary data.
5. 3 | P a g e
Advantages and Uses of Secondary Data:
Secondary data offer several advantages over primary data. Secondary data are easily
accessible, relatively inexpensive, and quickly obtained. Although it is rare for secondary data
to provide all the answers to a nonroutine research problem, such data can be useful in a
variety of ways which is as follows.
• Identify the problem
• Better define the problem
• Develop an approach to the problem
• Formulate an appropriate research design (for example, by identifying the key
variables)
• Answer certain research questions and test some hypotheses
• Interpret primary data more insightfully
Given these advantages and uses of secondary data, we state the following general rule:
Examination of available secondary data is a prerequisite to the collection of primary data.
Start with secondary data. Proceed to primary data only when the secondary data sources
have been exhausted or yield marginal returns.
Secondary data gives a frame of mind to the researcher that in which direction he/she should
go for the specific research. Marketing researchers should take advantage of the high-quality
secondary data that are available and consider the potential value in gaining knowledge and
giving insight into a broad range of marketing issues through utilizing secondary data analysis
method.
Limitations and Disadvantages of Secondary Data:
As secondary data has been collected for purposes other than the problem at hand, its
usefulness to the current problem may be limited in several important ways, including
relevance and accuracy. The following can be the disadvantage of the secondary data:
• Data collected in one location may not be suitable for the other one due variable
environmental factor
Secondary
Data
Internal
Ready to use
Requires
further
procressing
External
Published
Materials
Computerized
Databases
Syndicated
Services
6. 4 | P a g e
• With the passage of time the data becomes obsolete and very old
• Secondary data collected can distort the results of the research. For using secondary
data, a special care is required to amend or modify for use
• Secondary data can also raise issues of authenticity and copyright
The objectives, nature, and methods used to collect the secondary data may not be
appropriate to the present situation. Also, secondary data may be lacking in accuracy, or they
may not be completely current or dependable. Before using secondary data, it is important to
evaluate them on these factors. These factors are discussed in more detail in the following
table.
The major limitation of secondary data analysis of the published data is that, due to the
constraints of space due to variable or parameter importance. There is considerable
information in the original data that cannot be recovered from the summary measures reported
in the published article. If the original data were available, this limitation of secondary data
analysis would disappear.
Criteria for Evaluating Secondary Data:
Criteria for Evaluating Secondary Data
Criteria Issues Remarks
Specifications/ Data collection method Data should be reliable, valid, and
Methodology Response rate generalizable to the problem at hand
Quality of data
Sampling technique
Sample size
Questionnaire design
Fieldwork
Data analysis
Error/Accuracy Examine errors in approach, research design,
Assess accuracy by comparing data
from
sampling, data collection, data analysis,
reporting different sources
Currency Time lag between collection and publication Census data are periodically updated
Frequency of updates by syndicated firms
Objective Why were the data collected? The objective will determine the
relevance of the data
Nature Definition of key variables Reconfigure the data to increase their
Units of measurement usefulness, if possible
Categories used
Relationships examined
Dependability Expertise, credibility, reputation, and
Data should be obtained from an
original
trustworthiness of the source rather than an acquired source
7. 5 | P a g e
The quality of secondary data should be routinely evaluated, using the criteria present in the
above table. The specifications or the methodology used to collect the data should be critically
examined to identify possible sources of bias. Such methodological considerations include
size and nature of the sample, response rate and quality, questionnaire design and
administration, procedures used for fieldwork, and data analysis and reporting procedures.
These checks provide information on the reliability and validity of the data and help determine
whether they can be generalized to the problem at hand. The reliability and validity can be
further ascertained by an examination of the error, currency, objectives, nature, and
dependability associated with the secondary data.
Usage of Secondary Data Sources in Marketing:
With secondary research in hand, the next step is to review your source materials to pull out
the insights that are most pertinent to your marketing problem. Some secondary research
sources may include data you can analyze and map to your own customer segmentation or
other market analyses. Other secondary research provides analysis and insights you can use
to develop implications and recommendations for your organization and marketing problem.
It is helpful to capture key findings and recommendations from the secondary research review
and analysis, just as you would for a primary research project. The goal is to summarize what
you have learned, making it easier for any primary research activity to build on what has
already been discovered from secondary research.
Secondary Data Sources that are Public:
There are several data sources that are public and most of the data is uploaded here are
secondary data sets for a specific purpose and analysis. A few top great sites with free data
sets are listed below:
• Kaggle (https://www.kaggle.com/)
• GitHub (https://github.com/)
• UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/index.php)
• Openml (https://www.openml.org/)
• Reddit (https://www.reddit.com/r/datasets/)
• Quandl (https://www.quandl.com/)
• U.S. Government’s open data (https://www.data.gov/)
• India Government’s open data (https://data.gov.in/)
There are many more online websites where the dataset repository is created by few book
publishing houses, universities and author’s private blogs to support consumers in using the
datasets which is used in the book by the author. Moreover, few authors use GitHub as there
repository as well. For example, for the book named “Basic Practice of Statistics, 7th edition”
the data files are kept at https://www.austincc.edu/mparker/software/data/
8. 6 | P a g e
Secondary Data Sources that are Purchased:
These are those data which are used in industry or have a non-disclosure agreement between
two parties. Moreover, data which is copyright protected, company classified data, stock
market data like Bloomberg and online competition or hackathon data are some of the
secondary datasets which must be purchased as it has high value insights. To name a few
sites:
• Bloomberg (https://www.bloomberg.com/professional/solution/bloomberg-terminal/)
• Harvard Business School Datasets used in Education Institutions
(https://hbsp.harvard.edu/search?N=4294930434+516162&Ns=publication_date_filter
%7C1%7C%7Caggregate_sort%7C0&action=sort)
• Stock Market datasets used by traders and researchers
(https://www.nseindia.com/market-data/analytical-products)
How Secondary Data is used:
For a primary analysis, the investigator must select summary measures to report in the text,
figures, tables, and short appendices. The decision about how to summarize the data is an
important one, because it is irreversible. It is seldom possible to regenerate the original data
from the summaries and, therefore, the published summary measures of performance can
only rarely be used to examine alternative measures of performance. For example, a study of
classical conditioning that reports absolute or relative rates (or probabilities) of responding in
the presence and absence of a stimulus cannot be used to evaluate a real-time theory of
conditioning because the time of responding has been eliminated from the record.
Meta-Analysis is performed. Meta-analysis has become an important research strategy as it
enables researchers to combine the results of many pieces of research on a topic to determine
whether the findings holds generally. This is better than trying to assume that the findings of
a single study have global meaning.
Meta-Analysis is an objective and quantitative methodology for synthesizing previous studies
and research on a particular topic into an overall finding.
The term meta-analysis means ‘an analysis of analysis’. A particular topic may have been
replicated in various ways, using, for example, differently sized samples, and conducted in
different countries under different environmental, social and economic conditions. Sometimes
results appear to be reasonably consistent; others less so. Meta-analysis enables a rigorous
comparison to be made rather than a subjective ‘eyeballing’. However, the technique relies on
all relevant information being available for each of the examined studies. If some crucial
factors like sample size and methodology are missing, then comparison is not feasible.
Meta-analysis allows us to compare or combine results across a set of similar studies. In the
individual study, the units of analysis are the individual observations. In meta-analysis the units
of analysis are the results of individual studies.
9. 7 | P a g e
There are three stages to this:
• Identify the relevant variables:
This sounds easy, but like defining a hypothesis and determining research questions,
you must be specific and clear about what your real focus is. You cannot simply say, ‘I
want to do a meta-analysis on attitude change research’. So, you must limit yourself to
conducting an evaluation of a much smaller segment, such as the effects of different
types of feedback on job performance, or the effects of different levels of autonomy on
group achievement of objectives in the service industry.
• Locate relevant research:
One issue that is vital and potentially serious for the meta-analyst is the file drawer
problem. The file drawer phenomenon is potentially serious for meta-analysis because
it produces a biased sample – a sample of only those results published because they
reported statistically significant results. This bias inflates the probability of making a
Type II error (concluding that a variable has an effect when it does not). Studies that
failed to be published are not available to be included in the meta-analysis. Because
meta-analytic techniques ultimately lead to a decision based on available statistical
information, an allowance must be made for the file drawer phenomenon. There are
two ways of dealing with the file drawer problem.
First, uncover those studies that never reach print by identifying as many researchers
as possible in the research area you are researching. Then send each a questionnaire,
asking if any unpublished research on the issue of interest exists. This may be
impracticable in some topics as identification of researchers is difficult and non-
response may be high anyway. But most of the secondary data is get generated out of
structured datasets.
A second but more practical approach, involves calculating the number of studies
averaging null results (i.e. that did not reach significance) that would be required to
push the significance level for all studies, retrieved and unretrieved combined, to the
‘wrong’ side of significance level (p = 0.05).
• Conduct the meta-analysis:
When you have located relevant literature, collected your data, and are reasonably
certain that the file drawer phenomenon isn’t an issue, you are ready to apply one of
the many available meta-analytic statistical techniques. The heart of meta-analysis is
the statistical combination of results across studies. Therefore, as well as recording the
methodology, sample, design, hypotheses, conclusions, etc., you must record
information particularly from the results section of research papers you are reviewing
such a r’s, t’s, chi squares, F’s, and p values.
Apart from Meta-Analysis there are other analysis of secondary data which is performed by
many people to extract deep insights for decision making. Some of them are:
10. 8 | P a g e
• Cluster Analysis
• Predictive Analysis
• Discriminant Analysis
• Anomalies Detection
• Business Analysis for Decision Making
Who all uses Secondary Data:
Secondary data is used by many people across the domain. Once the raw data is generated
in an initial analysis or meta-analysis phase many people are engaged across the domain in
data preprocessing and data cleansing activities. Most of the data is normalized and
standardized (feature scaling) to make sense which is then further tested using statistical tools
and hypothesis. In fact, many researchers, students, data analyst, data scientist rely on
secondary data to prepare exit report and use it in publishing their research paper. The higher
management and stakeholders mostly deal with the secondary data as these datasets are the
ones which can provide them relevant insights.
Real-Life Application:
Below is example of GRE Dataset analysis for graduate admission. I have taken this dataset
from Kaggle where admission prediction is calculated based on GRE Score, TOEFL Score,
University Rating, Statement of purpose ratings, Letter of recommendation ratings, CGPA,
Research criteria and Percentage chance of Admission. Now, why I have taken this dataset
because I am curious to know who all joins the premier colleges which what scores and its
variable importance.
Example:
Below are the top 4 records. Please check the reference section (the last point) for the link to
download the dataset on your machine.
Serial No. GRE Score
TOEFL
Score
University
Rating SOP LOR CGPA Research
Chance of
Admit
1 337 118 4 4.5 4.5 9.65 1 0.92
2 324 107 4 4 4.5 8.87 1 0.76
3 316 104 3 3 3.5 8 1 0.72
4 322 110 3 3.5 2.5 8.67 1 0.8
Exploratory Data Analysis:
Before we run any analysis (like discriminant or cluster), we need to review the assumptions
by exploring the dataset.
11. 9 | P a g e
• The predictors must be independent
• Group member must be mutually exclusive. For example, a student can’t be admitted
and not admitted at the same time
• There must be an absence or negligible outliers
• The predictors should be normally distributed. And the in-group variance-variance
matrix should be equal across groups.
• Also, there should not be any multicollinearity issue between the variables in the
dataset
• The categorical variables in the dataset should be balanced else its effect might get
nullified.
Case Processing Summary
Cases
Valid Missing Total
N Percent N Percent N Percent
GRE Score 400 100.0% 0 0.0% 400 100.0%
TOEFL Score 400 100.0% 0 0.0% 400 100.0%
University Rating 400 100.0% 0 0.0% 400 100.0%
SOP 400 100.0% 0 0.0% 400 100.0%
LOR 400 100.0% 0 0.0% 400 100.0%
CGPA 400 100.0% 0 0.0% 400 100.0%
Research 400 100.0% 0 0.0% 400 100.0%
Chance of Admit 400 100.0% 0 0.0% 400 100.0%
13. 11 | P a g e
We could see that there is no missing data and the Descriptive Statistics for each variable
being generated by SPSS (Kindly check for the attached SPSS files for the same analysis
results). Moreover, we can understand that many variables are normally distributed. This
shows that we can go ahead with data analysis techniques like clustering and logistic
regression. We have also identified that we have outliers present in variables like LOR, CGPA
and Chance of Admit.
Cluster Analysis:
Further we have observed that there are no such categorical decision variable present in the
dataset. Looking at the dataset and above histogram we can predict that the percentage
Change of Admit needs to be binned into categories to create decision variables. Before
binning let us run the Two Step clustering algorithm to understand the model summary and
cluster quality.
Research
0 1
Frequency Percent Frequency Percent
Cluster 1 79 43.6% 71 32.4%
2 1 0.6% 114 52.1%
3 101 55.8% 34 15.5%
Combined 181 100.0% 219 100.0%
Now we can confirm that we have 3 clusters based on the cluster quality. So, using the Chance
of Admit variable and categorizing into 3 status we have. High Chance, Medium Chance and
Please check the
Descriptive Statistics for
other variables in the SPSS
Output file named
Exploratory_Data_Analysis.
14. 12 | P a g e
Low Chance of getting admission. High Chance being > 80%, Low Chance < 60% and the
rest being Medium Chance.
Performing K Means Clustering we have the following output with all variable being significant
in the ANOVA table.
Cluster Analysis is the statistical method of partitioning a sample into homogeneous classes
to produce an operational classification.
Because we usually don’t know the number of groups or clusters that will emerge in our sample
and because we want an optimum solution, a two-stage sequence of analysis occurs as
follows:
Final Cluster Centers
Cluster
1 2 3
GRE Score 329 316 301
TOEFL Score 114 106 101
University Rating 4 3 2
SOP 4.1 3.3 2.6
LOR 4.0 3.4 2.8
CGPA 9.19 8.48 7.98
Research 1 0 0
Initial Cluster Centers
Cluster
1 2 3
GRE Score 340 318 290
TOEFL Score 120 100 104
University Rating 5 2 4
SOP 4.5 2.5 2.0
LOR 4.5 3.5 2.5
CGPA 9.91 8.54 7.46
Research 1 1 0
15. 13 | P a g e
• We carry out a Two Step cluster analysis using Ward’s method applying squared
Euclidean Distance as the distance or similarity measure. This helps to determine the
optimum number of clusters we should work with.
• The next stage is to run other clustering techniques like the K Mean cluster analysis
with our selected number of clusters, which enables us to allocate every case in our
sample to a particular cluster.
This sequence and methodology using SPSS will be described in more detail later. There are
a variety of clustering procedures of which hierarchical cluster analysis is the major one.
Here Cluster 1 being the High Chance, Cluster 2 as Medium Chance and the Cluster 3 as Low
Chance. I have plotted few other cluster graphs in the SPSS output for better understanding
of the data.
Discriminant Analysis:
Discriminant Function Analysis (DA) undertakes the same task as multiple linear regression
by predicting an outcome. However, multiple linear regression is limited to cases where the
dependent variable on the Y axis is an interval variable so that the combination of predictors
will, through the regression equation, produce estimated mean population numerical Y values
16. 14 | P a g e
for given values of weighted combinations of X values. But many interesting variables are
categorical, such as Chance of getting admitted in this case analysis.
In SPSS, we will calculate the within-group statistics with function coefficient being checked
for both Fisher’s and Unstandardized. Moreover, as the number of group sizes are not same
across the dataset, we will use prior probabilities to be calculated based on group size as per
below SPSS screenshot.
17. 15 | P a g e
Discriminant Analysis SPSS Report
Tests of Equality of Group Means
Wilks' Lambda F df1 df2 Sig.
GRE Score .438 255.147 2 397 .000
TOEFL Score .433 259.859 2 397 .000
University Rating .527 178.399 2 397 .000
SOP .572 148.728 2 397 .000
LOR .626 118.377 2 397 .000
CGPA .329 405.413 2 397 .000
Research .695 87.261 2 397 .000
Pooled Within-Groups Matrices
GRE Score TOEFL Score
University
Rating SOP LOR CGPA Research
Correlation GRE Score 1.000 .624 .319 .248 .199 .578 .305
TOEFL Score .624 1.000 .372 .339 .219 .564 .137
University Rating .319 .372 1.000 .523 .427 .441 .114
SOP .248 .339 .523 1.000 .552 .422 .141
LOR .199 .219 .427 .552 1.000 .380 .105
CGPA .578 .564 .441 .422 .380 1.000 .153
Research .305 .137 .114 .141 .105 .153 1.000
Box’s Test of Equality of Covariance Matrices
Test Results
Box's M 184.784
F Approx. 3.201
df1 56
df2 175151.554
Sig. .000
Tests null hypothesis of equal
population covariance matrices.
Box’s M is 184.784 with F = 3.201 which is significant at p < .000
Log Determinants
Status Rank
Log
Determinant
Low Chance 7 -.256
Medium Chance 7 -.256
High Chance 7 -3.519
Pooled within-groups 7 -.834
The ranks and natural logarithms of determinants
printed are those of the group covariance matrices.
18. 16 | P a g e
Summary of Canonical Discriminant Functions
Eigenvalues
Function Eigenvalue % of Variance Cumulative %
Canonical
Correlation
1 2.496a
98.7 98.7 .845
2 .032a
1.3 100.0 .175
a. First 2 canonical discriminant functions were used in the analysis.
A canonical correlation of .845 suggests the model explains 71.40% of the variation in the
grouping variable, i.e. whether a candidate chance of admission is more or not.
Wilks' Lambda
Test of Function(s) Wilks' Lambda Chi-square df Sig.
1 through 2 .277 505.458 14 .000
2 .969 12.277 6 .056
Standardized Canonical Discriminant
Function Coefficients
Function
1 2
GRE Score .109 -.155
TOEFL Score .214 .497
University Rating .145 .601
SOP .051 -.447
LOR .085 -.611
CGPA .567 -.306
Research .235 .496
Canonical Discriminant Function
Coefficients
Function
1 2
GRE Score .014 -.020
TOEFL Score .053 .124
University Rating .175 .723
SOP .067 -.585
LOR .119 -.857
CGPA 1.653 -.891
Research .565 1.191
(Constant) -25.987 2.849
Unstandardized coefficients
Functions at Group Centroids
Status
Function
1 2
Low Chance -2.253 .272
Medium Chance -.514 -.169
High Chance 2.097 .105
Unstandardized canonical discriminant
functions evaluated at group means
Structure Matrix
Function
1 2
CGPA .904*
-.194
TOEFL Score .724*
.234
GRE Score .717*
.089
University Rating .600*
.164
SOP .546*
-.398
Research .417*
.411
LOR .484 -.586*
19. 17 | P a g e
Classification Statistics
Prior Probabilities for Groups
Status Prior
Cases Used in Analysis
Unweighted Weighted
Low Chance .185 74 74.000
Medium Chance .495 198 198.000
High Chance .320 128 128.000
Total 1.000 400 400.000
Classification Function Coefficients
Status
Low Chance Medium Chance High Chance
GRE Score 5.833 5.867 5.899
TOEFL Score .519 .557 .731
University Rating -14.127 -14.143 -13.488
SOP -3.695 -3.320 -3.306
LOR .776 1.362 1.439
CGPA 12.363 15.632 19.704
Research -29.497 -29.039 -27.237
(Constant) -944.747 -987.801 -1057.368
Fisher's linear discriminant functions
Classification Resultsa,c
Status
Predicted Group Membership
TotalLow Chance Medium Chance High Chance
Original Count Low Chance 46 28 0 74
Medium Chance 8 175 15 198
High Chance 0 12 116 128
% Low Chance 62.2 37.8 .0 100.0
Medium Chance 4.0 88.4 7.6 100.0
High Chance .0 9.4 90.6 100.0
Cross-validatedb
Count Low Chance 44 30 0 74
Medium Chance 12 169 17 198
High Chance 0 13 115 128
% Low Chance 59.5 40.5 .0 100.0
Medium Chance 6.1 85.4 8.6 100.0
High Chance .0 10.2 89.8 100.0
20. 18 | P a g e
Summary of Discriminant Analysis:
Wilks’ lambda indicates the significance of the discriminant function. This table indicates a
highly significant function (p < .000) and provides the proportion of total variability not
explained, i.e. it is the converse of the squared canonical correlation. So, we have 22.7%
unexplained.
From the Canonical Discriminant Function Coefficients table these unstandardized
coefficients are used to create the discriminant function (equation)
𝐷 = (0.014 × 𝐺𝑅𝐸 𝑆𝑐𝑜𝑟𝑒) + (0.053 × 𝑇𝑂𝐸𝐹𝐿 𝑆𝑐𝑜𝑟𝑒) + (0.175 × 𝑈𝑛𝑖𝑣𝑒𝑟𝑠𝑖𝑡𝑦 𝑅𝑎𝑡𝑖𝑛𝑔)
+ (0.67 𝑆𝑂𝑃) + (.119 𝐿𝑂𝑅) + (1.653 𝐶𝐺𝑃𝐴) + (0.565 𝑅𝑒𝑠𝑒𝑎𝑟𝑐ℎ) − 25.987
Finally, there is a classification phase. The above classification table, also called a confusion
table, is simply a table in which the rows are the observed categories of the dependent and
the columns are the predicted categories. When prediction is perfect all cases will lie on the
diagonal. The percentage of cases on the diagonal is the percentage of correct classifications.
The cross validated set of data is a more honest presentation of the power of the discriminant
function than that provided by the original classifications and often produces a poorer
outcome. Furthermore, the cross validation produces a more reliable function. 82.0% of cross-
validated grouped cases correctly classified. The argument behind it is that one should not
use the case you are trying to predict as part of the categorization process.
The classification results reveal that 84.3% of respondents were classified correctly into ‘High’,
‘Medium” or ‘Low’ chance groups. This overall predictive accuracy of the discriminant function
is called the ‘hit ratio’. The High chance is predicted at a better accuracy 90.6% than Medium
(88.4%) and Low (62.2%).
21. 19 | P a g e
My Experience:
As a part of corporate restructuring, merger and acquisition I worked for British Telecom as a
contractor employee on behalf of my company TCS. As a Data Analyst I was a part of Data
cleansing team for asset migration activities. On the floor, I was dealing with a deluge of
computer-generated data however, there were lot of data discrepancies and redundancy in
the data which entire team used to cleanse it before sending it to the stakeholders. We also
used to perform similar analysis as stated above however they were more related to telecom
domain. It was a nice experience in dealing with a lot of secondary then but now I realize the
real use of it and could connect the dots as I figured out from where those data originated
initially. I won’t be able to discuss much about the data which I used due to non-disclosure
agreement, but I can say that it really makes a lot sense to me. This really clarify the difference
between the primary and secondary data.
References:
The below references are being used to make this project:
• The Effective Use of Secondary Data, Journal published in Elsevier.
https://www.sciencedirect.com/science/article/abs/pii/S0023969001910987
• Accessing Secondary Data : A Literature Review, a journal from ResearchGate by Kalu,
Alexanda, Larry Chukwuemeka Unachukwu, Oti Ibiam
(https://www.researchgate.net/publication/328067351_Accessing_Secondary_Data_A
_Literature_Review)
• Discriminant Analysis Chapter 25 published by Sage, chapters from a book name
“Business Research Methods and Statistics using SPSS” by Robert B. Burns and
Richard A. Burns.
https://github.com/DragonflyStats/AdvancedStatistics/blob/master/Chapter%2025%20-
%20Discriminant%20Analysis.pdf
• Marketing Research Book by Naresh K. Malhotra, an applied orientation textbook.
• Class Notes and PPT used in our Marketing Analytics class by Plavini Punyatoya for
Discriminant Analysis and Cluster Analysis
• Dataset from Kaggle, GRE Dataset analysis for graduate admission. The
Admission_Predict.csv dataset. https://www.kaggle.com/mohansacharya/graduate-
admissions