NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
Using data mining in e commerce
1. HARRAN UNIVERSITY DEPARTMENT OF COMPUTER ENGINEERING
USING DATA MINING IN E-COMMERCE
27 MAY 2020
PRESENTED BY
SHAHAB H. KAKA ALI
2. Content
Introduction.
What is data mining ? And some review about it.
What is e-commerce ? And some review about it.
using data mining techniques in e-commerce
Aim of data mining techniques in e-commerce
DM applied to retail e-commerce
Advantage and disadvantage of using DM in E-commerce
Conclusion
References
3. INTRODUCTION
e-commerce has changed the face of most business functions in competitive
enterprises. internet technologies have perfectly automated interface processes
among customers and retailers, retailers and distributors, distributors and factories,
and factories and their all suppliers. in general, e-commerce and e-business have
enabled on-line transactions. also, generating large-scale real-time data has never
been easier. it is only apposite to seek the services of data mining to make business
sense out of these data sets. data mining (dm) has as its dominant goal, the
generation of non-obvious yet useful information for decision makers from very
large databases. the various mechanisms of this generation include abstractions,
aggregations, summarizations, and characterizations of data.
4. What is Data mining
It is the process of discovering patterns in large data sets involving methods at
the intersection of machine learning, statistics, and database systems, Which
helps to extract information from huge sets of data, Which are predict
behaviors and future trends, allowing businesses to make proactive,
knowledge-driven decisions, who Colin Shearer is Father of data mining, most
useful data mining techniques are:
Clustering (descriptive)
Regression (predictive)
Association Rule Discovery (descriptive)
Classification (predictive)
Outlier Detection
5. Data Mining Tasks
1. Classification: it determines the class of an object by its attributes. A set of objects is
given as the training set. Each object is represented by a vector of attributes followed
by its class. The classification model is built by analyzing the relationship between the
attributes and classes in the training set.
2. Clustering: it is the process of segmentation to extract unrecognized groups which
have the same characteristics.
3. Associations Rules :it is for finding the relationships between objects which it indicates
that the appearance of a set of objects in a database is strongly related to the
appearance of a set of other objects.
4. Description: is one of the crucial tasks in data mining tool to describe a complex
database which creates an operation to provide explanations. like in online shopping
can be summarized in total amount of the buying ,total number of items, and so on.
5. Prediction: is likes classification except it's results are in the future. (marketing forecast
Predicting the commercial value of a stock three months in the future.)
6. Some Common Data Mining Tools
1. Weka: is a collection of machine learning algorithms for data mining tasks.
Which includes all the standard data mining procedures like data
preprocessing, clustering, association, classification, regression and also
attribute selection.
2. NLTK: It is mainly for language processing task with pool of different language
processing tools together with machine learning, data mining and sentiment
analysis, data scrapping and different language processing tasks.
3. Spider Miner: is data mining tool that flexible tool and user friendly offered as
a service, and apart from data mining function, the tool can visualize, predict,
data pre-processing, deployment statistical modelling and of course
evaluation functions.
4. Knime: is a powerful tool with GUI that shows the network of data
nodes. Primarily used for data preprocessing i.e. (ETL)data.
7. Basic data mining steps:
1. Data selection: it is all about identifying the kind of data to be mined, At the end of this step the
right input attributes and output information in order to represent the task are chosen.
2. Data transformation: This step is all about organizing the data based on the requirements by
removing noise, converting one type of data to another, normalizing the data if there is need to,
and also defining the strategy to handle the missing data.
3. Data mining step per se: Having mined the transformed data using any of the techniques to
extract pattern of interest, the miner can also make data mining method by performing the
proceeding steps correctly.
4. Result interpretation and validation: For better understanding of data and it synthesized
knowledge together with its validity span, the robustness is check by data mining application
test. The information retrieved can also be evaluated by comparing it with the earlier expertise in
the application domain.
5. Incorporation of discovered knowledge: has to do with presenting the result of discovered
knowledge to decision maker so that it is possible to compare or check for conflict with an
earlier extracted knowledge where a new discovered pattern can be applied
8. What is e-commerce
electronic commerce or internet commerce, refers to the buying
and selling of goods or services using the internet, and the transfer
of money and data to execute these transactions.
The history of ecommerce begins with the first ever online sale: on
the August 11, 1994 a man sold a CD by the band Sting to his
friend through his website NetMarket, an American retail platform.
This is the first example of a consumer purchasing a product from
a business through the World Wide Web or “e-commerce” as we
commonly know it today.
9. Types of Ecommerce Models
Business to Consumer (B2C): When a business sells a good or service to an
individual consumer (e.g. You buy a pair of shoes from an online retailer).
Business to Business (B2B): When a business sells a good or service to another
business (e.g. A business sells software-as-a-service for other businesses to
use)
Consumer to Consumer (C2C): When a consumer sells a good or service to
another consumer (e.g. You sell your old furniture on eBay to another
consumer).
Consumer to Business (C2B): When a consumer sells their own products or
services to a business or organization (e.g. An influencer offers exposure to
their online audience in exchange for a fee, or a photographer licenses their
photo for a business to use).
10. using DM techniques in e-commerce
it refers to use all datamining techniques in e-commerce for providing
capabilities for automated prediction of trends ,behaviors, and automates
the process of finding predictive information specially in these databases
which sufficient size and quality which datamining technology can generate
new business opportunities.
for example when you want to make online shopping from an electronic
market, the data mining techniques helps you to make your showing easer
like recommending your desire items depending on your(age, sex, region,
season or etc..), and on another hand better for slayers because it make
good market for their items too.
11. Aim of DM techniques in e-commerce
Customer Profiling: is the collection of information about user’s behavior across many different
ways for the purpose of formulating a profile of users' habits and interests.
Personalization of Service: it refers to that it contains data from many types of visitor interaction,
from reading email to visiting brick and mortar stores and visiting your mobile site.
Basket Analysis: it identifying the products bought together by customers depend on this
prediction a recommendation can be displayed in the e-commerce which makes increasing sales
Sales Forecasting: A sales forecast is an estimate and assessment of how to manage the future
cash flow (regarding how money is going to come in and out).
Merchandise Planning: is a data-driven approach to selecting, buying, presenting and selling
merchandise to maximize your return on investment and satisfy consumer demand which is by
making the right merchandise available at the right places, times, prices and quantities.
market Segmentation: It is the process of evaluating and classifying customer groups to achieve
targeted marketing efforts to improve marketing efforts and provide the best products.
12. DM applied to retail e-commerce
1. Collecting data at the right level of abstraction is very important.
2. Designing user interface forms needs to consider the DM issues in mind. For instance, disabling
default values on various important attributes like Gender, Marital status, Employment status, etc.,
3. Certain important implementation parameters in retail e-commerce sites like the automatic time
outs of user sessions due to perceived inactivity at the user end, need to be based not purely on DM
algorithms, but on the relative importance of the users to the organization.
4. Generating logs for several million transactions is a costly exercise. It may be wise to generate
appropriate logs by conducting random sampling, as is done in statistical quality control.
5. Auditing of data procured for mining, from data warehouses, is mandatory. This is due to the fact
that the data warehouse might have collated data from several disparate systems with a high chance
of data being duplicated or lost during the ETL operations.
6. Mining data at the right level of granularity is essential. Otherwise, the results from the DM exercise
may not be correct.
13. Advantages
Marketing/Retail: it helps marketing companies build models based on historical data
to predict who will respond to the new marketing campaigns such as direct mail, online
marketing campaign…etc. Through the results, marketers will have an appropriate
approach to selling profitable products to targeted customers.
Finance/Banking: it gives financial institutions information about loan information and
credit reporting. By building a model from historical customer’s data, the bank, and
financial institution can determine good and bad loans. In addition, data mining helps
banks detect fraudulent credit card transactions to protect credit card’s owner.
manufacturer: By applying data mining in operational engineering data, manufacturers
can detect faulty equipment and determine optimal control parameters.
Governments: Data mining helps government agency by digging and analyzing records
of the financial transaction to build patterns that can detect money laundering or
criminal activities.
14. Disadvantage
Privacy Issues: The concerns about personal privacy have been increasing
because of privacy issues who people are afraid of their personal information
is collected and used in an unethical way, or maybe these personal
information they own probably is sold to others or leak.
Security issues: Businesses own information about their employees and
customers including social security number, birthday, payroll and etc. However
how properly this information is taken care is still in questions. There have
been a lot of cases that hackers accessed and stole big data of customers from
the big corporation such as Ford Motor Credit Company, Son.. .
inaccurate information: if the users information is not correct may be convert
all the advantage to disadvantage .
15. Conclusion
data mining plays an important role in the development of electronic
commerce applications. E-commerce and the fields related to business
intelligence and analytics have developed greatly due to the maturity of
data mining and other areas. These areas of research have become more
important due to the arrival of big data and the subsequent illumination
of the long tail in electronic commerce.
Data mining has a lot of benefits to e-commerce. However, privacy,
security, and inaccurate information are the big problems if they are not
addressed and resolved properly.
16. references
1. Agrawal R, Srikant R 1994 Fast algorithms for mining association rules. In 20th Int. Conf. on Very Large
Databases (New York: Morgan Kaufmann) p 487–499
2. Ansari S, Kohavi R, Mason L, Zheng Z 2001 Integrating e-commerce and data mining: architecture and
challenges. In Proc. 2001 IEEE Int. Conf. on Data Mining (New York: IEEE Comput. Soc.)pp 27–34
3. Auguste D M 2001 Customer service in e-business. IEEE Internet Comput. 5(5): 90–91
4. Berendt B, Hotho A, Stumme G 2002 Towards semantic web mining. In Proc. First Int. SemanticWeb
Conference, Sardinia, Italy
5. Box G, Jenkins G, Reinsel G 1994 Time series analysis: Forecasting and control 3rd edn (Englewood Cliffs,
NJ: Prentice Hall)
6. Carbone P L 2000 Expanding the meaning of and applications for data mining. In IEEE Int. Conf. on ystems,
Man, and Cybernetics (New York: IEEE) pp 1872–1873
7. Glymour C, Madigan D, Pregibon D, Smyth P 1996 Statistical inference and data mining. Common.
8. Gujarati D 2002 Basic econometrics (New York: McGraw-Hill/Irwin)
9. Haykin S 1998 Neural networks: A comprehensive foundation 2nd edn (Englewood Cliffs, NJ:Prentice-Hall)
10. Hertz J, Krogh A, Palmer R G 1994 Introduction to the theory of neural computation (Reading,
MA:Addison-Wesley)
11. Hu X, Cercone N 2002 An olam framework for web usage mining and business intelligence reporting.
12. In Proc. IEEE Int. Conf. on Fuzzy Systems, FUZZ-IEEE’02 (New York: IEEE Comput. Soc.) pp 950–955
17. 14. Jeng J J, Drissi Y 2000 Pens: a predictive event notification system for e-commerce environment. In The
24th Annu. Int. Computer Software and Applications Conference, COMPSAC 2000, pp 93–98
15. Kimball R, Reeves L, Ross M, Thornthwaite W 1998 The data warehouse lifecycle toolkit: Expert methods for
designing, developing, and deploying data warehouses (New York: Wiley)
16. Kohavi R 2001 Mining e-commerce data: The good, the bad, and the ugly.
17. In Proceedings of the Seventh ACMSIGKDD International Conference on Knowledge Discovery and Data
Mining (KDD 2001) (New York: ACM Press) pp 8–13
18. Kohavi R, Mason L, Parekh R, Zheng Z 2004 Lessons and challenges from mining retail e-commerce
19. data. Machine Learning J. (Special Issue on Data Mining Lessons Learned)