The document discusses several use cases for semantic technologies in e-commerce applications including business data integration, content augmentation, master data management, semantic search, and recommendation engines. It proposes using a central data lake to integrate data from various applications and provide a single source of structured, linked data to power these use cases and enable more automated, flexible solutions compared to traditional point-to-point integrations.
2. BACKGROUND
• Christian Opitz
• Head of Business Development and Innovation at Netresearch
• Project manager, consultant, web developer, designer, entrepreneur since 2007
• Netresearch
• Leipzig based E-Commerce-Specialist founded in 1998
• Serves global enterprises in building and maintaining web platforms and shops
• Develops and maintains Shop Integrations for several payment and shipping providers
13. September 20162
3. LEDS
• Linked Enterprise Data Services:
• Integration and Management of background knowledge, enterprise and open data
• Monitoring of the data access and quality
• Data evolution
• Content analysis of unstructured text documents
• Scalable, topic-oriented and personalized search
• Iteratively tested in the domains of e-commerce and e-government.
• 4 industry partners (brox, Ontos, Lecos, Netresearch) and 2 research partners
(Universität Leipzig, TU Chemnitz)
• 3-years project funded by Federal Ministry of Education and Research (BMBF)
13. September 20163
5. BUSINESS DATA INTEGRATION: PROBLEM
• (Web-) IT infrastructure mostly consisting of various applications for specific
domains:
• Enterprise Resource Planning (ERP)
Holds basic product information like SKU and stock availability
• Shop Systems
Presentation of products to the customer, checkout, order tracking interface
• Content Management Systems (CMS)
Corporate website, additional information, landing pages
• Customer Relationship Management (CRM)
Management of all customer and lead related activities and information
• Product Information Management (PIM)
Management of product information by channel (website, shop, print catalogues etc.)
• Digital Asset Management (DAM)
Management of files, their conversions and metadata
13. September 20165
6. BUSINESS DATA INTEGRATION: PROBLEM
• Required to exchange data with each other based on business rules – f.i.:
• PIM requires the basic product information (like SKU) from ERP and asset data from DAM
• Shop requires stock information from ERP, product data from PIM, assets from DAM and
eventually customer data and price rules from CRM
• ERP must be notified when products were ordered in shop
• CRM must be notified on customer and lead activities and data like signups and orders
from shop or CMS
• CMS requires assets from DAM, customer data from CRM and product data from PIM
• DAM should know where in PIM, shop or CMS each asset is used
• Often further complex business rules
• Mostly vendor specific formats and services
13. September 20166
7. BUSINESS DATA INTEGRATION: PROBLEM
• Todays approaches:
• Wiring applications directly:
• With existing or self developed adapters/connectors for each system
• Costly when no existing adapters available
• Introducing further dependencies
• Hindering upgrades
• Inflexible: Changing business rules often requires changes in several systems
• Using middleware:
• ETL (extract, transform, load) software allows to handle huge amounts of data
• ESB (enterprise service bus) software allow to orchestrate web services based on concrete
business rules
• Affordable existing solutions from vendors like Talend, Pentaho or MuleSoft
• Extensive or expensive integration: Steep learning curves, standard scenarios good kept secrets
13. September 20167
8. BUSINESS DATA INTEGRATION: SOLUTION
• Enterprise Data Lake:
• Reflects all relevant business data from
several applications and domains
• Vendor specific semistructured data
transformed into structured, linked data
using suitable vocabularies
• Structured data stored in triplestore
• Data can be queried from any domain
mixed with data from any other domain
• ETL/ESB middleware orchestrates
data flow between applications via
Data Lake
• Other applications can use and
manipulate the data without having to
know the actual source
13. September 20168
9. BUSINESS DATA INTEGRATION: SOLUTION
• Benefits
• Vendor and application independency:
• Structured data reflection of applications vendor specific data allows to replace a system in the
stack by only implementing the data transformation for the new one
• Flexibility:
• Any applications can work with data lake without having to care about the sources and targets
• Easy integration of other linked data sources and applications
• Insights:
• Whole business data universe available to Business Intelligence applications
• Business critical questions can be answered quickly by reports based on any data from the lake
13. September 20169
11. CONTENT AUGMENTATION: PROBLEM
• Writing, updating and linking editorial content with further or related
information is a time consuming process
• Crucial – especially for e-commerce companies
• Time to publishing
• Quality
• Quantity
… influence visibility on the web
• Regular publishing to social networks and timely react on trending topics is vital
but mostly requires a dedicated social media manager
13. September 201611
12. CONTENT AUGMENTATION: SOLUTION
• Using background knowledge to enrich and link contents
• Editor assistance:
• Editors input is mined for ontologies
• Editor is presented with the ontologies along with the available background knowledge
• Editor can choose to include the background knowledge – eventually paraphrased
(into title or longdesc attributes, foot notes, parentheses, inserted sentences, blocks, asides or
even new landing pages)
• Automated augmentation:
• Include background knowledge for ontologies mined from existing contents
• Use background knowledge to link with other, suitable contents
• Automated publishing:
• Post suitable contents to social networks for trending topics based on background knowledge
• Enrich existing content with trending keywords
13. September 201612
13. CONTENT AUGMENTATION: BENEFITS
• Benefits
• Easier editing work flow
• Less user fluctuation by keeping them reading on the site
• Increased visibility in search engines
• Reduced social media management effort
• Quicker and wider social network coverage
13. September 201613
15. MASTER DATA MANAGEMENT: PROBLEM
• Conception and modelling of product data is an extensive process
• Product categorization and linking
• Defining attributes:
• Decide on type
• Configure enumerations and validations
• Modelling common attributes by product classes (attribute sets)
• Requires shop and content management, marketing and editorial knowledge
+ knowledge of the particular field of the products
• Mistakes can lead to bad visibility in search engines and higher bounce rates in
the shop
13. September 201615
16. MASTER DATA MANAGEMENT: SOLUTION
• Use existing, semantic product information on the web:
• Find semantic product data on existing websites by available information (f.i. title, product
class, SKU)
• Web Data Commons Dataset could be used to find the websites providing appropriate data
• Suggest product class, attributes, attribute sets and related products
• Product manager can then choose to adopt them selectively
• Eventually regularly recrawl the semantic web for updated information and notify the
product manager
• Benefits:
• Reduced product information management effort
• Reduced time to market for resellers
• Eye on the market / up to date product information
13. September 201616
18. SEMANTIC SEARCH: PROBLEM
• Search queries for terms that are not in the index won’t give results even when
there is something in the index that correlates
• Example:
• A toy retailer sells Corgi toy cars on his web shop
• A user on the web shop searches for “Matchbox”
• Unless the retailer explicitly mentioned “Matchbox” in the product descriptions the search
won’t give results
13. September 201618
19. SEMANTIC SEARCH: SOLUTION
• Invoke background knowledge from linked open data sources while indexing or
actually searching
• Match it with the search term or the background knowledge for it
• On the example:
• The search engine can find out that “Matchbox” relates to toy cars and can then find the
Corgi cars (when it indexed “toy cars” along with “corgi” previously)
• Benefits:
• Better search results or results at all
• No need to manually provide keywords for the index on which items should be found
• When using the data lake, other linked data than open data is available to search against
13. September 201619
21. RECOMMENDATION ENGINE: PROBLEM
• Providing web shop visitors with related products (up-/cross-selling) usually
done by:
• Manually linking the related products
• time consuming
• Error-prone
• Inflexible – changes usually also time consuming
• Use more or less extensive and successful algorithms (f.i. “show products with the same
category which are more expensive”)
• Either not giving satisfying results
• Or extensive work required to implement them
• Or expensive to use those of specialized vendors
13. September 201621
22. RECOMMENDATION ENGINE: SOLUTION
• Automatically link related products based on background knowledge
• Semantic search can be utilized
• Linking rules could/should also invoke data from other domains than the product
information (f.i. product history of customers buying this product from CRM, stock data
from ERP)
• Benefits:
• No need to manually link products, develop custom algorithms or costly implement existing
ones
13. September 201622
24. SUMMARY
• Business data integration most fundamental use case, even only enabling the
other ones for e-commerce companies with multiple applications
• LEDS technology stack layed out to work with data lake and support close-by
applications as those from the other use cases