The document discusses developing transparent and serendipitous content-based adaptive systems using big data and open knowledge sources. It identifies three drawbacks of current recommendation technologies: training is a bottleneck, they are black boxes, and suggestions lack surprise. The authors propose a solution exploiting social media to model user preferences, entity linking to make profiles transparent and linked open data aware, and open knowledge sources to increase serendipity. Challenges include data representation and filtering while recommendations include promoting linked open data and interconnecting data silos.
Linked Open Data-enabled Strategies for Top-N RecommendationsCataldo Musto
Linked Open Data-enabled Strategies for Top-N Recommendations - Cataldo Musto, Pierpaolo Basile, Pasquale Lops, Marco De Gemmis and Giovanni Semeraro - 1st Workshop on New Trends in Content-based Recommender Systems, co-located with ACM Recommender Systems 2014
Linked Open Data-enabled Strategies for Top-N RecommendationsCataldo Musto
Linked Open Data-enabled Strategies for Top-N Recommendations - Cataldo Musto, Pierpaolo Basile, Pasquale Lops, Marco De Gemmis and Giovanni Semeraro - 1st Workshop on New Trends in Content-based Recommender Systems, co-located with ACM Recommender Systems 2014
Green in IT' as well as 'Green by IT' are established approaches to increase environmental sustainability with the use of information technology. The concept of digital sustainability enhances this view because today knowledge itself is a resource worth protecting. This concept assumes digital goods such as data, text, images, or software lead to the highest benefit for society when they are freely available surrounded by an open ecosystem of contributors.
Public data ecosystems in and for smart cities: how to make open / Big / smar...Anastasija Nikiforova
This is a set of slides used as part of my keynote "Public data ecosystems in and for smart cities: how to make open / Big / smart / geo data ecosystems value-adding for SDG-compliant Smart Living and Society 5.0" delivered at the 5th International Conference on Advanced Research Methods and Analytics (CARMA 2023) -> https://carmaconf2023.wordpress.com/keynote-speakers/. read more here -> https://anastasijanikiforova.com/2023/06/30/keynote-at-the-5th-international-conference-on-advanced-research-methods-and-analytics-carma-2023/
Digital preservation through Digital SustainabilityMatthias Stürmer
The concept of digital sustainability introduces a holistic approach on how to maximize the benefits of digital resources for our society. The nine basic conditions for digital sustainability also provide a contribution to potential solutions to the challenges of digital preservation. Elaborateness, transparent structures, semantic data, distributed location, an open licensing regime, shared tacit knowledge, participatory culture, good governance, and diversified funding support the long-term availability of digital knowledge. Therefore, in this conceptual paper, we explain the links between digital sustainability and digital preservation in order to increase the impact of both. We conclude by presenting the political agenda of the Swiss parliamentary group for digital sustainability.
Linked Open Data and data-driven journalismPia Jøsendal
A keynote held at the Media 3.0 seminar in Bergen. It is an introductionary presentation of simple key elements of linked open data. It adresses media and journalists, what data driven journalism can look like and why they should care about what linked open data can offer.
EUDAT 3rd Conference: Bringing Data e-Infrastructures to Horizon2020 - Carl-C...EUDAT
| www.eudat.eu | EUDAT 3rd Conference Opening Session: Bringing Data e-Infrastructures to Horizon2020 - Carl-Christian Buhr, Member of the Cabinet of Ms Neelie Kroes, Vice‐President for Digital Agenda, European Commission - Wednesday 24th September 2014, Amsterdam, the Netherlands
Fostering cross-disciplinary collaboration between data science and other disciplines like design.
Creative commons image credits:
- Cook-Anderson, Gretchen. “Snapshots from Space Cultivate Fans among Midwest Farmers.” NASA, NASA, 16 Sept. 2009, https://www.nasa.gov/topics/earth/features/farmer_imagery.html.
- "Coffee For One" by Public Places is licensed under CC BY 2.0: https://wordpress.org/openverse/image/eafd97fb-0174-4fea-8337-a9df5e678f0b
- "Cooking" by omefrans is licensed under CC BY-NC 2.0: https://wordpress.org/openverse/image/e32f7eed-66a4-4b06-82c3-2c313f28fd9f
- "Edge Effect" by Tim Pohlhaus is licensed under CC BY-NC-SA 2.0: https://wordpress.org/openverse/image/45b7ce41-ab94-47d0-8ad5-a3551a50e0d1
- "The Ponte Vecchio 'Old Bridge' and Arno River, Florence, Italy" by Ray in Manila is licensed under CC BY 2.0: https://wordpress.org/openverse/image/7567cf6d-bb94-4719-865d-a55c3f88155b
- "Ha'Penny Bridge, Black and White" by timsackton is licensed under CC BY-SA 2.0:
https://wordpress.org/openverse/image/b25d1863-9d4c-4d95-bf15-9ce5b5d9f78b - "Brooklyn Bridge, New York City, ca. 1910" by trialsanderrors is licensed under CC BY 2.0: https://wordpress.org/openverse/image/0064cc7e-7cfa-43a7-9077-3b7801f03790
Susanna Sansone - OpenCon Oxford, 1st Dec 2017Crossref
FAIR Data: principles and practices
A growing worldwide movement for reproducible research encourages making data, along with the experimental details, available according to the FAIR principles of Findability, Accessibility, Interoperability and Reusability (see http://www.nature.com/articles/sdata201618). Several data management, sharing policies and plans have emerged and, in parallel, a growing number of community-based groups are developing hundreds of standards to harmonize the reporting of different experiments. Community mobilization is evident also by the number of efforts and alliances, but also data journals and data centres being launched.
"Big data" is a broad term that encompasses a wide range of data and contents. Big data offers new approaches to analysis and decision making. At first glance big data and IP may seem to be opposites, but have more in common than one may think. This talk will focus on how big data will impact, and be impacted, by IP. One of the biggest promises in big data is the possibility to re-use data produced via different sources, create new services or predict the future, via the analysis of correlations. In this context, how can companies protect information assets and analytical skills? What are the new skills required to search and analyze in real time a big amount of datasets ? Big data will change not only patents information, but will also generate new types of patents.
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...Geoffrey Fox
Motivating Introduction to MOOC on Big Data from an applications point of view https://bigdatacoursespring2014.appspot.com/course
Course says:
Geoffrey motivates the study of X-informatics by describing data science and clouds. He starts with striking examples of the data deluge with examples from research, business and the consumer. The growing number of jobs in data science is highlighted. He describes industry trend in both clouds and big data.
He introduces the cloud computing model developed at amazing speed by industry. The 4 paradigms of scientific research are described with growing importance of data oriented version. He covers 3 major X-informatics areas: Physics, e-Commerce and Web Search followed by a broad discussion of cloud applications. Parallel computing in general and particular features of MapReduce are described. He comments on a data science education and the benefits of using MOOC's.
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...Geoffrey Fox
Motivating Introduction to MOOC on Big Data from an applications point of view https://bigdatacoursespring2014.appspot.com/course
Course says:
Geoffrey motivates the study of X-informatics by describing data science and clouds. He starts with striking examples of the data deluge with examples from research, business and the consumer. The growing number of jobs in data science is highlighted. He describes industry trend in both clouds and big data.
He introduces the cloud computing model developed at amazing speed by industry. The 4 paradigms of scientific research are described with growing importance of data oriented version. He covers 3 major X-informatics areas: Physics, e-Commerce and Web Search followed by a broad discussion of cloud applications. Parallel computing in general and particular features of MapReduce are described. He comments on a data science education and the benefits of using MOOC's.
People in the Machine: Human-centric Software Engineering for Smart SystemsArosha Bandara
Talk delivered at the Symposium of Software Engineering for Smart Systems, focussing on ways of supporting and integrating people into software engineering for smart cyber-physical-social systems.
Green in IT' as well as 'Green by IT' are established approaches to increase environmental sustainability with the use of information technology. The concept of digital sustainability enhances this view because today knowledge itself is a resource worth protecting. This concept assumes digital goods such as data, text, images, or software lead to the highest benefit for society when they are freely available surrounded by an open ecosystem of contributors.
Public data ecosystems in and for smart cities: how to make open / Big / smar...Anastasija Nikiforova
This is a set of slides used as part of my keynote "Public data ecosystems in and for smart cities: how to make open / Big / smart / geo data ecosystems value-adding for SDG-compliant Smart Living and Society 5.0" delivered at the 5th International Conference on Advanced Research Methods and Analytics (CARMA 2023) -> https://carmaconf2023.wordpress.com/keynote-speakers/. read more here -> https://anastasijanikiforova.com/2023/06/30/keynote-at-the-5th-international-conference-on-advanced-research-methods-and-analytics-carma-2023/
Digital preservation through Digital SustainabilityMatthias Stürmer
The concept of digital sustainability introduces a holistic approach on how to maximize the benefits of digital resources for our society. The nine basic conditions for digital sustainability also provide a contribution to potential solutions to the challenges of digital preservation. Elaborateness, transparent structures, semantic data, distributed location, an open licensing regime, shared tacit knowledge, participatory culture, good governance, and diversified funding support the long-term availability of digital knowledge. Therefore, in this conceptual paper, we explain the links between digital sustainability and digital preservation in order to increase the impact of both. We conclude by presenting the political agenda of the Swiss parliamentary group for digital sustainability.
Linked Open Data and data-driven journalismPia Jøsendal
A keynote held at the Media 3.0 seminar in Bergen. It is an introductionary presentation of simple key elements of linked open data. It adresses media and journalists, what data driven journalism can look like and why they should care about what linked open data can offer.
EUDAT 3rd Conference: Bringing Data e-Infrastructures to Horizon2020 - Carl-C...EUDAT
| www.eudat.eu | EUDAT 3rd Conference Opening Session: Bringing Data e-Infrastructures to Horizon2020 - Carl-Christian Buhr, Member of the Cabinet of Ms Neelie Kroes, Vice‐President for Digital Agenda, European Commission - Wednesday 24th September 2014, Amsterdam, the Netherlands
Fostering cross-disciplinary collaboration between data science and other disciplines like design.
Creative commons image credits:
- Cook-Anderson, Gretchen. “Snapshots from Space Cultivate Fans among Midwest Farmers.” NASA, NASA, 16 Sept. 2009, https://www.nasa.gov/topics/earth/features/farmer_imagery.html.
- "Coffee For One" by Public Places is licensed under CC BY 2.0: https://wordpress.org/openverse/image/eafd97fb-0174-4fea-8337-a9df5e678f0b
- "Cooking" by omefrans is licensed under CC BY-NC 2.0: https://wordpress.org/openverse/image/e32f7eed-66a4-4b06-82c3-2c313f28fd9f
- "Edge Effect" by Tim Pohlhaus is licensed under CC BY-NC-SA 2.0: https://wordpress.org/openverse/image/45b7ce41-ab94-47d0-8ad5-a3551a50e0d1
- "The Ponte Vecchio 'Old Bridge' and Arno River, Florence, Italy" by Ray in Manila is licensed under CC BY 2.0: https://wordpress.org/openverse/image/7567cf6d-bb94-4719-865d-a55c3f88155b
- "Ha'Penny Bridge, Black and White" by timsackton is licensed under CC BY-SA 2.0:
https://wordpress.org/openverse/image/b25d1863-9d4c-4d95-bf15-9ce5b5d9f78b - "Brooklyn Bridge, New York City, ca. 1910" by trialsanderrors is licensed under CC BY 2.0: https://wordpress.org/openverse/image/0064cc7e-7cfa-43a7-9077-3b7801f03790
Susanna Sansone - OpenCon Oxford, 1st Dec 2017Crossref
FAIR Data: principles and practices
A growing worldwide movement for reproducible research encourages making data, along with the experimental details, available according to the FAIR principles of Findability, Accessibility, Interoperability and Reusability (see http://www.nature.com/articles/sdata201618). Several data management, sharing policies and plans have emerged and, in parallel, a growing number of community-based groups are developing hundreds of standards to harmonize the reporting of different experiments. Community mobilization is evident also by the number of efforts and alliances, but also data journals and data centres being launched.
"Big data" is a broad term that encompasses a wide range of data and contents. Big data offers new approaches to analysis and decision making. At first glance big data and IP may seem to be opposites, but have more in common than one may think. This talk will focus on how big data will impact, and be impacted, by IP. One of the biggest promises in big data is the possibility to re-use data produced via different sources, create new services or predict the future, via the analysis of correlations. In this context, how can companies protect information assets and analytical skills? What are the new skills required to search and analyze in real time a big amount of datasets ? Big data will change not only patents information, but will also generate new types of patents.
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...Geoffrey Fox
Motivating Introduction to MOOC on Big Data from an applications point of view https://bigdatacoursespring2014.appspot.com/course
Course says:
Geoffrey motivates the study of X-informatics by describing data science and clouds. He starts with striking examples of the data deluge with examples from research, business and the consumer. The growing number of jobs in data science is highlighted. He describes industry trend in both clouds and big data.
He introduces the cloud computing model developed at amazing speed by industry. The 4 paradigms of scientific research are described with growing importance of data oriented version. He covers 3 major X-informatics areas: Physics, e-Commerce and Web Search followed by a broad discussion of cloud applications. Parallel computing in general and particular features of MapReduce are described. He comments on a data science education and the benefits of using MOOC's.
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...Geoffrey Fox
Motivating Introduction to MOOC on Big Data from an applications point of view https://bigdatacoursespring2014.appspot.com/course
Course says:
Geoffrey motivates the study of X-informatics by describing data science and clouds. He starts with striking examples of the data deluge with examples from research, business and the consumer. The growing number of jobs in data science is highlighted. He describes industry trend in both clouds and big data.
He introduces the cloud computing model developed at amazing speed by industry. The 4 paradigms of scientific research are described with growing importance of data oriented version. He covers 3 major X-informatics areas: Physics, e-Commerce and Web Search followed by a broad discussion of cloud applications. Parallel computing in general and particular features of MapReduce are described. He comments on a data science education and the benefits of using MOOC's.
People in the Machine: Human-centric Software Engineering for Smart SystemsArosha Bandara
Talk delivered at the Symposium of Software Engineering for Smart Systems, focussing on ways of supporting and integrating people into software engineering for smart cyber-physical-social systems.
Similar to Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous content-based adaptive systems (20)
Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...Cataldo Musto
Convegno a Porte Chiuse dell'Associazione Italiana per l'Intelligenza Artificiale insieme al Ministero per gli Affari Esteri e la Cooperazione Internazionale - 30 Giugno 2021
Exploring the Effects of Natural Language Justifications in Food Recommender ...Cataldo Musto
Cataldo Musto, Alain D. Starke, Christoph Trattner, Amon Rapp, and Giovanni Semeraro. 2021. Exploring the Effects of Natural Language Justifications in Food Recommender Systems. In Proceedings of the 29th ACM
Conference on User Modeling, Adaptation and Personalization (UMAP ’21), June 21–25, 2021, Utrecht, Netherlands. ACM, New York, NY, USA, 11 pages. https://doi.org/10.1145/3450613.3456827
Natural Language Justifications for Recommender Systems Exploiting Text Summa...Cataldo Musto
Natural Language Justifications for Recommender Systems Exploiting Text Summarization and Sentiment Analysis - AI*IA 2019 - Italian Conference on Artificial Intelligence
A Framework for Holistic User Modeling Merging Heterogeneous Digital FootprintsCataldo Musto
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints - HUM 2018 – Holistic User Modeling Workshop jointly held with
UMAP 2018 – 26th International
Conference on User Modeling,
Adaptation and Personalization
Singapore - July 8, 2018
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous content-based adaptive systems
1. Mining Big Data and Open
Knowledge Sources to develop
transparent and serendipitous
content-based adaptive systems
Cataldo Musto, Giovanni Semeraro, Fedelucio Narducci
2. state of the art.
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
3. our research: personalization
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
4. Recommender Systems
Relevant items (movies, news, books, etc.) are pushed to the
user according to her preferences or her needs.
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
5. Amazon.com
Recommendations
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
6. current recommendation technologies share three
important drawbacks.
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
7. (1) training is a bottleneck.
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
8. need for
explicit
information
about
user interests.
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
9. (2) recsys are black boxes.
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
10. (3) suggestions are not surprising.
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
11. exploiting big data to build a novel generation
of content-based adaptive systems
solution
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
12. current work.
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
near future work.
13. C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
14. big data.
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
15. Information
Overload
we can handle 126 bits of information
we deal with 393 bits of information
ratio: more than 3x(Source: Adrian C.Ott,The 24-hour customer)
consequence:
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
16. Information Overload
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
17. Big Data: obstacle or
opportunity?
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
18. cornestone 1
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
exploit social media to
model user
preferences.
19. social media are an opportunity
provide information about user preferences
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
20. example
user preferences in music from Facebook
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
21. implicit preferences
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
example
22. Play.me
playlist
Most popular songs of the artists extracted from Last.fm (as well as
those added through the enrichment) are proposed to the user.
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
23. Myusic
recommendations
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
24. cornestone 2
exploit entity linking algorithms
to make user profiles more
transparent and LOD-aware
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
25. MyFeeds
RSS recommendations
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
26. MyFeeds
transparent user preferences
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
extracted from Facebook.
27. MyFeeds
transparent user preferences
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
further processing
28. MyFeeds
entity linking algorithms
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
• They map free text with structured
information
• Wikipedia pages or DBpedia nodes
• examples
• Tag.me ,Wikipedia Miner, DBpedia
Spotlight, etc.
29. Tag.me
extracts the Wikipedia pages the content refers to.
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
30. Linked Open Data Cloud
Structured
(RDF)
representation
of the information
stored in Wikipedia.
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
31. Linked Open Data Cloud
Profiles based
on Tag.me are
LOD-aware
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
32. cornestone 3
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
exploit open knowledge sources
to make recommendation
techniques more serendipitous.
33. ‘in vitro’ experiments
Watchmi plug-in
developed by Aprico.tv
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
34. From BOW to eBOW
Given a description of a TV show, we exploit ESA to
obtain an enhanced representation
The original set of features is enriched with the set of
Wikipedia articles related the most with theTV show
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
35. TV SHOW
Rad an Rad
Die besten Duelle der MotoGP
(Wheel to wheel
The best duels in the MotoGP)
Wikipedia(Articles(
großer&preis&von&italien&
(motorrad)&
großer&preis&von&malaysia&
(motorrad)&
großer&preis&von&tschechien&
(motorrad)&
scuderia&ferrari&
valen8no&rossi&
motorrad9wm9saison&2005&
motorrad9wm9saison&2006&
max&biaggi&
großer&preis&der&usa&(motorrad)&
motorrad9wm9saison&2008&
rad&(heraldik)&
loris&capirossi&
shin’ya&nakano&
motogp&
example
From BOW to eBOW
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
36. challenges.
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
issues.
recommendations.
37. Challenges and Issues
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
• Main challenge and issue:
• data representation and data filtering
• How to exploit these novel data sylos?
• What information is relevant for personalization?
• What kind of processing do data need?
• Which one is the best representation?
• Do reasoning techniques improve profiles transparency and
personalization accuracy?
• Do people accept the exploitation of these data?
• How to model the context?
38. Recommendations
C.Musto, G.Semeraro - Mining Big Data and Open Knowledge Sources to develop transparent and serendipitous
content-based adaptive systems - World Summit on Big Data and Organization Design, Paris, 16-17 May 2013
• Cornerstones
• Social media-based user profiling
• LOD-aware user profiles
• Open Knowledge Sources for Serendipitous Encounters
• Recommendations
• Promote the LOD initiative, to publish data in a structured
form, to enable reasoning on the information
• Make data sylos interconnected
• To design applications able to properly model, manage and
exploit the big amount of data coming from social media.