Presentation given by Atif Khan, VP AI and Data Science, Messagepoint at the August meetup event of the Waterloo Data Science and Data Engineering group.
In:Confidence 2019 - Balancing the conflicting objectives of data access and ...Privitar
Shane Lamont, Chief Technology Officer - Big Data and Cloud at HSBC Data Services, talks about how to balance conflicting objectives of data access and data privacy on the In:Confidence 2019 main stage (April 4th at Printworks, London).
Knowledge graphs for knowing more and knowing for sureSteffen Staab
Knowledge graphs have been conceived to collect heterogeneous data and knowledge about large domains, e.g. medical or engineering domains, and to allow versatile access to such collections by means of querying and logical reasoning. A surge of methods has responded to additional requirements in recent years. (i) Knowledge graph embeddings use similarity and analogy of structures to speculatively add to the collected data and knowledge. (ii) Queries with shapes and schema information can be typed to provide certainty about results. We survey both developments and find that the development of techniques happens in disjoint communities that mostly do not understand each other, thus limiting the proper and most versatile use of knowledge graphs.
Tackling GenAI Challenges with Knowledge Graphs, Graph Data Science and LLMsNeo4j
These are the presentation materials from our lunch and learn: Tackling GenAI Challenges with Knowledge Graphs, Graph Data Science and LLMs. Watch the full recording here: https://www.youtube.com/watch?v=Dlz3bAssKSU
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeAndre Freitas
The Challenge in a Nutshell
To create a query mechanism that semantically matches schema-agnostic user queries to knowledge base elements
The Goal
To support easy querying over complex databases with large schemata, relieving users from the need to understand the formal representation of the data
Relevance
The increase in the size and in the semantic heterogeneity of database schemas are bringing new requirements for users querying and searching structured data. At this scale it can become unfeasible for data consumers to be familiar with the representation of the data in order to query it. At the center of this discussion is the semantic gap between users and databases, which becomes more central as the scale and complexity of the data grows. Addressing this gap is a fundamental part of the Semantic Web vision.
Schema-agnostic query mechanisms aim at allowing users to be abstracted from the representation of the data, supporting the automatic matching between queries and databases. This challenge aims at emphasizing the role of schema-agnosticism as a key requirement for contemporary database management, by providing a test collection for evaluating flexible query and search systems over structured data in terms of their level of schema-agnosticism (i.e. their ability to map a query issued with the user terminology and structure, mapping it to the dataset vocabulary). The challenge is instantiated in the context of Semantic Web datasets.
Slides from my lightning talk at the Boston Predictive Analytics Meetup hosted at Predictive Analytics World, Boston, October 1, 2012.
Full code and data are available on github: http://bit.ly/pawdata
A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...Neo4j
In this talk we'll explore powerful analytic techniques for graph data. Firstly we'll discover some of the innate properties of (social) graphs from fields like anthropology and sociology. By understanding the forces and tensions within the graph structure and applying some graph theory, we'll be able to predict how the graph will evolve over time. To test just how powerful and accurate graph theory is, we'll also be able to (retrospectively) predict World War 1 based on a social graph and a few simple mechanical rules.
Then we'll see how graph matching can be used to extract online business intelligence (for powerful retail recommendations). In turn we'll apply these powerful techniques to modelling domains in Neo4j (a graph database) and show how Neo4j can be used to drive business intelligence.
Don't worry, there won't be much maths :-)
In:Confidence 2019 - Balancing the conflicting objectives of data access and ...Privitar
Shane Lamont, Chief Technology Officer - Big Data and Cloud at HSBC Data Services, talks about how to balance conflicting objectives of data access and data privacy on the In:Confidence 2019 main stage (April 4th at Printworks, London).
Knowledge graphs for knowing more and knowing for sureSteffen Staab
Knowledge graphs have been conceived to collect heterogeneous data and knowledge about large domains, e.g. medical or engineering domains, and to allow versatile access to such collections by means of querying and logical reasoning. A surge of methods has responded to additional requirements in recent years. (i) Knowledge graph embeddings use similarity and analogy of structures to speculatively add to the collected data and knowledge. (ii) Queries with shapes and schema information can be typed to provide certainty about results. We survey both developments and find that the development of techniques happens in disjoint communities that mostly do not understand each other, thus limiting the proper and most versatile use of knowledge graphs.
Tackling GenAI Challenges with Knowledge Graphs, Graph Data Science and LLMsNeo4j
These are the presentation materials from our lunch and learn: Tackling GenAI Challenges with Knowledge Graphs, Graph Data Science and LLMs. Watch the full recording here: https://www.youtube.com/watch?v=Dlz3bAssKSU
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeAndre Freitas
The Challenge in a Nutshell
To create a query mechanism that semantically matches schema-agnostic user queries to knowledge base elements
The Goal
To support easy querying over complex databases with large schemata, relieving users from the need to understand the formal representation of the data
Relevance
The increase in the size and in the semantic heterogeneity of database schemas are bringing new requirements for users querying and searching structured data. At this scale it can become unfeasible for data consumers to be familiar with the representation of the data in order to query it. At the center of this discussion is the semantic gap between users and databases, which becomes more central as the scale and complexity of the data grows. Addressing this gap is a fundamental part of the Semantic Web vision.
Schema-agnostic query mechanisms aim at allowing users to be abstracted from the representation of the data, supporting the automatic matching between queries and databases. This challenge aims at emphasizing the role of schema-agnosticism as a key requirement for contemporary database management, by providing a test collection for evaluating flexible query and search systems over structured data in terms of their level of schema-agnosticism (i.e. their ability to map a query issued with the user terminology and structure, mapping it to the dataset vocabulary). The challenge is instantiated in the context of Semantic Web datasets.
Slides from my lightning talk at the Boston Predictive Analytics Meetup hosted at Predictive Analytics World, Boston, October 1, 2012.
Full code and data are available on github: http://bit.ly/pawdata
A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...Neo4j
In this talk we'll explore powerful analytic techniques for graph data. Firstly we'll discover some of the innate properties of (social) graphs from fields like anthropology and sociology. By understanding the forces and tensions within the graph structure and applying some graph theory, we'll be able to predict how the graph will evolve over time. To test just how powerful and accurate graph theory is, we'll also be able to (retrospectively) predict World War 1 based on a social graph and a few simple mechanical rules.
Then we'll see how graph matching can be used to extract online business intelligence (for powerful retail recommendations). In turn we'll apply these powerful techniques to modelling domains in Neo4j (a graph database) and show how Neo4j can be used to drive business intelligence.
Don't worry, there won't be much maths :-)
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
Data spaces in distributed environments should be allowed to evolve in agile ways providing data space owners with large flexibility about which data they store. Agility and heterogeneity, however, jeopardize data exchanges because representations may build on varying ontologies and data consumers may not rely on the semantic correctness of their queries in the context of semantically heterogeneous, evolving data spaces. Graph data spaces are one example of a powerful model for representing and querying data whose semantics may change over time. To assert and enforce conditions on individual graph data spaces, shape languages (e.g SHACL) have been developed. We investigate the question of how querying and programming can be guarded by reasoning over SHACL constraints in a distributed setting and we sketch a picture of how a future landscape based on semantically heterogeneous data spaces might look like.
Easily Identify Sources of Supply Chain GridlockNeo4j
Join us for this 20-minute webinar to hear from Nick Johnson, Product Marketing Manager for Graph Data Science, as he explains the fundamentals of Neo4j Graph Data Science and its applications in optimizing supply chain management. Discover how leveraging graph analytics can help you identify bottlenecks, reduce costs, and streamline your supply chain operations more efficiently.
Connecting the Dots for Information Discovery.pdfNeo4j
In this presentation, delivered by ABK Andreas Kollegger at QCon London 2024, the focus was on Connecting the Dots for Information Discovery. The classic RAG application extends an LLM with private information, able to fetch answers to questions that are contained in a single chunk of text. What if the answer requires connecting the dots across multiple chunks that may not be directly similar to the question? That is information discovery with GraphRAG.
You'll learn how to:
- reconstruct chunks into the original context
- meaningfully connect disparate chunks
- expand unstructured text data with structured data
- combine all this into a RAG workflow
GraphSummit Toronto: Leveraging Graphs for AI and MLNeo4j
Phani Dathar, Ph.D., Data Science Solution Architect, Neo4j
Relationships are highly predictive of behavior. Graph technology abstracts connections in our data so businesses can apply relationships and network structures to make better predictions. Hear about the journey from graph analytics and machine learning to graph-enhanced AI. We’ll also cover how enterprises are using graph data science in areas such as fraud, targeted marketing, healthcare, and recommendations.
Knowledge Graphs (KG) are becoming core components of most artificial intelligence applications. Linked Data, as a method of publishing KGs, allows applications to traverse within, and even out of, the graph thanks to global dereferenceable identifiers denoting entities, in the form of IRIs. However, as we show in this work, after analyzing several popular datasets (namely DBpedia, LOD Cache, and Web Data Commons JSON-LD data) many entities are being represented using literal strings where IRIs should be used, diminishing the advantages of using Linked Data. To remedy this, we propose an approach for identifying such strings and replacing them with their corresponding entity IRIs. The proposed approach is based on identifying relations between entities based on both ontological axioms as well as data profiling information and converting strings to entity IRIs based on the types of entities linked by each relation. Our approach showed 98\% recall and 76\% precision in identifying such strings and 97\% precision in converting them to their corresponding IRI in the considered KG. Further, we analyzed how the connectivity of the KG is increased when new relevant links are added to the entities as a result of our method. Our experiments on a subset of the Spanish DBpedia data show that it could add 25% more links to the KG and improve the overall connectivity by 17%.
The School Uniforms Debate Free Essay Example. School Uniforms Debate Essay Example - PHDessay.com. School Uniform Persuasive Essay Telegraph. School Uniforms Debate Essay Telegraph. School uniforms debate essay. Should Students Wear School Uniforms .... An argument against school uniforms. School Uniform, Argumentative .... Middle School Argumentative Writing: School Uniforms Debate TpT. School Uniform Debate Essay Telegraph. Debate com school uniforms. Debate Against School Uniforms, Sample of .... Click to close or click and drag to move School Uniforms Debate, School .... An argumentative essay about school uniforms - Mighty Peace Golf Club. Pros School Uniforms Debate, School Uniform Essay, School Essay, School .... Questions about school uniforms in debates. School uniforms: the debate .... Why are school uniforms bad debate. Why School Uniforms Are Bad .... School Uniforms Debate: Real-World Argumentative Writing TpT. Debate on School Uniform School uniform essay, School uniform, School. Essay on why school uniform is important - School Uniforms Persuasive .... Argumentative Essay On Uniforms In Schools. Student uniform debate. The Pros amp; Cons of School Uniforms: Moms .... School uniforms essay. Arguments against school uniforms essay. How to write a persuasive essay against school uniforms .... Marvelous Debate Essay Topics Thatsnotus. School Uniform Essays Free Essay Example. School uniform debate essay From Top Writers.. why we should not wear school uniforms essay Siambookcenter. ️ School uniform essay introduction. School Uniform Debate Essay. 2019 .... Uniform debate essay. moreeee pro and cons School essay, School uniform essay, School .... 002 Essay Example Why Should Students Wear Uniforms School .... 006 Argumentative Essay On School Uniforms Sample Essaysmasters Public .... Introduction to Debate - For or Against School Uniforms - ESL worksheet .... School uniforms arguments. Debate Against School Uniforms Essay ... School Uniform Debate Essay School Uniform Debate Essay
Luna Dong, Principal Scientist, Amazon at MLconf Seattle 2017MLconf
Xin Luna Dong is a Principal Scientist at Amazon, leading the efforts of constructing Amazon Product Graph. She was one of the major contributors to the Knowledge Vault project, and has led the Knowledge-based Trust project, which is called the “Google Truth Machine” by Washington’s Post. She has won the VLDB Early Career Research Contribution Award for “advancing the state of the art of knowledge fusion”, and the Best Demo award in Sigmod 2005. She has co-authored book “Big Data Integration”, published 65+ papers in top conferences and journals, and given 20+ keynotes/invited-talks/tutorials. She is the PC co-chair for Sigmod 2018 and WAIM 2015, and serves as an area chair for Sigmod 2017, CIKM 2017, Sigmod 2015, ICDE 2013, and CIKM 2011.
Abstract summary
Leave No Valuable Data Behind: the Crazy Ideas and the Business:
With the mission “leave no valuable data behind”, we developed techniques for knowledge fusion to guarantee the correctness of the knowledge. This talk starts with describing a few crazy ideas we have tested. The first, known as “Knowledge Vault”, used 15 extractors to automatically extract knowledge from 1B+ Webpages, obtaining 3B+ distinct (subject, predicate, object) knowledge triples and predicting well-calibrated probabilities for extracted triples. The second, known as “Knowledge-Based Trust”, estimated the trustworthiness of 119M webpages and 5.6M websites based on the correctness of their factual information. We then present how we bring the ideas to business in filling the gap between the knowledge at existing knowledge bases and the knowledge in the world.
Story telling is a key to susccessfull presentation or to sell an idea, telling a story with simple words wont be useful to create a visual picture. If we can tell stories with proper data and visual that will bring a great meaning.
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...Connected Data World
What is the key to the holistic success of the fastest growing and most successful companies of our time globally? Well, often the key is the rapid increase in collected and analysed data. Graph databases provide a way to organise semantically by classes, not tables, are web-aware, and are superior for handling deep, complex relationships than traditional relational or NoSQL data stores.
It is these deep, complex relationships that can provide the rich context for hyper-personalising your product offering, inspiring consumers to purchase. In this talk, we describe how we are using artificial intelligence at Farfetch to not only help build a knowledge graph but also to evolve our insights with state-of-the-art graph-based AI.
Leveraging Data for the Internet of Things CLEVER°FRANKE
In an informal workshop hosted by Bob Corporaal, the group looked at the connection between IoT, Digital Transformation and Data Visualization. They explored a step-by-step process that can help create more value with the data from your IoT application.
Social network plays a fundamental role as a medium for the spread of INFLUENCE among its members. As part of this research, estimates for influence between individuals were presented.
More Related Content
Similar to Waterloo Data Science and Data Engineering Meetup - 2018-08-29
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
Data spaces in distributed environments should be allowed to evolve in agile ways providing data space owners with large flexibility about which data they store. Agility and heterogeneity, however, jeopardize data exchanges because representations may build on varying ontologies and data consumers may not rely on the semantic correctness of their queries in the context of semantically heterogeneous, evolving data spaces. Graph data spaces are one example of a powerful model for representing and querying data whose semantics may change over time. To assert and enforce conditions on individual graph data spaces, shape languages (e.g SHACL) have been developed. We investigate the question of how querying and programming can be guarded by reasoning over SHACL constraints in a distributed setting and we sketch a picture of how a future landscape based on semantically heterogeneous data spaces might look like.
Easily Identify Sources of Supply Chain GridlockNeo4j
Join us for this 20-minute webinar to hear from Nick Johnson, Product Marketing Manager for Graph Data Science, as he explains the fundamentals of Neo4j Graph Data Science and its applications in optimizing supply chain management. Discover how leveraging graph analytics can help you identify bottlenecks, reduce costs, and streamline your supply chain operations more efficiently.
Connecting the Dots for Information Discovery.pdfNeo4j
In this presentation, delivered by ABK Andreas Kollegger at QCon London 2024, the focus was on Connecting the Dots for Information Discovery. The classic RAG application extends an LLM with private information, able to fetch answers to questions that are contained in a single chunk of text. What if the answer requires connecting the dots across multiple chunks that may not be directly similar to the question? That is information discovery with GraphRAG.
You'll learn how to:
- reconstruct chunks into the original context
- meaningfully connect disparate chunks
- expand unstructured text data with structured data
- combine all this into a RAG workflow
GraphSummit Toronto: Leveraging Graphs for AI and MLNeo4j
Phani Dathar, Ph.D., Data Science Solution Architect, Neo4j
Relationships are highly predictive of behavior. Graph technology abstracts connections in our data so businesses can apply relationships and network structures to make better predictions. Hear about the journey from graph analytics and machine learning to graph-enhanced AI. We’ll also cover how enterprises are using graph data science in areas such as fraud, targeted marketing, healthcare, and recommendations.
Knowledge Graphs (KG) are becoming core components of most artificial intelligence applications. Linked Data, as a method of publishing KGs, allows applications to traverse within, and even out of, the graph thanks to global dereferenceable identifiers denoting entities, in the form of IRIs. However, as we show in this work, after analyzing several popular datasets (namely DBpedia, LOD Cache, and Web Data Commons JSON-LD data) many entities are being represented using literal strings where IRIs should be used, diminishing the advantages of using Linked Data. To remedy this, we propose an approach for identifying such strings and replacing them with their corresponding entity IRIs. The proposed approach is based on identifying relations between entities based on both ontological axioms as well as data profiling information and converting strings to entity IRIs based on the types of entities linked by each relation. Our approach showed 98\% recall and 76\% precision in identifying such strings and 97\% precision in converting them to their corresponding IRI in the considered KG. Further, we analyzed how the connectivity of the KG is increased when new relevant links are added to the entities as a result of our method. Our experiments on a subset of the Spanish DBpedia data show that it could add 25% more links to the KG and improve the overall connectivity by 17%.
The School Uniforms Debate Free Essay Example. School Uniforms Debate Essay Example - PHDessay.com. School Uniform Persuasive Essay Telegraph. School Uniforms Debate Essay Telegraph. School uniforms debate essay. Should Students Wear School Uniforms .... An argument against school uniforms. School Uniform, Argumentative .... Middle School Argumentative Writing: School Uniforms Debate TpT. School Uniform Debate Essay Telegraph. Debate com school uniforms. Debate Against School Uniforms, Sample of .... Click to close or click and drag to move School Uniforms Debate, School .... An argumentative essay about school uniforms - Mighty Peace Golf Club. Pros School Uniforms Debate, School Uniform Essay, School Essay, School .... Questions about school uniforms in debates. School uniforms: the debate .... Why are school uniforms bad debate. Why School Uniforms Are Bad .... School Uniforms Debate: Real-World Argumentative Writing TpT. Debate on School Uniform School uniform essay, School uniform, School. Essay on why school uniform is important - School Uniforms Persuasive .... Argumentative Essay On Uniforms In Schools. Student uniform debate. The Pros amp; Cons of School Uniforms: Moms .... School uniforms essay. Arguments against school uniforms essay. How to write a persuasive essay against school uniforms .... Marvelous Debate Essay Topics Thatsnotus. School Uniform Essays Free Essay Example. School uniform debate essay From Top Writers.. why we should not wear school uniforms essay Siambookcenter. ️ School uniform essay introduction. School Uniform Debate Essay. 2019 .... Uniform debate essay. moreeee pro and cons School essay, School uniform essay, School .... 002 Essay Example Why Should Students Wear Uniforms School .... 006 Argumentative Essay On School Uniforms Sample Essaysmasters Public .... Introduction to Debate - For or Against School Uniforms - ESL worksheet .... School uniforms arguments. Debate Against School Uniforms Essay ... School Uniform Debate Essay School Uniform Debate Essay
Luna Dong, Principal Scientist, Amazon at MLconf Seattle 2017MLconf
Xin Luna Dong is a Principal Scientist at Amazon, leading the efforts of constructing Amazon Product Graph. She was one of the major contributors to the Knowledge Vault project, and has led the Knowledge-based Trust project, which is called the “Google Truth Machine” by Washington’s Post. She has won the VLDB Early Career Research Contribution Award for “advancing the state of the art of knowledge fusion”, and the Best Demo award in Sigmod 2005. She has co-authored book “Big Data Integration”, published 65+ papers in top conferences and journals, and given 20+ keynotes/invited-talks/tutorials. She is the PC co-chair for Sigmod 2018 and WAIM 2015, and serves as an area chair for Sigmod 2017, CIKM 2017, Sigmod 2015, ICDE 2013, and CIKM 2011.
Abstract summary
Leave No Valuable Data Behind: the Crazy Ideas and the Business:
With the mission “leave no valuable data behind”, we developed techniques for knowledge fusion to guarantee the correctness of the knowledge. This talk starts with describing a few crazy ideas we have tested. The first, known as “Knowledge Vault”, used 15 extractors to automatically extract knowledge from 1B+ Webpages, obtaining 3B+ distinct (subject, predicate, object) knowledge triples and predicting well-calibrated probabilities for extracted triples. The second, known as “Knowledge-Based Trust”, estimated the trustworthiness of 119M webpages and 5.6M websites based on the correctness of their factual information. We then present how we bring the ideas to business in filling the gap between the knowledge at existing knowledge bases and the knowledge in the world.
Story telling is a key to susccessfull presentation or to sell an idea, telling a story with simple words wont be useful to create a visual picture. If we can tell stories with proper data and visual that will bring a great meaning.
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...Connected Data World
What is the key to the holistic success of the fastest growing and most successful companies of our time globally? Well, often the key is the rapid increase in collected and analysed data. Graph databases provide a way to organise semantically by classes, not tables, are web-aware, and are superior for handling deep, complex relationships than traditional relational or NoSQL data stores.
It is these deep, complex relationships that can provide the rich context for hyper-personalising your product offering, inspiring consumers to purchase. In this talk, we describe how we are using artificial intelligence at Farfetch to not only help build a knowledge graph but also to evolve our insights with state-of-the-art graph-based AI.
Leveraging Data for the Internet of Things CLEVER°FRANKE
In an informal workshop hosted by Bob Corporaal, the group looked at the connection between IoT, Digital Transformation and Data Visualization. They explored a step-by-step process that can help create more value with the data from your IoT application.
Similar to Waterloo Data Science and Data Engineering Meetup - 2018-08-29 (20)
Social network plays a fundamental role as a medium for the spread of INFLUENCE among its members. As part of this research, estimates for influence between individuals were presented.
Lykaio Wang (https://www.linkedin.com/in/lykaiowang/) discusses visualizing data in a web browser environment by initially discussing a few popular data visualizatoin libraries, and then dive further into the pros and cons of D3. Lykaio will show you why D3 is so powerful and how you can leverage it to visualize tabular, graph, geospatial or any other type of data.
Daria Voronova - The Art of Telling a StoryZia Babar
Daria Voronova (https://www.linkedin.com/in/daria-voronova-76b724b5/) takesus onto a journey of stories that can be uncovered using Tableau as a discovery tool. In this presentation, she's describes what is story telling and why is it so important in enterprise contexts, followed by how to build a story telling dashboard in Tableau.
Discussion on cloud-based data storage and databases. Presentation done by Zia Babar at the July event of the Waterloo Data Science and Data Engineering meetup
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
2. Real world entities are complex
❖ multi-dimensional
❖ rich interactions
Motivation
2
fan_of
like
colleague
reports_to
wife
3. Entities are described using multiple
heterogeneous datasets
❖ entity attributes/properties
❖ entity-to-entity interactions
Understanding an entity demands linking/joining
data across datasets
Motivation
3
scale of raw
information
4. Entities are described using multiple
heterogeneous datasets
❖ entity attributes/properties
❖ entity-to-entity interactions
Understanding an entity demands linking/joining
data across datasets
Motivation
4
scale of raw
information
8. An entity is modelled as a node
Graph Representation of Data
8
A
B
9. An entity is modelled as a node
An entity-to-entity relationship is
modelled as an edge
edges can be directed (son_of)
or non-directed (married_to)
Graph Representation of Data
9
A
B
10. In a property graph,
nodes (entities) & edges (relationships)
can define their own properties
Graph Representation of Data
10
A B
name: john
age: 24
name: mary
age: 22
gender: F
profession: singer
start: 1/1/1974
married_to
11. An information schema is also a (sub)graph
Graph Representation of Data
11
A B
married_to
12. An information schema is also a (sub)graph
Graph Representation of Data
12
A B
married_to
P P
married_to
instance_ofinstance_of
range: P
domain: P
start:
23. Graph Traversal (Query)
23
How many relatives
are there?
A: 3
brother
brother
wife
friend
friend
john
dave
april
arthur
mary
alice
relative
relativerelative
26. Information >> Knowledge
26
Will Smith @wsmith
Finished registering for KDD 2018
in London
http://www.kdd.org/kdd2018/
William Smith shared a link
I am off to Knowledge Discovery and
Data mining conference in London.
Looking forward to Michael Jordan’s
keynote address.
27. Information >> Knowledge
27
Will Smith @wsmith
Finished registering for KDD 2018
in London
http://www.kdd.org/kdd2018/
William Smith shared a link
I am off to Knowledge Discovery and
Data mining conference in London.
Looking forward to Michael Jordan’s
keynote address next week.
35. Information >> Knowledge
35
Insights
❏ William Smith is interested in data mining
(academic/data scientist)
wsmith,
william smith
KDD 2018,
Knowledge Discovery & Data Mining 2018
registered_for
37. Information >> Knowledge
37
Insights from the combined graph
❏ KDD 2018 is a data mining conference
❏ KDD 2018 requires registration to attend
❏ Michael Jordan will be giving a keynote
address at KDD 2018
❏ KDD 2018 URL (http://www.kdd.org/kdd2018/)
38. Information >> Knowledge
38
Inferred Insights from the combined graph
❏ Michael Jordan is an influencer in
data mining research community
❏ Both Michael Jordan and William Smith
will be in London from during KDD 2018
39. Information >> Knowledge
39
Insights applied to other problems
❏ Which Michael Jordan and why?
Michael Jordan is an American
former professional basketball
player. He played 15 seasons in
the National Basketball
Association for the Chicago
Bulls and Washington Wizards.
Michael Irwin Jordan is an
American scientist, professor at
the University of California,
Berkeley and researcher in
machine learning, statistics,
and artificial intelligence.
40. Information >> Knowledge
40
Insights applied to other problems
❏ Which Michael Jordan and why?
Michael Jordan is an American
former professional basketball
player. He played 15 seasons in
the National Basketball
Association for the Chicago
Bulls and Washington Wizards.
Michael Irwin Jordan is an
American scientist, professor at
the University of California,
Berkeley and researcher in
machine learning, statistics,
and artificial intelligence.
Knowledge Discovery
& Data Mining 2018
Michael
Jordan
keynote_speaker
43. General Recipe
1. Represent information as graphs
a. represent not model (~schema on read)
b. entities are nodes in the graph
entity attributes = node properties
c. entity-to-entity relationships are edges in the graph
relationship attributes = edge properties
43
44. General Recipe
1. Represent information as graphs
a. represent not model (~schema on read)
b. entities are nodes in the graph
entity attributes = node properties
c. entity-to-entity relationships are edges in the graph
relationship attributes = edge properties
d. repeat for each domain (highly parallelizable)
44
45. General Recipe
1. Represent information as graphs
2. Define “cross-over” traversals/edges
a. algorithms (similarity, clustering, classification)
45
46. General Recipe
1. Represent information as graphs
2. Define “cross-over” traversals/edges
a. algorithms (similarity, clustering, classification)
b. edges can be broadly described as
i. same_as : a measure of closeness
ii. member_of: entity-to-entity associations
46
47. General Recipe
1. Represent information as graphs
2. Define “cross-over” traversals/edges
a. algorithms (similarity, clustering, classification)
b. edges can be broadly described as
c. cross-over traversals can be
i. probabilistic in nature
ii. query specific
47
48. 1. Represent information as graphs
2. Define “cross-over” traversals/edges
3. Find the best projections
General Recipe
48
OR
49. General Recipe
1. Represent information as graphs
2. Define “cross-over” traversals/edges
3. Find the best projection
4. Automate graph construction
a. Use ML/IR/KE
b. probabilistic linkages
49
54. Acme Investment
54
⅓ of all adults are
average age is 64,
❖ median-women: 50,
❖ median-men:54
❖ first time grandparent
avg age:47
77% are married
Demographics
55. Acme Investment
55
⅓ of all adults are
average age is 64,
❖ median-women: 50,
❖ median-men:54
❖ first time grandparent
avg age:47
77% are married
control about ⅓ of the
nation's assets
spend about $52 billion
yearly on grandchildren
(education:$32 billion ,
infant-apparel: $3 billion)
give grandkids over $5
billion yearly in stocks &
securities
Demographics Financial
56. Use Cases
56
Data wrangling | ETL,ELTScalable Ingestion
Data Cleansing Knowledge Inference
Contextual Parsing Data Imputation Preprocessing
57. Use Cases
57
Data wrangling | ETL,ELTScalable Ingestion
Data Cleansing Knowledge Inference
Contextual Parsing Data Imputation
Data
Governance
Entity
DeDup
Community
Discovery
Fraud
Detection
Recommendation
Systems
Preprocessing
Customer
360
Customer
Journey
58. 58
In Summary
❖ Graphs are a flexible representation of real
world entities and their relationships.
❖ Graphs facilitate transforming
data into insights
❖ Graph creation & inference can be
augmented & automated using AI