The document discusses big graph data and data visualization. Some key points:
- Graphs are useful for solving problems across many domains like disease spread, recommendations, and power grids.
- Graphs can be small, medium, large, explicit, or implicit based on how the network is constructed from data.
- Visualization is a natural tool for exploring and understanding complex graph data through interaction and iteration.
- Open source tools like Gephi and Sigma.js can be used to visualize and analyze graph data.
Graph enhancements to Artificial Intelligence and Machine Learning are changing the landscape of intelligent applications. Beyond improving accuracy and modeling speed, graph technologies make building AI solutions more accessible. Join us to hear about 4 areas at the forefront of graph enhanced AI and ML, and find out which techniques are commonly used today and which hold the potential for disrupting industries. We'll provide examples and specifically look how: - Graphs provide better accuracy through connected feature extraction - Graphs provide better performance through contextual model optimization - Graphs provide context through knowledge graphs - Graphs add explainability to neural networks
Speakers: Jake Graham, Alicia Frame
ICASSP 2012: Analysis of Streaming Social Networks and Graphs on Multicore Ar...Jason Riedy
Analyzing static snapshots of massive, graph-structured data cannot keep pace with the growth of social networks, financial transactions, and other valuable data sources. We introduce a framework, STING (Spatio-Temporal Interaction Networks and Graphs), and evaluate its performance on multicore, multisocket Intel(TM)-based platforms. STING achieves rates of around 100,000 edge updates per second on large, dynamic graphs with a single, general data structure. We achieve speed-ups of up to 1000$\times$ over parallel static computation, improve monitoring a dynamic graph's connected components, and show an exact algorithm for maintaining local clustering coefficients performs better on Intel-based platforms than our earlier approximate algorithm.
Gephi is an open source software for graph and network analysis. It uses a 3D render engine to display large networks in real-time and to speed up the exploration. A flexible and multi-task architecture brings new pos- sibilities to work with complex data sets and produce valuable visual results. We present several key features of Gephi in the context of interactive exploration and interpretation of networks. It provides easy and broad access to network data and allows for spatializing, fil- tering, navigating, manipulating and clustering
When Big Data and Predictive Analytics Collide: Visual Magic HappensInfini Graph
Big data is useless data unless you have a way to handle and perform meaningful analysis that drives a business outcome. Data visualization has transformed complex data sets into patterns now being used to constructed predictive models. In the massive exploding world of social data and content engagement the need for intelligent data mining and pattern prediction is required to realize data driving marketing. In this presentation, we will explore techniques, key takeaways and examples behind this fast growing market of predictive analysis.
When Big Data and Predictive Analytics Collide: Visual Magic HappensChase McMichael
Big data is useless data unless you have a way to handle and perform meaningful analysis that drives a business outcome. Data visualization has transformed complex data sets into patterns now being used to constructed predictive models. In the massive exploding world of social data and content engagement the need for intelligent data mining and pattern prediction is required to realize data driving marketing. In this presentation, we will explore techniques, key takeaways and examples behind this fast growing market of predictive https://svforum.org/Business-Intelligence/Business-Intelligence-SIG-When-Big-Data-and-Predictive-Analytics-Collide SEE Dreamforce Content Hub in ACTION here http://blog.infinigraph.com/example-of-visual-content-trends-powered-by-hypercuration/
Presentation Title: Grand Challenges and Big Data: Implications for Public Participation in Scientific Research
Presenter: William Michener, Professor and PI/Director of DataONE, University Libraries, University of New Mexico
Graph enhancements to Artificial Intelligence and Machine Learning are changing the landscape of intelligent applications. Beyond improving accuracy and modeling speed, graph technologies make building AI solutions more accessible. Join us to hear about 4 areas at the forefront of graph enhanced AI and ML, and find out which techniques are commonly used today and which hold the potential for disrupting industries. We'll provide examples and specifically look how: - Graphs provide better accuracy through connected feature extraction - Graphs provide better performance through contextual model optimization - Graphs provide context through knowledge graphs - Graphs add explainability to neural networks
Speakers: Jake Graham, Alicia Frame
ICASSP 2012: Analysis of Streaming Social Networks and Graphs on Multicore Ar...Jason Riedy
Analyzing static snapshots of massive, graph-structured data cannot keep pace with the growth of social networks, financial transactions, and other valuable data sources. We introduce a framework, STING (Spatio-Temporal Interaction Networks and Graphs), and evaluate its performance on multicore, multisocket Intel(TM)-based platforms. STING achieves rates of around 100,000 edge updates per second on large, dynamic graphs with a single, general data structure. We achieve speed-ups of up to 1000$\times$ over parallel static computation, improve monitoring a dynamic graph's connected components, and show an exact algorithm for maintaining local clustering coefficients performs better on Intel-based platforms than our earlier approximate algorithm.
Gephi is an open source software for graph and network analysis. It uses a 3D render engine to display large networks in real-time and to speed up the exploration. A flexible and multi-task architecture brings new pos- sibilities to work with complex data sets and produce valuable visual results. We present several key features of Gephi in the context of interactive exploration and interpretation of networks. It provides easy and broad access to network data and allows for spatializing, fil- tering, navigating, manipulating and clustering
When Big Data and Predictive Analytics Collide: Visual Magic HappensInfini Graph
Big data is useless data unless you have a way to handle and perform meaningful analysis that drives a business outcome. Data visualization has transformed complex data sets into patterns now being used to constructed predictive models. In the massive exploding world of social data and content engagement the need for intelligent data mining and pattern prediction is required to realize data driving marketing. In this presentation, we will explore techniques, key takeaways and examples behind this fast growing market of predictive analysis.
When Big Data and Predictive Analytics Collide: Visual Magic HappensChase McMichael
Big data is useless data unless you have a way to handle and perform meaningful analysis that drives a business outcome. Data visualization has transformed complex data sets into patterns now being used to constructed predictive models. In the massive exploding world of social data and content engagement the need for intelligent data mining and pattern prediction is required to realize data driving marketing. In this presentation, we will explore techniques, key takeaways and examples behind this fast growing market of predictive https://svforum.org/Business-Intelligence/Business-Intelligence-SIG-When-Big-Data-and-Predictive-Analytics-Collide SEE Dreamforce Content Hub in ACTION here http://blog.infinigraph.com/example-of-visual-content-trends-powered-by-hypercuration/
Presentation Title: Grand Challenges and Big Data: Implications for Public Participation in Scientific Research
Presenter: William Michener, Professor and PI/Director of DataONE, University Libraries, University of New Mexico
Emerging real-world graph problems include detecting community structure in large social networks, improving the resilience of the electric power grid, and detecting and preventing disease in human populations. The volume and richness of data combined with its rate of change renders monitoring properties at scale by static recomputation infeasible. We approach these problems with massive, fine-grained parallelism across different shared memory architectures both to compute solutions and to explore the sensitivity of these solutions to natural bias and omissions within the data.
This talk presents areas of investigation underway at the Rensselaer Institute for Data Exploration and Applications. First presented at Flipkart, Bangalore India, 3/2015.
ANALYTIC QUERIES OVER GEOSPATIAL TIME-SERIES DATA USING DISTRIBUTED HASH TABLESNexgen Technology
TO GET THIS PROJECT COMPLETE SOURCE ON SUPPORT WITH EXECUTION PLEASE CALL BELOW CONTACT DETAILS
MOBILE: 9791938249, 0413-2211159, WEB: WWW.NEXGENPROJECT.COM,WWW.FINALYEAR-IEEEPROJECTS.COM, EMAIL:Praveen@nexgenproject.com
NEXGEN TECHNOLOGY provides total software solutions to its customers. Apsys works closely with the customers to identify their business processes for computerization and help them implement state-of-the-art solutions. By identifying and enhancing their processes through information technology solutions. NEXGEN TECHNOLOGY help it customers optimally use their resources.
Presentation from Dr. Flavio Villanustre, VP Technology Architecture and Product for LexisNexis on Security and Privacy in a Big Data World from the Global Big Data conference Jan 28, 2013.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Emerging real-world graph problems include detecting community structure in large social networks, improving the resilience of the electric power grid, and detecting and preventing disease in human populations. The volume and richness of data combined with its rate of change renders monitoring properties at scale by static recomputation infeasible. We approach these problems with massive, fine-grained parallelism across different shared memory architectures both to compute solutions and to explore the sensitivity of these solutions to natural bias and omissions within the data.
This talk presents areas of investigation underway at the Rensselaer Institute for Data Exploration and Applications. First presented at Flipkart, Bangalore India, 3/2015.
ANALYTIC QUERIES OVER GEOSPATIAL TIME-SERIES DATA USING DISTRIBUTED HASH TABLESNexgen Technology
TO GET THIS PROJECT COMPLETE SOURCE ON SUPPORT WITH EXECUTION PLEASE CALL BELOW CONTACT DETAILS
MOBILE: 9791938249, 0413-2211159, WEB: WWW.NEXGENPROJECT.COM,WWW.FINALYEAR-IEEEPROJECTS.COM, EMAIL:Praveen@nexgenproject.com
NEXGEN TECHNOLOGY provides total software solutions to its customers. Apsys works closely with the customers to identify their business processes for computerization and help them implement state-of-the-art solutions. By identifying and enhancing their processes through information technology solutions. NEXGEN TECHNOLOGY help it customers optimally use their resources.
Presentation from Dr. Flavio Villanustre, VP Technology Architecture and Product for LexisNexis on Security and Privacy in a Big Data World from the Global Big Data conference Jan 28, 2013.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
1. M AT H I E U B A S T I A N
D ATA V I S U A L I Z AT I O N S U M M I T,
1
SAN FRANCISCO, APRIL 11-12, 2013
2. BIG GRAPH DATA
• The story of big graph data is just starting
• BIG GRAPH DATA
DATA VISUALIZATION SUMMIT 2
2
3. BIG GRAPH DATA
• The story of big graph data is just starting
• BIG GRAPH DATA
BIG DATA GRAPHS
DATA VISUALIZATION SUMMIT 3
3
4. BIG GRAPH DATA
• The story of big graph data is just starting
• BIG GRAPH DATA
BIG DATA GRAPHS
DISTRIBUTED SYSTEMS
COMPLEX
STORAGE
DATABASES
INDEXATION
LARGE DATASETS ALGORITHM
CLOUD COMPUTING
HADOOP
ANALYTICS
REAL-TIME VISUALIZATION
DATA VISUALIZATION SUMMIT 4
4
5. BIG GRAPH DATA
• The story of big graph data is just starting
• BIG GRAPH DATA
BIG DATA GRAPHS
DISTRIBUTED SYSTEMS
COMPLEX
STORAGE
DATABASES
INDEXATION
LARGE DATASETS ALGORITHM
CLOUD COMPUTING
HADOOP
ANALYTICS
REAL-TIME VISUALIZATION
DATA VISUALIZATION SUMMIT 5
5
6. BIG DATA
• “The Petabyte age”
• All industries and domains can leverage big data
Health Government Finance Technology
• Big Data => Big Problems
• Focusing on building the technology to handle big data, and big
graph data (ex: graph databases)
• Seeking efficient analysis of ever more complex systems
DATA VISUALIZATION SUMMIT 6
6
7. GRAPHS
• Graphs are everywhere, and it’s easy to collect graph data
• The world is more complex and interconnected that we thought
Source: Collective Dynamics of Small-World Networks, D Watts, S Strogatz, Nature 393, 440-442
DATA VISUALIZATION SUMMIT 7
7
8. NETWORK SCIENCE
• The study of graphs has been exploding in the last 15 years
• Networks have properties and patterns one can study
• Robustness – How a network is resistant to random attacks?
• Contagion – How fast a disease or gossip spread in a network?
• Communities – How many communities exist in a network?
• Centrality – Who is the most central individual in a network?
• If you read one of these books, you understand Network Science
DATA VISUALIZATION SUMMIT 8
8
9. GRAPHS HELP SOLVE PROBLEMS
• Saddam Hussein Network (2003)
The Universe
C. Wilson. Searching for Saddam: a five-part series on how the US military
used social networking to capture the Iraqi dictator. 2010. www.slate.com/
id/2245228/.
DATA VISUALIZATION SUMMIT 9
9
10. GRAPHS HELP SOLVE PROBLEMS
• Predicting and controlling infectious disease
Naoki Masuda, Petter Holme - Predicting and controlling infectious disease
The Universe epidemics using temporal networks.
http://f1000.com/prime/reports/b/5/6/
Haraldsdottir S, Gupta S, Anderson RM: Preliminary studies of sexual
networks in a male homosexual community in Iceland. J Acquir Immune
Defic Syndr. 1992, 5:374–81.
DATA VISUALIZATION SUMMIT 10 1
0
11. GRAPHS HELP SOLVE PROBLEMS
• Recommendation systems
The Universe
Credit: http://markorodriguez.com/2011/09/22/a-graph-based-movie-recommender-engine/
DATA VISUALIZATION SUMMIT 11 1
1
12. GRAPHS HELP SOLVE PROBLEMS
• Recipe recommendation using ingredient networks
The Universe
Credit: http://www.ladamic.com/wordpress/?p=294
1
DATA VISUALIZATION SUMMIT 21
2
13. GRAPHS HELP SOLVE PROBLEMS
• Power grid
The Universe
Credit: http://www.npr.org/templates/story/story.php?storyId=110997398
DATA VISUALIZATION SUMMIT 13 1
3
14. SMALL GRAPHS
• Famous “Zachary’s Karate Club” study in 1977 only involved 34
nodes.
• It could be drawn by hand on paper
The Universe
Zachary’s Karate Club (1977) W. W. Zachary, An information flow model for conflict and fission in small
groups, Journal of Anthropological Research 33, 452-473 (1977).
DATA VISUALIZATION SUMMIT 14 1
4
15. MEDIUM GRAPHS
• Your own Facebook or LinkedIn social network
• The Harlem Shake: Anatomy of a Viral Meme
The Universe
Gilad Lotan. http://www.huffingtonpost.com/gilad-lotan/the-harlem-shake_b_2804799.html
DATA VISUALIZATION SUMMIT 15 1
5
16. LARGE GRAPHS
• The Internet Map (~350 000 domains)
• DBPedia (~290M relationships)
• Friendster Social Network dataset* (1.8B edges)
The Universe
Internet Map (http://internet-map.net)
* http://snap.stanford.edu/data/index.html
DATA VISUALIZATION SUMMIT 16 1
6
17. IMPLICIT GRAPHS
• Graphs can be explicit or implicit
• Explicit: The network exists in nature (Social Network, Food Webs,
Airlines Network)
• Implicit: The network is derived from other data (Word networks, co-
authorship)
• Example of an implicit graph:
• A set of documents have a set of tags
• One can create a link when two tags are on the same document
• Aggregate all links across all documents
DATA VISUALIZATION SUMMIT 17 1
7
18. SIMILARITY GRAPHS
• Graphs of all the co-occurrences between LinkedIn Skills (2011)
DATA VISUALIZATION SUMMIT 18 1
8
19. VISUALIZATION
• Visualization and statistics are the two basic toolkits one can use
on graphs
• Complex questions are asked when studying graphs
• Easy
• Min, max, average, quartiles Excel can do this!
• Exact queries, search
• Harder
• Patterns, trends, correlations
• Changes over time, context
• Anomalies, data errors Visualization can do this!
• Geographical representation
DATA VISUALIZATION SUMMIT 19 1
9
20. GRAPH VISUALIZATION
• Due to the size of graphs and the complexity of questions,
visualization is the natural tool to understand what’s going on
“ We are more easily persuaded by the reasons we
ourselves discover than by those which are given to us by
others.” Blaise Pascal
Let me play with the data!
Direct manipulation
DATA VISUALIZATION SUMMIT 20 2
0
21. DATA EXPLORATION AND INTERACTION
• Use visualization and statistics to discover new hypothesis
• Exploratory data analysis
“The greatest value of a picture is when it forces us
to notice what we never expected to see.”
John Tukey
• The user interface is centered around the human
• Empowers the user to understand the structure and patterns in
the data
• The machine augments the human
• How?
• Overview and details, zoom and pan interface
• Interactive, direct-manipulation
DATA VISUALIZATION SUMMIT 21 2
1
22. MAP YOUR DATA
• Iterative process to transform relational data into a map
• Use color, size and position to highlight, group and set up a
hierarchy
DATA VISUALIZATION SUMMIT 22 2
2
23. FROM INFORMATION TO KNOWLEDGE
• Exploring networks interactively & iterating often provide
“Eureka” moments for domain experts
Eureka
DATA VISUALIZATION SUMMIT 23 2
3
24. BIG GRAPH DATA
• Big graph data doesn’t necessarily mean you’re visualizing or
analyzing a large graph
• Small graphs can be extracted from large graphs and analyzed
• Small graphs can be extracted from non-graph data as well
• Graphs are just nodes and relationships after all
• Example: Adverse Drug Event Analysis with Hadoop, R, and Gephi
(Josh Wills, Cloudera, 2012)
DATA VISUALIZATION SUMMIT 24 2
4
25. GEPHI
• Built to solve large graph visualization problems.
• Open source tool for Windows, Mac OS X and Linux
• Large international community involved
• The latest version has been downloaded > 100,000 times
• Extensible with plug-ins
• Available at http://gephi.org
DATA VISUALIZATION SUMMIT 25 2
5
26. GEPHI
DATA EDITION
VISUAL
MAPPING FILTER
VISUALIZATION STATISTICS
LAYOUT
TIMELINE
DATA VISUALIZATION SUMMIT 26 2
6
27. SIGMA.JS
• Open-source lightweight JavaScript library to draw graphs
• Uses HTML5 Canvas
• Display dynamically graphs that can be generated on the fly
• Available at http://sigmajs.org
Sigma.js v0.1
DATA VISUALIZATION SUMMIT 27 2
7
28. SUMMARY
• Big graph data = Relational Big Data
• Graphs are everywhere!
• Graphs have fascinating structure and patterns one can analyze
• Visualization is a natural tool for such complex data and complex
questions
• On graphs, visualization done right allows interaction and
iteration. Play.
• The hard part is to extract a small or medium graph from big data
• Open source tools like Gephi or Sigma.js are a good start
DATA VISUALIZATION SUMMIT 28 2
8
29. Become a graph evangelist!
QUESTIONS?
Mathieu Bastian (@mathieubastian)
DATA VISUALIZATION SUMMIT 29 2
9
30. REFERENCES & LINKS
Join the Social Network Analysis class by Lada Adamic on Coursera Sigma.js, Alexis Jacomy and al.
https://www.coursera.org/course/sna http://sigmajs.org
Support the Gephi Consortium Linked: How Everything Is Connected to Everything Else and What It
http://consortium.gephi.org Means, Albert-Laszlo Barabasi
http://www.amazon.com/gp/product/0452284392/
Computational Information Design, Ben Fry (2004)
http://benfry.com/phd/ Six Degrees: The Science of a Connected Age, Duncan J. Watts
http://www.amazon.com/gp/product/0393325423/
The Atlas of Economic Complexity, Harvard's Center for International
Development (CID) and the MIT Media Lab Nexus: Small Worlds and the Groundbreaking Science of Networks,
http://atlas.media.mit.edu/ Mark Buchanan
http://www.amazon.com/gp/product/0393324427
The Mesh of Civilizations and International Email Flows, Bogdan State,
Patrick Park, Ingmar Weber, Yelena Mejova, Michael Macy Connected: The Surprising Power of Our Social Networks and How They
http://arxiv.org/abs/1303.0045 Shape Our Lives, Nicholas A. Christakis and James H. Fowler
http://www.amazon.com/dp/product/0316036137
The Human Disease Network, Goh K-I, Cusick ME, Valle D, Childs B, Atelier Iceberg – Gephi
Vidal M, Barabási A-L (2007) http://www.slideshare.net/ateliericeberg/gephi-17680699
http://www.pnas.org/content/104/21/8685.full
Adding Value through graph analysis using Titan and Faunus, Matthias
What does your intranet look like? Broecheler
http://intranetdiary.blogspot.co.uk/2012/11/network-visualisation.html http://www.slideshare.net/knowfrominfo/titan-talk-ebaymarch2013
Recipe recommendation using ingredient networks, Chun-Yuen Teng, Yu- Network Maps Board on Pinterest, Mathieu Bastian
Ru Lin, Lada A. Adamic http://pinterest.com/mathieubastian/network-maps/
http://arxiv.org/abs/1111.3919
Network Science Book, Albert-László Barabási
US Presidents Inaugural Speeches 1969-2013 Text Network Analysis http://barabasilab.neu.edu/networksciencebook
http://noduslabs.com/cases/presidents-inaugural-speeches-text-
network-analysis/ Adverse Drug Event Analysis with Hadoop, R, and Gephi, Cloudera
https://github.com/cloudera/ades
10 Reasons Why We Visualise Data
http://www.slideshare.net/Facegroup/10-reasons-why-we-visualise-data
DATA VISUALIZATION SUMMIT 30 3
0