This document discusses analyzing Bitcoin transaction data as a graph using Oracle technologies. It provides an overview of modeling Bitcoin transactions as a graph with transactions and addresses as vertices and relationships between them as edges. It then describes the workflow of preparing the data, loading it into a graph database, and analyzing the graph using PGX and PGQL. Examples are given of graph queries and algorithms like PageRank and betweenness centrality that can be run on the Bitcoin transaction graph to identify important transactions and addresses.
This document summarizes an introductory webinar on building an enterprise knowledge graph from RDF data using TigerGraph. It introduces RDF and knowledge graphs, demonstrates loading DBpedia data into a TigerGraph graph database using a universal schema, and provides examples of queries to extract information from the graph such as related people, publishers by location, and related topics for a given predicate. The webinar encourages attendees to learn more about graph databases and TigerGraph through additional resources and future webinar episodes.
A VERY high level over view of Graph Analytics concepts and techniques, including structural analytics, Connectivity Analytics, Community Analytics, Path Analytics, as well as Pattern Matching
Extending Spark Graph for the Enterprise with Morpheus and Neo4jDatabricks
Spark 3.0 introduces a new module: Spark Graph. Spark Graph adds the popular query language Cypher, its accompanying Property Graph Model and graph algorithms to the data science toolbox. Graphs have a plethora of useful applications in recommendation, fraud detection and research.
Morpheus is an open-source library that is API compatible with Spark Graph and extends its functionality by:
A Property Graph catalog to manage multiple Property Graphs and Views
Property Graph Data Sources that connect Spark Graph to Neo4j and SQL databases
Extended Cypher capabilities including multiple graph support and graph construction
Built-in support for the Neo4j Graph Algorithms library In this talk, we will walk you through the new Spark Graph module and demonstrate how we extend it with Morpheus to support enterprise users to integrate Spark Graph in their existing Spark and Neo4j installations.
We will demonstrate how to explore data in Spark, use Morpheus to transform data into a Property Graph, and then build a Graph Solution in Neo4j.
The document discusses how to visualize graphs created with Oracle Database. It provides examples of graph visualization libraries like D3.js, Cytoscape, and Linkurious that can be used. The document demonstrates how to use Cytoscape to connect to an Oracle database, retrieve and visualize graph data, perform graph analytics like shortest path queries, and save/load graph data. Resources for learning more about Oracle Spatial and Graph are also listed.
3rd in the AskTOM Office Hours series on graph database technologies. https://devgym.oracle.com/pls/apex/dg/office_hours/3084
See the magic of graphs in this session. Graph analysis can answer questions like detecting patterns of fraud or identifying influential customers - and do it quickly and efficiently. We’ll show you the APIs for accessing graphs and running analytics such as finding influencers, communities, anomalies, and how to use them from various languages including Groovy, Python, and Javascript, with Jupiter and Zeppelin notebooks.
Albert Godfrind (EMEA Solutions Architect), Zhe Wu (Architect), and Jean Ihm (Product Manager) walk you through, and take your questions.
Learn how graph technologies can be applied to real-world use cases, using medical, network security, and financial data. By combining graph models and machine learning techniques, we can discover relationships, classify information, and identify patterns and anomalies in data. We can answer questions such as “How did other investigators approach similar cases?” and “Do these symptoms seem similar to ones we’ve seen in other diseases?” Presented by Sungpack Hong, Research Director, Oracle Labs.
This document summarizes an introductory webinar on building an enterprise knowledge graph from RDF data using TigerGraph. It introduces RDF and knowledge graphs, demonstrates loading DBpedia data into a TigerGraph graph database using a universal schema, and provides examples of queries to extract information from the graph such as related people, publishers by location, and related topics for a given predicate. The webinar encourages attendees to learn more about graph databases and TigerGraph through additional resources and future webinar episodes.
A VERY high level over view of Graph Analytics concepts and techniques, including structural analytics, Connectivity Analytics, Community Analytics, Path Analytics, as well as Pattern Matching
Extending Spark Graph for the Enterprise with Morpheus and Neo4jDatabricks
Spark 3.0 introduces a new module: Spark Graph. Spark Graph adds the popular query language Cypher, its accompanying Property Graph Model and graph algorithms to the data science toolbox. Graphs have a plethora of useful applications in recommendation, fraud detection and research.
Morpheus is an open-source library that is API compatible with Spark Graph and extends its functionality by:
A Property Graph catalog to manage multiple Property Graphs and Views
Property Graph Data Sources that connect Spark Graph to Neo4j and SQL databases
Extended Cypher capabilities including multiple graph support and graph construction
Built-in support for the Neo4j Graph Algorithms library In this talk, we will walk you through the new Spark Graph module and demonstrate how we extend it with Morpheus to support enterprise users to integrate Spark Graph in their existing Spark and Neo4j installations.
We will demonstrate how to explore data in Spark, use Morpheus to transform data into a Property Graph, and then build a Graph Solution in Neo4j.
The document discusses how to visualize graphs created with Oracle Database. It provides examples of graph visualization libraries like D3.js, Cytoscape, and Linkurious that can be used. The document demonstrates how to use Cytoscape to connect to an Oracle database, retrieve and visualize graph data, perform graph analytics like shortest path queries, and save/load graph data. Resources for learning more about Oracle Spatial and Graph are also listed.
3rd in the AskTOM Office Hours series on graph database technologies. https://devgym.oracle.com/pls/apex/dg/office_hours/3084
See the magic of graphs in this session. Graph analysis can answer questions like detecting patterns of fraud or identifying influential customers - and do it quickly and efficiently. We’ll show you the APIs for accessing graphs and running analytics such as finding influencers, communities, anomalies, and how to use them from various languages including Groovy, Python, and Javascript, with Jupiter and Zeppelin notebooks.
Albert Godfrind (EMEA Solutions Architect), Zhe Wu (Architect), and Jean Ihm (Product Manager) walk you through, and take your questions.
Learn how graph technologies can be applied to real-world use cases, using medical, network security, and financial data. By combining graph models and machine learning techniques, we can discover relationships, classify information, and identify patterns and anomalies in data. We can answer questions such as “How did other investigators approach similar cases?” and “Do these symptoms seem similar to ones we’ve seen in other diseases?” Presented by Sungpack Hong, Research Director, Oracle Labs.
Neo4j-Databridge: Enterprise-scale ETL for Neo4jGraphAware
Neo4j - London User Group Meetup - 28th March, 2018
If your data ingestion requirements have grown beyond importing occasional CSV files then this talk is for you. Neo4j-Databridge from GraphAware is a comprehensive ETL tool specifically built for Neo4j. It has been designed for usability, expressive power and high performance to address the most common isues faced when importing data into Neo4j - multiple data sources and type, very large data sets, bespoke data conversions, non-tabular formats, filtering, merging and de-duplication, as well as bulk imports and incremental updates.
In this talk, we'll take a quick tour of the some of the main features, loading data from Kafka, Redis, JDBC and various other data sources along the way, to understand how Neo4j Databridge solves these problems and how it can help you import your data quickly and easily into Neo4j.
Vince Bickers is a Principal Consultant at GraphAware and the main author of Spring Data Neo4j (v4). He has been writing software and leading software development teams for over 30 years at organisations like Vodafone, Deutsche Bank, HSBC, Network Rail, UBS, VMWare, ConocoPhillips, Aviva and British Gas.
Transforming AI with Graphs: Real World Examples using Spark and Neo4jFred Madrid
Graphs – or information about the relationships, connection, and topology of data points – are transforming machine learning. We’ll walk through real world examples of how to get transform your tabular data into a graph and how to get started with graph AI. This talk will provide an overview of how we to incorporate graph based features into traditional machine learning pipelines, create graph embeddings to better describe your graph topology, and give you a preview of approaches for graph native learning using graph neural networks. We’ll talk about relevant, real world case studies in financial crime detection, recommendations, and drug discovery. This talk is intended to introduce the concept of graph based AI to beginners, as well as help practitioners understand new techniques and applications. Key take aways: how graph data can improve machine learning, when graphs are relevant to data science applications, what graph native learning is and how to get started.
Introduction to Property Graph Features (AskTOM Office Hours part 1) Jean Ihm
1st in the AskTOM Office Hours series on graph database technologies. https://devgym.oracle.com/pls/apex/dg/office_hours/3084
Xavier Lopez (PM Senior Director) and Zhe Wu (Graph Architect) will share a brief intro to what property graphs can do for you, and take your questions - on property graphs or any other aspect of Oracle Database Spatial and Graph features. With property graphs, you can analyze relationships in Big Data like social networks, financial transactions, or IoT sensor networks; identify influencers; discover patterns of fraudulent behavior; recommend products, and much more -- right inside Oracle Database.
GraphX: Graph analytics for insights about developer communitiesPaco Nathan
The document provides an overview of Graph Analytics in Spark. It discusses Spark components and key distinctions from MapReduce. It also covers GraphX terminology and examples of composing node and edge RDDs into a graph. The document provides examples of simple traversals and routing problems on graphs. It discusses using GraphX for topic modeling with LDA and provides further reading resources on GraphX, algebraic graph theory, and graph analysis tools and frameworks.
Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use CaseMo Patel
This document summarizes a presentation about analyzing graphs using Apache Spark's GraphFrames and GraphX libraries. It begins with an introduction of the speaker and their interests. It then discusses what graphs are and provides examples of graph analytics like node scoring and community detection. It introduces GraphX and GraphFrames, how they allow working with property graphs and integrating graph operations with DataFrames. It also provides an example of how financial institutions can use graph analytics to detect synthetic identity fraud by analyzing relationships between customer addresses.
Graphs are everywhere! Distributed graph computing with Spark GraphXAndrea Iacono
This document discusses GraphX, a graph processing system built on Apache Spark. It defines what graphs are, including vertices and edges. It explains that GraphX uses Resilient Distributed Datasets (RDDs) to keep data in memory for iterative graph algorithms. GraphX implements the Pregel computational model where each vertex can modify its state, receive and send messages to neighbors each superstep until halting. The document provides examples of graph algorithms and notes when GraphX is well-suited versus a graph database.
Graph Gurus Episode 4: Detecting Fraud and Money Laudering in Real-Time Part 2TigerGraph
This document discusses using TigerGraph and machine learning to detect money laundering. It describes money laundering techniques like layering and layering loops. It then outlines an AML workflow with TigerGraph, and goes into depth on how TigerGraph can detect layering loops through a bi-directional graph search approach in multiple phases. It provides pseudocode and examples to illustrate the loop detection approach. Finally, it discusses implementing loop detection as a GSQL query in TigerGraph.
An Introduction to Graph: Database, Analytics, and Cloud ServicesJean Ihm
Graph analysis employs powerful algorithms to explore and discover relationships in social network, IoT, big data, and complex transaction data. Learn how graph technologies are used in applications such as fraud detection for banking, customer 360, public safety, and manufacturing. This session will provide an overview and demos of graph technologies for Oracle Cloud Services, Oracle Database, NoSQL, Spark and Hadoop, including PGX analytics and PGQL property graph query language.
Presented at Analytics and Data Summit, March 20, 2018
ScalaTo July 2019 - No more struggles with Apache Spark workloads in productionChetan Khatri
Scala Toronto July 2019 event at 500px.
Pure Functional API Integration
Apache Spark Internals tuning
Performance tuning
Query execution plan optimisation
Cats Effects for switching execution model runtime.
Discovery / experience with Monix, Scala Future.
No REST till Production – Building and Deploying 9 Models to Production in 3 ...Databricks
Charmee Patel from Syntasa discusses building and deploying 9 models to production in 3 weeks to support media buying decisions for certain product segments using clickstream and enterprise data from ~2M visitors and ~100K SKUs. Key challenges included high data volume, complexity, non-stationarity, and reliability in production. Syntasa addressed this through an experiment process template, feature store, and ensemble modeling approach. Results showed significant lift over rule-based approaches, with the bespoke algorithmic models driving much higher conversion rates and marketing activity.
The document provides an introduction to Apache Spark and Scala. It discusses that Apache Spark is a fast and general-purpose cluster computing system that provides high-level APIs for Scala, Java, Python and R. It supports structured data processing using Spark SQL, graph processing with GraphX, and machine learning using MLlib. Scala is a modern programming language that is object-oriented, functional, and type-safe. The document then discusses Resilient Distributed Datasets (RDDs), DataFrames, and Datasets in Spark and how they provide different levels of abstraction and functionality. It also covers Spark operations and transformations, and how the Spark logical query plan is optimized into a physical execution plan.
Graph Databases and Machine Learning | November 2018TigerGraph
Graph Database and Machine Learning: Finding a Happy Marriage. Graph Databases and Machine Learning
both represent powerful tools for getting more value from data, learn how they can form a harmonious marriage to up-level machine learning.
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...Databricks
This document discusses using TigerGraph for real-time fraud detection at scale by integrating real-time deep-link graph analytics with Spark AI. It provides examples of common TigerGraph use cases including recommendation engines, fraud detection, and risk assessment. It then discusses how TigerGraph can power explainable AI by extracting over 100 graph-based features from entities and their relationships to feed machine learning models. Finally, it shares a case study of how China Mobile used TigerGraph for real-time phone-based fraud detection by analyzing over 600 million phone numbers and 15 billion call connections as a graph to detect various types of fraud in real-time.
This document provides a summary of an event on optimized graph algorithms in Neo4j. It includes an introduction to graph analytics and algorithms, examples of analyzing real-world networks, and a demonstration of Neo4j's native graph database capabilities for graph analytics and algorithms. The presentation discusses preprocessing data from multiple sources into a graph, running algorithms like PageRank and community detection, and visualizing results.
Oracle Spatial Studio: Fast and Easy Spatial Analytics and MapsJean Ihm
Learn about a new tool, Spatial Studio, that lets you quickly and easily do spatial analytics and create maps, even if you don't have GIS or Spatial knowledge. Now business users and non-GIS developers have a simple user interface to access the spatial features in Oracle Database.
Spatial Studio lets you prepare your data for spatial analysis, perform spatial analysis operations, publish, and share the results – as well access spatial analyses results via REST and incorporate in applications and workflows. Presented by Carol Palmer, Sr. Principal Product Manager, and David Lapp, Sr. Principal Product Manager, Oracle Spatial and Graph.
Presentation video including demo and resources available here: https://devgym.oracle.com/pls/apex/dg/office_hours/3084 .
Massively Scalable Computational Finance with SciDBParadigm4Inc
Hedge funds, investment managers and prop shops need to keep pace with rapidly growing data volumes from many sources.
SciDB—an advanced computational database programmable from R and Python—scales out to petabyte volumes and facilitates rapid integration of diverse data sources. Open source and running on commodity hardware, SciDB is extensible and scales cost effectively.
Attend this webinar to learn how quants and system developers harness SciDB’s massively scalable complex analytics to solve hard problems faster. SciDB’s native array storage is optimized for time-series data, delivering fast windowed aggregates and complex analytics, without time-consuming data extraction.
Webinar presenters will demonstrate real world use cases, including the ability to quickly:
1. Generate aggregated order books across multiple exchanges
2. Create adjusted continuous futures contracts
3. Analyze complex financial networks to detect anomalous behavior
5th in the AskTOM Office Hours series on graph database technologies. https://devgym.oracle.com/pls/apex/dg/office_hours/3084
PGQL: A Query Language for Graphs
Learn how to query graphs using PGQL, an expressive and intuitive graph query language that's a lot like SQL. With PGQL, it's easy to get going writing graph analysis queries to the database in a very short time. Albert and Oskar show what you can do with PGQL, and how to write and execute PGQL code.
Applying graph analytics on data stored in relational databases can provide tremendous value in many application domains. We discuss the importance of leveraging these analyses, and the challenges in enabling them. We present a tool, called GraphGen, that allows users to visually explore, and rapidly analyze (using NetworkX) different graph structures present in their databases.
Graph Analytics on Data from Meetup.comKarin Patenge
This document contains an agenda and slides from a presentation on analyzing data using graph analytics. The presentation discusses retrieving meetup data via API, transforming it into nodes and edges files, loading the data into a graph database, and analyzing the graph data using PGX and PGQL. Key topics analyzed include influential meetup groups, connections between groups in different locations, and popular topics.
Neo4j-Databridge: Enterprise-scale ETL for Neo4jGraphAware
Neo4j - London User Group Meetup - 28th March, 2018
If your data ingestion requirements have grown beyond importing occasional CSV files then this talk is for you. Neo4j-Databridge from GraphAware is a comprehensive ETL tool specifically built for Neo4j. It has been designed for usability, expressive power and high performance to address the most common isues faced when importing data into Neo4j - multiple data sources and type, very large data sets, bespoke data conversions, non-tabular formats, filtering, merging and de-duplication, as well as bulk imports and incremental updates.
In this talk, we'll take a quick tour of the some of the main features, loading data from Kafka, Redis, JDBC and various other data sources along the way, to understand how Neo4j Databridge solves these problems and how it can help you import your data quickly and easily into Neo4j.
Vince Bickers is a Principal Consultant at GraphAware and the main author of Spring Data Neo4j (v4). He has been writing software and leading software development teams for over 30 years at organisations like Vodafone, Deutsche Bank, HSBC, Network Rail, UBS, VMWare, ConocoPhillips, Aviva and British Gas.
Transforming AI with Graphs: Real World Examples using Spark and Neo4jFred Madrid
Graphs – or information about the relationships, connection, and topology of data points – are transforming machine learning. We’ll walk through real world examples of how to get transform your tabular data into a graph and how to get started with graph AI. This talk will provide an overview of how we to incorporate graph based features into traditional machine learning pipelines, create graph embeddings to better describe your graph topology, and give you a preview of approaches for graph native learning using graph neural networks. We’ll talk about relevant, real world case studies in financial crime detection, recommendations, and drug discovery. This talk is intended to introduce the concept of graph based AI to beginners, as well as help practitioners understand new techniques and applications. Key take aways: how graph data can improve machine learning, when graphs are relevant to data science applications, what graph native learning is and how to get started.
Introduction to Property Graph Features (AskTOM Office Hours part 1) Jean Ihm
1st in the AskTOM Office Hours series on graph database technologies. https://devgym.oracle.com/pls/apex/dg/office_hours/3084
Xavier Lopez (PM Senior Director) and Zhe Wu (Graph Architect) will share a brief intro to what property graphs can do for you, and take your questions - on property graphs or any other aspect of Oracle Database Spatial and Graph features. With property graphs, you can analyze relationships in Big Data like social networks, financial transactions, or IoT sensor networks; identify influencers; discover patterns of fraudulent behavior; recommend products, and much more -- right inside Oracle Database.
GraphX: Graph analytics for insights about developer communitiesPaco Nathan
The document provides an overview of Graph Analytics in Spark. It discusses Spark components and key distinctions from MapReduce. It also covers GraphX terminology and examples of composing node and edge RDDs into a graph. The document provides examples of simple traversals and routing problems on graphs. It discusses using GraphX for topic modeling with LDA and provides further reading resources on GraphX, algebraic graph theory, and graph analysis tools and frameworks.
Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use CaseMo Patel
This document summarizes a presentation about analyzing graphs using Apache Spark's GraphFrames and GraphX libraries. It begins with an introduction of the speaker and their interests. It then discusses what graphs are and provides examples of graph analytics like node scoring and community detection. It introduces GraphX and GraphFrames, how they allow working with property graphs and integrating graph operations with DataFrames. It also provides an example of how financial institutions can use graph analytics to detect synthetic identity fraud by analyzing relationships between customer addresses.
Graphs are everywhere! Distributed graph computing with Spark GraphXAndrea Iacono
This document discusses GraphX, a graph processing system built on Apache Spark. It defines what graphs are, including vertices and edges. It explains that GraphX uses Resilient Distributed Datasets (RDDs) to keep data in memory for iterative graph algorithms. GraphX implements the Pregel computational model where each vertex can modify its state, receive and send messages to neighbors each superstep until halting. The document provides examples of graph algorithms and notes when GraphX is well-suited versus a graph database.
Graph Gurus Episode 4: Detecting Fraud and Money Laudering in Real-Time Part 2TigerGraph
This document discusses using TigerGraph and machine learning to detect money laundering. It describes money laundering techniques like layering and layering loops. It then outlines an AML workflow with TigerGraph, and goes into depth on how TigerGraph can detect layering loops through a bi-directional graph search approach in multiple phases. It provides pseudocode and examples to illustrate the loop detection approach. Finally, it discusses implementing loop detection as a GSQL query in TigerGraph.
An Introduction to Graph: Database, Analytics, and Cloud ServicesJean Ihm
Graph analysis employs powerful algorithms to explore and discover relationships in social network, IoT, big data, and complex transaction data. Learn how graph technologies are used in applications such as fraud detection for banking, customer 360, public safety, and manufacturing. This session will provide an overview and demos of graph technologies for Oracle Cloud Services, Oracle Database, NoSQL, Spark and Hadoop, including PGX analytics and PGQL property graph query language.
Presented at Analytics and Data Summit, March 20, 2018
ScalaTo July 2019 - No more struggles with Apache Spark workloads in productionChetan Khatri
Scala Toronto July 2019 event at 500px.
Pure Functional API Integration
Apache Spark Internals tuning
Performance tuning
Query execution plan optimisation
Cats Effects for switching execution model runtime.
Discovery / experience with Monix, Scala Future.
No REST till Production – Building and Deploying 9 Models to Production in 3 ...Databricks
Charmee Patel from Syntasa discusses building and deploying 9 models to production in 3 weeks to support media buying decisions for certain product segments using clickstream and enterprise data from ~2M visitors and ~100K SKUs. Key challenges included high data volume, complexity, non-stationarity, and reliability in production. Syntasa addressed this through an experiment process template, feature store, and ensemble modeling approach. Results showed significant lift over rule-based approaches, with the bespoke algorithmic models driving much higher conversion rates and marketing activity.
The document provides an introduction to Apache Spark and Scala. It discusses that Apache Spark is a fast and general-purpose cluster computing system that provides high-level APIs for Scala, Java, Python and R. It supports structured data processing using Spark SQL, graph processing with GraphX, and machine learning using MLlib. Scala is a modern programming language that is object-oriented, functional, and type-safe. The document then discusses Resilient Distributed Datasets (RDDs), DataFrames, and Datasets in Spark and how they provide different levels of abstraction and functionality. It also covers Spark operations and transformations, and how the Spark logical query plan is optimized into a physical execution plan.
Graph Databases and Machine Learning | November 2018TigerGraph
Graph Database and Machine Learning: Finding a Happy Marriage. Graph Databases and Machine Learning
both represent powerful tools for getting more value from data, learn how they can form a harmonious marriage to up-level machine learning.
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...Databricks
This document discusses using TigerGraph for real-time fraud detection at scale by integrating real-time deep-link graph analytics with Spark AI. It provides examples of common TigerGraph use cases including recommendation engines, fraud detection, and risk assessment. It then discusses how TigerGraph can power explainable AI by extracting over 100 graph-based features from entities and their relationships to feed machine learning models. Finally, it shares a case study of how China Mobile used TigerGraph for real-time phone-based fraud detection by analyzing over 600 million phone numbers and 15 billion call connections as a graph to detect various types of fraud in real-time.
This document provides a summary of an event on optimized graph algorithms in Neo4j. It includes an introduction to graph analytics and algorithms, examples of analyzing real-world networks, and a demonstration of Neo4j's native graph database capabilities for graph analytics and algorithms. The presentation discusses preprocessing data from multiple sources into a graph, running algorithms like PageRank and community detection, and visualizing results.
Oracle Spatial Studio: Fast and Easy Spatial Analytics and MapsJean Ihm
Learn about a new tool, Spatial Studio, that lets you quickly and easily do spatial analytics and create maps, even if you don't have GIS or Spatial knowledge. Now business users and non-GIS developers have a simple user interface to access the spatial features in Oracle Database.
Spatial Studio lets you prepare your data for spatial analysis, perform spatial analysis operations, publish, and share the results – as well access spatial analyses results via REST and incorporate in applications and workflows. Presented by Carol Palmer, Sr. Principal Product Manager, and David Lapp, Sr. Principal Product Manager, Oracle Spatial and Graph.
Presentation video including demo and resources available here: https://devgym.oracle.com/pls/apex/dg/office_hours/3084 .
Massively Scalable Computational Finance with SciDBParadigm4Inc
Hedge funds, investment managers and prop shops need to keep pace with rapidly growing data volumes from many sources.
SciDB—an advanced computational database programmable from R and Python—scales out to petabyte volumes and facilitates rapid integration of diverse data sources. Open source and running on commodity hardware, SciDB is extensible and scales cost effectively.
Attend this webinar to learn how quants and system developers harness SciDB’s massively scalable complex analytics to solve hard problems faster. SciDB’s native array storage is optimized for time-series data, delivering fast windowed aggregates and complex analytics, without time-consuming data extraction.
Webinar presenters will demonstrate real world use cases, including the ability to quickly:
1. Generate aggregated order books across multiple exchanges
2. Create adjusted continuous futures contracts
3. Analyze complex financial networks to detect anomalous behavior
5th in the AskTOM Office Hours series on graph database technologies. https://devgym.oracle.com/pls/apex/dg/office_hours/3084
PGQL: A Query Language for Graphs
Learn how to query graphs using PGQL, an expressive and intuitive graph query language that's a lot like SQL. With PGQL, it's easy to get going writing graph analysis queries to the database in a very short time. Albert and Oskar show what you can do with PGQL, and how to write and execute PGQL code.
Applying graph analytics on data stored in relational databases can provide tremendous value in many application domains. We discuss the importance of leveraging these analyses, and the challenges in enabling them. We present a tool, called GraphGen, that allows users to visually explore, and rapidly analyze (using NetworkX) different graph structures present in their databases.
Graph Analytics on Data from Meetup.comKarin Patenge
This document contains an agenda and slides from a presentation on analyzing data using graph analytics. The presentation discusses retrieving meetup data via API, transforming it into nodes and edges files, loading the data into a graph database, and analyzing the graph data using PGX and PGQL. Key topics analyzed include influential meetup groups, connections between groups in different locations, and popular topics.
This document discusses analyzing social media data from Meetup.com using graph technologies. It describes retrieving data via the Meetup API, modeling the data as a graph, analyzing the graph using algorithms and tools like PGX and PGQL, and visualizing results in Cytoscape. Potential questions that could be answered include identifying influential people and groups, relationships between groups, and hot topics. The demo environment uses Oracle Big Data Lite with Oracle NoSQL Database to store the graph and analyze it.
At Data-centric Architecture Forum 2020 Thomas Cook, our Sales Director of AnzoGraph DB, gave his presentation "Knowledge Graph for Machine Learning and Data Science". These are his slides.
Graph analytics can be used to analyze a social graph constructed from email messages on the Spark user mailing list. Key metrics like PageRank, in-degrees, and strongly connected components can be computed using the GraphX API in Spark. For example, PageRank was computed on the 4Q2014 email graph, identifying the top contributors to the mailing list.
Continuous Intelligence - Intersecting Event-Based Business Logic and MLParis Carbone
Continuous intelligence involves integrating real-time analytics within business operations to prescribe actions in response to events based on current and historical data. It represents a paradigm shift from retrospective querying of data to providing real-time answers using stream processing as a 24/7 execution model. Technologies like Apache Flink enable this through scalable, fault-tolerant stream processing with stream SQL, complex event processing, and other abstractions.
Applying large scale text analytics with graph databasesData Ninja API
Data Ninja Services collaborated with Oracle to reach a major milestone in the integration of text analytics with Oracle Spatial and Graph. The Data Ninja Services client in Java can be used to analyze free texts, extract entities, generate RDF semantic graphs, and choose from a number of graph analytics to infer entity relationships. We demonstrated two case studies involving mining health news and detecting anomalies in product reviews.
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...Amazon Web Services
Companies have valuable data that they might not be analyzing due to the complexity, scalability, and performance issues of loading the data into their data warehouse. With the right tools, you can extend your analytics to query data in your data lake—with no loading required. Amazon Redshift Spectrum extends the analytic power of Amazon Redshift beyond data stored in your data warehouse to run SQL queries directly against vast amounts of unstructured data in your Amazon S3 data lake. This gives you the freedom to store your data where you want, in the format you want, and have it available for analytics when you need it. Join a discussion with an Amazon Redshift lead engineer to ask questions and learn more about how you can extend your analytics beyond your data warehouse.
Hadoop application architectures - using Customer 360 as an examplehadooparchbook
Hadoop application architectures - using Customer 360 (more generally, Entity 360) as an example. By Ted Malaska, Jonathan Seidman and Mark Grover at Strata + Hadoop World 2016 in NYC.
Multiplaform Solution for Graph DatasourcesStratio
One of the top banks in Europe, needed a system to provide better performance, scaling almost linearly with the increase in information to be analyzed, and allowing to move the processes that were currently being executed in the Host to a Big Data infrastructure. During a year we've worked on a system which is able to provide greater agility, flexibility and simplicity for the user to view information when profiling and is now able to analyze the structure of profile data. It's a powerful way to make online queries to a graph database, which is integrated with Apache Spark and different graph libraries. Basically, we get all the necessary information through Cypher queries which are sent to a Neo4j database.
Using the last Big Data technologies like Spark Dataframe, HDFS, Stratio Intelligence or Stratio Crossdata, we have developed a solution which is able to obtain critical information for multiple datasources like text files o graph databases.
Make your data fly - Building data platform in AWSKimmo Kantojärvi
This document summarizes a presentation on building a data platform in AWS. It discusses the architectural evolution from on-premise data warehouses to cloud-based data lakes and platforms. It provides examples of using AWS services like EMR, Redshift, Airflow and visualization tools. It also covers best practices for data modeling, performance optimization, security and DevOps approaches.
An Update on Scaling Data Science Applications with SparkR in 2018 with Heiko...Databricks
Spark has established itself as the most popular platform for advanced scale-out analytical applications. It is deeply integrated with the Hadoop ecosystem, offers a set of powerful libraries and supports both Python and R. Because of these reasons Data Scientists have started to adopt Spark to train and deploy their models. When Spark 1.4 was released back in 2015, it included the new SparkR library: this API gave R users the exciting new option to run R code on Spark.
And while the initial promise to provide a full R environment in Spark has been kept, it takes a deeper understanding of SparkR’s inner workings to make optimal use of its capabilities. This talk will give a comprehensive update on where we stand with Data Science applications in R based on the latest Spark releases. We will share insights from both a Startup solution and a Fortune 100 company where SparkR does Machine Learning in the Cloud on a scale that would have not been feasible previously: it’s parallel execution model runs in minutes and hours whereas conventional sequential approaches would take days and months.
Suggested Topics:
• An update on the SparkR architecture in the latest Spark release: using R with SparkSQL, MLlib and Spark’s Structured Streaming
• How to handle practical challenges, e.g. running R on the cluster without a local installation, storing non-tabular results, such as Data Science models or plots, mixing Scala and R.
• Scaling Big Compute Applications with SparkR: Parallelizing SparkR applications with User-Defined Functions (UDFs) and elastic scaling of resources in the Cloud
• An Outlook on Machine Learning with SparkR and its ecosystem, frameworks and tools.
• Plus: “Do I need to learn Python?”
Every year the financial industry loses billions because of fraud while in the meantime fraudsters are coming up with more and more sophisticated patterns.
Financial institutions have to find the balance between fraud protection and negative customer experience. Fraudsters bury their patterns in lots of data, but the traditional technologies are not designed to detect fraud in real-time or to see patterns beyond the individual account.
Analyzing relations with graph databases helps uncover these larger complex patterns and speeds up suspicious behavior identification.
Furthermore, graph databases enable fast and effective real-time link queries and passing context to machine learning models.
The earlier fraud pattern or network is identified, the faster the activity is blocked. As a result, losses and fines are minimized.
Talk about how we at Expedia are trying get to greater observability into stack using our opensourced distributed tracing and analysis system Haystack.
Scaling graph investigations with Math, GPUs, & Expertsgraphistry
Investigating logs is getting more and more important as more of our lives get recorded, and graph techniques promise to help us to reveal the connections in our data. However, scale challenges forensics in many enterprise and federal settings. By focusing on the fundamentals around the pure math, GPU accelerated implementation, and the experts performing the process, we can go quite far.
Demos span security, fraud, & crime, and cover concepts such as UMAP/K-NN/DL, hypergraphs, and low-code investigation automation via visual graph-based record & replay.
How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...Jean Ihm
2nd in the AskTOM Office Hours series on graph database technologies. https://devgym.oracle.com/pls/apex/dg/office_hours/3084
With property graphs in Oracle Database, you can perform powerful analysis on big data such as social networks, financial transactions, sensor networks, and more.
To use property graphs, first, you’ll need a graph model. For a new user, modeling and generating a suitable graph for an application domain can be a challenge. This month, we’ll describe key steps required to construct a meaningful graph, and offer a few tips on validating the generated graph.
Albert Godfrind (EMEA Solutions Architect), Zhe Wu (Architect), and Jean Ihm (Product Manager) walk you through, and take your questions.
Architecting next generation big data platformhadooparchbook
A tutorial on architecting next generation big data platform by the authors of O'Reilly's Hadoop Application Architectures book. This tutorial discusses how to build a customer 360 (or entity 360) big data application.
Audience: Technical.
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardParis Data Engineers !
Delta Lake is an open source framework living on top of parquet in your data lake to provide Reliability and performances. It has been open-sourced by Databricks this year and is gaining traction to become the defacto delta lake format.
We’ll see all the goods Delta Lake can do to your data with ACID transactions, DDL operations, Schema enforcement, batch and stream support etc !
The document discusses Oracle Spatial and GeoRaster capabilities for processing and analyzing raster imagery data within the Oracle database. It provides an overview of raster data concepts like cell size and resolution, multi-band images, blocking, and compression techniques for imagery and grid data. It also demonstrates how Oracle Spatial can be used to load, store, query, manipulate, analyze and process large volumes of raster data within the database for applications like remote sensing, mapping, and gridded data analysis.
The document discusses weather data stored in Oracle Spatial GeoRaster format. It provides an overview of weather forecast data from the German Weather Service (DWD) which is available in GRIB2 format from their open data server. It also describes how the weather data from DWD numerical models like ICON can be accessed and processed using Oracle Spatial GeoRaster.
Big Data Community Webinar vom 16. Mai 2019: Oracle NoSQL DB im ÜberblickKarin Patenge
Ein Key-Value Store mit nativer Unterstützung für JSON, der auch Graphen und SQL “kann”. Der Foliensatz enthält detaillierte Informationen zur Nutzung der Oracle NoSQL DB aus Sicht der Anwendungsentwicklung als auch aus Sicht der Administration / des Betriebs.
This document discusses using Pandas and Python to access and analyze data in an Oracle database. It begins with an introduction to Python and Pandas for data analysis. It then discusses how to connect Python to an Oracle database using the cx_Oracle library. It provides examples of querying and manipulating spatial vector data stored in Oracle using GeoPandas. The document aims to help developers get started with leveraging Python and Pandas for data work with an Oracle backend.
The document provides an overview of various emerging technologies and trends that are influencing customers, including chatbots, blockchain, internet of things, and artificial intelligence. It discusses these technologies and how Oracle is addressing them through products and services like its blockchain cloud service, IoT cloud service, and intelligent bots platform.
Datenbank-gestützte Validierung und Geokodierung von AdressdatenbeständenKarin Patenge
This slidedeck covers the topic of how to validate address data sets from various sources and convert the address information into coordinates (process of geocoding). Geocoded address data can be used to display them and on maps and to further do all kind of spatially-enabled analysis and mining.
Raster Algebra mit Oracle Spatial und uDigKarin Patenge
Im Foliensatz ist die Integration von Oracle Spatial mit Open Source Technolgien beschrieben. Am Beispiel von uDig wird Schritt-für-Schritt aufgezeigt, wie es zusammen mit Oracle Spatial für die Rasterdatenanalyse eingesetzt werden hier. Beispielhaft wird ein Vegetationsindex (NVDI) berechnet.
Bei Interesse gern auch weiterlesen auf dem Oracle Spatial Blog (http://oracle-spatial.blogspot.com).
Geodatenmanagement und -Visualisierung mit Oracle Spatial TechnologiesKarin Patenge
Der Foliensatz gibt einen Überblick darüber, wie und welche räumlichen Daten in der Oracle Datenbank gepflegt und ausgewertet werden können. Darüber hinaus zeigt er die Nutzung von Oracle Maps für die Visualisierung von räumlichen Daten in Form von Karten auf.
Bei Interesse gern auch weiterlesen auf dem deutschsprachigen Oracle Spatial Blog (http://oracle-spatial.blogspot.com).
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.