This document discusses graph databases and analyzing relationships at scale. It provides an overview of graph databases, how they are used to represent complex relationships between entities as nodes and edges in a graph, and how graph queries and analytics can reveal useful insights by traversing the graph along relationships. It also briefly introduces Aurelius, an open source graph database platform, and some of its features for working with large graph datasets.
Adding Value through graph analysis using Titan and FaunusMatthias Broecheler
In this presentation we discuss how graph analysis can add value to your data and how to use open source tools like Titan and Faunus to build scalable graph processing systems.
This presentation gives an update on the development status of Titan and Faunus with a preview of what is to come.
The eighth annual Future of Open Source Survey results, presented by Black Duck and North Bridge, point toward the increased strategic role that open source plays in today’s enterprises, its crucial function within new technology development, and the growth of both first-time developers within the OSS community and the impact open source has in daily life.
This live webinar demonstrates how using an integrated customer acquisition solution can help to close the loop between marketing and sales. We show you examples of how this process has worked for other companies, giving them a better understanding as to where their leads are coming from and how to best spend their marketing dollars for the highest return. - See more at: https://www.osscube.com/webinar/sales-and-marketing-together-at-last#sthash.ZT2dsELD.dpuf
Using pim to maximize revenue and improve customer satisfactionOSSCube
This live webinar shows how Pimcore, an open source PIM (Product Information Management) solution, can be used to quickly update and append your product catalog across all channels, effectively reducing data management costs.
This webinar goes through how the commerce industry today has changed, causing customers to interact differently, expect more from retailers and demand unique shopping experiences. Rakesh Kumar and John Bernard dive into what makes Magento the world’s leading eCommerce platform and how it puts the retailer back in control.
Adding Value through graph analysis using Titan and FaunusMatthias Broecheler
In this presentation we discuss how graph analysis can add value to your data and how to use open source tools like Titan and Faunus to build scalable graph processing systems.
This presentation gives an update on the development status of Titan and Faunus with a preview of what is to come.
The eighth annual Future of Open Source Survey results, presented by Black Duck and North Bridge, point toward the increased strategic role that open source plays in today’s enterprises, its crucial function within new technology development, and the growth of both first-time developers within the OSS community and the impact open source has in daily life.
This live webinar demonstrates how using an integrated customer acquisition solution can help to close the loop between marketing and sales. We show you examples of how this process has worked for other companies, giving them a better understanding as to where their leads are coming from and how to best spend their marketing dollars for the highest return. - See more at: https://www.osscube.com/webinar/sales-and-marketing-together-at-last#sthash.ZT2dsELD.dpuf
Using pim to maximize revenue and improve customer satisfactionOSSCube
This live webinar shows how Pimcore, an open source PIM (Product Information Management) solution, can be used to quickly update and append your product catalog across all channels, effectively reducing data management costs.
This webinar goes through how the commerce industry today has changed, causing customers to interact differently, expect more from retailers and demand unique shopping experiences. Rakesh Kumar and John Bernard dive into what makes Magento the world’s leading eCommerce platform and how it puts the retailer back in control.
Legacy to industry leader: a modernization case studyOSSCube
This live webinar goes through the steps of how MakeMyTrip.com engaged OSSCube to completely modernize their website and help them become one of the top online travel companies in the world. Zend Server and Zend Studio were used to expedite the entire project for what has now become arguably the largest Drupal implementation to date.
This webinar explores the process of how OSSCube developed a Talend solution--for a global provider of digital marketing and client reporting tools--that aggregates and converts information from a variety of resources into well-defined data formats.
Watch on YouTube: https://www.youtube.com/watch?v=gyZiiG7mjx8
OSSCube EVP John Bernard and Talend Alliances and Channels Manager Rich Searle provide an in-depth explanation of the benefits of Talend as well as the usefulness of data organization in today's business world.
Key Discussion Points:
- Talend ETL tools capabilities
- Implementing Talend in your organization
For more information please visit OSSCube.com
For more webinars please visit OSSCube.com/upcoming-webinars
Follow us on Twitter @OSSCube
Follow us on LinkedIn http://linkedin.com/company/osscube
Using MySQL Fabric for High Availability and Scaling OutOSSCube
MySQL Fabric is an extensible framework for managing farms of MySQL Servers. In this webinar, you will learn what MySQL Fabric is, what it can achieve and how it is used by database administrators and developers. Plus, you will learn how MySQL Fabric can help for sharding and high-availability. See more @ http://www.osscube.com/
This presentation introduces Titan, Faunus, and scalable graph computing in general. We present a case study of how Pearson builds an education social network on top of Titan, Faunus, and Cassandra to support learning in the 21st century.
Titan is an open source distributed graph database build on top of Cassandra that can power real-time applications with thousands of concurrent users over graphs with billions of edges. Faunus is an open source global graph processing engine build on top of Hadoop and compatible with Cassandra that can analyze graphs, compute graph statistics, and execute global traversals. Titan and Faunus are components of the Aurelius Graph Cluster which enables scalable graph computation and powers applications in social networking, recommendation engines, advertisement optimization, knowledge representation, health care, education, and security.
DataStax: Titan 1.0: Scalable real time and analytic graph queriesDataStax Academy
Titan is a scalable graph database build on top of Apache Cassandra that supports both real time and analytic graph queries across distributed clusters. This talk focuses on the recently released Titan 1.0 and the new features it introduces. Titan implements the most recent version of the popular Apache Gremlin graph query language through a custom rewrite engine and query optimizer to efficiently execute deep traversal queries. Graph analytics and global breadth-first execution of Gremlin queries is executed by Apache Spark through the Cassandra-Spark connector. These features are demonstrated on a social use case to highlight how Titan can deliver relationship value with little development effort.
Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, StrongerDataStax
Apache Cassandra 2.0 is out - now there's no reason not to ditch that ol' legacy relational system for your important online applications. Cassandra 2.0 includes big impact features like Light Weight Transactions and Triggers. Do you know about the other new enhancements that got lost in the noise. Let's put the spotlight on all the things! Changes in memory management, file handling and internals. Low hype but they pack a big punch. While we were at it, we also did a bit of house cleaning.
Cassandra Community Webinar | Introduction to Apache Cassandra 1.2DataStax
Title: Introduction to Apache Cassandra 1.2
Details: Join Aaron Morton, DataStax MVP for Apache Cassandra and learn the basics of the massively scalable NoSQL database. This webinar is will examine C*’s architecture and its strengths for powering mission-critical applications. Aaron will introduce you to core concepts such as Cassandra’s data model, multi-datacenter replication, and tunable consistency. He’ll also cover new features in Cassandra version 1.2 including virtual nodes, CQL 3 language and query tracing.
Speaker: Aaron Morton, Apache Cassandra Committer
Aaron Morton is a Freelance Developer based in New Zealand, and a Committer on the Apache Cassandra project. In 2010, he gave up the RDBMS world for the scale and reliability of Cassandra. He now spends his time advancing the Cassandra project and helping others get the best out of it.
Discovering User's Topics of Interest in Recommender SystemsGabriel Moreira
This talk introduces the main techniques of Recommender Systems and Topic Modeling.
Then, we present a case of how we've combined those techniques to build Smart Canvas (www.smartcanvas.com), a service that allows people to bring, create and curate content relevant to their organization, and also helps to tear down knowledge silos.
We present some of Smart Canvas features powered by its recommender system, such as:
- Highlight relevant content, explaining to the users which of his topics of interest have generated each recommendation.
- Associate tags to users’ profiles based on topics discovered from content they have contributed. These tags become searchable, allowing users to find experts or people with specific interests.
- Recommends people with similar interests, explaining which topics brings them together.
We give a deep dive into the design of our large-scale recommendation algorithms, giving special attention to our content-based approach that uses topic modeling techniques (like LDA and NMF) to discover people’s topics of interest from unstructured text, and social-based algorithms using a graph database connecting content, people and teams around topics.
Our typical data pipeline that includes the ingestion millions of user events (using Google PubSub and BigQuery), the batch processing of the models (with PySpark, MLib, and Scikit-learn), the online recommendations (with Google App Engine, Titan Graph Database and Elasticsearch), and the data-driven evaluation of UX and algorithms through A/B testing experimentation. We also touch topics about non-functional requirements of a software-as-a-service like scalability, performance, availability, reliability and multi-tenancy and how we addressed it in a robust architecture deployed on Google Cloud Platform.
Webinar: Five Ways a Technology Refresh Strategy Can Help Make Your Digital T...OSSCube
You’ve realized that in order to create new revenue streams, increase efficiency and improve customer engagement your organization may need a digital transformation. But what exactly is a digital transformation, how do you start one, and how does technology play a role? Join experts Dietmar Rietsch, co-founder and CEO of Pimcore, and John Bernard, EVP of North America at OSSCube, as they discuss how Pimcore is disrupting the digital transformation market.
We’ll cover:
- What digital transformation is and why it’s important for your organization
- The role technology plays in the digital transformation process
- How choosing the right technology gives you a competitive advantage
- Outcomes of a successful digital transformation project
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormDataStax
Cassandra provides facilities to integrate with Hadoop. This is sufficient for distributed batch processing, but doesn’t address CEP distributed processing. This webinar will demonstrate use of Cassandra in Storm. Storm provides a data flow and processing layer that can be used to integrate Cassandra with other external persistences mechanisms (e.g. Elastic Search) or calculate dimensional counts for reporting and dashboards. We’ll dive into a sample Storm topology that reads and writes from Cassandra using storm-cassandra bolts.
Problem solving in the 21st century increasingly depends on the analysis of complex systems. Developing new drugs, understanding risk in financial networks, searching for answers in knowledge graphs, personalization and recommendation in social networks all require the analysis of systems composed of interconnected entities that exhibit complex behavior as a whole. Graph computing provides a conceptual model and practical platform for developing such analyses.
This talk presents graph computing as an important component of every developer’s toolbox. We introduce the Aurelius graph cluster which is an open-source stack enabling graph computing at scale by building on distributed systems like Cassandra, HBase, and Hadoop. This stack addresses challenging problems in graph partitioning, graph query language design and graph algorithm development with solutions inspired by physics, biology and neuroscience.
Accelerate Your Digital Transformation Journey with PimcoreOSSCube
A key priority for businesses today is to successfully transform their enterprise into a digital business. Digital transformation offers enormous opportunities to enterprises to refine their business models and to win in this digital era. How is your organization placed in this digital world?
In the video, we have discussed, how Pimcore delivered the promise, consolidating PIM, CMS, DAM & Commerce within one framework platform with faster time-to-market.
We will also go through some recent digital transformation experiences driven through Pimcore that helped clients achieve market differentiation and customer value.
Key Points:
* Understanding Digital Transformation need and strategies
* Transformation of digital strategies through Pimcore
* Helps gain insights into Pimcore and its features
* Identification/Co-relation of end customer needs based on our digital transformation experiences
Cassandra Community Webinar: Apache Cassandra InternalsDataStax
Apache Cassandra solves many interesting problems to provide a scalable, distributed, fault tolerant database. Cluster wide operations track node membership, direct requests and implement consistency guarantees. At the node level, the Log Structured storage engine provides high performance reads and writes. All of this is implemented in a Java code base that has greatly matured over the past few years.
In this webinar Aaron Morton will step through read and write requests, automatic processes and manual maintenance tasks. He will also discuss the general approach to solving the problem and drill down to the code responsible for implementation.
Speaker: Aaron Morton, Apache Cassandra Committer
Aaron Morton is a Freelance Developer based in New Zealand, and a Committer on the Apache Cassandra project. In 2010 he gave up the RDBMS world for the scale and reliability of Cassandra. He now spends his time advancing the Cassandra project and helping others get the best out of it.
Cassandra Community Webinar: Back to Basics with CQL3DataStax
Cassandra is a distributed, massively scalable, fault tolerant, columnar data store, and if you need the ability to make fast writes, the only thing faster than Cassandra is /dev/null! In this fast-paced presentation, we'll briefly describe big data, and the area of big data that Cassandra is designed to fill. We will cover Cassandra's unique, every-node-the-same architecture. We will reveal Cassandra's internal data structure and explain just why Cassandra is so darned fast. Finally, we'll wrap up with a discussion of data modeling using the new standard protocol: CQL (Cassandra Query Language).
Titan is a scalable graph database that can distribute and query graph data across multiple machines. This presentation provides a general introduction to graph computing and Titan in particular. It also focuses on some recent development for Titan 0.9 and TinkerPop 3.
The problems we are faced with in the 21st century require efficient analysis of ever more complex systems. This presentation outlines how such problems can be better understood and effectively solved if they are modeled as graphs or networks. We present two tools for to help solve such problems at scale: Titan, which is a real-time distributed graph database based on Apache Cassandra and Hbase and Faunus, which is a batch analytics framework for graphs based on Apache Hadoop. We discuss their current development status as of November 2012 and illustrate an example application for the GitHub coding network.
Legacy to industry leader: a modernization case studyOSSCube
This live webinar goes through the steps of how MakeMyTrip.com engaged OSSCube to completely modernize their website and help them become one of the top online travel companies in the world. Zend Server and Zend Studio were used to expedite the entire project for what has now become arguably the largest Drupal implementation to date.
This webinar explores the process of how OSSCube developed a Talend solution--for a global provider of digital marketing and client reporting tools--that aggregates and converts information from a variety of resources into well-defined data formats.
Watch on YouTube: https://www.youtube.com/watch?v=gyZiiG7mjx8
OSSCube EVP John Bernard and Talend Alliances and Channels Manager Rich Searle provide an in-depth explanation of the benefits of Talend as well as the usefulness of data organization in today's business world.
Key Discussion Points:
- Talend ETL tools capabilities
- Implementing Talend in your organization
For more information please visit OSSCube.com
For more webinars please visit OSSCube.com/upcoming-webinars
Follow us on Twitter @OSSCube
Follow us on LinkedIn http://linkedin.com/company/osscube
Using MySQL Fabric for High Availability and Scaling OutOSSCube
MySQL Fabric is an extensible framework for managing farms of MySQL Servers. In this webinar, you will learn what MySQL Fabric is, what it can achieve and how it is used by database administrators and developers. Plus, you will learn how MySQL Fabric can help for sharding and high-availability. See more @ http://www.osscube.com/
This presentation introduces Titan, Faunus, and scalable graph computing in general. We present a case study of how Pearson builds an education social network on top of Titan, Faunus, and Cassandra to support learning in the 21st century.
Titan is an open source distributed graph database build on top of Cassandra that can power real-time applications with thousands of concurrent users over graphs with billions of edges. Faunus is an open source global graph processing engine build on top of Hadoop and compatible with Cassandra that can analyze graphs, compute graph statistics, and execute global traversals. Titan and Faunus are components of the Aurelius Graph Cluster which enables scalable graph computation and powers applications in social networking, recommendation engines, advertisement optimization, knowledge representation, health care, education, and security.
DataStax: Titan 1.0: Scalable real time and analytic graph queriesDataStax Academy
Titan is a scalable graph database build on top of Apache Cassandra that supports both real time and analytic graph queries across distributed clusters. This talk focuses on the recently released Titan 1.0 and the new features it introduces. Titan implements the most recent version of the popular Apache Gremlin graph query language through a custom rewrite engine and query optimizer to efficiently execute deep traversal queries. Graph analytics and global breadth-first execution of Gremlin queries is executed by Apache Spark through the Cassandra-Spark connector. These features are demonstrated on a social use case to highlight how Titan can deliver relationship value with little development effort.
Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, StrongerDataStax
Apache Cassandra 2.0 is out - now there's no reason not to ditch that ol' legacy relational system for your important online applications. Cassandra 2.0 includes big impact features like Light Weight Transactions and Triggers. Do you know about the other new enhancements that got lost in the noise. Let's put the spotlight on all the things! Changes in memory management, file handling and internals. Low hype but they pack a big punch. While we were at it, we also did a bit of house cleaning.
Cassandra Community Webinar | Introduction to Apache Cassandra 1.2DataStax
Title: Introduction to Apache Cassandra 1.2
Details: Join Aaron Morton, DataStax MVP for Apache Cassandra and learn the basics of the massively scalable NoSQL database. This webinar is will examine C*’s architecture and its strengths for powering mission-critical applications. Aaron will introduce you to core concepts such as Cassandra’s data model, multi-datacenter replication, and tunable consistency. He’ll also cover new features in Cassandra version 1.2 including virtual nodes, CQL 3 language and query tracing.
Speaker: Aaron Morton, Apache Cassandra Committer
Aaron Morton is a Freelance Developer based in New Zealand, and a Committer on the Apache Cassandra project. In 2010, he gave up the RDBMS world for the scale and reliability of Cassandra. He now spends his time advancing the Cassandra project and helping others get the best out of it.
Discovering User's Topics of Interest in Recommender SystemsGabriel Moreira
This talk introduces the main techniques of Recommender Systems and Topic Modeling.
Then, we present a case of how we've combined those techniques to build Smart Canvas (www.smartcanvas.com), a service that allows people to bring, create and curate content relevant to their organization, and also helps to tear down knowledge silos.
We present some of Smart Canvas features powered by its recommender system, such as:
- Highlight relevant content, explaining to the users which of his topics of interest have generated each recommendation.
- Associate tags to users’ profiles based on topics discovered from content they have contributed. These tags become searchable, allowing users to find experts or people with specific interests.
- Recommends people with similar interests, explaining which topics brings them together.
We give a deep dive into the design of our large-scale recommendation algorithms, giving special attention to our content-based approach that uses topic modeling techniques (like LDA and NMF) to discover people’s topics of interest from unstructured text, and social-based algorithms using a graph database connecting content, people and teams around topics.
Our typical data pipeline that includes the ingestion millions of user events (using Google PubSub and BigQuery), the batch processing of the models (with PySpark, MLib, and Scikit-learn), the online recommendations (with Google App Engine, Titan Graph Database and Elasticsearch), and the data-driven evaluation of UX and algorithms through A/B testing experimentation. We also touch topics about non-functional requirements of a software-as-a-service like scalability, performance, availability, reliability and multi-tenancy and how we addressed it in a robust architecture deployed on Google Cloud Platform.
Webinar: Five Ways a Technology Refresh Strategy Can Help Make Your Digital T...OSSCube
You’ve realized that in order to create new revenue streams, increase efficiency and improve customer engagement your organization may need a digital transformation. But what exactly is a digital transformation, how do you start one, and how does technology play a role? Join experts Dietmar Rietsch, co-founder and CEO of Pimcore, and John Bernard, EVP of North America at OSSCube, as they discuss how Pimcore is disrupting the digital transformation market.
We’ll cover:
- What digital transformation is and why it’s important for your organization
- The role technology plays in the digital transformation process
- How choosing the right technology gives you a competitive advantage
- Outcomes of a successful digital transformation project
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormDataStax
Cassandra provides facilities to integrate with Hadoop. This is sufficient for distributed batch processing, but doesn’t address CEP distributed processing. This webinar will demonstrate use of Cassandra in Storm. Storm provides a data flow and processing layer that can be used to integrate Cassandra with other external persistences mechanisms (e.g. Elastic Search) or calculate dimensional counts for reporting and dashboards. We’ll dive into a sample Storm topology that reads and writes from Cassandra using storm-cassandra bolts.
Problem solving in the 21st century increasingly depends on the analysis of complex systems. Developing new drugs, understanding risk in financial networks, searching for answers in knowledge graphs, personalization and recommendation in social networks all require the analysis of systems composed of interconnected entities that exhibit complex behavior as a whole. Graph computing provides a conceptual model and practical platform for developing such analyses.
This talk presents graph computing as an important component of every developer’s toolbox. We introduce the Aurelius graph cluster which is an open-source stack enabling graph computing at scale by building on distributed systems like Cassandra, HBase, and Hadoop. This stack addresses challenging problems in graph partitioning, graph query language design and graph algorithm development with solutions inspired by physics, biology and neuroscience.
Accelerate Your Digital Transformation Journey with PimcoreOSSCube
A key priority for businesses today is to successfully transform their enterprise into a digital business. Digital transformation offers enormous opportunities to enterprises to refine their business models and to win in this digital era. How is your organization placed in this digital world?
In the video, we have discussed, how Pimcore delivered the promise, consolidating PIM, CMS, DAM & Commerce within one framework platform with faster time-to-market.
We will also go through some recent digital transformation experiences driven through Pimcore that helped clients achieve market differentiation and customer value.
Key Points:
* Understanding Digital Transformation need and strategies
* Transformation of digital strategies through Pimcore
* Helps gain insights into Pimcore and its features
* Identification/Co-relation of end customer needs based on our digital transformation experiences
Cassandra Community Webinar: Apache Cassandra InternalsDataStax
Apache Cassandra solves many interesting problems to provide a scalable, distributed, fault tolerant database. Cluster wide operations track node membership, direct requests and implement consistency guarantees. At the node level, the Log Structured storage engine provides high performance reads and writes. All of this is implemented in a Java code base that has greatly matured over the past few years.
In this webinar Aaron Morton will step through read and write requests, automatic processes and manual maintenance tasks. He will also discuss the general approach to solving the problem and drill down to the code responsible for implementation.
Speaker: Aaron Morton, Apache Cassandra Committer
Aaron Morton is a Freelance Developer based in New Zealand, and a Committer on the Apache Cassandra project. In 2010 he gave up the RDBMS world for the scale and reliability of Cassandra. He now spends his time advancing the Cassandra project and helping others get the best out of it.
Cassandra Community Webinar: Back to Basics with CQL3DataStax
Cassandra is a distributed, massively scalable, fault tolerant, columnar data store, and if you need the ability to make fast writes, the only thing faster than Cassandra is /dev/null! In this fast-paced presentation, we'll briefly describe big data, and the area of big data that Cassandra is designed to fill. We will cover Cassandra's unique, every-node-the-same architecture. We will reveal Cassandra's internal data structure and explain just why Cassandra is so darned fast. Finally, we'll wrap up with a discussion of data modeling using the new standard protocol: CQL (Cassandra Query Language).
Titan is a scalable graph database that can distribute and query graph data across multiple machines. This presentation provides a general introduction to graph computing and Titan in particular. It also focuses on some recent development for Titan 0.9 and TinkerPop 3.
The problems we are faced with in the 21st century require efficient analysis of ever more complex systems. This presentation outlines how such problems can be better understood and effectively solved if they are modeled as graphs or networks. We present two tools for to help solve such problems at scale: Titan, which is a real-time distributed graph database based on Apache Cassandra and Hbase and Faunus, which is a batch analytics framework for graphs based on Apache Hadoop. We discuss their current development status as of November 2012 and illustrate an example application for the GitHub coding network.
Titan is an open source distributed graph database build on top of Cassandra that can power real-time applications with thousands of concurrent users over graphs with billions of edges. Graphs are a versatile data model for capturing and analyzing rich relational structures. Graphs are an increasingly popular way to represent data in a wide range of domains such as social networking, recommendation engines, advertisement optimization, knowledge representation, health care, education, and security.
This presentation discusses Titan's data model, query language, and novel techniques in edge compression, data layout, and vertex-centric indices which facilitate the representation and processing of Big Graph Data across a Cassandra cluster. We demonstrate Titan's performance on a large scale benchmark evaluation using Twitter data.
Presented at the Cassandra 2012 Summit.
PMatch: Probabilistic Subgraph Matching on Huge Social NetworksMatthias Broecheler
Users querying massive social networks or RDF databases are often not 100% certain about what they are looking for due to the complexity of the query or heterogeneity of the data. In this paper, we propose “probabilistic subgraph” (PS) queries over a graph/network database, which afford users great flexibility in specifying “approximately” what they are looking for. We formally define the probability that a substitution satisfies a PS-query with respect to a graph database. We then present the PMATCH algorithm to answer such queries and prove its correctness. Our experimental evaluation demonstrates that PMATCH is efficient and scales to massive social networks with over a billion edges.
Continuous Markov random fields are a general formalism to model joint probability distributions over events with continuous outcomes. We prove that marginal computation for constrained continuous MRFs is #P-hard in general and present a polynomial-time approximation scheme under mild assumptions on the structure of the random field. Moreover, we introduce a sampling algorithm to compute marginal distributions and develop novel techniques to increase its efficency. Continuous MRFs are a general purpose probabilistic modeling tool and we demonstrate how they can be applied to statistical relational learning. On the problem of collective classification, we evaluate our algorithm and show that the standard deviation of marginals serves as a useful measure of confidence.
COSI: Cloud Oriented Subgraph Identification in Massive Social NetworksMatthias Broecheler
Slides presenting our work on COSI at the ASONAM conference 2010
Note: The images used in this presentation are copyright by the respective owners as indicated with the picture. Pictures used are either CC or fair use. Please notify the author if you feel that your images are unfairly used in this presentation.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
23. Apache 2
Aurelius Graph Cluster
TITAN FAUNUS FULGORA
Map/Reduce
Load
Bulk Load
Analysis results
back into Titan
Stores a massive-scale Batch processing of large Runs global graph algorithms
property graph allowing real- graphs with Hadoop
on large, compressed,
time traversals and updates
in-memory graphs
24. Titan Features
Numerous Concurrent Users
Many Short Transactions
read/write
Real-time Traversals (OLTP)
High Availability
Dynamic Scalability
Variable Consistency Model
ACID or eventual consistency
Real-time Big Graph Data
26. $ ./titan-0.2.0/bin/gremlin.sh!
! ! !,,,/!
(o o)!
-----oOOo-(_)-oOOo-----!
gremlin> g = TitanFactory.open('/tmp/titan')!
==>titangraph[local:/tmp/titan]!
gremlin> v = g.V(‘name’,’Hercules’)!
==>v[4]!
gremlin> v.out(‘father’).out(‘brother’).name!
27. Vertex-Centric Indices
Sort and index edges per
vertex by primary key
Primary key can be composite
Enables efficient focused
traversals
Only retrieve edges that matter
Uses push down predicates for
quick, index-driven retrieval
28. battled
battled
battled
time: 1
time: 3
time: 5
mother
battled
v
v.query()!
time: 9
father
fought
fought
29. battled
battled
battled
time: 1
time: 3
time: 5
mother
battled
v
v.query()!
time: 9
.direction(OUT)!
father
32. Titan Server
REST
REXPRO
$ wget http://s3.thinkaurelius.com/downloads/titan/titan-cassandra-0.3.0.zip!
$ unzip titan-cassandra-0.3.0.zip!
$ cd titan-cassandra-0.3.0!
$ sudo bin/titan.sh config/titan-server-rexster.xml config/titan-server-
cassandra.properties!
33. Graph Indexing
Vertex and Edge indexing
Pluggable index provider
ElasticSearch
Lucene
Full-text search
Numeric range search
Geographic search
34. name: Neptune
name: Alcmene
age: 4500
type: human
title: God of the age: 45
earth and ocean
brother
mother
name: Jupiter
name: Saturn
age: 4800
name: Hercules
name: Hydra
type: titan
title: God of the title: Divine hero
type: monster
age: 10000
heaven and skies
father
father
battled
time: 2
battled
locaion: [37.7,23.9]
brother
time:12
location: [39,22]
name: Pluto
name: Cerberus
age: 4000
title: Ugly beast of the
title: God of the
underworld
underworld
pet
35. name: Neptune
name: Alcmene
age: 4500
type: human
title: God of the age: 45
earth and ocean
brother
mother
name: Jupiter
name: Saturn
age: 4800
name: Hercules
name: Hydra
type: titan
title: God of the title: Divine hero
type: monster
age: 10000
heaven and skies
father
father
battled
time: 2
battled
locaion: [37.7,23.9]
brother
time:12
location: [39,22]
name: Pluto
name: Cerberus
age: 4000
title: Ugly beast of the
title: God of the
underworld
underworld
pet
g.query().has(‘title’,Txt.CONTAINS,’god’).vertices()!
36. name: Neptune
name: Alcmene
age: 4500
type: human
title: God of the age: 45
earth and ocean
brother
mother
name: Jupiter
name: Saturn
age: 4800
name: Hercules
name: Hydra
type: titan
title: God of the title: Divine hero
type: monster
age: 10000
heaven and skies
father
father
battled
time: 2
battled
locaion: [37.7,23.9]
brother
time:12
location: [39,22]
name: Pluto
name: Cerberus
age: 4000
title: Ugly beast of the
title: God of the
underworld
underworld
pet
g.query().has(‘age’,GREATER_THAN,4500)
.has(‘title’,CONTAINS,’god’).vertices()!
37. name: Neptune
name: Alcmene
age: 4500
type: human
title: God of the age: 45
earth and ocean
brother
mother
name: Jupiter
name: Saturn
age: 4800
name: Hercules
name: Hydra
type: titan
title: God of the title: Divine hero
type: monster
age: 10000
heaven and skies
father
father
battled
time: 2
battled
locaion: [37.7,23.9]
brother
time:12
location: [39,22]
name: Pluto
name: Cerberus
age: 4000
title: Ugly beast of the
title: God of the
underworld
underworld
pet
g.query().has(‘location’,WITHIN,
Geoshape.circle(38,24,50).edges()!
38. Faunus Features
Hadoop-based Graph
Computing Framework
Graph Analytics
Breadth-first Traversals
Global Graph Computations
Batch Big Graph Data
41. Faunus Setup
$ bin/gremlin.sh !
,,,/!
(o o)!
-----oOOo-(_)-oOOo-----!
gremlin> g = FaunusFactory.open('bin/titan-hbase.properties')!
==>faunusgraph[titanhbaseinputformat]!
gremlin> g.getProperties()!
==>faunus.graph.input.format=com.thinkaurelius.faunus.formats.titan.hbase.TitanHBaseInputFormat
==>faunus.graph.output.format=org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat!
==>faunus.sideeffect.output.format=org.apache.hadoop.mapreduce.lib.output.TextOutputFormat!
==>faunus.output.location=dbpedia!
==>faunus.output.location.overwrite=true!
gremlin> g._() !
12/11/09 15:17:45 INFO mapreduce.FaunusCompiler: Compiled to 1 MapReduce job(s)!
12/11/09 15:17:45 INFO mapreduce.FaunusCompiler: Executing job 1 out of 1:
MapSequence[com.thinkaurelius.faunus.mapreduce.transform.IdentityMap.Map]!
12/11/09 15:17:50 INFO mapred.JobClient: Running job: job_201211081058_0003!
42. Build a Knowledge Graph
Based on DBPedia
Graph version of Wikipedia
~290 million edges (~1B triples)
1. Bulk load RDF into Faunus
6 m1.xlarge
2. Convert to property graph
3. Bulk load into Titan
3 m1.xlarge with Cassandra
4. OLTP+OLAP
Total Time: ~ 2 hours
49. Apache 2
Aurelius Graph Cluster
TITAN FAUNUS FULGORA
Map/Reduce
Load
Bulk Load
Analysis results
aureliusgraphs@googlegroups.com
back into Titan
Stores a massive-scale Batch processing of large Runs global graph algorithms
property graph allowing real- graphs with Hadoop
on large, compressed,
time traversals and updates
in-memory graphs