Neo4J and Weka 2

•Download as PPTX, PDF•

6 likes•2,464 views

Vasko Yordanov

Combining recommendation engine with a graph database as a sample of the potential of emerging technologies.

Technology Education

Combining NEO4J graph databse with WEKA Basic “toy” example drawn upon mining SEC filings of Form -D

Experiment :Find intersection among VC firms related to Google and its latest acquisitions (i.e the “Dataset”) and play with “predicting” the chance of newly funded startup being acquired by Google by examining proximity.

Weka: Machine learning toolkit containing classification and clustering algorithms. In this case used for creating recommendations based on input. Neo4j: Graph Database. Very suitable for social networks data. Used here for finding “shortest path” between two nodes

Neo4J can handle large sets of unstructured linked data:

RDF : Subject- Property- Object Neo4J: Node 1–Relationship-Node2

Statement: “Sequoia Capital Funded Google” Initialize Database: grapb = new EmbeddedGraphDatabase( “SEC" ); index = new LuceneIndexService( graphDb ); Create the Nodes: Node Sequoia = graphDb.createNode(); Sequoia.setProperty( "name", “Seqioua Capital” ); Node Google = graphDb.createNode(); Google.setProperty( "name", “Google” ); index.index(Sequoia , "name“,” Seqioua Capital”) ); Create Relationship: Relationship rel = Sequoia.createRelationshipTo(Google, Relationship.FUNDED);

$Traversertraverser = node.traverse( Order.DEPTH_FIRST, topEvaluator.END_OF_NETWORK, new ReturnableEvaluator(){public booleanisReturnableNode(TraversalPositioncurrentPosition){Relationship last =currentPosition.lastRelationshipTraversed(); return( last.getType().equals(InvestorRelationTypes.FUNDED) ) return false; } }, InvestorRelationTypes.BOARD, Direction.INCOMING, InvestorRelationTypes.FUNDED, Direction.INCOMING, InvestorRelationTypes.ACQUIRED, Direction.OUTGOING ); return traverser.getAllNodes();$

Weka Create Attributes (table input) Create DataSet for Learning Build predictive model Evaluate quality of Model Predict the rank based on input

Basic terms in WEKA ,[object Object],A set of data items, the dataset, is a very basic concept of machine learning. A dataset is roughly equivalent to a two-dimensional spreadsheet or database table. In WEKA a dataset is a collection of Instances. ,[object Object]

Attribute –Each instance consist of attributes

What's hot

Signals from outer spaceGraphAware

Introduce to PredictionIOWei-Yuan Chang

A Data Ecosystem to Support Machine Learning in Materials ScienceGlobus

Automating Drug Design Nov 13th 2009 97David Leahy

Fully Automated QA System For Large Scale Search And Recommendation Engines U...Spark Summit

Power of Polyglot SearchJanos Szendi-Varga

Apache Spark Side of FunnelsDatabricks

A Data Model, Workflow, and Architecture for Integrating DataDavid Massart

Softwares used in data miningVishruth Kumar

Multiplatform Spark solution for Graph datasources by Javier DominguezBig Data Spain

Introduction to Data Science and AnalyticsSrinath Perera

What is a distributed data science pipeline. how with apache spark and friends.Andy Petrella

Mapreduce in SearchAmund Tveit

2014.06.24.what is ubixJim Cooley

07 Retrieving ObjectsRanjan Kumar

Intro to machine learning with scikit learnYoss Cohen

Joining the Club: Using Spark to Accelerate Big Data at Dollar Shave ClubData Con LA

Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use CaseMo Patel

Agile data science: Distributed, Interactive, Integrated, Semantic, Micro Ser...Andy Petrella

Scala: the unpredicted lingua franca for data scienceAndy Petrella

What's hot (20)

Signals from outer space

Introduce to PredictionIO

A Data Ecosystem to Support Machine Learning in Materials Science

Automating Drug Design Nov 13th 2009 97

Fully Automated QA System For Large Scale Search And Recommendation Engines U...

Power of Polyglot Search

Apache Spark Side of Funnels

A Data Model, Workflow, and Architecture for Integrating Data

Softwares used in data mining

Multiplatform Spark solution for Graph datasources by Javier Dominguez

Introduction to Data Science and Analytics

What is a distributed data science pipeline. how with apache spark and friends.

Mapreduce in Search

2014.06.24.what is ubix

07 Retrieving Objects

Intro to machine learning with scikit learn

Joining the Club: Using Spark to Accelerate Big Data at Dollar Shave Club

Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use Case

Agile data science: Distributed, Interactive, Integrated, Semantic, Micro Ser...

Scala: the unpredicted lingua franca for data science

Viewers also liked

Building an Online-Recommendation Engine with MongoDBMongoDB

Neo4J with Docker and Azure - GraphConnect 2015Patrick Chanezon

Driving Predictive Roadway Analytics with the Power of Neo4jNeo4j

Neo4j on Azure Step by StepNeo4j

Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...Martin Junghanns

Open Source Big Graph Analytics on Neo4j with Apache SparkKenny Bastani

Natural language search using Neo4jKenny Bastani

Building a Graph-based Analytics PlatformKenny Bastani

Neo4j + Tableau Visual Analytics - GraphConnect SF 2015 Neo4j

Document Classification with Neo4jKenny Bastani

Natural Language Processing with Graph Databases and Neo4jWilliam Lyon

Natural Language Processing with Neo4jKenny Bastani

An Introduction to NOSQL, Graph Databases and Neo4jDebanjan Mahata

Big Graph Analytics on Neo4j with Apache SparkKenny Bastani

Introduction to Graph DatabasesMax De Marzi

NOSQLEU - Graph Databases and Neo4jTobias Lindaaker

Building a Recommendation Engine - An example of a product recommendation engineNYC Predictive Analytics

Graph database Use CasesMax De Marzi

Data Modeling with Neo4jNeo4j

Viewers also liked (19)

Building an Online-Recommendation Engine with MongoDB

Neo4J with Docker and Azure - GraphConnect 2015

Driving Predictive Roadway Analytics with the Power of Neo4j

Neo4j on Azure Step by Step

Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...

Open Source Big Graph Analytics on Neo4j with Apache Spark

Natural language search using Neo4j

Building a Graph-based Analytics Platform

Neo4j + Tableau Visual Analytics - GraphConnect SF 2015

Document Classification with Neo4j

Natural Language Processing with Graph Databases and Neo4j

Natural Language Processing with Neo4j

An Introduction to NOSQL, Graph Databases and Neo4j

Big Graph Analytics on Neo4j with Apache Spark

Introduction to Graph Databases

NOSQLEU - Graph Databases and Neo4j

Building a Recommendation Engine - An example of a product recommendation engine

Graph database Use Cases

Data Modeling with Neo4j

Similar to Neo4J and Weka 2

Elastic search integration with hadoop leveragebigdataPooja Gupta

Strata NYC 2015 - What's coming for the Spark communityDatabricks

Understanding backbonejsNick Lee

Meetup ml spark_pptSnehal Nagmote

.NET Database Toolkitwlscaudill

Spark devoxx2014Andy Petrella

Apache Spark OverviewVadim Y. Bichutskiy

Agile Data Science 2.0Russell Jurney

data mining with weka applicationRezapourabbas

JavaCro'14 - Scala and Java EE 7 Development Experiences – Peter PilgrimHUJAK - Hrvatska udruga Java korisnika / Croatian Java User Association

JavaCro 2014 Scala and Java EE 7 Development ExperiencesPeter Pilgrim

Adding a modern twist to legacy web applicationsJeff Durta

Data visualization in python/Djangokenluck2001

FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...Zhenzhong Xu

Agile Data ScienceRussell Jurney

Nhibernatethe Orm For Net Platform 1226744632929962 8Nicolas Thon

SQL on Big Data using OptiqJulian Hyde

Flink Forward Berlin 2018: Jared Stehler - "Streaming ETL with Flink and Elas...Flink Forward

Accessing data with android cursorsinfo_zybotech

Similar to Neo4J and Weka 2 (20)

Elastic search integration with hadoop leveragebigdata

Strata NYC 2015 - What's coming for the Spark community

Understanding backbonejs

Meetup ml spark_ppt

.NET Database Toolkit

Spark devoxx2014

Apache Spark Overview

Agile Data Science 2.0

data mining with weka application

JavaCro'14 - Scala and Java EE 7 Development Experiences – Peter Pilgrim

JavaCro 2014 Scala and Java EE 7 Development Experiences

Adding a modern twist to legacy web applications

Data visualization in python/Django

FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...

Agile Data Science

Nhibernatethe Orm For Net Platform 1226744632929962 8

SQL on Big Data using Optiq

Flink Forward Berlin 2018: Jared Stehler - "Streaming ETL with Flink and Elas...

Accessing data with android cursors

Recently uploaded

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3

Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González

2024 April Patch TuesdayIvanti

Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll

Rise of the Machines: Known As Drones...Rick Flair

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3

Time Series Foundation Models - current state and future directionsNathaniel Shimoni

Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen

UiPath Community: Communication Mining from Zero to HeroUiPathCommunity

TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

A Journey Into the Emotions of Software DevelopersNicole Novielli

Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda

How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe

Scale your database traffic with Read & Write split using MySQL RouterMydbops

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3

What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina

Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal

Sample pptx for embedding into website for demoHarshalMandlekar2

Recently uploaded (20)

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx

Generative Artificial Intelligence: How generative AI works.pdf

2024 April Patch Tuesday

Emixa Mendix Meetup 11 April 2024 about Mendix Native development

Rise of the Machines: Known As Drones...

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx

Time Series Foundation Models - current state and future directions

Testing tools and AI - ideas what to try with some tool examples

UiPath Community: Communication Mining from Zero to Hero

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024

A Journey Into the Emotions of Software Developers

Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...

How AI, OpenAI, and ChatGPT impact business and software.

Scale your database traffic with Read & Write split using MySQL Router

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx

What is DBT - The Ultimate Data Build Tool.pdf

Genislab builds better products and faster go-to-market with Lean project man...

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...

Sample pptx for embedding into website for demo

Neo4J and Weka 2

1. Combining NEO4J graph databse with WEKA Basic “toy” example drawn upon mining SEC filings of Form -D

2. Experiment :Find intersection among VC firms related to Google and its latest acquisitions (i.e the “Dataset”) and play with “predicting” the chance of newly funded startup being acquired by Google by examining proximity.

3. Weka: Machine learning toolkit containing classification and clustering algorithms. In this case used for creating recommendations based on input. Neo4j: Graph Database. Very suitable for social networks data. Used here for finding “shortest path” between two nodes

4. Neo4J can handle large sets of unstructured linked data:

5. RDF : Subject- Property- Object Neo4J: Node 1–Relationship-Node2

6. Statement: “Sequoia Capital Funded Google” Initialize Database: grapb = new EmbeddedGraphDatabase( “SEC" ); index = new LuceneIndexService( graphDb ); Create the Nodes: Node Sequoia = graphDb.createNode(); Sequoia.setProperty( "name", “Seqioua Capital” ); Node Google = graphDb.createNode(); Google.setProperty( "name", “Google” ); index.index(Sequoia , "name“,” Seqioua Capital”) ); Create Relationship: Relationship rel = Sequoia.createRelationshipTo(Google, Relationship.FUNDED);

7. Traversertraverser = node.traverse( Order.DEPTH_FIRST, topEvaluator.END_OF_NETWORK, new ReturnableEvaluator(){public booleanisReturnableNode(TraversalPositioncurrentPosition){Relationship last =currentPosition.lastRelationshipTraversed(); return( last.getType().equals(InvestorRelationTypes.FUNDED) ) return false; } }, InvestorRelationTypes.BOARD, Direction.INCOMING, InvestorRelationTypes.FUNDED, Direction.INCOMING, InvestorRelationTypes.ACQUIRED, Direction.OUTGOING ); return traverser.getAllNodes();

8. “Path to Google:”

9. Weka Create Attributes (table input) Create DataSet for Learning Build predictive model Evaluate quality of Model Predict the rank based on input

10.

11. Instance –Dataset consist of Instances

12. Attribute –Each instance consist of attributes

13.

14. Example:Attributes

15. 1) Create Attributes: Attribute pathAttribute = new Attribute("path");Attribute categoryAttribute = new Attribute("category");Attribute similiarityAttribute = new Attribute("similarity");Attribute probabiityAttribute = new Attribute("probability"); In Weka a vector is container foR Attributes FastVector allAttributes = new FastVector(4); allAttributes.addElement(pathAttribute); allAttributes.addElement(categoryAttribute); 2) Create Dataset:Instance is a “container” of Attributesand the Dataset is container of Instances. Instances trainingDataSet = new Instances("VC", allAttributes, 17); For each instance we set values to be trained upon: Instance instance = new Instance(4);instance.setDataset(trainingDataSet);instance.setValue(0, path);instance.setValue(1, category); instance.setValue(2, similiarity); instance.setValue(3, rank); trainingDataSet.add(instance);

16. 3) Train Classifier and Evaluate RBFNetwork rbfLearner = new RBFNetwork(); rbfLearner.setNumClusters(17); rbfLearner.buildClassifier(trainingDataSet ); Evaluation learningSetEvaluation = new Evaluation(learningDataset); learningSetEvaluation.evaluateModel(rbfLearner, learningDataset); 4) Predict Unknown Cases Instance instance = new Instance(4);instance.setDataset(trainingDataSet);instance.setValue(0, path);instance.setValue(1, category); instance.setValue(2, similiarity); instance.setValue(3, 0); double prediction = rbfLearner.classifyInstance(testInstance);

Neo4J and Weka 2

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (19)

Similar to Neo4J and Weka 2

Similar to Neo4J and Weka 2 (20)

Recently uploaded

Recently uploaded (20)

Neo4J and Weka 2