SlideShare a Scribd company logo
1 of 1
Download to read offline
ECWAY TECHNOLOGIES
IEEE PROJECTS & SOFTWARE DEVELOPMENTS
OUR OFFICES @ CHENNAI / TRICHY / KARUR / ERODE / MADURAI / SALEM / COIMBATORE
CELL: +91 98949 17187, +91 875487 2111 / 3111 / 4111 / 5111 / 6111
VISIT: www.ecwayprojects.com MAIL TO: ecwaytechnologies@gmail.com

CLUSTERING LARGE PROBABILISTIC GRAPHS

ABSTRACT:

We study the problem of clustering probabilistic graphs. Similar to the problem of clustering
standard graphs, probabilistic graph clustering has numerous applications, such as finding
complexes in probabilistic protein-protein interaction (PPI) networks and discovering groups of
users in affiliation networks.

We extend the edit-distance-based definition of graph clustering to probabilistic graphs. We
establish a connection between our objective function and correlation clustering to propose
practical approximation algorithms for our problem. A benefit of our approach is that our
objective function is parameter-free. Therefore, the number of clusters is part of the output.

We develop methods for testing the statistical significance of the output clustering and study the
case of noisy clusterings. Using a real protein-protein interaction network and ground-truth data,
we show that our methods discover the correct number of clusters and identify established
protein relationships. Finally, we show the practicality of our techniques using a large social
network of Yahoo! users consisting of one billion edges.

More Related Content

What's hot

Fast activity detection indexing for temporal stochastic automaton based acti...
Fast activity detection indexing for temporal stochastic automaton based acti...Fast activity detection indexing for temporal stochastic automaton based acti...
Fast activity detection indexing for temporal stochastic automaton based acti...ecway
 
Ideas on Machine Learning Interpretability
Ideas on Machine Learning InterpretabilityIdeas on Machine Learning Interpretability
Ideas on Machine Learning InterpretabilitySri Ambati
 
Dexa2007 Orsi V1.5
Dexa2007 Orsi V1.5Dexa2007 Orsi V1.5
Dexa2007 Orsi V1.5Giorgio Orsi
 
MediaEval 2015 - Geo_ML @ MediaEval Placing Task 2015
MediaEval 2015 - Geo_ML @ MediaEval Placing Task 2015MediaEval 2015 - Geo_ML @ MediaEval Placing Task 2015
MediaEval 2015 - Geo_ML @ MediaEval Placing Task 2015multimediaeval
 
Introduction to Model-Based Machine Learning
Introduction to Model-Based Machine LearningIntroduction to Model-Based Machine Learning
Introduction to Model-Based Machine LearningDaniel Emaasit
 
1 00-introduction to computer graphics
1 00-introduction to computer graphics1 00-introduction to computer graphics
1 00-introduction to computer graphicsRajesh Kulkarni
 
Slides - Summary of: "Automating Data Preparation: Can We? Should We? Must We?"
Slides -  Summary of: "Automating Data Preparation: Can We? Should We? Must We?"Slides -  Summary of: "Automating Data Preparation: Can We? Should We? Must We?"
Slides - Summary of: "Automating Data Preparation: Can We? Should We? Must We?"SamueleBertollo1
 

What's hot (8)

Fast activity detection indexing for temporal stochastic automaton based acti...
Fast activity detection indexing for temporal stochastic automaton based acti...Fast activity detection indexing for temporal stochastic automaton based acti...
Fast activity detection indexing for temporal stochastic automaton based acti...
 
Ideas on Machine Learning Interpretability
Ideas on Machine Learning InterpretabilityIdeas on Machine Learning Interpretability
Ideas on Machine Learning Interpretability
 
IT Report
IT ReportIT Report
IT Report
 
Dexa2007 Orsi V1.5
Dexa2007 Orsi V1.5Dexa2007 Orsi V1.5
Dexa2007 Orsi V1.5
 
MediaEval 2015 - Geo_ML @ MediaEval Placing Task 2015
MediaEval 2015 - Geo_ML @ MediaEval Placing Task 2015MediaEval 2015 - Geo_ML @ MediaEval Placing Task 2015
MediaEval 2015 - Geo_ML @ MediaEval Placing Task 2015
 
Introduction to Model-Based Machine Learning
Introduction to Model-Based Machine LearningIntroduction to Model-Based Machine Learning
Introduction to Model-Based Machine Learning
 
1 00-introduction to computer graphics
1 00-introduction to computer graphics1 00-introduction to computer graphics
1 00-introduction to computer graphics
 
Slides - Summary of: "Automating Data Preparation: Can We? Should We? Must We?"
Slides -  Summary of: "Automating Data Preparation: Can We? Should We? Must We?"Slides -  Summary of: "Automating Data Preparation: Can We? Should We? Must We?"
Slides - Summary of: "Automating Data Preparation: Can We? Should We? Must We?"
 

Viewers also liked

Dotnet covering points of interest with mobile sensors
Dotnet  covering points of interest with mobile sensorsDotnet  covering points of interest with mobile sensors
Dotnet covering points of interest with mobile sensorsEcwaytech
 
Dotnet cross-layer design of congestion control and power control in fast-fa...
Dotnet  cross-layer design of congestion control and power control in fast-fa...Dotnet  cross-layer design of congestion control and power control in fast-fa...
Dotnet cross-layer design of congestion control and power control in fast-fa...Ecwaytech
 
Dotnet automatic semantic content extraction in videos using a fuzzy ontolog...
Dotnet  automatic semantic content extraction in videos using a fuzzy ontolog...Dotnet  automatic semantic content extraction in videos using a fuzzy ontolog...
Dotnet automatic semantic content extraction in videos using a fuzzy ontolog...Ecwaytech
 
Dotnet an access point-based fec mechanism for video transmission over wirel...
Dotnet  an access point-based fec mechanism for video transmission over wirel...Dotnet  an access point-based fec mechanism for video transmission over wirel...
Dotnet an access point-based fec mechanism for video transmission over wirel...Ecwaytech
 
Dotnet delay-optimal broadcast for multihop wireless networks using self-int...
Dotnet  delay-optimal broadcast for multihop wireless networks using self-int...Dotnet  delay-optimal broadcast for multihop wireless networks using self-int...
Dotnet delay-optimal broadcast for multihop wireless networks using self-int...Ecwaytech
 
Dotnet analysis of distance-based location management in wireless communicat...
Dotnet  analysis of distance-based location management in wireless communicat...Dotnet  analysis of distance-based location management in wireless communicat...
Dotnet analysis of distance-based location management in wireless communicat...Ecwaytech
 
Dotnet anonymization of centralized and distributed social networks by seque...
Dotnet  anonymization of centralized and distributed social networks by seque...Dotnet  anonymization of centralized and distributed social networks by seque...
Dotnet anonymization of centralized and distributed social networks by seque...Ecwaytech
 
Dotnet an investigation on lte mobility management
Dotnet  an investigation on lte mobility managementDotnet  an investigation on lte mobility management
Dotnet an investigation on lte mobility managementEcwaytech
 
Dotnet channel assignment for throughput optimization in multichannel multir...
Dotnet  channel assignment for throughput optimization in multichannel multir...Dotnet  channel assignment for throughput optimization in multichannel multir...
Dotnet channel assignment for throughput optimization in multichannel multir...Ecwaytech
 
Memoria 8 juegos escolares 2015
Memoria 8 juegos escolares 2015Memoria 8 juegos escolares 2015
Memoria 8 juegos escolares 2015jalisco2015
 

Viewers also liked (12)

Dotnet covering points of interest with mobile sensors
Dotnet  covering points of interest with mobile sensorsDotnet  covering points of interest with mobile sensors
Dotnet covering points of interest with mobile sensors
 
Dotnet cross-layer design of congestion control and power control in fast-fa...
Dotnet  cross-layer design of congestion control and power control in fast-fa...Dotnet  cross-layer design of congestion control and power control in fast-fa...
Dotnet cross-layer design of congestion control and power control in fast-fa...
 
Dotnet automatic semantic content extraction in videos using a fuzzy ontolog...
Dotnet  automatic semantic content extraction in videos using a fuzzy ontolog...Dotnet  automatic semantic content extraction in videos using a fuzzy ontolog...
Dotnet automatic semantic content extraction in videos using a fuzzy ontolog...
 
Dotnet an access point-based fec mechanism for video transmission over wirel...
Dotnet  an access point-based fec mechanism for video transmission over wirel...Dotnet  an access point-based fec mechanism for video transmission over wirel...
Dotnet an access point-based fec mechanism for video transmission over wirel...
 
Dotnet delay-optimal broadcast for multihop wireless networks using self-int...
Dotnet  delay-optimal broadcast for multihop wireless networks using self-int...Dotnet  delay-optimal broadcast for multihop wireless networks using self-int...
Dotnet delay-optimal broadcast for multihop wireless networks using self-int...
 
Dotnet analysis of distance-based location management in wireless communicat...
Dotnet  analysis of distance-based location management in wireless communicat...Dotnet  analysis of distance-based location management in wireless communicat...
Dotnet analysis of distance-based location management in wireless communicat...
 
Dotnet anonymization of centralized and distributed social networks by seque...
Dotnet  anonymization of centralized and distributed social networks by seque...Dotnet  anonymization of centralized and distributed social networks by seque...
Dotnet anonymization of centralized and distributed social networks by seque...
 
Dotnet an investigation on lte mobility management
Dotnet  an investigation on lte mobility managementDotnet  an investigation on lte mobility management
Dotnet an investigation on lte mobility management
 
Dotnet channel assignment for throughput optimization in multichannel multir...
Dotnet  channel assignment for throughput optimization in multichannel multir...Dotnet  channel assignment for throughput optimization in multichannel multir...
Dotnet channel assignment for throughput optimization in multichannel multir...
 
Eduardo Chum
Eduardo ChumEduardo Chum
Eduardo Chum
 
Sgc iso 9001 2008
Sgc iso 9001 2008Sgc iso 9001 2008
Sgc iso 9001 2008
 
Memoria 8 juegos escolares 2015
Memoria 8 juegos escolares 2015Memoria 8 juegos escolares 2015
Memoria 8 juegos escolares 2015
 

Similar to Clustering Large Probabilistic Graphs

Clustering large probabilistic graphs
Clustering large probabilistic graphsClustering large probabilistic graphs
Clustering large probabilistic graphsecway
 
Estimating project development effort using clustered regression approach
Estimating project development effort using clustered regression approachEstimating project development effort using clustered regression approach
Estimating project development effort using clustered regression approachcsandit
 
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACHESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACHcscpconf
 
A Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringA Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringIRJET Journal
 
A Machine learning based framework for Verification and Validation of Massive...
A Machine learning based framework for Verification and Validation of Massive...A Machine learning based framework for Verification and Validation of Massive...
A Machine learning based framework for Verification and Validation of Massive...IRJET Journal
 
MultiObjective(11) - Copy
MultiObjective(11) - CopyMultiObjective(11) - Copy
MultiObjective(11) - CopyAMIT KUMAR
 
IEEE Pattern analysis and machine intelligence 2016 Title and Abstract
IEEE Pattern analysis and machine intelligence 2016 Title and AbstractIEEE Pattern analysis and machine intelligence 2016 Title and Abstract
IEEE Pattern analysis and machine intelligence 2016 Title and Abstracttsysglobalsolutions
 
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstracttsysglobalsolutions
 
K anonymity for crowdsourcing database
K anonymity for crowdsourcing databaseK anonymity for crowdsourcing database
K anonymity for crowdsourcing databaseLeMeniz Infotech
 
BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC...
 BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC... BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC...
BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC...Nexgen Technology
 
final year ieee pojects in pondicherry,bulk ieee projects ,bulk 2015-16 i...
  final  year ieee pojects in pondicherry,bulk ieee projects ,bulk  2015-16 i...  final  year ieee pojects in pondicherry,bulk ieee projects ,bulk  2015-16 i...
final year ieee pojects in pondicherry,bulk ieee projects ,bulk 2015-16 i...nexgentech
 
IRJET- E-MORES: Efficient Multiple Output Regression for Streaming Data
IRJET- E-MORES: Efficient Multiple Output Regression for Streaming DataIRJET- E-MORES: Efficient Multiple Output Regression for Streaming Data
IRJET- E-MORES: Efficient Multiple Output Regression for Streaming DataIRJET Journal
 
A simplified predictive framework for cost evaluation to fault assessment usi...
A simplified predictive framework for cost evaluation to fault assessment usi...A simplified predictive framework for cost evaluation to fault assessment usi...
A simplified predictive framework for cost evaluation to fault assessment usi...IJECEIAES
 
IRJET- Analysis of Vehicle Number Plate Recognition
IRJET- Analysis of Vehicle Number Plate RecognitionIRJET- Analysis of Vehicle Number Plate Recognition
IRJET- Analysis of Vehicle Number Plate RecognitionIRJET Journal
 
Deepcoder to Self-Code with Machine Learning
Deepcoder to Self-Code with Machine LearningDeepcoder to Self-Code with Machine Learning
Deepcoder to Self-Code with Machine LearningIRJET Journal
 
Demonstrated Deep Learning Techniques for the Resolution of CAPTCHA images
Demonstrated Deep Learning Techniques for the Resolution of CAPTCHA imagesDemonstrated Deep Learning Techniques for the Resolution of CAPTCHA images
Demonstrated Deep Learning Techniques for the Resolution of CAPTCHA imagesIRJET Journal
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsIRJET Journal
 
Comparative Study of Pre-Trained Neural Network Models in Detection of Glaucoma
Comparative Study of Pre-Trained Neural Network Models in Detection of GlaucomaComparative Study of Pre-Trained Neural Network Models in Detection of Glaucoma
Comparative Study of Pre-Trained Neural Network Models in Detection of GlaucomaIRJET Journal
 

Similar to Clustering Large Probabilistic Graphs (20)

Clustering large probabilistic graphs
Clustering large probabilistic graphsClustering large probabilistic graphs
Clustering large probabilistic graphs
 
Estimating project development effort using clustered regression approach
Estimating project development effort using clustered regression approachEstimating project development effort using clustered regression approach
Estimating project development effort using clustered regression approach
 
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACHESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
 
A Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringA Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question Answering
 
A Machine learning based framework for Verification and Validation of Massive...
A Machine learning based framework for Verification and Validation of Massive...A Machine learning based framework for Verification and Validation of Massive...
A Machine learning based framework for Verification and Validation of Massive...
 
50120130406017
5012013040601750120130406017
50120130406017
 
MultiObjective(11) - Copy
MultiObjective(11) - CopyMultiObjective(11) - Copy
MultiObjective(11) - Copy
 
IEEE Pattern analysis and machine intelligence 2016 Title and Abstract
IEEE Pattern analysis and machine intelligence 2016 Title and AbstractIEEE Pattern analysis and machine intelligence 2016 Title and Abstract
IEEE Pattern analysis and machine intelligence 2016 Title and Abstract
 
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstract
 
K anonymity for crowdsourcing database
K anonymity for crowdsourcing databaseK anonymity for crowdsourcing database
K anonymity for crowdsourcing database
 
BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC...
 BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC... BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC...
BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC...
 
final year ieee pojects in pondicherry,bulk ieee projects ,bulk 2015-16 i...
  final  year ieee pojects in pondicherry,bulk ieee projects ,bulk  2015-16 i...  final  year ieee pojects in pondicherry,bulk ieee projects ,bulk  2015-16 i...
final year ieee pojects in pondicherry,bulk ieee projects ,bulk 2015-16 i...
 
IRJET- E-MORES: Efficient Multiple Output Regression for Streaming Data
IRJET- E-MORES: Efficient Multiple Output Regression for Streaming DataIRJET- E-MORES: Efficient Multiple Output Regression for Streaming Data
IRJET- E-MORES: Efficient Multiple Output Regression for Streaming Data
 
A simplified predictive framework for cost evaluation to fault assessment usi...
A simplified predictive framework for cost evaluation to fault assessment usi...A simplified predictive framework for cost evaluation to fault assessment usi...
A simplified predictive framework for cost evaluation to fault assessment usi...
 
IRJET- Analysis of Vehicle Number Plate Recognition
IRJET- Analysis of Vehicle Number Plate RecognitionIRJET- Analysis of Vehicle Number Plate Recognition
IRJET- Analysis of Vehicle Number Plate Recognition
 
Deepcoder to Self-Code with Machine Learning
Deepcoder to Self-Code with Machine LearningDeepcoder to Self-Code with Machine Learning
Deepcoder to Self-Code with Machine Learning
 
algorithms
algorithmsalgorithms
algorithms
 
Demonstrated Deep Learning Techniques for the Resolution of CAPTCHA images
Demonstrated Deep Learning Techniques for the Resolution of CAPTCHA imagesDemonstrated Deep Learning Techniques for the Resolution of CAPTCHA images
Demonstrated Deep Learning Techniques for the Resolution of CAPTCHA images
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather Conditions
 
Comparative Study of Pre-Trained Neural Network Models in Detection of Glaucoma
Comparative Study of Pre-Trained Neural Network Models in Detection of GlaucomaComparative Study of Pre-Trained Neural Network Models in Detection of Glaucoma
Comparative Study of Pre-Trained Neural Network Models in Detection of Glaucoma
 

Clustering Large Probabilistic Graphs

  • 1. ECWAY TECHNOLOGIES IEEE PROJECTS & SOFTWARE DEVELOPMENTS OUR OFFICES @ CHENNAI / TRICHY / KARUR / ERODE / MADURAI / SALEM / COIMBATORE CELL: +91 98949 17187, +91 875487 2111 / 3111 / 4111 / 5111 / 6111 VISIT: www.ecwayprojects.com MAIL TO: ecwaytechnologies@gmail.com CLUSTERING LARGE PROBABILISTIC GRAPHS ABSTRACT: We study the problem of clustering probabilistic graphs. Similar to the problem of clustering standard graphs, probabilistic graph clustering has numerous applications, such as finding complexes in probabilistic protein-protein interaction (PPI) networks and discovering groups of users in affiliation networks. We extend the edit-distance-based definition of graph clustering to probabilistic graphs. We establish a connection between our objective function and correlation clustering to propose practical approximation algorithms for our problem. A benefit of our approach is that our objective function is parameter-free. Therefore, the number of clusters is part of the output. We develop methods for testing the statistical significance of the output clustering and study the case of noisy clusterings. Using a real protein-protein interaction network and ground-truth data, we show that our methods discover the correct number of clusters and identify established protein relationships. Finally, we show the practicality of our techniques using a large social network of Yahoo! users consisting of one billion edges.