The document describes a scalable multi-threaded algorithm for community detection in social networks. It presents a parallel agglomerative method that avoids performance issues with sequential approaches. The method uses a maximal matching to merge communities in parallel while balancing computation. It can optimize various metrics like modularity or conductance. The implementation uses basic compressed data structures and primitives for scoring, matching, and contracting communities. Testing on moderate and large graphs showed the method achieving peak processing rates of up to 6.54 billion edges/second on an Intel server.
Graph Community Detection Algorithm for Distributed Memory Parallel Computing...Alexander Pozdneev
Community detection is an important problem that spans many research areas, such as social networks, systems biology, power grid optimization. The fine-grained communication and irregular access pattern to memory and interconnect limit the overall scalability and performance of existing algorithms. This talk presents a highly scalable parallel algorithm for distributed memory systems. The method employs a novel implementation strategy to store and process dynamic graphs. The scalability analysis of the algorithm was performed using two massively parallel machines, Blue Gene/Q and Power7-IH, for graphs with up to hundreds of billions of edges. Leveraging the convergence properties of the algorithm and the efficient implementation, it is possible to analyze communities of large-scale graphs in just a few seconds. The talk is based on a paper accepted for publication in IPDPS-2015 conference proceedings that was kindly provided by Dr. Fabrizio Petrini (IBM Research).
Physics Inspired Approaches to Community Detectionfuzzysphere
Community structure is one of the most relevant features of graphs in sociology, biology, computer science and so on. In this slide, the following methods for community detection are reviewed: (1) synchronization, and (2) spinglass.
References
[1] A. Arenas, A. D. Guilera, C. J. P. Vicente, Phys. Rev. Lett. 96, 114102 (2006) [arXiv:cond-mat/0511730]
[2] P. Ronhovde, Z. Nussinov, Phys. Rev. E 81, 046114 (2010) [arXiv:0803.2548]
[3] S.Fortunato, Phys. Rep. 486, 74 (2010) [arXiv:0906.0612]
Graph Community Detection Algorithm for Distributed Memory Parallel Computing...Alexander Pozdneev
Community detection is an important problem that spans many research areas, such as social networks, systems biology, power grid optimization. The fine-grained communication and irregular access pattern to memory and interconnect limit the overall scalability and performance of existing algorithms. This talk presents a highly scalable parallel algorithm for distributed memory systems. The method employs a novel implementation strategy to store and process dynamic graphs. The scalability analysis of the algorithm was performed using two massively parallel machines, Blue Gene/Q and Power7-IH, for graphs with up to hundreds of billions of edges. Leveraging the convergence properties of the algorithm and the efficient implementation, it is possible to analyze communities of large-scale graphs in just a few seconds. The talk is based on a paper accepted for publication in IPDPS-2015 conference proceedings that was kindly provided by Dr. Fabrizio Petrini (IBM Research).
Physics Inspired Approaches to Community Detectionfuzzysphere
Community structure is one of the most relevant features of graphs in sociology, biology, computer science and so on. In this slide, the following methods for community detection are reviewed: (1) synchronization, and (2) spinglass.
References
[1] A. Arenas, A. D. Guilera, C. J. P. Vicente, Phys. Rev. Lett. 96, 114102 (2006) [arXiv:cond-mat/0511730]
[2] P. Ronhovde, Z. Nussinov, Phys. Rev. E 81, 046114 (2010) [arXiv:0803.2548]
[3] S.Fortunato, Phys. Rep. 486, 74 (2010) [arXiv:0906.0612]
Human mobility,urban structure analysis,and spatial community detection from ...Song Gao
In the age of Big Data, the widespread use of location-awareness devices has made it possible to collect spatio-temporal individual trajectory datasets for analyzing human activity patterns in both physical space and cyberspace. Aggregation of such data can also support the urban computing studies and the understanding of urban dynamics and spatial networks. The research results can be utilized by urban managers to understand the dynamic spatial interaction patterns between different parts of the city in real-time and may guide them to conduct the optimized transportation infrastructures based on projected demand.
Στην παρούσα εργασία γίνεται μελέτη της οντολογίας μεγάλης κλίμακας FOAF (Friend of a Friend). Η εργασία έγινε στα πλαίσια του Μεταπτυχιακού προγράμματος σπουδών στην "Επιστήμη του Διαδικτύου" του τμήματος Μαθηματικών του Αριστοτελείου Πανεπιστημίου Θεσσαλονίκης.
Big data matrix factorizations and Overlapping community detection in graphsDavid Gleich
In a talk at the Chinese Academic of Sciences Institute for Automation, I discuss some of the MapReduce and community detection methods I've worked on.
Finding Emerging Topics Using Chaos and Community Detection in Social Media G...Paragon_Science_Inc
In this talk, we describe our recent work in the analysis of Twitter-based network graphs, including the Ebola crisis in 2014 and the stock market in 2015.
Community detection in graphs with NetworKitBenj Pettit
This is a "lightning talk" I gave at the 22nd PyData London meetup on 5 April 2016. The accompanying demonstration code is at https://github.com/benjpettit/networkit-demo
Clustering Methods and Community Detection with NetworkX. A slide deck for the NTU Complexity Science Winter School.
For the accompanying iPython Notebook, visit: http://github.com/eflegara/NetStruc
Community detection from a computational social science perspectiveDavide Bennato
This is the talk I gave at the Lipari Summer School on Computational Social Science, 2014. Which are the sociological strategies to detect communities in social media? How we can define a community form a computational social science point of view?
In many scientific areas, systems can be described as interaction networks where elements correspond to vertices and interactions to edges. A variety of problems in those fields can deal with network comparison and characterization.
The problem of comparing and characterizing networks is the task of measuring their structural similarity and finding characteristics which capture structural information. In order to analyze complex networks, several methods can be combined, such as graph theory, information theory, and statistics.
In this project, we present methods for measuring Shannon’s entropy of graphs.
Quick introduction to community detection.
Structural properties of real world networks, definition of "communities", fundamental techniques and evaluation measures.
2013 NodeXL Social Media Network AnalysisMarc Smith
Social media network analysis and visualization with NodeXL - the network overview discovery and exploration add-in for Excel. Map Twitter, Facebook, email, blogs, and the web with a point and click interface within the familiar spreadsheet.
Human mobility,urban structure analysis,and spatial community detection from ...Song Gao
In the age of Big Data, the widespread use of location-awareness devices has made it possible to collect spatio-temporal individual trajectory datasets for analyzing human activity patterns in both physical space and cyberspace. Aggregation of such data can also support the urban computing studies and the understanding of urban dynamics and spatial networks. The research results can be utilized by urban managers to understand the dynamic spatial interaction patterns between different parts of the city in real-time and may guide them to conduct the optimized transportation infrastructures based on projected demand.
Στην παρούσα εργασία γίνεται μελέτη της οντολογίας μεγάλης κλίμακας FOAF (Friend of a Friend). Η εργασία έγινε στα πλαίσια του Μεταπτυχιακού προγράμματος σπουδών στην "Επιστήμη του Διαδικτύου" του τμήματος Μαθηματικών του Αριστοτελείου Πανεπιστημίου Θεσσαλονίκης.
Big data matrix factorizations and Overlapping community detection in graphsDavid Gleich
In a talk at the Chinese Academic of Sciences Institute for Automation, I discuss some of the MapReduce and community detection methods I've worked on.
Finding Emerging Topics Using Chaos and Community Detection in Social Media G...Paragon_Science_Inc
In this talk, we describe our recent work in the analysis of Twitter-based network graphs, including the Ebola crisis in 2014 and the stock market in 2015.
Community detection in graphs with NetworKitBenj Pettit
This is a "lightning talk" I gave at the 22nd PyData London meetup on 5 April 2016. The accompanying demonstration code is at https://github.com/benjpettit/networkit-demo
Clustering Methods and Community Detection with NetworkX. A slide deck for the NTU Complexity Science Winter School.
For the accompanying iPython Notebook, visit: http://github.com/eflegara/NetStruc
Community detection from a computational social science perspectiveDavide Bennato
This is the talk I gave at the Lipari Summer School on Computational Social Science, 2014. Which are the sociological strategies to detect communities in social media? How we can define a community form a computational social science point of view?
In many scientific areas, systems can be described as interaction networks where elements correspond to vertices and interactions to edges. A variety of problems in those fields can deal with network comparison and characterization.
The problem of comparing and characterizing networks is the task of measuring their structural similarity and finding characteristics which capture structural information. In order to analyze complex networks, several methods can be combined, such as graph theory, information theory, and statistics.
In this project, we present methods for measuring Shannon’s entropy of graphs.
Quick introduction to community detection.
Structural properties of real world networks, definition of "communities", fundamental techniques and evaluation measures.
2013 NodeXL Social Media Network AnalysisMarc Smith
Social media network analysis and visualization with NodeXL - the network overview discovery and exploration add-in for Excel. Map Twitter, Facebook, email, blogs, and the web with a point and click interface within the familiar spreadsheet.
High-performance graph analysis is unlocking knowledge in computer security, bioinformatics, social networks, and many other data integration areas. Graphs provide a convenient abstraction for many data problems beyond linear algebra. Some problems map directly to linear algebra. Others, like community detection, look eerily similar to sparse linear algebra techniques. And then there are algorithms that strongly resist attempts at making them look like linear algebra. This talk will cover recent results with an emphasis on streaming graph problems where the graph changes and results need updated with minimal latency. We’ll also touch on issues of sensitivity and reliability where graph analysis needs to learn from numerical analysis and linear algebra.
LSS'11: Charting Collections Of Connections In Social MediaLocal Social Summit
Keynote Title: Charting Collections of Connections in Social Media: Creating Maps and Measures with NodeXL
Abstract: Networks are a data structure common found across all social media services that allow populations to author collections of connections. The Social Media Research Foundation‘s NodeXL project makes analysis of social media networks accessible to most users of the Excel spreadsheet application. With NodeXL, Networks become as easy to create as pie charts. Applying the tool to a range of social media networks has already revealed the variations present in online social spaces. A review of the tool and images of Twitter, flickr, YouTube, and email networks will be presented.
Slides for talk at ConTech 2011 the International Symposium on Convergence Technology (ConTech 2011) – Smart & Humane World – on November 3rd in Seoul, South Korea.
Date: 2011 November 3 (Thurs)
Place: COEX Grand Ballroom, Seoul, Korea
Organized by Advanced Institutes of Convergence Technologies (AICT), Seoul National University (SNU)
In Cooperation with Ministry of Knowledge Economy, Ministry of Education, Science and Technology, National Research Foundation of Korea, Graduate School of Convergence Science and Technology (GSCST)
2010 june - personal democracy forum - marc smith - mapping political socia...Marc Smith
Marc Smith's presentation to the Personal Democracy Forum 2010 in New York City on June 4th, 2010 about the use of NodeXL, a social media network analysis tool, to map political topics in services like Twitter.
NodeXL is available from http://nodexl.codeplex.com
1. Basics of Social Networks
2. Real-world problem
3. How to construct graph from real-world problem?
4. What graph theory problem getting from real-world problem?
5. Graph type of Social Networks
6. Special properties in social graph
7. How to find communities and groups in social networks? (Algorithms)
8. How to interpret graph solution back to real-world problem?
Community Detection in Networks Using Page Rank Vectors ijbbjournal
Nodes in the real world networks organize in the form of network communities. A community (also referred
to as module or cluster)is defined as where the links are denser inside the nodes and sparser outside the
nodes in the network. Communities in the networks also overlap because the nodes may belong to different
clusters at once. The task of detecting communities in networks becomes an open problem because of lack
of reliable algorithms. In practice all the existing community detection methods work good for nonoverlapping
communities and fail to detect communities with dense overlaps. We developed a novel method
for detecting communities by considering a single seed node. This method successfully captures the
overlapping networks ranging from social to information and from biological to citation networks. We
believe that the proposed system works well for the overlapping communities.
Similar to MTAAP12: Scalable Community Detection (20)
Reproducible Linear Algebra from Application to ArchitectureJason Riedy
All computing must be parallel to take advantage of modern systems like multicore processors, GPUs, and distributed systems. Results that are not bit-wise reproducible introduce doubt on many levels. Sometimes that is appropriate. Reproducibility limitations occur because underlying libraries do not specify their reproducibility requirements. New advances in interfaces, algorithms, and architectures allow selecting among those requirements in the future. This talk covers many of the upcoming options and their trade-offs.
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...Jason Riedy
The Rogues Gallery is a new experimental testbed that is focused on tackling "rogue'' architectures for the Post-Moore era of computing. While some of these devices have roots in the embedded and high-performance computing spaces, managing current and emerging technologies provides a challenge for system administration that are not always foreseen in traditional data center environments.
We present an overview of the motivations and design of the initial Rogues Gallery testbed and cover some of the unique challenges that we have seen and foresee with upcoming hardware prototypes for future post-Moore research. Specifically, we cover the networking, identity management, scheduling of resources, and tools and sensor access aspects of the Rogues Gallery and techniques we have developed to manage these new platforms. We argue that current tools like the Slurm resource manager can support new rogues without major infrastructure changes.
ICIAM 2019: Reproducible Linear Algebra from Application to ArchitectureJason Riedy
All computing must be parallel to take advantage of modern systems like multicore processors, GPUs, and distributed systems. Results that are not bit-wise reproducible introduce doubt on many levels. Sometimes that is appropriate. Reproducibility limitations occur because underlying libraries do not specify their reproducibility requirements. New advances in interfaces, algorithms, and architectures allow selecting among those requirements in the future. This talk covers many of the upcoming options and their trade-offs.
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisJason Riedy
Applications in many areas analyze an ever-changing environment. On billion vertices graphs, providing snapshots imposes a large performance cost. We propose the first formal model for graph analysis running concurrently with streaming data updates. We consider an algorithm valid if its output is correct for the initial graph plus some implicit subset of concurrent changes. We show theoretical properties of the model, demonstrate the model on various algorithms, and extend it to updating results incrementally.
In one classic sense a rogue is someone who goes their own way, who breaks away from the crowd. The CRNCH Rogues Gallery aims to support computer architecture rogues by being a physical and virtual space providing access to novel computing architectures. Researchers find applications, and architects discover what happens when their prototypes hit reality. Our goals are to help kick-start software ecosystems, train students in novel system evaluation and use, and provide rapid feedback to architects. By exposing students and researchers to this set of unique hardware, we foster cross-cutting discussions about hardware designs that will drive future performance improvements in computing long after the Moore’s Law era of “cheap transistors” ends. We provide a brief description of the current Rogues Gallery along with successes and research highlights over the last year.
Augmented Arithmetic Operations Proposed for IEEE-754 2018Jason Riedy
Algorithms for extending arithmetic precision through compensated summation or arithmetics like double-double rely on operations commonly called twoSum and twoProduct. The current draft of the IEEE 754 standard specifies these operations under the names augmentedAddition and augmentedMultiplication. These operations were included after three decades of experience because of a motivating new use: bitwise reproducible arithmetic. Standardizing the operations provides a hardware acceleration target that can provide at least a 33% speed improvements in reproducible dot product, placing reproducible dot product almost within a factor of two of common dot product. This paper provides history and motivation for standardizing these operations. We also define the operations, explain the rationale for all the specific choices, and provide parameterized test cases for new boundary behaviors.
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsJason Riedy
The Rogues Gallery is a new concept focused on developing our understanding of next-generation hardware with a focus on unorthodox and uncommon technologies. This project, initiated by Georgia Tech's Center for Research into Novel Computing Hierarchies (CRNCH), will acquire new and unique hardware (ie, the aforementioned "rogues") from vendors, research labs, and startups and make this hardware available to students, faculty, and industry collaborators within a managed data center environment. By exposing students and researchers to this set of unique hardware, we hope to foster cross-cutting discussions about hardware designs that will drive future performance improvements in computing long after the Moore's Law era of "cheap transistors" ends.
A New Algorithm Model for Massive-Scale Streaming Graph AnalysisJason Riedy
Applications in computer network security, social media analysis,and other areas rely on analyzing a changing environment. The data is rich in relationships and lends itself to graph analysis. Traditional static graph analysis cannot keep pace with network security applications analyzing nearly one million events per second and social networks like Facebook collecting 500 thousand comments per second. Streaming frameworks like STINGER support ingesting up three million of edge changes per second but there are few streaming analysis kernels that keep up with these rates. Here we present a new algorithm model for applying complex metrics to a changing graph. In this model, many more algorithms can be applied without having to stop the world.
High-Performance Analysis of Streaming Graphs Jason Riedy
Graph-structured data in social networks, finance, network security, and others not only are massive but also under continual change. These changes often are scattered across the graph. Stopping the world to run a single, static query is infeasible. Repeating complex global analyses on massive snapshots to capture only what has changed is inefficient. We discuss requirements for single-shot queries on changing graphs as well as recent high-performance algorithms that update rather than recompute results. These algorithms are incorporated into our software framework for streaming graph analysis, STINGER.
High-Performance Analysis of Streaming GraphsJason Riedy
Graph-structured data in social networks, finance, network security, and others not only are massive but also under continual change. These changes often are scattered across the graph. Stopping the world to run a single, static query is infeasible. Repeating complex global analyses on massive snapshots to capture only what has changed is inefficient. We discuss requirements for single-shot queries on changing graphs as well as recent high-performance algorithms that update rather than recompute results. These algorithms are incorporated into our software framework for streaming graph analysis, STING (Spatio-Temporal Interaction Networks and Graphs).
Algorithm for efficiently and accurately updating PageRank as the graph changes from a stream of updates. Also includes needs from the upcoming GraphBLAS to support high-performance streaming graph analysis.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
National Security Agency - NSA mobile device best practices
MTAAP12: Scalable Community Detection
1. Scalable Multi-threaded Community Detection
in Social Networks
E. Jason Riedy1 , David A. Bader1 , and Henning Meyerhenke2
1
School of Comp. Science and Engineering, Georgia Inst. of Technology
2
Inst. of Theoretical Informatics, Karlsruhe Inst. of Technology (KIT)
25 May 2012
2. Exascale data analysis
Health care Finding outbreaks, population epidemiology
Social networks Advertising, searching, grouping
Intelligence Decisions at scale, regulating algorithms
Systems biology Understanding interactions, drug design
Power grid Disruptions, conservation
Simulation Discrete events, cracking meshes
• Graph clustering is common in all application areas.
MTAAP 2012—Scalable Community Detection—Jason Riedy 2/35
3. These are not easy graphs.
Yifan Hu’s (AT&T) visualization of the in-2004 data set
http://www2.research.att.com/~yifanhu/gallery.html
MTAAP 2012—Scalable Community Detection—Jason Riedy 3/35
4. But no shortage of structure...
Protein interactions, Giot et al., “A Protein
Interaction Map of Drosophila melanogaster”,
Jason’s network via LinkedIn Labs
Science 302, 1722-1736, 2003.
• Locally, there are clusters or communities.
• First pass over a massive social graph:
• Find smaller communities of interest.
• Analyze / visualize top-ranked communities.
• Our part: Community detection at massive scale. (Or kinda
large, given available data.)
MTAAP 2012—Scalable Community Detection—Jason Riedy 4/35
5. Outline
Motivation
Defining community detection and metrics
Shooting for massive graphs
Our parallel method
Implementation and platform details
Performance
Conclusions and plans
MTAAP 2012—Scalable Community Detection—Jason Riedy 5/35
6. Community detection
What do we mean?
• Partition a graph’s
vertices into disjoint
communities.
• A community locally
maximizes some metric.
• Modularity,
conductance, ...
• Trying to capture that
vertices are more similar
within one community
than between
communities. Jason’s network via LinkedIn Labs
MTAAP 2012—Scalable Community Detection—Jason Riedy 6/35
7. Community detection
Assumptions
• Disjoint partitioning of
vertices.
• There is no one unique
answer.
• Many metrics are
NP-complete to
optimize (Brandes, et
al.[1]).
• Graph is lossy
representation.
• Want an adaptable
detection method. Jason’s network via LinkedIn Labs
MTAAP 2012—Scalable Community Detection—Jason Riedy 7/35
8. Common community metric: Modularity
• Modularity: Deviation of connectivity in the community
induced by a vertex set S from some expected background
model of connectivity.
• We take Newman [2]’s basic uniform model.
• Let m count all edges in graph G, mS count of edges with
both endpoints in S, and xS count the edges with any
endpoint in S. Modularity QS :
QS = (mS − x2 /4m)/m
S
• Total modularity: sum of modularities of disjoint subsets.
• A sufficiently positive modularity implies some structure.
• Known issues: Resolution limit, NP-complete opt. prob.
MTAAP 2012—Scalable Community Detection—Jason Riedy 8/35
9. Can we tackle massive graphs now?
Parallel, of course...
• Massive needs distributed memory, right?
• Well... Not really. Can buy a 2 TiB Intel-based Dell server
on-line for around $200k USD, a 1.5 TiB from IBM, etc.
Image: dell.com.
Not an endorsement, just evidence!
• Publicly available “real-world” data fits...
• Start with shared memory to see what needs done.
• Specialized architectures provide larger shared-memory views
over distributed implementations (e.g. Cray XMT).
MTAAP 2012—Scalable Community Detection—Jason Riedy 9/35
10. Multi-threaded algorithm design points
A scalable multi-threaded graph analysis algorithm
• ... avoids global locks and frequent global synchronization.
• ... distributes computation over edges rather than only vertices.
• ... works with data as local to an edge as possible.
• ... uses compact data structures that agglomerate memory
references.
MTAAP 2012—Scalable Community Detection—Jason Riedy 10/35
11. Sequential agglomerative method
• A common method (e.g. Clauset, et
al. [3]) agglomerates vertices into
A C communities.
• Each vertex begins in its own
B community.
D • An edge is chosen to contract.
E • Merging maximally increases
modularity.
G • Priority queue.
F
• Known often to fall into an O(n2 )
performance trap with modularity
(Wakita & Tsurumi [4]).
MTAAP 2012—Scalable Community Detection—Jason Riedy 11/35
12. Sequential agglomerative method
• A common method (e.g. Clauset, et
al. [3]) agglomerates vertices into
A C communities.
• Each vertex begins in its own
B community.
D • An edge is chosen to contract.
E • Merging maximally increases
modularity.
G • Priority queue.
F
• Known often to fall into an O(n2 )
performance trap with modularity
(Wakita & Tsurumi [4]).
MTAAP 2012—Scalable Community Detection—Jason Riedy 12/35
13. Sequential agglomerative method
• A common method (e.g. Clauset, et
al. [3]) agglomerates vertices into
A C communities.
• Each vertex begins in its own
B community.
D • An edge is chosen to contract.
E • Merging maximally increases
modularity.
G • Priority queue.
F
• Known often to fall into an O(n2 )
performance trap with modularity
(Wakita & Tsurumi [4]).
MTAAP 2012—Scalable Community Detection—Jason Riedy 13/35
14. Sequential agglomerative method
• A common method (e.g. Clauset, et
al. [3]) agglomerates vertices into
A C communities.
• Each vertex begins in its own
B community.
D • An edge is chosen to contract.
E • Merging maximally increases
modularity.
G • Priority queue.
F
• Known often to fall into an O(n2 )
performance trap with modularity
(Wakita & Tsurumi [4]).
MTAAP 2012—Scalable Community Detection—Jason Riedy 14/35
15. Parallel agglomerative method
• We use a matching to avoid the queue.
• Compute a heavy weight, large
matching.
• Simple greedy algorithm.
A C • Maximal matching.
• Within factor of 2 in weight.
B • Merge all matched communities at
D once.
E • Maintains some balance.
G • Produces different results.
F
• Agnostic to weighting, matching...
• Can maximize modularity, minimize
conductance.
• Modifying matching permits easy
exploration.
MTAAP 2012—Scalable Community Detection—Jason Riedy 15/35
16. Parallel agglomerative method
• We use a matching to avoid the queue.
• Compute a heavy weight, large
matching.
• Simple greedy algorithm.
A C • Maximal matching.
• Within factor of 2 in weight.
B • Merge all matched communities at
D once.
E • Maintains some balance.
G • Produces different results.
F
• Agnostic to weighting, matching...
• Can maximize modularity, minimize
conductance.
• Modifying matching permits easy
exploration.
MTAAP 2012—Scalable Community Detection—Jason Riedy 16/35
17. Parallel agglomerative method
• We use a matching to avoid the queue.
• Compute a heavy weight, large
matching.
• Simple greedy algorithm.
A C • Maximal matching.
• Within factor of 2 in weight.
B • Merge all matched communities at
D once.
E • Maintains some balance.
G • Produces different results.
F
• Agnostic to weighting, matching...
• Can maximize modularity, minimize
conductance.
• Modifying matching permits easy
exploration.
MTAAP 2012—Scalable Community Detection—Jason Riedy 17/35
18. Platform: Cray XMT2
Tolerates latency by massive multithreading.
• Hardware: 128 threads per processor
• Context switch on every cycle (500 MHz)
• Many outstanding memory requests (180/proc)
• “No” caches...
• Flexibly supports dynamic load balancing
• Globally hashed address space, no data cache
• Support for fine-grained, word-level synchronization
• Full/empty bit on with every memory word
• 64 processor XMT2 at CSCS,
the Swiss National
Supercomputer Centre.
• 500 MHz processors, 8192
threads, 2 TiB of shared
memory Image: cray.com
MTAAP 2012—Scalable Community Detection—Jason Riedy 18/35
19. Platform: Intel R E7-8870-based server
Tolerates some latency by hyperthreading.
• “Westmere:” 2 threads / core, 10 cores / socket, four sockets.
• Fast cores (2.4 GHz), fast memory (1 066 MHz).
• Not so many outstanding memory requests (60/socket), but
large caches (30 MiB L3 per socket).
• Good system support
• Transparent hugepages reduces TLB costs.
• Fast, user-level locking. (HLE would be better...)
• OpenMP, although I didn’t tune it...
• mirasol, #17 on Graph500
(thanks to UCB)
• Four processors (80 threads),
256 GiB memory
• gcc 4.6.1, Linux kernel
Image: Intel R press kit
3.2.0-rc5
MTAAP 2012—Scalable Community Detection—Jason Riedy 19/35
20. Platform: Other Intel R -based servers
Different design points
• “Nehalem” X5570: 2.93 GHz, 2 threads/core, 4 cores/socket,
2 sockets, 8 MiB cache/socket
• “Westmere” X5650: 2.66 GHz, 2 threads/core, 6 cores/socket,
2 sockets, 12 MiB cache/socket
• All with 1 066 MHz memory.
• Does the Westmere E7-8870’s scale affect performance?
• Nodes in Georgia Tech CSE
cluster jinx
• 24-48 GiB memory, small
tests
Image: Intel R press kit
MTAAP 2012—Scalable Community Detection—Jason Riedy 20/35
21. Implementation: Data structures
Extremely basic for graph G = (V, E)
• An array of (i, j; w) weighted edge pairs, each i, j stored only
once and packed, uses 3|E| space
• An array to store self-edges, d(i) = w, |V |
• A temporary floating-point array for scores, |E|
• A additional temporary arrays using 4|V | + 2|E| to store
degrees, matching choices, offsets...
• Weights count number of agglomerated vertices or edges.
• Scoring methods (modularity, conductance) need only
vertex-local counts.
• Storing an undirected graph in a symmetric manner reduces
memory usage drastically and works with our simple matcher.
MTAAP 2012—Scalable Community Detection—Jason Riedy 21/35
22. Implementation: Data structures
Extremely basic for graph G = (V, E)
• An array of (i, j; w) weighted edge pairs, each i, j stored only
once and packed, uses 3|E| space
• An array to store self-edges, d(i) = w, |V |
• A temporary floating-point array for scores, |E|
• A additional temporary arrays using 4|V | + 2|E| to store
degrees, matching choices, offsets...
• Original ignored order in edge array, killed OpenMP.
• New: Roughly bucket edge array by first stored index.
Non-adjacent CSR-like structure.
• New: Hash i, j to determine order. Scatter among buckets.
MTAAP 2012—Scalable Community Detection—Jason Riedy 22/35
23. Implementation: Routines
Three primitives: Scoring, matching, contracting
Scoring Trivial.
Matching Repeat until no ready, unmatched vertex:
1 For each unmatched vertex in parallel, find the
best unmatched neighbor in its bucket.
2 Try to point remote match at that edge (lock,
check if best, unlock).
3 If pointing succeeded, try to point self-match at
that edge.
4 If both succeeded, yeah! If not and there was
some eligible neighbor, re-add self to ready,
unmatched list.
(Possibly too simple, but...)
MTAAP 2012—Scalable Community Detection—Jason Riedy 23/35
24. Implementation: Routines
Contracting
1 Map each i, j to new vertices, re-order by hashing.
2 Accumulate counts for new i bins, prefix-sum for offset.
3 Copy into new bins.
• Only synchronizing in the prefix-sum. That could be removed if
I don’t re-order the i , j pair; haven’t timed the difference.
• Actually, the current code copies twice... On short list for
fixing.
• Binning as opposed to original list-chasing enabled
Intel/OpenMP support with reasonable performance.
MTAAP 2012—Scalable Community Detection—Jason Riedy 24/35
32. Conclusions and plans
• Code:
http://www.cc.gatech.edu/~jriedy/community-detection/
• Some low-hanging fruit remains:
• Eliminate one unnecessary copy during contraction.
• Deal with stars.
• Then... Practical experiments.
• How volatile are modularity and conductance to perturbations?
• What matching schemes work well?
• How do different metrics compare in applications?
• Extending to streaming graph data!
• Includes developing parallel refinement...
• And possibly de-clustering or manipulating the dendogram....
• Very much WIP, more tricky than anticipated.
MTAAP 2012—Scalable Community Detection—Jason Riedy 32/35
34. Bibliography I
U. Brandes, D. Delling, M. Gaertler, R. G¨rke, M. Hoefer,
o
Z. Nikoloski, and D. Wagner, “On modularity clustering,” IEEE
Trans. Knowledge and Data Engineering, vol. 20, no. 2, pp.
172–188, 2008.
M. Newman, “Modularity and community structure in
networks,” Proc. of the National Academy of Sciences, vol. 103,
no. 23, pp. 8577–8582, 2006.
A. Clauset, M. Newman, and C. Moore, “Finding community
structure in very large networks,” Physical Review E, vol. 70,
no. 6, p. 66111, 2004.
K. Wakita and T. Tsurumi, “Finding community structure in
mega-scale social networks,” CoRR, vol. abs/cs/0702048, 2007.
MTAAP 2012—Scalable Community Detection—Jason Riedy 34/35
35. Bibliography II
D. Chakrabarti, Y. Zhan, and C. Faloutsos, “R-MAT: A
recursive model for graph mining,” in Proc. 4th SIAM Intl.
Conf. on Data Mining (SDM). Orlando, FL: SIAM, Apr. 2004.
D. Bader, J. Gilbert, J. Kepner, D. Koester, E. Loh,
K. Madduri, W. Mann, and T. Meuse, HPCS SSCA#2 Graph
Analysis Benchmark Specifications v1.1, Jul. 2005.
J. Leskovec, “Stanford large network dataset collection,” At
http://snap.stanford.edu/data/, Oct. 2011.
P. Boldi, B. Codenotti, M. Santini, and S. Vigna, “Ubicrawler:
A scalable fully distributed web crawler,” Software: Practice &
Experience, vol. 34, no. 8, pp. 711–726, 2004.
MTAAP 2012—Scalable Community Detection—Jason Riedy 35/35