This document provides an overview of an introduction to NoSQL webinar. It discusses why NoSQL databases were created, the different types of NoSQL databases including key-value stores, column stores, graph stores, multi-model databases and document stores. It provides details on MongoDB, describing how MongoDB stores data as JSON-like documents with dynamic schemas and supports features like indexing, aggregation and geospatial queries. The webinar agenda is also outlined.
Webinar: Getting Started with MongoDB - Back to BasicsMongoDB
Part one an Introduction to MongoDB. Learn how easy it is to start building applications with MongoDB. This session covers key features and functionality of MongoDB and sets out the course of building an application.
Beyond the Basics 2: Aggregation Framework MongoDB
The aggregation framework is one of the most powerful analytical tools available with MongoDB.
Learn how to create a pipeline of operations that can reshape and transform your data and apply a range of analytics functions and calculations to produce summary results across a data set.
This presentation was given at the LDS Tech SORT Conference 2011 in Salt Lake City. The slides are quite comprehensive covering many topics on MongoDB. Rather than a traditional presentation, this was presented as more of a Q & A session. Topics covered include. Introduction to MongoDB, Use Cases, Schema design, High availability (replication) and Horizontal Scaling (sharding).
Back to Basics Webinar 1: Introduction to NoSQLMongoDB
This is the first webinar of a Back to Basics series that will introduce you to the MongoDB database, what it is, why you would use it, and what you would use it for.
Back to Basics: My First MongoDB ApplicationMongoDB
This Back to Basics webinar series will introduce you to NoSQL and the MongoDB database. You will find out what MongoDB is, why you would use it, and what you would use it for.
Webinar: Working with Graph Data in MongoDBMongoDB
With the release of MongoDB 3.4, the number of applications that can take advantage of MongoDB has expanded. In this session we will look at using MongoDB for representing graphs and how graph relationships can be modeled in MongoDB.
We will also look at a new aggregation operation that we recently implemented for graph traversal and computing transitive closure. We will include an overview of the new operator and provide examples of how you can exploit this new feature in your MongoDB applications.
Webinar: Getting Started with MongoDB - Back to BasicsMongoDB
Part one an Introduction to MongoDB. Learn how easy it is to start building applications with MongoDB. This session covers key features and functionality of MongoDB and sets out the course of building an application.
Beyond the Basics 2: Aggregation Framework MongoDB
The aggregation framework is one of the most powerful analytical tools available with MongoDB.
Learn how to create a pipeline of operations that can reshape and transform your data and apply a range of analytics functions and calculations to produce summary results across a data set.
This presentation was given at the LDS Tech SORT Conference 2011 in Salt Lake City. The slides are quite comprehensive covering many topics on MongoDB. Rather than a traditional presentation, this was presented as more of a Q & A session. Topics covered include. Introduction to MongoDB, Use Cases, Schema design, High availability (replication) and Horizontal Scaling (sharding).
Back to Basics Webinar 1: Introduction to NoSQLMongoDB
This is the first webinar of a Back to Basics series that will introduce you to the MongoDB database, what it is, why you would use it, and what you would use it for.
Back to Basics: My First MongoDB ApplicationMongoDB
This Back to Basics webinar series will introduce you to NoSQL and the MongoDB database. You will find out what MongoDB is, why you would use it, and what you would use it for.
Webinar: Working with Graph Data in MongoDBMongoDB
With the release of MongoDB 3.4, the number of applications that can take advantage of MongoDB has expanded. In this session we will look at using MongoDB for representing graphs and how graph relationships can be modeled in MongoDB.
We will also look at a new aggregation operation that we recently implemented for graph traversal and computing transitive closure. We will include an overview of the new operator and provide examples of how you can exploit this new feature in your MongoDB applications.
Back to Basics, webinar 2: La tua prima applicazione MongoDBMongoDB
Questo è il secondo webinar della serie Back to Basics che ti offrirà un'introduzione al database MongoDB. In questo webinar ti dimostreremo come creare un'applicazione base per il blogging in MongoDB.
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesMongoDB
This is the fourth webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will introduce you to the aggregation framework.
Webinar: Back to Basics: Thinking in DocumentsMongoDB
New applications, users and inputs demand new types of data, like unstructured, semi-structured and polymorphic data. Adopting MongoDB means adopting to a new, document-based data model.
While most developers have internalized the rules of thumb for designing schemas for relational databases, these rules don't apply to MongoDB. Documents can represent rich data structures, providing lots of viable alternatives to the standard, normalized, relational model. In addition, MongoDB has several unique features, such as atomic updates and indexed array keys, that greatly influence the kinds of schemas that make sense.
In this session, Buzz Moschetti explores how you can take advantage of MongoDB's document model to build modern applications.
To understand how to make your application fast, it's important to understand what makes the database fast. We will take a detailed look at how to think about performance, and how different choices in schema design affect your cluster performances depending on storage engines used and physical resources available.
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDBMongoDB
Este es el segundo seminario web de la serie Conceptos básicos, en la que se realiza una introducción a la base de datos MongoDB. En este seminario web mostraremos cómo construir una aplicación de creación de blogs en MongoDB.
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkMongoDB
This is the fifth webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will introduce you to the aggregation framework.
MongoDB is the most famous and loved NoSQL database. It has many features that are easy to handle when compared to conventional RDBMS. These slides contain the basics of MongoDB.
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...MongoDB
Este es el cuarto seminario web de la serie Conceptos básicos, en la que se realiza una introducción a la base de datos MongoDB. Este seminario se ve en la compatibilidad con índices de texto libre y geoespaciales.
Back to Basics Webinar 3: Schema Design Thinking in DocumentsMongoDB
This is the third webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will explain the architecture of document databases.
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkMongoDB
Este es el quinto seminario web de la serie Conceptos básicos, en la que se realiza una introducción a la base de datos MongoDB. En este seminario web, se analizan los aspectos básicos de Aggregation Framework.
MongoDB is an open source document database, and the leading NoSQL database. MongoDB is a document oriented database that provides high performance, high availability, and easy scalability. It is Maintained and supported by 10gen.
Back to Basics, webinar 2: La tua prima applicazione MongoDBMongoDB
Questo è il secondo webinar della serie Back to Basics che ti offrirà un'introduzione al database MongoDB. In questo webinar ti dimostreremo come creare un'applicazione base per il blogging in MongoDB.
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesMongoDB
This is the fourth webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will introduce you to the aggregation framework.
Webinar: Back to Basics: Thinking in DocumentsMongoDB
New applications, users and inputs demand new types of data, like unstructured, semi-structured and polymorphic data. Adopting MongoDB means adopting to a new, document-based data model.
While most developers have internalized the rules of thumb for designing schemas for relational databases, these rules don't apply to MongoDB. Documents can represent rich data structures, providing lots of viable alternatives to the standard, normalized, relational model. In addition, MongoDB has several unique features, such as atomic updates and indexed array keys, that greatly influence the kinds of schemas that make sense.
In this session, Buzz Moschetti explores how you can take advantage of MongoDB's document model to build modern applications.
To understand how to make your application fast, it's important to understand what makes the database fast. We will take a detailed look at how to think about performance, and how different choices in schema design affect your cluster performances depending on storage engines used and physical resources available.
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDBMongoDB
Este es el segundo seminario web de la serie Conceptos básicos, en la que se realiza una introducción a la base de datos MongoDB. En este seminario web mostraremos cómo construir una aplicación de creación de blogs en MongoDB.
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkMongoDB
This is the fifth webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will introduce you to the aggregation framework.
MongoDB is the most famous and loved NoSQL database. It has many features that are easy to handle when compared to conventional RDBMS. These slides contain the basics of MongoDB.
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...MongoDB
Este es el cuarto seminario web de la serie Conceptos básicos, en la que se realiza una introducción a la base de datos MongoDB. Este seminario se ve en la compatibilidad con índices de texto libre y geoespaciales.
Back to Basics Webinar 3: Schema Design Thinking in DocumentsMongoDB
This is the third webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will explain the architecture of document databases.
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkMongoDB
Este es el quinto seminario web de la serie Conceptos básicos, en la que se realiza una introducción a la base de datos MongoDB. En este seminario web, se analizan los aspectos básicos de Aggregation Framework.
MongoDB is an open source document database, and the leading NoSQL database. MongoDB is a document oriented database that provides high performance, high availability, and easy scalability. It is Maintained and supported by 10gen.
Design, Scale and Performance of MapR's Distribution for Hadoopmcsrivas
Details the first ever Exabyte-scale system that can hold a Trillion large files. Describes MapR's Distributed NameNode (tm) architecture, and how it scales very easily and seamlessly. Shows map-reduce performance across a variety of benchmarks like dfsio, pig-mix, nnbench, terasort and YCSB.
MapR is an amazing new distributed filesystem modeled after Hadoop. It maintains API compatibility with Hadoop, but far exceeds it in performance, manageability, and more.
/* Ted's MapR meeting slides incorporated here */
Creating a Modern Data Architecture for Digital TransformationMongoDB
By managing Data in Motion, Data at Rest, and Data in Use differently, modern Information Management Solutions are enabling a whole range of architecture and design patterns that allow enterprises to fully harness the value in data flowing through their systems. In this session we explored some of the patterns (e.g. operational data lakes, CQRS, microservices and containerisation) that enable CIOs, CDOs and senior architects to tame the data challenge, and start to use data as a cross-enterprise asset.
Webinar: 10-Step Guide to Creating a Single View of your BusinessMongoDB
Organizations have long seen the value in aggregating data from multiple systems into a single, holistic, real-time representation of a business entity. That entity is often a customer. But the benefits of a single view in enhancing business visibility and operational intelligence can apply equally to other business contexts. Think products, supply chains, industrial machinery, cities, financial asset classes, and many more.
However, for many organizations, delivering a single view to the business has been elusive, impeded by a combination of technology and governance limitations.
MongoDB has been used in many single view projects across enterprises of all sizes and industries. In this session, we will share the best practices we have observed and institutionalized over the years. By attending the webinar, you will learn:
- A repeatable, 10-step methodology to successfully delivering a single view
- The required technology capabilities and tools to accelerate project delivery
- Case studies from customers who have built transformational single view applications on MongoDB.
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...MongoDB
The United States will be deploying 16,000 traffic speed monitoring sensors - 1 on every mile of US interstate in urban centers. These sensors update the speed, weather, and pavement conditions once per minute. MongoDB will collect and aggregate live sensor data feeds from roadways around the country, support real-time queries from cars on traffic conditions on their route as well as be the platform for real-time dashboards displaying traffic conditions and more complex analytical queries used to identify traffic trends. In this session, we’ll implement a few different data aggregation techniques to query and dashboard the metrics gathered from the US interstate.
No se pierda esta oportunidad de conocer las ventajas de NoSQL. Participe en nuestro seminario web y descubra:
Qué significa el término NoSQL
Qué diferencias hay entre los almacenes clave-valor, columna ancha, grafo y de documentos
Qué significa el término «multimodelo»
MongoDb is a document oriented database and very flexible one as it gives horizontal scalability.
In this presentation basic study about mongodb with installation steps and basic commands are described.
Slides from workshop held on 12/14 in Asbury Park, NJ
http://www.meetup.com/Jersey-Shore-Tech/events/148118762/?gj=ro2_e&a=ro2_gnl&rv=ro2_e&_af_eid=148118762&_af=event
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
During this talk we'll navigate through a customer's journey as they migrate an existing MongoDB deployment to MongoDB Atlas. While the migration itself can be as simple as a few clicks, the prep/post effort requires due diligence to ensure a smooth transfer. We'll cover these steps in detail and provide best practices. In addition, we’ll provide an overview of what to consider when migrating other cloud data stores, traditional databases and MongoDB imitations to MongoDB Atlas.
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
MongoDB Kubernetes operator and MongoDB Open Service Broker are ready for production operations. Learn about how MongoDB can be used with the most popular container orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications. A demo will show you how easy it is to enable MongoDB clusters as an External Service using the Open Service Broker API for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
Humana, like many companies, is tackling the challenge of creating real-time insights from data that is diverse and rapidly changing. This is our journey of how we used MongoDB to combined traditional batch approaches with streaming technologies to provide continues alerting capabilities from real-time data streams.
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
Time series data is increasingly at the heart of modern applications - think IoT, stock trading, clickstreams, social media, and more. With the move from batch to real time systems, the efficient capture and analysis of time series data can enable organizations to better detect and respond to events ahead of their competitors or to improve operational efficiency to reduce cost and risk. Working with time series data is often different from regular application data, and there are best practices you should observe.
This talk covers:
Common components of an IoT solution
The challenges involved with managing time-series data in IoT applications
Different schema designs, and how these affect memory and disk utilization – two critical factors in application performance.
How to query, analyze and present IoT time-series data using MongoDB Compass and MongoDB Charts
At the end of the session, you will have a better understanding of key best practices in managing IoT time-series data with MongoDB.
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
Our clients have unique use cases and data patterns that mandate the choice of a particular strategy. To implement these strategies, it is mandatory that we unlearn a lot of relational concepts while designing and rapidly developing efficient applications on NoSQL. In this session, we will talk about some of our client use cases, the strategies we have adopted, and the features of MongoDB that assisted in implementing these strategies.
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
Encryption is not a new concept to MongoDB. Encryption may occur in-transit (with TLS) and at-rest (with the encrypted storage engine). But MongoDB 4.2 introduces support for Client Side Encryption, ensuring the most sensitive data is encrypted before ever leaving the client application. Even full access to your MongoDB servers is not enough to decrypt this data. And better yet, Client Side Encryption can be enabled at the "flick of a switch".
This session covers using Client Side Encryption in your applications. This includes the necessary setup, how to encrypt data without sacrificing queryability, and what trade-offs to expect.
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
MongoDB Kubernetes operator is ready for prime-time. Learn about how MongoDB can be used with most popular orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications.
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
When you need to model data, is your first instinct to start breaking it down into rows and columns? Mine used to be too. When you want to develop apps in a modern, agile way, NoSQL databases can be the best option. Come to this talk to learn how to take advantage of all that NoSQL databases have to offer and discover the benefits of changing your mindset from the legacy, tabular way of modeling data. We’ll compare and contrast the terms and concepts in SQL databases and MongoDB, explain the benefits of using MongoDB compared to SQL databases, and walk through data modeling basics so you feel confident as you begin using MongoDB.
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
Query performance should be the unsung hero of an application, but without proper configuration, can become a constant headache. When used properly, MongoDB provides extremely powerful querying capabilities. In this session, we'll discuss concepts like equality, sort, range, managing query predicates versus sequential predicates, and best practices to building multikey indexes.
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
Aggregation pipeline has been able to power your analysis of data since version 2.2. In 4.2 we added more power and now you can use it for more powerful queries, updates, and outputting your data to existing collections. Come hear how you can do everything with the pipeline, including single-view, ETL, data roll-ups and materialized views.
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
MongoDB Atlas Data Lake is a new service offered by MongoDB Atlas. Many organizations store long term, archival data in cost-effective storage like S3, GCP, and Azure Blobs. However, many of them do not have robust systems or tools to effectively utilize large amounts of data to inform decision making. MongoDB Atlas Data Lake is a service allowing organizations to analyze their long-term data to discover a wealth of information about their business.
This session will take a deep dive into the features that are currently available in MongoDB Atlas Data Lake and how they are implemented. In addition, we'll discuss future plans and opportunities and offer ample Q&A time with the engineers on the project.
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
Virtual assistants are becoming the new norm when it comes to daily life, with Amazon’s Alexa being the leader in the space. As a developer, not only do you need to make web and mobile compliant applications, but you need to be able to support virtual assistants like Alexa. However, the process isn’t quite the same between the platforms.
How do you handle requests? Where do you store your data and work with it to create meaningful responses with little delay? How much of your code needs to change between platforms?
In this session we’ll see how to design and develop applications known as Skills for Amazon Alexa powered devices using the Go programming language and MongoDB.
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
aux Core Data, appréciée par des centaines de milliers de développeurs. Apprenez ce qui rend Realm spécial et comment il peut être utilisé pour créer de meilleures applications plus rapidement.
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
Il n’a jamais été aussi facile de commander en ligne et de se faire livrer en moins de 48h très souvent gratuitement. Cette simplicité d’usage cache un marché complexe de plus de 8000 milliards de $.
La data est bien connu du monde de la Supply Chain (itinéraires, informations sur les marchandises, douanes,…), mais la valeur de ces données opérationnelles reste peu exploitée. En alliant expertise métier et Data Science, Upply redéfinit les fondamentaux de la Supply Chain en proposant à chacun des acteurs de surmonter la volatilité et l’inefficacité du marché.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
4. 4
Course Agenda
Date Time Webinar
19-Jan-2017 15:00 GMT Introduction to NoSQL
26-Jan-2017 15:00 GMT Your First MongoDB Application
02-Feb-2017 11:00 GMT Introduction to Replica Sets
09-Feb-2017 11:00 GMT Introduction to Sharding
5. 5
Agenda for Today
• Why NoSQL
• The different types of NoSQL database
• Detailed overview of MongoDB
• Q&A
11. 11
Key Value Stores
• An associative array
• Single key lookup
• Very fast single key lookup
• Not so hot for “reverse lookups”
Key Value
12345 4567.3456787
12346 { addr1 : “The Grange”, addr2: “Dublin” }
12347 “top secret password”
12358 “Shopping basket value : 24560”
12787 12345
12. 12
Revision : Row Stores (RDBMS)
• Store data aligned by rows (traditional RDBMS, e.g MySQL)
• Reads retrieve a complete row everytime
• Reads requiring only one or two columns are wasteful
ID Name Salary Start Date
1 Joe D $24000 1/Jun/1970
2 Peter J $28000 1/Feb/1972
3 Phil G $23000 1/Jan/1973
1 Joe D $24000 1/Jun/1970 2 Peter J $28000 1/Feb/1972 3 Phil G $23000 1/Jan/1973
13. 13
How a Column Store Does it
1 2 3
ID Name Salary Start Date
1 Joe D $24000 1/Jun/1970
2 Peter J $28000 1/Feb/1972
3 Phil G $23000 1/Jan/1973
Joe D Peter J Phil G $24000 $28000 $23000 1/Jun/1970 1/Feb/1972 1/Jan/1973
14. 14
Why is this Attractive?
• A series of consecutive seeks can retrieve a column efficiently
• Compressing similar data is super efficient
• How do I align my rows? By order or by inserting a row ID
• IF you just need a small number of columns you don’t need to
read all the rows
• But:
– Updating and deleting by row is expensive
• Append only is preferred
• Better for OLAP than OLTP
15. 15
Graph Stores
• Store graphs (edges and vertexes)
• E.g. social networks
• Designed to allow efficient traversal
• Optimised for representing connections
• Can be implemented as a key value stored with the ability to store
links
• MongoDB 3.4 supports graph queries
16. 16
Multi-Model Databases
• Combine multiple storage/access models
• Often Graph plus “something else”
• Fixes the “polyglot persistence” issue of keeping multiple
independent databases consistent
• The “new new thing” in NoSQL Land
• MongoDB is a "multi-modal" document store
– Graph
– Geo-Spatial
– B-tree
– Full Text
17. 17
Document Store
• Not PDFs, Microsoft Word or HTML
• Documents are nested structures created using Javascript Object Notation (JSON)
{
name : “Joe Drumgoole”,
title : “Director of Developer Advocacy”,
Address : {
address1 : “Latin Hall”,
address2 : “Golden Lane”,
eircode : “D09 N623”,
}
expertise: [ “MongoDB”, “Python”, “Javascript” ],
employee_number : 320,
location : [ 53.34, -6.26 ]
}
19. 19
MongoDB Understands JSON Documents
• From the very first version it was a native JSON database
• Understands and can index the sub-structures
• Stores JSON as a binary format called BSON
• Efficient for encoding and decoding for network transmission
• MongoDB can create indexes on any document field
• (We will cover these areas in detail later on in the course)
20. 20
Why Documents?
• Dynamic Schema
• Elimination of Object/Relational Mapping Layer
• Implicit denormalisation of the data for performance
21. 21
Why Documents?
• Dynamic Schema
• Elimination of Object/Relational Mapping Layer
• Implicit denormalisation of the data for performance
25. 25
Next Webinar – Your First MongoDB Application
• 26th Jan 2017 – 15:00 GMT.
• Learn how to build your first MongoDB application
– Create databases and collections
– Look at queries
– Build indexes
– Start to understand performance
• Register at https://www.mongodb.com/webinar/back-to-basics-
webinar-series
• Send feedback to back-to-basics@mongodb.com
26. 26
What's Next?
• Sign up for an online course: https://university.mongodb.com/
• Join a MUG: https://www.meetup.com/pro/mongodb/
Delighted to have you here. Hope you can make it to all the sessions. Sessions will be recorded so we can send them out afterwards so don’t worry if you miss one.
If you have questions please pop them in the sidebar.
A lot of people expect us to come in and bash relational database or say we don’t think they’re good. And that’s simply not true.
Relational databases has laid the foundation for what you’d want out of a database, and we absolutely think there are capabilities that remain critical today
Expressive query language & secondary Indexes. Users should be able to access and manipulate their data in sophisticated ways – and you need a query language that let’s you do all that out of the box. Indexes are a critical part of providing efficient access to data. We believe these are table stakes for a database.
Strong consistency. Strong consistency has become second nature for how we think about building applications, and for good reason. The database should always provide access to the most up-to-date copy of the data. Strong consistency is the right way to design a database.
Enterprise Management and Integrations. Finally, databases are just one piece of the puzzle, and they need to fit into the enterprise IT stack. Organizations need a database that can be secured, monitored, automated, and integrated with their existing IT infrastructure and staff, such as operations teams, DBAs, and data analysts.
But of course the world has changed a lot since the 1980s when the relational database first came about.
First of all, data and risk are significantly up.
In terms of data
90% data created in last 2 years - think about that for a moment, of all the data ever created, 90% of it was in the last 2 years
80% of enterprise data is unstructured - this is data that doesn’t fit into the neat tables of a relational database
Unstructured data is growing 2X rate of structured data
At the same time, risks of running a database are higher than ever before. You are now faced with:
More users - Apps have shifted from small internal departmental system with thousands of users to large external audiences with millions of users
No downtime - It’s no longer the case that apps only need to be available during standard business hours. They must be up 24/7.
All across the globe - your users are everywhere, and they are always connected
On the other hand, time and costs are way down.
There’s less time to build apps than ever before. You’re being asked to:
Ship apps in a few months not years - Development methods have shifted from a waterfall process to an iterative process that ships new functionality in weeks and in some cases multiple times per day at companies like Facebook and Amazon.
And costs are way down too. Companies want to:
Pay for value over time - Companies have shifted to open-source business and SaaS models that allow them to pay for value over time
Use cloud and commodity resources - to reduce the time to provision their infrastructure, and to lower their total cost of ownership
Because the relational database was not designed for modern applications, starting about 10 years ago a number of companies began to build their own databases that are fundamentally different. The market calls these NoSQL.
NoSQL databases were designed for this new world…
Flexibility. All of them have some kind of flexible data model to allow for faster iteration and to accommodate the data we see dominating modern applications. While they all have different approaches, what they have in common is they want to be more flexible.
Scalability + Performance. Similarly, they were all built with a focus on scalability, so they all include some form of sharding or partitioning. And they're all designed to deliver great performance. Some are better at reads, some are better at writes, but more or less they all strive to have better performance than a relational database.
Always-On Global Deployments. Lastly, NoSQL databases are designed for highly available systems that provide a consistent, high quality experience for users all over the world. They are designed to run on many computers, and they include replication to automatically synchronize the data across servers, racks, and data centers.
However, when you take a closer look at these NoSQL systems, it turns out they have thrown out the baby with the bathwater. They have sacrificed the core database capabilities you’ve come to expect and rely on in order to build fully functional apps, like rich querying and secondary indexes, strong consistency, and enterprise management.
MongoDB was built to address the way the world has changed while preserving the core database capabilities required to build modern applications.
Our vision is to leverage the work that Oracle and others have done over the last 40 years to make relational databases what they are today, and to take the reins from here. We pick up where they left off, incorporating the work that internet pioneers like Google and Amazon did to address the requirements of modern applications.
MongoDB is the only database that harnesses the innovations of NoSQL and maintains the foundation of relational databases – and we call this our Nexus Architecture.
Think redis, memcached or Couchbase.
Column stores you know and love, HP Vertica, Cassandra.