The document provides an overview of NoSQL databases and MongoDB. It discusses:
- What NoSQL is and why it was created
- The different categories of NoSQL databases, including key-value stores, document databases, column family stores, and graph databases
- MongoDB specifically, including its flexible schema, horizontal scalability, replication support, and data modeling approach
- Comparisons between relational and NoSQL databases
The Information Technology have led us into an era where the production, sharing and use of information are now part of everyday life and of which we are often unaware actors almost: it is now almost inevitable not leave a digital trail of many of the actions we do every day; for example, by digital content such as photos, videos, blog posts and everything that revolves around the social networks (Facebook and Twitter in particular). Added to this is that with the "internet of things", we see an increase in devices such as watches, bracelets, thermostats and many other items that are able to connect to the network and therefore generate large data streams. This explosion of data justifies the birth, in the world of the term Big Data: it indicates the data produced in large quantities, with remarkable speed and in different formats, which requires processing technologies and resources that go far beyond the conventional systems management and storage of data. It is immediately clear that, 1) models of data storage based on the relational model, and 2) processing systems based on stored procedures and computations on grids are not applicable in these contexts. As regards the point 1, the RDBMS, widely used for a great variety of applications, have some problems when the amount of data grows beyond certain limits. The scalability and cost of implementation are only a part of the disadvantages: very often, in fact, when there is opposite to the management of big data, also the variability, or the lack of a fixed structure, represents a significant problem. This has given a boost to the development of the NoSQL database. The website NoSQL Databases defines NoSQL databases such as "Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open source and horizontally scalable." These databases are: distributed, open source, scalable horizontally, without a predetermined pattern (key-value, column-oriented, document-based and graph-based), easily replicable, devoid of the ACID and can handle large amounts of data. These databases are integrated or integrated with processing tools based on the MapReduce paradigm proposed by Google in 2009. MapReduce with the open source Hadoop framework represent the new model for distributed processing of large amounts of data that goes to supplant techniques based on stored procedures and computational grids (step 2). The relational model taught courses in basic database design, has many limitations compared to the demands posed by new applications based on Big Data and NoSQL databases that use to store data and MapReduce to process large amounts of data.
Course Website http://pbdmng.datatoknowledge.it/
Contact me for other informations and to download the slides
The Information Technology have led us into an era where the production, sharing and use of information are now part of everyday life and of which we are often unaware actors almost: it is now almost inevitable not leave a digital trail of many of the actions we do every day; for example, by digital content such as photos, videos, blog posts and everything that revolves around the social networks (Facebook and Twitter in particular). Added to this is that with the "internet of things", we see an increase in devices such as watches, bracelets, thermostats and many other items that are able to connect to the network and therefore generate large data streams. This explosion of data justifies the birth, in the world of the term Big Data: it indicates the data produced in large quantities, with remarkable speed and in different formats, which requires processing technologies and resources that go far beyond the conventional systems management and storage of data. It is immediately clear that, 1) models of data storage based on the relational model, and 2) processing systems based on stored procedures and computations on grids are not applicable in these contexts. As regards the point 1, the RDBMS, widely used for a great variety of applications, have some problems when the amount of data grows beyond certain limits. The scalability and cost of implementation are only a part of the disadvantages: very often, in fact, when there is opposite to the management of big data, also the variability, or the lack of a fixed structure, represents a significant problem. This has given a boost to the development of the NoSQL database. The website NoSQL Databases defines NoSQL databases such as "Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open source and horizontally scalable." These databases are: distributed, open source, scalable horizontally, without a predetermined pattern (key-value, column-oriented, document-based and graph-based), easily replicable, devoid of the ACID and can handle large amounts of data. These databases are integrated or integrated with processing tools based on the MapReduce paradigm proposed by Google in 2009. MapReduce with the open source Hadoop framework represent the new model for distributed processing of large amounts of data that goes to supplant techniques based on stored procedures and computational grids (step 2). The relational model taught courses in basic database design, has many limitations compared to the demands posed by new applications based on Big Data and NoSQL databases that use to store data and MapReduce to process large amounts of data.
Course Website http://pbdmng.datatoknowledge.it/
Contact me for other informations and to download the slides
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGijiert bestjournal
An unstructured data poses challenges to storing da ta. Experts estimate that 80 to 90 percent of the d ata in any organization is unstructured. And the amount of uns tructured data in enterprises is growing significan tly� often many times faster than structured databases are gro wing. As structured data is existing in table forma t i,e having proper scheme but unstructured data is schema less database So it�s directly signifying the importance of NoSQL storage Model and Map Reduce platform. For processi ng unstructured data,where in existing it is given to Cassandra dataset. Here in present system along wit h Cassandra dataset,Mongo DB is to be implemented. As Mongo DB provide flexible data model and large amou nt of options for querying unstructured data. Where as Cassandra model their data in such a way as to mini mize the total number of queries through more caref ul planning and renormalizations. It offers basic secondary ind exes but for the best performance it�s recommended to model our data as to use them infrequently. So to process
As more businesses realised that data, in all forms and sizes, is critical to making the best possible decisions, we see the continued growth of systems that support massive volume of non-relational or unstructured forms of data. Nothing shows the picture more starkly than the Gartner Magic quadrant for operational database management systems, which assumes that, by 2017, all leading operational DBMSs will offer multiple data models, relational and NoSQL, in a single DBMS platform. Having a single data platform for managing both well-structured data and NoSQL data is beneficial to users; this approach reduces significantly integration, migration, development, maintenance, and operational issues. Therefore, a challenging research work is how to develop efficient consolidated single data management platform covering both relational data and NoSQL to reduce integration issues, simplify operations, and eliminate migration issues.
In this tutorial, we review the previous work on multi-model data management and provide the insights on the research challenges and directions for future work.
Papers and more materials on this tutorial can be found at: http://udbms.cs.helsinki.fi/?tutorials
Big Data is used to store huge volume of both structured and unstructured data which is so large and is
hard to process using current / traditional database tools and software technologies. The goal of Big Data
Storage Management is to ensure a high level of data quality and availability for business intellect and big
data analytics applications. Graph database which is not most popular NoSQL database compare to
relational database yet but it is a most powerful NoSQL database which can handle large volume of data in
very efficient way. It is very difficult to manage large volume of data using traditional technology. Data
retrieval time may be more as per database size gets increase. As solution of that NoSQL databases are
available. This paper describe what is big data storage management, dimensions of big data, types of data,
what is structured and unstructured data, what is NoSQL database, types of NoSQL database, basic
structure of graph database, advantages, disadvantages and application area and comparison of various
graph database.
Why we need Database Awareness?
Document vs Relational
Row-based vs Column-based
In-memory Database vs In-memory Data grids
Graph
Time-series
Solr vs ElasticSearch
Event Store
This presentation is related to nosql database and nosql database types information. this presentationa also contains discussion about, how mongodb works and mongodb security and mongodb sharding information.
SQL vs NoSQL | MySQL vs MongoDB Tutorial | EdurekaEdureka!
(** MYSQL DBA Certification Training https://www.edureka.co/mysql-dba **)
This Edureka PPT on SQL vs NoSQL will discuss the differences between SQL and NoSQL. It also discusses the differences between MySQL and MongoDB.
The following topics will be covered in this PPT:
What is SQL?
What is NoSQL?
SQL vs NoSQL
Type of database
Schema
Database Categories
Complex Queries
Hierarchical Data Storage
Scalability
Language
Online Processing
Base Properties
External Support
What is MySQL?
What is MongoDB?
MySQL vs MongoDB:
Query Language
Flexibility of Schema
Relationships
Security
Performance
Support
Key Features
Replication
Usage
Active Community
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
What is NoSQL? How does it come to the picture? What are the types of NoSQL? Some basics of different NoSQL types? Differences between RDBMS and NoSQL. Pros and Cons of NoSQL.
What is MongoDB? What are the features of MongoDB? Nexus architecture of MongoDB. Data model and query model of MongoDB? Various MongoDB data management techniques. Indexing in MongoDB. A working example using MongoDB Java driver on Mac OSX.
Object Relational Mapping with LINQ To SQLShahriar Hyder
OR Impedance Mismatch
Object Relational Mapping
The LINQ Project
Data Access In APIs Today
Data Access with DLINQ
DLinq For Relational Data
Architecture
Key Takeaways
Querying For Objects
When to Use LINQ to SQL?
Backbone using Extensible Database APIs over HTTPMax Neunhöffer
These days, more and more software applications are designed using a micro services architecture, that is, as suites of independently deployable services, talking to each other with well-defined interfaces. This approach is helped by the fact that many NoSQL databases expose their API through HTTP, which makes it particularly easy to define the interfaces.
The multi-model NoSQL database ArangoDB embeds Google's V8 JavaScript engine and features the Foxx framework, which allows the developer to extend ArangoDB's API by user defined JavaScript code that runs on the database server.
In this talk I will explain the benefits of this approach to the software architecture and development process. I will keep the presentation practice oriented by showing concrete examples in ArangoDB and JavaScript, using Backbone.js
This deck talks about the basic overview of NoSQL technologies, implementation vendors/products, case studies, and some of the core implementation algorithms. The presentation also describes a quick overview of "Polyglot Persistency", "NewSQL" like emerging trends.
The deck is targeted to beginners who wants to get an overview of NoSQL databases.
3.Implementation with NOSQL databases Document Databases (Mongodb).pptxRushikeshChikane2
this Chapter gives information about Document Based Database and Graph based Database. It gives their basic structures, Features,applications ,Limitations and use cases
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGijiert bestjournal
An unstructured data poses challenges to storing da ta. Experts estimate that 80 to 90 percent of the d ata in any organization is unstructured. And the amount of uns tructured data in enterprises is growing significan tly� often many times faster than structured databases are gro wing. As structured data is existing in table forma t i,e having proper scheme but unstructured data is schema less database So it�s directly signifying the importance of NoSQL storage Model and Map Reduce platform. For processi ng unstructured data,where in existing it is given to Cassandra dataset. Here in present system along wit h Cassandra dataset,Mongo DB is to be implemented. As Mongo DB provide flexible data model and large amou nt of options for querying unstructured data. Where as Cassandra model their data in such a way as to mini mize the total number of queries through more caref ul planning and renormalizations. It offers basic secondary ind exes but for the best performance it�s recommended to model our data as to use them infrequently. So to process
As more businesses realised that data, in all forms and sizes, is critical to making the best possible decisions, we see the continued growth of systems that support massive volume of non-relational or unstructured forms of data. Nothing shows the picture more starkly than the Gartner Magic quadrant for operational database management systems, which assumes that, by 2017, all leading operational DBMSs will offer multiple data models, relational and NoSQL, in a single DBMS platform. Having a single data platform for managing both well-structured data and NoSQL data is beneficial to users; this approach reduces significantly integration, migration, development, maintenance, and operational issues. Therefore, a challenging research work is how to develop efficient consolidated single data management platform covering both relational data and NoSQL to reduce integration issues, simplify operations, and eliminate migration issues.
In this tutorial, we review the previous work on multi-model data management and provide the insights on the research challenges and directions for future work.
Papers and more materials on this tutorial can be found at: http://udbms.cs.helsinki.fi/?tutorials
Big Data is used to store huge volume of both structured and unstructured data which is so large and is
hard to process using current / traditional database tools and software technologies. The goal of Big Data
Storage Management is to ensure a high level of data quality and availability for business intellect and big
data analytics applications. Graph database which is not most popular NoSQL database compare to
relational database yet but it is a most powerful NoSQL database which can handle large volume of data in
very efficient way. It is very difficult to manage large volume of data using traditional technology. Data
retrieval time may be more as per database size gets increase. As solution of that NoSQL databases are
available. This paper describe what is big data storage management, dimensions of big data, types of data,
what is structured and unstructured data, what is NoSQL database, types of NoSQL database, basic
structure of graph database, advantages, disadvantages and application area and comparison of various
graph database.
Why we need Database Awareness?
Document vs Relational
Row-based vs Column-based
In-memory Database vs In-memory Data grids
Graph
Time-series
Solr vs ElasticSearch
Event Store
This presentation is related to nosql database and nosql database types information. this presentationa also contains discussion about, how mongodb works and mongodb security and mongodb sharding information.
SQL vs NoSQL | MySQL vs MongoDB Tutorial | EdurekaEdureka!
(** MYSQL DBA Certification Training https://www.edureka.co/mysql-dba **)
This Edureka PPT on SQL vs NoSQL will discuss the differences between SQL and NoSQL. It also discusses the differences between MySQL and MongoDB.
The following topics will be covered in this PPT:
What is SQL?
What is NoSQL?
SQL vs NoSQL
Type of database
Schema
Database Categories
Complex Queries
Hierarchical Data Storage
Scalability
Language
Online Processing
Base Properties
External Support
What is MySQL?
What is MongoDB?
MySQL vs MongoDB:
Query Language
Flexibility of Schema
Relationships
Security
Performance
Support
Key Features
Replication
Usage
Active Community
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
What is NoSQL? How does it come to the picture? What are the types of NoSQL? Some basics of different NoSQL types? Differences between RDBMS and NoSQL. Pros and Cons of NoSQL.
What is MongoDB? What are the features of MongoDB? Nexus architecture of MongoDB. Data model and query model of MongoDB? Various MongoDB data management techniques. Indexing in MongoDB. A working example using MongoDB Java driver on Mac OSX.
Object Relational Mapping with LINQ To SQLShahriar Hyder
OR Impedance Mismatch
Object Relational Mapping
The LINQ Project
Data Access In APIs Today
Data Access with DLINQ
DLinq For Relational Data
Architecture
Key Takeaways
Querying For Objects
When to Use LINQ to SQL?
Backbone using Extensible Database APIs over HTTPMax Neunhöffer
These days, more and more software applications are designed using a micro services architecture, that is, as suites of independently deployable services, talking to each other with well-defined interfaces. This approach is helped by the fact that many NoSQL databases expose their API through HTTP, which makes it particularly easy to define the interfaces.
The multi-model NoSQL database ArangoDB embeds Google's V8 JavaScript engine and features the Foxx framework, which allows the developer to extend ArangoDB's API by user defined JavaScript code that runs on the database server.
In this talk I will explain the benefits of this approach to the software architecture and development process. I will keep the presentation practice oriented by showing concrete examples in ArangoDB and JavaScript, using Backbone.js
This deck talks about the basic overview of NoSQL technologies, implementation vendors/products, case studies, and some of the core implementation algorithms. The presentation also describes a quick overview of "Polyglot Persistency", "NewSQL" like emerging trends.
The deck is targeted to beginners who wants to get an overview of NoSQL databases.
3.Implementation with NOSQL databases Document Databases (Mongodb).pptxRushikeshChikane2
this Chapter gives information about Document Based Database and Graph based Database. It gives their basic structures, Features,applications ,Limitations and use cases
Comparative study of no sql document, column store databases and evaluation o...ijdms
In the last decade, rapid growth in mobile applications, web technologies, social media generating
unstructured data has led to the advent of various nosql data stores. Demands of web scale are in
increasing trend everyday and nosql databases are evolving to meet up with stern big data requirements.
The purpose of this paper is to explore nosql technologies and present a comparative study of document
and column store nosql databases such as cassandra, MongoDB and Hbase in various attributes of
relational and distributed database system principles. Detailed study and analysis of architecture and
internal working cassandra, Mongo DB and HBase is done theoretically and core concepts are depicted.
This paper also presents evaluation of cassandra for an industry specific use case and results are
published.
What is NoSQL? NoSQL describes a family of approaches to managing data at an enterprise level that have key similarities, but - at the same time - are very different from classic SQL based relational databases.
NoSQL has emerged as a 'movement' over the last 5 years and many specific noSQL datastores - Mongo, Redis, HBase, Cassandra, Neo4J - are being used for mission critical systems by many organizations including Facebook, LinkedIn, Dropbox, American Express, NSA, & the CIA. Does NoSQL spell the end of SQL based relational datastores like Oracle, MySQL, SQLServer, & Sybase? Definitely not, but the world is moving in the direction of "Polyglot Persistence" and away from the "Relational Persistence" hegemony. In my presentation I will explain why this shift is occurring and will speculate about what the future will hold.
This Presentation is about NoSQL which means Not Only SQL. This presentation covers the aspects of using NoSQL for Big Data and the differences from RDBMS.
A Study on Graph Storage Database of NOSQLIJSCAI Journal
Big Data is used to store huge volume of both structured and unstructured data which is so large and is
hard to process using current / traditional database tools and software technologies. The goal of Big Data
Storage Management is to ensure a high level of data quality and availability for business intellect and big
data analytics applications. Graph database which is not most popular NoSQL database compare to
relational database yet but it is a most powerful NoSQL database which can handle large volume of data in
very efficient way. It is very difficult to manage large volume of data using traditional technology. Data
retrieval time may be more as per database size gets increase. As solution of that NoSQL databases are
available. This paper describe what is big data storage management, dimensions of big data, types of data,
what is structured and unstructured data, what is NoSQL database, types of NoSQL database, basic
structure of graph database, advantages, disadvantages and application area and comparison of various
graph database.
Big Data is used to store huge volume of both structured and unstructured data which is so large and is
hard to process using current / traditional database tools and software technologies. The goal of Big Data
Storage Management is to ensure a high level of data quality and availability for business intellect and big
data analytics applications. Graph database which is not most popular NoSQL database compare to
relational database yet but it is a most powerful NoSQL database which can handle large volume of data in
very efficient way. It is very difficult to manage large volume of data using traditional technology. Data
retrieval time may be more as per database size gets increase. As solution of that NoSQL databases are
available. This paper describe what is big data storage management, dimensions of big data, types of data,
what is structured and unstructured data, what is NoSQL database, types of NoSQL database, basic
structure of graph database, advantages, disadvantages and application area and comparison of various
graph database.
A Study on Graph Storage Database of NOSQLIJSCAI Journal
Big Data is used to store huge volume of both structured and unstructured data which is so large and is
hard to process using current / traditional database tools and software technologies. The goal of Big Data
Storage Management is to ensure a high level of data quality and availability for business intellect and big
data analytics applications. Graph database which is not most popular NoSQL database compare to
relational database yet but it is a most powerful NoSQL database which can handle large volume of data in
very efficient way. It is very difficult to manage large volume of data using traditional technology. Data
retrieval time may be more as per database size gets increase. As solution of that NoSQL databases are
available. This paper describe what is big data storage management, dimensions of big data, types of data,
what is structured and unstructured data, what is NoSQL database, types of NoSQL database, basic
structure of graph database, advantages, disadvantages and application area and comparison of various
graph database.
The rising interest in NoSQL technology over the last few years resulted in an increasing number of evaluations and comparisons among competing NoSQL technologies From survey we create a concise and up-to-date comparison of NoSQL engines, identifying their most beneficial use from the software engineer point of view.
As technology and needs evolve and the need for scalable and high availability solutions increase there is a need to evaluate new databases. The lack of clarity in the market makes in difficult for IT stakeholders to understand the differences between the solutions available and the choice to make. The key areas to consider while evaluating NoSql databases are data model, query model, consistency model, APIs, support and community strength.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
3. What is NOSQL?
Stands for Not Only SQL
built in the early 2000s for the purpose
of large-scale database clustering in
cloud and web applications.
Class of non-relational data storage
systems
Usually do not require a fixed table
schema nor do they use the concept of
joins
https://db-engines.com/en/ranking
4. Why NOSQL
Relational Database Management System doesn’t have the
ability /capacity to handle:
The numbers of concurrent users
The amount of new data type such as unstructured data, stream data,
semi structured data
The continual increase in the amount of data that have been provided by
mobile, social media, geo-location data.
5. Why NOSQL
The foundation of relational databases are based on 'Normalization'
and multiple 'Join Operations' for each query and then when the
number of users and data increase, relational databases led to create
a numerous amount of tables that are uncontrollable and really hard
to manage
Normalization?!
Join?!
The relational model is not sufficient
A new generation of databases
6. How do NOSQL Databases should
store/Manage this storm?
Next Generation Databases mostly addressing some of the points:
Being non-relational
Horizontally scalable- can dynamically support rapidly growing
Distributed
Open-source
Schema-free
Agile development methods
Easy replication support
Simple API
High Availability
Simple to install and operate
Follow BASE principle
(not ACID - Atomicity,
Consistency, Isolation, Durability).
BASE
Basically Available:
Availability is more important
than consistency.
Soft State: Higher
availability results in an
eventual consistent state
Eventually Consistent: if
now new updates are made to
a given data item, eventually all
accesses to that item will return
the last updated value
7. RDBMS vs NoSQL
RDBMS
Structured and organized
data
Structured query
language (SQL)
Data and its relationships
are stored in separate
tables.
Data Manipulation
Language, Data Definition
Language
Join & Normalization
Tight Consistency
ACID Transaction
NoSQL
Unstructured, Semi structured and Real
time data
No declarative query language
Flexible schema
Many Model: KVS, Graph, and etc.
Prioritizes high performance high
availability and scalability
Distributed
Horizontally scalable
Lower Cost
Open source
No Complicate relationship
(Join, Normalization)
Eventual consistency
rather ACID property
CAP Theorem.
8. CAP Theorem
Consistency: All nodes see the
same data at the same time
Availability: Every request
receives a response
Partition Tolerance : The
system continues to operate
despite arbitrary message loss
or failure of part of the system.
Impossible to have all 3
requirements met.
At most two of Consistency,
Availability, and Partition-
tolerance.
12. Document-Oriented Databases
Documents are the main concept.
A Document-Oriented database stores and retrieves documents in some standard
format(s):
JSON
XML
BSON
YAML
Binary forms (like PDF and MS Word).
Document is similar to row or record in relation DB, but more flexible (Documents have
differences in their attributes).
Document are indexed
Document databases store documents in the value part of the key-value
{
name:”Robert ”, Key:Value
Age:55, ”, Key:Value
Department :[“Emergency”, “Heart Center”] Key:Value
}
15. MongoDB Overview
MongoDB is a scalable and high-performance open source database.
MongoDB can Store complex documents as arrays, hash tables, integers, objects and
every thing else supported by JSON .
Written in C++
Has driver to all most every popular language programming
Full Index Support
Built-In Replication & Cluster Management
Data redundancy
Fault tolerant (automatic failover AND recovery)
Consistency (wait-for-propagate or write-and-forget)
Simplified maintenance
Distributed Storage (Sharding)
Base on define shard key.
It’s enabling horizontal scaling across multiple nodes.
Auto-Balances as shard servers are added or removed
Failover handled through replica sets.
Map Reduce queries are run in parallel across shards.
MongoDB in
many ways “feels”
like an RDMS
23. MongoDB Advantages
To reduce complexity
Get rid of migrations
No create table
No alter column
No add column
No change column
Get rid of relationships
Many to one/ One to Many/ Many to Many
Reduce number of database requests
Joined
Rich queries
In-place updates
JSON
MongoDB knows JSON
Don’t have to convert data from / to JSON
Adapt to changes
Changes in schema
Changes in data & algorithms $set, $unset, $push, $rename,
Changes for performance & scaling
24. Data Types
String : This is most commonly used datatype to store the data. String in mongodb
must be UTF-8 valid.
Integer : This type is used to store a numerical value. Integer can be 32 bit or 64 bit
depending upon your server.
Boolean : This type is used to store a boolean (true/ false) value.
Double : This type is used to store floating point values.
Min/ Max keys : This type is used to compare a value against the lowest and highest
BSON elements.
Arrays : This type is used to store arrays or list or multiple values into one key.
Timestamp : ctimestamp. This can be handy for recording when a document has been
modified or added.
Object : This datatype is used for embedded documents.
Null : This type is used to store a Null value.
Symbol : This datatype is used identically to a string however, it's generally reserved
for languages that use a specific symbol type.
Date : This datatype is used to store the current date or time in UNIX time format. You
can specify your own date time by creating object of Date and passing day, month, year
into it.
Object ID : This datatype is used to store the document’s ID.
Binary data : This datatype is used to store binay data.
Code : This datatype is used to store javascript code into document.
Regular expression : This datatype is used to store regular expression
25. Data Modeling
MongoDB is the most similar database to relational database
regarding data Modeling, but there exists some significant
differences between the document oriented model and the
relational model:
Flexible schema
Lack of join in MongoDB
Arrays & Embedded-Documents
De-Normalization
26. Flexible schema
Schema should not determine before inserting data.
MongoDB supports a flexible schema based on the needs of
the application usage such as updates, data processing,
queries and etc.
CREATE TABLE DOCTOR(
ID VARCHAR2 (8) NOT NULL PRIMARY
KEY,
SSN VARCHAR2(11) NOT NULL unique,
F_NAME VARCHAR2 (25) NOT NULL,
L_NAME VARCHAR2 (25) NOT NULL,
Department varchar(10),
Address VARCHAR2(50) NOT NULL,
ZIP_Code NUMBER(5) NOT NULL
)
ID SSN F_NAME L_NAME Department
HOMEPHON
E
CELLPHONE BIRTH_DATE
11111112 123-98-4534 Hamid Sahat Emergency
1-555-729-
2345
693-258-
1968
15-JUN-63
11111113 199-98-2365 Chang Zhxiao CCU
1-552-729-
5236
332-258-
1456
13-APR-55
27. Lack of join in MongoDB
Data spilt horizontally on different clients and
therefore, to perform so many joins on different
applications’ server is unrealistic.
Supports a left outer join to an unsharded collection
in the same database($lookup ).
It supports partially the relational model(Primary
key & Foreign Key)
28. Arrays & Embedded-Documents
• MongoDB supports the cardinality ratio by
using:
• Arrays
• Embedded documents (document with a
nested document)
db.DOCTOR.find({},{"Department":1,_id:0})
db.DOCTOR.find({},{"Degree":1,_id:0}).limit(1)
32. Create/Drop Database
'use Hospital' is used to create 'hospital' database. The command will create the database, if it
doesn't exist otherwise it will return the existing database
>db.dropDatabase()
According to delete a database we firstly should select the database (Switched to db Hospital) and then
delete it.
33. Create/Drop Collection
db.createCollection
db.createCollection(“DOCTOR”)
MongoDB automatically creates a collection when we insert some
document.
db.PATIENT.drop()
If the selected collection is removed successfully, the drop() method
will return ‘true’, otherwise ‘false’ will be returned .
db.PATIENT.remove({})
To remove all documents from a collection, pass an empty query
document '{}' to the ’remove’ method.
Remove() method does not remove the indexes
34. Insert
db.collection.insert()
db.DOCTOR.insert(
{name:{firstname:'John', lastname:'Thomsen'},
SSN:'534-10-4534',
DOB : new Date('June 30,1948'),
Salary: 310000,
phone:{ HomePhone:'1-880-529-234', CellPhone:'443-258-1968'},
address: {street:'589 Linden st', city:'Frostburg', Zip:'21532', state:'Maryland'},
Maritalstatus:'M',
Specialties:'Oncology',
Degree:{DegreeTitle:'Doctor of Medicine',Degreefrom:'Maryland
University',Date_of_Degree: new Date('Feb 22,1988')},
Certificate:{CertificateName:'Cardic Surgery',
Date_of_Certificate: new Date('Sep 24, 1987')},
Department:['Laboratory and Blood Bank','Internal Diseases','Medical'],
Mentor:'Sara Konei'} )
Embedded Document
Embedded Document
Embedded
Document
Embedded Document
Array
36. Insertdb.PATIENT.insert( {
name:{firstname:'Ahmad', lastname:'Abdolahe', middelname:'Saeed'},
SSN:'33-90-1134',
DOB: new Date('April 20,1995'),
phone:{ HomePhone:'1-301-338-5986',
CellPhone:'240-501-1968'},
Address: {street:'181 Ormand Street', city:'Frostburg', Zip:'21532', state:'Maryland'},
Maritalstatus:'M',
Gender:'M',
Race:'White',
Smoker:'N',
DrivingLicense:{ DRIVERSLICENSENUM: 72585432,DRIVERSLICENSESTATE:'MD'},
INFECTION_HISTORY: { DISEASE: 'Alopecia'},
MEDICINE_HISTORY:{ MEDICINE_USED:' Ripernol ', DOSAGE: '120 ml',
START_DATE: new Date('JAN 18, 2000'), END_DATE: new Date('JUNE 03, 2001')},
ADMISSION:{ DEPARTMENT_NAME :[' Emergency ', 'Surgery'], ADMISSION_DATE:
new Date ('Feb 14,1998'), DISCHARGE_DATE : new Date ('Feb 19,1998')},
ROOM_ASSIGNED:[145,289], BED: {TYPEOFBED: 'GeneralCare', Price:90 },
INSURANCE:{POLICYNUM:'12359867', COMPANYNAME: 'Royal sun', DATEISSUED:
new Date ('JAN 14,2003'), EXPIRATIONDATE: new Date ('JAN
30,2004'),INSURANCETYPE: 'Private'},
Doctor_Info: [{Doctor_ID:db.DOCTOR.find()[0]._id},{Doctor_ID:db.DOCTOR.find()[1]. _id}]
37. Update
• db.collection.update()
•
db.DOCTOR.update({"Specialties" : " Anesthesia"},{$inc:{Salary:1000}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
db.PATIENT.update({"BED.Price":135},{$set:{"BED.Price":200}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
db.PATIENT.update({"name.firstname":"Nancy"},{$unset:{"INSURANCE":""}})
$set
To increment the salary by 1000 $inc
To update a field within an embedded document
To increase the salaries of all doctors who are specialist in Anesthesia
To delete a particular field $unset operator is used
db.DOCTOR.update({"Maritalstatus" : "S"},{$set:{"Maritalstatus" : "M"}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
38. Upsert
The UPSERT command inserts documents that don’t exist and updates the
documents that exist
Insert
Update
39. Aggregation & Queries
MongoDB has a rich query framework
find() -The result from find is a cursor that can list the fields we want to retrieve.
41. Aggregation & Queries
Display information about the beds
db.PATIENT.find({},{_id:0,BED:1})
Select the patient’s names that do not have insurance
db.PATIENT.find( { INSURANCE: { $exists:false } }, {"name.firstname":1, "name.lastname":1, _id:0} )
Display the name of doctors who graduated between March 1, 2000 and March 1, 2010
>start= new Date("March/01/2000")
ISODate("2000-02-29T23:00:00Z")
> end= new Date("March/01/2010")
ISODate("2010-02-28T23:00:00Z")
db.DOCTOR.find({"Degree.Date_of_Degree":{"$gte":start,"$lt":end}},{_id:0,"name.firstname":1,"name.lastname":1,
"Degree.Date_of_Degree":1}).pretty()
new Date("<YYYY-mm-dd>") returns the specified date string ("<YYYY-mm-dd>")
as a Date object.
The Date is wrapped by ISODate helper.
42. Aggregation Pipeline
Aggregation pipeline is a way to transform/combine documents in a
collection.
MongoDB Aggregation Operators SQL Terms, Function, and Concepts
$match WHERE
$group GROUP BY
$match HAVING
$project SELECT
$lookup LEFT OUTER JOIN
Db.collection.aggregate{$ Aggregation Operator :Field}
45. Aggregation & Queries
Display patient names and the doctors name who has been visited
$lookup -New in version 3.2.
lookup() method uses to join between collections
Performs a left outer join to an unshared collection in the same database
db.PATIENT.aggregate([ {
$lookup:
{
from: "DOCTOR",
localField: "_id",
foreignField: "Doctor_ID",
as: "test_look_up"}},{$project:{_id:0,"name.firstname":1,"name.
Doctor_Info.Doctor_firstname":1, "Doctor_Info.Doctor_lastname":1}}]).pretty()
Join
48. Aggregation & Queries
mapReduce() is a data processing paradigm for condensing large volumes of
data into useful aggregated results. MongoDB uses mapReduce command for
map-reduce operations. MapReduce is generally used for processing large data
sets.
The map-reduce function:
Query the collection
Map a value with a key and emits a key-value pair
Reduce all the documents having the same key end up in an array
Out specifies the location of the map-reduce query result
Query specifies the optional selection criteria for selecting documents
Sort specifies the optional sort criteria
Limit specifies the optional maximum number of documents to be
returned
Array
49. Aggregation & Queries
Display the total amount of salary for each single doctor.
To Select single doctors
To group them on the basis of SSN
SUM the amount of salaries by each doctor
db.DOCTOR.mapReduce( function() { emit(this.SSN,this.Salary); },function(SSN,Salary)
{return Array.sum(Salary)},
{query:{Maritalstatus:"S"},out:"total_Salary_S"})
50. MongoDB in Practice
Personalization creates customized online experiences for the customers in real time based on
analysis of behavioral and demographic profiles, historical interactions, and preferences.
Scratchpad automatically saves things you
view while you shop across devices
Scratchpad automates the Intelligently
remembers searches
hunts for the lowest prices
makes it easy to shop across any device
makes the travel search process fast, easy, and
personalized
It keeps people from jumping to other travel
sites.
It dramatically increases conversion rates for
Expedia.
suppliers constantly change inventory and
pricing information behind the scenes.
create a huge volume of highly variable data.
Scratchpad APP makes the travel search process fast, easy, and personalized
52. MongoDB in PracticeMakes it possible for Expedia to
create a feature that gives every user
a relevant, seamless shopping
experience. That collects highly-
dynamic customer information in
real-time and presents personalized
offers on fly.
Flexible document store and
simple horizontal scale-out
Makes it easy to store any
combination of city pairs, dates and
destinations. Expedia can even
continue shopping for someone
after that customer has closed out a
session. When the customer
returns, all the latest pricing and
availability for their searches are
displayed side by side on their
Scratchpad.
MongoDB’s flexible data model
54. Graph Databases
A Graph is a set of nodes and the
relationships that connect those nodes
Nodes and Relationships contain
properties to represent data.
Properties are key-value pairs to
represent data.
Nodes can be labeled
A graph database stores data in a graph
Each node knows its adjacent nodes
As the number of nodes increases, the
cost of a local step (or hop) remains the
same
57. Why use a graph database?
Greater performance – compared to NoSQL stores or relational
databases, graph databases offer much faster access to complex
connected data, mainly as they lack expensive ‘join’ operations. In
one example, a graph database was 1000x faster than a relational
database when working with a query depth of four.
Lower latency – users of graph databases experience lower levels
of latency. As the nodes and links ‘point’ to one another, millions of
related records can be traversed per second and query response
time remains constant irrespective of the overall database size.
Flexible and agile – a graph database should closely match the
structure of the data it uses. This allows developers to start work
sooner without the added complexity of mapping data across
tables.
Good for semi-structured data – graph databases are schema
free, meaning patchy data, or data with exceptional attributes, don’t
pose a structural problem.
59. Neo4j Overview
Neo4j is world’s leading graph databases
Neo Technology is creator of Neo4j
Follows Property Graph Data Model
Is written in Java Language.
CQL(Cypher Query Language) as query
language
An open source
Schema-free
Full ACID
High Availability
Supports highly connected data
It provides REST API to be accessed by any
Programming Language like Java.
61. Advantages of Neo4j
Easy to model and store relationships
CQL query language commands are very easy to learn.
It is very easy to represent connected data.
It is very easy and faster to retrieve/traversal/navigation data
It represents semi-structured data very easily.
Intuitiveness- In humane readable format
Speed
It uses simple and powerful data model.
Performance of relationship traversal remains constant with
growth in data size
Fast agile development and evolution-It has a naturally
adaptive model
Scalability-Neo4j scales up and out, supporting tens of
billions of nodes and relationships, and hundreds of
thousands of ACID transactions per second
62. Property Value Types
Properties are named values where the name is a string.
Property values can be either a primitive or an array of one primitive
type.For example String, int and int[] values are valid for properties
NULL is not a valid property value
NULL s can instead be modeled by the absence of a key
63. Converting relational model into
Graph model
#1 Drop Foreign Keys #2 Join Tables become relationships
#3 Attributes Properties
AdmitTO
VisitBY
WorkIN
Mentor
66. Create a node
MATCH (p: Patient)where p.name= 'Simon Diohman' RETURN p
CREATE (p:Patient{ name : 'Simon Diohman', SSN :'51-90-1134', CellPhone:'443-501-
1968',DOB:'1959-06-20' }) RETURN p
68. Delete
Delete all nodes and relationships
Delete single node
Delete a specific node
Delete relationship
Delete a node with all its relationships
69. Remove
The REMOVE clause is used to remove properties
and labels from graph elements.
70. Update
The SET clause is used to update properties on nodes and relationships.
Set a property-To set a property on a node or relationship
Coping properties between nodes
The Simon node has had all it’s properties replaced by the properties in the Simon node
Update a relationship
r
r1
73. Collection functions
Collection functions return collections of nodes,
relationships and etc. in a path.
RELATIONSHIPS
Returns all nodes in a path.
Returns all relationships in a path.
74. Neo4j in Practice
Insurance Fraud
Insurance Fraud is the filing of a false
claim to life, health, automobile, property,
workers' compensation or other types of
insurance benefits.
Insurance fraud is a significant and costly problem for both
policyholders and insurance companies.
Insurance companies lose millions of dollars each year through
fraudulent claims, largely because they do not have a way to easily
determine which claims are legitimate and which may be fraudulent.
Insurance companies shift the
risk of loss to their customers
75. Neo4j in Practice
Typical Scenario
Rings of fraudsters work together to stage fake accidents.
Such rings normally include a number of roles.
Providers. Collusions typically involve participation from
professionals in several categories:
a. Doctors, who diagnose false injuries
b. Lawyers, who file fraudulent claims, and
c. Body shops, which misrepresent damage to cars 2.
Participants. These are the people involved in the
(false) accident, and normally include:
a. Drivers
b. Passengers
c. Pedestrians
d. Witnesses
76. Neo4j in Practice
Simple 6-people collusion
Simple Ten-Person Collusion
20000$x6x3+5000$x6=390000$
The ring can claim $390000
40000$*10*4+5000$x10=1.65M
The ring can claim ?
Fraudsters often create and manage rings by “RECYCLING” participants
so as to stage many accidents
77. Neo4j in Practice
To detect rings in the graph by walking the graph.
In Real Time
78. References
https://www.youtube.com/watch?v=A4gRg-9jNF4
http://www.aerospike.com/what-is-a-key-value-store/
A. P. D. P. Ameya Nayak, "Type of NOSQL Databases and its Comparison with
Relational Databases
vol. International Journal of Applied Information Systems (IJAIS), no. ISSN : 2249-
0868, March 2013.
J. Kaur, "Distributed Hash Tables," 2014.
https://www.youtube.com/watch?v=i1KAvQ-pw08.
http://db-engines.com/en/ranking
https://redislabs.com
http://redis.io/
https://www.youtube.com/watch?v=A4gRg-
9jNF4&ebc=ANyPxKoFk4OYpr8O4v1Nf2ugf-
uEiStTzLxdMdyHf2KFxV5IuvDLA2wpMyAdsWbb9OP9H5KWwmK7wJuYe7d4cb8e3N
YMuEDsRQ
http://www.tutorialspoint.com/redis/redis_quick_guide.htm
http://antirez.com/news/75
M. Paksula, "Persisting Objects in Redis Key-Value Database," University of Helsinki,
Department of
Computer Science, Helsinki, Finland.
http://www.lynda.com/NoSQL-tutorials/Understanding-key-value-
stores/368756/387726-4.html
http://data-magnum.com/9-lessons-for-starting-a-big-data-initiative-and-selecting-the-
right-nosql-tools/