This presentation is a review of the NoSQL spaces I did for the X Jornades de Programari Lliure in Barcelona.
You will see a complete review of the NoSQL movement, use cases, technology review, an special review of what are the Graph Databases. And more....
Special thanks to @Hagenburger, @sbitxu, @jannis and the inspiration of the big @jimwebber and the amazing community.
Gremlin is a Turing-complete, graph-based programming language developed for key/value-pair multi-relational graphs called property graphs. Gremlin makes extensive use of XPath 1.0 to support complex graph traversals. Connectors exist to various graph databases and frameworks. This language has application in the areas of graph query, analysis, and manipulation.
Gremlin is a Turing-complete, graph-based programming language developed for key/value-pair multi-relational graphs called property graphs. Gremlin makes extensive use of XPath 1.0 to support complex graph traversals. Connectors exist to various graph databases and frameworks. This language has application in the areas of graph query, analysis, and manipulation.
Data Integration at the Ontology Engineering GroupOscar Corcho
Presentation done on the work being done on Data Integration at OEG-UPM (http://www.oeg-upm.net/), for the CredIBLE workshop, in Sophia-Antipolis (October 15th, 2012).
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...Gezim Sejdiu
Over the past decade, vast amounts of machine-readable structured information have become available through the automation of research processes as well as the increasing popularity of knowledge graphs and semantic technologies.
A major and yet unsolved challenge that research faces today is to perform scalable analysis of large scale knowledge graphs in order to facilitate applications like link prediction, knowledge base completion, and question answering.
Most machine learning approaches, which scale horizontally (i.e. can be executed in a distributed environment) work on simpler feature vector based input rather than more expressive knowledge structures.
On the other hand, the learning methods which exploit the expressive structures, e.g. Statistical Relational Learning and Inductive Logic Programming approaches, usually do not scale well to very large knowledge bases owing to their working complexity.
This talk gives an overview of the ongoing project Semantic Analytics Stack (SANSA) which aims to bridge this research gap by creating an out of the box library for scalable, in-memory, structured learning.
Big-data analytics beyond Hadoop - Big-data is not equal to Hadoop, especially for iterative algorithms! Lot of alternatives have emerged. Spark and GraphLab are most interesting next generation platforms for analytics.
Looking into the past - feature extraction from historic maps using Python, O...James Crone
Tutorial presentation providing an overview of extracting geospatial features from scanned historic maps in an automated fashion using Python, OpenCV and PostGIS.
Graph-Based Source Code Analysis of JavaScript Repositories Dániel Stein
A graph-based approach to analyze JavaScript source codes, using Neo4j as the graph database backend and ShapeSecurity Shift as the parser.
Hungarian version (presented at a Neo4j meetup): http://www.slideshare.net/steindani/forrskdtrak-grfalap-statikus-analzise
Relational databases are perhaps the most commonly used data management systems. In relational databases, data is modeled as a collection of disparate tables. In order to unify the data within these tables, a join operation is used. This operation is expensive as the amount of data grows. For information retrieval operations that do not make use of extensive joins, relational databases are an excellent tool. However, when an excessive amount of joins are required, the relational database model breaks down. In contrast, graph databases maintain one single data structure---a graph. A graph contains a set of vertices (i.e. nodes, dots) and a set of edges (i.e. links, lines). These elements make direct reference to one another, and as such, there is no notion of a join operation. The direct references between graph elements make the joining of data explicit within the structure of the graph. The benefit of this model is that traversing (i.e. moving between the elements of a graph in an intelligent, direct manner) is very efficient and yields a style of problem-solving called the graph traversal pattern. This session will discuss graph databases, the graph traversal programming pattern, and their use in solving real-world problems.
Here is a very simple three step guide on how to create a professional Twitter cover photo in PowerPoint. Use this strategy for your business, personal brand or whatever you want in order to bring traffic to your other sites.
Whether you're just starting out, or have been around for years, there are always opportunities to get more out of LinkedIn for your small business. Here are 5 simple things that can help take your efforts to the next level.
15 Tips for Compelling Company Updates on LinkedInLinkedIn
LinkedIn has evolved into a platform for content marketing. With more than 225 million members worldwide, professionals are using LinkedIn to become great at what they do by seeking and sharing insights. On LinkedIn, marketers are able to build relationships with professionals by using accurate targeting to share relevant content. LinkedIn Company Updates, shared from your Company Page, are a powerful way to reach professionals with relevant content across devices. We’ve created these 15 tips for compelling company updates to help you drive better results.
For more about content marketing on LinkedIn, visit http://lnkd.in/LIContentMarketing
How to Become a Thought Leader in Your NicheLeslie Samuel
Are bloggers thought leaders? Here are some tips on how you can become one. Provide great value, put awesome content out there on a regular basis, and help others.
Data Integration at the Ontology Engineering GroupOscar Corcho
Presentation done on the work being done on Data Integration at OEG-UPM (http://www.oeg-upm.net/), for the CredIBLE workshop, in Sophia-Antipolis (October 15th, 2012).
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...Gezim Sejdiu
Over the past decade, vast amounts of machine-readable structured information have become available through the automation of research processes as well as the increasing popularity of knowledge graphs and semantic technologies.
A major and yet unsolved challenge that research faces today is to perform scalable analysis of large scale knowledge graphs in order to facilitate applications like link prediction, knowledge base completion, and question answering.
Most machine learning approaches, which scale horizontally (i.e. can be executed in a distributed environment) work on simpler feature vector based input rather than more expressive knowledge structures.
On the other hand, the learning methods which exploit the expressive structures, e.g. Statistical Relational Learning and Inductive Logic Programming approaches, usually do not scale well to very large knowledge bases owing to their working complexity.
This talk gives an overview of the ongoing project Semantic Analytics Stack (SANSA) which aims to bridge this research gap by creating an out of the box library for scalable, in-memory, structured learning.
Big-data analytics beyond Hadoop - Big-data is not equal to Hadoop, especially for iterative algorithms! Lot of alternatives have emerged. Spark and GraphLab are most interesting next generation platforms for analytics.
Looking into the past - feature extraction from historic maps using Python, O...James Crone
Tutorial presentation providing an overview of extracting geospatial features from scanned historic maps in an automated fashion using Python, OpenCV and PostGIS.
Graph-Based Source Code Analysis of JavaScript Repositories Dániel Stein
A graph-based approach to analyze JavaScript source codes, using Neo4j as the graph database backend and ShapeSecurity Shift as the parser.
Hungarian version (presented at a Neo4j meetup): http://www.slideshare.net/steindani/forrskdtrak-grfalap-statikus-analzise
Relational databases are perhaps the most commonly used data management systems. In relational databases, data is modeled as a collection of disparate tables. In order to unify the data within these tables, a join operation is used. This operation is expensive as the amount of data grows. For information retrieval operations that do not make use of extensive joins, relational databases are an excellent tool. However, when an excessive amount of joins are required, the relational database model breaks down. In contrast, graph databases maintain one single data structure---a graph. A graph contains a set of vertices (i.e. nodes, dots) and a set of edges (i.e. links, lines). These elements make direct reference to one another, and as such, there is no notion of a join operation. The direct references between graph elements make the joining of data explicit within the structure of the graph. The benefit of this model is that traversing (i.e. moving between the elements of a graph in an intelligent, direct manner) is very efficient and yields a style of problem-solving called the graph traversal pattern. This session will discuss graph databases, the graph traversal programming pattern, and their use in solving real-world problems.
Here is a very simple three step guide on how to create a professional Twitter cover photo in PowerPoint. Use this strategy for your business, personal brand or whatever you want in order to bring traffic to your other sites.
Whether you're just starting out, or have been around for years, there are always opportunities to get more out of LinkedIn for your small business. Here are 5 simple things that can help take your efforts to the next level.
15 Tips for Compelling Company Updates on LinkedInLinkedIn
LinkedIn has evolved into a platform for content marketing. With more than 225 million members worldwide, professionals are using LinkedIn to become great at what they do by seeking and sharing insights. On LinkedIn, marketers are able to build relationships with professionals by using accurate targeting to share relevant content. LinkedIn Company Updates, shared from your Company Page, are a powerful way to reach professionals with relevant content across devices. We’ve created these 15 tips for compelling company updates to help you drive better results.
For more about content marketing on LinkedIn, visit http://lnkd.in/LIContentMarketing
How to Become a Thought Leader in Your NicheLeslie Samuel
Are bloggers thought leaders? Here are some tips on how you can become one. Provide great value, put awesome content out there on a regular basis, and help others.
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Data Science London
What 'kind of things' does a data scientist do? What are the foundations and principles of data science? What is a Data Product? What does the data science process looks like? Learning from data: Data Modeling or Algorithmic Modeling? - talk by Carlos Somohano @ds_ldn at The Cloud and Big Data: HDInsight on Azure London 25/01/13
Oracle Machine Learning Overview and From Oracle Data Professional to Oracle ...Charlie Berger
DBAs spend too time with routine tasks leaving little time for innovation. Autonomous Databases free data professionals to extract more value from data. Oracle Machine Learning, in Autonomous Database, “moves the algorithms; not the data” for 100% in-database processing. Data professionals perform many supporting tasks for “data scientists”, typically 80% of the work. Come learn an evolutionary path for Oracle data professionals to leverage domain knowledge and data skills and add machine learning. See how to build and deploy predictive models inside the Database. Using examples, demos and sharing experiences, Charlie will show you how to discover new insights, make predictions and become an “Oracle Data Scientist” in just 6 weeks!
The tutorial has been presented at CAISE 2010. The tutorial discusses the state-of-the-art on research addresseing the quality of data at the conceptual level (conceptual schemas) and of Ontologies
machine learning in the age of big data: new approaches and business applicat...Armando Vieira
Presentation at University of Lisbon on Machine Learning and big data.
Deep learning algorithms and applications to credit risk analysis, churn detection and recommendation algorithms
Towards Open Pervasive Displays (Keynote at UbiSummit, Helsinki, May 2011)Adrian Friday
We discuss the challenges of opening up networks of public displays to wider control (based on our experiences of eCampus) and postulate what might happen if we open up to applications also (global networks of displays, content and applications, c.f. http://pd-net.org)
Virtual/Augmented reality, digital tools and superpowers for health applicati...Boo Aguilar
Keynote apresentado no minicurso realizado durante o primeiro simpósio mineiro de engenharia biomédica no INATEL no dia 14/08/15
Sobre a ordem das coisas, tentei comprimir a estrutura mais ou menos assim:
1-Um pouco sobre a FLAGCX e como vemos a ciência como propulsora que estica os limites da tecnologia.
2- Alguns cases legais, pra não ficar só na arena teórica (Get shit done!)
3- Um pouco sobre meu trabalho (e sobre como só consigo fazer o que faço graças ao crescimento exponencial, super ferramentas e outros fatores chave). Falamos de VR, AR, tecnologias que uso, demos técnicos e visão pessoal de futuro a curto/médio prazo desse mercado e do mercado de tecnologia pra healthcare
4- Finalmente a parte que mais queria dividir com vocês: Referências de super tools pra health care, softwares, serviços e cases de empresas disruptivas, desde realidade virtual/aumentada até databases conectadas, inteligencia artificial, Uberização de serviços de saúde, bioengenharia, plataformas de educação, treinamento e colaboração descentralizadas etc etc
5- Finalizamos com uma chamada pra vocês se inspirarem sempre, agirem, e ajudarem a espécie humana a transcender nossas limitações através das nossas ferramentas e do nosso intelecto.
Como prometido, aí estão todas as referências pra serem revistadas com calma. Espero que tenham gostado da experiência tanto quanto eu, e saibam que vocês foram a primeira turma pra quem eu apresentei 170 slides em mais de 3 horas de talk sem perder um único
interessado. You rock!
We're together in this ocean, let's do it!
#RadicalOpenness #Transcend ; )
Understanding the New World of Cognitive ComputingDATAVERSITY
Cognitive Computing is a rapidly developing technology that has reached practical application and implementation. So what is it? Do you need it? How can it benefit your business?
In this webinar a panel of experts in Cognitive Computing will discuss the technology, the current practical applications, and where this technology is going. The discussion will start with a review of a recent survey produced by DATAVERSITY on how Cognitive Computing is currently understood by your peers. The panel will also review many components of the technology including:
Cognitive Analytics
Machine Learning
Deep Learning
Reasoning
And next generation artificial intelligence (AI)
And get involved in the discussion with your own questions to present to the panel.
Desafios e Oportunidades derivados da Explosao de Dados (Big Data)Francisco Pires
Apresentação : "Desafios e Oportunidades derivados da Explosão de Dados (Big Data) ": nas Jornadas ANPRI - Associação Nacional dos Profissionais de Informática - em Coimbra no dia 16 de Junho de 2012 - Por Francisco Lavrador Pires : FB - https://www.facebook.com/francisco.l.pires ; Twitter @flpires
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Enhancing Performance with Globus and the Science DMZGlobus
ESnet has led the way in helping national facilities—and many other institutions in the research community—configure Science DMZs and troubleshoot network issues to maximize data transfer performance. In this talk we will present a summary of approaches and tips for getting the most out of your network infrastructure using Globus Connect Server.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
The Art of the Pitch: WordPress Relationships and Sales
Try NoSQL it doesn't hurts and is fun
1. N SQL
Pere Urbón-Bayes
Moviepilot Gmbh
@purbon
purbon@purbon.com
dijous 30 de juny de 2011
2. We’re going to talk
about?
Were we are, and where do we come from?
NoSQL. { “motivation” : “use cases” }
Graph databases.
....
dijous 30 de juny de 2011
3. {
"if_you":{
"are_the_master_of": [ "movies", "data analytics", "ruby", "git", "nosql" ],
"love":"recommendation systems",
"would_love_to_know_about":"graph_databases",
"believe_in":"open source"
},
Moviepilot is a
"join_us":"true",
leading provider and
"contact_with":"jobs at moviepilot.com"
discovery service for
}
movies and series
based in Berlin!
dijous 30 de juny de 2011
4. Come and Going
History
1960 Navigational
Databases
1970 Relational Databases.
Edgar Codd Algebra.
1970 ends, SQL DBMS.
SQL, DB2, Ingres,
PostgreSQL, Sybase,
dijous 30 de juny de 2011
7. Where are we now?
Is every
thing related?
Semantic Web
Business Intelligence
Tagging Folksonomies
Social networks Linked Data
RDF
RDMS Blogs
Text Files
1990 2000 2010 2030
dijous 30 de juny de 2011
8. Where are we now?
1980
1990
2000
2010
2020
dijous 30 de juny de 2011
9. How are our apps...?
Data warehousing and Business Intelligence.
Stream processing.
Text search.
Scientific processing.
Semi-(un)-structured data.
dijous 30 de juny de 2011
10. How are our apps...?
Need to scale Performance,
horizontally. Performance,
Performance.
Partition and
replication. Flexibility.
OLTP and OLAP. Big even Huge
datasets.
Web 2.0.
.....?
dijous 30 de juny de 2011
11. N SQL
select fun, profit from real_world where relational=false and barcelona=true;
Carlo Strozzi, 1998.
Eric Evans (Rackspace) and Johan Oskarsson
(last.fm), early 2009.
no:sql(east) 2009, no:sql(eu) 2010.
dijous 30 de juny de 2011
12. N SQL
select fun, profit from real_world where relational=false and barcelona=true;
Ability to scale Access throw
horizontally. different end points.
Replication and Dynamic schema
distribution. environment.
Weaker concurrency Leave more business
model. to the app side.
Smart use of
resources.
dijous 30 de juny de 2011
14. Dismantle
Store
Rebuild
Enjoy Brick
Window
Roof
Unstructured Structured
Unstructured?
dijous 30 de juny de 2011
15. ACID
select fun, profit from real_world where relational=false and barcelona=true;
Atomicity
Helps All operations are executed or none is.
Understand data. Consistency
Persistence guaranteed.
Data is consistent after the transaction.
Hurts Isolation
Horizontal scale.
High Availability. Transactions are independent.
Durability
Changes persist, event if failures.
dijous 30 de juny de 2011
16. “There is a magic bullet!
It's called relaxing the requirements.”
- Evan Weaver, @evan
dijous 30 de juny de 2011
17. CAP
select fun, profit from real_world where relational=false and barcelona=true;
Consistency mysql
Each client has the same
view. C A
Availability redis
P riak
All client can read and
write.
Partition Tolerance
Works well across different Only Two!!!!
network partitions.
dijous 30 de juny de 2011
18. “You have database problem. You
research blog and HN. You start use
NoSQL product. Now you not
know anymore if you have problem.”
- Devops BORAT, @devops_borat
dijous 30 de juny de 2011
19. NoSQL systems.
select fun, profit from real_world where relational=false and barcelona=true;
Most commons Other systems
Column DBs. XML Databases
Document DBs. Grid Databases.
Key-Value DBs. RDF.
Graph DBs. ....
Object DBs.
dijous 30 de juny de 2011
20. Column Databases
select fun, profit from real_world where relational=false and barcelona=true;
Is a DBMS that stores its content by column
rather than by row. This has advantages for
data warehouses.
More efficient with Aggregates and if data is
column oriented.
Suited for OLAP and not much for OLTP.
First implementations, early 1970.
dijous 30 de juny de 2011
21. Apache Cassandra
select fun, profit from real_world where relational=false and barcelona=true;
Designed to handle very large data spread
across multiple commodity servers.
High Availability with no SPOF.
Born at Facebook, to power Inbox Search.
Hybrid system, between column and rows.
Initial Release 2008. Version 0.8.1 28/06/11.
dijous 30 de juny de 2011
22. Key-Value Databases
select fun, profit from real_world where relational=false and barcelona=true;
Allow the use to store key-value pairs, where
the key usually consist of a string, and the
value is a simple primitive.
Suited for use cases where properties and
values are enough, ex: profiles, logs, etc...
Eventually consistent, hierarchy, multivalued,
etc..
First implementations, around 1980.
dijous 30 de juny de 2011
23. Redis.io
select fun, profit from real_world where relational=false and barcelona=true;
Open-source, networked, in-memory,
persistent, journaled, key-value datastore.
Binding for the major languages.
The data structure storage system.
Master-Slave replication. High performance.
Initial Release 2009. Version 2.2.7 11/05/11
dijous 30 de juny de 2011
24. Document Databases
select fun, profit from real_world where relational=false and barcelona=true;
Is a DBMS where the default unit of store is
a document. XML, JSON, YAML, .....
More complex than Key-Value store.
Suited for multi document apps. News, CVs,...
Eventual consistency, limited Atomicity and
Isolation.
One of the first, Lotus Notes, 1989.
dijous 30 de juny de 2011
25. OrientDB
select fun, profit from real_world where relational=false and barcelona=true;
Open source database written in Java.
Schema-[full,less,mix] modes.
Support SQL, ACID compliant, HTTP, Rest and
JSON. Distributed and scalable.
Light and embeddable. Binding most langs.
Initial Release 2010, Version 1.0rc2 17/06/11
dijous 30 de juny de 2011
26. Graph Databases
select fun, profit from real_world where relational=false and barcelona=true;
Is a database that uses graph structures
with nodes, edges, and properties.
Suited for associative datasets, map object
oriented app structure. Avoid expensive joins.
Are powerful for graph-like operations, like
shortest path, communities, etc.
First implementations around 2007.
dijous 30 de juny de 2011
28. What is a graph?
Graph G(V,E) where V = {v1,v2,...,vN) and E =
{E1,E2,...,EN)
Directed / Undirected
Mixed
Multigraph
Weighted
dijous 30 de juny de 2011
30. Graph Databases
The Property Graph
Abstractions
Nodes and Relationships.
Properties on both.
John smith liked http://www.example.com at 01/10/11
dijous 30 de juny de 2011
31. Graph Databases
Applications
Task planning Dependency analysis
Scheduling Impact analysis
Process assignation Network flow
Routing Traffic analysis and
optimization
Logistics
Delivery
League planning optimization
Pattern Recognition Optimization of tasks
dijous 30 de juny de 2011
32. Graph Databases
Applications
Recommendations Walks
Heuristics Search
(PageRank) algorithms
Shooting stars
Local
K-nearest
Shortest Paths
neighbors
Hammock
Functions
dijous 30 de juny de 2011
33. Graph Databases
Applications
Semantic web Link analysis
RDF (OWL) Store Structure mining
RDF-Sail
SPARQL
Linked data (Open
Data)
dijous 30 de juny de 2011
34. Graph Databases
Vendors
Neo4J: Open source Sones: SaaS dot Net
database NoSQL Graph database.
graph.
OrientDB: The
HyperGraphDB: An IA Document-GraphDB.
and semantic web
graph database. FlockDB: The twitter
graphdb.
Infogrid: The
Internet Graph Pregel: Graph
database. Processing at Google.
dijous 30 de juny de 2011