LDCache - a cache for linked data-driven web applicationsMetaSolutions AB
Presentation of LDCache at the Developers Workshop at the International Semantic Web Conference 2014. See http://ceur-ws.org/Vol-1268/paper12.pdf for the full paper and http://entrystore.org/ldcache/ for the project's website.
Connections that work: Linked Open Data demystifiedJakob .
Keynote given 2014-10-22 at the National Library of Finland at Kirjastoverkkopäivät 2014 (https://www.kiwi.fi/pages/viewpage.action?pageId=16767828) #kivepa2014
Repositories are systems to safely store and publish digital objects and their descriptive metadata. Repositories mainly serve their data by using web interfaces which are primarily oriented towards human consumption. They either hide their data behind non-generic interfaces or do not publish them at all in a way a computer can process easily. At the same time the data stored in repositories are particularly suited to be used in the Semantic Web as metadata are already available. They do not have to be generated or entered manually for publication as Linked Data. In my talk I will present a concept of how metadata and digital objects stored in repositories can be woven into the Linked (Open) Data Cloud and which characteristics of repositories have to be considered while doing so. One problem it targets is the use of existing metadata to present Linked Data. The concept can be applied to almost every repository software. At the end of my talk I will present an implementation for DSpace, one of the software solutions for repositories most widely used. With this implementation every institution using DSpace should become able to export their repository content as Linked Data.
Performance of graph query languages:
Analysis on theperformance of graph querylanguages: comparative study of Cypher, Gremlin and native access in Neo4j
LDCache - a cache for linked data-driven web applicationsMetaSolutions AB
Presentation of LDCache at the Developers Workshop at the International Semantic Web Conference 2014. See http://ceur-ws.org/Vol-1268/paper12.pdf for the full paper and http://entrystore.org/ldcache/ for the project's website.
Connections that work: Linked Open Data demystifiedJakob .
Keynote given 2014-10-22 at the National Library of Finland at Kirjastoverkkopäivät 2014 (https://www.kiwi.fi/pages/viewpage.action?pageId=16767828) #kivepa2014
Repositories are systems to safely store and publish digital objects and their descriptive metadata. Repositories mainly serve their data by using web interfaces which are primarily oriented towards human consumption. They either hide their data behind non-generic interfaces or do not publish them at all in a way a computer can process easily. At the same time the data stored in repositories are particularly suited to be used in the Semantic Web as metadata are already available. They do not have to be generated or entered manually for publication as Linked Data. In my talk I will present a concept of how metadata and digital objects stored in repositories can be woven into the Linked (Open) Data Cloud and which characteristics of repositories have to be considered while doing so. One problem it targets is the use of existing metadata to present Linked Data. The concept can be applied to almost every repository software. At the end of my talk I will present an implementation for DSpace, one of the software solutions for repositories most widely used. With this implementation every institution using DSpace should become able to export their repository content as Linked Data.
Performance of graph query languages:
Analysis on theperformance of graph querylanguages: comparative study of Cypher, Gremlin and native access in Neo4j
This presentation discusses the value of inferred knowledge over LOD and presents a new version of FactForge, a reason-able view, the biggest body of heterogeneous generic knowledge on which inference is performed, showing examples of inferred statements across LOD datasets.
Using FME to Compile, Validate and Maintain a 4 Million Oil and Gas Well Data...Safe Software
Approximately 32 states and more than four million well bores have been drilled in the United States. For its well data, each state agency must deal with an uncoordinated, autonomous data collection process, data model, and distribution methods. This session discusses how Whitestar uses FME to build an extensive set of dataflow models that regularly ingest the raw data, compute locations, verify elevations, perform data validation checks, and standardize the schema nationwide. We'll also highlight how FME is used to output data to a series of open source version 8.4 postgreSQL database structures.
Data carving using artificial headers info sec conferenceRobert Daniel
Digital forensic tools are an essential requirement in criminal and increasingly civil
cases in order to process electronic evidence. Investigators rely upon the functionality
of these tools to identify and extract relevant artifacts. One of these key processes is
data carving – an approach that ignores the file system and analyses the drive for files
that match a particular signature. Unfortunately, however, other than simple files, data
carving has many limitations that result in either missing files or producing high
numbers of false alarms. The core of their detection is largely based upon a signature
appearing in the header of the file. However, for files that have corrupted or missing
headers, modern data carvers are unable to recover the file successfully. This paper
proposes a new approach to data carving that inserts an artificial header onto the file,
thereby circumventing the header issue. Experiments have demonstrated that this
approach is able to successfully recover files that no current data-carving tools are
able to achieve.
This deck talks about the basic overview of NoSQL technologies, implementation vendors/products, case studies, and some of the core implementation algorithms. The presentation also describes a quick overview of "Polyglot Persistency", "NewSQL" like emerging trends.
The deck is targeted to beginners who wants to get an overview of NoSQL databases.
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesMongoDB
With so much talk of how Big Data is revolutionizing the world and how a data lake with Hadoop and/or Spark will solve all your data problems, it is hard to tell what is hype, reality, or somewhere in-between.
In working with dozens of enterprises in varying stages of their enterprise data management (EDM) strategy, MongoDB enterprise architect, Matt Kalan, sees the same challenges and misunderstandings arise again and again.
In this session, he will explain common challenges in data management, what capabilities are necessary, and what the future state of architecture looks like. MongoDB is uniquely capable of filling common gaps in the data lake strategy.
This session also includes a live Q&A portion during which you are encouraged to ask questions of our team.
Analyzing Semi-Structured Data At Volume In The CloudRobert Dempsey
Presentation from Snowflake Computing at the November 2015 Data Wranglers DC meetup.
The Cloud, Mobile and Web Applications are producing semi-structured data at an unprecedented rate. IT professionals continue to struggle capturing, transforming, and analyzing these complex data structures mixed with traditional relational style datasets using conventional MPP and/or Hadoop infrastructures. Public cloud infrastructures such as Amazon and Azure provide almost unlimited resources and scalability to handle both structured and semi-structured data (XML, JSON, AVRO) at Petabyte scale. These new capabilities coupled with traditional data management access methods such as SQL allow organizations and businesses new opportunities to leverage analytics at an unprecedented scale while greatly simplifying data pipeline architectures and providing an alternative to the "data lake".
What is NoSQL? How does it come to the picture? What are the types of NoSQL? Some basics of different NoSQL types? Differences between RDBMS and NoSQL. Pros and Cons of NoSQL.
What is MongoDB? What are the features of MongoDB? Nexus architecture of MongoDB. Data model and query model of MongoDB? Various MongoDB data management techniques. Indexing in MongoDB. A working example using MongoDB Java driver on Mac OSX.
Key aspects of big data storage and its architectureRahul Chaturvedi
This paper helps understand the tools and technologies related to a classic BigData setting. Someone who reads this paper, especially Enterprise Architects, will find it helpful in choosing several BigData database technologies in a Hadoop architecture.
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliData Driven Innovation
La nascita dei data lake - La aziende, ormai, sono sommerse dai dati e il classico datawarehouse fa fatica a macinare questi dati per numerosità e varietà. In molti hanno iniziato a guardare a delle architetture chiamate Data Lakes con Hadoop come tecnologia di riferimento. Ma questa soluzione va bene per tutto? Vieni a capire come operazionalizzare i data lakes per creare delle moderne architetture di gestione dati.
Understanding Metadata: Why it's essential to your big data solution and how ...Zaloni
In this O'Reilly webcast, Ben Sharma (cofounder and CEO of Zaloni) and Vikram Sreekanti (software engineer in the AMPLab at UC Berkeley) discuss the value of collecting and analyzing metadata, and its potential to impact your big data solution and your business.
Watch the replay here: http://oreil.ly/28LO7IW
Deep-dive into Microservices Patterns with Replication and Stream Analytics
Target Audience: Microservices and Data Architects
This is an informational presentation about microservices event patterns, GoldenGate event replication, and event stream processing with Oracle Stream Analytics. This session will discuss some of the challenges of working with data in a microservices architecture (MA), and how the emerging concept of a “Data Mesh” can go hand-in-hand to improve microservices-based data management patterns. You may have already heard about common microservices patterns like CQRS, Saga, Event Sourcing and Transaction Outbox; we’ll share how GoldenGate can simplify these patterns while also bringing stronger data consistency to your microservice integrations. We will also discuss how complex event processing (CEP) and stream processing can be used with event-driven MA for operational and analytical use cases.
Business pressures for modernization and digital transformation drive demand for rapid, flexible DevOps, which microservices address, but also for data-driven Analytics, Machine Learning and Data Lakes which is where data management tech really shines. Join us for this presentation where we take a deep look at the intersection of microservice design patterns and modern data integration tech.
Reference Architectures for Layered CPS System of Systems using Data Hubs and...Bob Marcus
Describes extensions of current NIST Reference Architectures and Frameworks that are needed to handle CPS System of Systems Use Cases (e.g. Smart Grid, Smart City). These extensions include Data Hubs and CPS Hubs.
This presentation discusses the value of inferred knowledge over LOD and presents a new version of FactForge, a reason-able view, the biggest body of heterogeneous generic knowledge on which inference is performed, showing examples of inferred statements across LOD datasets.
Using FME to Compile, Validate and Maintain a 4 Million Oil and Gas Well Data...Safe Software
Approximately 32 states and more than four million well bores have been drilled in the United States. For its well data, each state agency must deal with an uncoordinated, autonomous data collection process, data model, and distribution methods. This session discusses how Whitestar uses FME to build an extensive set of dataflow models that regularly ingest the raw data, compute locations, verify elevations, perform data validation checks, and standardize the schema nationwide. We'll also highlight how FME is used to output data to a series of open source version 8.4 postgreSQL database structures.
Data carving using artificial headers info sec conferenceRobert Daniel
Digital forensic tools are an essential requirement in criminal and increasingly civil
cases in order to process electronic evidence. Investigators rely upon the functionality
of these tools to identify and extract relevant artifacts. One of these key processes is
data carving – an approach that ignores the file system and analyses the drive for files
that match a particular signature. Unfortunately, however, other than simple files, data
carving has many limitations that result in either missing files or producing high
numbers of false alarms. The core of their detection is largely based upon a signature
appearing in the header of the file. However, for files that have corrupted or missing
headers, modern data carvers are unable to recover the file successfully. This paper
proposes a new approach to data carving that inserts an artificial header onto the file,
thereby circumventing the header issue. Experiments have demonstrated that this
approach is able to successfully recover files that no current data-carving tools are
able to achieve.
This deck talks about the basic overview of NoSQL technologies, implementation vendors/products, case studies, and some of the core implementation algorithms. The presentation also describes a quick overview of "Polyglot Persistency", "NewSQL" like emerging trends.
The deck is targeted to beginners who wants to get an overview of NoSQL databases.
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesMongoDB
With so much talk of how Big Data is revolutionizing the world and how a data lake with Hadoop and/or Spark will solve all your data problems, it is hard to tell what is hype, reality, or somewhere in-between.
In working with dozens of enterprises in varying stages of their enterprise data management (EDM) strategy, MongoDB enterprise architect, Matt Kalan, sees the same challenges and misunderstandings arise again and again.
In this session, he will explain common challenges in data management, what capabilities are necessary, and what the future state of architecture looks like. MongoDB is uniquely capable of filling common gaps in the data lake strategy.
This session also includes a live Q&A portion during which you are encouraged to ask questions of our team.
Analyzing Semi-Structured Data At Volume In The CloudRobert Dempsey
Presentation from Snowflake Computing at the November 2015 Data Wranglers DC meetup.
The Cloud, Mobile and Web Applications are producing semi-structured data at an unprecedented rate. IT professionals continue to struggle capturing, transforming, and analyzing these complex data structures mixed with traditional relational style datasets using conventional MPP and/or Hadoop infrastructures. Public cloud infrastructures such as Amazon and Azure provide almost unlimited resources and scalability to handle both structured and semi-structured data (XML, JSON, AVRO) at Petabyte scale. These new capabilities coupled with traditional data management access methods such as SQL allow organizations and businesses new opportunities to leverage analytics at an unprecedented scale while greatly simplifying data pipeline architectures and providing an alternative to the "data lake".
What is NoSQL? How does it come to the picture? What are the types of NoSQL? Some basics of different NoSQL types? Differences between RDBMS and NoSQL. Pros and Cons of NoSQL.
What is MongoDB? What are the features of MongoDB? Nexus architecture of MongoDB. Data model and query model of MongoDB? Various MongoDB data management techniques. Indexing in MongoDB. A working example using MongoDB Java driver on Mac OSX.
Key aspects of big data storage and its architectureRahul Chaturvedi
This paper helps understand the tools and technologies related to a classic BigData setting. Someone who reads this paper, especially Enterprise Architects, will find it helpful in choosing several BigData database technologies in a Hadoop architecture.
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliData Driven Innovation
La nascita dei data lake - La aziende, ormai, sono sommerse dai dati e il classico datawarehouse fa fatica a macinare questi dati per numerosità e varietà. In molti hanno iniziato a guardare a delle architetture chiamate Data Lakes con Hadoop come tecnologia di riferimento. Ma questa soluzione va bene per tutto? Vieni a capire come operazionalizzare i data lakes per creare delle moderne architetture di gestione dati.
Understanding Metadata: Why it's essential to your big data solution and how ...Zaloni
In this O'Reilly webcast, Ben Sharma (cofounder and CEO of Zaloni) and Vikram Sreekanti (software engineer in the AMPLab at UC Berkeley) discuss the value of collecting and analyzing metadata, and its potential to impact your big data solution and your business.
Watch the replay here: http://oreil.ly/28LO7IW
Deep-dive into Microservices Patterns with Replication and Stream Analytics
Target Audience: Microservices and Data Architects
This is an informational presentation about microservices event patterns, GoldenGate event replication, and event stream processing with Oracle Stream Analytics. This session will discuss some of the challenges of working with data in a microservices architecture (MA), and how the emerging concept of a “Data Mesh” can go hand-in-hand to improve microservices-based data management patterns. You may have already heard about common microservices patterns like CQRS, Saga, Event Sourcing and Transaction Outbox; we’ll share how GoldenGate can simplify these patterns while also bringing stronger data consistency to your microservice integrations. We will also discuss how complex event processing (CEP) and stream processing can be used with event-driven MA for operational and analytical use cases.
Business pressures for modernization and digital transformation drive demand for rapid, flexible DevOps, which microservices address, but also for data-driven Analytics, Machine Learning and Data Lakes which is where data management tech really shines. Join us for this presentation where we take a deep look at the intersection of microservice design patterns and modern data integration tech.
Reference Architectures for Layered CPS System of Systems using Data Hubs and...Bob Marcus
Describes extensions of current NIST Reference Architectures and Frameworks that are needed to handle CPS System of Systems Use Cases (e.g. Smart Grid, Smart City). These extensions include Data Hubs and CPS Hubs.
Best Practices for Building and Deploying Data Pipelines in Apache SparkDatabricks
Many data pipelines share common characteristics and are often built in similar but bespoke ways, even within a single organisation. In this talk, we will outline the key considerations which need to be applied when building data pipelines, such as performance, idempotency, reproducibility, and tackling the small file problem. We’ll work towards describing a common Data Engineering toolkit which separates these concerns from business logic code, allowing non-Data-Engineers (e.g. Business Analysts and Data Scientists) to define data pipelines without worrying about the nitty-gritty production considerations.
We’ll then introduce an implementation of such a toolkit in the form of Waimak, our open-source library for Apache Spark (https://github.com/CoxAutomotiveDataSolutions/waimak), which has massively shortened our route from prototype to production. Finally, we’ll define new approaches and best practices about what we believe is the most overlooked aspect of Data Engineering: deploying data pipelines.
Government GraphSummit: Leveraging Knowledge Graphs for Foundational Intellig...Neo4j
Jim McHugh, VP, National Intelligence Solutions, BigBear.ai
Today’s Intelligence Analysts need the ability to ingest, filter, and conflate large volumes of data from disparate intelligence feeds from controlled and publicly available sources for quick and accurate decision-making. In this session, I will describe how BigBear.ai leverages knowledge graphs to extend existing data warehouses and analysis platforms to extract meaningful and actionable insights in decreased time as data volumes continue to increase in both size and complexity.
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...Databricks
Many had dubbed 2020 as the decade of data. This is indeed an era of data zeitgeist.
From code-centric software development 1.0, we are entering software development 2.0, a data-centric and data-driven approach, where data plays a central theme in our everyday lives.
As the volume and variety of data garnered from myriad data sources continue to grow at an astronomical scale and as cloud computing offers cheap computing and data storage resources at scale, the data platforms have to match in their abilities to process, analyze, and visualize at scale and speed and with ease — this involves data paradigm shifts in processing and storing and in providing programming frameworks to developers to access and work with these data platforms.
In this talk, we will survey some emerging technologies that address the challenges of data at scale, how these tools help data scientists and machine learning developers with their data tasks, why they scale, and how they facilitate the future data scientists to start quickly.
In particular, we will examine in detail two open-source tools MLflow (for machine learning life cycle development) and Delta Lake (for reliable storage for structured and unstructured data).
Other emerging tools such as Koalas help data scientists to do exploratory data analysis at scale in a language and framework they are familiar with as well as emerging data + AI trends in 2021.
You will understand the challenges of machine learning model development at scale, why you need reliable and scalable storage, and what other open source tools are at your disposal to do data science and machine learning at scale.
Intervention dans le cadre des formations proposées par BibliEst en début 2011. Les services en ligne, avec un point particulier séparé fait sur les questions de droits par Me Martin.
Présentation générale de ce que sont les métadonnées, de quelques questions qu'elles soulèvent, suivie d'une proposition de typologie des standards de métadonnées.
Manquent les animations
Version 1.1
Ready to Unlock the Power of Blockchain!Toptal Tech
Imagine a world where data flows freely, yet remains secure. A world where trust is built into the fabric of every transaction. This is the promise of blockchain, a revolutionary technology poised to reshape our digital landscape.
Toptal Tech is at the forefront of this innovation, connecting you with the brightest minds in blockchain development. Together, we can unlock the potential of this transformative technology, building a future of transparency, security, and endless possibilities.
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC
Ellisha Heppner, Grant Management Lead, presented an update on APNIC Foundation to the PNG DNS Forum held from 6 to 10 May, 2024 in Port Moresby, Papua New Guinea.
Understanding User Behavior with Google Analytics.pdfSEO Article Boost
Unlocking the full potential of Google Analytics is crucial for understanding and optimizing your website’s performance. This guide dives deep into the essential aspects of Google Analytics, from analyzing traffic sources to understanding user demographics and tracking user engagement.
Traffic Sources Analysis:
Discover where your website traffic originates. By examining the Acquisition section, you can identify whether visitors come from organic search, paid campaigns, direct visits, social media, or referral links. This knowledge helps in refining marketing strategies and optimizing resource allocation.
User Demographics Insights:
Gain a comprehensive view of your audience by exploring demographic data in the Audience section. Understand age, gender, and interests to tailor your marketing strategies effectively. Leverage this information to create personalized content and improve user engagement and conversion rates.
Tracking User Engagement:
Learn how to measure user interaction with your site through key metrics like bounce rate, average session duration, and pages per session. Enhance user experience by analyzing engagement metrics and implementing strategies to keep visitors engaged.
Conversion Rate Optimization:
Understand the importance of conversion rates and how to track them using Google Analytics. Set up Goals, analyze conversion funnels, segment your audience, and employ A/B testing to optimize your website for higher conversions. Utilize ecommerce tracking and multi-channel funnels for a detailed view of your sales performance and marketing channel contributions.
Custom Reports and Dashboards:
Create custom reports and dashboards to visualize and interpret data relevant to your business goals. Use advanced filters, segments, and visualization options to gain deeper insights. Incorporate custom dimensions and metrics for tailored data analysis. Integrate external data sources to enrich your analytics and make well-informed decisions.
This guide is designed to help you harness the power of Google Analytics for making data-driven decisions that enhance website performance and achieve your digital marketing objectives. Whether you are looking to improve SEO, refine your social media strategy, or boost conversion rates, understanding and utilizing Google Analytics is essential for your success.
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBrad Spiegel Macon GA
Brad Spiegel Macon GA’s journey exemplifies the profound impact that one individual can have on their community. Through his unwavering dedication to digital inclusion, he’s not only bridging the gap in Macon but also setting an example for others to follow.
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdfFlorence Consulting
Quattordicesimo Meetup di Milano, tenutosi a Milano il 23 Maggio 2024 dalle ore 17:00 alle ore 18:30 in presenza e da remoto.
Abbiamo parlato di come Axpo Italia S.p.A. ha ridotto il technical debt migrando le proprie APIs da Mule 3.9 a Mule 4.4 passando anche da on-premises a CloudHub 1.0.
Italy Agriculture Equipment Market Outlook to 2027harveenkaur52
Agriculture and Animal Care
Ken Research has an expertise in Agriculture and Animal Care sector and offer vast collection of information related to all major aspects such as Agriculture equipment, Crop Protection, Seed, Agriculture Chemical, Fertilizers, Protected Cultivators, Palm Oil, Hybrid Seed, Animal Feed additives and many more.
Our continuous study and findings in agriculture sector provide better insights to companies dealing with related product and services, government and agriculture associations, researchers and students to well understand the present and expected scenario.
Our Animal care category provides solutions on Animal Healthcare and related products and services, including, animal feed additives, vaccination
Italy Agriculture Equipment Market Outlook to 2027
Modèles de données et langages de description ouverts 6 - 2021-2022
1. Modèles de données et langages
de description ouverts - 6
Licence DIST
2021-2022
2. Rappel programme global
• Comprendre la notion de métadonnée,
approche des langages à balises (s1)
• Comprendre le XML et ses applications (s2-4)
• Données ouvertes, traitements et
matérialisations informatiques (s5-6)
3. Programme de ce cours
1. Evaluation finale
2. Rappel : le web de données
3. Modèles de données et infrastructure
4. Questions-réponses dossiers
5. Bilan du cours
5. 2) Rappel : le web de données
• Les bases en web de données sont
constituées d’un ensemble de déclarations
sujet - relation - prédicat, des triplets
• D’une logique de tables de données avec des
relations binaires clé / valeur, on passe alors à
un système de graphe de données (la
structure des liens tissés par les triplets)
5
6. Le web de données
Exemple issu du RDF 1.1 Primer
<Bob> <is a> <person>.
<Bob> <is a friend of> <Alice>.
<Bob> <is born on> <the 4th of July 1990>.
<Bob> <is interested in> <the Mona Lisa>.
<the Mona Lisa> <was created by> <Leonardo da Vinci>.
<the video 'La Joconde à Washington'> <is about> <the Mona Lisa>
8. Le web de données
• Un modèle de données fondamentalement
ouvert
–Des points d’accès stables (via les URI)
–Une logique de réutilisation permettant les
enrichissements à l’infini
–Une conception partagée sans limitation, un
“commun”
9. Le web de données
Photo par Michael Dziedzic sur Unsplash
Photo par DevVrat Jadon sur Unsplash
https://commons.wikimedia.org/wiki/File:Super_glue.jpg
Ce modèle de données, si puissant soit-il, est UNE option technique parmi
d’autres, avec ses utilisations privilégiées. Est-ce qu’on imaginerait:
- Clouer une Tour Eiffel en allumettes ?
- Coller les rails des lignes du TGV ?
10. Le web de données
Après la phase de “hype” puis la frustration des pionniers, montée en
usage progressive
Hype Cycle du Gartner, Jérémy Kemp, CC-BY-SA
11. Le web de données
Sur des plans très différents, situation du XML (format de métadonnées),
du Web de données (modèle de données, donc de métadonnées) et de
JSON (JavaScript Object Notation - encodage de données)
Hype Cycle du Gartner, Jérémy Kemp, CC-BY-SA
JSON
Web3
XML
12. 3) Modèle de données et
infrastructure
Les modèles de données reposent sur une
infrastructure technique à ne pas négliger
(issu de https://medium.com/sogetiblogsnl/the-role-of-the-dba-on-modern-data-platforms-da40f5ad8f78)
S’appuyer sur des modèles de données
ouverts facilite la portabilité et
l’interopérabilité des services /
applications
Le stockage et le traitement informatique
de bas niveau sont plus ou moins adaptés
à (=optimisés pour) ces modèles de
données
13. Modèle de données et
infrastructure
• Les services permis
par un modèle de
données ne
fonctionnent que
si l’infrastructure
technique les rend
aussi possibles /
réalisables
14. Modèle de données et
infrastructure
• La typologie des bases de données n’est pas
très figée, en voici une en guise d’exemple.
(OLTP)
Dans https://iaobs.com/blog/bases-de-donnees-relationnelles-et-nosql-dans-gcp/
XML ? Web
de données ?
15. Modèle de données et
infrastructure
• La plupart du temps, une infrastructure peut
gérer tous les modèles de données mais pas
avec les mêmes performances (ici du RDF)
Dans https://www.sciencedirect.com/science/article/pii/S0268401219306097
16. Modèle de données et
infrastructure
• Les performances varient surtout selon les
opérations lancées sur les données (1/2)
Dans https://medium.com/profil-software-blog/database-compare-sql-vs-nosql-mysql-vs-postgresql-vs-redis-vs-mongodb-3da5f41c31b5
17. Modèle de données et
infrastructure
• Les performances varient surtout selon les
opérations lancées sur les données (2/2)
Dans https://medium.com/profil-software-blog/database-compare-sql-vs-nosql-mysql-vs-postgresql-vs-redis-vs-mongodb-3da5f41c31b5
18. Modèle de données et
infrastructure
Base de données
relationnelles
Base de données
document
Base de données
graphes
Moteur de
recherche
• Beaucoup de lecture/écriture
• Données très structurées
• Garantie de la transaction
• Peu ou pas de mise à jour
• Données semi structurées
• Montée en charge sécurisée en
volume de données
• Données très structurées
• Inférences sur les données
• Respect de la logique des données
• Requête plein texte
• Rapidité des réponses
• Montée en charge pour le
nombre d’utilisateurs
Forces et faiblesses des différents systèmes de bases de données.
Il n’existe pas une base de données idéale, chacune présente ses forces et ses faiblesses.
CC-BY Gautier Poupeau - 2019
19. Modèle de données et
infrastructure
• Logique “data-centrée”-> de + en + des
réservoirs différents pour services multiples
Extrait de Data Lake for Enterprises - Tomcy John et Pankaj Misra - Pakt - ISBN 9781787281349
20. Modèle de données et
infrastructure
• La destruction du format unique 💍
(le destin sans doute…)
• Pourquoi se préoccuper
de choisir des formats
ouverts de stockage ou d’expression des
métadonnées (XML, RDF, JSON…) ?
21. Modèle de données et
infrastructure
• Choisir des formats ouverts de stockage ou
d’expression des métadonnées ?
–Reprendre la maîtrise
sur les données publiques
(ou en tirer profit!)
–Ouvrir sur l’inconnu
24. Et pour la suite...
Bon courage !
Entraînez-vous à repérer le XML ou équivalent, et
vous savez maintenant que ça peut être maîtrisé
avec des connaissances et du bon sens.
Merci enfin pour vos efforts pour gérer cet aspect
technique du métier...