The document describes an Entity Registry System (ERS) that allows for decentralized, linked data storage in a document store. It was designed to work in environments with poor network connectivity. The ERS uses contributors to write data, bridges to connect isolated parts of the system, and an optional aggregator for high-performance read-only data retrieval. Testing showed the ERS could tolerate disconnects and poor networks as long as connections lasted at least half a second. It was tested with up to 40 nodes and was able to reliably synchronize data in real-world simulation scenarios like a conference social network and remote merchants updating prices between villages via a mobile bridge.
DIVE INTO THE EVENT-BASED
BROWSING OF LINKED HISTORICAL MEDIA
VICTOR DE BOER, JOHAN OOMEN, OANA INEL, LORA AROYO, ELCO VAN STAVEREN, WERNER HELMICH AND DENNIS DE BEURS
DIVE INTO THE EVENT-BASED
BROWSING OF LINKED HISTORICAL MEDIA
VICTOR DE BOER, JOHAN OOMEN, OANA INEL, LORA AROYO, ELCO VAN STAVEREN, WERNER HELMICH AND DENNIS DE BEURS
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...confluent
Do you know who is knocking on your network’s door? Have new regulations left you scratching your head on how to a handle what is happening in your network? Network flow data helps answer many questions across a multitude of use cases including network security, performance, capacity planning, routing, operational troubleshooting and more. Today’s modern day streaming data pipelines need to include tools that can scale to meet the demands of these service providers while continuing to provide responsive answers to difficult questions. In addition to stream processing, data needs to be stored in a redundant, operationally focused database to provide fast, reliable answers to critical questions. Together, Kafka and Druid work together to create such a pipeline.
In this talk Eric Graham and Rachel Pedreschi will discuss these pipelines and cover the following topics: Network flow use cases and why this data is important. Reference architectures from production systems at a major international Bank. Why Kafka and Druid and other OSS tools for Network flows. A demo of one such system.
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...confluent
Do you know who is knocking on your network’s door? Have new regulations left you scratching your head on how to handle what is happening in your network? Network flow data helps answer many questions across a multitude of use cases including network security, performance, capacity planning, routing, operational troubleshooting and more. Today’s modern day streaming data pipelines need to include tools that can scale to meet the demands of these service providers while continuing to provide responsive answers to difficult questions. In addition to stream processing, data needs to be stored in a redundant, operationally focused database to provide fast, reliable answers to critical questions. Together, Kafka and Druid work together to create such a pipeline.
In this talk Eric Graham and Rachel Pedreschi will discuss these pipelines and cover the following topics:
-Network flow use cases and why this data is important.
-Reference architectures from production systems at a major international Bank.
-Why Kafka and Druid and other OSS tools for Network Flows.
-A demo of one such system.
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...DataWorks Summit
Back in 2014, our team set out to change the way the world exchanges and collaborates with data. Our vision was to build a single tenant environment for multiple organisations to securely share and consume data. And we did just that, leveraging multiple Hadoop technologies to help our infrastructure scale quickly and securely.
Today Data Republic’s technology delivers a trusted platform for hundreds of enterprise level companies to securely exchange, commercialise and collaborate with large datasets.
Join Head of Engineering, Juan Delard de Rigoulières and Senior Solutions Architect, Amin Abbaspour as they share key lessons from their team’s journey with Hadoop:
* How a startup leveraged a clever combination of Hadoop technologies to build a secure data exchange platform
* How Hadoop technologies helped us deliver key solutions around governance, security and controls of data and metadata
* An evaluation on the maturity and usefulness of some Hadoop technologies in our environment: Hive, HDFS, Spark, Ranger, Atlas, Knox, Kylin: we've use them all extensively.
* Our bold approach to expose APIs directly to end users; as well as the challenges, learning and code we created in the process
* Learnings from the front-line: How our team coped with code changes, performance tuning, issues and solutions while building our data exchange
Whether you’re an enterprise level business or a start-up looking to scale - this case study discussion offers behind-the-scenes lessons and key tips when using Hadoop technologies to manage data governance and collaboration in the cloud.
Speakers:
Juan Delard De Rigoulieres, Head of Engineering, Data Republic Pty Ltd
Amin Abbaspour, Senior Solutions Architect, Data Republic
Large-Scale System Integration with DDS for SCADA, C2, and FinanceRick Warren
Presentation to the OMG Real-Time Workshop in May 2010 on system integration patterns, especially (but not exclusively) with respect to OMG Data Distribution Service (DDS) technology.
History of Computer Systems - Why we are doing it that wayLeo Lorieri
- by past and present facts, try to help new web professionals understand why web development is going that way: why queues, why caches, why REST, why NoSql, etc
- try to illustrate how and why old technologies were re-packed to help on web development
The Anatomy Of The Google Architecture Fina Lv1.1Hassy Veldstra
A comprehensive overview of Google's architecture - starting from the search page and all the way to its internal networks.
By Ed Austin, talk given at Edinburgh Techmeetup in December 2009
http://techmeetup.co.uk
Indeed Flex: The Story of a Revolutionary Recruitment PlatformHostedbyConfluent
"This is a tale of two streams when the pandemic hit and how we changed with the times and built a revolutionary recruitment platform for “going into work”. We engaged employers, recruiters and job seekers from industrial, healthcare, retail, hospital and facilities management sectors by building a unique platform where the job seeker has full control to pick their schedule, pay rate and what meets their preferences. Our goals are to give job seekers and employers a platform that thrives on simplicity, transparency and low costs. The Flexer stands today with full control of their time at the edge of opportunities to thrive on.
This presentation will go into the details of how we are tearing down a monolithic platform piece by piece and building a robust architecture,
- Routing events between two platforms
- Many sources and,
- Consumed by downstream several applications
We will discuss the caveats and bugs we learned when we worked with schema registry and evolution of schemas. We will highlight improvements we gained from automation and observability with Datadog integration for Confluent Cloud.
If you’re in discussions surrounding event driven systems at your organization then this talk is for you. Join Ronak and me for this talk and let’s have a discussion."
One day workshop Linked Data and Semantic WebVictor de Boer
As taught at UNIMAS July 2019. based on a three day summer school by Knud Hinnerk Moeller and Victor de Boer. Includes hands on excercises using SWI-Prolog ClioPatria
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...confluent
Do you know who is knocking on your network’s door? Have new regulations left you scratching your head on how to a handle what is happening in your network? Network flow data helps answer many questions across a multitude of use cases including network security, performance, capacity planning, routing, operational troubleshooting and more. Today’s modern day streaming data pipelines need to include tools that can scale to meet the demands of these service providers while continuing to provide responsive answers to difficult questions. In addition to stream processing, data needs to be stored in a redundant, operationally focused database to provide fast, reliable answers to critical questions. Together, Kafka and Druid work together to create such a pipeline.
In this talk Eric Graham and Rachel Pedreschi will discuss these pipelines and cover the following topics: Network flow use cases and why this data is important. Reference architectures from production systems at a major international Bank. Why Kafka and Druid and other OSS tools for Network flows. A demo of one such system.
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...confluent
Do you know who is knocking on your network’s door? Have new regulations left you scratching your head on how to handle what is happening in your network? Network flow data helps answer many questions across a multitude of use cases including network security, performance, capacity planning, routing, operational troubleshooting and more. Today’s modern day streaming data pipelines need to include tools that can scale to meet the demands of these service providers while continuing to provide responsive answers to difficult questions. In addition to stream processing, data needs to be stored in a redundant, operationally focused database to provide fast, reliable answers to critical questions. Together, Kafka and Druid work together to create such a pipeline.
In this talk Eric Graham and Rachel Pedreschi will discuss these pipelines and cover the following topics:
-Network flow use cases and why this data is important.
-Reference architectures from production systems at a major international Bank.
-Why Kafka and Druid and other OSS tools for Network Flows.
-A demo of one such system.
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...DataWorks Summit
Back in 2014, our team set out to change the way the world exchanges and collaborates with data. Our vision was to build a single tenant environment for multiple organisations to securely share and consume data. And we did just that, leveraging multiple Hadoop technologies to help our infrastructure scale quickly and securely.
Today Data Republic’s technology delivers a trusted platform for hundreds of enterprise level companies to securely exchange, commercialise and collaborate with large datasets.
Join Head of Engineering, Juan Delard de Rigoulières and Senior Solutions Architect, Amin Abbaspour as they share key lessons from their team’s journey with Hadoop:
* How a startup leveraged a clever combination of Hadoop technologies to build a secure data exchange platform
* How Hadoop technologies helped us deliver key solutions around governance, security and controls of data and metadata
* An evaluation on the maturity and usefulness of some Hadoop technologies in our environment: Hive, HDFS, Spark, Ranger, Atlas, Knox, Kylin: we've use them all extensively.
* Our bold approach to expose APIs directly to end users; as well as the challenges, learning and code we created in the process
* Learnings from the front-line: How our team coped with code changes, performance tuning, issues and solutions while building our data exchange
Whether you’re an enterprise level business or a start-up looking to scale - this case study discussion offers behind-the-scenes lessons and key tips when using Hadoop technologies to manage data governance and collaboration in the cloud.
Speakers:
Juan Delard De Rigoulieres, Head of Engineering, Data Republic Pty Ltd
Amin Abbaspour, Senior Solutions Architect, Data Republic
Large-Scale System Integration with DDS for SCADA, C2, and FinanceRick Warren
Presentation to the OMG Real-Time Workshop in May 2010 on system integration patterns, especially (but not exclusively) with respect to OMG Data Distribution Service (DDS) technology.
History of Computer Systems - Why we are doing it that wayLeo Lorieri
- by past and present facts, try to help new web professionals understand why web development is going that way: why queues, why caches, why REST, why NoSql, etc
- try to illustrate how and why old technologies were re-packed to help on web development
The Anatomy Of The Google Architecture Fina Lv1.1Hassy Veldstra
A comprehensive overview of Google's architecture - starting from the search page and all the way to its internal networks.
By Ed Austin, talk given at Edinburgh Techmeetup in December 2009
http://techmeetup.co.uk
Indeed Flex: The Story of a Revolutionary Recruitment PlatformHostedbyConfluent
"This is a tale of two streams when the pandemic hit and how we changed with the times and built a revolutionary recruitment platform for “going into work”. We engaged employers, recruiters and job seekers from industrial, healthcare, retail, hospital and facilities management sectors by building a unique platform where the job seeker has full control to pick their schedule, pay rate and what meets their preferences. Our goals are to give job seekers and employers a platform that thrives on simplicity, transparency and low costs. The Flexer stands today with full control of their time at the edge of opportunities to thrive on.
This presentation will go into the details of how we are tearing down a monolithic platform piece by piece and building a robust architecture,
- Routing events between two platforms
- Many sources and,
- Consumed by downstream several applications
We will discuss the caveats and bugs we learned when we worked with schema registry and evolution of schemas. We will highlight improvements we gained from automation and observability with Datadog integration for Confluent Cloud.
If you’re in discussions surrounding event driven systems at your organization then this talk is for you. Join Ronak and me for this talk and let’s have a discussion."
One day workshop Linked Data and Semantic WebVictor de Boer
As taught at UNIMAS July 2019. based on a three day summer school by Knud Hinnerk Moeller and Victor de Boer. Includes hands on excercises using SWI-Prolog ClioPatria
The Benefits of Linking Metadata for Internal and External users of an Audiov...Victor de Boer
Slides for the MTSR2018 presentation for the paper The Benefits of Linking Metadata for Internal and
External users of an Audiovisual Archive by Victor de Boer, Tim de Bruyn, John Brooks and Jesse de Vos
Like other heritage institutions, audiovisual archives adopt structured vocabularies for their metadata management. With Semantic Web and Linked Data now becoming more and more stable and commonplace technologies, organizations are looking now at linking these vocabularies to external sources, for example those of Wikidata, DBPedia or GeoNames. However, the benefits of such endeavors to the organizations are generally underexplored. In this paper, we present an in-depth case study into the benefits of linking the “Common Thesaurus for Audiovisual Archives” (or GTAA) and the general-purpose dataset Wikidata. We do this by identifying various use cases for user groups that are both internal as well as external to the organization. We describe the use cases and various proofs-of-concept prototypes that address these use cases.
UX Challenges of Information Organisation: Assessment of Language Impairment ...Victor de Boer
Presentation at #ICTOPEN2018 for the ABC-KB project "UX Challenges of Information Organisation: Assessment of Language Impairment in Bilingual Children" by Dana Hakman, Cerise Muller, Victor de Boer, Petra Bos
Fahad Ali's slides for Machine to-machine communication in rural conditions ...Victor de Boer
Fahad Ali's slides for the final presentation for his Information Sciences Master Thesis titled "Machine to-machine communication in rural conditions realizing kasadaka-net"
Linking African Traditional Medicine Knowledge - by Gossa LoVictor de Boer
Slides for Gossa Lo's presentation on Linking African Traditional Medicine Knowledge (Lo, de Boer, Schlobach) at the SWAT4LS conference.
abstract African Traditional Medicine (ATM) is widely used in Africa as the first-line of treatment thanks to its accessibility and affordability. However, the lack of formalization of this knowledge can lead to safety issues and malpractice. This paper investigates a possible contribution of the Semantic Web in realizing the formalization and integration of ATM with data on conventional medicine. As a proof of concept we convert various ATM datasets and link them to conventional medical data. This results in a Linked ATM knowledge graph. We finally give some examples with some interesting SPARQL queries and insightful results.
Enriching Media Collections for Event-based ExplorationVictor de Boer
Slides for the MTSR2017 presentation on event enrichment in DIVE+ in the context of CLARIAH.
By: Victor de Boer, Liliana Melgar, Oana Inel, Carlos Martinez Ortiz, Lora Aroyo, and Johan Oomen
Abstract: Scholars currently have access to large heterogeneous media collections on the Web, which they use as sources for their research. Exploration of such collections is an important part in their research, where
scholars make sense of these heterogeneous datasets. Knowledge graphs which relate media objects, people and places with historical events can provide a valuable structure for more meaningful and serendipitous browsing. Based on extensive requirements analysis done with historians and media scholars, we present a methodology to publish, represent, enrich, and link heritage collections so that they can be explored by domain expert users. We present four methods to derive events from media object descriptions. We also present a case study where four datasets with mixed media types are made accessible to scholars and describe the building blocks for event-based proto-narratives in the knowledge graph.
New Life for Old Media: Investigations into Speech Synthesis and Deep Learning-based Colorization for Audiovisual Archive - Rudy Marsman, Victor de Boer, Themistoklis Karavellas, Johan Oomen
Presentation for the New European Media summit
User-centered Data Science for Digital HumanitiesVictor de Boer
User-centered Data Science for Digital Humanities: DIVE, Dutch Ships and Sailors and ArchimediaL as presented during the "Network Institute meets CLUE+" event.
Continuous enrichment and linking of heterogeneous collections brings new possibilities for access, analysis. Using automatic methods. Always with human(s) in the loop
Linked Data for Audiovisual Archives (Guest lecture at NISV)Victor de Boer
Guest lecture for the Master programme "Preservation and Presentation of the Moving Image" from UvA about "Linked Data for Audiovisual Archives". The guest lecture was part of educational activities at Netherlands Institute for Sound and Vision
Semantic Technology for Development: Semantic Web without the Web?Victor de Boer
Slides for my keynote address for the joint session of the SALAD workshop and DBPedia day at SEMANTiCS2017. The talk addresses the need for research into the opportunities and challenges for Linked Data in the context of ICT for Development. It shows current work on Kasadaka, Semantic Web in an SMS and sneakernets http://salad2017.linked.services/ http://semantics.cc
A few slides to introduce the cultuurlink tool developed by Spinque for Netherlands Institute for Sound and Vision. These were presented at the second CLARIAH LOD workshop.
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxEduSkills OECD
Andreas Schleicher presents at the OECD webinar ‘Digital devices in schools: detrimental distraction or secret to success?’ on 27 May 2024. The presentation was based on findings from PISA 2022 results and the webinar helped launch the PISA in Focus ‘Managing screen time: How to protect and equip students against distraction’ https://www.oecd-ilibrary.org/education/managing-screen-time_7c225af4-en and the OECD Education Policy Perspective ‘Students, digital devices and success’ can be found here - https://oe.cd/il/5yV
The Indian economy is classified into different sectors to simplify the analysis and understanding of economic activities. For Class 10, it's essential to grasp the sectors of the Indian economy, understand their characteristics, and recognize their importance. This guide will provide detailed notes on the Sectors of the Indian Economy Class 10, using specific long-tail keywords to enhance comprehension.
For more information, visit-www.vavaclasses.com
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
How to Create Map Views in the Odoo 17 ERPCeline George
The map views are useful for providing a geographical representation of data. They allow users to visualize and analyze the data in a more intuitive manner.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
2. Linked data
- In computing, linked data (often capitalized as Linked
Data) describes a method of publishing structured data
so that it can be interlinked and become more useful
through semantic queries (Wikipedia)
- Tim Berners Lee
- DBpedia alone 3.4 million concepts described by 1
billion triples
3.
4. Example solutions - centralised
- rdf4j
- neo4j
- gremlin (apache tinkerpop)
- virtuoso
- many others
5. Entity Registry System
- decentralised
- linked data in
document store
(s, p1, o1), (s, p2, o2) -> {“id”: s, “p1”: o1,
“p2”: o2, ..}
- designed for poor
network connectivity
8. Contributor
- read and write content
- private, public and cache
- private never shared
- public can be searched by others and distributed
9. Bridge
- can connect isolated parts of the system
- similar to a cache (if network goes down, data can still be read from
the bridge)
- reduce O(n^2) links down to O(n)
10. Aggregator
- optional component
- high-performance node/cluster
- read-only entry retrieval of data
- contributors and bridges push public data to it
14. Similar solutions
Nintendo streetpass and spotpass(3DS only, no aggregator)
Transparent Inter-process Communication
- nodes, zones, cluster (only logical grouping)
- ip layer, bridges reduce links
Sugar network (OLPC)
- in sugar network, clients can communicate to nodes(equivalent of bridges)
or Master node(equivalent of Aggregator)
- ers has more flexibility in the data format.
15. Initial status
- not working,
- very limited testing
- a bit frustrating to install
- not much investigation (real world tests)
Thus far, the highest number of concurrent users of ERS has been 4 XO laptops, with one bridge, all in the same
geographical location
16. Research Questions
- Can the Entity Registration system reliably perform in
a real-world scenario ? (i.e. provide the required
functionality in a robust manner)?
- Does the ERS scale to a large number of users?
- How does the ERS cope with poor network
connectivity?
17. “local” tests
- unit tests for storage, daemon and api
communication
- basic operations performance tests
- entity creation 5/s
- property edits 20/s
- value edits 20/s
18. Docker
- much larger number of nodes
- faster to start
- docker hub image -> ers can be ran within
seconds on any x86 device (atm no docker
image for arm)
19. “Real world” simulation
- simulations so far have a more or less ideal
network
- reality is a bit different
20. Simian army
- http://techblog.netflix.com/2011/07/netflix-simian-army.html
- december 2012 amazon employee launched a
maintenance process against the running production system
which deleted the state information needed by load balancers
- problems on Christmas Eve at 1:45 p.m.
- lasted until 9:41 a.m. on Christmas Day,
an outage of about 20 hours
- no Netflix on christmas - unhappy customers
21. Simian army - cont
- clunky to integrate
- only works on aws
24. Real world case - Conference
deployment
- Simulate conference social network
- Think LinkedIn without central server
- Profile, skills
- Endorsements
- Fixed bridge, mobile contributors
26. Remote merchants
- vendors in different remote villages
- no network connectivity
- box on a truck that visits every
farmer, provides up to date
information on the prices of the other
villages, picks up any new
information from current one
- Fixed contributors, mobile bridge
27. Behavior
- As long as network isn’t too bad(will be
detailed) if the truck stops for a couple of
seconds we achieve synchronization
28. Network tolerance
Fixed 5 seconds wait time, contributor writing
as fast as it can
- 100 ms each way
- 15% loss/corruption each way
- duplication is fine
- mostly binary progress
29. Research Questions Revisited
Can the Entity Registration system reliably perform in a
real-world scenario ? (i.e. provide the required
functionality in a robust manner)? Suite deployed, tests
indicate yes
- Does the ERS scale to a large number of users? 40
nodes
- How does the ERS cope with poor network connectivity?
- If connection for .5 sec or longer: yes