The document discusses various data exchange formats and their pros and cons for serializing and transmitting structured data. It begins with an overview of JSON and its limitations before examining alternatives like Protocol Buffers, XML, MessagePack, and ASN.1. The presenter provides details on ASN.1's advanced features like parametrized types and string ranges. While ASN.1 is very rich, the quality of open source libraries is variable. In summary, the document analyzes different data serialization options beyond JSON and their tradeoffs.
We use tokens to identify resources and try to ensure data security in insecure environments, however the management of these tokens can get quite complex. When we have distributed environments things are harder to deal with. Come to the magical world of JSON Web Tokens and make your life simpler!
MongoDB Europe 2016 - Debugging MongoDB PerformanceMongoDB
Asya is back, and so is Sherlock Holmes and his techniques to gather and analyze data from your poorly performing MongoDB clusters. In this advanced talk we take a deep look at all the diagnostic data that lives inside MongoDB - how to interrogate and interpret it to help you solve those frustrating performance bottlenecks that we all face occasionally.
Audio available: https://www.liferay.com/web/events-symposium-north-america/recap
Liferay makes it easy to integrate your application with powerful search engines. However, it may be hard to diagnose why your most important content isn't showing up the way you need it to. This session will recap the key concepts for indexing and querying with Liferay Search, and present a number of techniques to guarantee your documents will be found with best possible relevance.
André de Oliveira joined Liferay in early 2014 as a senior engineer and leads the Search Infrastructure team. He's been a Java developer and architect for the last 15 years. Ever since discovering Elasticsearch, he's vowed never to write another SQL WHERE clause again.
MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...MongoDB
Proximus is one of the biggest Telecom companies in the Belgian market. This year the company began developing a new IoT network using LoRaWan technology. The talk will detail our development team’s search for a database suited to meet the needs of our IoT project, the selection and implementation of MongoDB as a database, as well as well as how we built a system for storing a variety of sensor data with high throughput by leveraging sleepy.mongoose. The talk will also discuss how different decisions around data storage impact applications in regards to both performance and total cost.
We use tokens to identify resources and try to ensure data security in insecure environments, however the management of these tokens can get quite complex. When we have distributed environments things are harder to deal with. Come to the magical world of JSON Web Tokens and make your life simpler!
MongoDB Europe 2016 - Debugging MongoDB PerformanceMongoDB
Asya is back, and so is Sherlock Holmes and his techniques to gather and analyze data from your poorly performing MongoDB clusters. In this advanced talk we take a deep look at all the diagnostic data that lives inside MongoDB - how to interrogate and interpret it to help you solve those frustrating performance bottlenecks that we all face occasionally.
Audio available: https://www.liferay.com/web/events-symposium-north-america/recap
Liferay makes it easy to integrate your application with powerful search engines. However, it may be hard to diagnose why your most important content isn't showing up the way you need it to. This session will recap the key concepts for indexing and querying with Liferay Search, and present a number of techniques to guarantee your documents will be found with best possible relevance.
André de Oliveira joined Liferay in early 2014 as a senior engineer and leads the Search Infrastructure team. He's been a Java developer and architect for the last 15 years. Ever since discovering Elasticsearch, he's vowed never to write another SQL WHERE clause again.
MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...MongoDB
Proximus is one of the biggest Telecom companies in the Belgian market. This year the company began developing a new IoT network using LoRaWan technology. The talk will detail our development team’s search for a database suited to meet the needs of our IoT project, the selection and implementation of MongoDB as a database, as well as well as how we built a system for storing a variety of sensor data with high throughput by leveraging sleepy.mongoose. The talk will also discuss how different decisions around data storage impact applications in regards to both performance and total cost.
Talk given in Tecnocampus Mataró.
In recent years, the mobile and web software industry has advanced fast. Each year new tools and frameworks emerge, fueled by the open source community. This presentation was prepared by Geemba in order to introduce students to this new landscape and modern tecnologies.
REST est devenu un standard pour les APIs web. Mais malgré sa popularité, il est plein de défauts. Son successeur existe et il vient de Facebook. Venez découvrir en détail le principe, la mise en oeuvre et l’écosystème de GraphQL.
JSON-LD is a set of W3C standards track specifications for representing Linked Data in JSON. It is fully compatible with the RDF data model, but allows developers to work with data entirely within JSON.
More information on JSON-LD can be found at http://json-ld.org/
The core Search frameworks in Liferay 7 have been significantly retooled to benefit not only from Liferay's new modular architecture, but also from one of the most innovative players in the market: Elasticsearch, which replaces Lucene as the default search engine in Portal. This session will cover topics like clustering and scalability, unveil improvements (both Elasticsearch and Solr) like aggregations, filters, geolocation, "more like this" and other new query types, and also hot new features for the Enterprise like out-of-the-box Marvel cluster monitoring and Shield security.
André "Arbo" Oliveira joined Liferay in early 2014 as a senior engineer and leads the Search Infrastructure team. He's been writing code for a living for 22 years, 14 of them as a Java developer and architect. Ever since discovering Elasticsearch, he's vowed never to write another SQL WHERE clause again.
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs - Front in Bahia...Ícaro Medeiros
In my talk I walk throgh Semantic Web initiatives, like RDF and SPARQL, linked data principles, discuss some implementation and adoption issues and talk about semantic annotation in HTML. Semantic annotation using the Schema.org vocabulary is demonstrated using both HTML 5 Microdata or JSON-LD input. There is a strong highlight in benefits seen in Google search results with Rich Snippets, Actions in Email, and Google Now with real examples.
Presentation of the paper "On Using JSON-LD to Create Evolvable RESTful Services" at the 3rd International Workshop on RESTful Design (WS-REST 2012) at WWW2012 in Lyon, France
Talk given in Tecnocampus Mataró.
In recent years, the mobile and web software industry has advanced fast. Each year new tools and frameworks emerge, fueled by the open source community. This presentation was prepared by Geemba in order to introduce students to this new landscape and modern tecnologies.
REST est devenu un standard pour les APIs web. Mais malgré sa popularité, il est plein de défauts. Son successeur existe et il vient de Facebook. Venez découvrir en détail le principe, la mise en oeuvre et l’écosystème de GraphQL.
JSON-LD is a set of W3C standards track specifications for representing Linked Data in JSON. It is fully compatible with the RDF data model, but allows developers to work with data entirely within JSON.
More information on JSON-LD can be found at http://json-ld.org/
The core Search frameworks in Liferay 7 have been significantly retooled to benefit not only from Liferay's new modular architecture, but also from one of the most innovative players in the market: Elasticsearch, which replaces Lucene as the default search engine in Portal. This session will cover topics like clustering and scalability, unveil improvements (both Elasticsearch and Solr) like aggregations, filters, geolocation, "more like this" and other new query types, and also hot new features for the Enterprise like out-of-the-box Marvel cluster monitoring and Shield security.
André "Arbo" Oliveira joined Liferay in early 2014 as a senior engineer and leads the Search Infrastructure team. He's been writing code for a living for 22 years, 14 of them as a Java developer and architect. Ever since discovering Elasticsearch, he's vowed never to write another SQL WHERE clause again.
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs - Front in Bahia...Ícaro Medeiros
In my talk I walk throgh Semantic Web initiatives, like RDF and SPARQL, linked data principles, discuss some implementation and adoption issues and talk about semantic annotation in HTML. Semantic annotation using the Schema.org vocabulary is demonstrated using both HTML 5 Microdata or JSON-LD input. There is a strong highlight in benefits seen in Google search results with Rich Snippets, Actions in Email, and Google Now with real examples.
Presentation of the paper "On Using JSON-LD to Create Evolvable RESTful Services" at the 3rd International Workshop on RESTful Design (WS-REST 2012) at WWW2012 in Lyon, France
Esri GeoConX 2016 White Paper Presentation by Peter Zimmermann of UDC Inc. and Robert Smith of Pacific Gas & Electric.
While recently converting their electric distribution sources, PG&E included an initiative to convert their GIS data into a standalone ArcSchematics model. The schematics model was developed to support the engineering group as well as PG&E‘s DMS outage management system and was tailored to provide a geo-schematic representation following major roadways to enhance locating of key primary system features. Software was developed to allow daily GIS updates to feed the schematics model directly thus greatly reducing latency between the two systems. Maintenance of schematics consists of daily executed minor whitespace management edits ensuring ease of use.
How to Exchange Data between CAD and GISSafe Software
Gain total control over CAD and GIS data exchange. Discover how to use FME to preserve the information in CAD annotation when converting to GIS, and turn GIS data and attributes into rich, clean CAD drawings. You'll see how you can use reusable workflows to easily transform virtually any CAD or GIS data including AutoCAD, Esri ArcGIS, MapInfo, and MicroStation.
A geographic information system (GIS) is a system designed to capture, store, manipulate, analyze, manage, and present all types of geographical data. The acronym GIS is sometimes used for geographical information science or geospatial information studies to refer to the academic discipline or career of working with geographic information systems and is a large domain within the broader academic discipline of Geoinformatics. In the simplest terms, GIS is the merging of cartography, statistical analysis, and computer science technology.
This is presentation is intended for middle school students. It provides a short introduction to GIS and how to use GIS in the real-world.
ArcGIS Explorer is the software used to demonstrate concepts.
45 minutes + 15 minutes demo
Download ArcGIS Explorer here...
http://www.esri.com/software/arcgis/explorer/
After a brief introduction into the history of Database Management Systems different types of NoSQL data stores are characterized. Theoretical background information about sharding mechanisms, horizontal scaling and the CAP theorem are getting explained.
After a comparison of different NoSQL stores you will get to know the pros and cons of the different approaches and you will learn how to take the decision for the best fitting database in your project.
Start visualizing, analyzing and exploring Instagram feeds/influencer from South Tyrol & Trentino and inquiries/bookings from Touristic Portals from South Tyrol (BigData4Tourism working group) with Elastic.
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...confluent
We’ve built a real-time streaming platform that enables prediction based on user behavior, with events occurring in virtual and augmented reality environments. The solution enables organizations to train people in an extended reality environment, where real-life training may be costly and dangerous. Kafka Streams enables analyzing spatial and event data to detect gestural feature and analyze user behavior in real-time to be able to predict any future mistake the user might make. Kafka is the backbone of our real-time analytics and extended reality communication platform with our cluster and applications being deployed on Kubernetes.
In this talk, we will mainly focus on the following: 1. Why Extended Reality with Kafka is a step in the right direction. 2. Architecture & Power of Schema Registry in building a generic platform for pluggable XR apps and analytics models 3. How KSQL and Kafka Streams fits in Kafka Ecosystem to help analyze human motion data and detect features for real-time prediction. 4. Demo of a VR application with real-time analytics feedback, which assists people to be trained in how to work with chemical laboratory equipment.
Extensible RESTful Applications with Apache TinkerPopVarun Ganesh
Presented at Graph Day SF 2018.
You are into data analytics. You come across a source of data and you realise that it is an intuitive case for a Knowledge Graph and that there is much value to be gained by incorporating it into one. How do you take this from zero to product while ensuring that it is well-tested, extensible, scalable and plays nicely with other components and services?
Slack, with its various interactions among its users is a prime candidate for this. Join us as we take you through our journey of conceptualizing Slack user data as a knowledge graph, evaluating different frameworks, incorporating business logic using TinkerPop with an extensible DSL and exposing it all through a familiar RESTful interface that allows us to effectively handle an ever-growing and dynamic graph.
ELK Stack - Turn boring logfiles into sexy dashboardGeorg Sorst
Die Präsentation zeigt, wie mit dem ELK-Stack (Elasticsearch, Logstash, Kibana) Logs von Applikationen zentralisiert verwaltet und ausgewertet werden können.
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017Codemotion
Today’s applications are expected to provide powerful full-text search. But how does that work in general and how do I implement it on my site or in my application? Actually, this is not as hard as it sounds at first. This talk covers: * How full-text search works in general and what the differences to databases are. * How the score or quality of a search result is calculated. * How to implement this with Elasticsearch. Attendees will learn how to add common search patterns to their applications without breaking a sweat.
Modeling JSON data for NoSQL document databasesRyan CrawCour
Modeling data in a relational database is easy, we all know how to do it because that's what we've always been taught; But what about NoSQL Document Databases?
Document databases take (much) of what you know and flip it upside down. This talk covers some common patterns for modeling data and how to approach things when working with document stores such as Azure DocumentDB
Demo presentation given at the Semantic Web Applications and Tools for Life Science (SWAT4LS) 2014 meeting in Berlin, Dec 10, 2014. http://www.swat4ls.org/workshops/berlin2014/scientific-programme/
NoSQL - MongoDB. Agility, scalability, performance. I am going to talk about the basis of NoSQL and MongoDB. Why some projects requires RDBMs and another NoSQL databases? What are the pros and cons to use NoSQL vs. SQL? How data are stored and transefed in MongoDB? What query language is used? How MongoDB supports high availability and automatic failover with the help of the replication? What is sharding and how it helps to support scalability?. The newest level of the concurrency - collection-level and document-level.
Composable Data Processing with Apache SparkDatabricks
As the usage of Apache Spark continues to ramp up within the industry, a major challenge has been scaling our development. Too often we find that developers are re-implementing a similar set of cross-cutting concerns, sprinkled with some variance of use-case specific business logic as a concrete Spark App.
Malli is a new Clojure/Script library for elegant data-driven data modelling. It is born from the real world needs of dynamic, distributed and multi-tenant systems: Schemas to describe the data should be first-class data. They should drive the runtime value transformations, web forms, rules and processes. Configuration should be explicit – no magic, macros or mutability. We should be able to infer, create and modify the models at runtime, persist and load them back from database and share over the wire, for both Clojure and ClojureScript. Like JSON Schema, but for EDN & Clojure/Script.
In this talk, I’ll walk through the core concepts, features and the lessons learned on creating Malli. I’ll also demonstrate some cool new things we can do with literal data schemas and compare it to prior art, including clojure.spec, plumatic schema and JSON Schema.
Welcome to the Program Your Destiny course. In this course, we will be learning the technology of personal transformation, neuroassociative conditioning (NAC) as pioneered by Tony Robbins. NAC is used to deprogram negative neuroassociations that are causing approach avoidance and instead reprogram yourself with positive neuroassociations that lead to being approach automatic. In doing so, you change your destiny, moving towards unlocking the hypersocial self within, the true self free from fear and operating from a place of personal power and love.
16. Can be serialized in only one way
{"username": "MrPresident", "firstName":
"Frank", "lastName": "Underwood", "age": 50,
"email": "president@whitehouse.gov",
"badges": ["caring", "loving"]}
17. Schema is sent with every message
{
“email”: “president@whitehouse.gov”,
“username”: “MrPresident”,
“firstName”: “Frank”,
“lastName”: “Underwood”,
“age”: 50,
“badges”: [“caring”, “loving”]
}
18. Schema can change at any time
{
“email”: “president@whitehouse.gov”,
“username”: “MrPresident”,
“first_name”: “Frank”,
“last_name”: “Underwood”,
“age”: 50,
“badges”: [“caring”, “loving”]
}
23. Eric Naggum
“(...) those who think it is rational to require parsing of character
data at each and every application interface are literally retarded
and willfully blind. (...) But, alas, people prefer buggy text formats
that they can approximate rather than precise binary formats that
follow general rules that are make them as easy to use as text
formats.”
http://www.xach.com/naggum/articles/3242274237190594@naggum.no.txt
26. About ASN.1
● Created in 1984 by ITU (International Telecommunication Union)
● Revised and changed many times later
● There is no ASN.2 :)
● Heavily used by telecommunications industry (UMTS, LTE, SIM
cards)
● Cryptography: PKCS, PKI (X.509)
● LDAP, RFID
33. ASN.1- quality of libraries
Because the language is quite rich the quality of libraries
varies greatly.
Most featureful seems to be OSS Noklava (oss.com) but
their product is proprietary and not free.
34. ASN.1- my opinion
Opinions about ASN.1 vary. IMHO it hasn’t
reached the critical mass required for the open-
source ecosystem to develop and great tools to
appear that will support all of its advanced
features.
Bandwagon effect: once a product becomes popular, more people tend to "get
on the bandwagon" and buy it, too [Wikipedia]
35. ASN.1 demo
And let’s not forget about
openssl asn1parse -inform DER -in server-message-received
openssl asn1parse -in id_rsa.asn1-test
https://lapo.it/asn1js/
37. protobuf - general remarks
● Only one official serialization protocol
● 3rd party JSON encoder: https://github.com/dpp-
name/protobuf-json
● ASN.1 with PER encoding is more compact: http:
//stackoverflow.com/a/4441622
● Tags are required (contrary to ASN.1)
38. protobuf - who uses it?
● Google
● Hadoop
● OpenStreetMap for their PBF Format
● Ubuntu One did use it for storage (before it was shut
down)
● … know more?
39. protobuf - officially supported languages
● C++
● Java
● Python
● Objective-C (3.0+)
● C# (3.0+)
40. protobuf - unofficially supported languages
● Clojure
● Common Lisp
● Erlang
● Go
● Haskell
● JavaScript
● Lua
● Ruby
● Scala
● and many others
41. protobuf - RPC
Similarly to ASN.1, protobuf doesn’t have builtin RPC but
take a look at http://www.grpc.io/
45. Thrift - general remarks
● Started at Facebook, now part of Apache Software Foundation
● RPC is built-in
● Supports multiple protocols (Binary, Compact, JSON,
multiplexed, simple JSON, tuple, ...) and transports
(Empty, Framed, HttpClient, ...) out of the box
● Tags are required
● http://diwakergupta.github.io/thrift-missing-guide/
46. Thrift - officially supported languages
● C++
● C#
● Erlang
● Haskell
● Java
● JavaScript
● Python
● Ruby
● and many others with varying protocol/transport support
47. Thrift - who uses it?
● Facebook
● Hadoop
● Cassandra DID use it but switched to CQL
● Evernote
● LastFM
52. Cap’n proto
● Similar to the previous ones
● Tags are obligatory
● RPC is built-in
● One encoding protocol with optional “packing” for data
compression
● No encoding/decoding step, data is already kept
encoded in memory
53. Cap’n proto - supported languages
● C++
● Erlang
● JavaScript (Node.js only)
● Python
● Ruby
54. Cap’n proto - supported languages (no RPC)
● C
● C#
● Go
● Java
● JavaScript
● Lua
● OCaml
61. Avro - general remarks
● Developed by Apache Software Foundation
● Primarily used in Hadoop
● Dynamic schema, can be specified either in JSON or a
custom IDL (Interface Definition Language)
● Serialization to binary or JSON
● No tags! Every field is optional
63. Avro - dynamic schema
Because schema is dynamic, every message must be sent
along with its schema definition.
This resembles JSONSchema packed together with JSON
which holds the actual data.
64. Avro - dynamic schema ctd.
To decode the message you always need the schema it
was encoded with.
But: Avro has smart schema resolution rules so that if
reader expects the message in different format, Avro will
translate the decoded data to that format.
66. Avro - schema resolution for records
● the ordering of fields may be different: fields are matched by name.
● schemas for fields with the same name in both records are resolved
recursively.
● if the writer's record contains a field with a name not present in the reader's
record, the writer's value for that field is ignored.
● if the reader's record schema has a field that contains a default value, and
writer's schema does not have a field with the same name, then the reader
should use the default value from its field.
● if the reader's record schema has a field with no default value, and writer's
schema does not have a field with the same name, an error is signalled.
67. Avro
Avro seems like a good way to start out when you’re still
used to JSON and are not convinced by other protocols.
69. Final remarks
● Required fields are forever (except for Avro, where “tag” is fieldname)
● Can be used not only for exchange but storage too: protobuf in ActiveMQ and Ubuntu One (just
think in terms of Thrift and take “transport” to be Database)
● For HTTP use the Content-Encoding and Content-Type headers
● Performance & size comparisons:
○ https://github.com/sidshetye/SerializersCompare (C#)
○ https://github.com/eishay/jvm-serializers (Java)
● Everyone creates their own:
○ Protocol Buffers (Google)
○ Thrift (Facebook)
○ Bond (Microsoft)