The document argues that human-readable text formats for storing structured data are wasteful for machines to process. While easy for humans to read and edit, text formats require extra processing for machines to parse and handle errors. Instead, the document advocates storing data in its internal machine-readable representation directly for improved efficiency, and providing specialized editors and viewers when human readability is desired. This can result in smaller file sizes, faster load times, and more responsible development practices.
2. What is it?
• Usually, structured text files
• XML, HTML, JSON, YAML, ...
• Log Files, Source Code, ...
• Structure is for machine readability
• Text is for human readability
• Humans and machines process information differently
• A compromise
Dietmar Hauser | roborodent e.U. | 2023
3. Why is it?
• Easy to view and edit with text editor
• Easy to log and debug
• Easy to read?
• The "lazy" solution
Dietmar Hauser | roborodent e.U. | 2023
4. Why is it bad?
• Not easy to write
• Error prone
• Encoding sensitive
• Difficult to parse
• Parser needs to account for human error
• Humans will make errors
• Wasteful!!!
• CPU, memory, bandwidth, and storage aren't free
Dietmar Hauser | roborodent e.U. | 2023
5. How is it wasteful?
• Text has a low entropy
• "little information per byte"
• Compress to check
• Google's start page shrinks to ~30%
• For consumption, a file needs to be:
• Downloaded
• Uncompressed
• Parsed into an internal representation
• Processed
• Essentially the opposite happens on the creation side
Dietmar Hauser | roborodent e.U. | 2023
6. How can we improve?
• Skip human readability!
• Ideally, store the internal representation
• Minimal processing / parsing required
• Fixup references
• Maybe fix endianness
• Self-describing formats are available
• Protobufs, MsgPack, ...
• Compression can be optional
Dietmar Hauser | roborodent e.U. | 2023
7. Do we lose anything?
• Easy to edit with text editor
• Provide bespoke editor / viewer
• Convert to / from human readable
• Easy to log and debug
• No change if logging supports binary data
• Base64 or similar otherwise
• See above
• Easy to read?
• It is with viewer / editor support!
• See above again!
Dietmar Hauser | roborodent e.U. | 2023
8. Why doesn't everyone do that?
• Hardware is still scaling fast
• Bad user experience
• Wasteful upgrade cycle
• Global heating
• "Move fast and break things" attitude
• Non-human readable formats can be broken just as easily
• With decent tools, less incentive for "manual" editing
• With decent tools, one can move faster
Dietmar Hauser | roborodent e.U. | 2023
9. What would we gain?
• Smaller files everywhere
• Faster load times
• Locally and over network
• Editors and viewers
• Respect your customers
• Don't by lazy
• Think of the children!
Dietmar Hauser | roborodent e.U. | 2023
10. Dietmar Hauser | roborodent e.U. | 2023
roborodent
Dietmar Hauser
P r o g r a m m e r
Software Solutions | Creative Consulting
https://www.roborodent.com
@rattenhirn
dietmar.hauser@roborodent.com
https://slideshare.net/DietmarHauser
https://fb.me/roborodent
https://github.com/rattenhirn/
https://www.linkedin.com/in/rattenhirn/
Editor's Notes
As some of you may know, I work from home
So I don't spend much time in public transport to rant at random people
But you folks will do, so thank you for coming!
This talk is about one of the many injustices that I perceive in the world
I hope my wild mix of facts and opinion will bedazzle some of you to take action
What am I even talking about when say "human readability"
It's about file formats, yay!
Why are these so common?
In my opinion the lazy solution
Most importantly wasteful!
Let's dive deeper into this
Can we do better?
What about all those great advantages the human readable files have?
By now, you are certainly fully convinced, so why doesn't everyone do that?
Why should everyone do that!
Thank you for listening to me, I feel relieved!