Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How to Become an Encoding Champion

70 views

Published on

Character encoding happens every time we interact with a computer or any digital device. There is no such thing as plain text. Understanding how encoding works, how Ruby handles encoding issues, and how to strategically debug encoding snafus, can be a great asset in your developer toolbox. We will cover a bit of history, dive deep into Ruby's encoding methods, and learn some tricks for managing encoding in your application.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

How to Become an Encoding Champion

  1. 1. RubyConf 2019 @ddlavinder Encoding How to Become an Encoding Champion DeeDee Lavinder @ddlavinder
  2. 2. RubyConf 2019 @ddlavinder world's Télécom tilg??ngelig d??gnet
  3. 3. RubyConf 2019 @ddlavinder Encode: To convert into a coded form.
  4. 4. RubyConf 2019 @ddlavinder Encoding != Encryption
  5. 5. RubyConf 2019 @ddlavinder
  6. 6. RubyConf 2019 @ddlavinder 01100010 01101001 01110100 01110011
  7. 7. RubyConf 2019 @ddlavinder Binary to Decimal Base Exponent 27 26 25 24 23 22 21 20 Place Value 128 64 32 16 8 4 2 1 46 = 0 0 1 0 1 1 1 0
  8. 8. RubyConf 2019 @ddlavinder
  9. 9. RubyConf 2019 @ddlavinder
  10. 10. RubyConf 2019 @ddlavinder b i t s 01100010 01101001 01110100 01110011
  11. 11. RubyConf 2019 @ddlavinder
  12. 12. RubyConf 2019 @ddlavinder http://kunststube.net/encoding/ GB18030
  13. 13. RubyConf 2019 @ddlavinder
  14. 14. RubyConf 2019 @ddlavinder
  15. 15. RubyConf 2019 @ddlavinder https://www.unicode.org/charts/PDF/U10480.pdf
  16. 16. RubyConf 2019 @ddlavinder Unicode Transformation Format — UTF http://www.drdobbs.com/web-development/unicode-and-java-web-applications/213201510 https://www.w3.org/International/articles/definitions-characters/http://kunststube.net/encoding/
  17. 17. RubyConf 2019 @ddlavinder https://googleblog.blogspot.com/2012/02/unicode-over-60-percent-of-web.html
  18. 18. RubyConf 2019 @ddlavinder www.utf-8.com
  19. 19. RubyConf 2019 @ddlavinder Endianness
  20. 20. RubyConf 2019 @ddlavinder https://en.wikipedia.org/wiki/Han_unification#Examples_of_language-dependent_glyphs
  21. 21. RubyConf 2019 @ddlavinder
  22. 22. RubyConf 2019 @ddlavinder .encode() .force_encoding()
  23. 23. RubyConf 2019 @ddlavinder string1 = 'Norrlandsvägen'.encode('iso-8859-1') => “NorrlandsvxE4gen" string2 = string1.encode('utf-8') => “Norrlandsvägen" string1.bytesize => 14 string2.bytesize => 15 .encode
  24. 24. RubyConf 2019 @ddlavinder string2.bytes => [78, 111, 114, 114, 108, 97, 110, 100, 115, 118, 195, 164, 103, 101, 110] string2.bytes.first => 78 char = string2.bytes.first.chr => “N" char.ord => 78 string2.encoding => #<Encoding:UTF-8> string2.encoding.ascii_compatible? => true string2.valid_encoding? => true
  25. 25. RubyConf 2019 @ddlavinder Encoding.default_external Encoding.default_internal Magic comment — # encoding: UTF-8 <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> Email Header — Content-Type: text/plain; charset="UTF-8"
  26. 26. RubyConf 2019 @ddlavinder Base64 https://en.wikipedia.org/wiki/Base64
  27. 27. RubyConf 2019 @ddlavinder require 'base64' => true text = 'hello world' => "hello world” encoded_text = Base64.encode64(text) => “aGVsbG8gd29ybGQ=n" decoded_text = Base64.decode64(encoded_text) => "hello world" Base64
  28. 28. RubyConf 2019 @ddlavinder Things to Remember Binary is super cool. There is no such thing as ‘plain text’. Specify which encodings you accept from your users. Please use UTF-8! ‘Characters’ are abstract entities. To decode, you have to know what was used to encode.
  29. 29. RubyConf 2019 @ddlavinder Thank you for listening! DeeDee Lavinder
  30. 30. RubyConf 2019 @ddlavinder Resources https://www.learncisco.net/courses/icnd-1/lan-connections/binary-basics.html https://home.unicode.org/basic-info/faq/ https://deliciousbrains.com/how-unicode-works/ http://kunststube.net/encoding/ https://w3techs.com/technologies/overview/character_encoding http://utf8everywhere.org/ https://developer.mozilla.org/en-US/docs/Glossary/Endianness https://betterexplained.com/articles/understanding-big-and-little-endian-byte-order/ https://mivehind.net/2017/04/23/why-unicode-is-not-my-favorite/ https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer- absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/

×