What is it? In computing, backwards or downwards compatibility is a general term referring to the ability to read, write, and/or execute input with a certain technology, where that input was designed to be read, written, and/or executed by an older version of the same technology. Example 1: Java In the Java programming language, code written according to the Java 1.0 specification will still run with identical results on the current version (*), and it is generally understood that this will be so indefinitely. Innovations have been made in the succeeding major versions of the Java language, mostly by expanding the available syntax, but the original specification is still 100% valid. Example 2: PDF PDFs created in an old PDF format (e.g. PDF 1.0) should be rendered correctly by later viewers. Newer versions of PDF add new functionality. They don’t remove existing functionality (*). An older PDF viewer might not be able to use that new functionality (e.g. OCG), or even fail to render the document (e.g. when the cross-reference table is compressed). Important: Backward compatibility obviously doesn’t mean you can run newer code in an older environment. For instance: code compiled with Java 8 can cause a java.lang.UnsupportedClassVersionError (Unsupported major.minor version) when you run it on a Java 6 JVM, even if you only use functionality that was already available in Java 6.
In context of a library, the “input” is the application code calling the API.
Mention “Bug compatibility”: emulation of bugs may be necessary if legacy code depends on that behavior
BouncyCastle The Android operating system, as of early 2014, includes a customized version of Bouncy Castle. Due to class name conflicts, this prevents Android applications from including and using the official release of Bouncy Castle as-is. A third-party project called Spongy Castle distributes a renamed version of the library to work around this issue.
iText 4 A third party created a fork of iText 2.1.7 and named it iText 4. That’s fine, but unfortunately they also released this unofficial version on Maven using the official iText groupId. That’s not OK: according to the Apache FAQ, this is in violation with the rules. Many users upgraded to iText 4 without realizing they were using an unofficial iText version. Nevertheless, they expected the original iText developers to support this version. iText Group reclaimed its groupId and this broke the Maven builds of many iText 4 users. Some developers blamed iText for this, instead of blaming the real culprits.
I removed this example:
OpenSSL At 11 PM on new year’s eve 2011, David Henson received the code from Robin Seggelmann, a respected academic who’s an expert in internet protocols. Henson reviewed the code and had added it to the OpenSSL repository. More than two years later, in April 2014, this patch was discovered to cause a serious security vulnerability known as “Heartbleed”, impacting almost every company using OpenSSL.
Try to fix problems with internal re-implementations Do not change the API that is exposed to the developers. Sometimes parts of the API were not intended to be exposed. Java 9 will allow to encapsulate internal APIs Use two methods for the same function, deprecate the old method Main disadvantage: redundancy E.g. in PDF: the F operator is equivalent to the f operator. It’s there for historical reasons and should no longer be used.
XFA There were dozens of ways to construct a similar form. Even Adobe didn’t succeed in following its own spec. Almost no third party vendor supported XFA because it was too complex. As a result, XFA was deprecated in ISO-32000-2. A PDF 2.0 file shall not contain any XFA. Python An official goal of the Python 3 redesign was to "reduce feature duplication by removing old ways of doing things". An example is the use of placeholders in string templates iText E.g. in iText: the HTMLWorker class was deprecated in favor of the XML Worker framework. HTMLWorker was written for a specific, limited purpose, but people started to use it in a broader way. We get plenty of questions from people who use HTMLWorker and want us to extend it. We can’t and won’t do that.
In iText 5, there were several rendering APIs that had a lot of functional overlap but also showed (sometimes subtle) differences in behavior In iText 7, we introduced a renderer framework that is much easier to maintain
Digital signatures depend on hashing and encryption algorithms Algorithms are subject to flaws and vulnerabilities: e.g. collision attacks in MD5, SHA-1... Processing power increases, reducing the time to decrypt a message Internationalization: ASCII isn’t sufficient anymore, we need Unicode Python 2 depended on ASCII; the API of Python 3 was changed in favor of Unicode Standards and specifications change For instance: the maximum file size in PDF file used to be 10 Gigabyte, this was increased to 1 Terabyte, but to achieve this, the PDF needs to have a compressed cross-reference table with a different structure than before. Old viewers can’t read such a cross-reference table. Before iText 5.2.0, all byte positions were expressed in int. This reduced the maximum file size of a PDF to 2 Gigabyte. We changed int to long to support files up to 1 Terabyte. No API change was required, but we broke the 5.2.x series by accident. We removed all 5.2.x releases from our repositories.
Twitter The easy API provided by Twitter caused a couple of problems It resulted in huge server loads (we all know the “fail whale”) It was easy to use the API in ways that weren’t acceptable (SPAM, abuse) In 2013, the API was changed to restrict access to authenticated, registered applications Sharepoint Sharepoint 2010: IFilter architecture, for instance to search PDF files in an intelligent way Adobe (free), PDFLib, and Foxit were better than MS at searching PDF files in Sharepoint Sharepoint 2013: IFilter support for PDF is broken PDF “natively” supported in Sharepoint by MS; competing products no longer work, no longer needed
iText 7 versus iText 5 The core design hadn’t changed since 2000. The original design and first versions were written by a self-taught developer Not enough refactoring along the way. Breaking compatibility gradually from time to time could have prevented the complete iText 7 rewrite. Younger code (e.g. digital signatures, parsing PDF) can be reused; the older code needed a rewrite We gained new insights thanks to people using iText in ways we didn’t expect E.g. the original developer never imagined that one day he’d have to support Hindi iText grew organically; many different developers contributed E.g. form fields: different classes for form creation vs. form filling E.g. different layout systems: Document.add() vs. ColumnText vs. writing to PdfContentByte The world has changed since 2000: People started deploying applications on AWS and GAE: different file system. People started using iText on Android: monolithic iText vs. limitation of the number of classes
BouncyCastle: Source code incompatibilities were introduced in version 1.47. This led to numerous problems for all libraries that had a dependency on BouncyCastle 1.x. The communication wasn’t ideal: “The next release of BC will be version 2.0. For this reason a lot of things in 1.46 that relate to CMS have been deprecated” (Bouncy Castle Release Notes 1.46, 2011) “Okay, so we have had to do another release. The issue we have run into is that we probably didn’t go far enough in 1.46, but we are now confident that moving from this release to 2.0 should be largely just getting rid of deprecated methods.” (Bouncy Castle Release Notes 1.47, 2012) “There has been further clean out of deprecated methods in this release. If your code has previously been flagged as using a deprecated method you may need to change it. The OpenPGP API is the most heavily affected.” (Bouncy Castle Release Notes 1.51, 2014) The most recent version is currently 1.54, dating from 2015. BouncyCastle 2.0 never happened.
iText 7 also doesn’t strictly use semantic versioning, but we intend to do something similar.
Spring 2 evolving into Spring 3 Configuration in Spring 2 was done using XML XML used to be popular, but its popularity faded over the years Spring 3 added configuration through annotations without breaking the API The original design of the Spring Framework was future-proof Python 3 versus Python 2 There were several integer sizes in memory, one of which was dependent on the architecture of the underlying processor The exception framework was inconsistent and idiosyncratic Comparison of object types was complex and non-intuitive
But also… trade off between being future proof / well designed and pragmatic / performant.
We kept project Arya closed source until we were confident there wouldn’t be any substantial changes to the API Writing the jump-start tutorial was a pain, because the examples and the text of the tutorial had to be updated on a regular basis
Commons-imaging: our mistake to depend on a snapshot version
* Clirr (http://clirr.sourceforge.net/) ** Clirr Maven plugin *** console: mvn clirr:check *** we run it in a separate profile ** Clirr SonarQube plugin *** browser *** Tip: create separate SQ dashboard
* JDiff (http://javadiff.sourceforge.net/) ** Javadoc doclet ** generates HTML report of API differences ** published for users
Provide good documentation Provide conversion tools Python: To move from Python 2 to Python 3, a tool called 2to3 was developed to make the transition easier, essentially rewriting Python 2 source code to fit the Python 3 specification specific user-contributed package called six, which users can add to their Python project of either version to make it reusable on the other version Microsoft Word The .doc and newer .docx document formats for Microsoft Word are not mutually compatible. However, the versions of MS Word since the introduction of .docx are still able to open and edit .doc files usually suggesting to save the document to the newer format to make the file more future-proof.
Python a number of Python 3 innovations have been back-ported to the Python 2 project. This takes away incentive for developers to make the switch at all, because the new features are no longer exclusive to Python 3. As a consequence of the stability of and continued development on Python 2, uptake of Python 3 has been relatively slow. iText No new development in iText 5, only bug fixing All new functionality will be developed in iText 7
Maintaining multiple versions means less resource for new developments in the new version.
In open source you cannot prevent/prohibit back-porting. Users may fork. They should mainly be discouraged to do so because they are convinced by the new version.
Python 3 was released in 2008. Announcement that Python 2 would be supported until 2020 was made in 2014. That EOL date has changed many times.
UWP: the Universal Windows Platform Released in 2015 The announcement of UWP felt as a huge marketing event There were talks and articles on “How to monetize your app” There was very little technical info UWP is based on the .NET Core, but the API of .NET Core wasn’t backward compatible to the .NET Framework Moreover: it wasn’t ready when UWP was announced. It was a moving target. A general porting guide was released in February 2016, but it stays clear of giving exact advice on some of the more fundamental changes in the .NET Core reimplementation, like the Cryptography implementation. This is problematic for vendors confronted with questions such as “Is there UWP support for iText?” We know what breaks, but it’s not clear how to fix it. Stream::Close() doesn’t exist anymore: replaced by Stream::Dispose() (from IDisposable) System.Text.Encoding has changed the names of several properties ICloneable doesn’t exist anymore MethodImplOptions.Synchronized doesn’t exist anymore Serializable and SerializationInfo don’t exist anymore The entire System.util namespace doesn’t exist anymore TimeZone doesn’t exist anymore System.Xml.XmlTextReader and XPath APIs don’t exist anymore …
Deprecate before deleting The user thus receives a fair warning that it is unwise to reference that code in their client application, usually with the implication that the faulty code may be deleted at any time Warn that the code is subject to change Sun packages: example PKCS#11 The class sun.security.pkcs11.SunPKCS11 was available in the 32-bit version of Java 6, but not in the 64-bit version Beware of non-technical API changes iText 2: the MPL/LGPL made it hard to find a working business model iText 5: the AGPL allowed us to create a business based on a dual licensing model We had to make sure that users didn’t accidentally upgrade, so we broke the API The package names com.lowagie were changed into com.itextpdf
New insights that led to a complete rewrite of iText 7 Make it modular, make it extensible Some users complained that they only needed a small part of the functionality, but still had to ship the full, monolithic iText jar with their applications. The original design was extensible, otherwise it wouldn’t have lasted for 16 years, but there was room for improvement. Allow the introduction of a new business model: iText 7 as a platform. Remove functional overlap to avoid maintenance hell You change something in one place, but forget to change it in another place