Json Processing API (JSR 353) review

1,312 views

Published on

Json Processing API (JSR 353) review

Published in: Technology, News & Politics
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,312
On SlideShare
0
From Embeds
0
Number of Embeds
16
Actions
Shares
0
Downloads
19
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Json Processing API (JSR 353) review

  1. 1. The evolution and missing parts of JSON Processing API January 29 2013 My thoughts on current API design and how to improve it focusing on type hierarchy clarification, why JSON model objects should not extend Java collections directly, configuration, naming conventions, API shortcomings related to usability, minor improvements and the missing parts of the JSON Processing API. Martin Škurla crazyjavahacking@gmail.com v. 1.01
  2. 2. Iterables, Iterators, Factories, Parsers, Streams, what the ... ? Table of Contents Introduction............................................................................................................................................. 3 API Design ................................................................................................................................................ 4 Iterables, Iterators, Factories, Parsers, Streams, what the ... ? .......................................................... 4 Why JSON model objects should not extend collections directly ....................................................... 5 Composition over inheritance ......................................................................................................... 5 Violating LSP .................................................................................................................................... 5 "Method confusion" ........................................................................................................................ 5 Revealing too much from the implementation ............................................................................... 6 Too many unsupported methods / the feel of mutability .............................................................. 6 Premature design decisions ............................................................................................................ 6 Unfriendly for non Java implementations ....................................................................................... 7 Do we really need java.io.Flushable? .................................................................................... 8 Configuration through JsonConfiguration ............................................................................... 8 Thoughts on Properties ............................................................................................................ 8 Thoughts on passing Map instance ................................................................................................. 9 Thoughts on JsonFeature.......................................................................................................... 9 How configuration fits into immutable model .............................................................................. 11 Naming conventions.......................................................................................................................... 12 API shortcomings ............................................................................................................................... 12 Inconsistent & misleading method names .................................................................................... 12 The definition of equals() & hashCode() semantics........................................................... 13 The semantics of toString() ................................................................................................... 15 JsonValue.NULL vs. null literal mismatch........................................................................... 16 Minor improvements ........................................................................................................................ 17 Exception handling ........................................................................................................................ 17 Mixed / Voting for ......................................................................................................................... 17 The missing parts................................................................................................................................... 18 Date support...................................................................................................................................... 18 Possible configuration options .......................................................................................................... 18 More explicit thread model............................................................................................................... 18 More explicit immutability ................................................................................................................ 18 2
  3. 3. Iterables, Iterators, Factories, Parsers, Streams, what the ... ? Introduction Hi, my name is Martin Škurla and during the last few weeks I was working on the API review of the JSR1 353: Java API for JSON Processing. I was observing the mailing lists from almost the start of the JSR, but because of lack of time I was not able to focus on the review until now. This document summarizes all the important aspects I think need to be further discussed, changed or leave as they are. I am a big fan of the API design process, so some parts could be described as we would live in the ideal world. I am describing my ideas even if the resolution of associated issues was Closed or Fixed as I think some of the JSR parts needs to be further clarified or improved. The vast majority of chapters include the description of the current state of the JSON Processing API, proposed API change, conclusion, influences and relates to parts. As a lot of discussion was done on the Expert Group mailing list and Users mailing list, I am referencing to both of the mailing lists where appropriate. I am also referring to the JSR 353 JIRA bug tracking system every time there is an issue somehow related to the described part. All source code examples were tested and ran against the current version of the JSON Processing API and its reference implementation (Maven dependency version 1.0-b02). All used code snippets were included to be part of a single Maven project that can be downloaded from my jsonProcessingReview BitBucket repository. 1 Java Specification Request 3
  4. 4. Iterables, Iterators, Factories, Parsers, Streams, what the ... ? API Design Iterables, Iterators, Factories, Parsers, Streams, what the ... ? I think there was a little bit misunderstanding of the situation over implementing or not implementing Iterable and Iterator. It started with mailing list discussion on JsonArray shouldn't extend with Iterable, then also spotted by Torronto JUG and their hack day feedback. From my point of view the decision process is quite easy. Here is how I would define terms (in the Java programming language jargon). Iterator is an abstraction for iterating over items, it's neither random-access nor does it provide the number of all elements and thus could take advantage of lazy loading. Iterator is single-use (not reusable) object always iterating in one direction with possibility of returning items in well-defined order. Remember, we are talking about general terms, not special implementations like ListIterator. Last but not least Iterator is not tight to any particular collection type or data structure. Stream then is a special type of Iterator. Iterable is nothing else than a factory for Iterator-s allowing you to iterate over and over again. JsonParser is not a reusable object, you can only do parsing once per instance. So JsonParser cannot be Iterable. Now because JsonParser is really an event stream and event stream is a special type of Iterator, then JsonParser must logically be an Iterator. Whatever weird it may look like, JsonParser really is an Iterator. What is interesting is that JsonParser mentally already implements Iterator as it contains next() and hasNext() methods although it does not define Iterator as it formal parent. JsonArray on the other hand is an Iterable as you are able to iterate over its items multiple times. It's questionable if it would be beneficial to define Iterable as parent of JsonArray as it already has a List view. Proposed API change I would propose the JsonParser to extend Iterator<JsonParser.Event>. Conclusion By extending JsonParser with Iterator<JsonParser.Event> it would be a little bit more obvious that JsonParser is not a reusable object. Influences JsonParser Relates to JSON_PROCESSING_SPEC-7 "Don't extend JsonArray with Iterable<JsonValue>" {Fixed} JSON_PROCESSING_SPEC-17 "JsonParser shouldn't extend Iterable" {Fixed} JSON_PROCESSING_SPEC-27 "Remove JsonParser#iterator() method" {Fixed} 4
  5. 5. Why JSON model objects should not extend collections directly Why JSON model objects should not extend collections directly I am pretty sure that JSON2 model objects should definitely not extend collections directly. If I would be able to change only one decision in the JSON Processing API3, this would definitely be the one. There are many hints that direct collection inheritance is an example of badly designed API. Composition over inheritance Did we forget about basic OO4 principles? If type A wants only some subset of the behavior exposed by type B, that clearly indicates the need for composition and not inheritance. Do we want to make the same mistake as core JDK5 architects did with java.util.Properties? Now when we know that Properties should never extend Hashtable, we could apply the learnt lesson and not extend JSON model objects with Java collections directly. I think we need to ask if we want to expose the whole parent API or just reuse/copy some functionality. I don't think we want to expose the API as some methods are not applicable and some method pairs could cause "Method confusion" described later. Violating LSP One of the practical ways to determine if the type hierarchy is correct is to use the LSP6. The LSP could be interpreted as "Derived types must be completely substitutable for their base types.". So is JsonObject completely substitutable for java.util.Map<String, JsonValue>? Of course not. The first hint is mutability; JsonObject cannot be used as argument for methods that mutate/modify the given Map. There would be a different story if Java would introduce the distinction between mutable and immutable collections on type system level (e.g. MutableMap & ImmutableMap, ...). The second hint is null literal handling; java.util.Map does not define if null literals are allowed (as keys and values) and that it depends on the implementation. Practically the vast majority of cases allow using null literals (all hash-based Map implementations except concurrent ones) but this is not allowed in JSON objects. So in majority of cases JsonObject cannot be used as argument for methods adding null as keys. In fact JsonObject is immutable so the whole point of the discussion around null literal handling is more like theoretical and should just highlight that there is something wrong with the current abstraction. "Method confusion" Let's discuss the distinction between the following methods that are part of JsonObject (get() defined within Map and getValue() defined within JsonObject): V get (Object key); <T extends JsonValue> T getValue(String name, Class<T> clazz); 2 JavaScript Object Notation Application Programming Interface 4 Object Oriented 5 Java Development Kit 6 Liskov Substitution Principle 3 5
  6. 6. Why JSON model objects should not extend collections directly The user could be confused which method to use. Even if the return type could be at least JsonValue in both cases, one method allow to query for Object key and the other for String key. So the possibility of querying JsonObject using key as arbitrary object depends on the method that will be used. Using the first method will sacrifice the type safety. And if JsonObject should be substitutable, then the type safety will always be immolated. There is one even more important reason that is more deeply explained in [API shortcommings/JsonValue.Null vs null literal mismatch]. Revealing too much from the implementation I'm sure that this is a little bit controversial, but saying that JsonObject is a java.util.Map and JsonArray is a java.util.List reveals too much from the implementation. Programmers used to stuck with the types and abstractions they are familiar with and usually substitute terms when the underlying implementation is obvious. Even if the internal representation of JSON object and JSON array is obvious, wouldn't such substitution reveal more that the JSON spec is saying? According to the JSON spec, "An object is an unordered set of name/value pairs.". That's it, no mention of map or directory-like structure, no mention of hashing or constant lookup time. If we would like to be precise, the type should be something like Set<Pair<String, JsonValue>> instead of Map<String, JsonValue>. The similar applies to JSON array. According to the JSON spec, "An array is an ordered collection of values.". Again no mention of random-access like structure. I think that JSON array was created as a general container-like object (just to hold multiple objects). We would get the random-access characteristics from pretty much every implementation as a bonus, but to be precise, the type should be something like Iterable<JsonValue> instead of List<JsonValue>. Even if you could disagree with the types and such design would make the API less usable, I see this as a hint for not extending collections directly. Too many unsupported methods / the feel of mutability The overall idea of "optional methods" is just fundamentally wrong. If the interface is the basic contract of what messages object could reply to (in OO languages modeled as methods), then "optional methods" basically represent an object that could sometimes respond reasonably. This is very bad from the usability point of view. IDEs7 would provide a lot of methods from the code completion but all of the mutation methods (almost half of all methods) are practically not applicable as throwing a java.lang.RuntimeException as default/required implementation does not seem like a clean design. The user could also be kind of convinced that the JSON model objects are mutable, even if they are not. This would be confusing. Premature design decisions One of the arguments for extending the collection interfaces directly was the cooperation with JDK 8 & Lambda expressions. Let's be realistic, not that much time ago there was a discussion whether JDK 6 or JDK 7 should be the minimal Java requirement for JSON Processing API. And now we are making API decisions because JDK 8? According to the release schedule of Java EE 7 and JDK 8 7 Integrated Development Environments 6
  7. 7. Why JSON model objects should not extend collections directly schedule, Java EE 7 should be released before JDK 8. This really looks like premature design. Anyway I don't really see any important difference between the following 2 method calls: jsonArray.stream(); jsonArray.asList().stream(); And even if this would be considered important, JDK 8 will introduce extension methods, so there would be always an easy way (for MR8) to add stream() method with whatever proper return type it would be for JDK 8 and later. So do we really want to design the API for JDK that is not finished yet (and would still not be finished at the time JSON Processing API would be final) and the real-world codebase migration to that JDK will took even longer? Unfriendly for non Java implementations JSON Processing API is a Java API, but the implementation is not tight to the Java programming language. Basically any implementation passing the TCK9 should be considered as a valid JSON Processing API implementation. This is especially bad for functional programming languages on the JVM10 (e.g. Clojure, Scala, ...). As functional languages used to use persistent data structures, the internal implementation in such languages will almost certainly not be something directly implementing Java collections. Do we really want the JSON Processing API for non Java language implementations to be unnatural and to leak the "feel of Java collections" into every implementation? I don't think so. Proposed API change I would propose to change the API so that JsonObject would not extend java.util.Map directly and JsonArray would not extend java.util.List directly. We would then need to introduce some additional methods (e.g. size() for JsonArray and hasProperty() for JsonObject). Some additional thoughts on type hierarchy were mentioned in the previous chapter [Iterables, Iterators, Factories, Parsers, Streams, what the ... ?]. Conclusion I believe that there were enough reasons to not to extend Java Collection interfaces directly. Influences JsonObject, JsonArray Relates to JSON_PROCESSING_SPEC-24 & JSON_PROCESSING_SPEC-26 "Provide write()/read() methods for Map<String, Object> and List in JsonWriter/JsonReader" {Won't Fix & Won't Fix} JSON_PROCESSING_SPEC-38 "Extend JsonArray with List<JsonValue> and JsonObject with Map<String, JsonValue>" {Fixed} 8 Maintenance Release Technology Compatibility Kit 10 Java Virtual Machine 9 7
  8. 8. Do we really need java.io.Flushable? Do we really need java.io.Flushable? Although the necessity to flush the content to Reader or OutputStream from time to time is a completely reasonable requirement (especially during writing big JSON structures), it looks too lowlevel for me. In the ideal world, API should be consistent and use the same level of abstraction across the whole API. So we have a bunch of methods working on the JSON domain level but there is also a single method that kind of sneaks from lower level (dealing with IO-related functionality). I think this should be as much transparent for the user as possible. The flushing characteristics naturally depend on the Writer/OutputStream instance passed as method argument. If user of the API passed a buffered instance, the buffering will happen automatically and transparently. However the implementation could still be powerful as it could check the instance type at runtime and wrap the given Writer/OutputStream object with buffered adapter if not already wrapped. Another implementation could use some implementation-specific buffering & flushing implementation that could be further configured. Proposed API change I would propose to remove the Flushable interface as parent of JsonGenerator. We would also need to clarify that close() would do the flushing automatically. Conclusion By not extending Flushable and letting the implementation or passed object determine the buffering & flushing characteristics, the API would be easier to use but still equally powerful. Influences JsonGenerator Relates to JSON_PROCESSING_SPEC-11 "Flushable for JsonGenerator" {Fixed} chapter The missing parts/Possible configuration options Configuration through JsonConfiguration As pretty much every non-trivial system, we need a way to represent configuration in some abstract and ideally also type-safe manner. Lately there was a mailing list discussion on Replacing JsonFeature/JsonConfiguration with properties. I disagree with that idea and here is why. Thoughts on Properties Let's define a non-trivial use case to compare the possible solutions. We want to create a configuration that consists of multiple logically grouped settings. For instance custom pretty printing could consist of newline character (as it could be useful to set the character independently on the running platform) and the possibility to use tab character ('t') or defined number of spaces for indentation. But it only makes sense to set one of them (tab vs. spaces indentation), not both. 8
  9. 9. Configuration through JsonConfiguration We could achieve the described requirement with Properties like configuration using the following imaginary API: public class JsonConfiguration { ... JsonConfiguration setFeature(String featureKey, Object feature) { ... } } Then the user would need to write code like: jsonConfiguration.setFeature("json.prettyPrinting.newLineChar", "n") .setFeature("json.prettyPrinting.tabIndentation", true) .setFeature("json.prettyPrinting.spacesIndentation", 4); There are few problems with that API. It's easy to introduce a typo in the feature key unless there is a constant for it. It's not really type-safe, there's no compile time check if the feature value is of proper type. But most importantly the feature key and feature type pair effectively becomes part of the API => they have to be documented, tested, summarized and added as part of the TCK. You have to perform a runtime check if the given feature key is valid String and also if the passed argument type matches the required feature type at runtime. By the way did you notice that I accidentally set the indentation to both tabs and spaces? That's another problem, it's not easy to define rules between multiple features (without throwing exceptions at runtime). There is also a possibility to use a slightly modified version of the previous API. Instead of using 3 independent key/value pairs to set the pretty printing, one could use an aggregation object. Such aggregation object would hold all related configuration options together. That would make configuration a little bit easier at the cost of introducing aggregation objects for every more complicated feature. That approach will in fact mimic some of the JsonFeature advantages (described later) but the more verbose API and the necessity of runtime checks would still be necessary. Using Properties like structure is not a perfect solution and also looks like too much burden to do the configuration the right way. Let's look at the other options as there are better ways to do the configuration. Thoughts on passing Map instance The previously mentioned mailing list discussion lately proposed to replace the JsonFeature with Map<String,?>. I don't think it would be much better than the previous approach. Generally speaking, in programming languages with mutability allowed by default (including Java) immutable collections are more like a dream than something real. Yes, you can use the unmodifiable wrappers but that only makes the collection immutable, not items of the collection. So it's still the responsibility of the API designers and implementers to make all Map values not exposing any mutable methods. Thoughts on JsonFeature The mailing list discussion on JsonFeature interface - is it meant as a marker or constant? reveals that the usage of JsonFeature without more examples and better description could be misleading. So it could be beneficial to describe and show the benefits of JsonFeature abstraction over previously mentioned approaches. 9
  10. 10. Configuration through JsonConfiguration Let's discuss the JsonFeature contract: public interface JsonFeature { static final JsonFeature PRETTY_PRINTING = new JsonFeature() {}; } Yes it seems kind of weird that JsonFeature is a marker interface as marker interfaces usually represent code smell (that the design was not done properly). But in this case I would personally make an exception. Does it look suspicious that JsonFeature has no methods? Because you just really need a way to access the JsonFeature configuration data, right? Seriously, do you? JsonFeature's internal data is an implementation detail and there should be no need to read the data after creation (except the JSON Processing API implementation accepting such feature of course). As the configuration is created once (by given API) and only processed internally by the JSON Processing API implementation, the implementation naturally knows the internal representation of custom features (as its objects were asked to create those features) and so there should really be no need to expose the internal representation of JsonFeature anywhere else. Then the only remaining question is how should we handle standard features with arguments (if there will be any). The practical reason to not add any methods to JsonFeature is that there's no way to abstract an undefined number (0-?) of possible arguments (with possibility of heterogeneous types) under the same API. Finally let's see some code to demonstrate how elegant the JsonFeature abstraction is. Let's define the custom pretty printer feature implementation: public class CoolJsonLibraryFeatures { public static JsonFeature prettyPrintingWithTabIndentation (String newline) { ... } public static JsonFeature prettyPrintingWithSpacesIndentation (String newline, int numberOfSpaces) { ... } } The user would need to write code like: JsonFeature feature1 = CJLF.prettyPrintingWithTabIndentation("n"); JsonFeature feature2 = CJLF.prettyPrintingWithSpacesIndentation("n", 4); new JsonConfiguration().with(feature1); The nice thing about previous approach is that all internal representation is hidden, all necessary type conversion, transformation and validation is done by CoolJsonLibraryFeatures implementation. You cannot create invalid pretty printing (e.g. including both tab and spaces indentation) and subsequent calls of the same type of feature should just overwrite the previous one or throw exception. Invalid data passed through arguments will throw an exception during JsonFeature creation instead of JsonConfiguration or factory/parser/generator creation. There are a few disadvantages with that design though. First we would introduce compile-time & runtime dependency on the concrete implementation, but let's be realistic. Changing an implementation is never as easy as it might look like initially. It's not just about changing a few lines in pom.xml, right? In traditional enterprise environment, custom features compatibility, performance, possible side effects and maybe also a few others need to be first clarified before real "implementation switch" could happen. I would 10
  11. 11. Configuration through JsonConfiguration rather accept compile and runtime dependency than silently ignore possibly not obvious bugs caused by unsupported features. The second is not really a disadvantage, but more like a consequence instead. JSON Processing API would be hopefully used by Java EE containers internally and so needs to be OSGi11 compliant. In order to create an implementation-specific configuration, the "entry object" responsible for creating JsonFeature instances has to be part of publicly exposed packages (but JsonFeature implementation objects should not). Whatever evil it may look like, there already are JSON libraries forcing you to define such compile & runtime dependencies to specific implementation classes just to support implementation-specific configuration (e.g. by type casting API interfaces to their implementation classes and calling appropriate mutate methods on them). How configuration fits into immutable model JsonConfiguration and all its implementation details need to fit into already defined immutable model. If we would stay with JsonFeature, then everything would seamlessly integrate into immutability model if JsonFeature would be immutable (which every marker interface is by definition) as well as JsonConfiguration would be also immutable (after refactoring to builder- like pattern). If there would be a requirement to modify configuration at runtime, appropriate methods could achieve this (e.g. setFeature(JsonFeature) on factories, parsers and generators). So user is still able to modify the configuration by passing immutable JsonFeature object. However additional clarification would be needed for factories, including what will happen with instances created by given factory after changing configuration: a) the change in configuration will propagate to instances (probably not a good idea) or b) the change in configuration will be used only for newly created instances (makes more sense). Such immutable guarantee would make the reasoning about thread-safety easier. Proposed API change I would not propose any API change as I like the JsonFeature abstraction. On the other side to avoid confusion I would propose to add some clarification to JsonFeature javadoc contract together with hints to specification (for implementers) to not forget to publicly expose the package with custom features. Conclusion I propose not to change the way the configuration is done. Compile-time and runtime dependency does not seem that bad for an advantage of type-safety and easier and more usable API. Influences JsonConfiguration, JsonFeature Relates to JSON_PROCESSING_SPEC-44 "Replace JsonFeature/JsonConfiguration with Properties" {Open} chapter The missing parts/More explicit immutability 11 Open Services Gateway initiative 11
  12. 12. Naming conventions Naming conventions I think some people misunderstand the importance of naming conventions. According to mailing list discussion on JSON API naming JsonObject/JsonMap, there was a proposal to rename JsonObject to JsonMap and JsonArray to JsonList. Well, that's really a bad idea. In the same way as you cannot build a sophisticated finance library without using domain vocabulary (financial terms) and trying to name financial domain model objects according to some Java types, it's just delusional to try to name JSON domain model objects according to Java Collections. People using some API need to understand the underlying domain to use the API correctly. Relates to JSON_PROCESSING_SPEC-39 "JSON API renaming: s/JsonArray/JsonList and s/JsonObject /JsonMap" {Won't Fix} API shortcomings The following part would be exclusively focused on the user experience of the current JSON processing API. Some of the examples may be taken out of the context or special method chain could be demonstrated just to show some of the shortcomings of the current design. Try to look at the code as you were not discussing it during the last months. Inconsistent & misleading method names Consider the following code: String someText = jsonArray.getValue(0, JsonString.class).getValue(); String somePropertyText = jsonObject.getValue("prop", JsonString.class) .getValue(); In order to get the String item value from the given jsonArray or String property value from the given jsonObject it seems like we need to call the getValue() twice. This seems wrong. The first problem is that JsonString contains a method getValue() which is selfdescriptive in usual use cases, but not in the previous context when used with method chaining. The second problem is a little bit misleading method names for JsonObject and JsonArray. Let's look at the current method declarations of JsonObject (similar observations apply to JsonArray): int getIntValue (String name); String getStringValue(String name); <T extends JsonValue> T getValue (String name, Class<T> clazz); The problem is that if the get...Value() methods return ... type, getValue() is not a clear name as it does not return "final" value, but rather an object wrapper over the value. Proposed API change I would propose to rename the getValue() method in JsonString to getStringValue() so the method naming would be more consistent through the API. I would also propose to rename 12
  13. 13. API shortcomings the getValue() methods in both JsonObject and JsonArray to getProperty() and getItem()respectively. The API comparison String someText = jsonArray.getItem(0, JsonString.class).getStringValue(); String somePropertyText = jsonObject.getProperty("prop", JsonString.class) .getStringValue(); Conclusion By renaming the method names the API would be more consistent and more readable especially in chained method invocations. Influences JsonObject, JsonArray The definition of equals() & hashCode() semantics Consider the following imaginary test part of the TCK: @Test public void jsonObjectEqualsMethod_shouldHandleFloatingNumbersCorrectly() { JsonReader jsonReader = new JsonReader( new StringReader("{ "number": 10.0 }")); JsonObject readJsonObject = jsonReader.readObject(); JsonObject builtJsonObject = new JsonObjectBuilder() .add("number", BigDecimal.TEN) .build(); assertThat(readJsonObject).isEqualTo(builtJsonObject); } The problem is that this test as of current specification and RI12 fails. It looks like the current definition of equals() & hashCode() semantics violates The principle of least astonishment. The user would most likely expect that this test needs to pass. Well, I am cheating here a little bit as this test was specially prepared to point to the question if the current equality semantics is defined correctly. The trick is that java.math.BigDecimal has defined the equals() method respecting the number scale (1.0 is not equal to 1.00). Whatever absurd it may look like (in fact understandable after studying floating-point arithmetic) the question is if that is the behavior we want. Do we really want to specify that 10 is not equal to 10.0? I guess that regular API users would be surprised and upset after founding the cause of the problem. Remember, JSON language has nothing to do with the BigDecimal class and its equals() or hashCode() semantics. BigDecimal was chosen as obvious solution to handle arbitrary-precision floating point numbers in Java, but it seems wrong to me to influence the API usability by such design decision. This is even more important as every JsonNumber instance will internally use BigDecimal even if the value could fit into double precisely. 12 Reference Implementation 13
  14. 14. API shortcomings I didn't find more information about this specific issue anywhere on mailing list (if this was intentional, further discussed or just picked as obvious solution). Whatever resolution would we come up with, I think this is important part of the specification and has to be explicitly mentioned in the API contract with examples (just saying that BigDecimal would be used is probably not descriptive enough for users not familiar with BigDecimal). The ideas on API change For simplicity of use I would propose to change the equality semantics, so that 10 is equal to 10.0 and 10.0 is equal to 10.00. The implementation is however something not trivial. The naive approach would be to use the BigDecimal.stripTrailingZeros() method, but considering that mentioned method would create a new BigDecimal object during every call and that both objects would need to be converted in such a way (as the '10.0' converts to '1E+1') that would create 2 unnecessary objects every time equals() or hashCode() will be called. It looks easier and more efficient to use compareTo() method instead. But still solves only the half of the problem as both equals() and hashCode() implementations need to stay in sync. That "synchronization" is not necessarily obvious and even libraries like Commons Lang did this incorrectly (see issues LANG-393 and LANG-467). What about combining both solutions? In such way the contract of equals() & hashCode() would not be violated and the API would have expected behavior. We could even cache the BigDecimal instance for performance reasons. The API comparison @Override public int hashCode() { return bigDecimal.stripTrailingZeros().hashCode(); } @Override public boolean equals(Object obj) { if (obj == null) return false; if (obj == this) return true; if (obj.getClass() != JsonNumberImpl.class) return false; JsonNumber other = (JsonNumber)obj; return bigDecimal.compareTo(other.getBigDecimalValue()) == 0; } Conclusion By changing the equals() & hashCode() semantics, the equality of JsonNumber instances would behave in way most users would probably expect. Influences JsonNumberImpl Relates to JSON_PROCESSING_SPEC-14 "equals()/hashCode() for object model" {Fixed} 14
  15. 15. API shortcomings The semantics of toString() The mailing list discussion on JsonObject look like dom api, why use for stream api? unveils an important question whether the behavior of toString() for JSON model objects should be defined in the specification. Let me first tell you the story learned from writing the javac AST visualizer. I was writing a javac 13 AST visualizer connecting a graphical representation of AST produced by javac compiler with textual matching of selected AST sub-tree within source code. I was using the Tree.toString() instance methods of Compiler Tree API basically across the whole codebase. And it was great, it worked correctly but most importantly it was easy to use (writing custom pretty printer for Java language is not trivial without consulting JLS14). Well it really worked correctly, but only until I started visualizing some special cases (inner classes, synthetic methods, ...). Then it turned out that some of the special cases produce not valid (not compilable) Java source representation. And that was really bad as the visualizer was suddenly not able to visualize every valid Java source code! I wrote to compiler-dev mailing list about the bug and the answer was interesting. They simply said that the toString() should be used for debugging purposes only so it's not really a bug and I should write my own pretty printer anyway. I was not happy with that, but in the same way as I was convinced that time that they were just wrong, now I am convinced that they were right. I was relying on something not specified and even though there was not a single replacement for that (no notion of pretty printers or specific writers) I should just not do that. Now what could be learn from that story is the fact that users will almost certainly first try the toString() method for serialization (and will not care whether it is specified or not). If it would work, they would not care to find a replacement. Seriously, why would they if it just worked? From dependency management point of view, JSON model objects should really not depend on JsonWriter internally. In the current state it is not such a problem as there is only a single module for the whole JSON Processing API specification, but if there would be 3 modules (model, low-level streaming API and tree API), such class usage would create a circular dependency. Circular dependencies are the worst situation you could end up with. Still, even if the majority of people will misuse the API, that's kind of their problem and we need to think about usability (for easy cases it should just work). We could however minimize the misuse of such method calls by returning String representation without pretty printing. That's correct as it should be used only for debugging purposes but quickly becomes less useful for bigger JSON structures (because of readability). Conclusion I would propose not specify the toString() semantics as part of the specification. The toString() implementation should be independent of JsonWriter and pretty printing configuration and we should explicitly say that pretty printing feature only applies to JsonWriter. Influences JsonArray, JsonObject Relates to JSON_PROCESSING_SPEC-21 "Add toSting() to JsonObject/JsonArray" {Fixed} 13 Abstract Syntax Tree The Java Language Specification 14 15
  16. 16. API shortcomings JsonValue.NULL vs. null literal mismatch This subchapter is the most important part of the whole API shortcomings chapter. Mailing list discussion on Json Null type: friend or foe? as first recognized the problem with JSON Null value. When I dug a little bit deeper it turned out to be a much more serious problem with consistency. Consistency Let's discuss the following code written against the current version of the JSON Processing API: JsonObject jsonObject = new JsonReader( new StringReader("{ "optVal" : null }")) .readObject(); jsonObject.getStringValue("optVal"); // throws ClassCastException jsonObject.getStringValue("optVal2"); // throws NullPointerException jsonObject.getValue("optVal", JsonValue.class); // returns JsonValue.NULL jsonObject.getValue("optVal2", JsonValue.class); // returns null jsonObject.get("optVal"); // returns null jsonObject.get("optVal2"); // returns null You do spotted that there is something wrong, don't you? Asking JsonObject for a String value that is set to null throws ClassCastException, but asking it for a non-existing String value throws NullPointerException instead. That doesn't make much sense. Asking JsonObject for a JsonValue value that is set to null returns the JsonValue.NULL instance but asking it for non-existing JsonValue returns the null literal which is not the same as JsonValue.NULL. Again, something suspicious here. Last but not least using get() methods inherited from java.util.Map returns null literal for both cases. This is a consistency problem because trying to get the value by using 3 different API calls behaves differently in every case! This is important as using a Map view over the JsonObject requires different code to behave the same way as using direct methods on JsonObject and vice versa and I can bet that this would confuse users. Seriously, this is an API design disaster and a usability nightmare. API should not throw exceptions for regular actions as exceptions should only signal exceptional states (not the case of getStringValue() method). API should be consistent in the general method usage and if it is not, there should be a very good reason to do so (again not the case of JsonObject). I think the most important point here solving the inconsistency is the observation, that it does not really make any difference if there is a property with null value or no property at all. The only thing that matters is that there's no value for such property. By generalizing those 2 states (but still allowing you to determine which is the real state), the JSON Processing API would be more consistent and easier to use. Proposed API change I would propose to get rid of JsonValue.NULL completely. The ability to recognize a null value on the streaming API level (JsonParser & JsonParser.Event) is sill necessary and that part should not be changed. However the tree API (JsonObject) should no longer use JsonValue.NULL as it only causes confusion. To be able to distinguish between existing property with null value and not existing property, a method like boolean hasProperty(String) should be added to JsonObject. 16
  17. 17. Minor improvements The only pending problem then remains the handling of not existing properties within convenience methods returning primitive types (e.g. JsonArray#getIntValue(int)), because null cannot be returned as primitive type. I see only 2 options here. Either throw an exception (I don't like this) or provide a method with required default value argument (seems better, but still). Maybe both kinds of methods could be added and user will pick the error handling policy he prefers. Conclusion By removing JsonValue.NULL and adding some additional methods, the whole part of API responsible for getting JsonObject property values or JsonArray items would be consistent. Please, don't provide just another "EJB 2 disaster"-like API. Influences JsonArray, JsonObject, JsonValue.NULL Relates to JSON_PROCESSING_SPEC-43 "JsonObject#getIntValue() javadoc says it would return null" {Open} JSON_PROCESSING_SPEC-45 "JsonObject/JsonArray accessor methods that take default values" {Open} chapter Why JSON model objects should not extend collections directly Minor improvements Exception handling I think the internal exception handling could be done a little bit easier. According to mailing list discussion on JSON exceptions, JsonException should wrap all of ClassNotFoundException, InstantiationException and IllegalAccessException. In Java 7 there is a better way to handle mentioned exceptions: java.lang.ReflectiveOperationException. Maybe using this exception could improve things a little bit. The current exception type hierarchy also incorrectly provides zero-argument constructor (and constructor with only Throwable). I would personally not use any of those constructors as just saying that there is something wrong with the generating/parsing/processing is not very useful. The message should always be available together with optional Throwable. Exception constructor without a message is only applicable if the exception name exactly describes the cause of the problem (which is not the case in JSON Processing API - e.g. JsonGenerationException vs. DuplicateKeysDuringGenerationException). Mixed / Voting for  +1 for <T extends JsonValue> List<T> JsonArray#getValuesAs(Class<T>)  +1 for JsonObject#getValueWithDefault(...)  +1 for JsonObject#getBooleanValue(...)  +1 for JsonParser#skipValue() JSON_PROCESSING_SPEC-29  +1 for the ability to use single quoted strings (very practical for Java unit testing)  +1 for adding methods to JsonParser to get objects JSON_PROCESSING_SPEC-9 17
  18. 18. Date support The missing parts Date support Possible configuration options This part summarizes all the configuration options I think could be at least at some level useful. I am not proposing to add all of those as standard configuration options. I am also convinced that some should be rather implementation-specific that something build-in (at least we can inspire specification implementers).  more control over flushing (using buffered adapter, using custom buffer, defining buffer size, ...) More explicit thread model I think the spec needs to be more explicit about the thread model. Especially in the context of Java EE and prepared JSR 236: Concurrency Utilities for Java EE we need to be clear about what methods and objects are thread-safe and what objects could be safely cached. The definition of thread model has to be consistent with immutability. Every object and method should be explicitly marked as thread-safe or not thread-safe (part of its javadoc) to avoid confusion and possible misinterpretation. More explicit immutability In the same way as more explicit thread model needs to be defined, more explicit immutability needs to be documented. As most of the objects of the JSON Processing API are immutable (and others should be converted) this creates a contract that should not be violated. If objects are immutable and the semantics of equals() and hashCode() methods are part of the API (which applies to all JSON model objects), immutability is not optional, but required (as mentioned methods could only be defined in terms of immutable fields). Every object should be explicitly marked as mutable or immutable (part of its javadoc) to avoid confusion and possible misinterpretation. 18

×