Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

A Tool for Optimizing Java 8 Stream Software via Automated Refactoring

142 views

Published on

Streaming APIs are pervasive in mainstream Object-Oriented languages and platforms. For example, the Java 8 Stream API allows for functional-like, MapReduce-style operations in processing both finite, e.g., collections, and infinite data structures. However, using this API efficiently involves subtle considerations like determining when it is best for stream operations to run in parallel, when running operations in parallel can actually be less efficient, and when it is safe to run in parallel due to possible lambda expression side-effects. In this paper, we describe the engineering aspects of an open source automated refactoring tool called Optimize Streams that assists developers in writing optimal stream software in a semantics-preserving fashion. Based on a novel ordering and typestate analysis, the tool is implemented as a plug-in to the popular Eclipse IDE, using both the WALA and SAFE frameworks. The tool was evaluated on 11 Java projects consisting of ~642 thousand lines of code, where we found that 36.31% of candidate streams were refactorable, and an average speedup of 1.55 on a performance suite was observed. We also describe experiences gained from integrating three very different static analysis frameworks to provide developers with an easy-to-use interface for optimizing their stream code to its full potential.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

A Tool for Optimizing Java 8 Stream Software via Automated Refactoring

  1. 1. A Tool for Optimizing Java 8 Stream Software via Automated Refactoring Raffi Khatchadourian1,2 Yiming Tang2 Mehdi Bagherzadeh3 Syed Ahmed3 IEEE International Working Conference on Source Code Analysis and Manipu- lation September 2018, Madrid, Spain 1 Computer Science, City University of New York (CUNY) Hunter College, USA 2 Computer Science, City University of New York (CUNY) Graduate Center, USA 3 Computer Science & Engineering, Oakland University, USA
  2. 2. Introduction
  3. 3. Streaming APIs • Streaming APIs are widely-available in today’s mainstream, Object-Oriented programming languages [Biboudis et al., 2015]. • Incorporate MapReduce-like operations on native data structures like collections. • Can make writing parallel code easier, less error-prone (avoid data cases, thread contention). 1
  4. 4. Problem • MapReduce traditionally runs in highly-distributed environments with no shared memory. • Streaming APIs typically execute on a single node under multiple threads or cores in a shared memory space. • Collections reside in local memory. • Issues may arise from close ties between shared memory and the operations. • Developers must manually determine whether running stream code in parallel is efficient and interference-free. • Requires thorough understanding of the API. • Error-prone, possibly requiring complex analysis. • Omission-prone, optimization opportunities may be missed. 2
  5. 5. Solution • Fully-automated refactoring tool named Optimize Streams. • Transforms Java 8 stream code for improved performance. • Publicly available as an open source Eclipse IDE1 plug-in.2 • Includes fully-functional UI, preview pane, and unit tests. • Based on: • Novel ordering analysis. • Infers when maintaining ordering is necessary for semantics preservation. • Typestate analysis [Fink et al., 2008; Strom and Yemini, 1986]. • Augments the type system with “state.” • Traditionally used for preventing resource usage errors. 1http://eclipse.org. 2Available at http://git.io/vpTLk. 3
  6. 6. • First to integrate automated refactoring with typestate analysis.3 • Uses WALA static analysis framework4 and the SAFE typestate analysis engine.5 • Combines analysis results from varying IR representations (SSA, AST). 3To the best of our knowledge. 4http://wala.sf.net 5http://git.io/vxwBs 4
  7. 7. Demonstration
  8. 8. Also available at http://youtu.be/YaSYH7n6y5s. Detailed video entry point links: • Demo start. • Refactoring start. • Refactoring end. 5
  9. 9. Evaluation
  10. 10. Preliminary Results • Applied to 11 Java projects of varying size and domain with a total of ∼642 KSLOC. • 36.31% candidate streams were refactorable. • Observed an initial average speedup of 1.55 during performance testing. • See paper for more details, including user feedback, as well as tool and data set engineering challenges. 6
  11. 11. Conclusion
  12. 12. • Optimize Streams is an open source, automated refactoring tool that assists developers with writing optimal Java 8 Stream code. • Integrates an Eclipse refactoring with the advanced static analyses offered by WALA and SAFE. • 11 Java projects totaling ∼642 thousands of lines of code were used in the tool’s assessment. • A speedup of 1.55 on the refactored code was observed as part of a preliminary study. 7
  13. 13. For Further Reading Biboudis, Aggelos, Nick Palladinos, George Fourtounis, and Yannis Smaragdakis (2015). “Streams à la carte: Extensible Pipelines with Object Algebras”. In: ECOOP, pp. 591–613. doi: 10.4230/LIPIcs.ECOOP.2015.591. Fink, Stephen J., Eran Yahav, Nurit Dor, G. Ramalingam, and Emmanuel Geay (2008). “Effective Typestate Verification in the Presence of Aliasing”. In: ACM TOSEM 17.2, pp. 91–934. doi: 10.1145/1348250.1348255. Strom, Robert E and Shaula Yemini (1986). “Typestate: A programming language concept for enhancing software reliability”. In: IEEE TSE SE-12.1, pp. 157–171. doi: 10.1109/tse.1986.6312929. 8
  14. 14. Provocative Statements 1. Streaming API usage does not match that of how the API designers envisioned usage. Question What are the consequences for future versions of such APIs? 2. Using streaming APIs in mainstream, Object-Oriented languages has many benefits, such as conciseness and succinct parallelism, but hinders code reuse, thus promoting clones. Question Is writing multiple, similar lambda expressions easier than writing reusable functions? 9

×