1IBM
_
Languages:
For the Man or the Machine?
Gireesh Punathil
JavaOne | Sep 18 – 22 | San Francisco
Agenda
❊ Machine and the Machine Code
❊ Language Classification
❊ Abstraction types and their implications
❊ Major Language paradigms
❊ Java Perspectives
❊ Stories from Scripts
❊ Expressiveness plus Efficiency
Introduction to the speaker
❊ 14 years of experience: Developing, Porting, and Debugging large and
complex System software modules
❊ Virtual machines, Language Runtimes, Compilers, Web Servers
❊ Active Contributor to Open source Projects
❊ Interests: Language semantics, Subroutine linkage, Code optimization,
Virtual machines, Process runtime, PaaS, Core file debugging
❊ Focus area: PaaS
linkedin: gireeshpunathil
Twitter : @gireeshpunam
Github : gireeshpunathil
Email : gpunathi@in.ibm.com
Machine and the Machine Code
❊ Logic implemented by Circuits
❊ Behavior specified by Architecture
❊ Capability abstracted by Instructions
❊ Instructions encoded in bits
❊ Code and Data referred by address
💻
Non-abstracted Capabilities
❊ Arithmetic: ADD
❊ Copy: MOV
❊ Compare: CMP
❊ Control: JMP, CALL, RET
❊ Port access: IN, OUT
❊ CAS: CMPXCHG
Benefits
❊ Fast and Powerful
❊ Direct access to devices
❊ Little code transformation
❊ Low resource consumption
Drawbacks
♨ Lacks portability
♨ Code maintenance difficult
♨ Hard to read
♨ Un-named data
♨ Hard to debug issues
♨ Very little runtime checks
💉 🔌
C – A thin wrapper around Assembly
❊ Arithmetic: +, -, *, /, +=, ++
❊ Copy: =, memset(), strcpy()
❊ Compare: ==, !=, <, >, >=
❊ Control: if, for, switch, (), return
❊ Port access: read(), write()
❊ CAS: mutex, semaphore, conditions
🚀
C – Often as powerful as Assembly
unsigned long mytime()
{
unsigned long time;
__asm__ volatile (”rdtsc”:"=A" (time));
return time;
}
http://www.tldp.org/HOWTO/text/IO-Port-Programming ⏱
Domain based
❊ Focus on problem domain
❊ Validation at business level
❊ Used in limited scope
❊ 3rd level of Abstraction
❊ HTML, SQL, SED, AWK
Paradigm based
Programming Language Classification
Script based
❊ General purpose
❊ Focus on S/W domain
❊ Rules on code & data
❊ 1st level of Abstraction
❊ C, C++, C#, Java
❊ Discrete commands strung
into a coherent whole
❊ Automate repeatable tasks
❊ 2nd level of Abstraction
❊ Py, PHP, JS, Ruby, Bash
💊🃋 🗡
Implications of Abstraction
❊ Compilers: Optimization, Transformation
❊ [ GCC, MSVC, Clang, Javac ]
❊ Transpilers: Source-Source Transformation
❊ [ CoffeeScript, Jython, Jruby ]
❊ Interpreters: Translation, Command-to-Action
❊ [ Bash, JVM, V8, Python ]
❊ Virtual machines: Virtualization and Simulation
❊ [ JVM, LLVM, CLR, ART ]
❊ Runtime Environments: Execution Support Subsystem
❊ [ GLIBC, JRE, PTHREAD, KERNEL32.DLL ]
🐞
🗡
⏳🔋 🗡🐌 🗡
Types of Abstraction
❊ High level semantics (instructions)
❊ Typed variables (raw memory)
❊ Virtual machine (CPU)
User space Kernel space
❊ System calls (devices)
❊ Threading (Scheduling)
❊ APIs(I/O, net, FS,
resources)
⚙ ⚓️
Major Language Paradigms
Object Orientation
Data organization
Data Modelling
Behavior specialization
Composition, Delegation
Polymorphism
Re-usability
Modularity
Data organization cost
Data access cost
Data optimization cost
Code optimization cost
Code bloating
Weak Spatial Locality
Runtime code verification
Runtime type verification
Runtime linking
Dynamic dispatch
Method de-virtualization
Dynamic memory
Synchronization
Serialization
Expressiveness Compiler Pressure Runtime Pressure
Functional Programming
Functions as variables
Continuation passing
Higher order functions
Code loosely bound to
data, applied as custom
agents
Data access validation
State definition
Contextualization
State Creation
Context management
Context Synchronization
Context lifecycle
Disambiguation
Runtime Code generation
Memory management
Expressiveness Compiler Pressure Runtime Pressure
Java Perspectives
Virtual Methods
Enable specialization
Runtime polymorphism
Mimic real world heritage
models
Hierarchy validation
Virtual method table creation
Class Hierarchy
Analysis
Method lookup
Dynamic binding
Code aggregation
Virtual guarding
Expressiveness Compiler Pressure Runtime Pressure
Synchronization
Synchronization intrinsic
to language
Locks intrinsic to Objects
Granular at function and
block level
Syntax and Semantics
validation
Lock word management
Implement sync. primitives
Fast path sync.
Slow path sync.
Exception handling
Expressiveness Compiler Pressure Runtime Pressure
Threading
Abstracts execution sequence
Flexible creation models
Lifecycle management
Backbone of concurrency
Backbone of Multicore exploitation
Cost of Native threading
Cost of stack management
Cost of context switching
Cost of synchronization
Expressiveness Compiler Pressure Runtime Pressure
Garbage Collection
Automatic Object
memory management
Cost of the Stopped World
Cost of Copy Collection
Cost of Stack walk
Cost of Marking
Cost of Sweeping
Cost of Compaction
Features Compiler Pressure Runtime Pressure
Native Interfacing
Special cases to descent
into a low level language
Fill the gap in platform
abstraction
Syntax validation
Type verification
Call semantics validation
Stub creation
Dynamic loading
Dynamic linking
Type conversion/validation
Environment management
Stack management
Context switching
Memory management
Expressiveness Compiler Pressure Runtime Pressure
this
Anchor Java Object
Disambiguate heredity
Syntax validation
Access verification
Instance check cost
Field access cost
Method access cost
Invocation cost
Locking cost
Expressiveness Compiler Pressure Runtime Pressure
Class
Custom Types
Glues Code with Data
Implements OO
Models real world entities
with attributes and
behaviors
Syntax validation
Hierarchy validation
Access validation
Semantic validation
Constant pool creation
Bytecode generation
Unitization
Class loading cost
Class loader cost
Class initialization cost
Reflection cost
Object header cost
Field access cost
Method access cost
Invocation cost
Expressiveness Compiler Pressure Runtime Pressure
Bytecode aka. Portability
Write Once Run Everywhere
Forget the real machine,
learn only language spec.
and virtual machine spec.
Syntax validation
Hierarchy validation
Access validation
Semantic validation
Constant pool creation
Bytecode generation
Unitization
Interpretation cost
Dynamic Compilation cost
Classloading cost
Runtime verification cost
Exception handling cost
Expressiveness Compiler Pressure Runtime Pressure
Stories from Scripts
Dynamic Typing
Model more real-world
like data
Data bound to Object not
with the Class
Data access cost
Type inference cost
Object Lookup cost
Data access cost
Type inference cost
Heterogeneous type
management cost
Features Compiler Pressure Runtime Pressure
Runtime Evaluation
Executable in a String
Run arbitrary,
unprepared code
Code verification
Data verification
Consistency check
Entire process of parsing,
compilation,
transformation,
interpretation initiated at a
call site
Features Compiler Pressure Runtime Pressure
When
Expressiveness
Balances
with
Efficiency
Python: Analytics
Packing and Zipping
Generator expressions
Tuples, Sets and Queues
OS module: thinnest wrapper
around platforms
•Beautiful is better than ugly
•Explicit is better than implicit
•Simple is better than complex
•Complex is better than complicated
•Readability counts
…
• Practicality beats purity
Deep learning Semantics Zen of Python
Swift: Concurrency and Parallelism
dispatch_async(queue) {
parseOneTBData();
}
Concurrency Semantics Multi-core exploitation
Node.js: Interactive Systems
Event driven Semantics Asynchronous Callbacks
http.get('http://www.google.com', function(res) {
console.log('net io');
});
Summary
❊ Ideal feature balances expressiveness with commutability
❊ A Seamless, Silky route from the feature to the platform
❊ It is OK to be Polyglot
❊ Each language specializes around a central theme
❊ Keep one eye on the intended workload, and other on the
underlying system
❊ Find the right tool for each jobs, and fuse them
Want to build a new Language?
❊ Obvious Challenge: Huge Initial Investment
❊ Build Language Runtime before building a Language:
 Platform Abstraction
 Memory management
 Dynamic Compiler
 Diagnostic support
❊ Eclipse OMR (Open Managed Runtime): (https://developer.ibm.com/open/omr)
 Create and supply all common infrastructure components
 Effort is better spent on Language features
 Reduces (Relegates) the complexity
References
Java Virtual Machine Specification
https://docs.oracle.com/javase/specs/jvms/se8/jvms8.pdf
Intel Architecture Specification
http://www.intel.in/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-
manual-325462.pdf
Programming Language Classification
https://en.wikipedia.org/wiki/Category:Programming_language_classification
Python Language Reference
https://docs.python.org/3/reference/index.html
Swift Language Reference
https://swift.org/documentation/TheSwiftProgrammingLanguage(Swift3).epub
Node.js API reference
https://nodejs.org/api
Eclipse OMR
https://developer.ibm.com/open/omr/
34IBM
_
Thank You!
Gireesh Punathil | gpunathi@in.ibm.com | @gireeshpunam

Languages formanandmachine

  • 1.
    1IBM _ Languages: For the Manor the Machine? Gireesh Punathil JavaOne | Sep 18 – 22 | San Francisco
  • 2.
    Agenda ❊ Machine andthe Machine Code ❊ Language Classification ❊ Abstraction types and their implications ❊ Major Language paradigms ❊ Java Perspectives ❊ Stories from Scripts ❊ Expressiveness plus Efficiency
  • 3.
    Introduction to thespeaker ❊ 14 years of experience: Developing, Porting, and Debugging large and complex System software modules ❊ Virtual machines, Language Runtimes, Compilers, Web Servers ❊ Active Contributor to Open source Projects ❊ Interests: Language semantics, Subroutine linkage, Code optimization, Virtual machines, Process runtime, PaaS, Core file debugging ❊ Focus area: PaaS linkedin: gireeshpunathil Twitter : @gireeshpunam Github : gireeshpunathil Email : gpunathi@in.ibm.com
  • 4.
    Machine and theMachine Code ❊ Logic implemented by Circuits ❊ Behavior specified by Architecture ❊ Capability abstracted by Instructions ❊ Instructions encoded in bits ❊ Code and Data referred by address 💻
  • 5.
    Non-abstracted Capabilities ❊ Arithmetic:ADD ❊ Copy: MOV ❊ Compare: CMP ❊ Control: JMP, CALL, RET ❊ Port access: IN, OUT ❊ CAS: CMPXCHG
  • 6.
    Benefits ❊ Fast andPowerful ❊ Direct access to devices ❊ Little code transformation ❊ Low resource consumption Drawbacks ♨ Lacks portability ♨ Code maintenance difficult ♨ Hard to read ♨ Un-named data ♨ Hard to debug issues ♨ Very little runtime checks 💉 🔌
  • 7.
    C – Athin wrapper around Assembly ❊ Arithmetic: +, -, *, /, +=, ++ ❊ Copy: =, memset(), strcpy() ❊ Compare: ==, !=, <, >, >= ❊ Control: if, for, switch, (), return ❊ Port access: read(), write() ❊ CAS: mutex, semaphore, conditions 🚀
  • 8.
    C – Oftenas powerful as Assembly unsigned long mytime() { unsigned long time; __asm__ volatile (”rdtsc”:"=A" (time)); return time; } http://www.tldp.org/HOWTO/text/IO-Port-Programming ⏱
  • 9.
    Domain based ❊ Focuson problem domain ❊ Validation at business level ❊ Used in limited scope ❊ 3rd level of Abstraction ❊ HTML, SQL, SED, AWK Paradigm based Programming Language Classification Script based ❊ General purpose ❊ Focus on S/W domain ❊ Rules on code & data ❊ 1st level of Abstraction ❊ C, C++, C#, Java ❊ Discrete commands strung into a coherent whole ❊ Automate repeatable tasks ❊ 2nd level of Abstraction ❊ Py, PHP, JS, Ruby, Bash 💊🃋 🗡
  • 10.
    Implications of Abstraction ❊Compilers: Optimization, Transformation ❊ [ GCC, MSVC, Clang, Javac ] ❊ Transpilers: Source-Source Transformation ❊ [ CoffeeScript, Jython, Jruby ] ❊ Interpreters: Translation, Command-to-Action ❊ [ Bash, JVM, V8, Python ] ❊ Virtual machines: Virtualization and Simulation ❊ [ JVM, LLVM, CLR, ART ] ❊ Runtime Environments: Execution Support Subsystem ❊ [ GLIBC, JRE, PTHREAD, KERNEL32.DLL ] 🐞 🗡 ⏳🔋 🗡🐌 🗡
  • 11.
    Types of Abstraction ❊High level semantics (instructions) ❊ Typed variables (raw memory) ❊ Virtual machine (CPU) User space Kernel space ❊ System calls (devices) ❊ Threading (Scheduling) ❊ APIs(I/O, net, FS, resources) ⚙ ⚓️
  • 12.
  • 13.
    Object Orientation Data organization DataModelling Behavior specialization Composition, Delegation Polymorphism Re-usability Modularity Data organization cost Data access cost Data optimization cost Code optimization cost Code bloating Weak Spatial Locality Runtime code verification Runtime type verification Runtime linking Dynamic dispatch Method de-virtualization Dynamic memory Synchronization Serialization Expressiveness Compiler Pressure Runtime Pressure
  • 14.
    Functional Programming Functions asvariables Continuation passing Higher order functions Code loosely bound to data, applied as custom agents Data access validation State definition Contextualization State Creation Context management Context Synchronization Context lifecycle Disambiguation Runtime Code generation Memory management Expressiveness Compiler Pressure Runtime Pressure
  • 15.
  • 16.
    Virtual Methods Enable specialization Runtimepolymorphism Mimic real world heritage models Hierarchy validation Virtual method table creation Class Hierarchy Analysis Method lookup Dynamic binding Code aggregation Virtual guarding Expressiveness Compiler Pressure Runtime Pressure
  • 17.
    Synchronization Synchronization intrinsic to language Locksintrinsic to Objects Granular at function and block level Syntax and Semantics validation Lock word management Implement sync. primitives Fast path sync. Slow path sync. Exception handling Expressiveness Compiler Pressure Runtime Pressure
  • 18.
    Threading Abstracts execution sequence Flexiblecreation models Lifecycle management Backbone of concurrency Backbone of Multicore exploitation Cost of Native threading Cost of stack management Cost of context switching Cost of synchronization Expressiveness Compiler Pressure Runtime Pressure
  • 19.
    Garbage Collection Automatic Object memorymanagement Cost of the Stopped World Cost of Copy Collection Cost of Stack walk Cost of Marking Cost of Sweeping Cost of Compaction Features Compiler Pressure Runtime Pressure
  • 20.
    Native Interfacing Special casesto descent into a low level language Fill the gap in platform abstraction Syntax validation Type verification Call semantics validation Stub creation Dynamic loading Dynamic linking Type conversion/validation Environment management Stack management Context switching Memory management Expressiveness Compiler Pressure Runtime Pressure
  • 21.
    this Anchor Java Object Disambiguateheredity Syntax validation Access verification Instance check cost Field access cost Method access cost Invocation cost Locking cost Expressiveness Compiler Pressure Runtime Pressure
  • 22.
    Class Custom Types Glues Codewith Data Implements OO Models real world entities with attributes and behaviors Syntax validation Hierarchy validation Access validation Semantic validation Constant pool creation Bytecode generation Unitization Class loading cost Class loader cost Class initialization cost Reflection cost Object header cost Field access cost Method access cost Invocation cost Expressiveness Compiler Pressure Runtime Pressure
  • 23.
    Bytecode aka. Portability WriteOnce Run Everywhere Forget the real machine, learn only language spec. and virtual machine spec. Syntax validation Hierarchy validation Access validation Semantic validation Constant pool creation Bytecode generation Unitization Interpretation cost Dynamic Compilation cost Classloading cost Runtime verification cost Exception handling cost Expressiveness Compiler Pressure Runtime Pressure
  • 24.
  • 25.
    Dynamic Typing Model morereal-world like data Data bound to Object not with the Class Data access cost Type inference cost Object Lookup cost Data access cost Type inference cost Heterogeneous type management cost Features Compiler Pressure Runtime Pressure
  • 26.
    Runtime Evaluation Executable ina String Run arbitrary, unprepared code Code verification Data verification Consistency check Entire process of parsing, compilation, transformation, interpretation initiated at a call site Features Compiler Pressure Runtime Pressure
  • 27.
  • 28.
    Python: Analytics Packing andZipping Generator expressions Tuples, Sets and Queues OS module: thinnest wrapper around platforms •Beautiful is better than ugly •Explicit is better than implicit •Simple is better than complex •Complex is better than complicated •Readability counts … • Practicality beats purity Deep learning Semantics Zen of Python
  • 29.
    Swift: Concurrency andParallelism dispatch_async(queue) { parseOneTBData(); } Concurrency Semantics Multi-core exploitation
  • 30.
    Node.js: Interactive Systems Eventdriven Semantics Asynchronous Callbacks http.get('http://www.google.com', function(res) { console.log('net io'); });
  • 31.
    Summary ❊ Ideal featurebalances expressiveness with commutability ❊ A Seamless, Silky route from the feature to the platform ❊ It is OK to be Polyglot ❊ Each language specializes around a central theme ❊ Keep one eye on the intended workload, and other on the underlying system ❊ Find the right tool for each jobs, and fuse them
  • 32.
    Want to builda new Language? ❊ Obvious Challenge: Huge Initial Investment ❊ Build Language Runtime before building a Language:  Platform Abstraction  Memory management  Dynamic Compiler  Diagnostic support ❊ Eclipse OMR (Open Managed Runtime): (https://developer.ibm.com/open/omr)  Create and supply all common infrastructure components  Effort is better spent on Language features  Reduces (Relegates) the complexity
  • 33.
    References Java Virtual MachineSpecification https://docs.oracle.com/javase/specs/jvms/se8/jvms8.pdf Intel Architecture Specification http://www.intel.in/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer- manual-325462.pdf Programming Language Classification https://en.wikipedia.org/wiki/Category:Programming_language_classification Python Language Reference https://docs.python.org/3/reference/index.html Swift Language Reference https://swift.org/documentation/TheSwiftProgrammingLanguage(Swift3).epub Node.js API reference https://nodejs.org/api Eclipse OMR https://developer.ibm.com/open/omr/
  • 34.
    34IBM _ Thank You! Gireesh Punathil| gpunathi@in.ibm.com | @gireeshpunam