Wordware 2011: Lingoport i18n Planning & Static Analysis

  • 929 views
Uploaded on

The business case for internationalization, character encoding, a Java internationalization example and an overview of Globalyzer’s static analysis.

The business case for internationalization, character encoding, a Java internationalization example and an overview of Globalyzer’s static analysis.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
929
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
19
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Lingoport, Inc.3985 Wonderland Hill Ave.Boulder, ColoradoUSA 80304+1 303 444 8020www.lingoport.com Successful I18n Project Planning using Static Analysis Olivier Libouban Adam Asnes G11n Lead Grand Poisson Copyright: March 2011 Please do not reproduce without authorized permission
  • 2. Lingoport• Internationalization Services – Assessment – Project planning – I18n development – I18n testing – Localization integration• Globalyzer – Internationalization software • Find and fix i18n issues in code
  • 3. Agenda• Business Case• I18n issues• Static Analysis Background• Requirements Gathering• Static Analysis Detail• Project Plan Example• Agile planning• Continuous Integration for i18n
  • 4. Engineering for Locale Support• Globalization (g11n) has two components : – Internationalization (i18n) : software engineering to enable localization – Localization (L10n): culture specific resources (translation, etc.)
  • 5. Business Case:Nothing gets internationalized or localized just cause it would be cool
  • 6. I18n Needs – Biz vs. Tech Our Software must be in Japanese, French, German, Chinese, and Engineering thinks about… Spanish by November 1. Multi-tiered web application? 2. Complex Interface? 3. Database components? 4. Embedded Strings? 5. Locale aware application? 6. Can it manage multiple data formats? 7. I18n testing plan? 8. Tactics to get it done
  • 7. I18n is Business Driven• Global initiatives – Expanding opportunities, New customers• Competitive pressure• Lost time to market• Iterative code fixing, problems keep slipping through• Development costs in the hundreds of thousands to millions of dollars
  • 8. You Need a Plan – Scope 1st, design later• Project becomes real with $$$• CFO thinking in terms of ROI – Deal Based • Revenue – Costs = Profit – Strategic • Revenue over X years – Costs + effect on equity – risk • Leverage global investment of organization – Cost of Time to Market • If you‟re late or lousy, that has significant opportunity cost
  • 9. Engineering: Localization is a Downstream Concern• “Somebody else‟s problem” in the world of many developers• Creates an opportunity to educate and shepherd teams through globalization
  • 10. Is It Internationalized?• Typically underestimate i18n requirements• Most don‟t know the answer• Agile or other feature and release requirements often overrun less formally measured i18n requirements• There is a Management Value in being able to confirm global readiness
  • 11. Example: Hard-Coded English Text1 million lines of source codeFound: 20,000 Embedded Strings which cannot be efficiently translatedString orderStatus = “Your order has been processed. A confirmation e-mail will be sent to you shortly.”;
  • 12. Character Sets/Encodings• Character set (e.g. Unicode) – A set of characters used to support a given language or series of languages• Character encoding (e.g. UTF-16, UTF-8) – A set of code points that defines numeric values for each character within a character set (coded character set)
  • 13. Character Sets and Encoding• This is broken:
  • 14. Sample Code (Java) – i18n Examples
  • 15. I18n Engineering Considerations• Locale Handling • Honorific titles• Character encoding • Telephone formats• Strings • Postal formats – – External, Grammar, Segments, Plurals, Wrapping String Handling (char *, etc.) • Region-specific functions – Tabs, spaces, delimiters, etc. • Shipping conditions• Resource management – • Numerical formats centralized, normalized, re-usable • Page layout, LTR, RTL• Dates - Calendar • Fonts and attributes• Times • Icons, colors• Sorting & searching • Reporting, workflow• Currency • Database support• Transaction process • Multi-byte enabling• Character set conversions • Business logic• On line help • Measurements, units• Sounds • Input Methods • Data exchange
  • 16. Internationalization Challenge• Software Data Path - it‟s not just the display Display Input Transform Store Transform Retrieve
  • 17. New Internationalization Project!• What to do? – Large amount of code – Change in requirements – Change in architecture – Change in development practices – Change in testing requirements
  • 18. Practical Challenges• Sift through hundreds of thousands or millions of lines of code• Managing fixing complex problems• Creating a product that looks, feels and behaves natively to its worldwide users• Source code must be adapted to seamlessly adapt to any language, streamlining support and updates
  • 19. Code Review• What to Identify – Embedded strings – Locale-Sensitive methods/functions/classes – Image references – Unsafe programming constructs (ex: regular expressions needing US Alphabetical Order, Pointer arithmetic and more)
  • 20. Code Analysis• How to Identify Issues – “Brute force” • Engineers search for and resolve known issues • Count display pages • Pseudo-localization • Scripts and page by page analysis – Globalyzer-assisted review, static analysis • An I18n code analysis tool is employed to examine source code for a large range of potential and known issues • Issues can be identified and resolved in a more systematic fashion
  • 21. Traditional Approach - repeat, and repeat, and repeat, and repeat Localize and see what you‟re missing GREP, overwhelm Test, Pseudo-Localize developers View pages. Pour Externalize and refactor through code for strings, one by one methods, etc.
  • 22. Globalyzer Server and Clients Static Analysis on the Source Code ServerClient Command Line Globalyzer is methodology agnostic. Project Managers may use it in a „traditional‟ approach or Agile approach.
  • 23. Globalyzer Principles - Customization• Globalyzer Server manages Rule Sets Configuration – Globalyzer Rule Sets are used to identify i18n issues in the code base – Rules embody the i18n issue detection logic – One rule set targets one programming language (& variant) – Default rule sets are based on research and years of experience – Rules must be tailored to a specific project – Rules can be shared amongst team members
  • 24. Globalyzer Principles – Desktop Analysis • Globalyzer desktop client: – Scan source code using Globalyzer Rule Sets – Detect and report i18n potential issues – Manage i18n issues – Assist Fixing the code to become i18n compliant
  • 25. Globalyzer Principles - Automation• Globalyzer Command Line – For integration in the overall software process to run at given frequencies – Generate reports once a setup has been established – Different strategies • Segment the code base into small scan projects that reflect the i18n effort • Focus on i18n scope
  • 26. I18n Processes• Planning • Market Requirements Analysis • Architectural Requirements Analysis • Code Review• I18n Design• I18n Implementation• Testing• And beyond… • Localization • Support
  • 27. Merging Requirements and• Architectural Changes • Code AnalysisWhat‟s not in the code What‟s in the code – Locale support – Strings – Changes to how data – Refactoring Locale- is passed around limiting methods/functions – Discuss and Analyze technical requirements – Find and count issues
  • 28. I18n Architectural Challenge – what’s not in the code Marketing Requirements Locale behavior Database Application Code Character e.g. Java, C++, VB U/I encoding support e.g. JSP, ASP, ASPX 3rd Party Products Business Logic Platforms, Browser Support Requirements
  • 29. COMPLICATIONS
  • 30. Operational Challenges• Ongoing development – Agile? – Code Branching? – Multiple teams?
  • 31. Release Path• Internationalization, • Feature Release 1st Time – 3 week sprint? – Most of U/I – Focus on code subset – Breaks the DB – Concentrated testing – Data I/O • Static analysis with Globalyzer – Test entire product Code branch, merge, testing strategy
  • 32. Factors to Plan On• Programming languages• How many tiers, what do they do• Database support• Locale Requirements• 3rd Party Products – support for Unicode?• Size of Application – Lines of Code• Amount of Embedded Strings to be Externalized• Estimate of concatenation• DB refactoring• Methods/Functions/Classes replacement
  • 33. Tiers and Technologies • Java 1 • C# • JavaScript 2 • VB • C++ 3 • Older languages: e.g. RPG Time and effort increase
  • 34. Other Issues• Stability of the build• Quality of the code – History• Focus of the developers• Source code management approach• New concurrent development introducing new i18n problems
  • 35. Questions & Answers Adam Asnes Resources adam@lingoport.com http://www.lingoport.com Olivier Libouban Globalyzer http://www.globalyzer.com olivier@lingoport.com Blog http://i18nblog.com
  • 36. Lingoport:Requirements and PlanningAdam Asnes Olivier LiboubanPresident & CEO Globalization LeadLingoport Lingoport
  • 37. Why go through requirements?• I18n work is software engineering• To determine the scope of the i18n work, the i18n cannot simply look at the code and come up with an i18n project• Scope also leads to planning, cost, resources• How to describe i18n requirements?
  • 38. Focus on one requirement: Locale• One product instance per locale?• Multi-locale support• Locale detection?• User account support?
  • 39. Ex: WebSphere Portal LocaleDetermination – User logged in: display user‟s preferred language – No preferred user language: look for user‟s browser language • If supports of that language, displays in that language. • If browser has more than one language defined, uss the first language in the list to display the content. – If no browser language can be found, for example if the browser used does not send a language, the portal resorts to its own default language. – If the user has a portlet that does not support the language that was determined by the previous steps, that portlet is shown in its own default language.
  • 40. One-Time Locale Selection
  • 41. System based Locale Detection
  • 42. More of the typical i18n requirements• Target date(s)• System requirements• Existing & potential use cases for UI text entry,• Text display• Text processing• Collation• Handling of locale-sensitive data (dates, numbers, currencies, etc.).• Client Installer considerations
  • 43. Architectural Discussion• Thorough Product Demo• Walk through major architecture components
  • 44. Conceptual illustrative architecture Specific development and integration CODE UI Business Persistance RDBMS LDAP CMS Workflow Web Services Rules Engine JMS 3rd Parties EngineApril 19, 2011 – p 45
  • 45. Specific i18n software engineering focus Specific development and integration CODE UI • UI : html, server side, JavaScript, Business Persistance RDBMS LDAP CMS input forms, css, content presentation, etc. Web Services Rules Engine Workflow Engine JMS 3rd Parties • Business logic, searches, comparisons, data exchange with external systems • Persistence : exchanges with RDMBS, Content Management, LDAP, file based persistence (xml, etc.)April 19, 2011 – p 46
  • 46. Specific development i18n issues Specific development and integration CODE • String externalization (outside of UI Business Persistance code) and i18n resource bundles RDBMS LDAP CMS • Locale sensitive methods : Web Services Rules Engine Workflow Engine JMS 3rd Parties searching, retrieving, sorting, date and time, string operations, character operations, etc. • Code resources (images, etc.) • Overall programming language specificsApril 19, 2011 – p 47
  • 47. Data stores i18n issues Specific development and integration • CODE PL/SQL UI Business Persistance • Encoding RDBMS LDAP CMS • Locale files (xml, xls, csv, etc) Web Services Rules Engine Workflow Engine JMS 3rd Parties • Database specific issues, date/time, conversion, sorting, soundex, etc. • Storing and retrieving local data in local language (vs. a “generic” schema) • User entered data • Columns requiring translation • Attributes, user names, postal addresses, etc • Database designApril 19, 2011 – p 48
  • 48. Content Management i18n issues Specific development and integration CODE UI Business • Accessing the proper locale Persistance RDBMS LDAP CMS Workflow Web Services Rules Engine JMS 3rd Parties Engine • Encoding of contentApril 19, 2011 – p 49
  • 49. External system i18n issues Specific development and integration CODE UI Business • Modality of data exchange / Persistance RDBMS LDAP CMS data loss Web Services Rules Engine Workflow Engine JMS 3rd Parties • Accessing the proper locale • Encoding/persistence of content on external systemApril 19, 2011 – p 50
  • 50. I18n Engineering Considerations• Locale Handling • Honorific titles• Character encoding • Telephone formats• Strings • Postal formats – – External, Grammar, Segments, Plurals, Wrapping String Handling (char *, etc.) • Region-specific functions – Tabs, spaces, delimiters, etc. • Shipping conditions• Resource management – • Numerical formats centralized, normalized, re-usable • Page layout, LTR, RTL• Dates - Calendar • Fonts and attributes• Times • Icons, colors• Sorting & searching • Reporting, workflow• Currency • Database support• Transaction process • Multi-byte enabling• Character set conversions • Business logic• On line help • Measurements, units• Sounds • Input Methods • Data exchange April 19, 2011 – p 51
  • 51. Process requirements:how to fit into an existing environment• Lifecycle • Build• Documentation • Source control• Integration • Branching• QA • Reporting structure• Type of meetings • Review boards • JUnit • Globalyzer • Bug Reporting
  • 52. Questions & Answers Adam Asnes Resources adam@lingoport.com http://www.lingoport.com Olivier Libouban Globalyzer http://www.globalyzer.com olivier@lingoport.com Blog http://i18nblog.com
  • 53. Static Analysis DetailGlobalyzer example – Running and ReportingAdam Asnes Olivier LiboubanPresident & CEO Globalization LeadLingoport Lingoport
  • 54. Example Project Plan Looking at a plan from a service project
  • 55. Example Project PlanCombine:•1 Part Architecture•1 Part Code Metrics•1 Part Experience
  • 56. Lingoport:Agile & InternationalizationAdam Asnes Olivier LiboubanPresident & CEO Globalization LeadLingoport Lingoport
  • 57. Agile in one slide (smallest nutshell)• Roles (Product Owner, Scrum Master, Team)• Product Backlog• Sprints (user stories are designed, implemented, tested in a „short‟ timeframe, e.g. 3 weeks)• Sprint Backlog• Daily Scrums• Demonstrable• „Shippable‟
  • 58. i18n and Agile Challenges• Traditionally, Legacy i18n has followed a waterfall model: – i18n cuts across the code, for instance: • Encoding problems …in all the code • Formatting issues … in all the code • Externalize strings … – i18n needs a systemic approach – I18n tend to have long project life cycles – (L10n: must get an entire locale done)• From a methodology perspective Agile: – is feature driven – runs in “short” Sprint• Sometimes a Hybrid approach works best
  • 59. Agile & i18n Process Challenges
  • 60. Lingoport Project Assessment - Legacy• Uncover i18n potential issues from 2 perspectives: – Code perspective: Globalyzer reporting/metrics – Architectural: Locale/technical i18n requirements• Allows to create the initial „i18n product backlog‟• Can, but does not need to be part of a Sprint• Allows to have an overall scope and effort estimate• Can feed into a number of processes – TDD, ADD, Waterfall, … Agile• Involve the Product Owner: communication resource
  • 61. Lingoport Project OrganizationBacklog identification and Scoping• The i18n product backlog is a prioritized list of requirements, stories, features, etc.• What the customer wants, described using the (Product Owner‟s) customer‟s terminology ID Name Imp Est How to demo Notes If no login before, 1 Locale Setting and Tracking 30 5 Log in, default locale Splash screen for If first time, otherwise Locale remembers … … … … Log in for an en US 2 Locale for languages 10 8 user Locale is default Go to page www. Check pseudo Change Locale localization … ..
  • 62. Lingoport Project OrganizationSprint Management• i18n code branching• Agile typically uses development build, CI environments• Must pass „regular‟ dev criteria• Must be able to push i18n code branching easily and vice versa• I18n tests must be available to other teams in CI• Some items are more sensitive than others – Database schema changes and implications on all source
  • 63. Continuous Integration - Basics Team 1 Team 2 Team 3 Team 4 Team 5
  • 64. CI & Scan Results Summary
  • 65. CI & Scan Details Results
  • 66. Questions & Answers Adam Asnes Resources adam@lingoport.com http://www.lingoport.com Olivier Libouban Globalyzer http://www.globalyzer.com olivier@lingoport.com Blog http://i18nblog.com