Large Java EAI Training

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    1 Favorite

    Large Java EAI Training - Presentation Transcript

    1. Design Guidelines for Large Java Message-based EAI SystemsAn EAI Case Study
      Intertech
      March 2006
    2. This Talk
      Presents an EAI case study
      A very large EAI system for a retail chain.
      Identify issues and challenges encountered in the project
      Identifies lessons learned and recommendations for your EAI projects.
      Let’s you know others do have it as bad as you.
      The story does have a happy ending – maybe providing hope to the hopeless.
    3. Large Java EAI
      Messaging/EAI development  Web or other distributed app development
      Especially when very large
      Many new or significantly altered considerations
      Requirement differences
      Time and space needs
      Process control/orchestration
      Failure handling
      Monitoring
      Proprietary nature of vendor solutions
      Support turnover
      Staffing needs
    4. The Case Study Situation
      A major retail chain has dozens of distribution centers
      Each distribution center or warehouse services hundreds of stores (>1200 total stores).
      Each distribution center is moving thousands of “cartons” (i.e. boxes) around the warehouse each day
      Receiving them from trucks through dock doors.
      Moving them with fork lifts to storage areas in the warehouse
      Conveying them to “break down” areas for distribution to stores.
      Conveying them down belts to storage areas or outbound trucks.
      Moving them onto trucks that depart the warehouse.
    5. The Case Study Situation
      A box is tracked via labels and bar code readers.
      Some “reads” are manual and some are automated.
      Generating literally hundreds of “events” per second per warehouse.
      RFID was about to make create more events.
      More “reads” from more points in the warehouse.
      Potentially adding store “reads” to the event list.
    6. “The System”
      The retail chain wanted all the data on events regarding the movement of cartons sent to HQ
      Providing them with unparalleled real time information on inventory levels and product status.
      Providing more accurate information for merchandise analyst and productivity monitoring for warehouse managers.
      Providing a Java Web application to nearly 10,000 users to access the data company wide.
      Reports galore.
      Some limited ad hoc query reporting.
    7. Let’s do the math…
      >25 warehouses
      Each generating ~15-20 carton events per second
      Averaging 400 messages a second incoming at HQ
      Peak around 1300 messages a second incoming at HQ
      Data around an event ~200bytes/msg
      24x7x52 (31,449,600 seconds for those not counting)
      = ~ 4-7GB a day
    8. …and the math wasn’t getting any better
      During Christmas time things were worse – much worse.
      The organization wants to double its current size by 2010!
      Oh ya…did I mention RFID was coming
      Tripling or quadrupling the number of events
    9. My Challenge
      Design and implement a system to get the data from the warehouses to HQ
      In near real time to support the reporting needs
      Use whatever makes sense (to some degree – more later)
      With a good size team (20-25 people in various roles)
    10. My Background
      15 year “grizzled” veteran of software development.
      6 years of Java experience.
      Author of a Java book.
      Experienced architect, manager, mentor, trainer.
      Eager to take on any software system challenge.
      No experience in EAI!
      An organization with limited EAI experience.
    11. “The Perfect Storm”
      The size of the EAI project + abilities of the development team =
    12. The Solution
      Significant company resources and investment in SeeBeyond EAI product.
      Put SeeBeyond at all the endpoints (warehouses and HQ).
      All data would move through SeeBeyond.
      SeeBeyond is Java based (also a company technology direction).
      Write routing/minor processing code in Java in SeeBeyond.
      Significant company resources and investment in Oracle RDBMS.
      Oracle already at the warehouses
      Obtain a “honking” big Oracle DB at HQ.
      Use Oracle stored procedures for heavy lifting (data processing – report data preparation).
    13. Solution diagram
    14. Problem 1 – We weren’t ready
      As an architect, I was not aware how different an EAI/Java messaging system is.
      Asynchronous-everywhere nature
      Had no patterns to follow (No – I had not read Hohpe/Woolf EAI book)
      Did not have an awareness of the vendor landscape
      Was easily talked into solutions by others.
      My organization didn’t see how big it was
      Had only implemented smaller EAI solutions
      Finding good help was hard – and a critical step
      Internally – lots of support but no experience
      Contractors – lots of desire, but little implementation experience to the scale/level of effort
    15. Getting Yourself Ready
      Get yourself ready
      Understand your options – all the three letter E’s (EAI, ETL, EII, etc.)
      Read EAI patterns
      Know the products (WBI, Vitria, Tibco, WebMethods, SeeBeyond, etc.)
      Find people with real EAI experience
      Experienced with systems matching the size of your app
      Find people with product expertise
      Find people with design/pattern expertise
    16. EAI Patterns
      Enterprise Integration Patterns: Hohpe/Woolf
      Next Generation Application Integration: Linthicum
      IT Architectures and Middleware: Britton
    17. Getting Resources Ready
      Let the network engineers know of your plans
      You are going to be using a significant amount of pipe.
      Have you considered failover/load balancing? (comm lines around warehouses get cut on occasion)
      Let the database engineers know of your plans
      Terabytes of data to be stored and processed – where will it go?
      Consider backup/recovery systems
      Database logs/archiving
      Performance tuning
    18. Getting Support Ready
      Support staffs will be lost at turnover
      How many of your support shops really know …
      How to manage application servers?
      How to manage web applications effectively?
      Can you expect them to be able to operate, maintain and support component based messaging systems?
      Do they know what a message server or bus is?
      Across a very distributed environment?
      Get them trained early (in messaging infrastructure).
      Have them help you design the monitoring tools and alert systems.
      Work together to develop proactive systems checks and troubleshooting procedures.
    19. Get Others Ready
      If your development team isn’t ready, what about…
      Testing/QA teams?
      Analyst?
      Managers?
      For example, finding experienced testers for asynchronous messaging systems is difficult.
      They usually need intricate knowledge of the messaging subsystem monitors and admin capabilities.
    20. Problem 2 – Proprietary EAI
      EAI Products/Solutions are many.
      EAI Standards are few.
      EAI/ETL/EII/… market place is tumultuous
      Sun has purchases SeeBeyond
      IBM bought Ascential
      Everyone calling their product an ESB
      Products/Solutions have scale limits
      Some they know about
      Others they do not
      Java alone does not make you platform independent.
    21. Examine your solution options
      See if what you have would work.
      There is a reason MQ has been around a long time.
      Where possible consider tried, true and already deployed platforms (but again do the math and see if they can support the extra load).
      In house support is probably equipped (more in a bit)
      Not everything has to travel by message.
      Consider multiple/alternate technologies for parts of your solution.
      ETL is great for certain parts of a large solution
      There is a reason why products like Oracle are expensive (technologies like Oracle Replication – more in a bit).
      Does, however, create more issues of timing.
    22. Not everything has to travel by message
      Consider multiple/alternate technologies for parts of your solution.
      Replication of reference data
      Bulk/batch transfers
      Non-real time needs
      ETL is great for certain parts of a large solution
      Examine features in your DB/App Servers
      There is a reason why products like Oracle are expensive (technologies like Oracle Replication – more in a bit).
      How about those Message Beans in the app server?
      This can, however, create more issues of timing.
    23. Reference Data
      In many applications, you need reference data on both ends of the messaging systems.
      You can build a “replicating” message engine to treat this like other message data (not recommended).
      Referential integrity becomes a real problem.
      Consider issues of message timing (PR becomes the 51st state but messages with PR references start to arrive before the new state data does)
      Use simple replication technologies where possible
      ETL tools if reference data changes only happen at certain times.
      Technologies like Oracle Replication for real time (it can operate over a WAN).
    24. Java = interoperability (not always)
      Even when you use Java, how is it being applied?
      Java running inside of proprietary components (like SeeBeyond eWays) does not make you portable.
      Write component code that can be used by or incorporated by proprietary systems.
      Under the covers, is the vendor using
      JMS
      JMX
      JAX-RPC
      Etc…
    25. Process Outside the Bus
      Process outside the message bus/subsystem if you can
      Let the bus focus on delivering the goods.
      Too much processing time in the bus will create
      Scalability problems
      Monitoring problems
      Possibly interoperability problems (especially when using proprietary technology/components)
      Process with components that are
      Flexible
      easy to get at (and change)
      interoperable (if possible)
      and contain reusable business logic (if possible)
    26. Problem 3 – We didn’t figure or figured poorly
      We didn’t do enough “math” up front.
      We didn’t plan for failure/growth.
      The messages moved slower than anticipated.
      The message processing took more time than expected.
      The amount of data was larger than expected.
    27. Do the math…
      How much time its going to take to get a message from A to B
      Test that estimate early.
      Work with the business analysts to figure out how many messages need to be moved.
      Make volume estimates part of the non-functional requirements gathering process.
      Check that against the existing databases if possible.
      How much data needs to be packaged, shipped, processed, stored?
      Design the messages and calculate the size of the overall message (XML and all).
      Calculate the rate and add up the total volume.
    28. …and pad your answer
      Do you have room to spare??
      Can the messaging system handle that (on both ends)?
      Can the consuming database handle that?
      Can the hardware and network handle that?
      Anticipate failure
      What happens if something/anything goes down for an hour?
      What happens if you go down for a day?
      What happens if you have unexpected growth?
    29. Problem 4 – Exception Handling Wasn’t
      More considerations for failover and redundancy
      Versus Web application
      We did not plan on downtime
      Unplanned system issues
      Planned outages
      We didn’t build in enough redundancy
      Load balancing and
      Failover were both after thoughts
      All messages always correct all the time (NOT)
      At first, we had no proper dead letter queuing
      No proper exception processing
      No means to properly see and react to issues
      Many more points of failure and potential issues
      More widely distributed
    30. Design Load Balancing and Failover Upfront
      Load balancing and failover must be accommodated
      Like security, you need a multi-layered approach
      Hardware (like Big IP)
      Redundant message bus/message servers
      Processing components
      Database
      EAI system throttling
      How are you going to kick over to the failover systems (and return to regular systems)?
      Without losing messages
      Without causing timing problems in message deliver/receipt
    31. Space, space and more space
      Plan on extra space for failure
      A place for queued messages to sit if something goes down
      Space in the DB or space in the message channels – or both
      Plan on extra space for logs
      You are going to want to keep log files around for a while.
      Some problems take time to manifest to a point of awareness.
      Devise an automated archive/clean up for logs.
      No…not all EAI systems provide log clean up utilities.
    32. Anticipate Bad Messages
      Build a Dead Letter Queue (see EAI Patterns book).
      Unless you have a simple system, you will have messages the system can’t handle
      Improper format, wrong data, etc…
      Build a means to capture and handle these
      Less they clog your process.
      Where do you put them? DB, other queue?
      Who checks them (do you have a “one’s” issue or systemic problem?)
    33. Message Repair
      If possible, build a message triage mechanism to inspect, fix, resend DLQ’ed messages
      This can be built/improved over time
      More manual at first
      Automated as you learn more.
      Considerations
      How are you going to clean up the error “droppings” (messages that are truly dead)
      Consider a “retry” queue with varied strategies to retry messages that have failed.
      Failure may be due to row locks or reference updates that are just microseconds away from completion.
      Be cautious of when/why messages end up in the “dead letter” queue.
      You don’t want it flooded because the DB is down.
    34. Dead Letter Queue (DLQ)
    35. Managing it/monitor it
      The multiple points of failures and issues of your systems make them complicated to manage and support.
      Build in automated monitoring facilities and system health dashboards.
      You need a one stop shop for what’s up, what’s down, what’s queuing properly, what’s queuing too much, etc.
      Consider the use of JMX (it is probably already built into some of your infrastructure components.
      Calculate system thresholds and provide automated alerts to the dashboard and email/page/etc. systems when they start to get close (not once they have been achieved).
    36. Problem 5 – Change is inevitable
      The size and shape of our messages changed over time.
      We had no way to deal effectively with change.
      Consequently, new system versions/updates caused
      Shutdown
      Replace (sometimes transforming data to a new structure)
      Restart
      The real world was the only time we saw some situations
      We had no effective test harness
      Typically leading to ugly back outs
    37. Version Strategy
      EAI system stability/life span depends on it message structure.
      Message structure is the hardest part to get exactly right up front.
      When message formats need to change, this creates a real problem. The entire must be down, queues emptied, etc.
      Consider version information in the message and routing/processing instructions in the bus.
      More complicated system
      Can also affect performance
      Allows for dual operation (old and new systems) without failure and major down time.
      Its going to happen – especially early – plan for it.
    38. Version Routing
    39. Testing EAI is a b*&%#
      Consider collecting days worth of messages or message generating data and using it for replay scenarios.
      Problem - even if you have all the data, you don’t have the same timing issues you will see in the real world.
      Testing all the potential message scenarios is impossible with any significant sized system.
      Consider developing a message “replicator” subsystem.
      Send replicated messages to a test harness.
      A “live” test faucet of messages ready whenever you need them.
      Critical to be able test new/updated processes, performance, etc.
      Requires a fair amount of hardware and some switch to turn it on/off.
      Will impact performance
      Consider putting the “faucet” on just one of the servers in a farm
    40. Test “Faucet”
    41. Wrap up
      Despite the issues – the system is up and running today.
      Extremely useful to the business – providing unparalleled distribution information.
      Like most things in software system development, the lessons learned are more about organization, architecture and design than implementation.
      Thank you – for your time and attention.

    + Intertech TrainingIntertech Training, 4 months ago

    custom

    347 views, 1 favs, 0 embeds more stats

    http://www.Intertech.com

    This is a slide deck fr more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 347
      • 347 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 1
    • Downloads 0
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories

    Tags