Multithreaded XML Import (San Francisco Magento Meetup)
Upcoming SlideShare
Loading in...5
×
 

Multithreaded XML Import (San Francisco Magento Meetup)

on

  • 1,614 views

Author: Fabrizio Branca

Author: Fabrizio Branca
Date: 2013-10-23

Statistics

Views

Total Views
1,614
Views on SlideShare
1,568
Embed Views
46

Actions

Likes
1
Downloads
7
Comments
0

1 Embed 46

https://twitter.com 46

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Multithreaded XML Import (San Francisco Magento Meetup) Multithreaded XML Import (San Francisco Magento Meetup) Presentation Transcript

    • Multithreaded XML Import …for Magento San Francisco Magento Meetup Group - October 23, 2013
    • Fabrizio Branca Lead System Developer at
    • E-Commerce: Magento CMS: TYPO3 Global Enterprise Projects Portals: ZF, FLOW,… High Performance /Scale Mobile Searchperience: SOLR 120 people in 7 offices world-wide
    • Aoe_Import github.com/AOEmedia/Aoe_Import git clone --recursive …
    • Will Aoe_Import be the fastest product importer around? YES, of course! Well, maybe… Actually, Aoe_Import is only a XML Importer “Framework”. It’s up to you to decide how to handle the xml snippets…
    • for large XML files XML! Not CSV. full flexibility in processor implementation Aoe_Import multi-thread support! Subscribe your “Processors” to xpaths Stream processing (XMLReader) “event” driven
    • memory single product Problem Memory limit time
    • Memory limit time memory memory Trivial Solution Memory limit time
    • Beat the memory Leak by forking Waiting for other thread to terminate Threading overhead Process import
    • Forking? In PHP? $pid = pcntl_fork(); if ($pid) { // parent process runs what is here echo "parentn"; } else { // child process runs what is here echo "childn"; }
    • Threadi github.com/AOEmedia/Threadi
    • Clean OOP interface for PHP to forking and process management Threadi
    • Batch Processor Collect a bunch of imports … …fork… …and process them in a child process.
    • No imports are processed in the main thread. So there’s no memory leak happing here Main thread memory Memory limit time Create process collection Waiting for other thread to terminate Threading Process imports in process collection overhead Forks Every fork starts with the low memory footprint of the main thread Find the number of imports that can be processed at a time without hitting the memory limit
    • Multi-threading? Sure! Number of threads processed in parallel Number of items in a batch
    • Problems? Database Connection Database connection doesn’t like to be cloned! Mage::getSingleton('core/resource') ->getConnection('core_write') ->closeConnection();
    • Problems? Thread Safety
    • Problems? Thread Safety --- a/app/code/core/Enterprise/Catalog/Model/Index/Action/Catalog/Category/Product/Refresh.php +++ b/app/code/core/Enterprise/Catalog/Model/Index/Action/Catalog/Category/Product/Refresh.php @@ -326,7 +326,7 @@ class Enterprise_Catalog_Model_Index_Action_Catalog_Category_Product_Refresh ->setComment('Catalog Category Product Index Tmp'); $this->_connection->dropTable($this->_getMainTmpTable()); $this->_connection->createTable($table); $this->_connection->createTemporaryTable($table); + } /**
    • Other Use-Cases? Scheduler Queue processing Indexes Everything that’s batchable
    • Thank you! Any questions? My blog http://www.aoemedia.com http://www.fabrizio-branca.de @fbrnc Follow me on twitter!