SlideShare a Scribd company logo
Synchronize
applications with
akeneo/batch
Grégory Planchat
@gplanchat
Kiboko SAS, France
Grégory Planchat
@gplanchat
Kiboko SAS, France
Grégory Planchat
@gplanchat
Kiboko SAS, France
You may know me from
akeneopim-ug.slack.com
Data integration
It is all about importing and
exporting data
akeneo/batch
• Exists since Akeneo’s inception in the
AkeneoBatchBundle
• framework-agnostic
Used to…
• Import data
• Export
• Mass actions
• …
$batchJob = new Job(
‘inventory_job',
$eventDispatcher,
$jobRepository,
[
new ItemStep(
‘inventory_step',
$eventDispatcher,
$jobRepository,
new InventoryFileReader(),
new InventoryImportProcessor(),
new InventoryDatabaseWriter()
),
]
);
What does a job look like?
$execution = new JobExecution();
$execution->setJobParameters(
new JobParameters(
[
'cachePath' => $filename,
]
)
);
$batchJob->execute($execution);
How to run a job ?
Under the hood
Run Tier Job Tier Step Tier Data Tier
$batchJob = new Job(
‘foo_job',
$eventDispatcher,
new JobRepository(),
[
new ItemStep(
...
),
new ItemStep(
...
),
new MyAwesomeNotificationStep(
...
),
new RandomStep(
...
),
]
);
Unlimited number of Steps
Principles
• Every Job has at least one Step
• Akeneo provides 2 types of steps:
• Triggers
• Item processing
• The Job provides a JobParameters object for
execution-specific configuration
Trigger
• Runs a unique action, should be fast, on an
external element of the batch (eg. A command line)
• Only provides a status (success / failed)
• Examples : download files, clear a cache, send an
e-mail, run a command line, etc.
class ApiCacheFetching extends AbstractStep
{
public function doExecute(StepExecution $stepExecution)
{
$file = fopen(
$stepExecution->getJobParameters()->get(‘cachePath'),
‘w'
);
$api = fopen(‘http://api.example.com/foo?bar=3', 'r');
stream_copy_to_stream($api, $file);
fclose($api);
fclose($file);
}
}
A trigger Step: download a
file and put it in local storage
class ApiCacheCleanup extends AbstractStep
{
public function doExecute(StepExecution $stepExecution)
{
$file = $stepExecution->getJobParameters()
->get('cachePath');
unlink($file);
}
}
A trigger Step: remove a file
from local storage
public function doExecute(StepExecution $stepExecution)
{
$builder = new ProcessBuilder(
[
'/usr/bin/env',
'php',
$this->indexerCommandPath,
'--reindex',
$this->index,
]
);
$process = $builder->getProcess();
$process->setTimeout(3600);
$process->run();
if (!$process->isSuccessful()) {
$stepExecution->addFailureException(
new ProcessFailedException($process));
}
$stepExecution->addSummaryInfo('index', $process->getOutput());
}
A trigger step: run Magento
reindex
$batchJob = new Job(
‘inventory_job',
$eventDispatcher,
$jobRepository,
[
new ApiCacheFetching(...),
new ItemStep(
‘inventory_step',
$eventDispatcher,
$jobRepository,
new InventoryFileReader(),
new InventoryDefaultProcessor(),
new InventoryDatabaseWriter()
),
new ApiCacheCleanup(...),
]
);
How to integrate it with the Job
We have added

our 2 Steps

into the Job
Item processing
• Handles a data source one line at a time
• Input and output can be two different storage types
• Can reject or ignore certain items
• Provides a status (success / failure)
Item processing
Item processing, Read
Item processing, Writing
Using Iterators
• Eases the data reading
• Lots of classes in the SPL and 3rd party libraries
provides iteration feature
class IteratorReader implements
ItemReaderInterface,
StepExecutionAwareInterface,
InitializableInterface
{
public function setStepExecution(StepExecution $stepExecution)
{
$this->stepExecution = $stepExecution;
}
public function initialize()
{
$this->iterator = new FooIterator(...);
$this->iterator->rewind();
}
public function read()
{
...
}
}
Using an Iterator
public function initialize()
{
$filePath = $this->stepExecution
->getJobParameters()->get('cachePath')
$xml = simplexml_load_file($filePath);
$this->iterator = new ArrayIterator(
$xml->xpath(‘//inventory/stock/virtual')
);
$this->iterator->rewind();
}
public function read()
{
if (!$this->iterator->valid()) {
return null;
}
$this->stepExecution->incrementReadCount();
$item = this->iterator->current();
$this->iterator->next();
return $item;
}
Using SimpleXML
Limits
• No conditional execution
• No parallel execution
• Processors are unpleasant and complex to write
How to go further?
• Create a new type of Step
• Use the Extract-Transform-Load pattern
• Ease data mapping
Use case #1
• 145.000 product SKU
• 70 websites
• Inventories and prices are coming from 46 ERP
The pipeline
• Extract data from the sources
• Transform the data (type and contents)
• Load the data into the destination
• Open-sourced as php-etl/pipeline
$pipeline = (new PipelinePipeline(new PipelineRunner()))
->extract(new ProductExtractor($productEndpoint, $start, $limit))
->transform(new ProductTransformer($mapper, $logger))
->transform(new ProductMappingToDoctrineEntityBySKUTransformer($om, $mapper))
->load(new DoctrineBatchLoader($batchSize, $om))
;
Declare a pipeline
$batchJob = new Job(
‘product_job',
$eventDispatcher,
$jobRepository,
[
new PipelineStep(
‘product_step',
$eventDispatcher,
$jobRepository,
$pipeline
),
]
);
How to integrate it with the Job
Use case #1
Use case #2
• 8.000 product SKU per tenant
• Big concurrency
• A batch of 25 products imported in 20 seconds
How do we go fast?
• Optimize code for speed makes it hard to read
• Stop caring about code readability, makes
maintenance hard
• Code compilers are great, let’s use:
• nikic/php-parser
• symfony/expression-language
$mapper = (new ArrayBuilder(null, $interpreter))
->children()
->constant('[type]', 'SIMPLE')
->copy('[ak_applications.default.value]', '[applications]')
->copy('[ak_tree.default.value][last_name]', '[tree]')
->list('[products]', 'merge( input["units"], input["shippings"] )')
->children()
->copy('[code]', '[ean]')
->expression(
'[name]',
'filter( input["label"], locale("fr_FR"), scope("web") )')
->end()
->end()
->end()
->getMapper();
Declare your mapping
$compiler = new CompilerCompiler(new CompilerStrategySpaghetti());
$mapper = $compiler->compile(
CompilerStandardCompilationContext::build(
new EmptyPropertyPath(), __DIR__, 'FooArraySpaghettiMapper'
),
$mapper
);
Compile your mapper
$output['ak_applications.default.value'] = array_values((function (array $input) {
return array_slice($input, 0, 1, true);
})((function (array $input) : array {
$output = array_filter($input, function (array $item) {
return in_array($item['locale'], ["fr_FR", null]);
});
return $output;
})($input["values"]["applications"] ?? [])))[0]["data"] ?? null;
$output['ak_arbre.name'] = implode(",", (array) (array_values((function (array $input) {
return array_slice($input, 0, 1, true);
})((function (array $input) : array {
$output = array_filter($input, function (array $item) {
return in_array($item['locale'], ["fr_FR", null]);
});
return $output;
})($input["values"]["arbre"] ?? [])))[0]["data"] ?? null));
Spaghetti-generated code
Use case #2
• Now imports 6000 products in 20 sec
Is it a problem ?
• Code is auto-generated by the library
• Code is optimized to be very fast
• You do not need to maintain it
• Xdebug and Blackfire will be fully functional
Merci
Grégory Planchat
@gplanchat
Kiboko SAS, France

More Related Content

What's hot

Using Actions and Filters in WordPress to Make a Plugin Your Own
Using Actions and Filters in WordPress to Make a Plugin Your OwnUsing Actions and Filters in WordPress to Make a Plugin Your Own
Using Actions and Filters in WordPress to Make a Plugin Your Own
Brian Hogg
 
Testowanie JavaScript
Testowanie JavaScriptTestowanie JavaScript
Testowanie JavaScript
Tomasz Bak
 
Incremental Type Safety in React Apollo
Incremental Type Safety in React Apollo Incremental Type Safety in React Apollo
Incremental Type Safety in React Apollo
Evans Hauser
 
BDX 2015 - Scaling out big-data computation & machine learning using Pig, Pyt...
BDX 2015 - Scaling out big-data computation & machine learning using Pig, Pyt...BDX 2015 - Scaling out big-data computation & machine learning using Pig, Pyt...
BDX 2015 - Scaling out big-data computation & machine learning using Pig, Pyt...
Ron Reiter
 
Building data flows with Celery and SQLAlchemy
Building data flows with Celery and SQLAlchemyBuilding data flows with Celery and SQLAlchemy
Building data flows with Celery and SQLAlchemy
Roger Barnes
 
Using Task Queues and D3.js to build an analytics product on App Engine
Using Task Queues and D3.js to build an analytics product on App EngineUsing Task Queues and D3.js to build an analytics product on App Engine
Using Task Queues and D3.js to build an analytics product on App Engine
River of Talent
 
SPFx working with SharePoint data
SPFx working with SharePoint dataSPFx working with SharePoint data
SPFx working with SharePoint data
Vladimir Medina
 
SPFx: Working with SharePoint Content
SPFx: Working with SharePoint ContentSPFx: Working with SharePoint Content
SPFx: Working with SharePoint Content
Vladimir Medina
 
Building and Incredible Machine with Pipelines and Generators in PHP (IPC Ber...
Building and Incredible Machine with Pipelines and Generators in PHP (IPC Ber...Building and Incredible Machine with Pipelines and Generators in PHP (IPC Ber...
Building and Incredible Machine with Pipelines and Generators in PHP (IPC Ber...
dantleech
 
Integrating React.js with PHP projects
Integrating React.js with PHP projectsIntegrating React.js with PHP projects
Integrating React.js with PHP projects
Ignacio Martín
 
Intro to Redux | DreamLab Academy #3
Intro to Redux | DreamLab Academy #3 Intro to Redux | DreamLab Academy #3
Intro to Redux | DreamLab Academy #3
DreamLab
 
Tools for Solving Performance Issues
Tools for Solving Performance IssuesTools for Solving Performance Issues
Tools for Solving Performance Issues
Odoo
 
Automation in angular js
Automation in angular jsAutomation in angular js
Automation in angular js
Marcin Wosinek
 
Practical Google App Engine Applications In Py
Practical Google App Engine Applications In PyPractical Google App Engine Applications In Py
Practical Google App Engine Applications In Py
Eric ShangKuan
 
Appengine Java Night #2b
Appengine Java Night #2bAppengine Java Night #2b
Appengine Java Night #2b
Shinichi Ogawa
 
What's new for developers in Dynamics 365 v9: Client API enhancement
What's new for developers in Dynamics 365 v9: Client API enhancementWhat's new for developers in Dynamics 365 v9: Client API enhancement
What's new for developers in Dynamics 365 v9: Client API enhancement
Kenichiro Nakamura
 
Polyglot parallelism
Polyglot parallelismPolyglot parallelism
Polyglot parallelism
Phillip Toland
 
Yang Tools Quick Memo
Yang Tools Quick MemoYang Tools Quick Memo
Yang Tools Quick Memo
Kentaro Ebisawa
 
The Return of JavaScript: 3 Open-Source Projects that are driving JavaScript'...
The Return of JavaScript: 3 Open-Source Projects that are driving JavaScript'...The Return of JavaScript: 3 Open-Source Projects that are driving JavaScript'...
The Return of JavaScript: 3 Open-Source Projects that are driving JavaScript'...
Ben Teese
 
MBL301 Data Persistence to Amazon Dynamodb for Mobile Apps - AWS re: Invent 2012
MBL301 Data Persistence to Amazon Dynamodb for Mobile Apps - AWS re: Invent 2012MBL301 Data Persistence to Amazon Dynamodb for Mobile Apps - AWS re: Invent 2012
MBL301 Data Persistence to Amazon Dynamodb for Mobile Apps - AWS re: Invent 2012
Amazon Web Services
 

What's hot (20)

Using Actions and Filters in WordPress to Make a Plugin Your Own
Using Actions and Filters in WordPress to Make a Plugin Your OwnUsing Actions and Filters in WordPress to Make a Plugin Your Own
Using Actions and Filters in WordPress to Make a Plugin Your Own
 
Testowanie JavaScript
Testowanie JavaScriptTestowanie JavaScript
Testowanie JavaScript
 
Incremental Type Safety in React Apollo
Incremental Type Safety in React Apollo Incremental Type Safety in React Apollo
Incremental Type Safety in React Apollo
 
BDX 2015 - Scaling out big-data computation & machine learning using Pig, Pyt...
BDX 2015 - Scaling out big-data computation & machine learning using Pig, Pyt...BDX 2015 - Scaling out big-data computation & machine learning using Pig, Pyt...
BDX 2015 - Scaling out big-data computation & machine learning using Pig, Pyt...
 
Building data flows with Celery and SQLAlchemy
Building data flows with Celery and SQLAlchemyBuilding data flows with Celery and SQLAlchemy
Building data flows with Celery and SQLAlchemy
 
Using Task Queues and D3.js to build an analytics product on App Engine
Using Task Queues and D3.js to build an analytics product on App EngineUsing Task Queues and D3.js to build an analytics product on App Engine
Using Task Queues and D3.js to build an analytics product on App Engine
 
SPFx working with SharePoint data
SPFx working with SharePoint dataSPFx working with SharePoint data
SPFx working with SharePoint data
 
SPFx: Working with SharePoint Content
SPFx: Working with SharePoint ContentSPFx: Working with SharePoint Content
SPFx: Working with SharePoint Content
 
Building and Incredible Machine with Pipelines and Generators in PHP (IPC Ber...
Building and Incredible Machine with Pipelines and Generators in PHP (IPC Ber...Building and Incredible Machine with Pipelines and Generators in PHP (IPC Ber...
Building and Incredible Machine with Pipelines and Generators in PHP (IPC Ber...
 
Integrating React.js with PHP projects
Integrating React.js with PHP projectsIntegrating React.js with PHP projects
Integrating React.js with PHP projects
 
Intro to Redux | DreamLab Academy #3
Intro to Redux | DreamLab Academy #3 Intro to Redux | DreamLab Academy #3
Intro to Redux | DreamLab Academy #3
 
Tools for Solving Performance Issues
Tools for Solving Performance IssuesTools for Solving Performance Issues
Tools for Solving Performance Issues
 
Automation in angular js
Automation in angular jsAutomation in angular js
Automation in angular js
 
Practical Google App Engine Applications In Py
Practical Google App Engine Applications In PyPractical Google App Engine Applications In Py
Practical Google App Engine Applications In Py
 
Appengine Java Night #2b
Appengine Java Night #2bAppengine Java Night #2b
Appengine Java Night #2b
 
What's new for developers in Dynamics 365 v9: Client API enhancement
What's new for developers in Dynamics 365 v9: Client API enhancementWhat's new for developers in Dynamics 365 v9: Client API enhancement
What's new for developers in Dynamics 365 v9: Client API enhancement
 
Polyglot parallelism
Polyglot parallelismPolyglot parallelism
Polyglot parallelism
 
Yang Tools Quick Memo
Yang Tools Quick MemoYang Tools Quick Memo
Yang Tools Quick Memo
 
The Return of JavaScript: 3 Open-Source Projects that are driving JavaScript'...
The Return of JavaScript: 3 Open-Source Projects that are driving JavaScript'...The Return of JavaScript: 3 Open-Source Projects that are driving JavaScript'...
The Return of JavaScript: 3 Open-Source Projects that are driving JavaScript'...
 
MBL301 Data Persistence to Amazon Dynamodb for Mobile Apps - AWS re: Invent 2012
MBL301 Data Persistence to Amazon Dynamodb for Mobile Apps - AWS re: Invent 2012MBL301 Data Persistence to Amazon Dynamodb for Mobile Apps - AWS re: Invent 2012
MBL301 Data Persistence to Amazon Dynamodb for Mobile Apps - AWS re: Invent 2012
 

Similar to Synchronize applications with akeneo/batch

Salesforce Batch processing - Atlanta SFUG
Salesforce Batch processing - Atlanta SFUGSalesforce Batch processing - Atlanta SFUG
Salesforce Batch processing - Atlanta SFUG
vraopolisetti
 
Workshop quality assurance for php projects - phpbelfast
Workshop quality assurance for php projects - phpbelfastWorkshop quality assurance for php projects - phpbelfast
Workshop quality assurance for php projects - phpbelfast
Michelangelo van Dam
 
Workshop quality assurance for php projects - ZendCon 2013
Workshop quality assurance for php projects - ZendCon 2013Workshop quality assurance for php projects - ZendCon 2013
Workshop quality assurance for php projects - ZendCon 2013
Michelangelo van Dam
 
CakePHP
CakePHPCakePHP
CakePHP
Walther Lalk
 
Rails is not just Ruby
Rails is not just RubyRails is not just Ruby
Rails is not just Ruby
Marco Otte-Witte
 
Sql storeprocedure
Sql storeprocedureSql storeprocedure
Sql storeprocedure
ftz 420
 
Understanding backbonejs
Understanding backbonejsUnderstanding backbonejs
Understanding backbonejs
Nick Lee
 
Lean Php Presentation
Lean Php PresentationLean Php Presentation
Lean Php Presentation
Alan Pinstein
 
Zend framework service
Zend framework serviceZend framework service
Zend framework service
Michelangelo van Dam
 
Zend framework service
Zend framework serviceZend framework service
Zend framework service
Michelangelo van Dam
 
Task Scheduling and Asynchronous Processing Evolved. Zend Server Job Queue
Task Scheduling and Asynchronous Processing Evolved. Zend Server Job QueueTask Scheduling and Asynchronous Processing Evolved. Zend Server Job Queue
Task Scheduling and Asynchronous Processing Evolved. Zend Server Job Queue
Sam Hennessy
 
Javascript Everywhere
Javascript EverywhereJavascript Everywhere
Javascript Everywhere
Pascal Rettig
 
Javascript first-class citizenery
Javascript first-class citizeneryJavascript first-class citizenery
Javascript first-class citizenery
toddbr
 
Doctrine For Beginners
Doctrine For BeginnersDoctrine For Beginners
Doctrine For Beginners
Jonathan Wage
 
Reactive programming with RxJava
Reactive programming with RxJavaReactive programming with RxJava
Reactive programming with RxJava
Jobaer Chowdhury
 
Adding a modern twist to legacy web applications
Adding a modern twist to legacy web applicationsAdding a modern twist to legacy web applications
Adding a modern twist to legacy web applications
Jeff Durta
 
Meet Magento Belarus debug Pavel Novitsky (eng)
Meet Magento Belarus debug Pavel Novitsky (eng)Meet Magento Belarus debug Pavel Novitsky (eng)
Meet Magento Belarus debug Pavel Novitsky (eng)
Pavel Novitsky
 
Refresh Austin - Intro to Dexy
Refresh Austin - Intro to DexyRefresh Austin - Intro to Dexy
Refresh Austin - Intro to Dexy
ananelson
 
Celery with python
Celery with pythonCelery with python
Celery with python
Alexandre González Rodríguez
 
Accessible Ajax on Rails
Accessible Ajax on RailsAccessible Ajax on Rails
Accessible Ajax on Rails
supervillain
 

Similar to Synchronize applications with akeneo/batch (20)

Salesforce Batch processing - Atlanta SFUG
Salesforce Batch processing - Atlanta SFUGSalesforce Batch processing - Atlanta SFUG
Salesforce Batch processing - Atlanta SFUG
 
Workshop quality assurance for php projects - phpbelfast
Workshop quality assurance for php projects - phpbelfastWorkshop quality assurance for php projects - phpbelfast
Workshop quality assurance for php projects - phpbelfast
 
Workshop quality assurance for php projects - ZendCon 2013
Workshop quality assurance for php projects - ZendCon 2013Workshop quality assurance for php projects - ZendCon 2013
Workshop quality assurance for php projects - ZendCon 2013
 
CakePHP
CakePHPCakePHP
CakePHP
 
Rails is not just Ruby
Rails is not just RubyRails is not just Ruby
Rails is not just Ruby
 
Sql storeprocedure
Sql storeprocedureSql storeprocedure
Sql storeprocedure
 
Understanding backbonejs
Understanding backbonejsUnderstanding backbonejs
Understanding backbonejs
 
Lean Php Presentation
Lean Php PresentationLean Php Presentation
Lean Php Presentation
 
Zend framework service
Zend framework serviceZend framework service
Zend framework service
 
Zend framework service
Zend framework serviceZend framework service
Zend framework service
 
Task Scheduling and Asynchronous Processing Evolved. Zend Server Job Queue
Task Scheduling and Asynchronous Processing Evolved. Zend Server Job QueueTask Scheduling and Asynchronous Processing Evolved. Zend Server Job Queue
Task Scheduling and Asynchronous Processing Evolved. Zend Server Job Queue
 
Javascript Everywhere
Javascript EverywhereJavascript Everywhere
Javascript Everywhere
 
Javascript first-class citizenery
Javascript first-class citizeneryJavascript first-class citizenery
Javascript first-class citizenery
 
Doctrine For Beginners
Doctrine For BeginnersDoctrine For Beginners
Doctrine For Beginners
 
Reactive programming with RxJava
Reactive programming with RxJavaReactive programming with RxJava
Reactive programming with RxJava
 
Adding a modern twist to legacy web applications
Adding a modern twist to legacy web applicationsAdding a modern twist to legacy web applications
Adding a modern twist to legacy web applications
 
Meet Magento Belarus debug Pavel Novitsky (eng)
Meet Magento Belarus debug Pavel Novitsky (eng)Meet Magento Belarus debug Pavel Novitsky (eng)
Meet Magento Belarus debug Pavel Novitsky (eng)
 
Refresh Austin - Intro to Dexy
Refresh Austin - Intro to DexyRefresh Austin - Intro to Dexy
Refresh Austin - Intro to Dexy
 
Celery with python
Celery with pythonCelery with python
Celery with python
 
Accessible Ajax on Rails
Accessible Ajax on RailsAccessible Ajax on Rails
Accessible Ajax on Rails
 

Recently uploaded

Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 

Recently uploaded (20)

Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 

Synchronize applications with akeneo/batch

  • 3. Grégory Planchat @gplanchat Kiboko SAS, France You may know me from akeneopim-ug.slack.com
  • 4. Data integration It is all about importing and exporting data
  • 5. akeneo/batch • Exists since Akeneo’s inception in the AkeneoBatchBundle • framework-agnostic
  • 6. Used to… • Import data • Export • Mass actions • …
  • 7. $batchJob = new Job( ‘inventory_job', $eventDispatcher, $jobRepository, [ new ItemStep( ‘inventory_step', $eventDispatcher, $jobRepository, new InventoryFileReader(), new InventoryImportProcessor(), new InventoryDatabaseWriter() ), ] ); What does a job look like?
  • 8. $execution = new JobExecution(); $execution->setJobParameters( new JobParameters( [ 'cachePath' => $filename, ] ) ); $batchJob->execute($execution); How to run a job ?
  • 9. Under the hood Run Tier Job Tier Step Tier Data Tier
  • 10. $batchJob = new Job( ‘foo_job', $eventDispatcher, new JobRepository(), [ new ItemStep( ... ), new ItemStep( ... ), new MyAwesomeNotificationStep( ... ), new RandomStep( ... ), ] ); Unlimited number of Steps
  • 11. Principles • Every Job has at least one Step • Akeneo provides 2 types of steps: • Triggers • Item processing • The Job provides a JobParameters object for execution-specific configuration
  • 12. Trigger • Runs a unique action, should be fast, on an external element of the batch (eg. A command line) • Only provides a status (success / failed) • Examples : download files, clear a cache, send an e-mail, run a command line, etc.
  • 13. class ApiCacheFetching extends AbstractStep { public function doExecute(StepExecution $stepExecution) { $file = fopen( $stepExecution->getJobParameters()->get(‘cachePath'), ‘w' ); $api = fopen(‘http://api.example.com/foo?bar=3', 'r'); stream_copy_to_stream($api, $file); fclose($api); fclose($file); } } A trigger Step: download a file and put it in local storage
  • 14. class ApiCacheCleanup extends AbstractStep { public function doExecute(StepExecution $stepExecution) { $file = $stepExecution->getJobParameters() ->get('cachePath'); unlink($file); } } A trigger Step: remove a file from local storage
  • 15. public function doExecute(StepExecution $stepExecution) { $builder = new ProcessBuilder( [ '/usr/bin/env', 'php', $this->indexerCommandPath, '--reindex', $this->index, ] ); $process = $builder->getProcess(); $process->setTimeout(3600); $process->run(); if (!$process->isSuccessful()) { $stepExecution->addFailureException( new ProcessFailedException($process)); } $stepExecution->addSummaryInfo('index', $process->getOutput()); } A trigger step: run Magento reindex
  • 16. $batchJob = new Job( ‘inventory_job', $eventDispatcher, $jobRepository, [ new ApiCacheFetching(...), new ItemStep( ‘inventory_step', $eventDispatcher, $jobRepository, new InventoryFileReader(), new InventoryDefaultProcessor(), new InventoryDatabaseWriter() ), new ApiCacheCleanup(...), ] ); How to integrate it with the Job We have added
 our 2 Steps
 into the Job
  • 17. Item processing • Handles a data source one line at a time • Input and output can be two different storage types • Can reject or ignore certain items • Provides a status (success / failure)
  • 21. Using Iterators • Eases the data reading • Lots of classes in the SPL and 3rd party libraries provides iteration feature
  • 22. class IteratorReader implements ItemReaderInterface, StepExecutionAwareInterface, InitializableInterface { public function setStepExecution(StepExecution $stepExecution) { $this->stepExecution = $stepExecution; } public function initialize() { $this->iterator = new FooIterator(...); $this->iterator->rewind(); } public function read() { ... } } Using an Iterator
  • 23. public function initialize() { $filePath = $this->stepExecution ->getJobParameters()->get('cachePath') $xml = simplexml_load_file($filePath); $this->iterator = new ArrayIterator( $xml->xpath(‘//inventory/stock/virtual') ); $this->iterator->rewind(); } public function read() { if (!$this->iterator->valid()) { return null; } $this->stepExecution->incrementReadCount(); $item = this->iterator->current(); $this->iterator->next(); return $item; } Using SimpleXML
  • 24. Limits • No conditional execution • No parallel execution • Processors are unpleasant and complex to write
  • 25. How to go further? • Create a new type of Step • Use the Extract-Transform-Load pattern • Ease data mapping
  • 26. Use case #1 • 145.000 product SKU • 70 websites • Inventories and prices are coming from 46 ERP
  • 27. The pipeline • Extract data from the sources • Transform the data (type and contents) • Load the data into the destination • Open-sourced as php-etl/pipeline
  • 28. $pipeline = (new PipelinePipeline(new PipelineRunner())) ->extract(new ProductExtractor($productEndpoint, $start, $limit)) ->transform(new ProductTransformer($mapper, $logger)) ->transform(new ProductMappingToDoctrineEntityBySKUTransformer($om, $mapper)) ->load(new DoctrineBatchLoader($batchSize, $om)) ; Declare a pipeline
  • 29. $batchJob = new Job( ‘product_job', $eventDispatcher, $jobRepository, [ new PipelineStep( ‘product_step', $eventDispatcher, $jobRepository, $pipeline ), ] ); How to integrate it with the Job
  • 31. Use case #2 • 8.000 product SKU per tenant • Big concurrency • A batch of 25 products imported in 20 seconds
  • 32. How do we go fast? • Optimize code for speed makes it hard to read • Stop caring about code readability, makes maintenance hard • Code compilers are great, let’s use: • nikic/php-parser • symfony/expression-language
  • 33. $mapper = (new ArrayBuilder(null, $interpreter)) ->children() ->constant('[type]', 'SIMPLE') ->copy('[ak_applications.default.value]', '[applications]') ->copy('[ak_tree.default.value][last_name]', '[tree]') ->list('[products]', 'merge( input["units"], input["shippings"] )') ->children() ->copy('[code]', '[ean]') ->expression( '[name]', 'filter( input["label"], locale("fr_FR"), scope("web") )') ->end() ->end() ->end() ->getMapper(); Declare your mapping
  • 34. $compiler = new CompilerCompiler(new CompilerStrategySpaghetti()); $mapper = $compiler->compile( CompilerStandardCompilationContext::build( new EmptyPropertyPath(), __DIR__, 'FooArraySpaghettiMapper' ), $mapper ); Compile your mapper
  • 35. $output['ak_applications.default.value'] = array_values((function (array $input) { return array_slice($input, 0, 1, true); })((function (array $input) : array { $output = array_filter($input, function (array $item) { return in_array($item['locale'], ["fr_FR", null]); }); return $output; })($input["values"]["applications"] ?? [])))[0]["data"] ?? null; $output['ak_arbre.name'] = implode(",", (array) (array_values((function (array $input) { return array_slice($input, 0, 1, true); })((function (array $input) : array { $output = array_filter($input, function (array $item) { return in_array($item['locale'], ["fr_FR", null]); }); return $output; })($input["values"]["arbre"] ?? [])))[0]["data"] ?? null)); Spaghetti-generated code
  • 36. Use case #2 • Now imports 6000 products in 20 sec
  • 37. Is it a problem ? • Code is auto-generated by the library • Code is optimized to be very fast • You do not need to maintain it • Xdebug and Blackfire will be fully functional