Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
INGELA VIKSTROM, ANABEL SILVA, SANDRO PRATO
CSL Bio21 Research Scientists
Australia
INGELA VIKSTROM, ANABEL SILVA, SANDRO ...
Outline
• CSL Behring
– Introduction of CSL Behring
• CSL Behring’s products and focus
• Growth and global placement of ma...
CSL Behring’s Products and Focus
• CSL Behring
– CSL Behring is a global biotherapeutics leader
– Focused on serving patie...
Business Driver
PACE globalization initiative
• PACE is a global, transformation initiative that fulfills our
promise to p...
Global Manufacturing Facilities
• Manufacturing Sites
• United States
– Kankakee
• Germany
– Marburg
• Switzerland
– Bern
...
Manufacturing & Analytical Silos
13/06/20176
Future Manufacturing Data Flows
13/06/20177
Challenges
• Each Manufacturing system uses a different backend
databases and schema to log the batch execution steps
– 12...
NiFi
• Allows the creation of custom processors for each MES
system (python).
• Uses back pressure to eliminate any full d...
Thank You
CSL Limited
45 Poplar Road
Parkville, Victoria, 3056
Australia
TEST
Upcoming SlideShare
Loading in …5
×

Integrating and Analyzing Data from Multiple Manufacturing Sites using Apache NIFI and Apche Zeppelin to a central Hadoop data lake at CSL Behring

374 views

Published on

In this talk Mark Baker (CSL) will show how CSL Behring is Integrating and Analyzing Data from Multiple Manufacturing Sites using Apache NIFI to a central Hadoop data lake at CSL Behring

The challenge of merging data from disparate systems has been a leading driver behind investments in data warehousing systems, as well as, in Hadoop. While data warehousing solutions are ready-built for RDBMS integration, Hadoop adds the benefits of infinite and economical scale – not to mention the variety of structured and non-structured formats that it can handle. Whether using a data warehouse or Hadoop or both, physical data movement and consolidation is the primary method of integration.
There may also be challenges with synchronizing rapidly changing data from a system of record to a consolidated Hadoop platform .
This introduces the need for “data federation” , where data is integrated without copying data between systems.
For historical/batch data use cases there is a replication of data across remote data hubs into a central data lake using Apache NIFI.
We will demo using Apache Zeppelin for analyzing data using Apache Spark and Apache HIVE.

Published in: Technology
  • Be the first to comment

Integrating and Analyzing Data from Multiple Manufacturing Sites using Apache NIFI and Apche Zeppelin to a central Hadoop data lake at CSL Behring

  1. 1. INGELA VIKSTROM, ANABEL SILVA, SANDRO PRATO CSL Bio21 Research Scientists Australia INGELA VIKSTROM, ANABEL SILVA, SANDRO PRATO CSL Bio21 Research Scientists Australia MARK BAKER Head of Big Data Infrastructure CSL Behring ANALYZING DATA FROM MULTIPLE MANUFACTURING SITES USING A CENTRAL HADOOP DATA LAKE
  2. 2. Outline • CSL Behring – Introduction of CSL Behring • CSL Behring’s products and focus • Growth and global placement of manufacturing facilities • Current PACE globalization initiative • Streamlining global processes to improve efficiency • Partnership with Hortonworks to create our Big Data Platform • HDP for Data lake and analytics using Zeppelin • HDF for secure data movement from global manufacturing sites to our central data repository SAP HANA & HDP. • Q & A
  3. 3. CSL Behring’s Products and Focus • CSL Behring – CSL Behring is a global biotherapeutics leader – Focused on serving patients’ needs by using the latest technologies • Deliver innovative therapies that are used to treat rare and serious conditions. – One of our “super orphan” therapies treats a condition affecting approximately 300 patients in the U.S. and only one million worldwide. To meet growing demand and bring more therapies to more patients, we continue to invest in the expansion of all our manufacturing facilities
  4. 4. Business Driver PACE globalization initiative • PACE is a global, transformation initiative that fulfills our promise to patients by aligning our processes and enhancing collaboration to achieve sustainable business excellence • Provide advanced analytics capabilities to exploit existing and new data assets, support decision-making, and provide predictive models • Build user community with the right skills and right tools
  5. 5. Global Manufacturing Facilities • Manufacturing Sites • United States – Kankakee • Germany – Marburg • Switzerland – Bern • Australia – Melbourne • Historically separated by region and operated independently
  6. 6. Manufacturing & Analytical Silos 13/06/20176
  7. 7. Future Manufacturing Data Flows 13/06/20177
  8. 8. Challenges • Each Manufacturing system uses a different backend databases and schema to log the batch execution steps – 12 x SCADA and MES systems • Edge servers must not impact MES system performance – Sensitive systems required impact assessment prior to direct data extracts • Data must be encrypted in motion and at rest – HIPAA compliance and EU privacy requirements • Data must be compressed over the WAN – Due to bandwidth constrictions on intranet • Multiple time zones and string encodings
  9. 9. NiFi • Allows the creation of custom processors for each MES system (python). • Uses back pressure to eliminate any full database pulls after network/hardware outages. • Encrypts data over the wire. • Compresses data over the wire • Allows data enrichment for the addition of UTC column. • ETL functionality allows for special characters to be transformed into data analytical tools can process ex. ṏ
  10. 10. Thank You CSL Limited 45 Poplar Road Parkville, Victoria, 3056 Australia TEST

×