Successfully reported this slideshow.
Your SlideShare is downloading. ×

Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—Industry 4.0 and Logistics Success Examples with Francisco J. Lacueva and Rosa Montañés

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 26 Ad

Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—Industry 4.0 and Logistics Success Examples with Francisco J. Lacueva and Rosa Montañés

Download to read offline

In many cases, Big Data becomes just another buzzword because of the lack of tools that can support both the technological requirements for developing and deploying of the projects and/or the fluency of communication between the different profiles of people involved in the projects.

In this talk, we will present Moriarty, a set of tools for fast prototyping of Big Data applications that can be deployed in an Apache Spark environment. These tools support the creation of Big Data workflows using the already existing functional blocks or supporting the creation of new functional blocks. The created workflow can then be deployed in a Spark infrastructure and used through a REST API.

For better understanding of Moriarty, the prototyping process and the way it hides the Spark environment to the Big Data users and developers, we will present it together with a couple of examples based on a Industry 4.0 success cases and other on a logistic success case.

In many cases, Big Data becomes just another buzzword because of the lack of tools that can support both the technological requirements for developing and deploying of the projects and/or the fluency of communication between the different profiles of people involved in the projects.

In this talk, we will present Moriarty, a set of tools for fast prototyping of Big Data applications that can be deployed in an Apache Spark environment. These tools support the creation of Big Data workflows using the already existing functional blocks or supporting the creation of new functional blocks. The created workflow can then be deployed in a Spark infrastructure and used through a REST API.

For better understanding of Moriarty, the prototyping process and the way it hides the Spark environment to the Big Data users and developers, we will present it together with a couple of examples based on a Industry 4.0 success cases and other on a logistic success case.

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Similar to Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—Industry 4.0 and Logistics Success Examples with Francisco J. Lacueva and Rosa Montañés (20)

Advertisement

More from Spark Summit (20)

Recently uploaded (20)

Advertisement

Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—Industry 4.0 and Logistics Success Examples with Francisco J. Lacueva and Rosa Montañés

  1. 1. Hiding Apache Spark Complexity for Fast Prototyping Francisco J. Lacueva, ITAINNOVA Rosa Montañés, ITAINNOVA of Big Data Applications #EUent6
  2. 2. Index • everisMoriarty – Win-win dialog: Developer & Data Scientist – everisMoriarty platform • Spark in everisMoriarty• Spark in everisMoriarty – Basic example – Real cases • TT • FACTS4WORKERS • Streaming 2#EUent6
  3. 3. everisMoriarty everisMoriarty is a platform for end-to-end development of data analytics models 3#EUent6 everis (www.everis.es) belongs to NTT Data Group It offers IT solutions, services and outsourcing to several sectors such as Telcos., financial entities, public administration, industrial, utilities, energy providers o health companies. Revenues: 1031M€ Sites in 16 countries and 19000 professionals work for everis. NTT Data Group has sites in 50 countries and 110000 professionals work in ITAINNOVA (www.itainnova.es) is a non profit R&D Centre owned by Aragon Regional Government. Sites: Zaragoza and Huesca 15000m2 facilities. 15M€ Budget 59% private projects, 30% CPF, 11%NCPF 1,5M€ Investments, 15000m2 facilities. Knowledge Areas: Materials, Mechatronics, Electrical Power Systems, Industrial Processes, ICT-Big Data, Laboratories –Quality
  4. 4. everisMoriarty Win-win dialog 4#EUent6
  5. 5. everisMoriarty Win-win dialog Aspect “WorkTeam” (solution) I use your models and develop tools quicker and effectively I build models, patterns, and cognitive systems ready to use 5#EUent6 Data Scientist Collaborate with developer
  6. 6. everisMoriarty More than 100 workitems (WI) covering: ML and DL Models, DB connectors, Crawlers, Text Mining algorithms, Spark components… WFs can be used as a WI within another WF. Supported programming languages: Java, Python. Deploy Button 6#EUent6 Python. Automatic Provision of REST APIs for Published WFs. Easy integration with third-party APIs List of Workitems eM WorkFlow eM WorkItem WorkFlow Parameters
  7. 7. Spark in everisMoriarty Spark Clúster (Cloudera CDH) eM Spark WIs: Generic, SparkStreaming, SparkML 7#EUent6
  8. 8. everisMoriarty Spark Basic Wis Spark Streaming WIs Spark Machine Learning WIs Spark WIs overview 8#EUent6 • Spark Context: • SparkContextCreator • SparkContextStop • Data Sources: • SparkCSVLoader • SparkCSVWriter • MongoSpark • … • Spark Streaming Context: • SparkStreamingContextCreator • SparkStreamingContextStartAnd Wait • SparkStreamingContextStop • DStream transformations: • SparkStreamingInputDStreamWi ndower • SparkKMeans • SparkLinearRegression • SparkDecisionTreeClassification • SparkPCA • SparkClassifier • SparkTokenizer • SparkTfIdf
  9. 9. Spark in everisMoriarty Integration with Spark: – Spark WFs (using Spark WIs) • Native Spark operations & Spark data types 9#EUent6 – Other WFs • No Native Spark operations & No Spark data types  Reused within Spark WFs, i.e. as mapFunction.  Disitributed execution over Spark environment.
  10. 10. Spark in everisMoriarty (Very) Basic Example Start Event End EventSparkLoadData ProcessData SparkStoreData Record processingWF 10#EUent6 Encapsulates mapFunction DataScientist-based:
  11. 11. Spark in everisMoriarty (Very) Basic Example: classify data Start Event End EventSparkLoadData ProcessData SparkStoreData Record processingWF 11#EUent6 Encapsulates mapFunction DataScientist-based: - Classification
  12. 12. Spark in everisMoriarty (Very) Basic Example: classify by opinion Start Event End EventSparkLoadData ProcessData SparkStoreData Record processingWF 12#EUent6 Encapsulates mapFunction DataScientist-based: - Classification - Opinion Mining
  13. 13. Spark in everisMoriarty (Very) Basic Example: use external resource Start Event End EventSparkLoadData ProcessData SparkStoreData Record processingWF 13#EUent6 Encapsulates mapFunction DataScientist-based: - Classification - Opinion Mining - Invoke REST APIs
  14. 14. Spark in everisMoriarty (Very) Basic Example Start Event End EventSparkLoadData ProcessData SparkStoreData Record processingWF 14#EUent6 Encapsulates mapFunction DataScientist-based: - Classification - Opinion Mining - Invoke REST APIs - ...
  15. 15. Real case Spark in everisMoriarty 15#EUent6 • PILOT DOMAIN: Dynamic Supply Networks / eCommerce • DOMAIN LEAD: Athens University of Economics and Business AUEB • USE CASES: • Route Planning • Forecasting https://transformingtransport.eu/
  16. 16. Data Transformation & Route Planning Service Invocation Spark in everisMoriarty • Data processed over Spark platform • Parallelized execution 16 <Service> <ID>261680</ID> <Name>Cliente261680</Name> <Duration>PT10M</Duration> <Location> <Address>No Address</Address> <HouseNumber /> <City /> <PostalCode /> <Region /> <Country>ES</Country> <Coord srs="EPSG:4326" x= "37.984332" y="23.726466" /> <GeocodeLevel>XY</GeocodeLevel> </Location> <Priority>1</Priority> <Windows> <Window start="2017-04-28T08:00:00" end="2017-04-28T16:00:00" /> </Windows> <UnloadUnits>1</UnloadUnits> <UnloadKg>6.000</UnloadKg> <UnloadM3>0.036</UnloadM3> <Comments /> </Service> Aaaa Aaaa Aaaa Aaaa Aaaaaaaa Aaaaaaaa Aaaaaaaa Aaaaaaaa Aaaaaaa Aaaaaaa Aaaaaaa Aaaaaaa Aaaaaaaa Aaaaaaaa Aaaaaaaa Aaaaaaaa <Service> <ID>261680</ID> <Name>Cliente261680</Name> <Duration>PT10M</Duration> <Location> <Address>No Address</Address> <HouseNumber /> <City /> <PostalCode /> <Region /> <Country>ES</Country> <Coord srs="EPSG:4326" x= "37.984332" y="23.726466" /> <GeocodeLevel>XY</GeocodeLevel> </Location> <Priority>1</Priority> <Windows> <Window start="2017-04-28T08:00:00" end="2017-04-28T16:00:00" /> </Windows> <UnloadUnits>1</UnloadUnits> <UnloadKg>6.000</UnloadKg> <UnloadM3>0.036</UnloadM3> <Comments /> </Service> Company A Company B Company C
  17. 17. Real case Spark in everisMoriarty FACTS4WORKERS. Worker Centric Workplaces in Smart Factories. PILOT DOMAIN: H2020- FoF 4. 2014 DOMAIN LEAD: Virtual Vehicle USE CASES: Assesing the Impact in Workers Based on System Logged Infomation 17#EUent6 4 Large Industry Partners 5 non-profit ResearchCenters 2 SME´s This project has received funding from the European Union's Horizon 2020 research and innovation programme under Grant Agreement n˚ 636778 http://facts4workers.eu
  18. 18. Spark in everisMoriarty Abstract World RealWorld As-isSituation Problem Scenario Should-be Situation ActivityScenario Instance Problem Solution Artefact HMI/HCI WorkflowEngine F4WBBs BackendSystems BBs andServices Provided bytheWFE provides standard APIs: REST JSONAPI. Mobile Devices Wearables (e.g.Smart Watches) Smart Glasses Desktop Authenti- cation Control Charts Logbook ERP Interface Multimedia Manage- ment Semantic Search Training Module Chat Module Video Chat Module Alarm Warning Manager User Content Rating Machine Status Operator Skill Profiling Task Manager planned BBs … Work Flow Engine DocsRepository PMS KMS SocialBig Data Industrial IoT Big Data 3DModelsTranformers Environment Sensors Data Repositories Social Software Security System ERP … Iteravive, Agile Development 18#EUent6 Problem Artefact Evaluation
  19. 19. Spark in everisMoriarty HMI/HCI WorkflowEngine F4W BBs Backend Systems Mobile Devices Wearables (e.g.Smart Watches) Smart Glasses Authenti- cation Control Charts Logbook ERP Interface Multimedia Manage- ment Semantic Search Training Module Chat Module Alarm Warning Manager User Content Rating Machine Operator Skill Profiling Docs Repository PMS KMS Social Big Data Industrial IoT Big Data 3D Models Tranformers Environment Sensors Data Repositories Security System 19#EUent6 Desktop Video Chat Module Rating Machine Status Profiling Task Manager planned BBs … Social Software ERP … Dashboard Alarms Reports
  20. 20. • Real case: Spark Streaming Spark in everisMoriarty 20#EUent6 • USE CASES: • Urban Mobility events processing • Classification of individuals
  21. 21. Spark in everisMoriarty Urban mobility events processing & classification 21#EUent6 HDFS DBs Dashboards
  22. 22. Spark in everisMoriarty Urban mobility events processing & classification Spark Context 22#EUent6 Spark Streaming connector Spark Engine Spark Streaming Context Start HDFS DBs Dashboards
  23. 23. Moving faster with Spark Easy Spark integration Increase computation capabilities Reduce execution time Easy deployment of non-spark WFs over Spark infrastructure Fast Prototyping Basic Spark applications 23#EUent6 Basic Spark applications Spark ML applications Spark Streaming applications
  24. 24. Moving faster with SparkFast Prototyping & Fast deployment Design and Developement Testing and Deployment Production Code control Integration Engine Development environment Continuous Integration and Delivery Production Environment Integration Engine QA Functional Tests Performance Tests Unit & Integration Tests 24#EUent6 Gartner: Market Trends: Top Five Buyer Expectations of Intelligent Automation in Data and Analytics Services. Published: 17 March 2017 ID: G00322319
  25. 25. Questions? 25#EUent6
  26. 26. Contact us #EUent6 Rosa Montañés rmontanes@itainnova.es Master IA, Data Scientist and BD Architect Telecom Engineer Francisco J. Lacueva fjlacueva@itainnova.es Data Scientist and BD Architect Master Software Engineer

×