Big Data – Tap into Cloud Infrastructure with FME

2,311 views

Published on

See how to easily migrate to the cloud with FME, and take advantage of the corresponding benefits including unlimited resources, scalability, and zero hardware to maintain. You'll see how you can use new FME 2014 support to move data to Big Data handling tools and services such as Amazon RDS, Amazon S3, Amazon DynamoDB, Amazon RedShift, and Google BigQuery. Plus, learn about the benefits of being close to the data, and how FME Server and FME Cloud can help power the flow of data, whether it's hosted, on-site, or somewhere in between.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,311
On SlideShare
0
From Embeds
0
Number of Embeds
1,107
Actions
Shares
0
Downloads
32
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Video plays here - what is big dataFuzzy term sort of like “cloud”. What does big data look like?As a catch-all term, “big data” can be pretty nebulous, in the same way that the term “cloud” covers diverse technologies. Input data to big data systems could be chatter from social networks, web server logs, traffic flow sensors, satellite imagery, broadcast audio streams, banking transactions, MP3s of rock music, the content of web pages, scans of government documents, GPS trails, telemetry from automobiles, financial market data, the list goes on. Are these all really the same thing? To clarify matters, the three Vs of volume, velocity and variety are commonly used to characterize different aspects of big data. They’re a helpful lens through which to view and understand the nature of the data and the software platforms available to exploit them. Most probably you will contend with each of the Vs to one degree or another.
  • Big data holds all of it
  • - on premise - cloud (amazon web services) - cloud (google) - cloud (other) - not currently using Big Data
  • Loading DataConversion: big data not spatial friendly (CAD, GIS)Expensive to upload / downloadGeoreferencing and spatial indexingmost big data repositories have limited geospatialBig Data AnalysisQuerying and Exporting DataTricky to find and access stored dataNeed to generate appropriate keys on load
  • Loading DataConversion: big data not spatial friendly (CAD, GIS)Expensive to upload / downloadGeoreferencing and spatial indexingmost big data repositories have limited geospatialBig Data AnalysisQuerying and Exporting DataTricky to find and access stored dataNeed to generate appropriate keys on load
  • Loading DataConversion: big data not spatial friendly (CAD, GIS)Expensive to upload / downloadGeoreferencing and spatial indexingmost big data repositories have limited geospatialBig Data AnalysisQuerying and Exporting DataTricky to find and access stored dataNeed to generate appropriate keys on load
  • Big data repository – scale as big as you wantNoSQL database – optimized for XML / GMLPowerful search and analysis (BI, semantic queries)Stores location, not just geohashXML based data model – rapid XML exportStore any documents: GML, XML (metadata)Deploy on Hadoop HDFS
  • * As applicable (e.g. cant convert raster to gml!)FME2014’s new schema based GML writer which allows FME to convert almost any CAD / GIS or even BIM data to GML or CityGML. This makes FME a very powerful loader tool for MarkLogicFME - A Natural Fit to support MarkLogic:Converts almost any spatial data to GMLWrite almost any XML with XMLTemplaterLoading XML into MarkLogic is a simple HTTP PUT operation easily done with HTTPUploaderQuery, process and reconvert XML results
  • Converting features to GML/XML usually involves a GeometryExtractor transformer or some combination of CoordinateExtractor and XMLTemplaterKey fields can be captured from the source data or use UUIDGenerator to generate unique IDs for URIs etc.Build insert message with XMLTemplaterExecute REST PUT call with HTTPUploader
  • Converting features to GML/XML usually involves a GeometryExtractor transformer or some combination of CoordinateExtractor and XMLTemplaterKey fields can be captured from the source data or use UUIDGenerator to generate unique IDs for URIs etc.Build insert message with XMLTemplaterExecute REST PUT call with HTTPUploader<?xml version="1.0" encoding="UTF-8"?><xml><docID>{fme:get-attribute("_uuid")}</docID><docAuthor>{fme:get-attribute("user")}</docAuthor><modType>{fme:get-attribute("updateType")}</modType><UpdateDate>{fme:get-attribute("_timestamp")}</UpdateDate><filePath>{fme:get-attribute("filePath")}</filePath><comment>{fme:get-attribute("comment")}</comment><doc_xml>{fme:get-xml-attribute("_file_contents")}</doc_xml></xml>
  • As simple as 1,2,3,4!
  • - on premise - cloud (amazon web services) - cloud (google) - cloud (other) - not currently using Big Data
  • * need bubble here for XML/WFS – maybe a circle with something like this in it:<gml:featureMember> <gn:NamedPlacegml:id=“abc.123"> <gn:geometry> <gml:Pointgml:id=“p.abc.123" srsName="EPSG:4258"><gml:pos>15.2 36.7</gml:pos> </gml:Point> </gn:geometry>…
  • This workspace can support the retrieval of any type of XML/GML regardless of schema. The same query workspace can be used to retrieve AIXM, INSPIRE or any other type of XML/GML.StringConcatenator composes search GET request based on input parametersHTTPFetcher sends search GET request to MarkLogicXMLFlattener flattens the response so result.uri can be exposedSecond StringConcatenatorcomposes document GET request based on matching URISecond HTTPFetcher sends document retrieval GET request to MarkLogicXMLFragmenter pulls out the doc_xml from the MarkLogic responseXML writer outputs the XML as a file or streams it to the FMEServer client once workspace is publishedSearch GET request to find URI based on query:http://localhost:8003/v1/keyvalue?element=comment&value=AIXM.ChicagoDocument Retrieval GET request based on URI:http://localhost:8003/v1/documents?uri=/docs/myXML_653c46c3-fdfb-4837-ae1c-49735dd29356.xml
  • For this demo the previous workspace was published to FME Server to make a feature service hosted by FMEServer on top of MarkLogic. The example here supports a simple REST based XML data stream.We could easily use this approach to build a FMEServer hosted WFS on top of MarkLogic.
  • This demo shows Inspector reading AIXM5 GML directly from the GET query: http://UHURA/fmedatastreaming/Demos/QueryMarkLogicDB.fmw?Element=airportCode&Value=CYVRThe query goes to FMEServer’s data streaming serviceFMEServer uses the URL parameters to run the published QueryMarkLogicDB.fmw workspace.QueryMarkLogicDB.fmw uses the values of Element and Value to build a search request and send that to MarkLogicQueryMarkLogicDB.fmw uses the URI from MarkLogic’s search result to compose and submit a document request to MarkLogicQueryMarkLogicDB.fmw extracts the feature XML from the MarkLogic’s document response and streams it back to the FMEServer client
  • This just shows how FME can read XML from MarkLogic and use the GeometryReplacer to covert it to virtually any format FME supports
  • Shows how FME can be used to integrateMarkLogic and ArcGIS Server.These are the steps to move data from MarkLogic to Arc Server Feature Service
  • Shows how FME can be used to integrateMarkLogic and ArcGIS Server.These are the steps to move data from Arc Server Feature Service to MarkLogic. Note this workflow could be event driven, real time or as a scheduled update.
  • Workspace showing data flow from ArcServer toMarkLogic. REST call to feature service retrieves the feature of interest.JSON is extracted and GeometryReplacer generates an FME geometry from it.GeometryExtractor renders the FME geometry as GMLGML is added to an XML update message and posted to MarkLogic
  • Demo #2 Limitless Spatial Indexed Database:Geohash spatial indexStore Vector DataStore Raster DataStore Lidar DataStore geotagged images by locationStore and associate any document with a location
  • - on premise - cloud (amazon web services) - cloud (google) - cloud (other) - not currently using Big Data
  • Big Data – Tap into Cloud Infrastructure with FME

    1. 1. Big Data Tap into Cloud Infrastructure with FME March 18, 2014
    2. 2. Meet the presenters. Don Murray  President and Co-Founder @DonAtSafe Dean Hintz  Senior Product Specialist @DeanAtSafe
    3. 3. Ask us. And join the discussion. Please submit using the GoToWebinar panel. We will follow-up with unanswered questions.
    4. 4. Agenda.  What is Big Data  Big Data Challenges  FME and Big Data  FME Demos:  Loading and Extracting from MarkLogic  Spatial Indexing and Loading to DynamoDB
    5. 5. What we do. www.safe.com
    6. 6. Poll: Which version of FME are you using?
    7. 7. New to FME?  Get your bearings from our Getting Started Page: www.safe.com/fme/getting-started  Learn from our crew in one of the weekly FME Overview webinars: safe.com/WeeklyIntro
    8. 8. What is Big Data?
    9. 9. Big Data and Cloud Big Data needs big resources  Big datastores  Big processing power  Big bandwidth Cloud technology gives you this for fraction of traditional cost!
    10. 10. Big Data and FME  Big Data is a new data “classification” for FME.  Big Data is no different than other data to FME  FME Cloud is a natural fit for data in the Cloud FME makes it easy to leverage the power of Big Data.
    11. 11. Big Data and FME Support Amazon S3  Limitless internet based storage Amazon RDS  See blog article on Amazon RDS (PostGIS) Amazon DynamoDB  NoSQL limitless database service Amazon RedShift  Petabyte scale database warehouse service. Google BigQuery  Superfast append only tables MarkLogic  Large XML based database
    12. 12. Poll: How are you currently working with Big Data?
    13. 13. Big Data Challenges  Loading Data  Lacks spatial support  Big Data Analysis  Querying and Exporting Data
    14. 14. Demo #1  MarkLogic Demo #2  Limitless Spatial Database
    15. 15. Why Demo FME with MarkLogic and DynamoDB? Different from other databases supported by FME.
    16. 16. What is ?  NoSQL database – XML optimized  Powerful search and analysis  Native Spatial Support  XML based data model (GML, XML, etc.)  Deploy on Hadoop HDFS
    17. 17. FME and MarkLogic – A Natural Fit  Convert data to XML/GML*  Easily Load XML into MarkLogic with FME  Process and convert XML results  FME 2014: New schema based GML Writer
    18. 18. Demo #1a Loading MarkLogic Convert GIS / CAD data to GML (XML) Compose REST request to PUT to MarkLogic database
    19. 19. 1. Convert GIS / CAD data into Valid GML 2.Generate Key Fields 3. Build insert message 4. Execute PUT REST call MarkLogic accepts any valid XML – just PUT it! Loading GIS to MarkLogic
    20. 20. Loading GIS to MarkLogic with FME
    21. 21. What ​Big Data technology are you most interested in?
    22. 22. Demo #1b Exporting from MarkLogic GET Query to find URI’s for features of interest GET Query using URI’s to get feature XML/GML, then Conversion to format of choice (CAD, GIS …) /WFS
    23. 23. Exporting XML from MarkLogic 1. Query database via GET request 2. Parse search result and compose GET feature request 3. Extract attributes and geometry from result 4. Validate and Write XML Result
    24. 24. Exporting XML from MarkLogic Search GET request: http://localhost:8003/v1/keyvalue?element=comment&value=AIXM.Chicago Retrieval GET request: http://localhost:8003/v1/documents?uri=/docs/myXML_653c46c3-fdfb-4837-ae1c- 49735dd29356.xml
    25. 25. AIXM from MarkLogic via FMEServer http://UHURA/fmedatastreaming/Demos/QueryMarkLogicDB.fmw ?Element=airportCode&Value=CYVR /AIXM
    26. 26. AIXM from MarkLogic via FMEServer
    27. 27. MarkLogic -> Anything (JSON, KML, GML …)
    28. 28. MarkLogic to ArcGIS via FME Server: 1. Submit search to MarkLogic as described earlier 2. Extract attributes and geometry from result 3. Generate update ESRIJSON message from feature 4. Post update ESRIJSON to ArcGIS Server MarkLogic / ArcGIS Integration
    29. 29. ArcGIS Server to MarkLogic via FME Server 1. Retrive JSON data from ArcGIS Server 2. Generate output GML 3. Write data to MarkLogic via PUT REST call
    30. 30. ArcGIS Server to MarkLogic
    31. 31. Demo #2 – Limitless Spatial Database
    32. 32. DynamoDB  NoSQL SSD-based database service  No limit on size of Database  Specify the needed performance  Autoscale thru Dynamic DynamoDB  Amazon EMR (Hadoop) integration
    33. 33. Demo # 2 – Index Strategy Generate GeoHash Index for each feature and Write to GeoHashSpatialIndex
    34. 34. Demo #2a – Vector, Raster, Lidar Write small features to DynamoDB Write large features to Amazon S3, link to DynamoDB
    35. 35. Demo #2b – Geocoded Images Generate Geohash record of picture location Write Image to S3, link to DynamoDB
    36. 36. Demo #2c – Spatially Store Anything Generate Geohash index Write Document to S3 and Link to DynamoDB location
    37. 37. Demo #2d – Spatially Locate any internet resource Write URI Link to DynamoDB Generate Geohash index location
    38. 38. What data types are you planning to store in Big Data?
    39. 39. Save the date. Webinar: How to Automate Practically Anything with FME Server (March 25th) Webinar: How to Load Data into Google Maps Engine (April 16th) FME World Tour 2014 (April – June 2014) FME International User Conference 2014 (20th Anniversary Celebration) • June 10 – 13, 2014 in Vancouver, Canada
    40. 40. Free and fun to learn. Online Courses - Live & Hands-On  Feb 18-19: FME Desktop Tutorials & Recorded Courses
    41. 41. Stay informed. fmepedia.com/community fmepedia.com/knowledge @SafeSoftware youtube.com/FMEChannel blog.safe.com
    42. 42. Summary Big Data = big new opportunities FME great for working with Big Data Cloud model is a natural fit for Big Data This is just the beginning - more to come!
    43. 43. Hand raising has now been enabled.  If you’d like to ask a question over the air, please click the hand icon and ensure your audio input is set up.
    44. 44. Thank you! Sales  info@safe.com Support  www.safe.com/support  (604) 501-9985 ext. 278 Don Murray  Don.murray@safe.com Dean Hintz  dean@safe.com

    ×