Successfully reported this slideshow.

Open Platform

4,374 views

Published on

Published in: Technology
  • Be the first to comment

Open Platform

  1. 1. Palantir Open Platform Brian Schimpf Forward Deployed Engineer © 2008 Palantir Technologies Inc. All rights reserved.
  2. 2. Presentation Overview  Palantir is an open platform – Designed from the ground up to be open and extensible – Rich set of APIs spanning the product  Palantir works with your IT infrastructure  In this talk – Integrating with existing software ecosystem – Palantir extensibility
  3. 3. Existing IT Ecosystem  Your existing IT infrastructure – Authentication – Information Extractors – Legacy data stores – Rapidly changing data sources
  4. 4. Existing IT Ecosystem  Your existing IT infrastructure – Authentication – Information Extractors – Legacy data stores – Rapidly changing data sources
  5. 5. Authentication  Already have an existing authentication and authorization infrastructure  May have multiple authentication sources  Want to provide a unified access LDAP Credentials control solution across information sources Public Key Credentials
  6. 6. Authentication Web Service  Authentication WS provides a common interface  Provide users, groups and group memberships  Allows multiple sources to be registered Auth LDAP Auth Source jdoe@source1 Sample User A Username: jdoe@source1 Dispatch Server Name: Jane Doe UID: ABCD-EFGH-IJKL Groups: 1234, 5678 Auth jsmith@source2 PKI Auth Source Sample User B Username: jsmith@source2 Name: John Smith UID: ZXYW-VUTS-RQPO Groups: 9876, 5432
  7. 7. Authentication Web Service  Prebuilt implementation for LDAP – Compatible with Microsoft Active Directory  Implemented via SOAP-RPC – Can be arbitrarily complex  Works seamlessly with Palantir Access Control Model – ACLs can span authentication sources  Can be leveraged by other applications for authentication
  8. 8. Existing IT Ecosystem  Your existing IT infrastructure – Authentication – Information Extractors – Legacy data stores – Rapidly changing data sources
  9. 9. Information Extractors  Large repositories of unstructured text  Multiple information extractors have been run across the text  Provide different types of extraction – Entities – Relationships – Metadata – Geotagging  Siloed view of each entity extractors output  Want to combine these views alongside structured data into one interface
  10. 10. Entity Extractor SDK  Palantir provides excellent visualization and integration of entity extracted documents  Entity Extractor SDK provides common interface to all major extractors – Command line interface – SOA Web Service
  11. 11. Entity Extractor SDK  Leverages DocXML format to represent data – Can combine multiple extractor outputs into one representation – See Palantir XML Formats Presentation for more information  Standard SOAP-RPC and XML allows custom implementations in any language on any platform  Open interface and format allow the platform to be leveraged by other applications
  12. 12. Existing IT Ecosystem  Your existing IT infrastructure – Authentication – Information Extractors – Legacy data stores – Rapidly changing data sources
  13. 13. Legacy Data Systems  Multiple stovepiped sources of information  No common schema  No common interface  No common access control  Want to provide common interface for analysis and data access
  14. 14. XML APIs  Palantir XML provides a serialized form of the Palantir Object Model  Can exactly control the representation of data in Palantir – Fine grained access control – Tracking of pedigree and lineage  See Palantir XML Formats Presentation for more information
  15. 15. Existing IT Ecosystem  Your existing IT infrastructure – Authentication – Information Extractors – Legacy data stores – Rapidly changing data sources
  16. 16. Rapidly Changing Data Sources  New data sources come on line all the time  Want to easily integrate this content with existing data to discover new information  Palantir has flexible user interface and backend data import utilities – Easy to quickly map new datasets – Handles popular unstructured document formats – Rapidly transforms structured import sources • Flat files • Excel spreadsheets • Relational databases
  17. 17. Data Quality  This looks great, but…  The quality of analysis is only as good as the data that goes into it  Palantir handles dirty data – Attempts to parse and validate attribute values – Unparseable, incomplete or invalid data is still allowed and indexed but does not clutter the system  Data parsing and validation framework is extensible through the Palantir Ontology APIs
  18. 18. Object Model
  19. 19. Property Ontology APIs  The Palantir Ontology APIs enable developers to extend the functionality of the Property Ontology
  20. 20. Structure of a Property  Two types of property values are supported in Palantir – Simple • Used for single, unparsed values • e.g. Nationality, Organization Name – Composite • Used for values composed of discreet, semantic units • e.g. Name (first & last), Address (city, state, zip, etc).
  21. 21. Lifecycle of a Property Extract Extracted Value Raw property value Maker Transforms raw string Transform Validator Validate components Approx Gen Generate approxes Load Palantir Data Store to database and index Store
  22. 22. Property Maker  Data parsing interface  Transform tool that can be leveraged by both XML APIs and standard import interface  In: String with value “John Smith”  Out: Name Property with  First Name: John  Last Name: Smith
  23. 23. Property Parser API  In: Lindengasse 24-9, A-1020 Vienna  Out: Address Property with – Address 1: Lindengasse 24-9 – City: Vienna – Postal Code: 1020 – Country: AT
  24. 24. Property Validator  Data validation interface  Presents notification to the user if the property does not pass validation  Passport Machine Readable Zone – Encoding of passport information in standardized form – Includes checksum after each field P<UTOERIKSSON<<ANNA<MARIA<<<<<<<<<<<<<<<<<<< L898902C<3UTO6908061F9406236ZE184226B<<<<<14  Validator can verify checksum digits
  25. 25. Approx Generator  Fuzzy searching interface  Property can support multiple types of approxes  Approxes are indexed for fast searching  Example: Arabic Name Normalization Transliterate Name to Arabic :
  26. 26. ETL Tools  The Palantir Ontology APIs allow you to customize Palantir’s data handling  Extensions are leveraged across all imports – Data integration without a complex ETL toolchain  Works with manually entered or tagged data as well
  27. 27. Palantir Extensibility  I have all this data in Palantir, now what?  I need to extract the information for some other tool  I need to present the information to the user in a different way  Client Connection API allows for all these operations
  28. 28. Client Connection API  Used by Palantir Workspace  Proxies all requests to Dispatch  Written in Java Client  Get started coding in 5 minutes  Provides abstraction for – Object Model Spring – Revisioning DB HTTP RPC – Access Control Model Dispatch Server
  29. 29. SOA Data Interface  Client Connection API used to provide SOA WS interface to Palantir  Examples – Searching • Works across all data sources • SearchQueryand SearchAroundQueryclasses – Entity Extractor tuning • Retrieve manual edited tagging to train entity extractor • getAppEventObjectInfofor begin and end date – Revisioning database • Extract history of changes to objects • DBEventclass
  30. 30. Custom Presentation  Simple access to data for custom presentation  Searching and storing objects requires a few API calls  Examples – Data entry forms • Standard border crossing forms • createBlankObject, Property.attemptToCreate – Report generation • Report on changes in activity • SearchQueryand HGBinclass – Thin client graph presentation • Transform graph to HTML • Graphclass
  31. 31. Client Connection API  Provides simple and powerful access to Palantir data  Functionality of application plus more  Complete web-based viewer application written in under 6 hours
  32. 32. Summary  Palantir integrates with and becomes a part of your infrastructure  Can unify your authentication, information extraction and data resources in one environment  Provides a rich platform that can be leveraged in other projects
  33. 33. Palantir Open Platform Brian Schimpf Forward Deployed Engineer © 2008 Palantir Technologies Inc. All rights reserved.

×