Introduction to Lantea
.NET Open Source Big Data Solution
What is Lantea
• Open source big data platform
• Rich ETL (Extract-Transform-Load) features
• A platform that can help Data Scientist to collect and deal with data easily
• Import data from different source is extremely easy
Highlighted features of Lantea
• A lot of different data sources on different media
• Query aggregation data via SQL
• Very easy to collect data from websites, local file systems, emails and
databases
• Export data via a lot of formats and APIs
Target User of Lantea
• Data Scientists
• Marketing Analyzer
• Managers who needs BI
• Researchers
• Big data/BI Developers
• Deep Machine Learning Developers
Non-
Commercial
Commercial
Researchers
Data
Scientists
Big data/BI
Developers
Marketing
Analyzer
Open source
developers
Managers
who needs BI
Essential Elements of Big Data Platform
• Data/File Extraction
• Data Cleaning and Filtering
• Different ways of Analyzing data
• Real-time Processing
• Data Collection from Different Source
• Connect to Different Database Types
• Analysis Result Rendering
• Advanced Parameter Adjustment
Big Data
Extraction
Cleaning
Analysis
Data
Processing
Data
Collection
Parameter
Adjustment
Introduction to Lantea
Architecture Design and Use Case
Third-party Projects Included
• Toxy – Data Extraction framework
• Spidey – Web Spider framework
• EQueue – Queue Implementation
• CacheAdapter – Cache Provider
• Irony – Compiler Implementation
• ServiceStack.Redis– Redis Client
• ScrapySharp – Html Parser and Selector
• Autofac – IOC Container
• Log4net – Configurable Logging System
• Datatables.js – Web Spreadsheet
• Thinkecture Identity Server
- Social account integration
• Nepy
– Parsers for Natural Language Processing
License Candidate
• LGPL
• Apache 2.0
• MIT
• Custom Open Source license
Architecture Design v1
Key Features
• Web Crawling Service
• Data Extraction Service
• Queue Service
• CQLR
(Common Query Language Runtime)
• Rich Formats Outputs and APIs
• Restful and ODATA support
Schedule for Lantea
Use Case 1 – Regional Manager Report Collection
Use Case 1 – Lantea Solution
Use Case 2 – Data Aggregation from Websites
Use Case 2 – Lantea Solution
– the Studio behind Lantea
Our Mission
• Re-create .NET Ecosystem
• Provide .NET-based solutions for clients
• Create something non-exist for .NET Community
• Contribute to Global Open Source Community
• Change the way human lives

Lantea platform

  • 1.
    Introduction to Lantea .NETOpen Source Big Data Solution
  • 2.
    What is Lantea •Open source big data platform • Rich ETL (Extract-Transform-Load) features • A platform that can help Data Scientist to collect and deal with data easily • Import data from different source is extremely easy
  • 3.
    Highlighted features ofLantea • A lot of different data sources on different media • Query aggregation data via SQL • Very easy to collect data from websites, local file systems, emails and databases • Export data via a lot of formats and APIs
  • 4.
    Target User ofLantea • Data Scientists • Marketing Analyzer • Managers who needs BI • Researchers • Big data/BI Developers • Deep Machine Learning Developers Non- Commercial Commercial Researchers Data Scientists Big data/BI Developers Marketing Analyzer Open source developers Managers who needs BI
  • 5.
    Essential Elements ofBig Data Platform • Data/File Extraction • Data Cleaning and Filtering • Different ways of Analyzing data • Real-time Processing • Data Collection from Different Source • Connect to Different Database Types • Analysis Result Rendering • Advanced Parameter Adjustment Big Data Extraction Cleaning Analysis Data Processing Data Collection Parameter Adjustment
  • 6.
  • 7.
    Third-party Projects Included •Toxy – Data Extraction framework • Spidey – Web Spider framework • EQueue – Queue Implementation • CacheAdapter – Cache Provider • Irony – Compiler Implementation • ServiceStack.Redis– Redis Client • ScrapySharp – Html Parser and Selector • Autofac – IOC Container • Log4net – Configurable Logging System • Datatables.js – Web Spreadsheet • Thinkecture Identity Server - Social account integration • Nepy – Parsers for Natural Language Processing
  • 8.
    License Candidate • LGPL •Apache 2.0 • MIT • Custom Open Source license
  • 9.
    Architecture Design v1 KeyFeatures • Web Crawling Service • Data Extraction Service • Queue Service • CQLR (Common Query Language Runtime) • Rich Formats Outputs and APIs • Restful and ODATA support
  • 10.
  • 11.
    Use Case 1– Regional Manager Report Collection
  • 12.
    Use Case 1– Lantea Solution
  • 13.
    Use Case 2– Data Aggregation from Websites
  • 14.
    Use Case 2– Lantea Solution
  • 15.
    – the Studiobehind Lantea Our Mission • Re-create .NET Ecosystem • Provide .NET-based solutions for clients • Create something non-exist for .NET Community • Contribute to Global Open Source Community • Change the way human lives