SlideShare a Scribd company logo
Newstin Real-time Web
 Content Categorization

Presentation to WebExpo 2008



          October 18, 2008
Company Background
 Newstin a.s. founded in 1998 as I2S in Prague
 Team of 30 employees
   26 engineers
   14 nations
 Since 2005
   Real-time semantic content
     categorization
   Multiple patent filings
     on cross-language solution
 Past activities
   Business & government projects in
    information management and security
 Partnership with Business Objects/SAP
 RedHerring Europe 100 Winner Award
What is Newstin?

 Patented technology
 Largest news database, catalog of news in the world
   150,000+ information sources in 11 languages
   250,000+ articles daily fully processed into 1,000,000+ categories
   US, UK, Indian, French, German, Italian, Spanish, Mexican, Portuguese,
    Brazilian, Czech, Russian, Arabic, Chinese
   Japanese, Korean, Turkish coming in Q4 2008

 Newstin.com

 Popular user applications

 Business Intelligence

 Enterprise content organization
What is Newstin? (Details)
 Newstin is an innovative technology that incorporates a completely new approach to content
  organization. Newstin technology and its service-oriented architecture is the foundation of a unique
  system that features fully scalable real-time semantic, multi-language and cross-language document
  categorization. Newstin patented technology has the potential to become the core platform for
  organizing any unstructured textual data, including data from all sources on the Internet and potentially
  including the hidden Web.

 Newstin is a powerful engine which harnesses a variety of cutting-edge technologies and implements
  linguistic processing with semantic analysis, multilevel content categorization and cross-language
  taxonomy structures. The applications of Newstin technology utilize an inherent capability to make use
  of context in addition to conventional key word approaches.

 Newstin is the largest news database/catalogue in the world currently comprising 40 Million documents
  & 2.2 Billion metadata items and constantly growing. Newstin article collection is continuously updated
  from over 160,000 global and weighted sources selected from a pool of over 3 Million preprocessed
  sources in 12 languages. Daily up to 200,000+ articles are fully processed into 1.1 Million categories in 15
  supported editions: US, UK, Indian, French, German, Italian, Spanish, Mexican, Portuguese, Brazilian,
  Czech, Russian, Arabic, Chinese and Korean; with more languages and editions coming soon.

 Newstin is a complex system incorporating content retrieval, metadata processing, analysis and
  visualization. The extensive operation behind Newstin makes it a perfect platform for SaaS solutions.

 Newstin is a bi-directional application of its own. By imposing order on unstructured data Newstin
  leverages its own extensive metadata collection for business intelligence and enterprise performance
  management. It is inevitable to organize content first to maximize knowledge mining capability.
Web Content Chaos
 An inspiration for Newstin to develop a solution for organizing web content
Semantic Web 2.0 Organization
 A portion of Newstin’s taxonomy structure – a step toward organizing web content
Live Demonstration – Newstin.com
Live Demonstration – NewstinMap
Live Demonstration - Connecting VIP
Live Demonstration – BI Example
Live Demonstration – BI Example
Live Demonstration - EmergingStories
B2B: Online Categorization
                           Firewall

                                                    Enterprise
                                               Intranet
                               Unstructured
                                                   Semantic
                                               
                                   Data
Newstin                                            Organization
                                                   Contextual Search
Categorization                                 
                                                   Visual Navigation
                                               
                    Metadata
Engine
                                                   Cross-language
                                               
                                                   Mash up
                                               
                                                   internal/external


                                      Semantic / Web 2.0 Capability
             SaaS
                                      to Enterprise Market

        Standard for Tagging
         Product synergy /
          enhancement
         Competitive advantage
Cross-language Information Retrieval
 Newstin enables to reach a particluar topic in all supported languages through original definitions
Life Cycle

 Newstin is a comprehensive information system
Shrnutí Prezentace - CZ

Hlavní téma: Kategorizace webového obsahu v reálném čase

Newstin a.s. je česká technologická firma se sídlem v Praze,
zaměstnávající 30 inženýrů z 15 zemí. Během 3,5 roku vytvořila
unikátní technologii na real-time organizování textových dokumentů s
využitím sémantických a lingvistických technologií. Stěžejní a
patentovanou součástí Newstin technologie je tzv. cross-lingvální
řešení umožňující propojovat internetový obsah v různých jazycích bez
použití překladů.

Newstin vytvořil největší aktuální databázi článků internetového
zpravodajství v 11 světových jazycích včetně češtiny, která obsahuje 37
milionů článků za posledních 9 měsíců a 2 miliardy metadat. V
současnosti servery Newstin denně zpracují 250 tis. unikátních článků
ze 160 tis. nejdůležitějších zdrojů po celém světě.

Další využití technologie Newstin leží v oblasti mediálních analýz a
organizaci podnikových dat.
Real-time Web Content Categorization


Thank you.

Julius Rusnak
CTO

Newstin a.s.
Lomnickeho 9
140 00 Prague
Czech Republic

More Related Content

Viewers also liked

T H E L A T E S T T E C H
T H E  L A T E S T  T E C HT H E  L A T E S T  T E C H
T H E L A T E S T T E C HFree- Dominius
 
Kodeks
KodeksKodeks
Kodeks
Teresa
 
Trabo de tecnologia historia del rap
Trabo de tecnologia historia del rapTrabo de tecnologia historia del rap
Trabo de tecnologia historia del rap
Adrian Mercado
 
Aprendre junts
Aprendre juntsAprendre junts
Aprendre juntsemmsantboi
 
Employee Testimonials
Employee TestimonialsEmployee Testimonials
Employee Testimonials
Lauren Lorey
 
Storytelling
StorytellingStorytelling
Storytelling
Drew Skau
 
El Nuevo Curriculum Bachillerato Lenguas
El Nuevo Curriculum Bachillerato LenguasEl Nuevo Curriculum Bachillerato Lenguas
El Nuevo Curriculum Bachillerato Lenguas
Ana Basterra
 
Reputation management tips from Shashi Bellamkonda of Network Solutions
Reputation management tips from Shashi Bellamkonda of Network SolutionsReputation management tips from Shashi Bellamkonda of Network Solutions
Reputation management tips from Shashi Bellamkonda of Network Solutions
Web.com
 
Screenshots of editing
Screenshots of editingScreenshots of editing
Screenshots of editing
EloiseHatton
 
Detetive da escrita
Detetive da escritaDetetive da escrita
Detetive da escritaFSBA
 
Mac129 med102 med122 Television, video and the internet
Mac129 med102 med122 Television, video and the internetMac129 med102 med122 Television, video and the internet
Mac129 med102 med122 Television, video and the internet
Rob Jewitt
 
020.guerra.civil. .x-factor.v2.08.hq.br.07 mar07.os.impossiveis.br.gibihq
020.guerra.civil. .x-factor.v2.08.hq.br.07 mar07.os.impossiveis.br.gibihq020.guerra.civil. .x-factor.v2.08.hq.br.07 mar07.os.impossiveis.br.gibihq
020.guerra.civil. .x-factor.v2.08.hq.br.07 mar07.os.impossiveis.br.gibihqMarcos Donato
 

Viewers also liked (14)

T H E L A T E S T T E C H
T H E  L A T E S T  T E C HT H E  L A T E S T  T E C H
T H E L A T E S T T E C H
 
Kodeks
KodeksKodeks
Kodeks
 
Trabo de tecnologia historia del rap
Trabo de tecnologia historia del rapTrabo de tecnologia historia del rap
Trabo de tecnologia historia del rap
 
Aprendre junts
Aprendre juntsAprendre junts
Aprendre junts
 
Employee Testimonials
Employee TestimonialsEmployee Testimonials
Employee Testimonials
 
Storytelling
StorytellingStorytelling
Storytelling
 
El Nuevo Curriculum Bachillerato Lenguas
El Nuevo Curriculum Bachillerato LenguasEl Nuevo Curriculum Bachillerato Lenguas
El Nuevo Curriculum Bachillerato Lenguas
 
Reputation management tips from Shashi Bellamkonda of Network Solutions
Reputation management tips from Shashi Bellamkonda of Network SolutionsReputation management tips from Shashi Bellamkonda of Network Solutions
Reputation management tips from Shashi Bellamkonda of Network Solutions
 
Screenshots of editing
Screenshots of editingScreenshots of editing
Screenshots of editing
 
A todos
A todosA todos
A todos
 
Water Disaster
Water DisasterWater Disaster
Water Disaster
 
Detetive da escrita
Detetive da escritaDetetive da escrita
Detetive da escrita
 
Mac129 med102 med122 Television, video and the internet
Mac129 med102 med122 Television, video and the internetMac129 med102 med122 Television, video and the internet
Mac129 med102 med122 Television, video and the internet
 
020.guerra.civil. .x-factor.v2.08.hq.br.07 mar07.os.impossiveis.br.gibihq
020.guerra.civil. .x-factor.v2.08.hq.br.07 mar07.os.impossiveis.br.gibihq020.guerra.civil. .x-factor.v2.08.hq.br.07 mar07.os.impossiveis.br.gibihq
020.guerra.civil. .x-factor.v2.08.hq.br.07 mar07.os.impossiveis.br.gibihq
 

Similar to WebExpo 2008 Newstin

Globant and Big Data on AWS
Globant and Big Data on AWSGlobant and Big Data on AWS
Globant and Big Data on AWS
Amazon Web Services LATAM
 
16h00 globant - aws globant-big-data_summit2012
16h00   globant - aws globant-big-data_summit201216h00   globant - aws globant-big-data_summit2012
16h00 globant - aws globant-big-data_summit2012infolive
 
IBM Cognitive platform: IBM Watson
IBM Cognitive platform: IBM WatsonIBM Cognitive platform: IBM Watson
IBM Cognitive platform: IBM Watson
Daniela Zuppini
 
Applications of semantic web
Applications of semantic webApplications of semantic web
Applications of semantic web
Suresh Kumar Mukhiya
 
Intranet 2.0 - Integrating Enterprise 2.0 into your corporate intranet
Intranet 2.0 - Integrating Enterprise 2.0 into your corporate intranetIntranet 2.0 - Integrating Enterprise 2.0 into your corporate intranet
Intranet 2.0 - Integrating Enterprise 2.0 into your corporate intranet
James Dellow
 
Kyligence AI-Powered Self-Service Analytics Metrics Platform
Kyligence AI-Powered Self-Service Analytics Metrics PlatformKyligence AI-Powered Self-Service Analytics Metrics Platform
Kyligence AI-Powered Self-Service Analytics Metrics Platform
Kyligenc io
 
Gilbane SF - Content Convergence Strategies
Gilbane SF - Content Convergence StrategiesGilbane SF - Content Convergence Strategies
Gilbane SF - Content Convergence Strategies
Eric Barroca
 
Gtl Corporate Brochure
Gtl Corporate BrochureGtl Corporate Brochure
Gtl Corporate Brochure
jobin_john70
 
Linked Enterprise Vocabularies
Linked Enterprise VocabulariesLinked Enterprise Vocabularies
Linked Enterprise Vocabularies
Semantic Web Company
 
Web 2.0 for Schools/ Education Institution
Web 2.0 for Schools/ Education InstitutionWeb 2.0 for Schools/ Education Institution
Web 2.0 for Schools/ Education Institution
Venkatesh Iyer
 
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITYSEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
Amit Sheth
 
Content management
Content managementContent management
Content management
Rajendra Babu
 
SharePoint Search Goes Public!
SharePoint Search Goes Public!SharePoint Search Goes Public!
SharePoint Search Goes Public!
SurfRay
 
NYC Sem Web Meetup 20090219
NYC Sem Web Meetup 20090219NYC Sem Web Meetup 20090219
NYC Sem Web Meetup 20090219
Christine Connors
 
Web 2.0 Overview
Web 2.0 OverviewWeb 2.0 Overview
Web 2.0 Overview
Venkatesh Iyer
 
Guide to web trends query parameters
Guide to web trends query parametersGuide to web trends query parameters
Guide to web trends query parametersShipra Malik
 

Similar to WebExpo 2008 Newstin (20)

Globant and Big Data on AWS
Globant and Big Data on AWSGlobant and Big Data on AWS
Globant and Big Data on AWS
 
16h00 globant - aws globant-big-data_summit2012
16h00   globant - aws globant-big-data_summit201216h00   globant - aws globant-big-data_summit2012
16h00 globant - aws globant-big-data_summit2012
 
IBM Cognitive platform: IBM Watson
IBM Cognitive platform: IBM WatsonIBM Cognitive platform: IBM Watson
IBM Cognitive platform: IBM Watson
 
Applications of semantic web
Applications of semantic webApplications of semantic web
Applications of semantic web
 
Intranet 2.0 - Integrating Enterprise 2.0 into your corporate intranet
Intranet 2.0 - Integrating Enterprise 2.0 into your corporate intranetIntranet 2.0 - Integrating Enterprise 2.0 into your corporate intranet
Intranet 2.0 - Integrating Enterprise 2.0 into your corporate intranet
 
Kyligence AI-Powered Self-Service Analytics Metrics Platform
Kyligence AI-Powered Self-Service Analytics Metrics PlatformKyligence AI-Powered Self-Service Analytics Metrics Platform
Kyligence AI-Powered Self-Service Analytics Metrics Platform
 
Gilbane SF - Content Convergence Strategies
Gilbane SF - Content Convergence StrategiesGilbane SF - Content Convergence Strategies
Gilbane SF - Content Convergence Strategies
 
Gtl Corporate Brochure
Gtl Corporate BrochureGtl Corporate Brochure
Gtl Corporate Brochure
 
Gtl Corporate Brochure
Gtl Corporate BrochureGtl Corporate Brochure
Gtl Corporate Brochure
 
Yellow pages based_business_networking_portal
Yellow pages based_business_networking_portalYellow pages based_business_networking_portal
Yellow pages based_business_networking_portal
 
Linked Enterprise Vocabularies
Linked Enterprise VocabulariesLinked Enterprise Vocabularies
Linked Enterprise Vocabularies
 
Web 2.0 for Schools/ Education Institution
Web 2.0 for Schools/ Education InstitutionWeb 2.0 for Schools/ Education Institution
Web 2.0 for Schools/ Education Institution
 
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITYSEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
 
Content management
Content managementContent management
Content management
 
SharePoint Search Goes Public!
SharePoint Search Goes Public!SharePoint Search Goes Public!
SharePoint Search Goes Public!
 
Misha infotech
Misha infotechMisha infotech
Misha infotech
 
NYC Sem Web Meetup 20090219
NYC Sem Web Meetup 20090219NYC Sem Web Meetup 20090219
NYC Sem Web Meetup 20090219
 
Portal For Your Business
Portal For Your BusinessPortal For Your Business
Portal For Your Business
 
Web 2.0 Overview
Web 2.0 OverviewWeb 2.0 Overview
Web 2.0 Overview
 
Guide to web trends query parameters
Guide to web trends query parametersGuide to web trends query parameters
Guide to web trends query parameters
 

More from WebExpo

Jakub Vrána: Code Reviews with Phabricator
Jakub Vrána: Code Reviews with PhabricatorJakub Vrána: Code Reviews with Phabricator
Jakub Vrána: Code Reviews with Phabricator
WebExpo
 
Jaroslav Šnajdr: Getting a Business Collaboration Service Into Cloud: A Case ...
Jaroslav Šnajdr: Getting a Business Collaboration Service Into Cloud: A Case ...Jaroslav Šnajdr: Getting a Business Collaboration Service Into Cloud: A Case ...
Jaroslav Šnajdr: Getting a Business Collaboration Service Into Cloud: A Case ...
WebExpo
 
Steve Corona: Scaling LAMP doesn't have to suck
Steve Corona: Scaling LAMP doesn't have to suckSteve Corona: Scaling LAMP doesn't have to suck
Steve Corona: Scaling LAMP doesn't have to suck
WebExpo
 
Adii Pienaar: Lessons learnt running a global startup from the edge of the world
Adii Pienaar: Lessons learnt running a global startup from the edge of the worldAdii Pienaar: Lessons learnt running a global startup from the edge of the world
Adii Pienaar: Lessons learnt running a global startup from the edge of the world
WebExpo
 
Patrick Zandl: Energy industry post Edison, Křižík & IoT
Patrick Zandl: Energy industry post Edison, Křižík & IoTPatrick Zandl: Energy industry post Edison, Křižík & IoT
Patrick Zandl: Energy industry post Edison, Křižík & IoT
WebExpo
 
Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...
Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...
Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...
WebExpo
 
Marli Mesibov - What's in a Story?
Marli Mesibov - What's in a Story?Marli Mesibov - What's in a Story?
Marli Mesibov - What's in a Story?
WebExpo
 
Tomáš Procházka: Moje zápisky z designu
Tomáš Procházka: Moje zápisky z designuTomáš Procházka: Moje zápisky z designu
Tomáš Procházka: Moje zápisky z designu
WebExpo
 
Jiří Knesl: Souboj frameworků
Jiří Knesl: Souboj frameworkůJiří Knesl: Souboj frameworků
Jiří Knesl: Souboj frameworků
WebExpo
 
Richard Fridrich: Buď punkový konzument!
Richard Fridrich: Buď punkový konzument!Richard Fridrich: Buď punkový konzument!
Richard Fridrich: Buď punkový konzument!
WebExpo
 
Jakub Nešetřil: Jak (ne)dělat API
Jakub Nešetřil: Jak (ne)dělat APIJakub Nešetřil: Jak (ne)dělat API
Jakub Nešetřil: Jak (ne)dělat API
WebExpo
 
Michal Blažej: Zbavte sa account managementu
Michal Blažej: Zbavte sa account managementuMichal Blažej: Zbavte sa account managementu
Michal Blažej: Zbavte sa account managementuWebExpo
 
Denisa Lorencová: UX Designer - Anděl s ďáblem v těle
Denisa Lorencová: UX Designer - Anděl s ďáblem v těleDenisa Lorencová: UX Designer - Anděl s ďáblem v těle
Denisa Lorencová: UX Designer - Anděl s ďáblem v těle
WebExpo
 
Petr Ludwig: Jak bojovat s prokrastinací?
Petr Ludwig: Jak bojovat s prokrastinací?Petr Ludwig: Jak bojovat s prokrastinací?
Petr Ludwig: Jak bojovat s prokrastinací?
WebExpo
 
Jan Vlček: Gamifikace 101
Jan Vlček: Gamifikace 101Jan Vlček: Gamifikace 101
Jan Vlček: Gamifikace 101
WebExpo
 
Luke Wroblewski: Mobile First
Luke Wroblewski: Mobile FirstLuke Wroblewski: Mobile First
Luke Wroblewski: Mobile First
WebExpo
 
Adam Hrubý: Evoluce designéra
Adam Hrubý: Evoluce designéraAdam Hrubý: Evoluce designéra
Adam Hrubý: Evoluce designéra
WebExpo
 
Jan Sotorník: Grafika e-shopu jako sexy a chytrá prodavačka
Jan Sotorník: Grafika e-shopu jako sexy a chytrá prodavačkaJan Sotorník: Grafika e-shopu jako sexy a chytrá prodavačka
Jan Sotorník: Grafika e-shopu jako sexy a chytrá prodavačkaWebExpo
 
Jana Štěpánová: Neziskovky Goes Web
Jana Štěpánová: Neziskovky Goes WebJana Štěpánová: Neziskovky Goes Web
Jana Štěpánová: Neziskovky Goes Web
WebExpo
 
Douglas Crockford: Serversideness
Douglas Crockford: ServersidenessDouglas Crockford: Serversideness
Douglas Crockford: Serversideness
WebExpo
 

More from WebExpo (20)

Jakub Vrána: Code Reviews with Phabricator
Jakub Vrána: Code Reviews with PhabricatorJakub Vrána: Code Reviews with Phabricator
Jakub Vrána: Code Reviews with Phabricator
 
Jaroslav Šnajdr: Getting a Business Collaboration Service Into Cloud: A Case ...
Jaroslav Šnajdr: Getting a Business Collaboration Service Into Cloud: A Case ...Jaroslav Šnajdr: Getting a Business Collaboration Service Into Cloud: A Case ...
Jaroslav Šnajdr: Getting a Business Collaboration Service Into Cloud: A Case ...
 
Steve Corona: Scaling LAMP doesn't have to suck
Steve Corona: Scaling LAMP doesn't have to suckSteve Corona: Scaling LAMP doesn't have to suck
Steve Corona: Scaling LAMP doesn't have to suck
 
Adii Pienaar: Lessons learnt running a global startup from the edge of the world
Adii Pienaar: Lessons learnt running a global startup from the edge of the worldAdii Pienaar: Lessons learnt running a global startup from the edge of the world
Adii Pienaar: Lessons learnt running a global startup from the edge of the world
 
Patrick Zandl: Energy industry post Edison, Křižík & IoT
Patrick Zandl: Energy industry post Edison, Křižík & IoTPatrick Zandl: Energy industry post Edison, Křižík & IoT
Patrick Zandl: Energy industry post Edison, Křižík & IoT
 
Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...
Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...
Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...
 
Marli Mesibov - What's in a Story?
Marli Mesibov - What's in a Story?Marli Mesibov - What's in a Story?
Marli Mesibov - What's in a Story?
 
Tomáš Procházka: Moje zápisky z designu
Tomáš Procházka: Moje zápisky z designuTomáš Procházka: Moje zápisky z designu
Tomáš Procházka: Moje zápisky z designu
 
Jiří Knesl: Souboj frameworků
Jiří Knesl: Souboj frameworkůJiří Knesl: Souboj frameworků
Jiří Knesl: Souboj frameworků
 
Richard Fridrich: Buď punkový konzument!
Richard Fridrich: Buď punkový konzument!Richard Fridrich: Buď punkový konzument!
Richard Fridrich: Buď punkový konzument!
 
Jakub Nešetřil: Jak (ne)dělat API
Jakub Nešetřil: Jak (ne)dělat APIJakub Nešetřil: Jak (ne)dělat API
Jakub Nešetřil: Jak (ne)dělat API
 
Michal Blažej: Zbavte sa account managementu
Michal Blažej: Zbavte sa account managementuMichal Blažej: Zbavte sa account managementu
Michal Blažej: Zbavte sa account managementu
 
Denisa Lorencová: UX Designer - Anděl s ďáblem v těle
Denisa Lorencová: UX Designer - Anděl s ďáblem v těleDenisa Lorencová: UX Designer - Anděl s ďáblem v těle
Denisa Lorencová: UX Designer - Anděl s ďáblem v těle
 
Petr Ludwig: Jak bojovat s prokrastinací?
Petr Ludwig: Jak bojovat s prokrastinací?Petr Ludwig: Jak bojovat s prokrastinací?
Petr Ludwig: Jak bojovat s prokrastinací?
 
Jan Vlček: Gamifikace 101
Jan Vlček: Gamifikace 101Jan Vlček: Gamifikace 101
Jan Vlček: Gamifikace 101
 
Luke Wroblewski: Mobile First
Luke Wroblewski: Mobile FirstLuke Wroblewski: Mobile First
Luke Wroblewski: Mobile First
 
Adam Hrubý: Evoluce designéra
Adam Hrubý: Evoluce designéraAdam Hrubý: Evoluce designéra
Adam Hrubý: Evoluce designéra
 
Jan Sotorník: Grafika e-shopu jako sexy a chytrá prodavačka
Jan Sotorník: Grafika e-shopu jako sexy a chytrá prodavačkaJan Sotorník: Grafika e-shopu jako sexy a chytrá prodavačka
Jan Sotorník: Grafika e-shopu jako sexy a chytrá prodavačka
 
Jana Štěpánová: Neziskovky Goes Web
Jana Štěpánová: Neziskovky Goes WebJana Štěpánová: Neziskovky Goes Web
Jana Štěpánová: Neziskovky Goes Web
 
Douglas Crockford: Serversideness
Douglas Crockford: ServersidenessDouglas Crockford: Serversideness
Douglas Crockford: Serversideness
 

Recently uploaded

A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 

Recently uploaded (20)

A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 

WebExpo 2008 Newstin

  • 1. Newstin Real-time Web Content Categorization Presentation to WebExpo 2008 October 18, 2008
  • 2. Company Background  Newstin a.s. founded in 1998 as I2S in Prague  Team of 30 employees  26 engineers  14 nations  Since 2005  Real-time semantic content categorization  Multiple patent filings on cross-language solution  Past activities  Business & government projects in information management and security  Partnership with Business Objects/SAP  RedHerring Europe 100 Winner Award
  • 3. What is Newstin?  Patented technology  Largest news database, catalog of news in the world  150,000+ information sources in 11 languages  250,000+ articles daily fully processed into 1,000,000+ categories  US, UK, Indian, French, German, Italian, Spanish, Mexican, Portuguese, Brazilian, Czech, Russian, Arabic, Chinese  Japanese, Korean, Turkish coming in Q4 2008  Newstin.com  Popular user applications  Business Intelligence  Enterprise content organization
  • 4. What is Newstin? (Details)  Newstin is an innovative technology that incorporates a completely new approach to content organization. Newstin technology and its service-oriented architecture is the foundation of a unique system that features fully scalable real-time semantic, multi-language and cross-language document categorization. Newstin patented technology has the potential to become the core platform for organizing any unstructured textual data, including data from all sources on the Internet and potentially including the hidden Web.  Newstin is a powerful engine which harnesses a variety of cutting-edge technologies and implements linguistic processing with semantic analysis, multilevel content categorization and cross-language taxonomy structures. The applications of Newstin technology utilize an inherent capability to make use of context in addition to conventional key word approaches.  Newstin is the largest news database/catalogue in the world currently comprising 40 Million documents & 2.2 Billion metadata items and constantly growing. Newstin article collection is continuously updated from over 160,000 global and weighted sources selected from a pool of over 3 Million preprocessed sources in 12 languages. Daily up to 200,000+ articles are fully processed into 1.1 Million categories in 15 supported editions: US, UK, Indian, French, German, Italian, Spanish, Mexican, Portuguese, Brazilian, Czech, Russian, Arabic, Chinese and Korean; with more languages and editions coming soon.  Newstin is a complex system incorporating content retrieval, metadata processing, analysis and visualization. The extensive operation behind Newstin makes it a perfect platform for SaaS solutions.  Newstin is a bi-directional application of its own. By imposing order on unstructured data Newstin leverages its own extensive metadata collection for business intelligence and enterprise performance management. It is inevitable to organize content first to maximize knowledge mining capability.
  • 5. Web Content Chaos  An inspiration for Newstin to develop a solution for organizing web content
  • 6. Semantic Web 2.0 Organization  A portion of Newstin’s taxonomy structure – a step toward organizing web content
  • 9. Live Demonstration - Connecting VIP
  • 12. Live Demonstration - EmergingStories
  • 13. B2B: Online Categorization Firewall Enterprise Intranet Unstructured Semantic  Data Newstin Organization Contextual Search Categorization  Visual Navigation  Metadata Engine Cross-language  Mash up  internal/external Semantic / Web 2.0 Capability SaaS to Enterprise Market Standard for Tagging  Product synergy / enhancement  Competitive advantage
  • 14. Cross-language Information Retrieval  Newstin enables to reach a particluar topic in all supported languages through original definitions
  • 15. Life Cycle  Newstin is a comprehensive information system
  • 16. Shrnutí Prezentace - CZ Hlavní téma: Kategorizace webového obsahu v reálném čase Newstin a.s. je česká technologická firma se sídlem v Praze, zaměstnávající 30 inženýrů z 15 zemí. Během 3,5 roku vytvořila unikátní technologii na real-time organizování textových dokumentů s využitím sémantických a lingvistických technologií. Stěžejní a patentovanou součástí Newstin technologie je tzv. cross-lingvální řešení umožňující propojovat internetový obsah v různých jazycích bez použití překladů. Newstin vytvořil největší aktuální databázi článků internetového zpravodajství v 11 světových jazycích včetně češtiny, která obsahuje 37 milionů článků za posledních 9 měsíců a 2 miliardy metadat. V současnosti servery Newstin denně zpracují 250 tis. unikátních článků ze 160 tis. nejdůležitějších zdrojů po celém světě. Další využití technologie Newstin leží v oblasti mediálních analýz a organizaci podnikových dat.
  • 17. Real-time Web Content Categorization Thank you. Julius Rusnak CTO Newstin a.s. Lomnickeho 9 140 00 Prague Czech Republic