SlideShare a Scribd company logo
Requirement Analysis THE STAT PROJECT Milestone 1 Report
To design a framework, how many variations we need to protect? How many functionalities we need to provide for supporting all these variations? QUESTIONS
Variation for importing dataset (File Sources)
Variations for importing dataset (File formats)
Variations for importing dataset (Schemas) Even if we only consider dataset in XML, each dataset may have its own schema.
Reuters dataset example
Simplified approach ,[object Object],[object Object],[object Object],[object Object],Observation: for the sake of comparison, researchers usually deal with a few famous dataset (e.g., Reuters, RCV-1)
Able to  persist and read back  memory objects
Able to  visualize  memory objects
STAT (brief) Domain Model Note : We ignore texts on connectors for brevity. Some connections are not drawn because of space limitation
STAT framework sample code (conceptual)
 
Domain Concept:  RawCorpus A collection of  RawDocument , supporting collection operations: - Add new  RawDocument   element - Remove existing  RawDocument   element - Accessing elements in the collection - …
Domain Concept:  RawCorpus abstract class  RawCorpus  { List< RawDocument > rawDocuments; RawDocument getDocument(int index); void setDocument(int index, T doc); void removeDocument(int index); }
Domain Concept:  RawDocument An object with one or more string fields, serving as a non-processed, in-memory representation of a document unit - Like Java beans with getter and setter - All fields must be string type, even for numbers
Domain Concept:  RawDocument class  MyRawDocument  extends  RawDocument  { String title; String author; String body; String date; String numOfClicks; String topicType; … } abstract class  RawDocument  { public RawDocument() {} }
Domain Concept:  Processor An object that processes  RawCorpus  and produces  Corpus .  - Linguistic:  Tokenizer, Stemmer, StopRemover, PosTagger, … - Machine learning: Feature-specific, document-specific
Domain Concept:  Corpus An object representing a collection of  Document   for use by machine learning side of framework. This object provides a notion of splits which is commonly used (e.g., train, test)
Domain Concept:  Trainer A representation of a machine learning algorithm, which can learn from a  Corpus  and produce a  Model .
Domain Concept:  Model An object of what machine learning algorithm (i.e.,  Trainer ) creates to store parameters that are &quot;learned&quot; from the data (i.e.,  Corpus )
Domain Concept:  Classifier An object that maps  Documents  to target values (label, number, probability). It takes a  Corpus  and a  Model  as inputs, and produces a  Prediction  associated with the  Corpus  according to the  Model .
Domain Concept:  Prediction A collection of target values (label, number, probability) that associate with a  Corpus , i.e., a collection of  Document .
Domain Concept:  Evaluator An object used for comparing the  Prediction  against its associated  Corpus  and generating  Evaluation
Domain Concept:  Evaluation A representation of evaluation result given by a  Evaluator , in a summarized manner.
THE STAT PROJECT Thanks
STAT (brief) Domain Model Note : We ignore texts on connectors for brevity. Some connections are not drawn because of space limitation  Corpus Reader Processor RawCorpus Trainer Model Classifier Prediction Evaluator Evaluation Writer Vocabulary
STAT Domain Model Note : We ignore texts above lines for brevity  Corpus Reader Processor RawCorpus Trainer Model Classifier Prediction Evaluator Evaluation Writer
STAT Domain Model Note : We ignore texts above lines for brevity  Corpus Reader Processor RawCorpus Trainer Model Classifier Prediction Evaluator Evaluation Document RawDocument

More Related Content

What's hot

ALA Interoperability
ALA InteroperabilityALA Interoperability
ALA Interoperability
spacecowboyian
 
Versioned Triple Pattern Fragments
Versioned Triple Pattern FragmentsVersioned Triple Pattern Fragments
Versioned Triple Pattern Fragments
Ruben Taelman
 
Java stereams
Java stereamsJava stereams
Java stereams
Jernej Virag
 
9 Inputs & Outputs
9 Inputs & Outputs9 Inputs & Outputs
9 Inputs & Outputs
Deepak Hagadur Bheemaraju
 
Data structure Unit-I Part A
Data structure Unit-I Part AData structure Unit-I Part A
Data structure Unit-I Part A
SSN College of Engineering, Kalavakkam
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
PyData
 
C programming disk file reading and writing
C programming disk file reading and writingC programming disk file reading and writing
C programming disk file reading and writing
rishi ram khanal
 
Javaiostream
JavaiostreamJavaiostream
Javaiostream
Manav Prasad
 
MPTStore: A Fast, Scalable, and Stable Resource Index
MPTStore: A Fast, Scalable, and Stable Resource IndexMPTStore: A Fast, Scalable, and Stable Resource Index
MPTStore: A Fast, Scalable, and Stable Resource Index
Chris Wilper
 
input/ output in java
input/ output  in javainput/ output  in java
input/ output in java
sharma230399
 
EKAW - Linked Data Publishing
EKAW - Linked Data PublishingEKAW - Linked Data Publishing
EKAW - Linked Data Publishing
Ruben Taelman
 
RapidMiner: Word Vector Tool And Rapid Miner
RapidMiner:  Word Vector Tool And Rapid MinerRapidMiner:  Word Vector Tool And Rapid Miner
RapidMiner: Word Vector Tool And Rapid Miner
DataminingTools Inc
 
Presentation of OpenNLP
Presentation of OpenNLPPresentation of OpenNLP
Presentation of OpenNLPRobert Viseur
 
File Handling in Java Oop presentation
File Handling in Java Oop presentationFile Handling in Java Oop presentation
File Handling in Java Oop presentation
Azeemaj101
 
Java Input Output (java.io.*)
Java Input Output (java.io.*)Java Input Output (java.io.*)
Java Input Output (java.io.*)
Om Ganesh
 
A Standard Data Format for Computational Chemistry: CSX
A Standard Data Format for Computational Chemistry: CSXA Standard Data Format for Computational Chemistry: CSX
A Standard Data Format for Computational Chemistry: CSX
Stuart Chalk
 
Java IO Package and Streams
Java IO Package and StreamsJava IO Package and Streams
Java IO Package and Streams
babak danyal
 

What's hot (19)

ALA Interoperability
ALA InteroperabilityALA Interoperability
ALA Interoperability
 
Versioned Triple Pattern Fragments
Versioned Triple Pattern FragmentsVersioned Triple Pattern Fragments
Versioned Triple Pattern Fragments
 
Java stereams
Java stereamsJava stereams
Java stereams
 
9 Inputs & Outputs
9 Inputs & Outputs9 Inputs & Outputs
9 Inputs & Outputs
 
Data structure Unit-I Part A
Data structure Unit-I Part AData structure Unit-I Part A
Data structure Unit-I Part A
 
Javaiostream
JavaiostreamJavaiostream
Javaiostream
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
 
Input output streams
Input output streamsInput output streams
Input output streams
 
C programming disk file reading and writing
C programming disk file reading and writingC programming disk file reading and writing
C programming disk file reading and writing
 
Javaiostream
JavaiostreamJavaiostream
Javaiostream
 
MPTStore: A Fast, Scalable, and Stable Resource Index
MPTStore: A Fast, Scalable, and Stable Resource IndexMPTStore: A Fast, Scalable, and Stable Resource Index
MPTStore: A Fast, Scalable, and Stable Resource Index
 
input/ output in java
input/ output  in javainput/ output  in java
input/ output in java
 
EKAW - Linked Data Publishing
EKAW - Linked Data PublishingEKAW - Linked Data Publishing
EKAW - Linked Data Publishing
 
RapidMiner: Word Vector Tool And Rapid Miner
RapidMiner:  Word Vector Tool And Rapid MinerRapidMiner:  Word Vector Tool And Rapid Miner
RapidMiner: Word Vector Tool And Rapid Miner
 
Presentation of OpenNLP
Presentation of OpenNLPPresentation of OpenNLP
Presentation of OpenNLP
 
File Handling in Java Oop presentation
File Handling in Java Oop presentationFile Handling in Java Oop presentation
File Handling in Java Oop presentation
 
Java Input Output (java.io.*)
Java Input Output (java.io.*)Java Input Output (java.io.*)
Java Input Output (java.io.*)
 
A Standard Data Format for Computational Chemistry: CSX
A Standard Data Format for Computational Chemistry: CSXA Standard Data Format for Computational Chemistry: CSX
A Standard Data Format for Computational Chemistry: CSX
 
Java IO Package and Streams
Java IO Package and StreamsJava IO Package and Streams
Java IO Package and Streams
 

Viewers also liked

Effective usecases
Effective usecasesEffective usecases
Effective usecasesam_iim
 
Requirement analysis with use case
Requirement analysis with use caseRequirement analysis with use case
Requirement analysis with use case
Rapeepan Thawornwanchai
 
Determining Requirements In System Analysis And Dsign
Determining Requirements In System Analysis And DsignDetermining Requirements In System Analysis And Dsign
Determining Requirements In System Analysis And Dsign
Asaduzzaman Kanok
 
Requirements engineering with UML [Software Modeling] [Computer Science] [Vri...
Requirements engineering with UML [Software Modeling] [Computer Science] [Vri...Requirements engineering with UML [Software Modeling] [Computer Science] [Vri...
Requirements engineering with UML [Software Modeling] [Computer Science] [Vri...
Ivano Malavolta
 
Software (requirement) analysis using uml
Software (requirement) analysis using umlSoftware (requirement) analysis using uml
Software (requirement) analysis using umlDhiraj Shetty
 
Software Requirement Specification
Software Requirement SpecificationSoftware Requirement Specification
Software Requirement Specification
Vishal Singh
 
Example requirements specification
Example requirements specificationExample requirements specification
Example requirements specificationindrisrozas
 
Sample Business Requirement Document
Sample Business Requirement DocumentSample Business Requirement Document
Sample Business Requirement DocumentIsabel Elaine Leong
 

Viewers also liked (8)

Effective usecases
Effective usecasesEffective usecases
Effective usecases
 
Requirement analysis with use case
Requirement analysis with use caseRequirement analysis with use case
Requirement analysis with use case
 
Determining Requirements In System Analysis And Dsign
Determining Requirements In System Analysis And DsignDetermining Requirements In System Analysis And Dsign
Determining Requirements In System Analysis And Dsign
 
Requirements engineering with UML [Software Modeling] [Computer Science] [Vri...
Requirements engineering with UML [Software Modeling] [Computer Science] [Vri...Requirements engineering with UML [Software Modeling] [Computer Science] [Vri...
Requirements engineering with UML [Software Modeling] [Computer Science] [Vri...
 
Software (requirement) analysis using uml
Software (requirement) analysis using umlSoftware (requirement) analysis using uml
Software (requirement) analysis using uml
 
Software Requirement Specification
Software Requirement SpecificationSoftware Requirement Specification
Software Requirement Specification
 
Example requirements specification
Example requirements specificationExample requirements specification
Example requirements specification
 
Sample Business Requirement Document
Sample Business Requirement DocumentSample Business Requirement Document
Sample Business Requirement Document
 

Similar to STAT Requirement Analysis

postgres loader
postgres loaderpostgres loader
postgres loader
INRIA-OAK
 
ORDBMS.pptx
ORDBMS.pptxORDBMS.pptx
ORDBMS.pptx
Anitta Antony
 
Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible research
Andrew Lowe
 
Language Server Protocol - Why the Hype?
Language Server Protocol - Why the Hype?Language Server Protocol - Why the Hype?
Language Server Protocol - Why the Hype?
mikaelbarbero
 
BERT QnA System for Airplane Flight Manual
BERT QnA System for Airplane Flight ManualBERT QnA System for Airplane Flight Manual
BERT QnA System for Airplane Flight Manual
ArkaGhosh65
 
Composable Parallel Processing in Apache Spark and Weld
Composable Parallel Processing in Apache Spark and WeldComposable Parallel Processing in Apache Spark and Weld
Composable Parallel Processing in Apache Spark and Weld
Databricks
 
SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...
SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...
SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...
Spark Summit
 
DataFinder concepts and example: General (20100503)
DataFinder concepts and example: General (20100503)DataFinder concepts and example: General (20100503)
DataFinder concepts and example: General (20100503)Data Finder
 
Standardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for Python
Ralf Gommers
 
Hatkit Project - Datafiddler
Hatkit Project - DatafiddlerHatkit Project - Datafiddler
Hatkit Project - Datafiddler
holiman
 
ODF Mashups
ODF MashupsODF Mashups
ODF Mashups
Alexandro Colorado
 
1 Project 2 Introduction - the SeaPort Project seri.docx
1  Project 2 Introduction - the SeaPort Project seri.docx1  Project 2 Introduction - the SeaPort Project seri.docx
1 Project 2 Introduction - the SeaPort Project seri.docx
honey725342
 
Spark meetup TCHUG
Spark meetup TCHUGSpark meetup TCHUG
Spark meetup TCHUG
Ryan Bosshart
 
Quantopix analytics system (qas)
Quantopix analytics system (qas)Quantopix analytics system (qas)
Quantopix analytics system (qas)
Al Sabawi
 
DataFinder: A Python Application for Scientific Data Management
DataFinder: A Python Application for Scientific Data ManagementDataFinder: A Python Application for Scientific Data Management
DataFinder: A Python Application for Scientific Data Management
Andreas Schreiber
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"
Jihyun Ahn
 
Organizing the Data Chaos of Scientists
Organizing the Data Chaos of ScientistsOrganizing the Data Chaos of Scientists
Organizing the Data Chaos of Scientists
Andreas Schreiber
 
Source-to-source transformations: Supporting tools and infrastructure
Source-to-source transformations: Supporting tools and infrastructureSource-to-source transformations: Supporting tools and infrastructure
Source-to-source transformations: Supporting tools and infrastructure
kaveirious
 

Similar to STAT Requirement Analysis (20)

postgres loader
postgres loaderpostgres loader
postgres loader
 
ORDBMS.pptx
ORDBMS.pptxORDBMS.pptx
ORDBMS.pptx
 
Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible research
 
Language Server Protocol - Why the Hype?
Language Server Protocol - Why the Hype?Language Server Protocol - Why the Hype?
Language Server Protocol - Why the Hype?
 
BERT QnA System for Airplane Flight Manual
BERT QnA System for Airplane Flight ManualBERT QnA System for Airplane Flight Manual
BERT QnA System for Airplane Flight Manual
 
Composable Parallel Processing in Apache Spark and Weld
Composable Parallel Processing in Apache Spark and WeldComposable Parallel Processing in Apache Spark and Weld
Composable Parallel Processing in Apache Spark and Weld
 
SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...
SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...
SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...
 
DataFinder concepts and example: General (20100503)
DataFinder concepts and example: General (20100503)DataFinder concepts and example: General (20100503)
DataFinder concepts and example: General (20100503)
 
iOS Application Development
iOS Application DevelopmentiOS Application Development
iOS Application Development
 
Standardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for Python
 
Hatkit Project - Datafiddler
Hatkit Project - DatafiddlerHatkit Project - Datafiddler
Hatkit Project - Datafiddler
 
Java basics
Java basicsJava basics
Java basics
 
ODF Mashups
ODF MashupsODF Mashups
ODF Mashups
 
1 Project 2 Introduction - the SeaPort Project seri.docx
1  Project 2 Introduction - the SeaPort Project seri.docx1  Project 2 Introduction - the SeaPort Project seri.docx
1 Project 2 Introduction - the SeaPort Project seri.docx
 
Spark meetup TCHUG
Spark meetup TCHUGSpark meetup TCHUG
Spark meetup TCHUG
 
Quantopix analytics system (qas)
Quantopix analytics system (qas)Quantopix analytics system (qas)
Quantopix analytics system (qas)
 
DataFinder: A Python Application for Scientific Data Management
DataFinder: A Python Application for Scientific Data ManagementDataFinder: A Python Application for Scientific Data Management
DataFinder: A Python Application for Scientific Data Management
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"
 
Organizing the Data Chaos of Scientists
Organizing the Data Chaos of ScientistsOrganizing the Data Chaos of Scientists
Organizing the Data Chaos of Scientists
 
Source-to-source transformations: Supporting tools and infrastructure
Source-to-source transformations: Supporting tools and infrastructureSource-to-source transformations: Supporting tools and infrastructure
Source-to-source transformations: Supporting tools and infrastructure
 

More from stat

Stat Design3 18 09
Stat Design3 18 09Stat Design3 18 09
Stat Design3 18 09stat
 
Stat Tech Reportv1
Stat Tech Reportv1Stat Tech Reportv1
Stat Tech Reportv1stat
 
Requirementv4
Requirementv4Requirementv4
Requirementv4stat
 
Stat2 25 09
Stat2 25 09Stat2 25 09
Stat2 25 09stat
 
Requirment
RequirmentRequirment
Requirmentstat
 
Requirements - Part 1
Requirements - Part 1Requirements - Part 1
Requirements - Part 1stat
 

More from stat (6)

Stat Design3 18 09
Stat Design3 18 09Stat Design3 18 09
Stat Design3 18 09
 
Stat Tech Reportv1
Stat Tech Reportv1Stat Tech Reportv1
Stat Tech Reportv1
 
Requirementv4
Requirementv4Requirementv4
Requirementv4
 
Stat2 25 09
Stat2 25 09Stat2 25 09
Stat2 25 09
 
Requirment
RequirmentRequirment
Requirment
 
Requirements - Part 1
Requirements - Part 1Requirements - Part 1
Requirements - Part 1
 

Recently uploaded

Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 

Recently uploaded (20)

Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 

STAT Requirement Analysis

  • 1. Requirement Analysis THE STAT PROJECT Milestone 1 Report
  • 2. To design a framework, how many variations we need to protect? How many functionalities we need to provide for supporting all these variations? QUESTIONS
  • 3. Variation for importing dataset (File Sources)
  • 4. Variations for importing dataset (File formats)
  • 5. Variations for importing dataset (Schemas) Even if we only consider dataset in XML, each dataset may have its own schema.
  • 7.
  • 8. Able to persist and read back memory objects
  • 9. Able to visualize memory objects
  • 10. STAT (brief) Domain Model Note : We ignore texts on connectors for brevity. Some connections are not drawn because of space limitation
  • 11. STAT framework sample code (conceptual)
  • 12.  
  • 13. Domain Concept: RawCorpus A collection of RawDocument , supporting collection operations: - Add new RawDocument element - Remove existing RawDocument element - Accessing elements in the collection - …
  • 14. Domain Concept: RawCorpus abstract class RawCorpus { List< RawDocument > rawDocuments; RawDocument getDocument(int index); void setDocument(int index, T doc); void removeDocument(int index); }
  • 15. Domain Concept: RawDocument An object with one or more string fields, serving as a non-processed, in-memory representation of a document unit - Like Java beans with getter and setter - All fields must be string type, even for numbers
  • 16. Domain Concept: RawDocument class MyRawDocument extends RawDocument { String title; String author; String body; String date; String numOfClicks; String topicType; … } abstract class RawDocument { public RawDocument() {} }
  • 17. Domain Concept: Processor An object that processes RawCorpus and produces Corpus . - Linguistic: Tokenizer, Stemmer, StopRemover, PosTagger, … - Machine learning: Feature-specific, document-specific
  • 18. Domain Concept: Corpus An object representing a collection of Document for use by machine learning side of framework. This object provides a notion of splits which is commonly used (e.g., train, test)
  • 19. Domain Concept: Trainer A representation of a machine learning algorithm, which can learn from a Corpus and produce a Model .
  • 20. Domain Concept: Model An object of what machine learning algorithm (i.e., Trainer ) creates to store parameters that are &quot;learned&quot; from the data (i.e., Corpus )
  • 21. Domain Concept: Classifier An object that maps Documents to target values (label, number, probability). It takes a Corpus and a Model as inputs, and produces a Prediction associated with the Corpus according to the Model .
  • 22. Domain Concept: Prediction A collection of target values (label, number, probability) that associate with a Corpus , i.e., a collection of Document .
  • 23. Domain Concept: Evaluator An object used for comparing the Prediction against its associated Corpus and generating Evaluation
  • 24. Domain Concept: Evaluation A representation of evaluation result given by a Evaluator , in a summarized manner.
  • 26. STAT (brief) Domain Model Note : We ignore texts on connectors for brevity. Some connections are not drawn because of space limitation Corpus Reader Processor RawCorpus Trainer Model Classifier Prediction Evaluator Evaluation Writer Vocabulary
  • 27. STAT Domain Model Note : We ignore texts above lines for brevity Corpus Reader Processor RawCorpus Trainer Model Classifier Prediction Evaluator Evaluation Writer
  • 28. STAT Domain Model Note : We ignore texts above lines for brevity Corpus Reader Processor RawCorpus Trainer Model Classifier Prediction Evaluator Evaluation Document RawDocument