State and future of linked data in learning analyticsMathieu d'Aquin
The document outlines the agenda for a tutorial on using linked data in learning analytics. It discusses the current and potential uses of linked data in learning analytics, including as a data modeling approach, as a data source, and for ontological models and integration. It also summarizes results from the LAK Data Challenge, which involved using linked data for tasks like statistical analysis, network analysis, recommendation, and visualization. While linked data is not yet widely used in learning analytics, the document argues it will become more standard as the benefits become better understood and more education and resources are provided to the community.
The document summarizes information about the AIST 2014 conference, including that it received 74 submissions, mostly from Russian authors. Key highlights included the selection process, publication of accepted papers in Springer's Communications in Computer and Information Science series, and organization by a committee from Russia, UK, and other countries. It also lists sponsors and invited speakers for the conference.
Alexander Mikov - Program Tools for Dynamic Investigation of Social NetworksAIST
This document describes a simulation software tool called Triad.Net for investigating dynamic social networks. It allows modeling social networks and simulating information diffusion. Triad.Net includes components for model design, debugging, output analysis, security, and load balancing in distributed simulations. The software represents models using layers for structure, behavior, and messaging. It supports graph operations and standard network topologies. Triad.Net aims to help analyze hidden dependencies, structural properties, and conditions that impact simulation runs. Experiments show it can reduce rollback costs compared to optimistic simulation algorithms.
Artem Lukanin - Normalization of Non-Standard Words with Finite State Transd...AIST
This document discusses text normalization for Russian speech synthesis. It introduces Normatex, an open-source Russian text normalization system using finite state transducers. Normatex expands non-standard words like numbers, abbreviations, and acronyms. It achieved 84.33% recall and 93.95% precision on a test corpus. The document provides details on Normatex's normalization of numbers, acronyms, abbreviations, and its finite state transducers. Further improvements to Normatex are still underway.
Sergey Nikolenko (Steklov Mathematical Institute at St. Petersburg, Laboratory for Internet Studies, National Research University Higher School of Economics, St. Petersburg)
Probabilistic rating systems
State and future of linked data in learning analyticsMathieu d'Aquin
The document outlines the agenda for a tutorial on using linked data in learning analytics. It discusses the current and potential uses of linked data in learning analytics, including as a data modeling approach, as a data source, and for ontological models and integration. It also summarizes results from the LAK Data Challenge, which involved using linked data for tasks like statistical analysis, network analysis, recommendation, and visualization. While linked data is not yet widely used in learning analytics, the document argues it will become more standard as the benefits become better understood and more education and resources are provided to the community.
The document summarizes information about the AIST 2014 conference, including that it received 74 submissions, mostly from Russian authors. Key highlights included the selection process, publication of accepted papers in Springer's Communications in Computer and Information Science series, and organization by a committee from Russia, UK, and other countries. It also lists sponsors and invited speakers for the conference.
Alexander Mikov - Program Tools for Dynamic Investigation of Social NetworksAIST
This document describes a simulation software tool called Triad.Net for investigating dynamic social networks. It allows modeling social networks and simulating information diffusion. Triad.Net includes components for model design, debugging, output analysis, security, and load balancing in distributed simulations. The software represents models using layers for structure, behavior, and messaging. It supports graph operations and standard network topologies. Triad.Net aims to help analyze hidden dependencies, structural properties, and conditions that impact simulation runs. Experiments show it can reduce rollback costs compared to optimistic simulation algorithms.
Artem Lukanin - Normalization of Non-Standard Words with Finite State Transd...AIST
This document discusses text normalization for Russian speech synthesis. It introduces Normatex, an open-source Russian text normalization system using finite state transducers. Normatex expands non-standard words like numbers, abbreviations, and acronyms. It achieved 84.33% recall and 93.95% precision on a test corpus. The document provides details on Normatex's normalization of numbers, acronyms, abbreviations, and its finite state transducers. Further improvements to Normatex are still underway.
Sergey Nikolenko (Steklov Mathematical Institute at St. Petersburg, Laboratory for Internet Studies, National Research University Higher School of Economics, St. Petersburg)
Probabilistic rating systems
Iosif Itkin - Network models for exchange trade analysisAIST
The document discusses software testing tools from Exactpro Systems for validating trading systems and ensuring data reconciliation. It introduces several tools the company offers: ClearTH for post-trade testing; MiniRobots for multi-threaded Java testing; Dolphin for market surveillance testing; Shsha for post-transactional analysis; Load Injector for load testing; and Sailfish for end-to-end testing. It also provides background on software quality assurance processes and examples of financial technology failures like the 2012 Knight Capital incident and issues with Facebook's NASDAQ IPO cross.
Verichev Fedoseev - Robust Image Watermarking on Triangle Grid of Feature PointsAIST
Александр Веричев, Виктор Федосеев (Самарский государственный аэрокосмический университет, Институт систем обработки изображений РАН, Самара)
Robust Image Watermarking on Triangle Grid of Feature Points
AIST Conference 2015 http://aistconf.org
Sergey Zaika and Andrew Toporkov - Semantic Web on Duty of E- Learning: Ontol...AIST
The document discusses an ontological approach for educating programmers using semantic web technologies. The authors developed an ontological model with 128 nodes and part-of relations. They also generated test questions using techniques like homonyms, quasi-synonyms, paronyms, and antonyms to differentiate meanings based on context. The authors thank the audience for their attention and provide their contact information.
Alexandra Barysheva - Building Profiles of Blog Users Based on Comment Graph ...AIST
The document presents a method for building profiles of blog users based on analyzing comment graphs. The goal is to develop a language-independent tool to retrieve user profiles from online communities. The method studies user interactions in comment graphs, identifies attributes that can be retrieved from the graphs, and designs a profiling technique. It was tested on a dataset from Habrahabr involving over 2000 users. The results identified 5 types of user profiles based on clustering attributes like comments posted, received, and average distance in the graph. Further work could experiment on larger datasets and incorporate text from posts and comments.
Nikolay Karpov - Evolvable Semantic Platform for Facilitating Knowledge ExchangeAIST
This document describes an evolvable semantic platform called EXPERTIZE that was developed to facilitate knowledge exchange between experts at a university. EXPERTIZE analyzes unstructured text from news and matches it to the skills of university experts, as defined in their personal ontologies, in order to recommend relevant experts. It uses a latent Dirichlet allocation algorithm to perform the semantic matching. The system was implemented and evaluated, showing its ability to successfully recommend experts and categories for news items.
Dmitry Berg, Olga Zvereva - Identification Of Autopoietic Communication Patte...AIST
Dmitry Berg, Olga Zvereva (Ural Federal University)
Identification Of Autopoietic Communication Patterns In Social And Economic Networks
AIST Conference 2015 http://aistconf.org
Alexander Panchenko - Human and Machine Judgements about Russian Semantic Re...AIST
This document describes several datasets for evaluating semantic relatedness measures in Russian, including:
1) Human judgement datasets containing word pairs translated from English benchmarks and rated by humans on similarity.
2) The RuThes dataset containing synonyms and relations from a Russian thesaurus.
3) Machine judgement datasets created by combining results from shared tasks evaluating systems' ability to determine semantic relatedness of Russian word pairs.
4) An open Russian distributional thesaurus created using a skip-gram model on a large Russian text corpus.
Elena Bolshakova and Natalia Efremova - A Heuristic Strategy for Extracting T...AIST
The document describes a heuristic strategy for extracting terms from scientific texts. It discusses approaches to term extraction, including using statistical and linguistic criteria from large corpora or single texts. It also outlines developing term extraction procedures based on analyzing term types, structures, and contexts through linguistic patterns. The strategy was tested on Russian computer science and physics texts and compared to term dictionaries.
Dmitrii Stepanov, Aleksandr Bakhshiev, D.Gromoshinsky, N.Kirpan F.Gundelakh -...AIST
Dmitrii Stepanov, Aleksandr Bakhshiev, D.Gromoshinsky, N.Kirpan F.Gundelakh (Центральный Научно-Исследовательский И Опытно-Конструкторский Институт Робототехники И Технической Кибернетики)
Determination Of The Relative Position Of Space Vehicles By Detection And Tracking Of Natural Visual Features With The Existing Tv-Cameras
AIST Conference 2015 http://aistconf.org
Alexander Panchenko, Dmitry Babaev and Sergey Objedkov - Large-Scale Parallel...AIST
This document summarizes a study on matching social network profiles across different platforms at large scale. The researchers matched profiles from VKontakte (VK), a Russian social network, to Facebook profiles. They collected a training set of over 92,000 manually matched VK-Facebook profiles. An algorithm was developed to generate candidates, rank candidates based on similar friends between profiles, and select the best match. The algorithm achieved a precision of 0.98 and recall of 0.54 when matching over 600,000 additional VK profiles to Facebook. The method was computationally effective and easily parallelizable across 100 Amazon Web Services nodes.
Marina Danshina - The methodology of automated decryption of znamenny chantsAIST
1. The researchers created an automated system called "Computer Semiography" to decode Znamenny chants. The system consists of 5 modules: inputting chants into a database, reviewing manuscripts, forming linguistic and translation models, decoding chants, and a music editor.
2. The system can decode chants using either linear or Znamenny notation by applying rules from dictionaries and books. Researchers first build dictionaries from sources then use them to transform manuscripts into a linear notation.
3. The methodology allows producing three main components for decoding chants: a dictionary with translation rules, a translated manuscript version, and language and translation models. The work developed software to input, edit, and view chants in the database
Valeri Labunets - Fast multiparametric wavelet transforms and packets for ima...AIST
This document discusses multiparametric wavelet transforms (WDT). It describes the structure of WDT, which is characterized by two sets of coefficients (h-coefficients and g-coefficients) related by a matrix equation. It also discusses arbitrary cyclic wavelet transforms (AWT) and their representation using Jacobi-Givens rotations and stairs-like structures. An example of an 8-level AWT is presented using these concepts.
Symyx Notebook by Accelrys and the Enterprise R&D ArchitectureBIOVIA
The document describes Accelrys' Symyx Notebook electronic lab notebook (ELN) and how it integrates with Pipeline Pilot and Isentris. The ELN allows scientists to capture data from experiments, while Pipeline Pilot enables analysis, processing and reporting of the data. Isentris provides a way to explore and share knowledge from the ELN. The integrated system allows researchers to capture data, access and analyze it using tools like Pipeline Pilot, and generate reports and share insights using Isentris to improve R&D productivity.
The document discusses the Semantic Web and declarative knowledge representation in information technology. It provides an introduction to key concepts including semantics, ontologies, rules, and logic-based knowledge representation. It also outlines technologies that make up the Semantic Web such as RDF, RDF Schema, OWL, and SPARQL. The goal of these technologies is to represent information on the web in a structured, machine-readable format in order to enable automated processing of data.
Vishnu Gowthem is currently pursuing a Master of Computing degree at the National University of Singapore, with a focus on data analytics, text mining, and statistical modeling. He has over 3 years of work experience as a senior software engineer and healthcare analytics intern. His areas of expertise include programming languages like R, Python, and Java, as well as databases, data visualization tools, and machine learning algorithms. Some of his projects involve network robustness analysis, drug recommendation based on patient similarity, and basketball salary prediction based on player statistics.
eNanoMapper database, search tools and templatesNina Jeliazkova
A webinar given at the NCIP Hub https://nciphub.org/resources/1925
Nanomaterial safety assessment has become an important task following the production growth of engineered nanomaterials (ENMs) and the increased interest for ENMs from various academic, industry and regulatory parties. A number of challenges exist in nanomaterials data representation and integration mainly due to the data complexity and origination of ENM information from diverse sources. We have recently described eNanoMapper database [1] as part of the computational infrastructure for toxicological data management of engineered materials, developed within eNanoMapper project [2].
The eNanoMapper prototype database is publicly available at http://data.enanomapper.net, demonstrating the integration of data from multiple sources, using the common data model and Application Programming Interface. The supported import formats are IUCLID5 files (OECD HT), semantic format (RDF) and custom spreadsheet templates. The latter accommodates the preferred approach for data gathering for the majority of the NanoSafety Cluster projects and is enabled by a configurable parser mapping the the custom spreadsheet organization into the internal eNanoMapper storage components through external configuration file. Import of spreadsheet data and other data formats, generated by a number of NanoSafety Cluster projects is currently ongoing. The export formats have been extended with the new ISA JSON format, following the most recent ISA specification.
Defining templates for data gathering is a common activity for most of the NanoSafety Cluster projects usually resulting in modified Excel spreadsheets. In order to help avoiding the incompatibility issues, we present a tool for template generation, based on templates released under open license by JRC under the framework of the NANoREG project [3]. A number of physchem, in-vitro and in-vivo assays are supported and using feedback from users we added and extended existing information about different aspects of nanosafety, e.g. environmental exposure, cell culture assays, cellular and animal models, nanomaterial production features, and nanomaterial ageing.
Finally, the data can be accessed programmatically via the application programming interface as well as via user friendly search interface at https://search.data.enanomapper.net. The search application is powered by a free text search engine and eNanoMapper ontology and was improved over the last year based on user feedback.The search function allows now multiple filtering for information. It is possible to stack filters for e.g. nanomaterial type, cell model and assay.
eNanoMapper is supported by European Commission 7th Framework Programme for Research and Technological Development Grant (Grant agreement no: 604134).
TUW - Quality of data-aware data analytics workflowsHong-Linh Truong
The document discusses quality of data-aware data analytics workflows. It begins with outlining the topics to be covered, which include data analytics workflows structures and systems, issues with quality of data-aware workflows, and quality of data-aware simulation workflows. It then provides examples of different workflow systems and frameworks for data analytics workflows. Key points discussed are the need to understand hierarchical workflow structures, addressing data and service concerns, importance of quality of data for data analytics workflows, and approaches to modeling quality of data metrics and optimizing workflows based on quality of data.
Iosif Itkin - Network models for exchange trade analysisAIST
The document discusses software testing tools from Exactpro Systems for validating trading systems and ensuring data reconciliation. It introduces several tools the company offers: ClearTH for post-trade testing; MiniRobots for multi-threaded Java testing; Dolphin for market surveillance testing; Shsha for post-transactional analysis; Load Injector for load testing; and Sailfish for end-to-end testing. It also provides background on software quality assurance processes and examples of financial technology failures like the 2012 Knight Capital incident and issues with Facebook's NASDAQ IPO cross.
Verichev Fedoseev - Robust Image Watermarking on Triangle Grid of Feature PointsAIST
Александр Веричев, Виктор Федосеев (Самарский государственный аэрокосмический университет, Институт систем обработки изображений РАН, Самара)
Robust Image Watermarking on Triangle Grid of Feature Points
AIST Conference 2015 http://aistconf.org
Sergey Zaika and Andrew Toporkov - Semantic Web on Duty of E- Learning: Ontol...AIST
The document discusses an ontological approach for educating programmers using semantic web technologies. The authors developed an ontological model with 128 nodes and part-of relations. They also generated test questions using techniques like homonyms, quasi-synonyms, paronyms, and antonyms to differentiate meanings based on context. The authors thank the audience for their attention and provide their contact information.
Alexandra Barysheva - Building Profiles of Blog Users Based on Comment Graph ...AIST
The document presents a method for building profiles of blog users based on analyzing comment graphs. The goal is to develop a language-independent tool to retrieve user profiles from online communities. The method studies user interactions in comment graphs, identifies attributes that can be retrieved from the graphs, and designs a profiling technique. It was tested on a dataset from Habrahabr involving over 2000 users. The results identified 5 types of user profiles based on clustering attributes like comments posted, received, and average distance in the graph. Further work could experiment on larger datasets and incorporate text from posts and comments.
Nikolay Karpov - Evolvable Semantic Platform for Facilitating Knowledge ExchangeAIST
This document describes an evolvable semantic platform called EXPERTIZE that was developed to facilitate knowledge exchange between experts at a university. EXPERTIZE analyzes unstructured text from news and matches it to the skills of university experts, as defined in their personal ontologies, in order to recommend relevant experts. It uses a latent Dirichlet allocation algorithm to perform the semantic matching. The system was implemented and evaluated, showing its ability to successfully recommend experts and categories for news items.
Dmitry Berg, Olga Zvereva - Identification Of Autopoietic Communication Patte...AIST
Dmitry Berg, Olga Zvereva (Ural Federal University)
Identification Of Autopoietic Communication Patterns In Social And Economic Networks
AIST Conference 2015 http://aistconf.org
Alexander Panchenko - Human and Machine Judgements about Russian Semantic Re...AIST
This document describes several datasets for evaluating semantic relatedness measures in Russian, including:
1) Human judgement datasets containing word pairs translated from English benchmarks and rated by humans on similarity.
2) The RuThes dataset containing synonyms and relations from a Russian thesaurus.
3) Machine judgement datasets created by combining results from shared tasks evaluating systems' ability to determine semantic relatedness of Russian word pairs.
4) An open Russian distributional thesaurus created using a skip-gram model on a large Russian text corpus.
Elena Bolshakova and Natalia Efremova - A Heuristic Strategy for Extracting T...AIST
The document describes a heuristic strategy for extracting terms from scientific texts. It discusses approaches to term extraction, including using statistical and linguistic criteria from large corpora or single texts. It also outlines developing term extraction procedures based on analyzing term types, structures, and contexts through linguistic patterns. The strategy was tested on Russian computer science and physics texts and compared to term dictionaries.
Dmitrii Stepanov, Aleksandr Bakhshiev, D.Gromoshinsky, N.Kirpan F.Gundelakh -...AIST
Dmitrii Stepanov, Aleksandr Bakhshiev, D.Gromoshinsky, N.Kirpan F.Gundelakh (Центральный Научно-Исследовательский И Опытно-Конструкторский Институт Робототехники И Технической Кибернетики)
Determination Of The Relative Position Of Space Vehicles By Detection And Tracking Of Natural Visual Features With The Existing Tv-Cameras
AIST Conference 2015 http://aistconf.org
Alexander Panchenko, Dmitry Babaev and Sergey Objedkov - Large-Scale Parallel...AIST
This document summarizes a study on matching social network profiles across different platforms at large scale. The researchers matched profiles from VKontakte (VK), a Russian social network, to Facebook profiles. They collected a training set of over 92,000 manually matched VK-Facebook profiles. An algorithm was developed to generate candidates, rank candidates based on similar friends between profiles, and select the best match. The algorithm achieved a precision of 0.98 and recall of 0.54 when matching over 600,000 additional VK profiles to Facebook. The method was computationally effective and easily parallelizable across 100 Amazon Web Services nodes.
Marina Danshina - The methodology of automated decryption of znamenny chantsAIST
1. The researchers created an automated system called "Computer Semiography" to decode Znamenny chants. The system consists of 5 modules: inputting chants into a database, reviewing manuscripts, forming linguistic and translation models, decoding chants, and a music editor.
2. The system can decode chants using either linear or Znamenny notation by applying rules from dictionaries and books. Researchers first build dictionaries from sources then use them to transform manuscripts into a linear notation.
3. The methodology allows producing three main components for decoding chants: a dictionary with translation rules, a translated manuscript version, and language and translation models. The work developed software to input, edit, and view chants in the database
Valeri Labunets - Fast multiparametric wavelet transforms and packets for ima...AIST
This document discusses multiparametric wavelet transforms (WDT). It describes the structure of WDT, which is characterized by two sets of coefficients (h-coefficients and g-coefficients) related by a matrix equation. It also discusses arbitrary cyclic wavelet transforms (AWT) and their representation using Jacobi-Givens rotations and stairs-like structures. An example of an 8-level AWT is presented using these concepts.
Symyx Notebook by Accelrys and the Enterprise R&D ArchitectureBIOVIA
The document describes Accelrys' Symyx Notebook electronic lab notebook (ELN) and how it integrates with Pipeline Pilot and Isentris. The ELN allows scientists to capture data from experiments, while Pipeline Pilot enables analysis, processing and reporting of the data. Isentris provides a way to explore and share knowledge from the ELN. The integrated system allows researchers to capture data, access and analyze it using tools like Pipeline Pilot, and generate reports and share insights using Isentris to improve R&D productivity.
The document discusses the Semantic Web and declarative knowledge representation in information technology. It provides an introduction to key concepts including semantics, ontologies, rules, and logic-based knowledge representation. It also outlines technologies that make up the Semantic Web such as RDF, RDF Schema, OWL, and SPARQL. The goal of these technologies is to represent information on the web in a structured, machine-readable format in order to enable automated processing of data.
Vishnu Gowthem is currently pursuing a Master of Computing degree at the National University of Singapore, with a focus on data analytics, text mining, and statistical modeling. He has over 3 years of work experience as a senior software engineer and healthcare analytics intern. His areas of expertise include programming languages like R, Python, and Java, as well as databases, data visualization tools, and machine learning algorithms. Some of his projects involve network robustness analysis, drug recommendation based on patient similarity, and basketball salary prediction based on player statistics.
eNanoMapper database, search tools and templatesNina Jeliazkova
A webinar given at the NCIP Hub https://nciphub.org/resources/1925
Nanomaterial safety assessment has become an important task following the production growth of engineered nanomaterials (ENMs) and the increased interest for ENMs from various academic, industry and regulatory parties. A number of challenges exist in nanomaterials data representation and integration mainly due to the data complexity and origination of ENM information from diverse sources. We have recently described eNanoMapper database [1] as part of the computational infrastructure for toxicological data management of engineered materials, developed within eNanoMapper project [2].
The eNanoMapper prototype database is publicly available at http://data.enanomapper.net, demonstrating the integration of data from multiple sources, using the common data model and Application Programming Interface. The supported import formats are IUCLID5 files (OECD HT), semantic format (RDF) and custom spreadsheet templates. The latter accommodates the preferred approach for data gathering for the majority of the NanoSafety Cluster projects and is enabled by a configurable parser mapping the the custom spreadsheet organization into the internal eNanoMapper storage components through external configuration file. Import of spreadsheet data and other data formats, generated by a number of NanoSafety Cluster projects is currently ongoing. The export formats have been extended with the new ISA JSON format, following the most recent ISA specification.
Defining templates for data gathering is a common activity for most of the NanoSafety Cluster projects usually resulting in modified Excel spreadsheets. In order to help avoiding the incompatibility issues, we present a tool for template generation, based on templates released under open license by JRC under the framework of the NANoREG project [3]. A number of physchem, in-vitro and in-vivo assays are supported and using feedback from users we added and extended existing information about different aspects of nanosafety, e.g. environmental exposure, cell culture assays, cellular and animal models, nanomaterial production features, and nanomaterial ageing.
Finally, the data can be accessed programmatically via the application programming interface as well as via user friendly search interface at https://search.data.enanomapper.net. The search application is powered by a free text search engine and eNanoMapper ontology and was improved over the last year based on user feedback.The search function allows now multiple filtering for information. It is possible to stack filters for e.g. nanomaterial type, cell model and assay.
eNanoMapper is supported by European Commission 7th Framework Programme for Research and Technological Development Grant (Grant agreement no: 604134).
TUW - Quality of data-aware data analytics workflowsHong-Linh Truong
The document discusses quality of data-aware data analytics workflows. It begins with outlining the topics to be covered, which include data analytics workflows structures and systems, issues with quality of data-aware workflows, and quality of data-aware simulation workflows. It then provides examples of different workflow systems and frameworks for data analytics workflows. Key points discussed are the need to understand hierarchical workflow structures, addressing data and service concerns, importance of quality of data for data analytics workflows, and approaches to modeling quality of data metrics and optimizing workflows based on quality of data.
Tanya Cashorali gave a presentation on using R for data science applications across industries. She discussed how R can be used for data manipulation, dashboards, machine learning, migrating from other tools to R-based workflows, integrating with APIs, rapid prototyping, and training. She highlighted examples of using R in industries like pharmaceuticals, hospitals, telecommunications, and more. Cashorali concluded by discussing future trends in R adoption and suggestions for organizations looking to use R.
This document is a resume for Kavinya Rajendran summarizing her education and experience. She has a Master's in Computer Science specializing in Big Data Systems from Arizona State University and is seeking opportunities in Data Science or Engineering. Her experience includes internships at Bank of America developing analytical applications and at Global Analytics delivering analytical results using Python.
A Survey of Exploratory Search Systems Based on LOD ResourcesKarwan Jacksi
The document summarizes Karwan Jacksi's presentation on exploratory search systems based on Linked Open Data (LOD) resources at the International Conference on Computing and Informatics in Istanbul, 2015. The presentation discusses search strategies, the semantic web, linked data, existing linked data browsers and recommenders. It then summarizes several existing exploratory search systems that utilize LOD resources, including Yovisto, Semantic Wonder Cloud, Lookup Explore Discover, Aemoo, Seevl, Linked Jazz, Discovery Hub, and inWalk. The presentation also covers computing semantic similarity, linked data techniques, and references.
This document outlines the curriculum for the course "Elective Theory II - Data Science and Big Data" for the VI semester of the Diploma in Computer Engineering program. The course covers 5 units over 80 hours on data science fundamentals, data modeling, and big data concepts including storage and processing. The objectives are to understand data science techniques, apply data analysis in Python and Excel, learn about big data characteristics and technologies like Hadoop, and explore applications of big data. Topics include linear regression, classification models, MapReduce, and using big data in fields such as marketing, healthcare, and advertising.
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET Journal
The document discusses techniques for detecting similarity and deduplication in document analysis using vector analysis. It proposes analyzing documents by extracting abstract content, separating words and combining them in a word cloud to determine frequency. This approach aims to identify whether documents are duplicates by analyzing word vectors at the word, sentence and paragraph level while also applying techniques like stemming, stopping words and semantic similarity.
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET Journal
The document discusses techniques for detecting similarity and deduplication in document analysis using vector analysis. It proposes analyzing documents by extracting abstract content, separating words and combining them in a word cloud to determine frequency. This approach aims to identify whether documents are duplicates by analyzing word vectors at the word, sentence and paragraph level while also applying techniques like stemming, stopping words and semantic similarity.
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...Gezim Sejdiu
Over the past decade, vast amounts of machine-readable structured information have become available through the automation of research processes as well as the increasing popularity of knowledge graphs and semantic technologies.
A major and yet unsolved challenge that research faces today is to perform scalable analysis of large scale knowledge graphs in order to facilitate applications like link prediction, knowledge base completion, and question answering.
Most machine learning approaches, which scale horizontally (i.e. can be executed in a distributed environment) work on simpler feature vector based input rather than more expressive knowledge structures.
On the other hand, the learning methods which exploit the expressive structures, e.g. Statistical Relational Learning and Inductive Logic Programming approaches, usually do not scale well to very large knowledge bases owing to their working complexity.
This talk gives an overview of the ongoing project Semantic Analytics Stack (SANSA) which aims to bridge this research gap by creating an out of the box library for scalable, in-memory, structured learning.
The document describes the eTRIKS Data Harmonization Service Platform, which aims to provide a common infrastructure and services to support cross-institutional translational research. It discusses challenges around data integration and harmonization. The platform utilizes standards and controlled vocabularies to syntactically and semantically harmonize data from various sources. It employs a metadata framework and modular workflow to structure, standardize, and integrate observational data into a harmonized repository for exploration and analysis. A demo of the platform's capabilities for project setup, data staging, exploration, export, and integration with tranSMART is also provided.
Introduction to Data Science: A Practical Approach to Big Data AnalyticsIvan Khvostishkov
Meetup Moscow Big Systems/Big Data invited 3 March 2016 an engineer from EMC Corporation, Ivan Khvostishkov, to speak on key technologies and tools used in Big Data analytics, explain differences between Data Science and Business Intelligence and look closer on real use case from the industry. Materials are useful for engineers and analysts, who want to become contributors to Big Data projects, database professionals, college graduates and all, who want to know about Data Science as a career field.
In this video from the 2017 Argonne Training Program on Extreme-Scale Computing, Phil Carns from Argonne presents: HPC I/O for Computational Scientists.
"Darshan is a scalable HPC I/O characterization tool. It captures an accurate but concise picture of application I/O behavior with minimum overhead."
Darshan was originally developed on the IBM Blue Gene series of computers deployed at the Argonne Leadership Computing Facility, but it is portable across a wide variety of platforms include the Cray XE6, Cray XC30, and Linux clusters. Darshan routinely instruments jobs using up to 786,432 compute cores on the Mira system at ALCF.
Watch the video: https://wp.me/p3RLHQ-hv9
Learn more: https://extremecomputingtraining.anl.gov/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Resource Description Framework Approach to Data Publication and FederationPistoia Alliance
Bob Stanley, CEO, IO Informatics, explains the utility to RDF as a standard way of defining and redefining data that could have utility in managing life science information.
“Semantic Technologies for Smart Services” diannepatricia
Rudi Studer, Full Professor in Applied Informatics at the Karlsruhe Institute of Technology (KIT), Institute AIFB, presentation “Semantic Technologies for Smart Services” as part of the Cognitive Systems Institute Speaker Series, December 15, 2016.
TUW-ASE Summer 2015: Advanced service-based data analytics: Models, Elasticit...Hong-Linh Truong
This is a lecture from the advanced service engineering course from the Vienna University of Technology. See http://dsg.tuwien.ac.at/teaching/courses/ase
Alexey Mikhaylichenko - Automatic Detection of Bone Contours in X-Ray ImagesAIST
This document summarizes an algorithm for automatically segmenting and detecting joints in x-ray images. The algorithm involves several steps: (1) computing an edge map of the image, (2) determining binarization thresholds, (3) thinning edges, (4) binarizing the image, and (5) chaining edges together. The algorithm is compared to the Canny edge detector and is shown to achieve a 74% success rate on joint images. Processing time is improved by implementing a multi-threaded version. Challenges include false edge detection and discontinuities between edge fragments.
Евгений Цымбалов, Webgames - Методы машинного обучения для задач игровой анал...AIST
ML is helping a large Russian game developer and publisher called WebGames analyze data from their free-to-play games. They collect over 80 million records daily from their 400k daily players across various platforms. They use ML for tasks like churn prediction, revenue prediction, user classification, A/B testing, balance, and recommendations. Specifically, they build 30 different models to predict LTV for users based on their behavior in the first 30 days. They also use kNN and cohort-based approaches for user classification and Bayesian A/B testing to dynamically adjust testing over time. Rule-based modeling and midgame support based on classification help balance games. Content recommendations are done through static and dynamic clustering.
Александр Москвичев, EveResearch - Алгоритмы анализа данных в маркетинговых и...AIST
The document discusses various economic models related to utility and consumer choice. It introduces concepts like cardinal and ordinal utility and discusses modeling utility as a probability of success. It also discusses simple models of consumer choice that involve choosing a product based on type and brand or type and price. Additionally, it discusses models where neighbors' choices or prices can influence individual choices and mentions discounts and bundles. Graphs are shown comparing product turnover before and after implementing a neighbors effect model.
1) Exactpro is a specialist QA firm focused on testing financial systems that was acquired by the London Stock Exchange Group in 2015.
2) The London Stock Exchange Group is a leading international exchange group that traces its history back to 1698 and has over 5,500 employees.
3) Exactpro uses automated testing tools like Sailfish and ClearTH to test systems, as well as techniques like formal verification, crowd-sourced testing, and machine learning.
George Moiseev - Classification of E-commerce Websites by Product CategoriesAIST
This document describes a study that classified e-commerce websites by the products they sell. It discusses preprocessing web pages, extracting features using TF-IDF with additional weighting for tags, and classifying pages using a support vector machine. The results show that considering information from other pages in addition to the main page improved classification accuracy, with an average F-score of 0.81 for product type classification when using all page information with the tag weighting.
Elena Bruches - The Hybrid Approach to Part-of-Speech DisambiguationAIST
The document describes a hybrid approach to part-of-speech disambiguation that combines neural networks and manually crafted rules. The algorithm uses neural networks to generate a set of possible part-of-speech tags for each word, and rule-based tagging to generate another set. The final set of tags is the intersection of these two sets, or their union if the intersection is empty. The approach achieved 96.11% precision on one corpus and 86.39% precision on another larger corpus.
Edward Klyshinsky - The Corpus of Syntactic Co-occurences: the First GlanceAIST
The document discusses the Corpus of Syntactic Co-occurrences, which aims to provide a corpus for students learning Russian that contains correct word combinations. It notes existing corpora like the Russian National Corpus are too large and technical for beginners. The CoSyCo extracts unambiguous syntactic phrases from Russian texts that could help learners. It uses various news and technical texts totaling over 15 billion words. Examples of extracted phrases are provided. The CoSyCo site is mentioned and future plans outlined, such as enlarging the phrase list, filtering repeats and strange combinations, improving the design, and making the co-occurrence database clearer.
Galina Lavrentyeva - Anti-spoofing Methods for Automatic Speaker Verification...AIST
This document discusses anti-spoofing methods for automatic speaker verification systems. It summarizes various spoofing methods like replay attacks, voice conversion, and text-to-speech synthesis that attempt to manipulate biometric systems. The document then outlines the ASVspoof 2015 challenge on spoofing detection and the system submitted by the authors that achieved 2nd place. It details the authors' system including front-end preprocessing, feature extraction using magnitude, phase and high-level features, and back-end classifiers like SVM and neural networks. The system fused multiple feature types to achieve robust spoofing detection.
Oleksandr Frei and Murat Apishev - Parallel Non-blocking Deterministic Algori...AIST
This document describes parallel algorithms for topic modeling, including synchronous, asynchronous, and deterministic asynchronous algorithms. The synchronous offline algorithm splits the document collection into batches and has each thread process one batch at a time. The asynchronous online algorithm has processor threads process batches concurrently while a merger thread accumulates and merges results to recalculate model parameters. To make the algorithm deterministic, the deterministic asynchronous approach has each thread process batches and write results directly without a merger thread.
Kaytoue Mehdi - Finding duplicate labels in behavioral data: an application f...AIST
The document discusses identifying duplicate labels, or aliases, in behavioral data from e-sports games. It presents a method to analyze confusion matrices from predictive models to identify pairs of labels that concentrate confusion, indicating they may belong to the same player using different aliases. The method extracts fuzzy concepts from the confusion matrix and scores candidate pairs based on their cosine similarity to rank and filter the most likely alias pairs. Experimental settings on real e-sports datasets are also discussed.
Valeri Labunets - The bichromatic excitable Schrodinger metamediumAIST
This document describes research into modeling wave phenomena like particle motion and interference using a cellular automata approach called an excitable metamedium. It can simulate the Schrodinger equation by representing diffusion as complex numbers across cells. The researchers extended this to use triplet "color" numbers for diffusion coefficients, allowing visualization of properties like hue, saturation and lightness. Experiments demonstrated particle motion, interference and blending effects using different color diffusion values. Unusual geometries were also explored by changing the definition of the imaginary unit, affecting the behavior of color wave propagation in interesting ways.
Alexander Karkishchenko - Threefold Symmetry Detection in Hexagonal Images Ba...AIST
This document discusses a method for detecting threefold symmetry in hexagonal images using finite Eisenstein fields. It begins with an introduction to symmetry detection and issues that arise when applying existing continuous techniques to digital images. It then describes finite Eisenstein fields, which are constructed as finite fields analogous to the complex integers. Elements of these fields correspond naturally to hexagons, allowing hexagonal images to be represented as functions over the fields. Polar coordinate transformations are introduced to represent field elements in exponential form, enabling the transfer of continuous symmetry detection methods to digital hexagonal images. In summary, the document proposes a novel approach for symmetry detection in hexagonal images based on the algebraic structure of finite Eisenstein fields.
Artyom Makovetskii - An Efficient Algorithm for Total Variation DenoisingAIST
This document summarizes a research paper that analyzes the total variation denoising algorithm. It presents the following key points:
1. The total variation denoising model aims to minimize the sum of a fidelity term measuring noise and a regularization term measuring total variation.
2. The solution space can be reduced from bounded variation functions to piecewise constant functions on a given partition.
3. Explicit solutions are described for small values of the regularization parameter λ using Strong-Chan formulas, and these solutions are used to iteratively reduce the problem size and λ value.
4. The properties of extremal functions are proved, including uniqueness and behavior at discontinuity points depending on the sign of neighboring
Olesia Kushnir - Reflection Symmetry of Shapes Based on Skeleton Primitive Ch...AIST
1) The document proposes a method for detecting approximate reflection symmetry in shapes based on representing the skeleton of the shape as a chain of primitives.
2) It describes representing the skeleton as a chain of primitives containing the length and angle of each skeleton edge. This chain can then be divided into two sub-chains and aligned to evaluate symmetry.
3) The algorithm involves choosing start and end points on the skeleton to divide it into left and right sub-chains, adjusting the right sub-chain to be reflected, and calculating the dissimilarity between the sub-chains to determine the symmetry.
Oxana Logunova - The Results Of Sulfur Print Image Classification Of Section ...AIST
This presentation discusses the results of classifying sulfur print images of steel billet sections into three classes (A, B, C) using fuzzy logic. Membership functions were formed for the linguistic variables that describe image characteristics like brightness threshold, maximum brightness on each side of the threshold. Decision rules were developed to classify images when characteristics fall into ambiguous overlapping regions between different classes. The presentation concludes with thanks for attention.
The debris of the ‘last major merger’ is dynamically youngSérgio Sacani
The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the
‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor
collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the
MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space,
because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia
DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations
at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based
on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago.
We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative
measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data
1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’
did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within
the last few Gyr, consistent with the body of work surrounding the VRM.
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Leonel Morgado
Current descriptions of immersive learning cases are often difficult or impossible to compare. This is due to a myriad of different options on what details to include, which aspects are relevant, and on the descriptive approaches employed. Also, these aspects often combine very specific details with more general guidelines or indicate intents and rationales without clarifying their implementation. In this paper we provide a method to describe immersive learning cases that is structured to enable comparisons, yet flexible enough to allow researchers and practitioners to decide which aspects to include. This method leverages a taxonomy that classifies educational aspects at three levels (uses, practices, and strategies) and then utilizes two frameworks, the Immersive Learning Brain and the Immersion Cube, to enable a structured description and interpretation of immersive learning cases. The method is then demonstrated on a published immersive learning case on training for wind turbine maintenance using virtual reality. Applying the method results in a structured artifact, the Immersive Learning Case Sheet, that tags the case with its proximal uses, practices, and strategies, and refines the free text case description to ensure that matching details are included. This contribution is thus a case description method in support of future comparative research of immersive learning cases. We then discuss how the resulting description and interpretation can be leveraged to change immersion learning cases, by enriching them (considering low-effort changes or additions) or innovating (exploring more challenging avenues of transformation). The method holds significant promise to support better-grounded research in immersive learning.
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxMAGOTI ERNEST
Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024).
Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).
Immersive Learning That Works: Research Grounding and Paths ForwardLeonel Morgado
We will metaverse into the essence of immersive learning, into its three dimensions and conceptual models. This approach encompasses elements from teaching methodologies to social involvement, through organizational concerns and technologies. Challenging the perception of learning as knowledge transfer, we introduce a 'Uses, Practices & Strategies' model operationalized by the 'Immersive Learning Brain' and ‘Immersion Cube’ frameworks. This approach offers a comprehensive guide through the intricacies of immersive educational experiences and spotlighting research frontiers, along the immersion dimensions of system, narrative, and agency. Our discourse extends to stakeholders beyond the academic sphere, addressing the interests of technologists, instructional designers, and policymakers. We span various contexts, from formal education to organizational transformation to the new horizon of an AI-pervasive society. This keynote aims to unite the iLRN community in a collaborative journey towards a future where immersive learning research and practice coalesce, paving the way for innovative educational research and practice landscapes.
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
ESPP presentation to EU Waste Water Network, 4th June 2024 “EU policies driving nutrient removal and recycling
and the revised UWWTD (Urban Waste Water Treatment Directive)”
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
Current Ms word generated power point presentation covers major details about the micronuclei test. It's significance and assays to conduct it. It is used to detect the micronuclei formation inside the cells of nearly every multicellular organism. It's formation takes place during chromosomal sepration at metaphase.
When I was asked to give a companion lecture in support of ‘The Philosophy of Science’ (https://shorturl.at/4pUXz) I decided not to walk through the detail of the many methodologies in order of use. Instead, I chose to employ a long standing, and ongoing, scientific development as an exemplar. And so, I chose the ever evolving story of Thermodynamics as a scientific investigation at its best.
Conducted over a period of >200 years, Thermodynamics R&D, and application, benefitted from the highest levels of professionalism, collaboration, and technical thoroughness. New layers of application, methodology, and practice were made possible by the progressive advance of technology. In turn, this has seen measurement and modelling accuracy continually improved at a micro and macro level.
Perhaps most importantly, Thermodynamics rapidly became a primary tool in the advance of applied science/engineering/technology, spanning micro-tech, to aerospace and cosmology. I can think of no better a story to illustrate the breadth of scientific methodologies and applications at their best.
Authoring a personal GPT for your research and practice: How we created the Q...Leonel Morgado
Thematic analysis in qualitative research is a time-consuming and systematic task, typically done using teams. Team members must ground their activities on common understandings of the major concepts underlying the thematic analysis, and define criteria for its development. However, conceptual misunderstandings, equivocations, and lack of adherence to criteria are challenges to the quality and speed of this process. Given the distributed and uncertain nature of this process, we wondered if the tasks in thematic analysis could be supported by readily available artificial intelligence chatbots. Our early efforts point to potential benefits: not just saving time in the coding process but better adherence to criteria and grounding, by increasing triangulation between humans and artificial intelligence. This tutorial will provide a description and demonstration of the process we followed, as two academic researchers, to develop a custom ChatGPT to assist with qualitative coding in the thematic data analysis process of immersive learning accounts in a survey of the academic literature: QUAL-E Immersive Learning Thematic Analysis Helper. In the hands-on time, participants will try out QUAL-E and develop their ideas for their own qualitative coding ChatGPT. Participants that have the paid ChatGPT Plus subscription can create a draft of their assistants. The organizers will provide course materials and slide deck that participants will be able to utilize to continue development of their custom GPT. The paid subscription to ChatGPT Plus is not required to participate in this workshop, just for trying out personal GPTs during it.
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...University of Maribor
Slides from talk:
Aleš Zamuda: Remote Sensing and Computational, Evolutionary, Supercomputing, and Intelligent Systems.
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Inter-Society Networking Panel GRSS/MTT-S/CIS Panel Session: Promoting Connection and Cooperation
https://www.etran.rs/2024/en/home-english/
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Alexander Vodyaho & Nataly Zhukova — Implementation of Agile Concepts in Recommender Systems for Data Processing and Analyses
1. Implementation of Agile Concepts in
Recommender Systems for Data
Processing and Analyses
Alexander Vodyaho,
Nataly Zhukova
St. Petersburg Electrotechnical
University “LETI”
E-mail: nazhukova@mail.ru
AIST-2015, April, 9-11, Yekaterinburg
САНКТ-ПЕТЕРБУРГСКИЙ ГОСУДАРСТВЕННЫЙ ЭЛЕКТРОТЕХНИЧЕСКИЙ УНИВЕРСИТЕТ «ЛЭТИ»
6. Information and Knowledge of the Subject Domain of
Measurements Processing
Providing information
and knowledge
Creation and
assessment of
information and
knowledge
Using result of data
processing
New data processing
and analyses
Retrieving new
knowledge from
historical data
Solving complicated
task of data processing
Solving complicated
specialized task
Extending available
knowledge and
information
Building, Improving and
estimating ontology
Application of
knowledge for solving
task of data processing
Exchanging information
and knowledge in
standard formats
Receiving information
and knowledge
AIST-2015, April, 9-11, Yekaterinburg
SPbETU «LETI» www.eltech.ru
7. Features of the subject domain of measurements processing
• Structured binary streams, time series or separate
measurements
Initial measurements
• Huge volume, bad quality, heterogeneity, distribution in time
and space, non stationary behavior of time series, multiple
complicated relations
Measurements features
• Formalized knowledge about measurements
Requirements to the results
• Results of data processing are used for solving tasks at the
level of objects, situations and for decision making support
Consumers
AIST-2015, April, 9-11, Yekaterinburg
SPbETU «LETI» www.eltech.ru
9. Agile Сoncepts for DPAS
DPA RS
DPAS
Design
Development
(methodological
aspect)
Development
(implementation
aspect)
Support
First level of agile
features support
(industrial level)
Second level of
agile features support
(research-oriented
level level)
Ready
methodological
solutions
Ready
technological
solutions
Information
Systems (IT
sphere)
Scientific
prototypes
Base level of agile
features support
(IT level)
Third level of
agile features support
(research level)
Ready
technological
solutions
Execution
Suggestions for
new algorithms
Ready
implementation
of algorithms
Ready
technological
solutions
Ready
technological
solutions
Life cycle Life cycle
System Agility
support
AIST-2015, April, 9-11, Yekaterinburg
SPbETU «LETI» www.eltech.ru
10. Common and Agile Features of DPA RS
AgilefeaturesandDPARStechnologiesleadtonewproblemsAgilefeaturesandDPARStechnologiesleadtonewproblems
Features are based on technologiesFeatures are based on technologies
Technologies for DPA RSTechnologies for DPA RS
Technologies for RS Technologies for DPA RS
Content-based approach
Collaborative filtering
Hybrids
Knowledge-based approach
Logical inference
Experience-based approach
Exploration analyses
DPA RS featuresDPA RS features
RS features Agile features
Capability to process huge
amounts of data
Capability to make suggestions
Ranging capabilities
Easy integration of new methods and algorithms
Easy development of new methods and algorithms
Easy extension of data processing and analyses
systems business logic
DPA RS problemsDPA RS problems
Easy integration of new
methods and algorithms
Easy development of new
methods and algorithms
Easy extension of data
processing and analyses systems
business logic
Low cost of design,
development and support
Short time of design,
development and support
Convenient working space
AIST-2015, April, 9-11, Yekaterinburg
SPbETU «LETI» www.eltech.ru
11. DPA RS Information Model
Model of the applied subject
domain of data processing
and analyses
Model of the applied subject
domain of data processing
and analyses
DPAS dynamic information model
Information model of the environmentInformation model of the environment
Subject domain modelsSubject domain models
DPAS information modelDPAS information model
RS DPAS information modelRS DPAS information model
DPAS satic information modelDPAS satic information model
Model of the applied
subject domain
Model of the applied
subject domain
DPAS RS dynamic information
model DPAS RS static information modelDPAS RS static information model
<<inherits>> <<inherits>>
<<inherits>> <<inherits>> <<inherits>>
<<uses>>
<<uses>>
<<inherits>>
<<uses>>
AIST-2015, April, 9-11, Yekaterinburg
SPbETU «LETI» www.eltech.ru
12. DPA RS Architecture
RS GUI tools GUI Processes management tool
Ontologies and knowledge
bases editors
RS tasks manager
RS content manager
Ontologies
Service of mathematical
and modeling libraries
Service for external
connections management
and support
Data, information and
knowledge visualization
service
Processes management
and execution service
Network
Nework
Knowledge bases
Data, information and
knowledge sever
Data bases File storage
Data, information and knowledge
Backend
Administrative service
Data processing and
analyses service
Data processing and analyses
tools
Frontend
Inference machine
RS ontologies
RS knowledge bases
RS data, information and
knowledge manager
DPAS componentsDPAS componentsDPA RS componentsDPA RS components
Research-oriented
services
AIST-2015, April, 9-11, Yekaterinburg
SPbETU «LETI» www.eltech.ru
13. Case study
The system has the aim to analyze and control structure and contents of the
binary streams received from space objects.
Example of the binary steams structure
AIST-2015, April, 9-11, Yekaterinburg
SPbETU «LETI» www.eltech.ru
14. General procedure for the binary streams
processing and analyses
Apply methods based on calculation of the frequency distribution of the streamsApply methods based on calculation of the frequency distribution of the streams
Binary streams
Compare the descriptions with the descriptions of the earlier received streamsCompare the descriptions with the descriptions of the earlier received streams
Preliminary streams descriptions
Similar streams are found?
Restore the length of the cards and words, subcommutation and
supercommutation of the parameters using methods of correlation analyses
Restore the length of the cards and words, subcommutation and
supercommutation of the parameters using methods of correlation analyses
The structure needs improvements?
Build and analyze the graphs that represent the structure of the streamsBuild and analyze the graphs that represent the structure of the streams
Streams descriptions
Streams descriptions
Streams descriptions
AIST-2015, April, 9-11, Yekaterinburg
SPbETU «LETI» www.eltech.ru
15. GUI of program complexes for processing
stream structure
AIST-2015, April, 9-11, Yekaterinburg