This document describes FreebaseViz, a tool for interactively exploring the schema of Freebase, a large structured database, using query-driven visualization. It discusses the challenges of visualizing large-scale data and presents an architecture that uses a graph database to flexibly model Freebase's schema-less data and support complex graph queries and traversals needed to navigate the schema. The live demo of FreebaseViz allows filtering the Freebase schema graph by type or category to understand its structure. Analysis found the schema resembled a scale-free network with a few super-connected types dominating connections and many isolated types. Graph databases were concluded to be a promising approach for visualization due to their flexible modeling, powerful querying
Advanced Topics in OpenAPI: Added Value Services and Protection in the OpenTr...🧑💻 Manuel Coppotelli
The objectives of this work were to study a series of advanced aspects that an organization can consider when expose data through an OpenService.
I studied the problems relative to the implementation of Added Value Services using the information exposed through an OpenAPI, in particular a complex route planner that combines both timetables and real-time data on the public transport.
The exposed information can also be used by a byzantine user to infer whether a service provider is respecting the terms of its SLA.
Obviously an organization do not want to expose data that would allow to infer this kind of information; therefore arises the problem of studying what is the right tradeoff that allows to have a sort of protection but, a the same time, maintain the openness of the data.
The solution studied for this work have been applied to the real case of OpenTrasporti (a project by the Italian Ministry for Transportation and Infrastructures)
Advanced Topics in OpenAPI: Added Value Services and Protection in the OpenTr...🧑💻 Manuel Coppotelli
The objectives of this work were to study a series of advanced aspects that an organization can consider when expose data through an OpenService.
I studied the problems relative to the implementation of Added Value Services using the information exposed through an OpenAPI, in particular a complex route planner that combines both timetables and real-time data on the public transport.
The exposed information can also be used by a byzantine user to infer whether a service provider is respecting the terms of its SLA.
Obviously an organization do not want to expose data that would allow to infer this kind of information; therefore arises the problem of studying what is the right tradeoff that allows to have a sort of protection but, a the same time, maintain the openness of the data.
The solution studied for this work have been applied to the real case of OpenTrasporti (a project by the Italian Ministry for Transportation and Infrastructures)
Exploration of the University of Toronto's Mellon project integrated open source tools (Omeka, Mirador, Viscoll), UX design and IIIF in the field of medieval studies.
Analysis on deposit opportunities for ingest of research papers into repositories by the Sonex workteam was presented at the 2nd DL.org workshop held at the University of Glasgow Sep 9-10th, 2010
Presentation about http://worldwidesemanticweb.org/ given at SugarCamp#3 in Paris on April 12-13. The slides introduce the activities of the WWSW group centred around adapting Semantic Web technologies to be usable in challenging conditions.
An Emerging Standard for Research-Quality Images: What IIIF Means for Digital...tseneca
Presentation at the 2018 Chicago Colloquium on Digital Humanities and Computer Science. Includes video demos on several slides as well as notes, so may not make sense unless downloaded.
Classifying malicious websites using an ensemble weighted featuresDharmendra Vishwakarma
Research Project - Master's in Data Analytics
Applying different statistical and machine learning techniques learned as a part of Data Analytics coursework is applied on Thesis Project to solve the malicious web page detection.
This presentation was provided by Mackenzie Smith of MIT Libraries, during the NISO event, "Library Resource Management Systems: New Challenges, New Opportunities," held October 8 - 9, 2009.
One click to connect | Lizzy Jongma | PHAROS / RKD op 8 oktober 2018Netwerk Oorlogsbronnen
De slotlezing 'One click to connect' van Lizzy Jongma, op 8 oktober 2018 bij het RKD-Nederlands Instituut voor Kunstgeschiedenis t.g.v. een PHAROS-workshopdag.
What can «blue» do for you: overcoming ICES challenges with BlueBRIDGE toolsBlue BRIDGE
Scott Large, ICES, at BlueBRIDGE workshop on "Data Management services to support stock assessement", held during the Annual ICES Science conference 2016
DataEngConf: Building Satori, a Hadoop toll for Data Extraction at LinkedInHakka Labs
By Nikolai Avteniev (Sr Software Engineer, LinkedIn)
LinkedIn is the professional profile of record for our 370M+ members globally, but many people don't realize the full potential of their LinkedIn profile – especially on mobile. Adding blogs, photos and other rich content to your profile on a small screen device can get tedious. That's why LinkedIn created Satori, a Hadoop tool that crawls the web and extracts data to discover members' professional content online. Satori uses machine learning techniques and leverages other open source tools like Nutch and Gobblin in order to help match members with relevant content in order to maximize their professional profile. In this talk, Nikolai will share his experience in building the product and discuss the challenges and opportunities encountered along the way.
"Semantic Integration Is What You Do Before The Deep Learning". dev.bg Machine Learning seminar, 13 May 2019.
It's well known that 80\% of the effort of a data scientist is spent on data preparation. Semantic integration is arguably the best way to spend this effort more efficiently and to reuse it between tasks, projects and organizations. Knowledge Graphs (KG) and Linked Open Data (LOD) have become very popular recently. They are used by Google, Amazon, Bing, Samsung, Springer Nature, Microsoft Academic, AirBnb… and any large enterprise that would like to have a holistic (360 degree) view of its business. The Semantic Web (web 3.0) is a way to build a Giant Global Graph, just like the normal web is a Global Web of Documents. IEEE already talks about Big Data Semantics. We review the topic of KGs and their applicability to Machine Learning.
Exploration of the University of Toronto's Mellon project integrated open source tools (Omeka, Mirador, Viscoll), UX design and IIIF in the field of medieval studies.
Analysis on deposit opportunities for ingest of research papers into repositories by the Sonex workteam was presented at the 2nd DL.org workshop held at the University of Glasgow Sep 9-10th, 2010
Presentation about http://worldwidesemanticweb.org/ given at SugarCamp#3 in Paris on April 12-13. The slides introduce the activities of the WWSW group centred around adapting Semantic Web technologies to be usable in challenging conditions.
An Emerging Standard for Research-Quality Images: What IIIF Means for Digital...tseneca
Presentation at the 2018 Chicago Colloquium on Digital Humanities and Computer Science. Includes video demos on several slides as well as notes, so may not make sense unless downloaded.
Classifying malicious websites using an ensemble weighted featuresDharmendra Vishwakarma
Research Project - Master's in Data Analytics
Applying different statistical and machine learning techniques learned as a part of Data Analytics coursework is applied on Thesis Project to solve the malicious web page detection.
This presentation was provided by Mackenzie Smith of MIT Libraries, during the NISO event, "Library Resource Management Systems: New Challenges, New Opportunities," held October 8 - 9, 2009.
One click to connect | Lizzy Jongma | PHAROS / RKD op 8 oktober 2018Netwerk Oorlogsbronnen
De slotlezing 'One click to connect' van Lizzy Jongma, op 8 oktober 2018 bij het RKD-Nederlands Instituut voor Kunstgeschiedenis t.g.v. een PHAROS-workshopdag.
What can «blue» do for you: overcoming ICES challenges with BlueBRIDGE toolsBlue BRIDGE
Scott Large, ICES, at BlueBRIDGE workshop on "Data Management services to support stock assessement", held during the Annual ICES Science conference 2016
DataEngConf: Building Satori, a Hadoop toll for Data Extraction at LinkedInHakka Labs
By Nikolai Avteniev (Sr Software Engineer, LinkedIn)
LinkedIn is the professional profile of record for our 370M+ members globally, but many people don't realize the full potential of their LinkedIn profile – especially on mobile. Adding blogs, photos and other rich content to your profile on a small screen device can get tedious. That's why LinkedIn created Satori, a Hadoop tool that crawls the web and extracts data to discover members' professional content online. Satori uses machine learning techniques and leverages other open source tools like Nutch and Gobblin in order to help match members with relevant content in order to maximize their professional profile. In this talk, Nikolai will share his experience in building the product and discuss the challenges and opportunities encountered along the way.
"Semantic Integration Is What You Do Before The Deep Learning". dev.bg Machine Learning seminar, 13 May 2019.
It's well known that 80\% of the effort of a data scientist is spent on data preparation. Semantic integration is arguably the best way to spend this effort more efficiently and to reuse it between tasks, projects and organizations. Knowledge Graphs (KG) and Linked Open Data (LOD) have become very popular recently. They are used by Google, Amazon, Bing, Samsung, Springer Nature, Microsoft Academic, AirBnb… and any large enterprise that would like to have a holistic (360 degree) view of its business. The Semantic Web (web 3.0) is a way to build a Giant Global Graph, just like the normal web is a Global Web of Documents. IEEE already talks about Big Data Semantics. We review the topic of KGs and their applicability to Machine Learning.
Slide From DataEngConf 2015 event.
LinkedIn is the professional profile of record for our 400M+ members globally, but many people don't realize the full potential of their LinkedIn profile – especially on mobile. Adding blogs, photos and other rich content to your profile on a small screen device can get tedious. That's why LinkedIn created Satori, a Hadoop tool that crawls the web and extracts data to discover members' professional content online. Satori uses machine learning techniques and leverages other open source tools like Nutch and Gobblin in order to help match members with relevant content in order to maximize their professional profile. In this talk, Nikolai Avteniev, Sr. Staff Engineer and Agile Software Developer at LinkedIn, will share his experience in building the product and discuss the challenges and opportunities encountered along the way.
AI-Driven Science and Engineering with the Global AI and Modeling Supercomput...Geoffrey Fox
Most things are dominated by Artificial Intelligence (AI). Technology Companies like Amazon, Google, Facebook, and Microsoft are AI First organizations.
Engineering achievement today is highlighted by the AI buried in a vehicle or machine. Industry (Manufacturing) 4.0 focusses on the AI-Driven future of the Industrial Internet of Things.
Software is eating the world.
We can describe much computer systems work as designing, building and using the Global AI and Modelling supercomputer which itself is autonomously tuned by AI. We suggest that this is not just a bunch of buzzwords but has profound significance and examine consequences of this for education and research.
Naively high-performance computing should be relevant for the AI supercomputer but somehow the corporate juggernaut is not making so much use of it. We discuss how to change this.
cloud computing - concepts and technologies and mechanisms of tackling problems in cloud
you plz ignore who created it , plz focus on problem oriented points
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...Mihai Criveti
Automate your Data Science pipeline with Ansible, Python and Kubernetes - ODSC Talk
What is Data Science and the Data Science Landscape
Process and Flow
Understanding Data
The Data Science Toolkit
The Big Data Challenge
Cloud Computing Solutions
The rise of DevOps in Data Science
Automate your data pipeline with Ansible
Big Data Processing Beyond MapReduce by Dr. Flavio VillanustreHPCC Systems
Data Centric Approach: Our platform is built on the premise of absorbing data from multiple data sources and transforming them to a highly intelligent social network graphs that can be processed to non-obvious relationships.
AI, Knowledge Representation and Graph Databases - Key Trends in Data ScienceOptum
Knowledge Representation is a key focus for most modern AI texts. Many AI experts feel that over half of their work is understanding how to find the right knowledge structures to build intelligent agents that can continuously learn and respond to changing events in their world. In 2012, a paper published by Google started a consolidation of the many diverse forms of knowledge representation into a single general-purpose structure called a labeled property graph.
This talk will describe the key events behind this movement and show how a new generation of data scientist will be needed to build and maintain corporate knowledge graphs that contain a uniform, normalized and highly connected data sets for used by researchers and intelligent agents. We will also discuss the challenges of transferring siloed project-knowledge to reusable structures.
Similar to FrrbaseViz-A Tool for Exploring Freebase Using Query-Driven Visualisation (20)
Learning Embeddings from Free-text Triage Notes using Pretrained Transformer ...Mahmoud Elbattah
Presented at 15th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2022)
Émilien Arnaud, Mahmoud Elbattah, Maxime Gingon, Gilles Dequen
Abstract
The advent of transformer models has allowed for tremendous progress in the Natural Language Processing (NLP) domain. Pretrained transformers could successfully deliver the state-of-the-art performance in a myriad of NLP tasks. This study presents an application of transformers to learn contextual embeddings from free-text triage notes, widely recorded at the emergency department. A large-scale retrospective cohort of triage notes of more than 260K records was provided by the University Hospital of Amiens-Picardy in France. We utilize a set of Bidirectional Encoder Representations from Transformers (BERT) for the French language. The quality of embeddings is empirically examined based on a set of clustering models. In this regard, we provide a comparative analysis of popular models including CamemBERT, FlauBERT, and mBART. The study could be generally regarded as an addition to the ongoing contributions of applying the BERT approach in the healthcare context.
Vision-Based Approach for Autism Diagnosis Using Transfer Learning and Eye-Tr...Mahmoud Elbattah
Presented at 2022 15th International Conference on Health Informatics (HEALTHINF)
Mahmoud Elbattah, Jean-Luc Guérin, Romuald Carette, Federica Cilia, Gilles Dequen
Abstract
The potentials of Transfer Learning (TL) have been well-researched in areas such as Computer Vision and Natural Language Processing. This study aims to explore a novel application of TL to detect Autism Spectrum Disorder. We seek to develop an approach that combines TL and eye-tracking, which is commonly used for analyzing autistic features. The key idea is to transform eye-tracking scanpaths into a visual representation, which could facilitate using pretrained vision models. Our experiments implemented a set of widely used models including VGG-16, ResNet, and DenseNet. Our results showed that the TL approach could realize a promising accuracy of classification (ROC-AUC up to 0.78). The proposed approach is not claimed to provide superior performance compared to earlier work. However, the study is primarily thought to convey an interesting aspect regarding the use of (synthetic) visual representations of eye-tracking output as a means to transfer representations from models pretrained on large-scale datasets such as ImageNet.
NLP-Based Prediction of Medical Specialties at Hospital Admission Using Triag...Mahmoud Elbattah
Presented at 2021 IEEE International Conference on Healthcare Informatics (ICHI)
Émilien Arnaud, Mahmoud Elbattah, Maxime Gingon, Gilles Dequen
Abstract
Data Analytics is rapidly expanding within the healthcare domain to help develop strategies for improving the quality of care and curbing costs as well. Natural Language Processing (NLP) solutions have received particular attention whereas a large part of clinical data is stockpiled into unstructured physician or nursing notes. In this respect, we attempt to employ NLP to provide an early prediction of the medical specialties at hospital admission. The study uses a large-scale dataset including more than 260K ED records provided by the Amiens-Picardy University Hospital in France. Our approach aims to integrate structured data with unstructured textual notes recorded at the triage stage. On one hand, a standard MLP model is used against the typical set of features. On the other hand, a Convolutional Neural Network is used to operate over the textual data. While both learning components are conducted independently in parallel. The empirical results demonstrated a promising accuracy in general. It is conceived that the study could be an additional contribution to the mounting efforts of applying NLP methods in the healthcare domain.
NLP-Based Approach to Detect Autism Spectrum Disorder in Saccadic Eye MovementMahmoud Elbattah
Presented at 2020 IEEE Symposium Series on Computational Intelligence (SSCI)
Mahmoud Elbattah, Jean-Luc Guérin, Romuald Carette, Federica Cilia, Gilles Dequen
Abstract
Autism Spectrum Disorder (ASD) is a lifelong condition generally characterized by social and communication impairments. The early diagnosis of ASD is highly desirable, yet it could be complicated by several factors. Standard tests typically require intensive efforts and experience, which calls for developing assistive tools. In this respect, this study aims to develop a Machine Learning-based approach to assist the diagnosis process. Our approach is based on learning the sequence-based patterns in the saccadic eye movements. The key idea is to represent eye-tracking records as textual strings describing the sequences of fixations and saccades. As such, the study could borrow Natural Language Processing (NLP) methods for transforming the raw eye-tracking data. The NLP-based transformation could yield interesting features for training classification models. The experimental results demonstrated that such representation could be beneficial in this regard. With standard ConvNet models, our approach could realize a promising accuracy of classification (ROC-AUC up to 0.84).
Generative Modeling of Synthetic Eye-Tracking Data: NLP-Based Approach with R...Mahmoud Elbattah
Presented at 12th International Conference on Neural Computation Theory and Applications (NCTA)
Authors:
Mahmoud Elbattah, Jean-Luc Guérin, Romuald Carette, Federica Cilia, Gilles Dequen
Abstract
This study explores a Machine Learning-based approach for generating synthetic eye-tracking data. In this respect, a novel application of Recurrent Neural Networks is experimented. Our approach is based on learning the sequence patterns of eye-tracking data. The key idea is to represent eye-tracking records as textual strings, which describe the sequences of fixations and saccades. The study therefore could borrow methods from the Natural Language Processing (NLP) domain for transforming the raw eye-tracking data. The NLP-based transformation is utilised to convert the high-dimensional eye-tracking data into an amenable representation for learning. Furthermore, the generative modeling could be implemented as a task of text generation. Our empirical experiments support further exploration and development of such NLP-driven approaches for the purpose of producing synthetic eye-tracking datasets for a variety of potential applications.
Multi-Channel ConvNet Approach to Predict the Risk of In-Hospital Mortality f...Mahmoud Elbattah
Presented at International Conference on Deep Learning Theory and Applications (DeLTA) 2020
https://www.scitepress.org/PublicationsDetail.aspx?ID=1HSktBRmyxE=
Authors:
Fabien Viton, Mahmoud Elbattah, Jean-Luc Guérin, Gilles Dequen
Université de Picardie Jules Verne (UPJV), France
mahmoud.elbattah@u-picardie.fr
Learning Clusters in Autism Spectrum Disorder: Image-Based Clustering of Eye-...Mahmoud Elbattah
Presented at Presented at 41st Conference of the IEEE Engineering in Medicine & Biology Society (EMBC)
https://ieeexplore.ieee.org/document/8856904
Authors:
Mahmoud Elbattah, Romuald Carette, Gilles Dequen, Jean-Luc Guérin, Federica Cilia
Université de Picardie Jules Verne, France
mahmoud.elbattah@u-picardie.fr
Learning to Predict Autism Spectrum Disorder Based on the Visual Patterns of ...Mahmoud Elbattah
Presented at 12th International Conference on Health Informatics (HEALTHINF 2019)
http://www.insticc.org/Primoris/Resources/PaperPdf.ashx?idPaper=74026
Authors:
Romuald Carette, Mahmoud Elbattah, Federica Cilia, Gilles Dequen, Jean-Luc Guérin
Université de Picardie Jules Verne, France
mahmoud.elbattah@u-picardie.fr
Designing Care Pathways Using Simulation Modeling and Machine LearningMahmoud Elbattah
Presented at Winter Simulation Conference 2018, Gothenburg, Sweden
Authors:
Mahmoud Elbattah, Owen Molloy, Bernard P. Zeigler
Summary:
The paper presents a framework that incorporates Simulation Modeling along with Machine Learning (ML) for the purpose of designing pathways and evaluating the return on investment of implementation. The study goes through a use case in relation to elderly healthcare in Ireland, with a particular focus on the hip-fracture care scheme. Initially, unsupervised ML is utilised to extract knowledge from the Irish Hip Fracture Database. Data clustering is specifically applied to learn potential insights pertaining to patient characteristics, care-related factors, and outcomes. Subsequently, the data-driven knowledge is utilised within the process of simulation model development. Generally, the framework is conceived to provide a systematic approach for developing healthcare policies that help optimise the quality and cost of care.
Clustering-Aided Approach for Predicting Patient Outcomes with Application to...Mahmoud Elbattah
Presented at Workshop on Health Intelligence (W3PHIAI) - AAAI 2017 Conference
https://aaai.org/ocs/index.php/WS/AAAIW17/paper/view/15188
Authors:
Mahmoud Elbattah and Owen Molloy
National University of Ireland Galway
mahmoud.elbattah@nuigalway.ie
Using Machine Learning to Predict Length of Stay and Discharge Destination fo...Mahmoud Elbattah
Using Machine Learning to Predict Length of Stay and Discharge Destination for Hip-Fracture Patients
Paper presented at Intelligent Systems Conference, London, 2016.
Authors:
Mahmoud Elbattah and Owen Molloy
National University of Ireland Galway
mahmoud.elbattah@nuigalway.ie
https://link.springer.com/chapter/10.1007/978-3-319-56994-9_15
https://www.researchgate.net/publication/319198340_Using_Machine_Learning_to_Predict_Length_of_Stay_and_Discharge_Destination_for_Hip-Fracture_Patients
Presenting a diversity of criteria that can be used to guide the selection of ERP systems. The criteria covers seven main groups including: i) Cost-Related, ii) Implementation Time, iii) Vendor-Related, iv) User-Related, v) Technology-Related, vi) System-Related, and vii)Organizational Requirements.
The paper below can be kindly cited in case of using the criteria.
Hegazy, A. E. F. A., ElBattah, M., & Kadry, M. (2012, October). Fuzzy-Based Framework for Enterprise Resource Planning System Selection. In Proceedings of the 22nd International Conference on Computer Theory and Applications (ICCTA), (pp. 139-147). IEEE.
https://ieeexplore.ieee.org/document/6523560/
ML-Aided Simulation: A Conceptual Framework for Integrating Simulation Models...Mahmoud Elbattah
ML-Aided Simulation: A Conceptual Framework for Integrating Simulation Models with Machine Learning
Paper presented at ACM 2018 Conference on Principles of Advanced Discrete Simulation (PADS)
https://dl.acm.org/citation.cfm?id=3200933
Authors:
Mahmoud Elbattah and Owen Molloy
National University of Ireland Galway
The Economic Burden of Hip Fractures among Elderly Patients in Ireland: A Com...Mahmoud Elbattah
Paper presented at the 34th International Conference of the System Dynamics Society, Delft, Netherlands - July 17-21, 2016
Full-Text available at:
https://www.systemdynamics.org/assets/conferences/2016/proceed/papers/P1332.pdf
Authors:
Mahmoud Elbattah and Owen Molloy
National University of Ireland Galway
mahmoud.elbattah@nuigalway.ie
Using Simulation Modeling to Design Value-Based Healthcare SystemsMahmoud Elbattah
Paper presented at OR58 Annual Conference, Portsmouth, England
Full-text available at:
https://www.researchgate.net/publication/308138628_Using_Simulation_Modeling_to_Design_Value-Based_Healthcare_Systems
Authors:
Bernard P. Zeigler, Ernest L. Carter, Owen Molloy, Mahmoud Elbattah
The University of Arizona, Prince George's Health Department
, National University of Ireland Galway
mahmoud.elbattah@nuigalway.ie
Large-Scale Ontology Storage and Query Using Graph Database-Oriented ApproachMahmoud Elbattah
Paper presented at 2015 IEEE Seventh International Conference on Intelligent Computing and Information Systems (ICICIS).
Full-Text available at:
http://ieeexplore.ieee.org/document/7397191/
https://www.researchgate.net/publication/304414637_Large-Scale_Ontology_Storage_and_Query_Using_Graph_Database-Oriented_Approach
First Author:
Mahmoud Elbattah
National University of Ireland Galway
mahmoud.elbattah@nuigalway.ie
Towards Improving Modeling and Simulation of Clinical Pathways: Lessons Learn...Mahmoud Elbattah
Towards Improving Modeling and Simulation of Clinical Pathways: Lessons Learned and Future Insights
Paper presented at International Conference on Simulation and Modeling Methodologies, Technologies and Applications
(SimulTech) 2015
Full-Text available at:
https://www.researchgate.net/publication/284284807_Towards_Improving_Modeling_and_Simulation_of_Clinical_Pathways_Lessons_Learned_and_Future_Insights
Authors:
Mahmoud Elbattah and Owen Molloy
National University of Ireland Galway
mahmoud.elbattah@nuigalway.ie
Supply Chains Modelling and Simulation Framework:Graph-Driven Approach Using ...Mahmoud Elbattah
Supply Chains Modelling and Simulation Framework:Graph-Driven Approach Using Ontology-Based Semantic Networks and Graph Database
Paper presented at International Conference on Simulation and Modeling Methodologies, Technologies and Applications
(SimulTech) 2014
Full-Text available at:
https://www.researchgate.net/profile/Mahmoud_Elbattah2/publication/294087825_Supply_Chains_Modelling_and_Simulation_Framework_Graph-Driven_Approach_Using_Ontology-Based_Semantic_Networks_and_Graph_Database/links/56bdd57308ae373cf1aaa930.pdf
Authors:
Mahmoud Elbattah and Owen Molloy
National University of Ireland Galway
mahmoud.elbattah@nuigalway.ie
Coupling Simulation with Machine Learning:A Hybrid Approach for Elderly Disch...Mahmoud Elbattah
Coupling Simulation with Machine Learning:A Hybrid Approach for Elderly Discharge Planning
Paper presented at SIGSIM PADS 2016
https://dl.acm.org/citation.cfm?id=2901381
Authors:
Mahmoud Elbattah and Owen Molloy
National University of Ireland Galway
mahmoud.elbattah@nuigalway.ie
Learning about Systems Using Machine Learning:Towards More Data-Driven Feedba...Mahmoud Elbattah
Learning about Systems Using Machine Learning- Paper presented at Winter Simulation Conference 2017
http://ieeexplore.ieee.org/document/8247895/
Authors:
Mahmoud Elbattah and Owen Molloy
National University of Ireland Galway
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
FrrbaseViz-A Tool for Exploring Freebase Using Query-Driven Visualisation
1. FreebaseViz:
Interactive Exploration of Freebase Schema
Using Query-driven Visualisation
Mahmoud Elbattah
College of Engineering and Informatics
National University of Ireland
m.elbattah1@nuigalway.ie
The International Conference on Communication, Management and
Information Technology (ICCMIT 2016)
2. ICCMIT 2016
Outline
• What is Visualisation?, and why it is important?
• Related Challenges in the Context of Big Data
• Dataset of Interest: Freebase
• Architecture of FreebaseViz Tool
• Live Demo and Visualisation Scenarios
• Conclusions
2
4. ICCMIT 2016
What is Visualisation?
• The transformation of the symbolic into the
geometric (McCormick et al., 1987).
• The use of computer-generated, interactive, visual
representations of data to amplify cognition
(S. K. Card et al., 1999).
4
Sources:
McCormick, Bruce Howard, Thomas A. DeFanti, and Maxine D. Brown. "Visualization in scientific
computing." IEEE Computer Graphics and Applications 7, no. 10 (1987): 69-69.
S. K. Card, J. D. Mackinlay, et al. Readings in Information Visualization; Using Vision to think. Los Altos,
CA, Morgan Kaufmann. 1999.
5. ICCMIT 2016
What is Different about Visualisation?
• The interpretation of visual formats happens
immediately in a “pre-attentive” manner.
• Larger-bandwidth for perception rather than text-based
means.
• The pictorial representation of data can help answer or
discover questions.
• A particular significance in the era of Big Data.
5
19. ICCMIT 2016
Conclusions and Observations
• The Freebase schema resembled the structure of a scale-free network.
• The degree distribution followed a power law distribution.
• A few super-connected nodes dominated the schema graph connections.
• In contrast, a considerable proportion of the schema Types seemed
isolated with no connections in the schema graph.
19
20. ICCMIT 2016
Conclusions and Observations (cont’d)
Graph databases can present promising potentials for visualisation
environments as follows:
• Flexible schema-less modeling.
• Powerful query potentials.
• Complex graph traversal can answer queries requiring extensive
navigation around a graph.
• Advantageous scalability compared to traditional relational models.
20
21. ICCMIT 2016
Original Paper
The original paper can be accessed from:
• https://books.google.ie/books?id=0YSKDQAAQBAJ&pg=
PT130
• https://www.researchgate.net/publication/321716603_
FreebaseViz_Interactive_Exploration_of_Freebase_Sche
ma_Using_Query-Driven_Visualisation
21