This document discusses open data sharing and standards. It provides examples of open data sets available through Bristol's open data portal, including government, community, education, energy, environment, and health data. It also discusses open research data storage at the University of Bristol, as well as challenges with sharing sensitive or not-so-open data, and tools for mediated or remote data sharing like the Bristol Online Surveys system and DataSHIELD library. It concludes with questions around who should pay for data sharing, the pros and cons of mediated sharing, what standards exist for "data diplomacy", and whether that last question even makes sense.
A presentation about UK PubMed Central, given at the Research Information Network / Repositories Support Project event in London, 29th May, 2009.
The presentation outlines the benefits of using the UKPMC service to the UK's biomedical and health research community, which include increasing visibility. It also provides an overview of some of the development activities being undertaken by the UKPMC development team.
Presentation at the Workshop on Open Citations, University of Bologna, Bologna, Italy, September 4, 2018.
I will demonstrate the use of the VOSviewer software (www.vosviewer.com), of which I am one of the developers, for creating bibliometric visualizations of science based on openly available bibliographic data sources. Both the use of Crossref data and the use of data from the OpenCitations Corpus will be demonstrated. In addition, I will show how data from Dimensions can be used. The possibilities and limitations of the currently available open data sources will be discussed, also in comparison with more established data sources such as Web of Science and Scopus. Finally, I will provide my perspective on future developments, focusing especially on the integration of open data sources and visual analysis tools.
The Digital Academia Power Struggle: Mark Hahnel, Figshare FounderCASRAI
According to the Scholarly Kitchen Chefs, one of the things to have the biggest impact on scholarly publishing in 2015 is the publication of data and objects (like multimedia, application code). While we have seen the launch of ‘data journals’ from the like of Elsevier and Nature in the past 12 months, we have also seen the pressure from funders for institutions to be better managing the digital products of research carried within their walls. Funders are increasingly requiring grantees to deposit their raw research data in appropriate public archives or stores in order to facilitate the validation of results and further work by other researchers. According to the JISC and RLUK funded Sherpa Juliet site, globally there are now 34 funders who require data archiving and 16 who encourage it. So are we on course for a collision between publishers and institutions over who has control over the digital products of research? Previous attempts by institutions to retake control of printed scholarly output through institutional repositories have been beneficial, but have not stemmed the profit margins or reach of the big publishers. This is mainly due to the culture of academia, where for 350 years papers have been the currency and for the last 50, impact factor has been the value. The recent influx of digital-based data and other outputs is, however, creating a culture shift. This session will explore how the web enabled world of multiple digital outputs is playing out and predict what could happen in the next 12-60 months. Either way, it’ll be an interesting journey!
A presentation about UK PubMed Central, given at the Research Information Network / Repositories Support Project event in London, 29th May, 2009.
The presentation outlines the benefits of using the UKPMC service to the UK's biomedical and health research community, which include increasing visibility. It also provides an overview of some of the development activities being undertaken by the UKPMC development team.
Presentation at the Workshop on Open Citations, University of Bologna, Bologna, Italy, September 4, 2018.
I will demonstrate the use of the VOSviewer software (www.vosviewer.com), of which I am one of the developers, for creating bibliometric visualizations of science based on openly available bibliographic data sources. Both the use of Crossref data and the use of data from the OpenCitations Corpus will be demonstrated. In addition, I will show how data from Dimensions can be used. The possibilities and limitations of the currently available open data sources will be discussed, also in comparison with more established data sources such as Web of Science and Scopus. Finally, I will provide my perspective on future developments, focusing especially on the integration of open data sources and visual analysis tools.
The Digital Academia Power Struggle: Mark Hahnel, Figshare FounderCASRAI
According to the Scholarly Kitchen Chefs, one of the things to have the biggest impact on scholarly publishing in 2015 is the publication of data and objects (like multimedia, application code). While we have seen the launch of ‘data journals’ from the like of Elsevier and Nature in the past 12 months, we have also seen the pressure from funders for institutions to be better managing the digital products of research carried within their walls. Funders are increasingly requiring grantees to deposit their raw research data in appropriate public archives or stores in order to facilitate the validation of results and further work by other researchers. According to the JISC and RLUK funded Sherpa Juliet site, globally there are now 34 funders who require data archiving and 16 who encourage it. So are we on course for a collision between publishers and institutions over who has control over the digital products of research? Previous attempts by institutions to retake control of printed scholarly output through institutional repositories have been beneficial, but have not stemmed the profit margins or reach of the big publishers. This is mainly due to the culture of academia, where for 350 years papers have been the currency and for the last 50, impact factor has been the value. The recent influx of digital-based data and other outputs is, however, creating a culture shift. This session will explore how the web enabled world of multiple digital outputs is playing out and predict what could happen in the next 12-60 months. Either way, it’ll be an interesting journey!
The Needs of stakeholders in the RDM process - the role of LEARNLEARN Project
Presentation at 3rd LEARN workshop on Research Data Management, “Make research data management policies work”
Helsinki, 28 June 2016, by Martin Moyle/Paul Ayris, UCL Library Services
Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...EUDAT
The European Open Science Cloud (EOSC) has become a driving force behind the current evolution of e-Infrastructure to support research. The EOSC offers the vision of an integrated ecosystem of data, services and expertise providing a common platform for open cross-community research in Europe and beyond. In this session, I shall consider the aims of the EOSC and discuss some the opportunities it offers, and barriers it needs to overcome to realise the vision. I shall introduce the EOSC-Pilot project which is aiming to pave the way towards the EOSC by exploring the opportunities and barriers, and proposing how the EOSC should evolve, both technically, including its architecture, and organisationally, including how it should be managed. Participants will be invited to consider what the issues of the EOSC are and how it might affect their own domain.
Visit: https://www.eudat.eu/eudat-summer-school
Purdue University Receives Grant to Study Alternative EnergyFirminy Capital Sarl
Firminy Capital Sarl manages the Firminy Equity Fund, a securitization fund, as well as four sub-funds. Firminy Capital Sarl leverages one such sub-fund, Alternative Energy Series 1, to invest in research and development of alternative energy resources.
Presentation from RIN hosted event on 'The future of scholarly publishing - where do we go from here?'
Part one of a series of events on the theme 'Research information in transition'.
Increasingly online databases are being used for the purpose of structure identification. In many cases an unknown to an investigator is known in the chemical literature or online database and these “known unknowns” are commonly available in these aggregated internet resources. The identification of these types of compounds in commercial, environmental, forensic, and natural product samples can be identified by searching against these large aggregated databases querying by either elemental composition or monoisotopic mass. We will report on the search approaches that we offer on aggregated compound databases hosted by the Royal Society of Chemistry and how these resources can be used for the purpose of structure identification.
Presentation at the Colloquium Research Information Systems and Science Classifications: Revisiting the NARCIS Classification, Museum Meermanno, The Hague, The Netherlands, September 28, 2018.
Open Data in a Big Data World: easy to say, but hard to do?LEARN Project
Presentation at 3rd LEARN workshop on Research Data Management, “Make research data management policies work”
Helsinki, 28 June 2016, by Sarah Callaghan, STFC Rutherford Appleton Laboratory
Historical Photographs of China - the journey towards sustainability and utilitySimon Price
Presentation about the University of Bristol's 'Historical Photographs of China' collection at the GW4 Remediating the Archive digital humanities workshop in Cardiff, November 2016. The 'Historical Photographs of China' project began work in 2006 as part of an AHRC funded project on the 'History of the Chinese Maritime Customs Service' into an initiative that locates, digitises, and publishes online photographs of China held, largely, in private hands outside the country. Although some of the 10,000 photographs now online - a quarter of the total - originate from UK institutional repositories, our materials are principally 'crowdsourced' from families living outside China. This presentation introduces the collection and discusses the technical challenges of growing and sustaining free access to this virtual photographic archive of modern China.
The Needs of stakeholders in the RDM process - the role of LEARNLEARN Project
Presentation at 3rd LEARN workshop on Research Data Management, “Make research data management policies work”
Helsinki, 28 June 2016, by Martin Moyle/Paul Ayris, UCL Library Services
Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...EUDAT
The European Open Science Cloud (EOSC) has become a driving force behind the current evolution of e-Infrastructure to support research. The EOSC offers the vision of an integrated ecosystem of data, services and expertise providing a common platform for open cross-community research in Europe and beyond. In this session, I shall consider the aims of the EOSC and discuss some the opportunities it offers, and barriers it needs to overcome to realise the vision. I shall introduce the EOSC-Pilot project which is aiming to pave the way towards the EOSC by exploring the opportunities and barriers, and proposing how the EOSC should evolve, both technically, including its architecture, and organisationally, including how it should be managed. Participants will be invited to consider what the issues of the EOSC are and how it might affect their own domain.
Visit: https://www.eudat.eu/eudat-summer-school
Purdue University Receives Grant to Study Alternative EnergyFirminy Capital Sarl
Firminy Capital Sarl manages the Firminy Equity Fund, a securitization fund, as well as four sub-funds. Firminy Capital Sarl leverages one such sub-fund, Alternative Energy Series 1, to invest in research and development of alternative energy resources.
Presentation from RIN hosted event on 'The future of scholarly publishing - where do we go from here?'
Part one of a series of events on the theme 'Research information in transition'.
Increasingly online databases are being used for the purpose of structure identification. In many cases an unknown to an investigator is known in the chemical literature or online database and these “known unknowns” are commonly available in these aggregated internet resources. The identification of these types of compounds in commercial, environmental, forensic, and natural product samples can be identified by searching against these large aggregated databases querying by either elemental composition or monoisotopic mass. We will report on the search approaches that we offer on aggregated compound databases hosted by the Royal Society of Chemistry and how these resources can be used for the purpose of structure identification.
Presentation at the Colloquium Research Information Systems and Science Classifications: Revisiting the NARCIS Classification, Museum Meermanno, The Hague, The Netherlands, September 28, 2018.
Open Data in a Big Data World: easy to say, but hard to do?LEARN Project
Presentation at 3rd LEARN workshop on Research Data Management, “Make research data management policies work”
Helsinki, 28 June 2016, by Sarah Callaghan, STFC Rutherford Appleton Laboratory
Historical Photographs of China - the journey towards sustainability and utilitySimon Price
Presentation about the University of Bristol's 'Historical Photographs of China' collection at the GW4 Remediating the Archive digital humanities workshop in Cardiff, November 2016. The 'Historical Photographs of China' project began work in 2006 as part of an AHRC funded project on the 'History of the Chinese Maritime Customs Service' into an initiative that locates, digitises, and publishes online photographs of China held, largely, in private hands outside the country. Although some of the 10,000 photographs now online - a quarter of the total - originate from UK institutional repositories, our materials are principally 'crowdsourced' from families living outside China. This presentation introduces the collection and discusses the technical challenges of growing and sustaining free access to this virtual photographic archive of modern China.
Paper presentation at EUNIS 2016 conference, Thessaloniki, Greece. Globally, over 500 universities now offer data science courses at undergraduate or postgraduate level and, in research-intensive universities, these courses are typically underpinned by academic research in statistics, machine learning and computer science departments and, increasingly, in multidisciplinary data science institutes. Much has been written about the academic challenges of data science from the perspective of its core academic disciplines and from its application domains, ranging from sciences and engineering through to arts and humanities. However, relatively little has been written about the institutional information technology (IT) support challenges entailed by this rapid growth in data science. This paper sets out some of these IT challenges and examines competing support strategies, service design and financial models through the lens of academic IT support services.
Presentation about the University of Bristol naturelocator suite of crowdsourcing and citizen science applications at the Mobile Apps for Research Summit, December 2014 at Birmingham University.
SubSift web services and workflows for profiling and comparing scientists and...Simon Price
Paper presentation at IEEE eScience 2010 conference, December 2010, Brisbane, Australia. Scientific researchers, laboratories and organisations can be profiled and compared by analysing their published works, including documents ranging from academic papers to web sites, blog posts and Twitter feeds. This paper describes how the vector space model from information retrieval, more normally associated with full text search, has been employed in the open source SubSift software to support workflows to profile and compare such collections of documents. SubSift was originally designed to match submitted conference or journal papers to potential peer reviewers based on the similarity between the paper's abstract and the reviewer's publications as found in online bibliographic databases. The software is implemented as a family of RESTful web services that, composed into a re-usable workflow, have already been used to support several major data mining conferences. Alternative workflows and service compositions are now enabling other interesting applications.
Code Club - a Fight Club inspired approach to software inspection and reviewSimon Price
Public version of original 2002 internal presentation at the Institute for Learning and Research Technology (ILRT), University of Bristol, about improving software quality through semi-formal code inspections and reviews. Made public as a result of discussions at the April 2016 Alan Turing Institute Symposium on Reproducibility for Data-Intensive Research, Oxford University.
A Higher-Order Data Flow Model for Heterogeneous Big DataSimon Price
Paper presentation at IEEE Big Data 2013 conference, Santa Clara, California. We introduce a data flow model that supports highly parallelisable design patterns and also has useful properties for analysing data serially over extended time periods without requiring traditional Big Data computing facilities.
The model ranges over a class of higher-order relations which are sufficiently expressive to represent a wide variety of unstructured, semi-structured and structured data. Using JSONMatch, our web service implementation of the model, we show that the combination of this model and higher-order representation provides a powerful and extensible framework that is particularly well suited to analysing Big Variety data in a web application context.
Co-designing Research IT and Research Data ServicesSimon Price
Invited talk about evolving plans for Research IT and Data support at the University of Bristol, given at the UCISA Research IT International Symposium, UCISA 2014 Conference, Brighton.
NewsPatterns - visualisation layer of news feed miningSimon Price
Presentation to ILRT staff about the data visualisation layer for a multilingual news feed (blogs) data analytics pipeline developed in the Artificial Intelligence group in Engineering Maths and Computer Science at the University of Bristol.
Cost of Migrating Large-Scale Computer Assisted Learning (CAL) Software to We...Simon Price
Paper presentation at the Interactive Computer aided Learning conference (ICL 2001), Villach, Austria, September 2001. This paper presents an initial analysis of the duration and effort data collected during the migration of WinEcon, a large-scale CAL package to WWW-based delivery. The paper reviews the data collected during the initial development of WinEcon before presenting the preliminary data collected during the recent migration. The data presented represents a snapshot of the data collection process towards the end of the migration project including an overview of the raw project data in terms of cost, duration and effort. Initial lessons learned from the project will be presented along with the potentially controversial view that conversion of large-scale projects, while feasible, may not represent the most efficient use of resources. The authors will argue that in many cases it may be more efficient to start again rather than migrate and reuse exiting tools and technology.
Managing Large-scale Multimedia Development ProjectsSimon Price
Keynote presentation at IEEE International Conference on Multimedia in Engineering Education 1998, Hong Kong. This paper presents generally applicable techniques drawn from the experience of managing the UK's Teaching and Learning Technology Programme (TLTP) Economics Consortium project to develop WinEcon - a computer based package covering an entire first year introductory economics degree course. The WinEcon project has been a highly successful, large scale multimedia project. It has received multiple international awards, is site licensed by over 80% of UK universities and over 200 organisations world wide. However, what really happens when you set out to develop the world's largest computer based training package for economics with a team of 35 content experts and 17 programmers distributed across eight geographically separate sites is a far cry from the typical case study found in a 'software project management' textbook. There are inherent characteristics of multimedia software which make its development difficult. Consequently any multimedia project carries a high risk of failing to deliver on time, quality or budget and the nature of large scale development projects only serves to amplify the risk to such a degree that many such projects fail to deliver satisfactorily in any of these three areas. These management challenges encountered by the WinEcon project are independent of subject matter and must be addressed when managing any large scale multimedia development.
Presentation reporting the current situation and projected requirements for the University of Bristol, delivered at the Jisc, Janet and the Digital Curation Centre (DCC) workshop on universities' Research Data Management Storage Requirements, February 2013, London.
Research IT at the University of BristolSimon Price
Invited talk at the UCISA Community of Practice Workshop on IT Provisions in Support of Research in July 2015 on Research IT support at the University of Bristol. Topics include specialist IT staff skills requirements, addressing scarcity of data science and advanced IT skills amongst IT staff, and the challenges of costing specialist support.
A review of the state of the art in Machine Learning on the Semantic WebSimon Price
Paper presentation at UK Computation Intelligence workshop 2003, Bristol. This paper reviews the current state of the art of machine learning applied to the Semantic Web. It looks at the Semantic Web and its languages, including RDF and OWL, from a machine learning perspective. Trends in the Semantic Web are mentioned throughout and the relationship with Web Services is examined. Applications are discussed with recent examples and pointers to data sets. Finally, the emerging field of Semantic Web Mining is introduced.
Best of Bristol Media City - MyMobileBristol, NatureLocator, Visualising ChinaSimon Price
Presentation about the work of the Institute for Learning and Research Technology at the NextGen 2011 conference in Bristol. Describes three projects developed by the Web Futures team in ILRT at the University of Bristol.
Querying and Merging Heterogeneous Data by Approximate Joins on Higher-Order ...Simon Price
This paper addresses the important problem of integrating heterogeneous data from sources as diverse as web pages, digital libraries, knowledge bases and databases. The ultimate aim of this work is to be able to query such heterogeneous data sources as if their data were conveniently held in a single relational database. Pursuant of this aim, we propose a generalisation of relational joins from the relational database model to enable joins on arbitrarily complex structured data in a higher-order representation. By incorporating kernels and distances for structured data, we further extend this model to support approximate joins of data originating from heterogeneous sources. We have implemented these higher-order relational operators and their associated kernels in Prolog and applied this framework on the CORA data sets. We demonstrate the flexibility of our approach in the publications domain by evaluating example approximate queries on structured data, joining on types ranging from sets of co-authors through to entire publications.
Presentation at JISC Research Tools Workshop, Birmingham, 2013. The data.bris project has published a lightweight adaptation of the DCC‘s comprehensive Collaborative Assessment of Research Data Infrastructure and Objectives (CARDIO) survey tool. The lightweight survey is released as a customisable template through Bristol Online Surveys (BOS) and is freely available to the majority of UK Higher Education institutions with an existing BOS account.
Recording student clinical experiences as potential future learning resources to support practical skills in veterinary science and dentistry education at the University of Bristol. A talk about software developed by the Institute for Learning and Research Technology, presented at the Higher Education Academy (HEA) science, technology, engineering, and mathematics (STEM) workshop on Crowdsourcing in Higher Education in February 2014, in Bristol.
Presentation by Sarah Rodgers, Professor of Health Informatics, University of Liverpool: Digital opportunities within academia at ECO: Digital Health in the North on Wednesday 27 September at Kings House Conference Centre, Manchester
Accessing data for research: data publishing pathways and the Five SafesLouise Corti
Presented atL Assessing Disclosure Risk in Population Research Data and Outputs, Children of the 90s (ALSPAC)
Bristol Medical School, 24 January 2020.
In this half day session, we introduce the concept of a Safe Health Researcher, where both data producers and users are not only aware of key data legal, ethical and security measures surrounding the management and publication of biomedical research data, but also any risk in outputs they are creating.
The practical training session aimed at aimed at data managers looks at key elements of disclosure risk and trust in sharing biomedical data. We will cover the principles and practicalities of reviewing disclosure risk in numeric data sources and in research outputs.
UK HE Research Data Management Survey Results - Presentation to EPSRCMartin Hamilton
We recently surveyed UK Higher Education Institutions on their plans for Research Data Management (RDM) to inform our own RDM project - the results can be found on my blog at http://martinh.net. These slides are a summary of the results which we presented to EPSRC in November 2013.
SciDataCon - How to increase accessibility and reuse for clinical and persona...Fiona Nielsen
Presented in session 48 - Sharing of sensitive data - presented by Fiona Nielsen on September 12, 2016 at #SciDataCon http://scidatacon.org
We have addressed the most pressing problem for public genomic data, that of data discoverability, by indexing worldwide resources for genomic research data on an online platform (repositive.io) providing a single point of entry to find and access available genomic research data.
http://www.scidatacon.org/2016/sessions/48/paper/26/
http://www.scidatacon.org/2016/sessions/48/
International data week - #RDAPlenary #IDW2016
Why we care about research data? Why we share?Richard Ferrers
An introduction to why ANDS cares about research data. ANDS, the Australian National Data Service, encourages researchers to share data. This presentation explains why.
Why the food sector needs a research infrastructure on Food and Health Consum...e-ROSA
Bent Egberg Mikkelsen and Karin Zimmermann's presentation at the eROSA Workshop “Towards Open Science in Agriculture & Food”, a side event to High Level conference on FOOD 2030, Plovdiv, Bulgaria (13/6/2018)
Briefing on US EPA Open Data Strategy using a Linked Data Approach3 Round Stones
An overview presented by Ms. Bernadette Hyland on 18-Nov 2014 on the US EPA Open Data strategy, focusing on the Resource Conservation & Recovery Act (RCRA) dataset to be published as linked data . This work is in support of Presidential Memorandum M13-13 - Open Data Policy and Managing Information as an Asset.
Research Integrity Advisor and Data ManagementARDC
Dr Paul Wong from the Australian Research Data Commons presented at the University of Technology Sydney's RIA Data Management Workshop on 21 June 2018. In partnership with the Australian Research Council, the National Health and Medical Research Council, the Australian Research Data Commons, and RMIT University, this is part of a national workshop series in data management for research integrity advisors.
Funding agencies are instituting requirements for data management and sharing as a condition of receiving research funds. This presentation addresses why researchers should care about research data management, what libraries have to do with it, and a case study of what one research specialist at the University of Colorado Anschutz Medical Campus is doing in this area.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
5. 5
140+ datasets live on opendata.bristol.gov.uk
Some real time data
Transport API repository now available
Examples
Government: Elections since 2007
Community: Quality of Life survey
Education: School Results
Energy: Installed PV, Energy Use in Council Buildings
Environment: Real time & Historic Air Quality, Flood Alerts (EA)
Land use: 2013 Planning applications
Health: Life expectancy/ Mortality, Obesity, NHS Spend
Bristol is Open - datasets
13. • Launched in 2003 and
redeveloped in 2013-15
• Used by over 85% UK HEIs
• Used for benchmarking
data sharing "clubs"
www.onlinesurveys.ac.uk
Bristol Online Surveys
15. Sharing "un-sharable" data
DataSHIELD is an R library that
enables the remote and non-disclosive
analysis of sensitive research data.
Users are not required to have prior
knowledge of R.
16. Data Sharing & Standards
Some questions for discussion...
• Freely available data isn't free so who should pay for
sharing the data?
• What are the pros and cons of mediated data sharing?
• What are the "standards" for data diplomacy?
• And does that question even make sense?