For the digital preservation panel at CUWL 2014. This talk covers the background for research data curation, the challenges of preserving research data, and some strategies for building curation services.
Responsible Conduct of Research: Data ManagementKristin Briney
This presentation was given by myself and Brad Houston (http://www.slideshare.net/herodotusjr), for UWM's Responsible Conduct of Research (RCR) series in Fall of 2013. It covers data management plans and practical data management tips. The corresponding handout is also available on Slideshare: http://www.slideshare.net/kbriney/rcr-data-management-handout
This is Twitter 101 for academic researchers. Learn why to tweet, the anatomy of a tweet, what to tweet about, how to building a network, and the basics of live tweeting.
This document discusses best practices for storing and backing up data. It recommends having multiple copies of data stored in different locations (the "rule of 3" with 2 copies onsite and 1 offsite). Acceptable storage locations include computer hard drives, external hard drives, shared network drives, magnetic tapes, CDs/DVDs, cloud storage, and USB flash drives. The document also stresses the importance of regularly backing up data and testing backups to ensure the ability to recover files. Cloud storage services are mentioned but users are advised to carefully read the terms of service which may give the provider broad rights over uploaded content.
This document discusses the importance of lab notebooks for scientific data management, both currently and in the future. It identifies that lab notebooks are a critical tool for organizing pre-publication research data but practices vary widely. Ideal notebooks would contain all raw data, metadata, analyses, and citations in an electronic, searchable format. The document outlines how librarians can help by developing resources on best practices for organizing digital data and recording this in notebooks, as well as instruction on electronic notebook software. It recognizes that notebooks are shifting to fully digital formats and this will further impact data management.
Slides from the 2016 Aug 1 Digital Science webinar. I spoke about how data management does not need to be a barrier and gave my top 5 tips for managing your data better.
NIH Data Policy or: How I Learned to Stop Worrying and Love the Data Manageme...Kristin Briney
This document summarizes the key points from a presentation about NIH data management and sharing plan requirements. It discusses why these plans are now required for grants over $500,000, how to write an effective plan including what data to share, when, where, who will access it, and how it will be prepared. It also provides tips for effective long-term data management practices like file organization, documentation, backup plans, and security. Resources for creating data management plans and getting help from librarians and tools are also mentioned.
Responsible Conduct of Research: Data ManagementKristin Briney
This presentation was given by myself and Brad Houston (http://www.slideshare.net/herodotusjr), for UWM's Responsible Conduct of Research (RCR) series in Fall of 2013. It covers data management plans and practical data management tips. The corresponding handout is also available on Slideshare: http://www.slideshare.net/kbriney/rcr-data-management-handout
This is Twitter 101 for academic researchers. Learn why to tweet, the anatomy of a tweet, what to tweet about, how to building a network, and the basics of live tweeting.
This document discusses best practices for storing and backing up data. It recommends having multiple copies of data stored in different locations (the "rule of 3" with 2 copies onsite and 1 offsite). Acceptable storage locations include computer hard drives, external hard drives, shared network drives, magnetic tapes, CDs/DVDs, cloud storage, and USB flash drives. The document also stresses the importance of regularly backing up data and testing backups to ensure the ability to recover files. Cloud storage services are mentioned but users are advised to carefully read the terms of service which may give the provider broad rights over uploaded content.
This document discusses the importance of lab notebooks for scientific data management, both currently and in the future. It identifies that lab notebooks are a critical tool for organizing pre-publication research data but practices vary widely. Ideal notebooks would contain all raw data, metadata, analyses, and citations in an electronic, searchable format. The document outlines how librarians can help by developing resources on best practices for organizing digital data and recording this in notebooks, as well as instruction on electronic notebook software. It recognizes that notebooks are shifting to fully digital formats and this will further impact data management.
Slides from the 2016 Aug 1 Digital Science webinar. I spoke about how data management does not need to be a barrier and gave my top 5 tips for managing your data better.
NIH Data Policy or: How I Learned to Stop Worrying and Love the Data Manageme...Kristin Briney
This document summarizes the key points from a presentation about NIH data management and sharing plan requirements. It discusses why these plans are now required for grants over $500,000, how to write an effective plan including what data to share, when, where, who will access it, and how it will be prepared. It also provides tips for effective long-term data management practices like file organization, documentation, backup plans, and security. Resources for creating data management plans and getting help from librarians and tools are also mentioned.
This document discusses organizing data files through proper file organization and naming conventions. It recommends keeping files organized by project, analysis type, date or other logical scheme. Consistent naming conventions make files easier to find and avoid duplicates. The date format YYYY-MM-DD is suggested. Examples show files organized by site and sample number or author and title. Maintaining an organized filing system from the start helps ensure data remains usable over time.
This document provides guidance on creating a data management plan (DMP). It explains that DMPs are required by many funders to help researchers better organize, document, and preserve their data. The key parts of a DMP include describing the data, metadata standards, data security, archiving and preservation, and access. The presenter provides tips for addressing each part, such as using open formats and partnering with repositories. Resources for creating a DMP at the University of Wisconsin-Milwaukee are also listed.
Practical Data Management - ACRL DCIG WebinarKristin Briney
This document summarizes a webinar on practical data management. It discusses best practices for file organization, naming conventions, documentation, storage, backups, and ensuring future usability. Key recommendations include organizing files logically by project or type, using consistent naming conventions, thoroughly documenting data collection and analysis methods, storing data in multiple locations both on and off-site, backing up data regularly including testing backups, and future-proofing data through file format conversion and migration to new media. Resources for further information on data management best practices are also provided.
Lab Notebooks as Data Management (SLA Winter Virtual Conference 2012)Kristin Briney
This talk, aimed at librarians, describes the data management issues surrounding paper and electronic lab notebooks. It offers several ways for librarians to support good practices and the transition from paper to electronic.
This document discusses issues with reproducibility and data availability in scientific research. It notes that published research is merely an advertisement of the underlying scholarship and data, and that data availability declines rapidly as articles age. Several studies are cited showing limited ability to translate preclinical findings to the clinic due to reproducibility issues, and examples of academic fraud are provided that undermine trust in published results without available data. Overall, the document argues for the importance of data availability to verify and build upon published research findings.
This talk reviews tips and tools for leveling up your data management skills. Areas covered include: storage, file naming conventions, version control, documentation, and data clean up.
An overview of the current state of electronic laboratory notebooks (ELNs), pros and cons of using an ELN, and important considerations for adopting an ELN.
NISO Webinar on data curation services at the CDLCarly Strasser
"Building communities and Services in Support of Data-Intensive Research". Webinar on 18 Sept 2013 for the NISO Webinar Series. This was part 2 of 2 for Data Curation
A presentation given at the Coalition for Networked Information describing efforts undertaken by 3 partnered organizations (UCSF CTSI, UCSF Library, California Digital Library) to support sharing of research data by UCSF investigators
This document discusses sharing research data. It describes the Data Services Center, which provides data services including finding and providing access to datasets. It notes that funders and publishers require data sharing, and that shared data receives more citations. It recommends sharing the minimum data needed to reproduce results, and considering timing, usability and granularity of data sharing. For sharing methods, it recommends using disciplinary or general repositories like UR Research, Dryad and REACTUR, which provide long-term preservation and access. Workshops and help are available for data management and sharing.
Applying research methods: Investigating the Many Faces of Digital Visitors &...Lynn Connaway
Connaway, L. S. (2018). Applying research methods: Investigating the Many Faces of Digital Visitors & Residents. Presented at the American University, March 29, 2018, Rome, Italy.
Applying research methods: Investigating the Many Faces of Digital Visitors &...OCLC
Connaway, L. S. (2018). Applying research methods: Investigating the Many Faces of Digital Visitors & Residents. Presented at the American University, March 29, 2018, Rome, Italy.
This document discusses how data is driving decisions in research. It notes that the amount of data being generated is growing exponentially and researchers are now in the data business. It outlines four transformations needed - from unmanaged to managed data, disconnected to connected data, invisible to findable data, and single-use to reusable data. National strategies in Australia are aiming to support these transformations through initiatives like the Australian National Data Service which provides resources and expertise to help researchers manage, connect, and enable reuse of research data.
HIBERLINK: Reference Rot and Linked Data: Threat and RemedyPRELIDA Project
This document discusses reference rot in linked data and proposes remedies. It defines reference rot as occurring when links to web resources no longer point to the original content. Empirical evidence from analyses of journal articles and e-theses shows that over one third of references experience rot. Proposed remedies include a Hiberlink plug-in to enable proactive archiving, augmenting links with temporal context using the Missing Link approach, and a HiberActive system for repositories to actively archive references. The goal is to increase the chances of accessing referenced content over time by embedding archiving solutions into existing authoring and publishing workflows.
Digitization Process by Audra Eagle YunCraig FANSLER
The document provides an overview of getting started with digitization for an organization. It discusses organizing a digitization project, including prioritizing projects and creating a team. It also covers choosing equipment like hardware and software, and setting up a digital production station. The document outlines the scanning process, file organization, and storage. Finally, it discusses options for publishing and sharing digitized content both locally and through free online platforms.
The document provides an overview of getting started with digitization for an organization. It discusses organizing a digitization project, including prioritizing projects and creating a team. It also covers choosing equipment like hardware and software, setting up a scanning station, and scanning processes. Finally, it discusses options for publishing and storing digital files, including free services like Flickr and preserving content through the North Carolina Digital Heritage Center.
The document outlines a 23 Things program for research data management training, which releases weekly activities and has monthly webinars, and provides a calendar of events and list of coordinators for the program at UWA.
This document discusses organizing data files through proper file organization and naming conventions. It recommends keeping files organized by project, analysis type, date or other logical scheme. Consistent naming conventions make files easier to find and avoid duplicates. The date format YYYY-MM-DD is suggested. Examples show files organized by site and sample number or author and title. Maintaining an organized filing system from the start helps ensure data remains usable over time.
This document provides guidance on creating a data management plan (DMP). It explains that DMPs are required by many funders to help researchers better organize, document, and preserve their data. The key parts of a DMP include describing the data, metadata standards, data security, archiving and preservation, and access. The presenter provides tips for addressing each part, such as using open formats and partnering with repositories. Resources for creating a DMP at the University of Wisconsin-Milwaukee are also listed.
Practical Data Management - ACRL DCIG WebinarKristin Briney
This document summarizes a webinar on practical data management. It discusses best practices for file organization, naming conventions, documentation, storage, backups, and ensuring future usability. Key recommendations include organizing files logically by project or type, using consistent naming conventions, thoroughly documenting data collection and analysis methods, storing data in multiple locations both on and off-site, backing up data regularly including testing backups, and future-proofing data through file format conversion and migration to new media. Resources for further information on data management best practices are also provided.
Lab Notebooks as Data Management (SLA Winter Virtual Conference 2012)Kristin Briney
This talk, aimed at librarians, describes the data management issues surrounding paper and electronic lab notebooks. It offers several ways for librarians to support good practices and the transition from paper to electronic.
This document discusses issues with reproducibility and data availability in scientific research. It notes that published research is merely an advertisement of the underlying scholarship and data, and that data availability declines rapidly as articles age. Several studies are cited showing limited ability to translate preclinical findings to the clinic due to reproducibility issues, and examples of academic fraud are provided that undermine trust in published results without available data. Overall, the document argues for the importance of data availability to verify and build upon published research findings.
This talk reviews tips and tools for leveling up your data management skills. Areas covered include: storage, file naming conventions, version control, documentation, and data clean up.
An overview of the current state of electronic laboratory notebooks (ELNs), pros and cons of using an ELN, and important considerations for adopting an ELN.
NISO Webinar on data curation services at the CDLCarly Strasser
"Building communities and Services in Support of Data-Intensive Research". Webinar on 18 Sept 2013 for the NISO Webinar Series. This was part 2 of 2 for Data Curation
A presentation given at the Coalition for Networked Information describing efforts undertaken by 3 partnered organizations (UCSF CTSI, UCSF Library, California Digital Library) to support sharing of research data by UCSF investigators
This document discusses sharing research data. It describes the Data Services Center, which provides data services including finding and providing access to datasets. It notes that funders and publishers require data sharing, and that shared data receives more citations. It recommends sharing the minimum data needed to reproduce results, and considering timing, usability and granularity of data sharing. For sharing methods, it recommends using disciplinary or general repositories like UR Research, Dryad and REACTUR, which provide long-term preservation and access. Workshops and help are available for data management and sharing.
Applying research methods: Investigating the Many Faces of Digital Visitors &...Lynn Connaway
Connaway, L. S. (2018). Applying research methods: Investigating the Many Faces of Digital Visitors & Residents. Presented at the American University, March 29, 2018, Rome, Italy.
Applying research methods: Investigating the Many Faces of Digital Visitors &...OCLC
Connaway, L. S. (2018). Applying research methods: Investigating the Many Faces of Digital Visitors & Residents. Presented at the American University, March 29, 2018, Rome, Italy.
This document discusses how data is driving decisions in research. It notes that the amount of data being generated is growing exponentially and researchers are now in the data business. It outlines four transformations needed - from unmanaged to managed data, disconnected to connected data, invisible to findable data, and single-use to reusable data. National strategies in Australia are aiming to support these transformations through initiatives like the Australian National Data Service which provides resources and expertise to help researchers manage, connect, and enable reuse of research data.
HIBERLINK: Reference Rot and Linked Data: Threat and RemedyPRELIDA Project
This document discusses reference rot in linked data and proposes remedies. It defines reference rot as occurring when links to web resources no longer point to the original content. Empirical evidence from analyses of journal articles and e-theses shows that over one third of references experience rot. Proposed remedies include a Hiberlink plug-in to enable proactive archiving, augmenting links with temporal context using the Missing Link approach, and a HiberActive system for repositories to actively archive references. The goal is to increase the chances of accessing referenced content over time by embedding archiving solutions into existing authoring and publishing workflows.
Digitization Process by Audra Eagle YunCraig FANSLER
The document provides an overview of getting started with digitization for an organization. It discusses organizing a digitization project, including prioritizing projects and creating a team. It also covers choosing equipment like hardware and software, and setting up a digital production station. The document outlines the scanning process, file organization, and storage. Finally, it discusses options for publishing and sharing digitized content both locally and through free online platforms.
The document provides an overview of getting started with digitization for an organization. It discusses organizing a digitization project, including prioritizing projects and creating a team. It also covers choosing equipment like hardware and software, setting up a scanning station, and scanning processes. Finally, it discusses options for publishing and storing digital files, including free services like Flickr and preserving content through the North Carolina Digital Heritage Center.
The document outlines a 23 Things program for research data management training, which releases weekly activities and has monthly webinars, and provides a calendar of events and list of coordinators for the program at UWA.
This document discusses challenges with curating and sharing research data to support reuse. It notes that while the amount of digital research data being created is growing rapidly, current systems for preserving data are not optimally designed with input from researchers. Researchers have various concerns about openly sharing their data that need to be addressed. Studies found that engaging researchers early and building trusted relationships is important for developing effective data curation solutions tailored to different research practices and disciplines.
Delivered by Peter Burnhill, Director of EDINA, at the PRELIDA Consolidation and Dissemination workshop on 17/18 October 2014 (http://prelida.eu/consolidation-workshop).
Summary: The web changes over time, and significant reference rot inevitably occurs. Web archiving delivers only a 50% chance of success. So in addition to the original URI, the link should be augmented with temporal context to increase robustness.
This document provides an overview of resources for librarians to self-educate on data science basics, software, and the library's role in data management. It recommends introductory readings on cyberinfrastructure, data challenges, and evolving library services. More advanced readings include syllabi on digital curation. The document also lists blogs, conferences, and organizations for continuing education, as well as tools for tasks like data curation, metadata, and visualization.
This document summarizes a presentation about meeting federal data sharing requirements. It discusses the history of these requirements and defines good practices for data sharing and stewardship. It also reviews some public data sharing services and provides tips for evaluating them. Key aspects of good data sharing include maximizing access, protecting privacy, ensuring proper attribution, and having long-term preservation and sustainability plans. The presenter emphasizes that restricted-use or sensitive data can be effectively shared through secure virtual environments.
This document provides an overview of DataShare, a system being developed to facilitate research data sharing across the University of California campuses. It begins with background on goals of catalyzing data sharing and lowering barriers. There is then a demo of the UCSF DataShare instance, along with technical details of the system components and interactions for depositing and downloading data. Other details covered include branding, customization, costs, and governance agreements. The document concludes with discussion of next steps, including potential additional features, communication plans, and timelines for getting initial instances set up and customized at each campus.
Data Management Solutions from Libraries at NSF Large Facilities WorkshopCarly Strasser
This document discusses data management solutions that libraries can provide. It notes that data management has become a hot topic due to the growth of digital data and requirements for data sharing and reproducibility. Libraries are well-suited to help with data management tasks across the data lifecycle such as creating data management plans, describing and sharing data, and preserving and citing data. The document provides examples of specific tools and services libraries can offer.
Similar to Research Data & Digital Preservation - CUWL Conference 2014 (20)
Slides from NCURA's webinar "Part I: Public Access: Practical Ways To Assist Faculty To Comply With Public Access Policies". This is the last section on the webinar on open data.
The document discusses basic strategies for protecting internet privacy. It recommends patching programs, using antivirus software, choosing strong passwords, and using privacy-focused search engines and ad blockers like DuckDuckGo, Privacy Badger, and uBlock Origin. The document also suggests using HTTPS Everywhere and a VPN to encrypt traffic, and mentions the Tor network. It notes that internet service providers can track browsing activity like a "doorman" tracks visitors, and that traffic patterns alone can reveal information, so additional privacy measures may be needed.
This presentation is an updated version of my Data Management 101 talk, which covers the basics of research data management in the categories of: storage and backup, documentation, organization, and making files usable for the future.
Learn the basics of managing your research well, covering the topics of: file organization and naming, documentation, storage and backups, and future file usability.
Talk given for UW-Madison Ebling Library and School of Medicine and Public Health on 3 Dec 2013. It covers electronic laboratory notebooks and what to look for in the software.
This presentation is a crash course on practical data management. It is actually a portion of this talk (http://www.slideshare.net/kbriney/responsible-conduct-of-research-data-management) on data management and management plans, but I think the slides are useful enough to stand on their own.
Basic tips for managing research data. This is the accompanying handout for the RCR presentation here: http://www.slideshare.net/kbriney/responsible-conduct-of-research-data-management
This document provides a checklist for developing a data management plan. It addresses what data will be created, how it will be documented, protected, archived, and shared. Key questions cover the size and growth of data, storage methods, standards, metadata, security, file formats, long-term responsibility, and access policies. Best practices emphasized include prioritizing unique data, automated backups, community standards, preserving documentation, consulting security experts, using open formats, and archiving data in disciplinary repositories.
This presentation covers a number of best practices for managing research data. The main topics include: file naming and organization conventions, data documentation, and data storage and backups.
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Project Management Semester Long Project - Acuityjpupo2018
Acuity is an innovative learning app designed to transform the way you engage with knowledge. Powered by AI technology, Acuity takes complex topics and distills them into concise, interactive summaries that are easy to read & understand. Whether you're exploring the depths of quantum mechanics or seeking insight into historical events, Acuity provides the key information you need without the burden of lengthy texts.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Webinar: Designing a schema for a Data WarehouseFederico Razzoli
Are you new to data warehouses (DWH)? Do you need to check whether your data warehouse follows the best practices for a good design? In both cases, this webinar is for you.
A data warehouse is a central relational database that contains all measurements about a business or an organisation. This data comes from a variety of heterogeneous data sources, which includes databases of any type that back the applications used by the company, data files exported by some applications, or APIs provided by internal or external services.
But designing a data warehouse correctly is a hard task, which requires gathering information about the business processes that need to be analysed in the first place. These processes must be translated into so-called star schemas, which means, denormalised databases where each table represents a dimension or facts.
We will discuss these topics:
- How to gather information about a business;
- Understanding dictionaries and how to identify business entities;
- Dimensions and facts;
- Setting a table granularity;
- Types of facts;
- Types of dimensions;
- Snowflakes and how to avoid them;
- Expanding existing dimensions and facts.
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
5. Retention
VINES ET AL., THE AVAILABILITY OF RESEARCH DATA DECLINES RAPIDLY WITH ARTICLE AGE, CURRENT BIOLOGY (2014),
HTTP://DX.DOI.ORG/10.1016/J.CUB.2013.11.014
6. Requirements
• Data management
• Data retention
• Data sharing
Data curation
Steven Depolo, https://www.flickr.com/photos/stevendepolo/3242308007 (CC BY)
18. Data Management
• At minimum, work on
data management
• Be part of the
conversation earlier
• Have data in a better
format when it comes to
you
Todd Chandler, https://www.flickr.com/photos/trchandler/10120260443 (CC BY-SA)
19. Data Management is an Easier Sell
Josh, https://www.flickr.com/photos/ncindc/9633818260 (CC BY-ND)
20. Need Resources to Curate Data
• People
• Time
• Infrastructure
Toni Verdu Carbo, https://www.flickr.com/photos/tonivc/2283676770 (CC BY-NC-ND)
21. Run a Pilot
liz west, https://www.flickr.com/photos/calliope/2760112757 (CC BY)
22. More Than Just Preservation
• Data management
• Data retention
• Data sharing
Susy Morris, https://www.flickr.com/photos/chiotsrun/4115054476 (CC BY-NC)
23. Thanks!
• This presentation is CC-BY except images as noted
• Slides will be available at http://www.slideshare.net/kbriney