A talk, given as part of the FIU CIS invited lecture series, on various and select InferLink Corp R&D and commercialization works in AI and data driven solutions.
Companies are finding that data can be a powerful differentiator and are investing heavily in infrastructure, tools and personnel to ingest and curate raw data to be "analyzable". This process of data curation is called "Data Wrangling"
This task can be very cumbersome and requires trained personnel. However with the advances in open source and commercial tooling, this process has gotten a lot easier and the technical expertise required to do this effectively has dropped several notches.
In this tutorial, we will get a feel for what data wranglers do and use R, RStudio, Trifacta Wrangler, Open Refine tools with some hands-on exercises available at http://akuntamukkala.blogspot.com/2016/05/data-wrangling-examples.html
Effective searching for EMS Professional Communication 144pvhead123
This session covers effective searching using boolean search strategies, as well as evaluating information that was presented at Stellenbosch University.
Companies are finding that data can be a powerful differentiator and are investing heavily in infrastructure, tools and personnel to ingest and curate raw data to be "analyzable". This process of data curation is called "Data Wrangling"
This task can be very cumbersome and requires trained personnel. However with the advances in open source and commercial tooling, this process has gotten a lot easier and the technical expertise required to do this effectively has dropped several notches.
In this tutorial, we will get a feel for what data wranglers do and use R, RStudio, Trifacta Wrangler, Open Refine tools with some hands-on exercises available at http://akuntamukkala.blogspot.com/2016/05/data-wrangling-examples.html
Effective searching for EMS Professional Communication 144pvhead123
This session covers effective searching using boolean search strategies, as well as evaluating information that was presented at Stellenbosch University.
- introduce some of the principles of information literacy
- talk about constructing a search strategy and implementing some search techniques
- show students how to use the library's resources (catalogs, databases, and LibGuides)
- discuss evaluating information sources
- using information ethically and legally (citation styles)
User experience research is there an academic – practitioner divide?Michael Zarro, Ph.D.
In user experience research, is there an academic – practitioner divide? We think not - applied research is the new basic: As Stokes puts it, there is a “reverse flow, from technology to science” and “more and more science has become technology derived.”
Rather than draw a distinction between academic and practitioner research, Michael Zarro and Michael Carvin will present a view of research based on its contribution to fundamental understanding and considerations of use. Research in the fields of usability/user experience and the social sciences can be more alike than not with motivation and presentation being a primary difference.
In addition to discussing the similarities and differences of academic and practitioner research, they will look at academic research specifically and the role it can play in improving day-to-day projects for UX practitioners. They will share tips on finding the appropriate research papers and articles, understanding their contents and examples of how projects have directly benefited from such research.
Information Literacy, Privacy, & Risk: What Are the Implications of Mass Surv...g8briel
In light of new revelations about government warrantless wiretapping and electronic surveillance what role do librarians have in educating our patrons about digital privacy and security issues? Given that digital privacy is further complicated by for-profit Internet companies services, such as those provided by Facebook and Google, are our users savvy enough to understand threats to their information in this increasingly complex digital landscape? This presentation will explore issues related to current events and information security with an eye towards the implications for information literacy standards; brief examination of tools used to enhance information privacy; and discuss how librarians might play a role in helping users become more information aware.
Data Landscapes: The Neuroscience Information FrameworkMaryann Martone
Overview of how to use the Neuroscience Information Framework for data discovery presented at the Genetics of Addiction Workshop, held at Jackson Lab Aug 28- Sept 1, 2014.
ACCA Version of AI & Healthcare: An Overview for the CuriousKR_Barker
This is the version of my AI & Healthcare class that I presented to attendees of the Association of Cancer Center Administrators' 2024 annual conference in Philadelphia.
AI is widely utilized in healthcare. This presentation provides a friendly introduction to the topic for librarians, health professionals, and anyone with an interest in the topic. Attendees will come away informed about the field’s history, conversant with definitions of important concepts, an understanding of how AI can become biased (and what that means for patients), and familiar with some of the many ways that AI is currently being used in healthcare.
Towards Explainable Fact Checking (DIKU Business Club presentation)Isabelle Augenstein
Outline:
- Fact checking – what is it and why do we need it?
- False information online
- Content-based automatic fact checking
- Explainability – what is it and why do we need it?
- Making the right predictions for the right reasons
- Model training pipeline
- Explainable fact checking – some first solutions
- Rationale selection
- Generating free-text explanations
- Wrap-up
Workshop finding and accessing data - fiona - lunteren april 18 2016Fiona Nielsen
Workshop presentation on finding and accessing human genomics data for research.
Including statistics of publicly available data sources and tips on how to save time in your workflow of data access.
Presented at BioSB2016, pre-conference PhD retreat for young researchers in bioinformatics and systems biology at Congrescentrum De Werelt in Lunteren. #BioSB2016 #BioSB16
Link to event:
http://www.youngcb.nl/events/biosb-phd-retreat-2016/
Read more about my work:
http://DNAdigest.org
http://repositive.io
https://uk.linkedin.com/in/fionanielsen
Threat hunting - Every day is hunting seasonBen Boyd
Breakout Presentation by Ben Boyd during the 2018 Nebraska Cybersecurity Conference.
Introduction to Threat Hunting and helpful steps for building a Threat Hunting Program of any size, from small to massive.
An overview of big data in clinical research. Discussion of big data related to real world evidence (RWE), wearable sensor data (IoT), and clinical genomics. Introduces the use of map-reduce infrastructure for big data in biomedicine.
- introduce some of the principles of information literacy
- talk about constructing a search strategy and implementing some search techniques
- show students how to use the library's resources (catalogs, databases, and LibGuides)
- discuss evaluating information sources
- using information ethically and legally (citation styles)
User experience research is there an academic – practitioner divide?Michael Zarro, Ph.D.
In user experience research, is there an academic – practitioner divide? We think not - applied research is the new basic: As Stokes puts it, there is a “reverse flow, from technology to science” and “more and more science has become technology derived.”
Rather than draw a distinction between academic and practitioner research, Michael Zarro and Michael Carvin will present a view of research based on its contribution to fundamental understanding and considerations of use. Research in the fields of usability/user experience and the social sciences can be more alike than not with motivation and presentation being a primary difference.
In addition to discussing the similarities and differences of academic and practitioner research, they will look at academic research specifically and the role it can play in improving day-to-day projects for UX practitioners. They will share tips on finding the appropriate research papers and articles, understanding their contents and examples of how projects have directly benefited from such research.
Information Literacy, Privacy, & Risk: What Are the Implications of Mass Surv...g8briel
In light of new revelations about government warrantless wiretapping and electronic surveillance what role do librarians have in educating our patrons about digital privacy and security issues? Given that digital privacy is further complicated by for-profit Internet companies services, such as those provided by Facebook and Google, are our users savvy enough to understand threats to their information in this increasingly complex digital landscape? This presentation will explore issues related to current events and information security with an eye towards the implications for information literacy standards; brief examination of tools used to enhance information privacy; and discuss how librarians might play a role in helping users become more information aware.
Data Landscapes: The Neuroscience Information FrameworkMaryann Martone
Overview of how to use the Neuroscience Information Framework for data discovery presented at the Genetics of Addiction Workshop, held at Jackson Lab Aug 28- Sept 1, 2014.
ACCA Version of AI & Healthcare: An Overview for the CuriousKR_Barker
This is the version of my AI & Healthcare class that I presented to attendees of the Association of Cancer Center Administrators' 2024 annual conference in Philadelphia.
AI is widely utilized in healthcare. This presentation provides a friendly introduction to the topic for librarians, health professionals, and anyone with an interest in the topic. Attendees will come away informed about the field’s history, conversant with definitions of important concepts, an understanding of how AI can become biased (and what that means for patients), and familiar with some of the many ways that AI is currently being used in healthcare.
Towards Explainable Fact Checking (DIKU Business Club presentation)Isabelle Augenstein
Outline:
- Fact checking – what is it and why do we need it?
- False information online
- Content-based automatic fact checking
- Explainability – what is it and why do we need it?
- Making the right predictions for the right reasons
- Model training pipeline
- Explainable fact checking – some first solutions
- Rationale selection
- Generating free-text explanations
- Wrap-up
Workshop finding and accessing data - fiona - lunteren april 18 2016Fiona Nielsen
Workshop presentation on finding and accessing human genomics data for research.
Including statistics of publicly available data sources and tips on how to save time in your workflow of data access.
Presented at BioSB2016, pre-conference PhD retreat for young researchers in bioinformatics and systems biology at Congrescentrum De Werelt in Lunteren. #BioSB2016 #BioSB16
Link to event:
http://www.youngcb.nl/events/biosb-phd-retreat-2016/
Read more about my work:
http://DNAdigest.org
http://repositive.io
https://uk.linkedin.com/in/fionanielsen
Threat hunting - Every day is hunting seasonBen Boyd
Breakout Presentation by Ben Boyd during the 2018 Nebraska Cybersecurity Conference.
Introduction to Threat Hunting and helpful steps for building a Threat Hunting Program of any size, from small to massive.
An overview of big data in clinical research. Discussion of big data related to real world evidence (RWE), wearable sensor data (IoT), and clinical genomics. Introduces the use of map-reduce infrastructure for big data in biomedicine.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Tabula.io Cheatsheet: automate your data workflows
AI and Data, for Good
1. AI and Data, for Good
Naveen Ashish
InferLink Corporation
March 28th, 2019
Florida International University (FIU) , Computing & Information Sciences Lecture
6. ActiveSearch: Background
• ISS Example: Find recent documents
mentioning Canada and Islamist
Extremist Groups (e.g., Report Desk
documents)
• AFIA Example: Find Russian aircraft
mentioned in Central Africa (e.g., on
Jetphotos.com)
• Cyber Intelligence example: Find reports
of web browsers with cross-site scripting
vulnerabilities
• ISS TopicBuilder example: Find articles
on “Al-Qaeda in the Arabian
Peninsula” (“Ansar al-Sharia”)
• information retrieval :
broad coverage, too
general, no notion of
entities, relations,
events, etc.
• information extraction :
notion of entities and
concepts, but too
specific, needs
customization, hard
failures
7. ActiveSearch
• A research engine
• Not one-size-fits-all (Google)
• Take advantage of current natural language technology
• Plug and play
• Works out-of-the-box
• Immediate value, rapid response
• Easy to customize to a domain
15. ActiveSearch Use Case: Cytenna
• Analysts want to identify patterns in the
vulnerabilities and exploits
• What type of software is being exploited?
• Web servers, browsers, operating
systems, etc.
• What types of attacks are being committed?
• Denial of service, buffer overflow, XSS,
etc.
• More powerful search technologies are
needed to collect the data for pattern
analysis
“ransomware” as a concept
21. Profits per Year: $32 Billion
Average Age of Entry To Prostitution in the US: 14
PIMP’s Profit Per Victim Per Year: $150,000
Advertising Budget On the Web: $45 Million
Human Trafficking in the US
Find the locations where a potential victim
of human trafficking was advertised
28. “YOU don't wanna miss out
on ME :) Perfect lil booty
Green eyes Long curly black
hair Im a Irish,Armenian and
Filipino mixed princess :) ❤
Kim ❤ 7○7~7two7~7four77
❤ HH 80 roses ❤ Hour 120
roses ❤ 15 mins 60 roses”
Text Extraction
name: Kim
eye-color: green
hair-color: black
phone: 707-727-7477
rate: $60/15min $80/30min
$120/60min
41. PRODUCT
10,000 papers published per day
DATA
SOURCES:
JUST TO KEEP UP, ASSUMING 5 MINUTES/PAPER
= 24 SCIENTISTS READING 24 HOURS PER DAY
Example
Congress
Abstracts:
Volumes of Data … Per Day
42. PRODUCT
Guselkumab was also superior (P < .001) to adalimumab for Investigator Global Assessment
0/1 and PASI 90 at week 16 (85.1% vs 65.9% and 73.3% vs 49.7%), week 24 (84.2% vs
61.7% and 80.2% vs 53.0%), and week 48 (80.5% vs 55.4% and 76.3% vs 47.9%).
What do these refer to?
To our AI, it looks like this… but, instantly (25,000,000 articles per hour)
Intervention Outcome Measurment
guselkumab Investigator Global Assessment 0/1 at week 48 80.5%
adalimumab Investigator Global Assessment 0/1 at week 48 55.4%
AI: Automated Reading
50. AI Community
• Journal of Artificial Intelligence Research
• https://jair.org
• AI Access
• AI Resources
51. We did not cover :)
R&D
Tools
Solutions
Active Search
Complex Information Linkage
RSX
Spin-offs
MachineReading
AutoScience
Evid Science
OpenAI
AI Resources
EntityBase
ConnectD
OpenWatch
CodeFault
TBN
52. Active Interests in AI & Data Science
• Health informatics and Biomedical research
• Cybersecurity
• Homeland security, emergency and disaster response
• Data driven engineering design
• Precision agriculture, E-Governance
• Transportation safety
• ….
53. Acknowledgements
• Dr. Steven Minton, President and CEO InferLink
• Dr. Greg Barish, CTO InferLink & CEO Cytenna
• Dr. Matt Michelson, CEO Evid Science
• Dr. Pedro Szekely, Research Associate Professor USC/ISI