This is the second edition of Machine Learning and Language. If it seems to be almost identical to the initial version, which focused on a different area of science, that's the point...
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Machine learning and language v2
1. 1
Object-Oriented Data Governance
Overview
•
Global IT Solutions
Intuitive, Cost Effective, Data-Centric,
Scalable Solutions
Global IT Solutions (GITS) Presents:
Machine Learning and Language v2
Global IT Solutions
Intuitive, Cost-Effective, Scalable Solutions
4. Answer is - Untapped Potential!
Like other industries, 80% of information is Unstructured, and
buried in Artifacts:
o Journals
o Websites
o Publications
Scientists create Artifacts to share knowledge
If Knowledge is power and Information fuels Knowledge - then
logic dictates that vast opportunities are being missed
So you’ve digitized/scanned your documents…
You’ve provided document-to-document links on the Web…
Information stored in documents (unstructured data) is still
‘buried’ – you can’t link it to structured/geospatial data
Thus, you’re still not getting the results you were looking for…
The Problem
5. It doesn’t have to be this way…
5
You can build a Roadmap that leverages both
Structured and Unstructured Data Strategies
You can forge Interoperability/Collaboration, Globally
You can enlist Machines to exceed the limits of
humans..
The Future is here
The contents of Unstructured documents from various
scientists can be shared and linked in a meaningful way
Can you measure the benefits of your improvements?
Do your plans include building and implementing the
framework that is necessary to sustain 'Machine
Learning'?
Let's talk a little bit about Machine Learning benefits and
hurdles...
Slide
5 of 13
6. .
You may not know that much about Machine Learning (ML) …
But you know enough to know you don’t want what's behind Doors #2 and #3…
You also know that nothing is as easy as they say it is
Question: So, who’s right?
The Experts say that ‘Machine Learning’ can achieve your objectives….
Slide
6 of 13
7. The Answer: You both are (you and the experts)
You can ‘teach the Machine’ to learn and help:
Discover patterns and similarities across millions of Artifacts
Impart Knowledge contained in Unstructured Text and Structured Data
Make Inferences and Extrapolations on what you provide
Aid in making decisions
Exceed the limits of humans
But you are also right, there will be hurdles…
The hurdles are rooted in both the Machine and Humans
GITS uses the term ‘hurdles’ deliberately – the following items are not
‘problems’, they are just realities that have to be addressed
Machine Learning
Slide
7 of 13
8. Hurdle #1: Machine Learning
is a gradual process…
Reality #1 – When teaching new concepts to the
Machine, assume it thinks like a Child
Reality #2 -- You also must think like a Child, to
understand the ML process
Reality #3 – You can’t assume the Machine has
grasped a concept, you have to prove it
Reality #4 - Machine Learning Maturity is obtained
through trial-and-error – you need to conduct
‘experiments’
Reality #5 – You don’t need to be a genius to
conduct experiments, for trial-and-error ML
Reality #6 – You do need to keep track of your
experiments to determine how the Machine has
matured.
Slide
8 of 13
9. 9
Reality #8 -- People work in Silos
This is a fact of life. You can't change it. People like their Silos.
Within a given Silo, as Unstructured/Structured Data is captured, Reality #7 is
not a problem
In an Integrated Environment, Reality #7 is a problem
Reality #7 -- There is a ‘Vernacular’, collectively - among Colleagues, and
independently - amongst Authors
When individuals speak, it is common to use Synonyms, Homonyms and
Homographs (e.g., ‘Duff’, instead of ‘Coal Duff’; ‘Duff’, instead of ‘Soil Duff’
Hurdle #2: The Human Language is
fluid..
Reality #9 -- Enterprises rarely understand the importance of having
Ontologies/Taxonomies
Slide
9 of 13
10. Taxonomy Example
The GITS Methodology:
Provides visual representations of Taxonomies (e.g.,
Venn, Hierarchy) specific to the language of the business
Stores Taxonomies as Meta-Data
Provides the ability to link
Unstructured Data
Structured Data
Geospatial Data
Slide
10 of
13
11. Slide
11 of
13
GITS doesn’t attempt to change these Realities, our Methodology accommodates them
Before you teach the machine, GITS can show you how to manage the language
GITS will develop a Framework that can sustain Machine Learning
GITS will help you to ‘Practice what you teach’ the Machines
If you manage the language properly, you can exceed your expectations
GITS is realistic about the hurdles…
12. The GITS Methodology:
– Mitigates ‘Untapped Potential’
– Uses Ontologies/Taxonomies (as diagrams and Metadata)
– Links meaningful content from Unstructured Documents to
Structured Data/Geospatial Information
– Creates an environment amenable to efficient Machine
Learning
– Facilitates Machine Learning
GITS understands how to:
– Use Machines to exceed the limits of humans
– Provide Cost-Effective, Data-Centric Solutions
GITS provides The Solution
Slide
12 of
13
13. 13
• The GITS Methodology:
– Mitigates ‘Untapped Potential’
– Uses Ontologies/Taxonomies (as diagrams and Metadata)
– Links meaningful content from Unstructured Documents to
Structured Data/Geospatial Information
– Creates an environment amenable to efficient Machine
Learning
– Facilitates Machine Learning
• GITS understand how to:
– Use Machines to exceed the limits of humans
– Provide Cost-Effective, Data-Centric Solutions
GITS provides The Solution
Slide
13 of 8
13
Are the following part of
your Solutions Framework?
Bi-Temporal Time Series Solutions
Interoperability, Collaboration and Operational Efficiency
Unstructured/Structured Data Analytics
Multifaceted Business Intelligence (i.e.,
Unstructured/Structured Data, Geospatial)
Leveraging Social Media and Big Data
Ontology/Taxonomy Management and Implementation
Data Architecture/Data Science
Cost-Based Data Governance
Preparation for and usage of Machine Learning
If not, discover why they should be – contact us for a free
Consultation Session
13
Slide
13 of
13