Students Doing BIG STUFF with BIG DATA
Dan Matthews – Trine University
Trine University – Angola Indiana
INFORMATICS – OUR WAY
“The success of computing is in the
resolution of problems, found in areas that
are predominately ou...
Data Mining AKA:
Information
Harvesting
Knowledge
Mining
Knowledge
Discovery
Data
Dredging
Data Pattern
Processing
Data
Ar...
A DECENT DEFINITION
• The process of discovering meaningful new
correlations, patterns, and trends but sifting
through lar...
A number of technology skills are needed:
Data
Mining
Database
Management
Machine
Learning
Artificial
Intelligence
Analysi...
“In order to
discover
anything, you must
be looking for
something.”
Laws of Serendipity
I had to mine this data the hard way.
What I won’t talk about today but these concepts are
important to learn in a class on data mining.
Having fun “playing” with and mining data!
Visualization to gain insight and knowledge
David McCandless Data Visualization TED Talk
WEKA: the software
• Machine learning/data mining software written in Java
(distributed under the GNU Public License)
• Us...
@relation heart-disease-simplified
@attribute age numeric
@attribute sex { female, male}
@attribute chest_pain_type { typ_...
Visual
Analytics
Business
Integration
Tableau 8
Any
Data
Fast
Performance
Web & Mobile
Authoring
Visual
Analytics
Business
Integration
Tableau 8
Any
Data
Fast
Performance
Web & Mobile
Authoring
Forecasting
Sets and visu...
Tableau for Academia
Time to play!
Dan Matthews – Associate Professor – Trine U.
matthewsd@trine.edu
Data Mining and Data Visualization – Tools to Allow Students to do BIG STUFF with BIG DATA - Course Technology Computing C...
Upcoming SlideShare
Loading in …5
×

Data Mining and Data Visualization – Tools to Allow Students to do BIG STUFF with BIG DATA - Course Technology Computing Conference

746 views

Published on

Data Mining and Data Visualization – Tools to Allow Students to do BIG STUFF with BIG DATA - Course Technology Computing Conference

Presenter: Dan Matthews, Trine University

At first, when beginners hear the term “data mining” they wonder, “What kind of mining could a computer possibly do? It must be awfully hard. What would the end product of data mining look like?”. Data mining (analytics) is becoming a core skill for an unprecedented number of professions. There exist software environment that help make the process efficient for the data miner. Tableau is one of the systems I use in my data mining class to teach students data mining. The software helps accelerate the process of converting data to not just information but to knowledge with intuitive drag & drop technology that lets you stop worrying about how to connect to data and lets you spend your time answering questions and forming relationships (knowledge) using critical thinking and creative association. With Tableau's speed and ease of use, students find themselves doing more complex analyses in less time. Tableau has an academic program that gives professional-grade analytics software in the form of Tableau Desktop to full-time students to help prepare them for working in an increasingly data-driven world. Students use Tableau Desktop for class work and extracurricular projects. Tableau offers instructors free access to Tableau Desktop as well to equip them to teach the next generation of data scientists (miners) and analysts. In addition to software, Tableau recognizes that materials and support are essential to teaching with a tool, and to that end they offer a variety of solutions for different classrooms. Dozens of universities are using Tableau in Data Mining classes. I want to share how I use the resources available to me to do quality instruction in this very important new technology discipline. I will define data mining (as best as I can). I will discuss why the subject is so very important. I will discuss a variety of applications. And most of all I will demonstrate some fun things students can do with the mining of the big data sets available in the cloud.

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
746
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
23
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Data Mining and Data Visualization – Tools to Allow Students to do BIG STUFF with BIG DATA - Course Technology Computing Conference

  1. 1. Students Doing BIG STUFF with BIG DATA Dan Matthews – Trine University
  2. 2. Trine University – Angola Indiana
  3. 3. INFORMATICS – OUR WAY “The success of computing is in the resolution of problems, found in areas that are predominately outside of computing..”
  4. 4. Data Mining AKA: Information Harvesting Knowledge Mining Knowledge Discovery Data Dredging Data Pattern Processing Data Archaeology Database Mining Siftware Analytics Business Intelligence And more…
  5. 5. A DECENT DEFINITION • The process of discovering meaningful new correlations, patterns, and trends but sifting through large amounts of stored data, using pattern recognition technologies and statistical and mathematical techniques.
  6. 6. A number of technology skills are needed: Data Mining Database Management Machine Learning Artificial Intelligence Analysis of Algorithms Statistics Visualization Data Warehousing Security Technology Ethics
  7. 7. “In order to discover anything, you must be looking for something.” Laws of Serendipity
  8. 8. I had to mine this data the hard way.
  9. 9. What I won’t talk about today but these concepts are important to learn in a class on data mining.
  10. 10. Having fun “playing” with and mining data!
  11. 11. Visualization to gain insight and knowledge David McCandless Data Visualization TED Talk
  12. 12. WEKA: the software • Machine learning/data mining software written in Java (distributed under the GNU Public License) • Used for research, education, and applications • Complements “Data Mining” by Witten & Frank • Main features: – Comprehensive set of data pre-processing tools, learning algorithms and evaluation methods – Graphical user interfaces (incl. data visualization) – Environment for comparing learning algorithms
  13. 13. @relation heart-disease-simplified @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63,male,typ_angina,233,no,not_present 67,male,asympt,286,yes,present 67,male,asympt,229,yes,present 38,female,non_anginal,?,no,not_present ... WEKA only deals with “flat” files
  14. 14. Visual Analytics Business Integration Tableau 8 Any Data Fast Performance Web & Mobile Authoring
  15. 15. Visual Analytics Business Integration Tableau 8 Any Data Fast Performance Web & Mobile Authoring Forecasting Sets and visual groups Shared Filters Treemaps, bubble charts, word clouds New marks card Freeform dashboards Data Blending improvements Parallelized dashboards Faster quick filters Data Engine & Extract performance Fast graphics and calculations Performance recorder Salesforce.com Google Analytics & Google BigQuery Cloudera Impala, Cassandra, HortonWorks, Hadapt, Karmasphere SAP HANA Data Extract API JavaScript API Data Server Security Server Auditing Distributed Data Engine Web Authoring iPad and Android authoring Local rendering Subscriptions
  16. 16. Tableau for Academia
  17. 17. Time to play!
  18. 18. Dan Matthews – Associate Professor – Trine U. matthewsd@trine.edu

×