Introduction to Advance Analytics Course


Published on

Part of advanced analytics course.

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Facebook friend connections worldwide, a network diagram of the Enron email set, a comparison of similar gene sequences between humans, chimps, and macaques
  • HW, FW, MW, SW: Hardware Firmware Middleware Software
  • Introduction to Advance Analytics Course

    1. 1. Advanced Data Analytics: Introduction Jeffrey Stanton School of Information Studies Syracuse University
    2. 2. Kilo, Mega, Giga, Tera, Peta, Exa Zetta = 1021 bytes…An organization Over 95% of theemploying 1,000 digital universe isknowledge workers "unstructured data" –loses $5.7 million meaning its contentannually just in time cant be trulywasted having to represented by its fieldreformat information in a record, such asas they move among name, address, or dateapplications. Not of last transaction. Infinding information organizations,costs that same unstructured dataorganization an accounts for more thanadditional $5.3m a 80% of allyear. information.Source: IDC Source: IDC
    3. 3. Major sources of data• Health-related services, e.g. benefits, medical analyses• Business: – Walmart: 20 million transactions/day, 10 terabyte database• Science: – NASA: 0.5+ terabytes per day per satellite• Society and everyone: news, digital cameras, YouTube• DOD and intelligence 4
    4. 4. Analytics: Multiple Disciplines Database Technology Statistics Machine Visualization Learning Analytics Pattern Recognition Social Computer Science Science 5
    5. 5. Analytics: Multiple Skills• Curiosity – Interest and intrinsic motivation to figure things out, ask why, and pursue solutions• Skepticism – Seek simplicity and distrust it, go below the surface explanation of things, question all assumptions• Writing – Communicate results, tell stories, convince others of the merits of your case• Visual Reasoning – Develop and present visualizations that support your conclusions• Statistics – Draw inferences from and summarize data to develop a case and a story• Programming – Manipulate software tools to create a chain of provenance for data and analysis 6
    6. 6. Knowledge Development for Industry, Education, Government, Research Domain Experts Infrastructure Professionals Expertise in specific Information Rapid pace of subject areas Organization & IT development Visualization Limited opportunity to Limited expertise in master technology skills Information Data Solution domain areas Analysis Scientists IntegrationProliferation of big data & Specialized knowledge of new technology HW, FW, MW, SW Digital CurationNeed for knowledge and Communication information managers challenges Transforming Data Into Decisions
    7. 7. Analytics: Key Steps• Learn the application domain• Locate or develop a data source or data set• Clean and preprocess data: May take 60% of effort!• Data reduction and transformation – Find useful pieces, squeeze out redundancies• Choose analytical approaches – summarize, visualize, organize, describe, explore, find patterns, predict, test, infer• Communicate the results and implications to data users• Deploy discovered knowledge in a system• Monitor and evaluate the effectiveness of the system 8