Attempt to inspire some kids to pay attention in Math and Science classes so they can get a good job and help fill the skills gap in the years to come.
The First Data Scientist: Forgotten Lessons From Ancient Greece On Winning Wi...Joe Clements
Thales, a little known Greek philosopher became the world's first data scientist by correctly making agricultural forecasts using data on natural phenomena.
Thales' radical thinking make him a fabulously wealthy and began our path to the brave new world of Big Data.
Find out how the principles uncovered by Thales in ancient Greece can shape how we think about data, technology and business today.
Becoming a Data Scientist: Advice From My Podcast GuestsRenee Teate
Information and advice about learning data science, from the 17 data scientists & data science learners I have interviewed to date on the Becoming a Data Scientist Podcast, and from me!
Originally presented at PyDataDC conference, 10/9/2016
The First Data Scientist: Forgotten Lessons From Ancient Greece On Winning Wi...Joe Clements
Thales, a little known Greek philosopher became the world's first data scientist by correctly making agricultural forecasts using data on natural phenomena.
Thales' radical thinking make him a fabulously wealthy and began our path to the brave new world of Big Data.
Find out how the principles uncovered by Thales in ancient Greece can shape how we think about data, technology and business today.
Becoming a Data Scientist: Advice From My Podcast GuestsRenee Teate
Information and advice about learning data science, from the 17 data scientists & data science learners I have interviewed to date on the Becoming a Data Scientist Podcast, and from me!
Originally presented at PyDataDC conference, 10/9/2016
Data Scientist: The Sexiest Job in the 21st CenturyLyn Fenex
Presentation from WUSS 2015:
“Data scientist” is often used as a blanket title to describe jobs that are drastically different. There are plenty of articles and discussions on the web about what data science is, what qualities define a data scientist, how to nurture them, and how you should position yourself to be a competitive applicant. There are far fewer resources out there about the steps to take in order to obtain the skills necessary to practice this elusive discipline. This presentation will explore a collection of freely accessible materials and content to jumpstart your understanding of the theory and tools of Data Science. We will also discuss some of the variable understandings that companies use to define the roles of their Data Scientists.
Join our #DataTalk on Thursdays at 5 p.m. ET. This week, we tweeted with Dr. Michael Wu, the Chief Scientist at Lithium, where he applies data-driven methodologies to investigate the complex dynamics of the social web.
Michael works with big data and has developed many predictive and prescriptive social analytics with actionable insights. His R&D won him the recognition as a 2010 Influential Leader by CRM Magazine.
You can see all tweets and resources here:
http://www.experian.com/blogs/news/about/data-scientists/
Data science is the new thing! How to be a data scientist? See here.
This was originally was written by the team behind DataCamp, - the online interactive learning platform for data science!
Design an experiment or series of experiments to help corporate data scientists better understand the importance of ethical considerations in their own research.
The data scientists are the subject of this experiment.
There are eight rules.
HackerEarth is pleased to announce its next session to help you understand what it really takes to become a data scientist.
Agenda of this session will include answers to the following questions:
- Why is it the best time to take up Data Science as a career?
- How can you take the first step in Data Science? (After all, first step is always the hardest!)
- How can you become better and progress fast?
- How is life after becoming a Data Scientist?
Speaker:
Jesse Steinweg-Woods is soon-to-be a Senior Data Scientist at tronc, working on recommender systems for articles and understanding customer behavior. Previously, he worked at Argo Group Insurance on new pricing models that took advantage of machine learning techniques. He received his PhD in Atmospheric Science from Texas A&M University, and his research focused on numerical weather and climate prediction.
데이터 과학자의 실체 The Reality of Data Scientist
전체 분석 과정에서 대부분은 데이터를 모으고 가공하는데 소요한다.
그리고 애플리케이션에 데이터를 적용하기 위해서는 테스팅이 가장 중요하다.
인간공학 전공자들을 대상으로 준비한 발표자료라서 '데이터 수집 및 클렌징'보다는 '테스트 (온라인 테스트)'에 초점을 두고 자료를 만들었습니다.
WLMA 14 Conference Keynote PPT - Paige Jaeger: Connecting Creatively with the CCPaige Jaeger
Washington Library Media Association Conference Keynote - It was my pleasure to share ways to challenge, reach and teach the Millennials at your conference! Carpe Diem! Let us think!
Open Source Data Visualization for Resource Sharing: An Ivy Plus Libraries Pr...Heidi Nance
https://sched.co/GB4S
Presentation by Heidi Nance and Joe Zucca.
In order to better understand scholarly use of a vast collective collection - both within and without our 13-library partnership - Ivy Plus Libraries is leveraging MetriDoc, an open-source framework devised by a library for libraries, to create a generalizable data analysis infrastructure and visualization service. MetriDoc gathers, normalizes, and presents BorrowDirect consortial Resource Sharing data as well as ILLiad (interlibrary loan + document delivery) data from all 13 Ivy Plus Libraries—more than 500,000 transactions, annually. It integrates seamlessly with Tableau or other commodity statistical applications, thus allowing staff in any functional area (Assessment, User Services, Collections, IT, Technical Services, User Experience, Research & Instruction, etc.) to query, download, and interpret resource sharing data to support a variety of one-time or ongoing assessment projects.
In this session we will discuss the Ivy Plus project and goals, the framework’s IMLS-funded history, and basic architecture, myriad use cases, and creative opportunities for future extensibility and connections with third-party systems common to libraries. Come learn how you, too, can analyze the larger-than-you-might-expect Resource Sharing data universe.
Data Scientist: The Sexiest Job in the 21st CenturyLyn Fenex
Presentation from WUSS 2015:
“Data scientist” is often used as a blanket title to describe jobs that are drastically different. There are plenty of articles and discussions on the web about what data science is, what qualities define a data scientist, how to nurture them, and how you should position yourself to be a competitive applicant. There are far fewer resources out there about the steps to take in order to obtain the skills necessary to practice this elusive discipline. This presentation will explore a collection of freely accessible materials and content to jumpstart your understanding of the theory and tools of Data Science. We will also discuss some of the variable understandings that companies use to define the roles of their Data Scientists.
Join our #DataTalk on Thursdays at 5 p.m. ET. This week, we tweeted with Dr. Michael Wu, the Chief Scientist at Lithium, where he applies data-driven methodologies to investigate the complex dynamics of the social web.
Michael works with big data and has developed many predictive and prescriptive social analytics with actionable insights. His R&D won him the recognition as a 2010 Influential Leader by CRM Magazine.
You can see all tweets and resources here:
http://www.experian.com/blogs/news/about/data-scientists/
Data science is the new thing! How to be a data scientist? See here.
This was originally was written by the team behind DataCamp, - the online interactive learning platform for data science!
Design an experiment or series of experiments to help corporate data scientists better understand the importance of ethical considerations in their own research.
The data scientists are the subject of this experiment.
There are eight rules.
HackerEarth is pleased to announce its next session to help you understand what it really takes to become a data scientist.
Agenda of this session will include answers to the following questions:
- Why is it the best time to take up Data Science as a career?
- How can you take the first step in Data Science? (After all, first step is always the hardest!)
- How can you become better and progress fast?
- How is life after becoming a Data Scientist?
Speaker:
Jesse Steinweg-Woods is soon-to-be a Senior Data Scientist at tronc, working on recommender systems for articles and understanding customer behavior. Previously, he worked at Argo Group Insurance on new pricing models that took advantage of machine learning techniques. He received his PhD in Atmospheric Science from Texas A&M University, and his research focused on numerical weather and climate prediction.
데이터 과학자의 실체 The Reality of Data Scientist
전체 분석 과정에서 대부분은 데이터를 모으고 가공하는데 소요한다.
그리고 애플리케이션에 데이터를 적용하기 위해서는 테스팅이 가장 중요하다.
인간공학 전공자들을 대상으로 준비한 발표자료라서 '데이터 수집 및 클렌징'보다는 '테스트 (온라인 테스트)'에 초점을 두고 자료를 만들었습니다.
WLMA 14 Conference Keynote PPT - Paige Jaeger: Connecting Creatively with the CCPaige Jaeger
Washington Library Media Association Conference Keynote - It was my pleasure to share ways to challenge, reach and teach the Millennials at your conference! Carpe Diem! Let us think!
Open Source Data Visualization for Resource Sharing: An Ivy Plus Libraries Pr...Heidi Nance
https://sched.co/GB4S
Presentation by Heidi Nance and Joe Zucca.
In order to better understand scholarly use of a vast collective collection - both within and without our 13-library partnership - Ivy Plus Libraries is leveraging MetriDoc, an open-source framework devised by a library for libraries, to create a generalizable data analysis infrastructure and visualization service. MetriDoc gathers, normalizes, and presents BorrowDirect consortial Resource Sharing data as well as ILLiad (interlibrary loan + document delivery) data from all 13 Ivy Plus Libraries—more than 500,000 transactions, annually. It integrates seamlessly with Tableau or other commodity statistical applications, thus allowing staff in any functional area (Assessment, User Services, Collections, IT, Technical Services, User Experience, Research & Instruction, etc.) to query, download, and interpret resource sharing data to support a variety of one-time or ongoing assessment projects.
In this session we will discuss the Ivy Plus project and goals, the framework’s IMLS-funded history, and basic architecture, myriad use cases, and creative opportunities for future extensibility and connections with third-party systems common to libraries. Come learn how you, too, can analyze the larger-than-you-might-expect Resource Sharing data universe.
Talk at Bournemouth University 16th September.
The main part of this talk is on the post-hoc analysis of REF data for computing, the apparent bias by sub-area, institution and gender, and the implications of this for policy in UK computing.
In addition I briefly review a number of other areas of my research where data is central.
http://alandix.com/ref2014/2015/09/16/ref-talk-at-bournemouth/
From Digital Literacy to Digital FluencyDavid Cain
While our students may appear to be digital natives, they rarely have the capacity to make wise or ethical decisions as they construct their digital identities. As educators, we have a moral imperative to guide our students--even in an ever-changing digital landscape.
How Does Reading & Learning Change on the Internet: Responding to New LiteraciesJulie Coiro
This slide show provides an overview of the ways in which reading comprehension looks different relative to how we locate, critical evaluate, synthesize, and communicate information on the Internet.
Just about all of my current technical content in one 364 slide mega-deck. Source files at https://github.com/adrianco/slides
Sections on:
Scene Setting
State of the Cloud
What Changes?
Product Processes
Microservices
State of the Art
Segmentation
What’s Missing?
Monitoring
Challenges
Migration
Response Times
Serverless
Lock-In
Teraservices
Wrap-Up
Historical view of process and channel oriented programming idioms: CSP 1978, Occam 1983, Pi-Calculus 1993 etc. How they map to Go and some examples of dynamic channel routing using Go to simulate peer-to-peer networks and microservices networks.
Opening talk at Monitorama, talks about the problems of monitoring, challenges of creating monitoring tools and why monitoring vendors keep getting disrupted. Ended with a discussion of simulation testing and serverless architectures - Monitorless.
Full slide deck for day long discussion of microservices topics. Why use microservices, what options exist and how to migrate to them and address common problems.
Discussion of how microservices are being applied across both web scale and enterprise/government use cases to help speed up development.
Video available at http://www.ustream.tv/recorded/86151804
Microservices: What's Missing - O'Reilly Software Architecture New YorkAdrian Cockcroft
Assuming you have already figured out microservices, what else do you need to figure out to get them to work properly. This talk skips my usual introduction to why and what, and goes deeper on how.
A rough and researchy presentation where I tried out some new material in front of a local audience. Skipped the usual introduction and talked about some of the problems people run into when they do microservices and miss a few things. More refined version of this talk to be shown at O'Reilly Software Architecture Conference in New York in April.
There are many ways to manage whether a service can talk to another service. It can be tempting to over-use one segmentation mechanism to implement policy when the real problem is how to coordinate and manage many mechanisms in the physical, cloud and container spaces. This talk summarizes the problem space and opportunities rather than offers solutions.
Presented at the Docker Palo Alto meetup Feb 16th 2016 http://www.meetup.com/Docker-Palo-Alto/events/228277181/
Microxchg Analyzing Response Time Distributions for MicroservicesAdrian Cockcroft
Research oriented presentation @Microxchg Berlin Feb 5th 2016. New code to collect histograms of response time and export them to monte-carlo simulation spreadsheet via getguesstimate.com
Updated slides for 2016 presentation on innovation in large organizations, why microservices and Docker can be useful, thoughts on monitoring for large complex architectures, some discussion of new topics - serverless architectures AWS Lambda and teraservices.
Businesses are speeding up development and automating operations to remain competitive and to get large organizations to scale. Project based monolithic application updates are replaced by product teams owning containerized microservices. This puts developers on call, responsible for pushing code to production, fixing it when it breaks, and managing the cost and security aspects of running their microservices. In this world operations skill-sets are either embedded in the microservices development teams, or building and operating API driven platforms. The platform automates stress testing, canary based deployment, penetration testing and enforces availability and security requirements. There are no meetings or tickets to file in the delivery process for updating a containerized microservice, which can happen many times a day, and takes seconds to complete. The role of site reliability engineering moves from firefighting and fixing outages to buiding tools for finding problems and routing those problems to the right developers. SREs manage the incident lifecycle for customer visible problems, and measure and publish availability metrics. This may sound futuristic but Werner Vogels described this as “You build it, you run it” in 2006.
It's clear that Docker speeds up development and makes testing and deployment more efficient. As Docker moves into production new use cases and patterns are emerging that address availability and security concerns. With microservices, safety is part of the architecture that developers need to understand and build for. It's no longer good enough to wrap a firewall around an entire app when it goes to production, and have a cold standby in case it breaks.
Sildes of an internal talk given at Twitter similar to a previous webinar for Redhat with the same title.
Speeding up development is a key concern, cloud and technology improvements like Docker speed up key steps that make continuous delivery possible. Breaking up the work into many separate microservices and datastores with stable APIs allows teams to make progress independently so that the organization scales. Monolithic apps are preferred for small projects, built by small teams and when very low latency and high efficiency is the primary requirement. Monitoring microservices is currently a challenge with solutions starting to emerge.
Summary of fast development and cloud native architecture along with cost optimization techniques. Presented as opening keynote at the Utility and Cloud Computing 2014 as part of the Cloud Control Workshop.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
11. The hive mind map shows popular twitter hashtags
for the last 7 days and how they are connected
http://hivemindmap.com/?#
12. HIVE MIND MAP
A mind-map of what’s happening onTwitter
Thanks to Mark Harwood for these slides and the Hive Mind Map
http://www.infoq.com/presentations/elasticsearch-revealing-uncommonly-common
13. Connections
The thickness of a line between hashtags is
based on the strength of connection
Tip:!
Strength of connection
is the number of tweets
with both tags vs the
number with only one -
see “Jaccard similarity
coefficient”
14. Top tweets
The most popular tweets for a tag are sorted
based on the number of “retweets”
15. When?
The rise and fall of each hashtag’s popularity
can be shown over time
16. Calendar summary
Tags that “peak” together are grouped into
events on a calendar
Tip:!
Peaks are detected
using standard
deviations. Only tags
with a single peak are
chosen as events
Tip:!
Tags that rise and
fall in popularity at
the same time are
detected using
Pearson’s
Correlation
17. What makes this possible?
• Free software (Lucene, Java, Eclipse, Gephi, Tomcat, d3, Google analytics…)
• Free data (millions of users’ tweets from Twitter’s 1% sample feed)
• “Cloud” computing (rented server)
• Smarter web browsers (visualizations using HTML5’s SVG/Canvas)
• All the friendly folks on the internet (e.g. http://stackoverflow.com/
questions/14799842)
• Some imagination…
18. Opportunities in Data Science
• We are all generating volumes of data never seen before
• You can recycle the behaviors of billions of people into
more intelligent systems
• customer purchases can be used for product recommendations
• user searches can be used for spelling corrections,
• Reader clicks can influence the trending news
• Spotify activity is used to make music recommendations)
• The tools have never been cheaper
• It has never been easier to find help in developing systems
19. …one more thing..
I’m writing these slides for you
while on my annual snowboarding
trip to Canada.
Data science pays well ;-)
Wish you were here…
31. SCORES 2004-2012
Elementary - 4th Grade, Middle School - 8th Grade, High School
About half of
high school
students in
California are
proficient at
Math and
Science
33. CALIFORNIA SCHOOLS
Science and Math Scores at Elementary, Middle and High School Level
Scores have
been getting
better. Good!
34. CALIFORNIA SCHOOLS
Science and Math Scores at Elementary, Middle and High School Level
Scores have
been getting
better. Good!
Maybe the
Math tests
were harder
for everyone
that year?
35. CALIFORNIA SCHOOLS
Science and Math Scores at Elementary, Middle and High School Level
Scores have
been getting
better. Good!4th Grade
“cohort” in
2004 was 8th
Grade in 2008
Maybe the
Math tests
were harder
for everyone
that year?
36. DATA SCIENCE WITH EXCEL
Pivot tables let you rearrange data and trend lines measure the slope
37. LEARNTO BE A DATA SCIENTIST FOR $1
• Everything is being measured
• The latest data science tools are
available to anyone for pennies
• There is lots of freely available data
• Pay attention in math and science class,
play around with EMR and Bigquery
and get an interesting and well paid job
as a data scientist!