Presentation at Data Science and Engineering Club looking at ways to create a Data Analytics Portfolio to demonstrate the skills that add direct value to customers and organisations.
2. Agenda
• Data Analytics Life Cycle
• Why build a Data Analytics Portfolio
• Building a Data Analytics Portfolio
• Data Projects
• Events and Networking
• Sharing Knowledge
• Data Analytics Certifications
• Soft Skills
• Peer-led Training
4. Why build a Portfolio
• Demonstrate skills that add value
How to add value
• Extracting insights from raw data, and presenting those insights to others
• Building systems that offer direct value to the customer
• Building systems that offer direct value to others in the organization
• Sharing your expertise with others in the organization
Build a Portfolio demonstrate these Data Analytics skills
• Ability to communicate
• Ability to collaborate with others
• Technical competence
• Ability to reason about data
• Motivation and ability to take initiative
5.
6. Building a Data Analytics Portfolio
• Data Projects (Kaggle Challenge, LinkedIn Economic Graph Research)
• Hackathons (Microsoft OpenHack Dublin, Hackerday, Girls in Tech Dublin,
Ulster Bank)
• ,
• Data Charity Initiatives (Data Kind, Viz for Social Good)
• Data Visualization Portfolio (Tableau Public, Makeover Monday)
• Social Media (LinkedIn Accomplishments such as Projects, Courses & Certs,
Twitter, Acclaim)
7.
8.
9.
10.
11. Data Projects
• Data sets (Messy vs Clean, demonstrate data cleansing, transformation or
Visualization)
• Avoid projects that tackle common problems
• Picking an angle
• Create a well-structured project (high-performance , modular code,
documentation)
• GitHub Pages (turn repositories into websites to showcase portfolio,
projects and documentation)
12. Events and Networking
• Data Events
• Meetups (Meetup.com, Eventbrite, MLDublin)
• Conferences (Analytics Institute)
• Workshops (Data Science and Engineering Club)
• Webinars
13.
14. Share Knowledge
• Establish User Groups (Internal, meetup.com)
• Teach and deliver presentations (SlideShare)
• Distribute relevant content (Social Media, Blogging, publications, Github)
15. Data Analytics Certifications
• Web Analytics (Google Analytics IQ)
• Data Engineering (Microsoft, Oracle, Teradata)
• Data Visualization (Tableau Certified Professional, Power BI, Qlik)
• Advanced Analytics (R, Python)
• Cloud (Azure, AWS, Google)
• Data Science Platforms (Cloudera Data Science Workbench, SAS, Azure ML,
Dataiku)
• Graph Analytics (Neo4j)
16. Soft Skills
• Problem Solving (find the root cause of a problem)
• Critical Thinking (approach problem or task from multiple directions)
• Storytelling (build a compelling and digestible story)
• Communicating (present the results to engineers and senior executives)
• Curiosity (never stop asking “Why?”, explore, investigate, gain knowledge)
17. Peer-led Training
• Teradata Learning & Certification Group
• Tableau Masterclass Series
• Kaggle Santander Customer Transaction Prediction Challenge
• Machine Learning Algorithm Masterclass Series
• Analytics Institute Certification
• Soft skills development for Graduate Programme & Analytics teams
18. Data & Analytics COP Self-Assessment Survey
• Categorising Survey Respondents into 6 Analytics Personas
•
• Identify Skill Gaps and develop training approach to address
(Machine Learning / AI, Scrum, Predictive Modelling, Design Thinking)
• Finding Optimal Mentor-Mentee Matches
1. Discovery: The team learns the business domain, assesses the resources available, framing the business problem as an analytics challenge.
2. Data preparation: Data transformation and Analysis.
3. Model planning: determines the methods, techniques, and workflow. Explore relationships between variables, selects key variables and the most suitable models.
4. Model building: Develops datasets for testing, training and production purposes.
5. Communicate results: Collaboration with major stakeholders, identify key findings, quantify the business value, and develop a narrative to summarize and convey findings to stakeholders.
6. Operationalize: delivers final reports, briefings, code, and technical documents, implement the models in a production environment.
Kaggle
https://www.kaggle.com/
LinkedIn Economic Graph Research
https://engineering.linkedin.com/teams/data/projects/economic-graph-research/economic-graph-details
DataDriven
https://www.datadriven.org/
Microsoft OpenHack Dublin for AI/Machine Learning
Tuesday 3rd - Thursday 5th July, 2018, 8:30am – 5:00pm
Hackerday
https://www.dezyre.com/hackerday
Outlay 2018 banking hackathon, 7th and 8th of April
https://www.outlayhackathon.com/
DataKind Dublin
https://www.meetup.com/DataKind-DUB/
Viz for Social Good
https://www.vizforsocialgood.com/
Tableau Public
https://public.tableau.com/
Makeover Monday (weekly social data project)
www.makeovermonday.co.uk/
Meetup
https://www.meetup.com/
Eventbrite
https://www.eventbrite.ie/
Analytics Institute
https://analyticsinstitute.org/
Data Science and Engineering Club
https://www.meetup.com/Data-Science-and-Engineering-Club/
GitHub
https://github.com/
Data.gov
https://www.data.gov/
Acclaim (enterprise-class Open Badge platform)
https://www.youracclaim.com/
Neo4j Certification (Graph Database Platform)
https://neo4j.com/graphacademy/neo4j-certification/
Google Analytics Individual Qualification (IQ)
https://analytics.google.com/analytics/academy/
Cognitive Class - Free Data Science and Cognitive Computing Courses
https://cognitiveclass.ai/
Coursera
https://www.coursera.org/
EdX
https://www.edx.org/
data.world (host and share your data, collaborate with your team)
https://data.world/
GitHub Pages is a static site hosting service designed to host your personal, organization, or project pages directly from a GitHub repository.
showcase your work on GitHub
Add your projects to your LinkedIn profile
Twitter/Social Media: one of the most popular social media sites for Data Scientists is Twitter.
Demonstrate communication and Knowledge sharing
Content
Blogs
SlideShare
List any papers or publications
Online Presence: Kaggle, GitHub & LinkedIn profiles (try to fill out as much sections as possible)
An end to end project
Finding good datasets
+Find a messy dataset => demonstrate data cleansing and transformation
+Data sets for Data Visualization Projects shouldn't be messy, because you don't want to spend a lot of time cleaning data.
Try to avoid projects that tackle common problems, pick something topical and relevant
Clean up and document your code
Split Code Into Modules
Modular code is code which is separated into independent modules.
Modular programming is a software design technique that emphasizes separating the functionality of a programme into independent, interchangeable modules, such that each contains everything necessary to execute only one aspect of the desired functionality.
Creating a well-structured project, so its easy to integrate into operational flows
Writing high-performance code that runs quickly and uses minimal system resources
Documenting the installation and usage of your code well, so others can use it
Picking an angle
The important thing is to stick to a single angle. Trying to focus on too many things at once will make it hard to make an effective project. It’s also important to pick an angle that has sufficient nuance.
Data Science and Engineering Club
Saturday, July 7, 2018
Applied Math, Probability and Statistics for Data Science
Frank Friedman Oppenheimer (August 14, 1912 – February 3, 1985) was an American particle physicist, cattle rancher, professor of physics at the University of Colorado, and the founder of the Exploratorium in San Francisco.
Docendo discimus, (Latin "by teaching, we learn") is a Latin proverb.
Content
Blogs
SlideShare
List any papers or publications
Online Presence: Kaggle, GitHub & LinkedIn profiles (try to fill out as much sections as possible)
Analytics Capability Self-Assessment Survey
Tableau Demo (Overall Dashboard to highlight Gaps, Role Personas with aspiring candidates/gaps, Mentor/Mentee Matching)
Peer-led Training initiatives
Teradata Learning & Certification Group
Tableau Masterclass Series
Kaggle Santander Customer Transaction Prediction Challenge
Machine Learning Algorithm Masterclass Series
Analytics Institute Certification
Soft skills development for Graduate Programme & Analytics teams