SlideShare a Scribd company logo
1 of 60
Download to read offline
ECON-UH 2020 - Data Analysis
Spring 2022
Lecture 1: Course Logistics
Betul Arda
arda@nyu.edu
Jan 25, 2022
Outline
• What is data analysis?
• Why and how data science became popular
• Doesn’t data just speak for itself?!
• Course logistics:
• Attendance and participation
• Topics and schedule
• The textbooks and other learning resources
• Data resources
• Office hours
• Assignments and group projects
• Exams
• Academic honesty
2
25 January 2022 Lecture 1: Course logistics
What is data analysis?
25 January 2022 Lecture 1: Course logistics 3
Data analysis is extracting information from data.
We will learn how to organize and analyze data and interpret the
results (information).
The difference between inference and speculation is data.
25 January 2022 Lecture 1: Course logistics 4
Questions we can investigate with suitable data
Effects of a newly
implemented traffic
law on traffic accident
fatalities.
Bank loans and risk
assessments – can you
predict if a loan will
get accepted or not?
Inventory prediction
before holiday sales.
The effects of Brexit on UK
economy. The effect of COVID19
on gas prices.
Any other examples?
25 January 2022 Lecture 1: Course logistics 5
Some examples from https://www.analyticsinhr.com/blog/advanced-data-analysis-techniques-applied-to-people-analytics/
Data does not speak for itself!
25 January 2022 Lecture 1: Course logistics 6
A common
misconception: if
we have enough
data and a good
statistical
software, we can
learn everything
we need from that
data. WRONG!
A first simple
look at data
can be
misleading.
One can miss
underlying
phenomena
that are not
readily visible.
One can
mistake
artifacts for
meaningful
relationships.
One can run
the wrong
tests and/or
misinterpret
the results
and make
wrong
inferences.
Why data analysis?
25 January 2022 Lecture 1: Course logistics 7
The short and oversimplified answer:
In an argument, would you rather have evidence or opinion?
How it became so popular – the exponential growth in computing power.
25 January 2022 Lecture 1: Course logistics 8
http://content.time.com/time/interactive/0,31813,2048601,00.html
25 January 2022 Lecture 1: Course logistics 9
Why data analysis?
25 January 2022 Lecture 1: Course logistics 10
Why and how did data analysis become popular?
• We will never run out of data.
• The Internet! The ease of access to global data.
• Many questions to be answered with readily available or
collectible data.
• Data collection is also easier than ever.
Data is “renewable”:
Why data analysis?
25 January 2022 Lecture 1: Course logistics 11
Why and how did data analysis become popular?
• Time and money.
• The cost of educated decisions vs speculation.
• Time cost.
• Irreplaceable resources? Any ideas?
Limited and/or irreplaceable resources:
Why data analysis?
25 January 2022 Lecture 1: Course logistics 12
Still not convinced?
• The careers you want to pursue will most likely require some level of data analysis
knowledge.
• My children learned about mean, median, quartiles, histograms and the concept of
probability distributions as a part of his school curriculum in 3rd grade. By the time
these kids reach college age in 10 years, they’ll already be ahead of your current
position. Kids are learning coding and statistics at young ages. You must stay
relevant!*
• *This is an anecdotal evidence and an opinion. My argument that you will need data analysis skills
shouldn’t be very convincing without any data and supporting evidence. Keep on reading 
You will be left behind otherwise.
Some evidence
• This is just one study on the demand for data analysis skills. Do your own research.
• American Statistical Association: New Report Highlights Growing Demand for Data Science,
Analytics Talent. https://www.amstat.org/ASA/News/New-Report-Highlights-Growing-Demand-for-Data-Science-Analytics-
Talent.aspx
• Jobs of the Future: Data Analysis Skills: https://www.shrm.org/hr-today/trends-and-forecasting/research-and-
surveys/Pages/data-analysis-skills.aspx
25 January 2022 Lecture 1: Course logistics 13
Why organizations do
not use big data:
•51% lack of knowledge/
expertise
•30% not enough data
collected/available
Objectives: What will you learn?
25 January 2022 Lecture 1: Course logistics 14
The ability to answer questions with evidence
A good data analysis is never dumping raw data
into a software and seeing what comes out.
It is finding the right methods and properly
interpreting the results.
The ability to ask (meaningful) questions
We will turn real-world problems into
mathematical models.
Questions before and after the
analyses.
This class will teach you critical thinking in a quantitative way.
Recitations
• Sunday,
• REC1: 3:45pm-5:00pm (A2-012),
• REC2: 2:20pm-3:35pm (A2-012),
• REC3: Fri 7:55am-9:10am (A2-007),
• REC4: Fri 9:20am-10:35am (A2-007).
• The recitations will help tremendously.
• The instructor (Jon) will cover some Stata and stats basics next week.
• He will go over extra examples and you will practice on data.
• Practice will help you in exams, projects, later in other classes and
your career.
25 January 2022 Lecture 1: Course logistics 15
Assignments
• Practice is essential when learning data analysis.
• The assignments will help with thorough comprehension of the
material.
• Read the questions carefully and be mindful of the analyses you
are carrying out.
• Explain your steps and the results so you and I both know what
you are doing and why you are doing it.
• Recitations will help with practicing so it will be easier
completing the assignments.
25 January 2022 Lecture 1: Course logistics 16
Assignment reports
• There is an example report on Brightspace!
• Add comments to your code. Explain everything.
• Your code should run all at once.
• This is important as we won’t keep selecting and running partial
code to see if it’s working partially.
• When adding a comment, make sure to follow proper comment
notation so it does not stop your code from running as a whole.
• Clearly label all your plots, axes, etc.
25 January 2022 Lecture 1: Course logistics 17
Assignment reports
• Type your assignments. NO handwriting, please.
• Create a PDF with all your answers, plots, explanations, etc.
ONLY PDF. No docx.
• Code in a separate .do file.
• Do not add code into your report, and vice versa.
• Submit to Brightspace.
• Be mindful of math notations (subscripts, superscripts, etc.)
• No X1 if it’s 𝑥𝑥1.
• No theta^2 if it’s 𝜃𝜃2
.
25 January 2022 Lecture 1: Course logistics 18
Assignment reports
• Label your code as: LastnameFirstInitial_HW1.do (ArdaB_HW1.do)
• Label your pdf as: LastnameFirstInitial_HW1.pdf (ArdaB_HW1.pdf)
• Label your data as: LastnameFirstInitial_HW1.dta (ArdaB_HW1.dta)
• Label your log as: LastnameFirstInitial_HW1.log (ArdaB_HW1.dta)
• We receive about 40-50 reports per week per semester. With each report, we also
have the code, the data and the log. Multiply it with 10 – we’ll have 10 HWs this
semester. Can you imagine the mess if you don’t label your work properly?
• Submit the data as is. This might seem unnecessary, but it’ll help us confirm that the
problems in your results are not due to data download problems.
• Submit your log file too.
25 January 2022 Lecture 1: Course logistics 19
Assignment reports
• The first homework will be posted today (January 25).
• Due January 31, Monday, 11:59pm, sharp.
• 10 points will be subtracted for each additional hour starting
midnight.
• Ex: submitted Feb 1, 12:00:01 am (basically right after midnight)
𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔 ≤ 90, 1:30am 𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔 ≤ 80, etc.
• If your circumstances have proved it difficult to finish the HW on
time that week, let me know and ask for an extension before the
last minute.
25 January 2022 Lecture 1: Course logistics 20
Assignment reports
• Over the years, I and the instructors of this course have been
providing students with code that would help them not just
for this course but upcoming econometrics courses as well.
DO NOT ever present those code snippets as your own.
• For example, if you ever use a code snippet of Emir Hidovic or
Jonathan Rogers from recitations, cite them in the upcoming
econometrics courses (and elsewhere). It is their intellectual
property. The professors and instructors will recognize their own
code.
25 January 2022 Lecture 1: Course logistics 21
Academic honesty
• There are many ways to write a piece of code that does the
same thing.
• Write your own code. Do NOT copy someone else’s. Seriously, don’t
risk your whole academic future.
• Write your own reports in your own words.
• You aren’t allowed to use “homework” websites, solution
manuals, etc. If you can find it online, so can we.
• If you are stuck, ask for help from me or Jon.
• No posting course materials elsewhere. These are considered
intellectual property.
• Lecture notes, assignments, exams, solutions, recitation notes,
codes, meeting videos, etc. are all considered intellectual property.
25 January 2022 Lecture 1: Course logistics 22
The software
• Follow the instructions you received in your email to
download Stata 17.
• Earlier versions of Stata are fine to use for this course, but you may
need some of the newer features for the next econometrics course.
• Why Stata? Not R or any other free software?
• The short answer: easier to use (more user-friendly) and still very
versatile.
• This is not a coding class. The less time you spend trying to figure
out how to code something, the more time you’ll have for data
analysis topics.
25 January 2022 Lecture 1: Course logistics 23
The software
• Why Stata? Not R or any other free software?
• User-friendly, interactive interface.
• Lets us focus on learning data analysis without trying to figure out a
more complicated software.
• Comes with detailed help files full of real examples.
• Most more advanced econometrics classes at NYUAD require Stata
due to similar reasons.
• Commonly used in social sciences research.
• Comparison: https://www.r-bloggers.com/whats-the-best-statistical-software-a-comparison-
of-r-python-sas-spss-and-stata/
25 January 2022 Lecture 1: Course logistics 24
The software
• There is a third section of this class using R.
• Other statistical software/programming languages you may
want to explore: R, Python, Java, C++, C#, SAS, SPSS, MATLAB,
Mathematica…
• Once you learn to code in one language, you will find it easier to learn
any other language compared to starting blind.
• Python is probably the most popular programming language for
machine learning today.
• But we won’t go into machine learning topics this semester.
25 January 2022 Lecture 1: Course logistics 25
The textbooks
• Cross-sectional analysis:
• Introductory Econometrics: A Modern Approach. Wooldridge. 7th ed.
• Focuses on why and how to in general.
• Many examples to strengthen your understanding of the concepts.
• Introductory level. Easy to read. We won’t go over all the proofs.
• Time-series analysis:
• Introduction to Time Series Using Stata. Becketti.
• Doesn’t include many proofs anyway. Many examples.
• Why not Wooldridge for time-series: because it overcomplicates the
time-series topics. Becketti is easier to read and learn TS from.
25 January 2022 Lecture 1: Course logistics 26
Data: the books’ resources – Wooldridge
Wooldridge data
• Manual download:
https://www.cengage.com/aise/economi
cs/wooldridge_3e_datasets/statafiles.zip
• You can also use Stata’s bcuse
(findit bcuse) and install.
• The data list:
http://fmwww.bc.edu/ec-
p/data/wooldridge/datasets.list.html
25 January 2022 Lecture 1: Course logistics 27
Data: the books’ resources – ITSUS
• Download the data from the book’s website:
• https://www.stata-press.com/data/itsus.html
• Check out the errata for the list of typos:
• https://www.stata-press.com/books/errata/itsus.html
• Install directly into Stata:
25 January 2022 Lecture 1: Course logistics 28
net from “http://www.stata-press.com/data/itsus/”
net get itsus_files
net get itsus_data
net install itsus_files
Data: public databases
• Always cite your source!
• Data, papers, code, etc.
https://scholar.google.com
• Failing to do so is called
plagiarism! Don’t be that
person.
25 January 2022 Lecture 1: Course logistics 29
Data: public databases
• Always cite your source!
• Data, papers, code, etc.
• You can install the
Google Scholar button in
your browser for ease of
access to several citation
formats.
25 January 2022 Lecture 1: Course logistics 30
Data: public databases
• You can use many data types
in Stata:
• The data set does not have to
be in Stata format (.dta)
• You can import other text
formats such as csv, txt, xlsx,
etc.
25 January 2022 Lecture 1: Course logistics 31
Data: FRED
• Federal
Reserve of
Economics
Data (FRED)
25 January 2022 Lecture 1: Course logistics 32
https://fred.stlouisfed.org/
Data: FRED
• Federal Reserve of Economics
Data (FRED) –
https://fred.stlouisfed.org/
• Check out the video about how
to use freduse
• Install freduse either with the
link or the code
25 January 2022 Lecture 1: Course logistics 33
findit freduse
ssc install freduse
Data: FRED
25 January 2022 Lecture 1: Course logistics 34
• Federal Reserve of
Economics Data (FRED) –
https://fred.stlouisfed.org/
• Use the search box to
search for data used in
class or assignments
• Try GDP US.
• Load it directly into Stata.
Data: FRED
• Federal Reserve of
Economics Data (FRED) –
https://fred.stlouisfed.org/
• Use the search box to
search for data used in
class or assignments
• Try GDP US.
• Load it directly into Stata.
25 January 2022 Lecture 1: Course logistics 35
Data: FRED
• You’ll need the variable
names.
• Variable name for Real GDP
of the US: GDPC1
• Stata 17 allows you to
directly search in the
program.
• For versions earlier than
15, you’ll need to peruse
the website.
25 January 2022 Lecture 1: Course logistics 36
Data: FRED
• Direct access in Stata
• File > Import > Federal
Reserve Economic Data
(FRED)
• You can search for
keywords, add dates and
import the data directly
into Stata.
• You still have to add the
code in your assignments!
• e.g. freduse GDPC1
25 January 2022 Lecture 1: Course logistics 37
Data: Yahoo! Finance
• Data can be
downloaded from the
website or directly
into Stata using
fetchyahooquotes
25 January 2022 Lecture 1: Course logistics 38
Yahoo! Finance
• Use the code below to
install:
25 January 2022 Lecture 1: Course logistics 39
net install http://researchata.com/stata/203/fetchyahooquotes.pkg, force
Data: Yahoo! Finance
• Alphabet Inc. (GOOG)
25 January 2022 Lecture 1: Course logistics 40
clear all
fetchyahooquotes GOOGLE, freq(m) start(1jan2007) end(31dec2018)
line adjclose_GOOG date
• Once installed, try this code to check if it’s
working properly:
Exams
• Exams will be similar to the homework assignments:
• Analyzing the data sets assigned to you.
• Interpreting the results.
• Making and discussing the inferences.
• Midterm will be during lecture time
• Tentative: March 10. Open-book. On paper.
• Final exam date and time will be determined by the registrar.
• Cumulative. Double check the date and time closer to the exam
date. In the computer lab.
25 January 2022 Lecture 1: Course logistics 41
Group projects
• Data does not speak for itself!
• Even after you possibly carried out all the analyses under the sun
and found every bit of information it may contain. This is assuming
you chose the correct methods.
• You still have to find a neat and efficient way to present all
your analyses and inferences.
• Working on a project will give you experience in presenting
your work in front of an audience.
• You will also provide feedback for other projects assigned to
you.
25 January 2022 Lecture 1: Course logistics 42
Group projects
• Groups of 3-4. You can work with people in the other section.
• You will:
• Come up with a research question.
• Find relevant (publicly available) data.
• Write a proposal on how you are planning to tackle your research
question. Present your proposal in 8 minutes in class.
• Tackle your research question using the methods we learned,
analyze the data. Present your results in 10 minutes in class. This
will be the finished project.
• Write a final report.
25 January 2022 Lecture 1: Course logistics 43
Group projects
• Submissions: Project proposal (1st and 2nd drafts), updated
project report with results, final report.
• Due dates on the syllabus.
• About the research question: find something interesting.
• Every year I see at least 2 projects about same old topics. Be original. If you
repeat any topic that has been done before (i.e., if older cohorts gave you a
topic “idea”), I’ll ask for originality.
• This is a good time to think about your capstone. Try a few things now so you
can improve those or move on to new things for your capstone.
25 January 2022 Lecture 1: Course logistics 44
Group projects
• About the group project data:
• Start looking for the data the moment you have a research
question idea.
• Every year at least a couple groups realize that they can’t access to
data relevant to their topic.
• Either the data isn’t publicly available or simply does not exist.
• Or the data is publicly available, but it is too messy and complicated to
work on. E.g., requires extensive clean up, deciphering the variables, etc.
• You are allowed to change your topic and your research question, but you
can’t get the wasted time back. Find and check the data before it is too
late.
25 January 2022 Lecture 1: Course logistics 45
Group projects
• About the group project data:
• You can work with any publicly available data.
• Except for data created for teaching purposes like for books or software.
E.g., Stata datasets cited in the user manuals, or the datasets included in
the Wooldridge (or any other book) may be altered to fit the teaching
narrative.
• There are many publicly available trusted data resources.
• Do your own research to find them. It is a part of the learning experience.
• Ex: interested in global warming? NASA might have the data you're
looking for. COVID-19? Check out WHO. Macroeconomics data? FRED has
it covered.
25 January 2022 Lecture 1: Course logistics 46
Group projects
• About the group project data:
• Work with financial data at your own risk.
• NYUAD has access to the WRDS database via the library computers, but our
course hasn’t been given personal (home) access since it is not a finance course.
Rethink your decision to work with financial data if you have no previous
experience (deep personal interest, previous courses, etc.) with finance theory
and accounting terms. The instructor will not be able to help you deciphering the
WRDS database.
• Want to collect your own survey data? Consider these first:
• Have you checked the regulations? Do you have the IRB approval?
• Do you know how to properly do this? (population specification, sample frame,
non-responses, nonleading words, mutually exclusive categories, etc.)
• Are you sure you can collect a large enough sample on time?
25 January 2022 Lecture 1: Course logistics 47
Attendance and participation
25 January 2022 Lecture 1: Course logistics 48
Data analysis is a VAST area.
We will focus on practice and only
enough theory to make sense of
what we are doing and why.
But we cannot completely ignore
theory since practice won’t make
sense otherwise.
This is a lecture class.
Take notes!
You will need your notes along
with the slides and the books.
Your learning experience will
enhance significantly if you pay
attention to the lectures.
Attendance and participation
• This course is about your level as long as you put in the
necessary effort.
25 January 2022 Lecture 1: Course logistics 49
Pay attention to the lectures.
Attend
Ask and answer questions.
Participate
Study the material.
Read
Do the weekly homework assignments*.
Do
*This font on the slides indicates code you can type into Stata. (Yes, even this!)
Attendance and participation
• Read the assigned material as we cover them so you can participate in
class.
• Beyond your grade, your learning experience highly depends on how much
attention you pay to the lectures. You’ll have an easier time navigating the
topics throughout the semester.
• Do not hesitate to give a “wrong” answer to my questions during lectures.*
*except in exams, assignments and project presentations, of course.
• If you won’t be able to attend a lecture, let me know beforehand with
your reasoning.
• If can’t let me know before, definitely let me know after.
• We can discuss how to make sure you’re not behind.
25 January 2022 Lecture 1: Course logistics 50
Attendance and participation
• As per the university’s decision, the lectures will remain online
at least until February 4.
• Cameras will stay on for the duration of the lectures.
• This is comparable to physically showing up to a class.
• You are expected to attend lectures on campus once the
university reverts to face-to-face teaching.
25 January 2022 Lecture 1: Course logistics 51
Attendance and participation
• Participation means asking and answering questions.
• The class discussions will help you discover the areas you didn’t fully
understand.
• It will give me the ability to reinforce those areas.
• It will also earn you 10% of your grade. You can’t get an A if you are not
participating. A starts at 93.
• Do not hesitate to give a “wrong” answer to my questions during lectures.*
These are a part of the learning process.
• We will use the chat window on Zoom to ask and answer questions.
• You can raise your hand (button on Zoom) or directly write questions in the
chat. I’ll keep track of that area while teaching.
*obviously, except exams and project presentations, etc.
25 January 2022 Lecture 1: Course logistics 52
Participation grade rubric
• 50 for attending all lectures + 50 for actively answering questions and participating in the
discussions
• 100 – student does not miss a lecture without an excuse and consistently participates in the
discussion 15+ active participation (out of 22 lectures)
• 90 – student attends all lectures and participates sometimes but not always (10+ active)
• 85 – student attends but participates only a few times out of 22 lectures (5+ active)
• 80 – attends but never participates (<5 active participation)
• 75 – student attends the lectures but consistently late and/or preoccupied but 5+ participation
• 70 - student misses several (3+) lectures without an excuse or never participates
• 65 - student misses several (3+) lectures without an excuse and never participates
• 50 – 4-6 missing lectures
• 30 – 7+ missing lectures
• 0 – 10+ missing lectures
25 January 2022 Lecture 1: Course logistics 53
Attendance and participation
• Please attend the section you are registered in.
• Unless you have permission from me. Let me know beforehand.
• This is to ensure virtual class manageability and the sanitization
requirements when we go back to face-to-face teaching.
• Please be present on time.
• It won’t be easy to allow people in from the waiting room after the
first few minutes.
• I’ll be online a few minutes early if you want to ask any questions or
discuss a topic.
• I start lectures at exactly 9am and 10:25am in-class too.
25 January 2022 Lecture 1: Course logistics 54
Schedule
• First few classes: a fast review of necessary statistical
concepts and Stata code.
• You have already covered these topics in previous classes. This
will only be to refresh your memory.
• These will be the building blocks to understanding the rest of the
class and data analysis in general.
• Pay close attention to what you can easily remember and what
you need to review on your own.
• Technically, I will assume you already know these concepts, but a
refresher never hurts.
• Ask me if you need additional resources for recap.
25 January 2022 Lecture 1: Course logistics 55
Schedule
• First part of the semester: cross-sectional data
• Simple linear regression
• Multiple linear regression
• Common mistakes and how to avoid them
• Common problems and possible solutions
• Multiple regression analysis with qualitative information
• Binary (dummy) variables
• Logit and probit models
• Instrumental variables
• Will take about ~60 − 70% of the semester (8-9 weeks)
25 January 2022 Lecture 1: Course logistics 56
Schedule
• Second part of the semester: time-series data
• Components of time-series data
• Basic regression analysis with time-series data
• Time-series filters
• Parametric time-series models
• Common problems and possible solutions
• Forecasting
• A brief intro to panel data
• Will take ~30 − 40% of the semester (4-5 weeks)
• Most methods covered in the first 8 weeks will be applicable to time-series
and panel data. The second part will also include methods which can only be
applied to time-series or panel data.
25 January 2022 Lecture 1: Course logistics 57
Office hours
• Monday, 8:30am-10:00am or by appointment.
• Always on Zoom. Links on Brightspace.
• You can join in any time until 9:50am. You don’t have to arrive at 8:30
and stay the whole time. Stop by, ask your question and either stay and
listen to other people’s questions or leave. No obligation either way.
• Please do not hesitate to ask for help. I’m always happy to assist you
about this class. If you do not ask, I cannot help you.
• If you are having difficulty following the material because of gaps in
your background, let me know so I can recommend extra resources.
• If you would like to dive deeper into these topics, let me know so I can
assign you extra optional reading material.
25 January 2022 Lecture 1: Course logistics 58
Grading
25 January 2022 Lecture 1: Course logistics 59
Requirement Percentage
Weekly homework assignments 20%
Group project presentation and report 20%
Participation 10%
Midterm exam 20%
Final exam 30%
Before you go…
• Things to do before Thursday:
• Read Wooldridge Chapter 1, and Appendices A and B
• Install Stata.
• Download Wooldridge’s data sets.
• Download ITSUS data sets.
• Install freduse.
• Complete the installation and try the codes on the slides. If they don’t work, it
means you didn’t install it properly.
• Peruse FRED’s website.
• Install fetchyahooquotes.
• Try the code included on the slides.
• Be mindful of the time zones. Everything is listed in GST (GMT+4).
25 January 2022 Lecture 1: Course logistics 60

More Related Content

Similar to Lecture 1 v3.pdf

Waymaker Economics Courses: Personalized Learning in 5 Simple Steps
Waymaker Economics Courses: Personalized Learning in 5 Simple StepsWaymaker Economics Courses: Personalized Learning in 5 Simple Steps
Waymaker Economics Courses: Personalized Learning in 5 Simple StepsLumen Learning
 
Personalized Learning in 5 Simple Steps: Waymaker Economics
Personalized Learning in 5 Simple Steps: Waymaker EconomicsPersonalized Learning in 5 Simple Steps: Waymaker Economics
Personalized Learning in 5 Simple Steps: Waymaker EconomicsLumen Learning
 
BITA introduction slides
BITA introduction slidesBITA introduction slides
BITA introduction slidesMark Kor
 
BITA Introduction Slides
BITA Introduction SlidesBITA Introduction Slides
BITA Introduction SlidesMark Kor
 
Glfes summer institute2013_raleigh_final
Glfes summer institute2013_raleigh_finalGlfes summer institute2013_raleigh_final
Glfes summer institute2013_raleigh_finalTricia Townsend
 
AQT February PLC 2012
AQT February PLC 2012AQT February PLC 2012
AQT February PLC 2012Mona Toncheff
 
Data Quality: Are Your Data Suitable For Answering Your Questions? - Experfy ...
Data Quality: Are Your Data Suitable For Answering Your Questions? - Experfy ...Data Quality: Are Your Data Suitable For Answering Your Questions? - Experfy ...
Data Quality: Are Your Data Suitable For Answering Your Questions? - Experfy ...Experfy
 
Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful ...
Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful ...Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful ...
Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful ...Damian R. Mingle, MBA
 
Personalized Learning in 5 Simple Steps: Waymaker Economics Courses
Personalized Learning in 5 Simple Steps: Waymaker Economics CoursesPersonalized Learning in 5 Simple Steps: Waymaker Economics Courses
Personalized Learning in 5 Simple Steps: Waymaker Economics CoursesLumen Learning
 
BITA Introduction Slides.pdf
BITA Introduction Slides.pdfBITA Introduction Slides.pdf
BITA Introduction Slides.pdfMark Kor
 
Tips for Choosing A New College Planning Technology
Tips for Choosing A New College Planning TechnologyTips for Choosing A New College Planning Technology
Tips for Choosing A New College Planning TechnologyCyndy McDonald
 
Digital apprenticeships community event
Digital apprenticeships community eventDigital apprenticeships community event
Digital apprenticeships community eventJames Clay
 
BITA Induction Slides oct 2021
BITA Induction Slides oct 2021BITA Induction Slides oct 2021
BITA Induction Slides oct 2021Mark Kor
 
activities of integration.pptx
activities of integration.pptxactivities of integration.pptx
activities of integration.pptxBagalanaSteven
 
Project2 schedule steps_engl313_summer2021
Project2 schedule steps_engl313_summer2021Project2 schedule steps_engl313_summer2021
Project2 schedule steps_engl313_summer2021KatieKrahn
 
Artur Suchwalko “What are common mistakes in Data Science projects and how to...
Artur Suchwalko “What are common mistakes in Data Science projects and how to...Artur Suchwalko “What are common mistakes in Data Science projects and how to...
Artur Suchwalko “What are common mistakes in Data Science projects and how to...Lviv Startup Club
 
Lecture_1_Intro.pdf
Lecture_1_Intro.pdfLecture_1_Intro.pdf
Lecture_1_Intro.pdfpaijitk
 

Similar to Lecture 1 v3.pdf (20)

Waymaker Economics Courses: Personalized Learning in 5 Simple Steps
Waymaker Economics Courses: Personalized Learning in 5 Simple StepsWaymaker Economics Courses: Personalized Learning in 5 Simple Steps
Waymaker Economics Courses: Personalized Learning in 5 Simple Steps
 
Personalized Learning in 5 Simple Steps: Waymaker Economics
Personalized Learning in 5 Simple Steps: Waymaker EconomicsPersonalized Learning in 5 Simple Steps: Waymaker Economics
Personalized Learning in 5 Simple Steps: Waymaker Economics
 
BITA introduction slides
BITA introduction slidesBITA introduction slides
BITA introduction slides
 
BITA Introduction Slides
BITA Introduction SlidesBITA Introduction Slides
BITA Introduction Slides
 
Glfes summer institute2013_raleigh_final
Glfes summer institute2013_raleigh_finalGlfes summer institute2013_raleigh_final
Glfes summer institute2013_raleigh_final
 
AQT February PLC 2012
AQT February PLC 2012AQT February PLC 2012
AQT February PLC 2012
 
Data Quality: Are Your Data Suitable For Answering Your Questions? - Experfy ...
Data Quality: Are Your Data Suitable For Answering Your Questions? - Experfy ...Data Quality: Are Your Data Suitable For Answering Your Questions? - Experfy ...
Data Quality: Are Your Data Suitable For Answering Your Questions? - Experfy ...
 
Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful ...
Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful ...Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful ...
Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful ...
 
Lecture#01
Lecture#01Lecture#01
Lecture#01
 
Personalized Learning in 5 Simple Steps: Waymaker Economics Courses
Personalized Learning in 5 Simple Steps: Waymaker Economics CoursesPersonalized Learning in 5 Simple Steps: Waymaker Economics Courses
Personalized Learning in 5 Simple Steps: Waymaker Economics Courses
 
BITA Introduction Slides.pdf
BITA Introduction Slides.pdfBITA Introduction Slides.pdf
BITA Introduction Slides.pdf
 
Presentation MaSE 18-102012
Presentation MaSE 18-102012Presentation MaSE 18-102012
Presentation MaSE 18-102012
 
Tips for Choosing A New College Planning Technology
Tips for Choosing A New College Planning TechnologyTips for Choosing A New College Planning Technology
Tips for Choosing A New College Planning Technology
 
Digital apprenticeships community event
Digital apprenticeships community eventDigital apprenticeships community event
Digital apprenticeships community event
 
BITA Induction Slides oct 2021
BITA Induction Slides oct 2021BITA Induction Slides oct 2021
BITA Induction Slides oct 2021
 
activities of integration.pptx
activities of integration.pptxactivities of integration.pptx
activities of integration.pptx
 
Project2 schedule steps_engl313_summer2021
Project2 schedule steps_engl313_summer2021Project2 schedule steps_engl313_summer2021
Project2 schedule steps_engl313_summer2021
 
Ict3612 102 1_2018
Ict3612 102 1_2018Ict3612 102 1_2018
Ict3612 102 1_2018
 
Artur Suchwalko “What are common mistakes in Data Science projects and how to...
Artur Suchwalko “What are common mistakes in Data Science projects and how to...Artur Suchwalko “What are common mistakes in Data Science projects and how to...
Artur Suchwalko “What are common mistakes in Data Science projects and how to...
 
Lecture_1_Intro.pdf
Lecture_1_Intro.pdfLecture_1_Intro.pdf
Lecture_1_Intro.pdf
 

Recently uploaded

Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...lizamodels9
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...lizamodels9
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...lizamodels9
 
CATALOG cáp điện Goldcup (bảng giá) 1.4.2024.PDF
CATALOG cáp điện Goldcup (bảng giá) 1.4.2024.PDFCATALOG cáp điện Goldcup (bảng giá) 1.4.2024.PDF
CATALOG cáp điện Goldcup (bảng giá) 1.4.2024.PDFOrient Homes
 
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCRsoniya singh
 
A.I. Bot Summit 3 Opening Keynote - Perry Belcher
A.I. Bot Summit 3 Opening Keynote - Perry BelcherA.I. Bot Summit 3 Opening Keynote - Perry Belcher
A.I. Bot Summit 3 Opening Keynote - Perry BelcherPerry Belcher
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation SlidesKeppelCorporation
 
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurVIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurSuhani Kapoor
 
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...lizamodels9
 
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...lizamodels9
 
Catalogue ONG NUOC PPR DE NHAT .pdf
Catalogue ONG NUOC PPR DE NHAT      .pdfCatalogue ONG NUOC PPR DE NHAT      .pdf
Catalogue ONG NUOC PPR DE NHAT .pdfOrient Homes
 
(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCRsoniya singh
 
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfIntro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfpollardmorgan
 
(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCRsoniya singh
 
The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024christinemoorman
 
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / NcrCall Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncrdollysharma2066
 
rishikeshgirls.in- Rishikesh call girl.pdf
rishikeshgirls.in- Rishikesh call girl.pdfrishikeshgirls.in- Rishikesh call girl.pdf
rishikeshgirls.in- Rishikesh call girl.pdfmuskan1121w
 
Call Girls In ⇛⇛Chhatarpur⇚⇚. Brings Offer Delhi Contact Us 8377877756
Call Girls In ⇛⇛Chhatarpur⇚⇚. Brings Offer Delhi Contact Us 8377877756Call Girls In ⇛⇛Chhatarpur⇚⇚. Brings Offer Delhi Contact Us 8377877756
Call Girls In ⇛⇛Chhatarpur⇚⇚. Brings Offer Delhi Contact Us 8377877756dollysharma2066
 

Recently uploaded (20)

Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
 
CATALOG cáp điện Goldcup (bảng giá) 1.4.2024.PDF
CATALOG cáp điện Goldcup (bảng giá) 1.4.2024.PDFCATALOG cáp điện Goldcup (bảng giá) 1.4.2024.PDF
CATALOG cáp điện Goldcup (bảng giá) 1.4.2024.PDF
 
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
 
A.I. Bot Summit 3 Opening Keynote - Perry Belcher
A.I. Bot Summit 3 Opening Keynote - Perry BelcherA.I. Bot Summit 3 Opening Keynote - Perry Belcher
A.I. Bot Summit 3 Opening Keynote - Perry Belcher
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
 
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurVIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
 
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
 
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...
 
Catalogue ONG NUOC PPR DE NHAT .pdf
Catalogue ONG NUOC PPR DE NHAT      .pdfCatalogue ONG NUOC PPR DE NHAT      .pdf
Catalogue ONG NUOC PPR DE NHAT .pdf
 
(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR
 
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfIntro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
 
(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR
 
The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024
 
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / NcrCall Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
 
rishikeshgirls.in- Rishikesh call girl.pdf
rishikeshgirls.in- Rishikesh call girl.pdfrishikeshgirls.in- Rishikesh call girl.pdf
rishikeshgirls.in- Rishikesh call girl.pdf
 
Call Girls In ⇛⇛Chhatarpur⇚⇚. Brings Offer Delhi Contact Us 8377877756
Call Girls In ⇛⇛Chhatarpur⇚⇚. Brings Offer Delhi Contact Us 8377877756Call Girls In ⇛⇛Chhatarpur⇚⇚. Brings Offer Delhi Contact Us 8377877756
Call Girls In ⇛⇛Chhatarpur⇚⇚. Brings Offer Delhi Contact Us 8377877756
 

Lecture 1 v3.pdf

  • 1. ECON-UH 2020 - Data Analysis Spring 2022 Lecture 1: Course Logistics Betul Arda arda@nyu.edu Jan 25, 2022
  • 2. Outline • What is data analysis? • Why and how data science became popular • Doesn’t data just speak for itself?! • Course logistics: • Attendance and participation • Topics and schedule • The textbooks and other learning resources • Data resources • Office hours • Assignments and group projects • Exams • Academic honesty 2 25 January 2022 Lecture 1: Course logistics
  • 3. What is data analysis? 25 January 2022 Lecture 1: Course logistics 3 Data analysis is extracting information from data. We will learn how to organize and analyze data and interpret the results (information). The difference between inference and speculation is data.
  • 4. 25 January 2022 Lecture 1: Course logistics 4 Questions we can investigate with suitable data Effects of a newly implemented traffic law on traffic accident fatalities. Bank loans and risk assessments – can you predict if a loan will get accepted or not? Inventory prediction before holiday sales. The effects of Brexit on UK economy. The effect of COVID19 on gas prices. Any other examples?
  • 5. 25 January 2022 Lecture 1: Course logistics 5 Some examples from https://www.analyticsinhr.com/blog/advanced-data-analysis-techniques-applied-to-people-analytics/
  • 6. Data does not speak for itself! 25 January 2022 Lecture 1: Course logistics 6 A common misconception: if we have enough data and a good statistical software, we can learn everything we need from that data. WRONG! A first simple look at data can be misleading. One can miss underlying phenomena that are not readily visible. One can mistake artifacts for meaningful relationships. One can run the wrong tests and/or misinterpret the results and make wrong inferences.
  • 7. Why data analysis? 25 January 2022 Lecture 1: Course logistics 7 The short and oversimplified answer: In an argument, would you rather have evidence or opinion?
  • 8. How it became so popular – the exponential growth in computing power. 25 January 2022 Lecture 1: Course logistics 8 http://content.time.com/time/interactive/0,31813,2048601,00.html
  • 9. 25 January 2022 Lecture 1: Course logistics 9
  • 10. Why data analysis? 25 January 2022 Lecture 1: Course logistics 10 Why and how did data analysis become popular? • We will never run out of data. • The Internet! The ease of access to global data. • Many questions to be answered with readily available or collectible data. • Data collection is also easier than ever. Data is “renewable”:
  • 11. Why data analysis? 25 January 2022 Lecture 1: Course logistics 11 Why and how did data analysis become popular? • Time and money. • The cost of educated decisions vs speculation. • Time cost. • Irreplaceable resources? Any ideas? Limited and/or irreplaceable resources:
  • 12. Why data analysis? 25 January 2022 Lecture 1: Course logistics 12 Still not convinced? • The careers you want to pursue will most likely require some level of data analysis knowledge. • My children learned about mean, median, quartiles, histograms and the concept of probability distributions as a part of his school curriculum in 3rd grade. By the time these kids reach college age in 10 years, they’ll already be ahead of your current position. Kids are learning coding and statistics at young ages. You must stay relevant!* • *This is an anecdotal evidence and an opinion. My argument that you will need data analysis skills shouldn’t be very convincing without any data and supporting evidence. Keep on reading  You will be left behind otherwise.
  • 13. Some evidence • This is just one study on the demand for data analysis skills. Do your own research. • American Statistical Association: New Report Highlights Growing Demand for Data Science, Analytics Talent. https://www.amstat.org/ASA/News/New-Report-Highlights-Growing-Demand-for-Data-Science-Analytics- Talent.aspx • Jobs of the Future: Data Analysis Skills: https://www.shrm.org/hr-today/trends-and-forecasting/research-and- surveys/Pages/data-analysis-skills.aspx 25 January 2022 Lecture 1: Course logistics 13 Why organizations do not use big data: •51% lack of knowledge/ expertise •30% not enough data collected/available
  • 14. Objectives: What will you learn? 25 January 2022 Lecture 1: Course logistics 14 The ability to answer questions with evidence A good data analysis is never dumping raw data into a software and seeing what comes out. It is finding the right methods and properly interpreting the results. The ability to ask (meaningful) questions We will turn real-world problems into mathematical models. Questions before and after the analyses. This class will teach you critical thinking in a quantitative way.
  • 15. Recitations • Sunday, • REC1: 3:45pm-5:00pm (A2-012), • REC2: 2:20pm-3:35pm (A2-012), • REC3: Fri 7:55am-9:10am (A2-007), • REC4: Fri 9:20am-10:35am (A2-007). • The recitations will help tremendously. • The instructor (Jon) will cover some Stata and stats basics next week. • He will go over extra examples and you will practice on data. • Practice will help you in exams, projects, later in other classes and your career. 25 January 2022 Lecture 1: Course logistics 15
  • 16. Assignments • Practice is essential when learning data analysis. • The assignments will help with thorough comprehension of the material. • Read the questions carefully and be mindful of the analyses you are carrying out. • Explain your steps and the results so you and I both know what you are doing and why you are doing it. • Recitations will help with practicing so it will be easier completing the assignments. 25 January 2022 Lecture 1: Course logistics 16
  • 17. Assignment reports • There is an example report on Brightspace! • Add comments to your code. Explain everything. • Your code should run all at once. • This is important as we won’t keep selecting and running partial code to see if it’s working partially. • When adding a comment, make sure to follow proper comment notation so it does not stop your code from running as a whole. • Clearly label all your plots, axes, etc. 25 January 2022 Lecture 1: Course logistics 17
  • 18. Assignment reports • Type your assignments. NO handwriting, please. • Create a PDF with all your answers, plots, explanations, etc. ONLY PDF. No docx. • Code in a separate .do file. • Do not add code into your report, and vice versa. • Submit to Brightspace. • Be mindful of math notations (subscripts, superscripts, etc.) • No X1 if it’s 𝑥𝑥1. • No theta^2 if it’s 𝜃𝜃2 . 25 January 2022 Lecture 1: Course logistics 18
  • 19. Assignment reports • Label your code as: LastnameFirstInitial_HW1.do (ArdaB_HW1.do) • Label your pdf as: LastnameFirstInitial_HW1.pdf (ArdaB_HW1.pdf) • Label your data as: LastnameFirstInitial_HW1.dta (ArdaB_HW1.dta) • Label your log as: LastnameFirstInitial_HW1.log (ArdaB_HW1.dta) • We receive about 40-50 reports per week per semester. With each report, we also have the code, the data and the log. Multiply it with 10 – we’ll have 10 HWs this semester. Can you imagine the mess if you don’t label your work properly? • Submit the data as is. This might seem unnecessary, but it’ll help us confirm that the problems in your results are not due to data download problems. • Submit your log file too. 25 January 2022 Lecture 1: Course logistics 19
  • 20. Assignment reports • The first homework will be posted today (January 25). • Due January 31, Monday, 11:59pm, sharp. • 10 points will be subtracted for each additional hour starting midnight. • Ex: submitted Feb 1, 12:00:01 am (basically right after midnight) 𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔 ≤ 90, 1:30am 𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔 ≤ 80, etc. • If your circumstances have proved it difficult to finish the HW on time that week, let me know and ask for an extension before the last minute. 25 January 2022 Lecture 1: Course logistics 20
  • 21. Assignment reports • Over the years, I and the instructors of this course have been providing students with code that would help them not just for this course but upcoming econometrics courses as well. DO NOT ever present those code snippets as your own. • For example, if you ever use a code snippet of Emir Hidovic or Jonathan Rogers from recitations, cite them in the upcoming econometrics courses (and elsewhere). It is their intellectual property. The professors and instructors will recognize their own code. 25 January 2022 Lecture 1: Course logistics 21
  • 22. Academic honesty • There are many ways to write a piece of code that does the same thing. • Write your own code. Do NOT copy someone else’s. Seriously, don’t risk your whole academic future. • Write your own reports in your own words. • You aren’t allowed to use “homework” websites, solution manuals, etc. If you can find it online, so can we. • If you are stuck, ask for help from me or Jon. • No posting course materials elsewhere. These are considered intellectual property. • Lecture notes, assignments, exams, solutions, recitation notes, codes, meeting videos, etc. are all considered intellectual property. 25 January 2022 Lecture 1: Course logistics 22
  • 23. The software • Follow the instructions you received in your email to download Stata 17. • Earlier versions of Stata are fine to use for this course, but you may need some of the newer features for the next econometrics course. • Why Stata? Not R or any other free software? • The short answer: easier to use (more user-friendly) and still very versatile. • This is not a coding class. The less time you spend trying to figure out how to code something, the more time you’ll have for data analysis topics. 25 January 2022 Lecture 1: Course logistics 23
  • 24. The software • Why Stata? Not R or any other free software? • User-friendly, interactive interface. • Lets us focus on learning data analysis without trying to figure out a more complicated software. • Comes with detailed help files full of real examples. • Most more advanced econometrics classes at NYUAD require Stata due to similar reasons. • Commonly used in social sciences research. • Comparison: https://www.r-bloggers.com/whats-the-best-statistical-software-a-comparison- of-r-python-sas-spss-and-stata/ 25 January 2022 Lecture 1: Course logistics 24
  • 25. The software • There is a third section of this class using R. • Other statistical software/programming languages you may want to explore: R, Python, Java, C++, C#, SAS, SPSS, MATLAB, Mathematica… • Once you learn to code in one language, you will find it easier to learn any other language compared to starting blind. • Python is probably the most popular programming language for machine learning today. • But we won’t go into machine learning topics this semester. 25 January 2022 Lecture 1: Course logistics 25
  • 26. The textbooks • Cross-sectional analysis: • Introductory Econometrics: A Modern Approach. Wooldridge. 7th ed. • Focuses on why and how to in general. • Many examples to strengthen your understanding of the concepts. • Introductory level. Easy to read. We won’t go over all the proofs. • Time-series analysis: • Introduction to Time Series Using Stata. Becketti. • Doesn’t include many proofs anyway. Many examples. • Why not Wooldridge for time-series: because it overcomplicates the time-series topics. Becketti is easier to read and learn TS from. 25 January 2022 Lecture 1: Course logistics 26
  • 27. Data: the books’ resources – Wooldridge Wooldridge data • Manual download: https://www.cengage.com/aise/economi cs/wooldridge_3e_datasets/statafiles.zip • You can also use Stata’s bcuse (findit bcuse) and install. • The data list: http://fmwww.bc.edu/ec- p/data/wooldridge/datasets.list.html 25 January 2022 Lecture 1: Course logistics 27
  • 28. Data: the books’ resources – ITSUS • Download the data from the book’s website: • https://www.stata-press.com/data/itsus.html • Check out the errata for the list of typos: • https://www.stata-press.com/books/errata/itsus.html • Install directly into Stata: 25 January 2022 Lecture 1: Course logistics 28 net from “http://www.stata-press.com/data/itsus/” net get itsus_files net get itsus_data net install itsus_files
  • 29. Data: public databases • Always cite your source! • Data, papers, code, etc. https://scholar.google.com • Failing to do so is called plagiarism! Don’t be that person. 25 January 2022 Lecture 1: Course logistics 29
  • 30. Data: public databases • Always cite your source! • Data, papers, code, etc. • You can install the Google Scholar button in your browser for ease of access to several citation formats. 25 January 2022 Lecture 1: Course logistics 30
  • 31. Data: public databases • You can use many data types in Stata: • The data set does not have to be in Stata format (.dta) • You can import other text formats such as csv, txt, xlsx, etc. 25 January 2022 Lecture 1: Course logistics 31
  • 32. Data: FRED • Federal Reserve of Economics Data (FRED) 25 January 2022 Lecture 1: Course logistics 32 https://fred.stlouisfed.org/
  • 33. Data: FRED • Federal Reserve of Economics Data (FRED) – https://fred.stlouisfed.org/ • Check out the video about how to use freduse • Install freduse either with the link or the code 25 January 2022 Lecture 1: Course logistics 33 findit freduse ssc install freduse
  • 34. Data: FRED 25 January 2022 Lecture 1: Course logistics 34 • Federal Reserve of Economics Data (FRED) – https://fred.stlouisfed.org/ • Use the search box to search for data used in class or assignments • Try GDP US. • Load it directly into Stata.
  • 35. Data: FRED • Federal Reserve of Economics Data (FRED) – https://fred.stlouisfed.org/ • Use the search box to search for data used in class or assignments • Try GDP US. • Load it directly into Stata. 25 January 2022 Lecture 1: Course logistics 35
  • 36. Data: FRED • You’ll need the variable names. • Variable name for Real GDP of the US: GDPC1 • Stata 17 allows you to directly search in the program. • For versions earlier than 15, you’ll need to peruse the website. 25 January 2022 Lecture 1: Course logistics 36
  • 37. Data: FRED • Direct access in Stata • File > Import > Federal Reserve Economic Data (FRED) • You can search for keywords, add dates and import the data directly into Stata. • You still have to add the code in your assignments! • e.g. freduse GDPC1 25 January 2022 Lecture 1: Course logistics 37
  • 38. Data: Yahoo! Finance • Data can be downloaded from the website or directly into Stata using fetchyahooquotes 25 January 2022 Lecture 1: Course logistics 38
  • 39. Yahoo! Finance • Use the code below to install: 25 January 2022 Lecture 1: Course logistics 39 net install http://researchata.com/stata/203/fetchyahooquotes.pkg, force
  • 40. Data: Yahoo! Finance • Alphabet Inc. (GOOG) 25 January 2022 Lecture 1: Course logistics 40 clear all fetchyahooquotes GOOGLE, freq(m) start(1jan2007) end(31dec2018) line adjclose_GOOG date • Once installed, try this code to check if it’s working properly:
  • 41. Exams • Exams will be similar to the homework assignments: • Analyzing the data sets assigned to you. • Interpreting the results. • Making and discussing the inferences. • Midterm will be during lecture time • Tentative: March 10. Open-book. On paper. • Final exam date and time will be determined by the registrar. • Cumulative. Double check the date and time closer to the exam date. In the computer lab. 25 January 2022 Lecture 1: Course logistics 41
  • 42. Group projects • Data does not speak for itself! • Even after you possibly carried out all the analyses under the sun and found every bit of information it may contain. This is assuming you chose the correct methods. • You still have to find a neat and efficient way to present all your analyses and inferences. • Working on a project will give you experience in presenting your work in front of an audience. • You will also provide feedback for other projects assigned to you. 25 January 2022 Lecture 1: Course logistics 42
  • 43. Group projects • Groups of 3-4. You can work with people in the other section. • You will: • Come up with a research question. • Find relevant (publicly available) data. • Write a proposal on how you are planning to tackle your research question. Present your proposal in 8 minutes in class. • Tackle your research question using the methods we learned, analyze the data. Present your results in 10 minutes in class. This will be the finished project. • Write a final report. 25 January 2022 Lecture 1: Course logistics 43
  • 44. Group projects • Submissions: Project proposal (1st and 2nd drafts), updated project report with results, final report. • Due dates on the syllabus. • About the research question: find something interesting. • Every year I see at least 2 projects about same old topics. Be original. If you repeat any topic that has been done before (i.e., if older cohorts gave you a topic “idea”), I’ll ask for originality. • This is a good time to think about your capstone. Try a few things now so you can improve those or move on to new things for your capstone. 25 January 2022 Lecture 1: Course logistics 44
  • 45. Group projects • About the group project data: • Start looking for the data the moment you have a research question idea. • Every year at least a couple groups realize that they can’t access to data relevant to their topic. • Either the data isn’t publicly available or simply does not exist. • Or the data is publicly available, but it is too messy and complicated to work on. E.g., requires extensive clean up, deciphering the variables, etc. • You are allowed to change your topic and your research question, but you can’t get the wasted time back. Find and check the data before it is too late. 25 January 2022 Lecture 1: Course logistics 45
  • 46. Group projects • About the group project data: • You can work with any publicly available data. • Except for data created for teaching purposes like for books or software. E.g., Stata datasets cited in the user manuals, or the datasets included in the Wooldridge (or any other book) may be altered to fit the teaching narrative. • There are many publicly available trusted data resources. • Do your own research to find them. It is a part of the learning experience. • Ex: interested in global warming? NASA might have the data you're looking for. COVID-19? Check out WHO. Macroeconomics data? FRED has it covered. 25 January 2022 Lecture 1: Course logistics 46
  • 47. Group projects • About the group project data: • Work with financial data at your own risk. • NYUAD has access to the WRDS database via the library computers, but our course hasn’t been given personal (home) access since it is not a finance course. Rethink your decision to work with financial data if you have no previous experience (deep personal interest, previous courses, etc.) with finance theory and accounting terms. The instructor will not be able to help you deciphering the WRDS database. • Want to collect your own survey data? Consider these first: • Have you checked the regulations? Do you have the IRB approval? • Do you know how to properly do this? (population specification, sample frame, non-responses, nonleading words, mutually exclusive categories, etc.) • Are you sure you can collect a large enough sample on time? 25 January 2022 Lecture 1: Course logistics 47
  • 48. Attendance and participation 25 January 2022 Lecture 1: Course logistics 48 Data analysis is a VAST area. We will focus on practice and only enough theory to make sense of what we are doing and why. But we cannot completely ignore theory since practice won’t make sense otherwise. This is a lecture class. Take notes! You will need your notes along with the slides and the books. Your learning experience will enhance significantly if you pay attention to the lectures.
  • 49. Attendance and participation • This course is about your level as long as you put in the necessary effort. 25 January 2022 Lecture 1: Course logistics 49 Pay attention to the lectures. Attend Ask and answer questions. Participate Study the material. Read Do the weekly homework assignments*. Do *This font on the slides indicates code you can type into Stata. (Yes, even this!)
  • 50. Attendance and participation • Read the assigned material as we cover them so you can participate in class. • Beyond your grade, your learning experience highly depends on how much attention you pay to the lectures. You’ll have an easier time navigating the topics throughout the semester. • Do not hesitate to give a “wrong” answer to my questions during lectures.* *except in exams, assignments and project presentations, of course. • If you won’t be able to attend a lecture, let me know beforehand with your reasoning. • If can’t let me know before, definitely let me know after. • We can discuss how to make sure you’re not behind. 25 January 2022 Lecture 1: Course logistics 50
  • 51. Attendance and participation • As per the university’s decision, the lectures will remain online at least until February 4. • Cameras will stay on for the duration of the lectures. • This is comparable to physically showing up to a class. • You are expected to attend lectures on campus once the university reverts to face-to-face teaching. 25 January 2022 Lecture 1: Course logistics 51
  • 52. Attendance and participation • Participation means asking and answering questions. • The class discussions will help you discover the areas you didn’t fully understand. • It will give me the ability to reinforce those areas. • It will also earn you 10% of your grade. You can’t get an A if you are not participating. A starts at 93. • Do not hesitate to give a “wrong” answer to my questions during lectures.* These are a part of the learning process. • We will use the chat window on Zoom to ask and answer questions. • You can raise your hand (button on Zoom) or directly write questions in the chat. I’ll keep track of that area while teaching. *obviously, except exams and project presentations, etc. 25 January 2022 Lecture 1: Course logistics 52
  • 53. Participation grade rubric • 50 for attending all lectures + 50 for actively answering questions and participating in the discussions • 100 – student does not miss a lecture without an excuse and consistently participates in the discussion 15+ active participation (out of 22 lectures) • 90 – student attends all lectures and participates sometimes but not always (10+ active) • 85 – student attends but participates only a few times out of 22 lectures (5+ active) • 80 – attends but never participates (<5 active participation) • 75 – student attends the lectures but consistently late and/or preoccupied but 5+ participation • 70 - student misses several (3+) lectures without an excuse or never participates • 65 - student misses several (3+) lectures without an excuse and never participates • 50 – 4-6 missing lectures • 30 – 7+ missing lectures • 0 – 10+ missing lectures 25 January 2022 Lecture 1: Course logistics 53
  • 54. Attendance and participation • Please attend the section you are registered in. • Unless you have permission from me. Let me know beforehand. • This is to ensure virtual class manageability and the sanitization requirements when we go back to face-to-face teaching. • Please be present on time. • It won’t be easy to allow people in from the waiting room after the first few minutes. • I’ll be online a few minutes early if you want to ask any questions or discuss a topic. • I start lectures at exactly 9am and 10:25am in-class too. 25 January 2022 Lecture 1: Course logistics 54
  • 55. Schedule • First few classes: a fast review of necessary statistical concepts and Stata code. • You have already covered these topics in previous classes. This will only be to refresh your memory. • These will be the building blocks to understanding the rest of the class and data analysis in general. • Pay close attention to what you can easily remember and what you need to review on your own. • Technically, I will assume you already know these concepts, but a refresher never hurts. • Ask me if you need additional resources for recap. 25 January 2022 Lecture 1: Course logistics 55
  • 56. Schedule • First part of the semester: cross-sectional data • Simple linear regression • Multiple linear regression • Common mistakes and how to avoid them • Common problems and possible solutions • Multiple regression analysis with qualitative information • Binary (dummy) variables • Logit and probit models • Instrumental variables • Will take about ~60 − 70% of the semester (8-9 weeks) 25 January 2022 Lecture 1: Course logistics 56
  • 57. Schedule • Second part of the semester: time-series data • Components of time-series data • Basic regression analysis with time-series data • Time-series filters • Parametric time-series models • Common problems and possible solutions • Forecasting • A brief intro to panel data • Will take ~30 − 40% of the semester (4-5 weeks) • Most methods covered in the first 8 weeks will be applicable to time-series and panel data. The second part will also include methods which can only be applied to time-series or panel data. 25 January 2022 Lecture 1: Course logistics 57
  • 58. Office hours • Monday, 8:30am-10:00am or by appointment. • Always on Zoom. Links on Brightspace. • You can join in any time until 9:50am. You don’t have to arrive at 8:30 and stay the whole time. Stop by, ask your question and either stay and listen to other people’s questions or leave. No obligation either way. • Please do not hesitate to ask for help. I’m always happy to assist you about this class. If you do not ask, I cannot help you. • If you are having difficulty following the material because of gaps in your background, let me know so I can recommend extra resources. • If you would like to dive deeper into these topics, let me know so I can assign you extra optional reading material. 25 January 2022 Lecture 1: Course logistics 58
  • 59. Grading 25 January 2022 Lecture 1: Course logistics 59 Requirement Percentage Weekly homework assignments 20% Group project presentation and report 20% Participation 10% Midterm exam 20% Final exam 30%
  • 60. Before you go… • Things to do before Thursday: • Read Wooldridge Chapter 1, and Appendices A and B • Install Stata. • Download Wooldridge’s data sets. • Download ITSUS data sets. • Install freduse. • Complete the installation and try the codes on the slides. If they don’t work, it means you didn’t install it properly. • Peruse FRED’s website. • Install fetchyahooquotes. • Try the code included on the slides. • Be mindful of the time zones. Everything is listed in GST (GMT+4). 25 January 2022 Lecture 1: Course logistics 60