The Emerging Role of Data Scientists
on Software Development Teams
Miryung Kim
UCLA
Los Angeles, CA, USA
[email protected]
Thomas Zimmermann Robert DeLine Andrew Begel
Microsoft Research
Redmond, WA, USA
{tzimmer, rdeline, andrew.begel}@microsoft.com
ABSTRACT
Creating and running software produces large amounts of raw data
about the development process and the customer usage, which can
be turned into actionable insight with the help of skilled data scien-
tists. Unfortunately, data scientists with the analytical and software
engineering skills to analyze these large data sets have been hard to
come by; only recently have software companies started to develop
competencies in software-oriented data analytics. To understand
this emerging role, we interviewed data scientists across several
product groups at Microsoft. In this paper, we describe their educa-
tion and training background, their missions in software engineer-
ing contexts, and the type of problems on which they work. We
identify five distinct working styles of data scientists: (1) Insight
Providers, who work with engineers to collect the data needed to
inform decisions that managers make; (2) Modeling Specialists,
who use their machine learning expertise to build predictive mod-
els; (3) Platform Builders, who create data platforms, balancing
both engineering and data analysis concerns; (4) Polymaths, who
do all data science activities themselves; and (5) Team Leaders,
who run teams of data scientists and spread best practices. We fur-
ther describe a set of strategies that they employ to increase the im-
pact and actionability of their work.
Categories and Subject Descriptors:
D.2.9 [Management]
General Terms:
Management, Measurement, Human Factors.
1. INTRODUCTION
Software teams are increasingly using data analysis to inform their
engineering and business decisions [1] and to build data solutions
that utilize data in software products [2]. The people who do col-
lection and analysis are called data scientists, a term coined by DJ
Patil and Jeff Hammerbacher in 2008 to define their jobs at
LinkedIn and Facebook [3]. The mission of a data scientist is to
transform data into insight, providing guidance for leaders to take
action [4]. One example is the use of user telemetry data to redesign
Windows Explorer (a tool for file management) for Windows 8.
Data scientists on the Windows team discovered that the top ten
most frequent commands accounted for 81.2% of all of invoked
commands, but only two of these were easily accessible from the
command bar in the user interface 8 [5]. Based on this insight, the
team redesigned the user experience to make these hidden com-
mands more prominent.
Until recently, data scientists were found mostly on software teams
whose products were data-intensive, like internet search and adver-
tising. Today, we have reached an inflection point where many.
MAT 111 – Spring 2020 Name ___________________________________________
Quiz 4.1/4.2 – Rational Functions & Their Graphs
Identify any vertical and/or horizontal asymptotes. If one does not exist, write DNE.
Circle/Box final answer. Label each answer. You MUST show work to support your answer to receive
full credit (i.e. factor polynomials, identify how you determined horizontal asymptote, etc.)
1. 𝑓 𝑥 = 𝑥
3
3𝑥+2
2. 𝑔 𝑥 = 𝑥
2−4
𝑥2+2𝑥−8
3. 𝑓 𝑥 = 𝑥−3
𝑥2−16
4.. Identify any vertical or horizontal asymptotes. If one does not exist, write DNE.
Vertical Asymptote(s):
Horizontal Asymptotes(s):
Identify the interval over which the function is
Increasing, decreasing or constant.
#5 & 6 Find the domain/range, vertical/horizontal asymptotes, the x- and y-intercepts, and at least 3
additional points on the graph. If a value does not exist, label it as DNE. Graph the function. Be neat!
Label all answers. Show work when possible to receive full credit.
5. 𝑓 𝑥 = −2𝑥+9
𝑥−3
6. 𝑔 𝑥 = 𝑥+1
𝑥2−4
I PROMISE I AM COMPLETING THIS QUIZ ON MY OWN AND I WILL NOT SHARE
THE CONTENTS OF THIS QUIZ OR MY ANSWERS WITH ANYONE.
SIGN: _____________________________________________________ DATE: __________________
The Emerging Role of Data Scientists
on Software Development Teams
Miryung Kim
UCLA
Los Angeles, CA, USA
[email protected]
Thomas Zimmermann Robert DeLine Andrew Begel
Microsoft Research
Redmond, WA, USA
{tzimmer, rdeline, andrew.begel}@microsoft.com
ABSTRACT
Creating and running software produces large amounts of raw data
about the development process and the customer usage, which can
be turned into actionable insight with the help of skilled data scien-
tists. Unfortunately, data scientists with the analytical and software
engineering skills to analyze these large data sets have been hard to
come by; only recently have software companies started to develop
competencies in software-oriented data analytics. To understand
this emerging role, we interviewed data scientists across several
product groups at Microsoft. In this paper, we describe their educa-
tion and training background, their missions in software engineer-
ing contexts, and the type of problems on which they work. We
identify five distinct working styles of data scientists: (1) Insight
Providers, who work with engineers to collect the data needed to
inform decisions that managers make; (2) Modeling Specialists,
who use their machine learning expertise to build predictive mod-
els; (3) Platform Builders, who create data platforms, balancing
both engineering and data analysis concerns; (4) Polymaths, who
do all data science activities themselves; and (5) Team Leaders,
who run teams of data scientists and spread best practices. We fur-
ther describe a set of strategies that they employ to inc
Assessment 2:
Description/Focus
Essay
Value
50%
Due Date
Midnight Sunday 2 (Week 12)
Length
2500 words
Task: Human services practitioners work across many domains of practice including direct work with individuals, groups and communities.
1. Critically examine the policy or policies that you consider impact upon a client group and suggest ways that policy could be changed to improve the life outcomes for those with whom you are working.
2. Develop a framework that you would adopt for influencing policy change that aligns with your professional values, standards and ethics.
Presentation: The document will be typed in a word document, 12 pt. Font, 1½ or Double spacing
Assessment criteria:
· Critical analysis of social policy
· Application of theory to practice
· Adherence to academic conventions of writing
(eg referencing; writing style)
· At least 8 references. Format APA 6th referencing.
Running head: NETWORK AND WORKFLOW FOR A DATA ANALYTICS COMPANY 1
NETWORK AND WORKFLOW FOR A DATA ANALYTICS COMPANY 2
Network and Workflow for a Data Analytics Company on Ssports
Student Name Nezar Al Massad
Institution Name Dr. Mark O'Connell
Network and Workflow for a Ddata Analytics Company on Ssports.
A company’s network and workflow play a major roles in its performance and growth. Different companies consist of rely on different networks and workflows depending on the services/tasks they are providing and the number of workers and members of staff. A network tends to connect workers and members of staff at different levels of the company. This network tends to create a good and effective workflow within the company, hence a company network and workflow go hand in hand. When creating a network and a workflow of a company, the workers and members of staff working duration must be considered in order to achieve a company objective (Moretti, 2017).Also, the mode of employment which may be permanent or temporary/laying down of workers within a short period of time, to a large extent determines a company’s network and workflow. The change of an organizational requirement due to growth and expansion creates a need for a company to adapt a new network and workflow. A network in company plays a vital role of guiding how the company should run its operations. Comment by Mark O'Connell: Duration?? Comment by Mark O'Connell: What? Laying down?? Comment by Mark O'Connell: OK so stop educating us about the factors that determine a company’s network and tell us about YOUR network Comment by Mark O'Connell: Too obvious
My company in the world requires data analysts for to perform analysisdata analysis allowing them to and make important strategic decisions and identify opportunities in the market, and therefore data analysts are becoming very important vital to our company. Despite this, there are many companies coming u.
Ludmila Orlova HOW USE OF AGILE METHODOLOGY IN SOFTWARE DEVELO.docxsmile790243
Ludmila Orlova
HOW USE OF AGILE METHODOLOGY IN SOFTWARE DEVELOPMENT INFLUENCE AGILITY OF THE BUSINESS
Agile methodology is widely distributed tool for software development. Presented article explore research data about use of these tools, its influence to quality of the end product and performance of development and overall agility of business and companies.
KEYWORDS:
Agile, software development, agile business
CONTENT
1 INTRODUCTION
2 AGILE SOFTWARE DEVELOPMENT
3 SCALING AGILE
4 AGILE BUSINESS
5 CONCLUSION
REFERENCES
1 INTRODUCTION
Fast pace of science progress in solid state electronics led to incredible progress of computer devices that on its turn demanded software to control and manage the power of computer calculations and usage.
Software engineering emerged in the beginning of 20th century and by the end of it became separate state of art science, activity and the profession for millions. There are about 18.2 million software developers worldwide, a number that is due to rise to 26.4 million by 2019, a 45% increase, says Evans Data Corp. in its latest Global Developer Population and Demographic Study (P. Thibodeau, 2013). Along with growing number of software developers (software development firms, projects and people involved), increased the need for effective management of software development process. This demanded new approach and methodology from business researchers and managers. In the last several decades there was huge number of research, both in IT field and business management dedicated to this area.
Popularity of agile software development methods started about decade ago and at present these methods are employed by many big, medium size and small companies. Still growing attention to agile methods from software development specialists confirm these methods filled the lack of management techniques for software development that emerged and developed extremely fast along with speedy advancement of hardware in IT area. Great number of research done in areas such as changes in performance of software development using agile methods or scaling agile for large companies and teams. Also one of modern trends is an attempt to apply agile methodology for project management, marketing, sales and other activities. Goal of this article is to explore influence of application agile methods in software development to agility of whole company and business. Presented work based on secondary data taken from a multiple sources, the work performed as an exploratory study and a review of existing research in the area.
2 AGILE SOFTWARE DEVELOPMENT
Definition of an adjective agile in English is: able to move quickly and easily or able to think and understand quickly (Oxford Dictionary, 2015). The most often contemporary use presented by the following sentence: Relating to or denoting a method of project management, used especially for software development, that is characterized by the division of tasks into ...
ERP usability is such a problem that according to this study by IFS North America, many users would use Microsoft Excel, online tools like Google Docs or even Dropbox instead of their system of record. Some may even consider changing jobs to get away from bad ERP.
MAT 111 – Spring 2020 Name ___________________________________________
Quiz 4.1/4.2 – Rational Functions & Their Graphs
Identify any vertical and/or horizontal asymptotes. If one does not exist, write DNE.
Circle/Box final answer. Label each answer. You MUST show work to support your answer to receive
full credit (i.e. factor polynomials, identify how you determined horizontal asymptote, etc.)
1. 𝑓 𝑥 = 𝑥
3
3𝑥+2
2. 𝑔 𝑥 = 𝑥
2−4
𝑥2+2𝑥−8
3. 𝑓 𝑥 = 𝑥−3
𝑥2−16
4.. Identify any vertical or horizontal asymptotes. If one does not exist, write DNE.
Vertical Asymptote(s):
Horizontal Asymptotes(s):
Identify the interval over which the function is
Increasing, decreasing or constant.
#5 & 6 Find the domain/range, vertical/horizontal asymptotes, the x- and y-intercepts, and at least 3
additional points on the graph. If a value does not exist, label it as DNE. Graph the function. Be neat!
Label all answers. Show work when possible to receive full credit.
5. 𝑓 𝑥 = −2𝑥+9
𝑥−3
6. 𝑔 𝑥 = 𝑥+1
𝑥2−4
I PROMISE I AM COMPLETING THIS QUIZ ON MY OWN AND I WILL NOT SHARE
THE CONTENTS OF THIS QUIZ OR MY ANSWERS WITH ANYONE.
SIGN: _____________________________________________________ DATE: __________________
The Emerging Role of Data Scientists
on Software Development Teams
Miryung Kim
UCLA
Los Angeles, CA, USA
[email protected]
Thomas Zimmermann Robert DeLine Andrew Begel
Microsoft Research
Redmond, WA, USA
{tzimmer, rdeline, andrew.begel}@microsoft.com
ABSTRACT
Creating and running software produces large amounts of raw data
about the development process and the customer usage, which can
be turned into actionable insight with the help of skilled data scien-
tists. Unfortunately, data scientists with the analytical and software
engineering skills to analyze these large data sets have been hard to
come by; only recently have software companies started to develop
competencies in software-oriented data analytics. To understand
this emerging role, we interviewed data scientists across several
product groups at Microsoft. In this paper, we describe their educa-
tion and training background, their missions in software engineer-
ing contexts, and the type of problems on which they work. We
identify five distinct working styles of data scientists: (1) Insight
Providers, who work with engineers to collect the data needed to
inform decisions that managers make; (2) Modeling Specialists,
who use their machine learning expertise to build predictive mod-
els; (3) Platform Builders, who create data platforms, balancing
both engineering and data analysis concerns; (4) Polymaths, who
do all data science activities themselves; and (5) Team Leaders,
who run teams of data scientists and spread best practices. We fur-
ther describe a set of strategies that they employ to inc
Assessment 2:
Description/Focus
Essay
Value
50%
Due Date
Midnight Sunday 2 (Week 12)
Length
2500 words
Task: Human services practitioners work across many domains of practice including direct work with individuals, groups and communities.
1. Critically examine the policy or policies that you consider impact upon a client group and suggest ways that policy could be changed to improve the life outcomes for those with whom you are working.
2. Develop a framework that you would adopt for influencing policy change that aligns with your professional values, standards and ethics.
Presentation: The document will be typed in a word document, 12 pt. Font, 1½ or Double spacing
Assessment criteria:
· Critical analysis of social policy
· Application of theory to practice
· Adherence to academic conventions of writing
(eg referencing; writing style)
· At least 8 references. Format APA 6th referencing.
Running head: NETWORK AND WORKFLOW FOR A DATA ANALYTICS COMPANY 1
NETWORK AND WORKFLOW FOR A DATA ANALYTICS COMPANY 2
Network and Workflow for a Data Analytics Company on Ssports
Student Name Nezar Al Massad
Institution Name Dr. Mark O'Connell
Network and Workflow for a Ddata Analytics Company on Ssports.
A company’s network and workflow play a major roles in its performance and growth. Different companies consist of rely on different networks and workflows depending on the services/tasks they are providing and the number of workers and members of staff. A network tends to connect workers and members of staff at different levels of the company. This network tends to create a good and effective workflow within the company, hence a company network and workflow go hand in hand. When creating a network and a workflow of a company, the workers and members of staff working duration must be considered in order to achieve a company objective (Moretti, 2017).Also, the mode of employment which may be permanent or temporary/laying down of workers within a short period of time, to a large extent determines a company’s network and workflow. The change of an organizational requirement due to growth and expansion creates a need for a company to adapt a new network and workflow. A network in company plays a vital role of guiding how the company should run its operations. Comment by Mark O'Connell: Duration?? Comment by Mark O'Connell: What? Laying down?? Comment by Mark O'Connell: OK so stop educating us about the factors that determine a company’s network and tell us about YOUR network Comment by Mark O'Connell: Too obvious
My company in the world requires data analysts for to perform analysisdata analysis allowing them to and make important strategic decisions and identify opportunities in the market, and therefore data analysts are becoming very important vital to our company. Despite this, there are many companies coming u.
Ludmila Orlova HOW USE OF AGILE METHODOLOGY IN SOFTWARE DEVELO.docxsmile790243
Ludmila Orlova
HOW USE OF AGILE METHODOLOGY IN SOFTWARE DEVELOPMENT INFLUENCE AGILITY OF THE BUSINESS
Agile methodology is widely distributed tool for software development. Presented article explore research data about use of these tools, its influence to quality of the end product and performance of development and overall agility of business and companies.
KEYWORDS:
Agile, software development, agile business
CONTENT
1 INTRODUCTION
2 AGILE SOFTWARE DEVELOPMENT
3 SCALING AGILE
4 AGILE BUSINESS
5 CONCLUSION
REFERENCES
1 INTRODUCTION
Fast pace of science progress in solid state electronics led to incredible progress of computer devices that on its turn demanded software to control and manage the power of computer calculations and usage.
Software engineering emerged in the beginning of 20th century and by the end of it became separate state of art science, activity and the profession for millions. There are about 18.2 million software developers worldwide, a number that is due to rise to 26.4 million by 2019, a 45% increase, says Evans Data Corp. in its latest Global Developer Population and Demographic Study (P. Thibodeau, 2013). Along with growing number of software developers (software development firms, projects and people involved), increased the need for effective management of software development process. This demanded new approach and methodology from business researchers and managers. In the last several decades there was huge number of research, both in IT field and business management dedicated to this area.
Popularity of agile software development methods started about decade ago and at present these methods are employed by many big, medium size and small companies. Still growing attention to agile methods from software development specialists confirm these methods filled the lack of management techniques for software development that emerged and developed extremely fast along with speedy advancement of hardware in IT area. Great number of research done in areas such as changes in performance of software development using agile methods or scaling agile for large companies and teams. Also one of modern trends is an attempt to apply agile methodology for project management, marketing, sales and other activities. Goal of this article is to explore influence of application agile methods in software development to agility of whole company and business. Presented work based on secondary data taken from a multiple sources, the work performed as an exploratory study and a review of existing research in the area.
2 AGILE SOFTWARE DEVELOPMENT
Definition of an adjective agile in English is: able to move quickly and easily or able to think and understand quickly (Oxford Dictionary, 2015). The most often contemporary use presented by the following sentence: Relating to or denoting a method of project management, used especially for software development, that is characterized by the division of tasks into ...
ERP usability is such a problem that according to this study by IFS North America, many users would use Microsoft Excel, online tools like Google Docs or even Dropbox instead of their system of record. Some may even consider changing jobs to get away from bad ERP.
Data Science is in high demand, the melting pot
of complex skills requires a qualified data scientist have made them the unicorns in today's data-driven landscape.
Semantic Web Mining of Un-structured Data: Challenges and OpportunitiesCSCJournals
The management of unstructured data is acknowledged as one of the most critical unsolved problems in data management and business intelligence fields in current times. The major reason for this unresolved problem is primarily because of the actuality that the methods, systems and related tools that have established themselves so successfully converting structured information into business intelligence, simply are ineffective when we try to implement the same on unstructured information. New methods and approaches are very much necessary. It is a known realism that huge amount of information is shared by the organizations across the world over the web. It is, however, significant to observe that this information explosion across the globe has resulted in opening a lot of new avenues to create tools for data management and business intelligence primarily focusing on unstructured data. In this paper, we explore the challenges being faced by information system developers during mining of unstructured data in the context of semantic web and web mining. Opportunities in the wake of these challenges are discussed towards the end of the paper.
From Card Sort to Redesigned Intranet Site: A Success StoryBob Thomas
This is a case study describing how we moved an intranet that was not working for its employees into a successful website.
The objectives were to "lead with the need,' to determine how current employees want their information organized and what should be emphasized first. Our goals were to redesign the intranet site, focusing on the navigation and organization (overall information architecture); make the site usable; especially when it comes to finding things; and to make the site attractive and understandable to employees.
A card sort seemed the best solution because we had 2 sites that needed to be unified and that had grown too small for the additional information that kept getting added year in and year out. We had an IA that wasn’t working, and we wanted to discover how employees – the principal users of the site – wanted their information organized.
We chose a open card sort because we wanted participants to tell us how they wanted the main categories of the site named (as opposed to us naming the categories for them).
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docxtodd271
Running head: CS688 – Data Analytics with R1
CS688 – Data Analytics with R10
CS688 – Data Analytics with R
Surendra Parimi
CS688 – Introduction to CRISP-DM and the R platform IP 1
Colorado Technical University
07/10/2019
Table of Contents
Introduction to CRISP-DM and the R Platform Organizational Background3
Organizational Background:3
CRISP-DM(Cross-industry standard process for data mining):3
Data Maturity:4
Role of Data Analyst:6
How Do we Implement the R Platform:6
R Modeling With Regressions and Classifications (TBD)7
Model Performance Evaluation (TBD)8
Visualizations With R (TBD)9
Machine Learning (TBD)10
References11
Introduction to CRISP-DM and the R Platform Organizational BackgroundOrganizational Background:
The organization I currently work for and planning to implement the techniques of the data analytics course is T-Mobile USA, which offers wireless mobile phone services to 0ver 80 million customers in the United States. It’s a huge enterprise with large scale information technology systems that support the business that T-Mobile does. The company is seeing significant growth in terms of business and therefore the IT systems that are supporting the business. Myself as a DEVOPS engineer works on deploying the code to these mission critical systems, host them and operate to make sure the systems are working as expected. As the land scape of our IT systems grow, we want to be able to identify the issues in our systems in advance so that we can prevent them before causing any outage to the business. To achieve such a result, our IT systems logs needs to be analyzed in-depth to unleash the critical insights about the system performance and apply the feedback to improve our systems.
CRISP-DM(Cross-industry standard process for data mining):
The CRISP-DM helps us ensure our data analysis adheres certain standards and CRISP-DM is a proven strategy worldwide. Corporations like IBM have further enhanced and or customized the standard and came up with their own methodology knows as ‘Analytics
Solution
s Unified Method for Data Mining/Predictive Analytics(ASUS_DM)’
The CRISP-DM methodology involves 6 different steps
Business Understanding: Building the knowledge about business requirements and objectives from functional aspect and transforming this knowledge as a data mining objective with an implementation plan.
Data Understanding: Involves the process of data collection from diverse sources of data, review and understand the data to be able to identify the problems which compromise data quality and also give the initial understanding of what the data can deliver.
Data Preparation: The data preparation phase covers all activities to build the final dataset from the initial raw data collected.
Modeling: Modeling techniques are based on the objective of the problem being tried. So, based on the problem, model is decided and based on the model, data is collected.
Evaluation: The evaluation phase is taken up once.
Productivity Factors in Software Development for PC PlatformIJERA Editor
Identifying the most relevant factors influencing project performance is essential for implementing business strategies by selecting and adjusting proper improvement activities. The two major classification algorithms CRT and ANN that were recommended by the Auto Classifier tool in SPSS Modeler used for determining the most important variables (attributes) of software development in PC environment. While the accuracy of classification of productive versus non-productive cases are relatively close (72% vs 69%), their ranking of important variables are different. CRT ranks the Programming Language as the most important variable and Function Points as the least important. On the other hand, ANN ranks the Function Points as the most important followed by team size and Programming Language.
Rafael Oliveira Bitcoin diz que, na prática, os cientistas de dados trabalham em uma variedade de tarefas, como limpeza de dados, exploração de dados, engenharia de recursos, modelagem e avaliação.
The employee life cycle is a foundational framework for robust and h.docxtodd701
The employee life cycle is a foundational framework for robust and healthy employee experience and is a major contributor to the success of the organization. It is also a powerful mechanism that can, when well-designed and properly used, make a company a workplace that employees want to be at every day of the week and creativity and innovation show up even when leaders are just hoping for it. Learners are asked to respond to the following question for this last discussion in the course: Which parts of the employment life cycle do you consider most important and why?
Resources
Employee Life Cycle Impact on Engagement
(2018, Feb 28).
Report details how moments that matter & employee value propositions impact worker engagement.
PR Newswire.
"Among the most critical components shaping (the organization's engagement) ecosystem is the employee value proposition, the tangible and intangible deal that organizations provide in exchange for employee effort, commitment and performance."
Bradison, P. (2019).
HR Matters: From recruiting to onboarding the importance of quality new hire work flows.
Alaska Business Monthly,
35
(4), 83.
This article describes how "employees from multiple generations are seeking employment with a consumer’s approach" when they consider more than the pay structure before applying for a position.
Working in HRM
Justin, T. C. (2018).
Addressing the top HR challenges in 2019.
HR Strategy and Planning Excellence Essentials.
This preview to the year in HRM in Canada considers these hot topics: "catering to a multi-generational workforce, employee engagement, increasing feedback, attracting and keeping the right employees, and now marijuana in the workplace."
Sato, Y., Kobayashi, N., & Shirasaka, S. (2020).
An analysis of human resource management for knowledge workers: Using the three axes of target employee, lifecycle stage, and human resource flow.
Review of Integrative Business and Economics Research, 9
(1), 140–156.
This study considers human resource flow management and how to foster that along with two other HRM initiatives with knowledge workers.
Tyler, K. (2019).
10 steps to unlocking innovation at your organization.
HRMagazine, 64
(1), 1.
Innovation is a key component for the longevity of an organization and "HR can't expect to foster an innovative company culture if it does not have an innovative culture within its own function." This resource is inspiring to help HR professionals find a purpose for their efforts to improve all steps in the employee life cycle and embrace the HR platforms and tools that will help them towards this goal.
Case Study
Saurombe, M., Barkhuizen, E. N., & Schutte, N. E. (2017).
Management perceptions of a higher educational brand for the attraction of talented academic staff.
SA Journal of Human Resource Management
, 15.
This study gives a great example of how managers think about branding in higher education and how a.
The economy is driven by data ~ Data sustains an organization’s .docxtodd701
The economy is driven by data ~ Data sustains an organization’s business processes and enables it to deliver products and services. Stop the flow of data, and for many companies, business comes quickly to a halt. Those who understand its value and have the ability to manage related risks will have a competitive advantage. If the loss of data lasts long enough, the viability of an organization to survive may come into question.
What is the significant difference between quality assurance & quality control? Explain
Why is there a relationship between QA/QC and risk management? Explain
Why are policies needed to govern data both in transit and at rest (not being used - accessed)? Explain
.
More Related Content
Similar to The Emerging Role of Data Scientists on Software Developmen.docx
Data Science is in high demand, the melting pot
of complex skills requires a qualified data scientist have made them the unicorns in today's data-driven landscape.
Semantic Web Mining of Un-structured Data: Challenges and OpportunitiesCSCJournals
The management of unstructured data is acknowledged as one of the most critical unsolved problems in data management and business intelligence fields in current times. The major reason for this unresolved problem is primarily because of the actuality that the methods, systems and related tools that have established themselves so successfully converting structured information into business intelligence, simply are ineffective when we try to implement the same on unstructured information. New methods and approaches are very much necessary. It is a known realism that huge amount of information is shared by the organizations across the world over the web. It is, however, significant to observe that this information explosion across the globe has resulted in opening a lot of new avenues to create tools for data management and business intelligence primarily focusing on unstructured data. In this paper, we explore the challenges being faced by information system developers during mining of unstructured data in the context of semantic web and web mining. Opportunities in the wake of these challenges are discussed towards the end of the paper.
From Card Sort to Redesigned Intranet Site: A Success StoryBob Thomas
This is a case study describing how we moved an intranet that was not working for its employees into a successful website.
The objectives were to "lead with the need,' to determine how current employees want their information organized and what should be emphasized first. Our goals were to redesign the intranet site, focusing on the navigation and organization (overall information architecture); make the site usable; especially when it comes to finding things; and to make the site attractive and understandable to employees.
A card sort seemed the best solution because we had 2 sites that needed to be unified and that had grown too small for the additional information that kept getting added year in and year out. We had an IA that wasn’t working, and we wanted to discover how employees – the principal users of the site – wanted their information organized.
We chose a open card sort because we wanted participants to tell us how they wanted the main categories of the site named (as opposed to us naming the categories for them).
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docxtodd271
Running head: CS688 – Data Analytics with R1
CS688 – Data Analytics with R10
CS688 – Data Analytics with R
Surendra Parimi
CS688 – Introduction to CRISP-DM and the R platform IP 1
Colorado Technical University
07/10/2019
Table of Contents
Introduction to CRISP-DM and the R Platform Organizational Background3
Organizational Background:3
CRISP-DM(Cross-industry standard process for data mining):3
Data Maturity:4
Role of Data Analyst:6
How Do we Implement the R Platform:6
R Modeling With Regressions and Classifications (TBD)7
Model Performance Evaluation (TBD)8
Visualizations With R (TBD)9
Machine Learning (TBD)10
References11
Introduction to CRISP-DM and the R Platform Organizational BackgroundOrganizational Background:
The organization I currently work for and planning to implement the techniques of the data analytics course is T-Mobile USA, which offers wireless mobile phone services to 0ver 80 million customers in the United States. It’s a huge enterprise with large scale information technology systems that support the business that T-Mobile does. The company is seeing significant growth in terms of business and therefore the IT systems that are supporting the business. Myself as a DEVOPS engineer works on deploying the code to these mission critical systems, host them and operate to make sure the systems are working as expected. As the land scape of our IT systems grow, we want to be able to identify the issues in our systems in advance so that we can prevent them before causing any outage to the business. To achieve such a result, our IT systems logs needs to be analyzed in-depth to unleash the critical insights about the system performance and apply the feedback to improve our systems.
CRISP-DM(Cross-industry standard process for data mining):
The CRISP-DM helps us ensure our data analysis adheres certain standards and CRISP-DM is a proven strategy worldwide. Corporations like IBM have further enhanced and or customized the standard and came up with their own methodology knows as ‘Analytics
Solution
s Unified Method for Data Mining/Predictive Analytics(ASUS_DM)’
The CRISP-DM methodology involves 6 different steps
Business Understanding: Building the knowledge about business requirements and objectives from functional aspect and transforming this knowledge as a data mining objective with an implementation plan.
Data Understanding: Involves the process of data collection from diverse sources of data, review and understand the data to be able to identify the problems which compromise data quality and also give the initial understanding of what the data can deliver.
Data Preparation: The data preparation phase covers all activities to build the final dataset from the initial raw data collected.
Modeling: Modeling techniques are based on the objective of the problem being tried. So, based on the problem, model is decided and based on the model, data is collected.
Evaluation: The evaluation phase is taken up once.
Productivity Factors in Software Development for PC PlatformIJERA Editor
Identifying the most relevant factors influencing project performance is essential for implementing business strategies by selecting and adjusting proper improvement activities. The two major classification algorithms CRT and ANN that were recommended by the Auto Classifier tool in SPSS Modeler used for determining the most important variables (attributes) of software development in PC environment. While the accuracy of classification of productive versus non-productive cases are relatively close (72% vs 69%), their ranking of important variables are different. CRT ranks the Programming Language as the most important variable and Function Points as the least important. On the other hand, ANN ranks the Function Points as the most important followed by team size and Programming Language.
Rafael Oliveira Bitcoin diz que, na prática, os cientistas de dados trabalham em uma variedade de tarefas, como limpeza de dados, exploração de dados, engenharia de recursos, modelagem e avaliação.
The employee life cycle is a foundational framework for robust and h.docxtodd701
The employee life cycle is a foundational framework for robust and healthy employee experience and is a major contributor to the success of the organization. It is also a powerful mechanism that can, when well-designed and properly used, make a company a workplace that employees want to be at every day of the week and creativity and innovation show up even when leaders are just hoping for it. Learners are asked to respond to the following question for this last discussion in the course: Which parts of the employment life cycle do you consider most important and why?
Resources
Employee Life Cycle Impact on Engagement
(2018, Feb 28).
Report details how moments that matter & employee value propositions impact worker engagement.
PR Newswire.
"Among the most critical components shaping (the organization's engagement) ecosystem is the employee value proposition, the tangible and intangible deal that organizations provide in exchange for employee effort, commitment and performance."
Bradison, P. (2019).
HR Matters: From recruiting to onboarding the importance of quality new hire work flows.
Alaska Business Monthly,
35
(4), 83.
This article describes how "employees from multiple generations are seeking employment with a consumer’s approach" when they consider more than the pay structure before applying for a position.
Working in HRM
Justin, T. C. (2018).
Addressing the top HR challenges in 2019.
HR Strategy and Planning Excellence Essentials.
This preview to the year in HRM in Canada considers these hot topics: "catering to a multi-generational workforce, employee engagement, increasing feedback, attracting and keeping the right employees, and now marijuana in the workplace."
Sato, Y., Kobayashi, N., & Shirasaka, S. (2020).
An analysis of human resource management for knowledge workers: Using the three axes of target employee, lifecycle stage, and human resource flow.
Review of Integrative Business and Economics Research, 9
(1), 140–156.
This study considers human resource flow management and how to foster that along with two other HRM initiatives with knowledge workers.
Tyler, K. (2019).
10 steps to unlocking innovation at your organization.
HRMagazine, 64
(1), 1.
Innovation is a key component for the longevity of an organization and "HR can't expect to foster an innovative company culture if it does not have an innovative culture within its own function." This resource is inspiring to help HR professionals find a purpose for their efforts to improve all steps in the employee life cycle and embrace the HR platforms and tools that will help them towards this goal.
Case Study
Saurombe, M., Barkhuizen, E. N., & Schutte, N. E. (2017).
Management perceptions of a higher educational brand for the attraction of talented academic staff.
SA Journal of Human Resource Management
, 15.
This study gives a great example of how managers think about branding in higher education and how a.
The economy is driven by data ~ Data sustains an organization’s .docxtodd701
The economy is driven by data ~ Data sustains an organization’s business processes and enables it to deliver products and services. Stop the flow of data, and for many companies, business comes quickly to a halt. Those who understand its value and have the ability to manage related risks will have a competitive advantage. If the loss of data lasts long enough, the viability of an organization to survive may come into question.
What is the significant difference between quality assurance & quality control? Explain
Why is there a relationship between QA/QC and risk management? Explain
Why are policies needed to govern data both in transit and at rest (not being used - accessed)? Explain
.
THE EMERGENCY DEPARTMENT AND VICTIMS OF SEXUAL VIOLENCE AN .docxtodd701
THE EMERGENCY DEPARTMENT AND
VICTIMS OF SEXUAL VIOLENCE: AN
ASSESSMENT OF PREPAREDNESS TO HELP
STACEY BETH PLICHTA, SC.D.
TANCY VANDECAR-BURDIN, M.A.
Old Dominion University, Norfolk, VA
REBECCA K ODOR, M.S.W.
Virginia Department of Health, Richmond, VA
SHANI REAMS, A.A.S.
Virginia Sexual and Domestic Violence Action Alliance,
Richmond, VA
YAN ZHANG, M.S.
Old Dominion University, Norfolk, VA
ABSTRACT
The Emergency Department (ED) is a key source of care for
victims of sexual violence but there is little information available about
the extent to which EDs are prepared to provide this care. This study
examines the structural and process factors that the ED has in place to
assist victims. A survey of all 82 publicly accessible EDs in the
Commonwealth of Virginia was conducted (RR 76%). In general, the
EDs provide the recommended medical care to victims. However, at
least half do not have the needed resources in place to effectively assist
victims and most (80%) do not provide regular training to their medical
staff about sexual violence. Further, almost one-quarter do not have a
relationship with a local rape crisis center. It is recommended that each
ED partner with local rape crisis centers to provide training to their
staff and to ensure continuity of support for victims. It is also
suggested that the state government explore ways in which a forensic
(SANE) nurse be made available to every victim of sexual violence that
presents to the ED for medical assistance. Ideally, each ED would
become part of a community-wide Sexual Assault Response Team
286 JHHSA WINTER 2006
(SART) in order to provide comprehensive care to victims and
thorough evidence collection and information to law enforcement.
INTRODUCTION
This study seeks to examine the extent to which
Emergency Departments (EDs) in the Commonwealth of
Virginia are prepared to provide care for victims of sexual
violence through an examination of both structural and
process factors that are currently in place. Many studies
indicate that sexual violence victimization has both long-
term and short-term health consequences (Plichta and Falik,
2001; see also Rentoul and Applebloom 1997; Cloutier,
Martin and Poole, 2002; Bohn and Holz, 1996). The ED is
a key source of care for victims of sexual assault. It is one
of the first points of entry to care. Competent care by
professionals trained in treating sexual assault victims is
critical to the timely recovery of physical and mental
health. The ED also plays a critical role in the collection of
evidence that may lead to the conviction of the perpetrator
and a recent study found that specially trained (forensic)
nurses perform this function significantly better than do
other staff (Sievers, Murphy and Miller, 2003). Forensic
nurses are registered nurses (R.N.’s) who have advanced
training in the examination of sexual assault victims; this
includes training on legal aspe.
The emergence of HRM in the UK in the 1980s represented a new fo.docxtodd701
The emergence of HRM in the UK in the 1980s represented a new form of managerialism and was instrumental in increases in work intensification’. Discuss.
Word count: 2,000 words (excluding references) and the 10% convention applies
· Minimum use of 15 academic journal articles/ research reports.
· It must be single-sided with size 12 font, 1.5 spacing with the pages numbered and stapled.
Structure – a clear logical format with linked points and arguments.
Broadly, your essay should be structured in the following manner (subheadings are not necessary)
1. Introduction – summary of your ideas and the structure
2. Review of the literature – critical discussion
3. Conclusions
4. References
Background material – evidence of the background research drawing from literature sources. This should include enough descriptive content and factual information from which to derive arguments and assessment of key themes, issues and problems addressed.
Accuracy – in the presentation and description of theories used in the argument
Argumentation – the main argument of the report should relate to the objectives you have initially stated. They should be supported by evidence, both from a variety of sources in the literature.
Presentation – the answers should be well planned – clear, coherent and well constructed. Remember- never write in the first person.
Relevant references and sources must be cited using the Harvard style of referencing. Marks will be removed for wrong or poor referencing.
Useful tips on essay writing
http://www.reading.ac.uk/internal/studyadvice/studyresources/essays/stadevelopessay.aspx
.
The elimination patterns of our patients are very important to know .docxtodd701
The elimination patterns of our patients are very important to know as we continue to assess and do our care plans. How can impaired elimination affect the integumentary system?
Remember that your posts must exhibit appropriate writing mechanics including using proper language, cordiality, and proper grammar and punctuation. If you refer to any outside sources or reference materials be sure to provide proper attribution and/or citation.
.
The Elements and Principles of Design A Guide to Design Term.docxtodd701
The Elements and Principles of Design
A Guide to Design Terminology
The elements of design are some of the basic building blocks that make up the design or artwork.
Understanding and using this terminology can help the designer articulate what works and what doesn’t
work in a design, and to think critically about a design on a more conscious level. Combined, the elements
and principles of design can make for a strong, complete and well-established composition. The principles
of Gestalt, which arise from the elements of design, are included at the end of this document. Learning to
use these elements and principles will be the focus of Beginning Design.
The elements of design are: Point, Line, Form, Value, Texture, Shape, Space, Color
(Color is covered in Art 110; we will be focusing on black, white, and gray scale values.)
DEFINITIONS:
A Point is a position in space.
A Line is the path of a moving point. Two points connected make a line. Lines often imply motion, and can
be rendered in a variety of ways. Contour lines or outlines, define the boundary between shapes. Lines can
create texture or value when used in crosshatching. In addition to these types of actual lines, our eyes can
invent implied lines, such as in dotted lines, or where area boundaries describe lines that may not be there.
Shape is a two dimensional form. The variety of possible shapes is endless. Several common ones are as
follows:
• Simple Geometric: circles, squares, triangles are some of the examples.
• Complex Geometric: straight and curved shapes that have more sides and angles.
• Curvilinear: French curves, ellipses, circles and ovals used in combination.
• Accidental: an example of this might be a coffee ring or paint splatters.
Form is a shape with dimension, an object existing in three dimensional space physically or implied.
Value is the tone created by black, white and shades of gray. The value or tone of an element can create
mass, dimension, emphasis or volume.
Texture can be actual or visual.
• Actual texture is tactile: you can feel it by touching it.
• Visual texture are the markings of a two dimensional artwork that imply actual texture.
Space is an illusion or feeling of 3-dimensionality, which can be created in a two-dimensional design in
several ways, for example:
• Overlapping one object in front of another;
• Using differences in value, amount of detail, etc. between elements;
• Using techniques related to linear perspective, such as differences in size or height on page between
elements
The principles of design are: Unity, Variety, Movement, Balance, Emphasis, Contrast, Proportion,
and Pattern.
DEFINITIONS:
Unity or harmony is the quality of wholeness or oneness that is achieved through the effective use of the
elements and principles of design. The most basic quality of a design or artwork, unity gives a piece the
feeling of being an integrated human expression. The princi.
The emergence of HRM in the UK in the 1980s represented a new form o.docxtodd701
The emergence of HRM in the UK in the 1980s represented a new form of managerialism and was instrumental in increases in work intensification’. Discuss.
Word count: 2,000 words (excluding references) and the 10% convention applies
· Minimum use of 15 academic journal articles/ research reports.
· It must be single-sided with size 12 font, 1.5 spacing with the pages numbered and stapled.
Structure – a clear logical format with linked points and arguments.
Broadly, your essay should be structured in the following manner (subheadings are not necessary)
1. Introduction – summary of your ideas and the structure
2. Review of the literature – critical discussion
3. Conclusions
4. References
Background material – evidence of the background research drawing from literature sources. This should include enough descriptive content and factual information from which to derive arguments and assessment of key themes, issues and problems addressed.
Accuracy – in the presentation and description of theories used in the argument
Argumentation – the main argument of the report should relate to the objectives you have initially stated. They should be supported by evidence, both from a variety of sources in the literature.
Presentation – the answers should be well planned – clear, coherent and well constructed. Remember- never write in the first person.
Relevant references and sources must be cited using the Harvard style of referencing. Marks will be removed for wrong or poor referencing.
Useful tips on essay writing
http://www.reading.ac.uk/internal/studyadvice/studyresources/essays/stadevelopessay.aspx
.
The eligibility requirements to become a family nurse practition.docxtodd701
The eligibility requirements to become a family nurse practitioner include completion of “APRN core (advance physical assessment, advanced pharmacology, and advanced pathophysiology), supervised clinical hours, completion of an accredited graduate program with evidence of an academic transcript, and an active nurse license” (American Academy of Nurse Practitioners, 2021).
The value associated with certification as an FNP is very personal to me. Along with providing higher quality care to clientele, I will have a more fulfilled inner sense of purpose and also be able to provide for my family in a higher capacity than I was previously able to, with an estimated average nurse practitioner salary being over $100,000 annually in the state of Wisconsin. Achieving both my master's and nurse practitioner certification would allow my employer, fellow professional comrades, and most of all; my clients, to have a higher sense of security knowing I’ve worked and studied hard to bring them the highest quality care available. Staying up to date on my continuing education and state-of-the-art processes and pathology will also instill confidence in my clientele to not only continue coming to me with their individual and family healthcare needs but likely will ensure referrals into my practice.
Any time a nurse genuinely takes on a holistic approach towards the practical application of nursing theory, a client is in a better position for patient-centered care, maintaining anonymity, and ensuring positive effective communication during the care process. In the nursing profession, nurses need to not only advocate for their clients, but themselves by participating in associations that work towards advancing the field through by working towards lower nurse-to-client ratios to decrease burnout, leadership education, and opportunity, and also grants to advance continuing education.
.
The Electoral College was created to protect US citizens against.docxtodd701
The Electoral College was created to protect US citizens against mob rule. Mob rule is the control of a lawful government system by a mass of people through violence and intimidation. However, some Americans question the legitimacy of this process. Pick one election where the outcome of the popular vote and the electoral college vote differed to create an argument in favor of or opposed to the use of the electoral college. List at least three valid points to support your argument. Present you argument in a PowerPoint presentation.
As you complete your presentation, be sure to:
Use speaker's notes to expand upon the bullet point main ideas on your slides, making references to research and theory with citation.
Proof your work
Use visuals (pictures, video, narration, graphs, etc.) to compliment the text in your presentation and to reinforce your content.
Do not just write a paper and copy chunks of it into each slide. Treat this as if you were going to give this presentation live.
Presentation Requirements (APA format)
Length: 8-10 substantive slides (excluding cover and references slides)
Font should not be smaller than size 16-point
Parenthetical in-text citations included and formatted in APA style
References slide ( 3 scholarly sources)
.
The Earths largest phylum is Arthropoda, including centipedes, mill.docxtodd701
The Earth's largest phylum is Arthropoda, including centipedes, millipedes, crustaceans, and insects. The insects have shown to be a particularly successful class within the phylum. What biological characteristics have contributed to the success of insects? I'm many science fiction scenarios, post-apocalyptic Earth is mainly populated with giant insects. Why don't we see giant insects today?
250-500 words done by 12:40pm today which is about two hours from now. Cite work.
.
The economic and financial crisis from 2008 to 2009, also known .docxtodd701
The economic and financial crisis from 2008 to 2009, also known as the global financial crisis, was considered to be the worst financial crisis since the Great Depression. The general situation of financial markets has been additionally complicated by the introduction of new financial products as well as other modes of operations including globalization. The global financial market seems to be playing a different function in our economy and it has been working because of new regulations. The introduction of new trade platforms, online access to information, integration and globalization of the market have caused some revisions of finance theories.
What are reliable predictors of economic and financial crises (list at least 3 of them)?
Describe some achievements and some pending issues in context of a global crisis.
Are we still in danger of economic and financial crises today (please refer to current Covid-19 situation)?
Instructions:
Conduct research from viable and credible sources such as and not limited to economic journals, periodicals, books, data base, and websites. This assignment should be submitted/uploaded via D2L on the date the assignment is due. Any late assignments will be subject to a letter grade reduction unless an extension has been negotiated with the professor prior to the due date.
In this written assignment, the quality of your writing and the application of APA format will be evaluated in addition to your content. Evaluation based on these criteria is designed to help prepare you for completing your college projects, which must be well written and follow APA guidelines. Each written assignment should contain a minimum of 800 words, but no more than 900 words. Make sure that you use correct spelling, grammar, and punctuation.
.
The Economic Development Case Study is a two-part assign.docxtodd701
The Economic Development Case Study is a two-part assignment – the written paper and video
presentation. Economic Development Case studies must be posted prior to April 19th to
receive approval. Case studies are approved on first posted basis – case studies must be unique,
and students are required to review previously posted case studies to alleviate duplicate case
studies.
The first part of the assignment is to write a paper on a local (San Bernardino or Riverside counties)
economic development. You may identify a case as reported from a city’s website, local
newspapers, or other quality source. Remember, a low-quality source, or insufficient information
from your sources, will affect the quality of your grade for this assignment. The Economic
Development project cannot have been completed.
Your case study should be approximately 750~1000 words long. In your case study paper, you
should briefly describe the following:
• Introduction to the economic development case
• Identify the role government played
• Identify the role of the public, if any
• Economic impact to the community – What is the economic impact to the community? How will it
benefit or not benefit the community?
• Analysis – What is your analysis of the project?
• Conclusion – Where is the project currently?
Instructions for the case study: go to Economic Development Case Study – Submit Here
Scoring Rubric for Economic Development Case Study Paper
Criteria Exceptional
(15 - 13 points)
Very Good
(13 – 11.25 points)
Acceptable
(11.25 – 9 points)
Unacceptable
(8 points or less)
Content
Provides an accurate and
complete description of the
case. All sources of facts
and examples are fully
documented. The case is
original. Case was approved.
For the most part,
description of the case
accurate and complete.
Most sources of facts and
examples are documented.
The case is original and
case was approved.
Description of the case is
inaccurate or incomplete.
Some sources of facts and
examples are
documented. The case is
original and was approved.
Very little reference was
made to the case. Case
is not supported by
evidence. Case is not
original and was not
approved.
Organization
Writer presents information in
logical, interesting sequence,
which reader can follow
Writer presents information
in logical sequence which
reader can follow.
Reader has difficulty
following case study
Reader cannot follow the
case organization.
Economic Development Case Study Paper & Presentation:
Analysis
Writer provides excellent
analysis of the role of
government and the
economic impact of the case
supported by information
provided
Writer provides good
analysis of the role of
government and the
economic impact of the
case.
Writer provides analysis of
either the role of
gov.
The Eighties, Part OneFrom the following list, choose five.docxtodd701
The Eighties,
P
art
One
From the following list, choose five
events
during the 1980s.
I
dentify
the basic facts, dates, and purpose of the event in 2 to 3 sentences in the Identify column. Include why the event is significant in the Significance column, and add a reference for your material in the Reference column.
·
The Sunbelt
·
Suburban Conservatism
·
The Tax Revolt
·
Corporate Elites
·
Neoconservatives
·
Populist Conservatives
·
Deregulation
·
The Federal Reserve Board
·
The Energy Glut
·
The 1981 Tax Cuts
·
Spending Cuts
·
Military Spending
·
Technology
Event
Identify
Significance
Reference
The Eighties,
P
art
Two
From the following list, choose five
events
during the 1980s.
I
dentify
the basic facts, dates, and purpose of the event in 2 to 3 sentences in the Identify column. Include why the event is significant in the Significance column, and add a reference for your material in the Reference column.
·
Feminism
·
Homelessness
·
Republicans and the environment
·
Malls
·
Alternative rock
·
Madonna
·
Michael Jackson
·
AIDS
·
The Cosby Show
·
Sandra Day O’Connor
·
We Are the World
·
Global Warming
·
Geraldine Ferraro
Event
Identify
Significance
Reference
.
The Election of 1860Democrats split· Northern Democrats run .docxtodd701
The Election of 1860
Democrats split
· Northern Democrats run Stephen Douglas
· Southern Democrats run John C. Breckinridge
Republicans decide for moderate
· Republicans nominate Lincoln
· Lincoln opposes slavery in territories
· Republican platform comprehensive
Fourth party enters race
· Constitutional Unionists
· Run John Bell
Republican Victory
· Lincoln gains 40% popular vote
· Lincoln wins in electoral college
· Most Americans want settlement
South Carolina fire-eaters demand secession
· South Carolina secedes December 20, 1680
· Deep South follows
· Buchanan unable to shape compromise
Crittenden Compromise
· Proposed extension of 36º 30’
· John Tyler proposed constitutional amendment
· Lincoln cannot accept slavery in territories
· Compromises fail
Confederate States of America
· Seven states of deep South
· Montgomery original capital
· Constitution similar to that of U.S.
· Constitution protects slavery
President Jefferson of CSA
· Model slave owner; not fire-eater
· Cold personality, irritable, inflexible
· Lacks self-confidence
· Surrounds himself with yes-men
President Abraham Lincoln of United States
· Knows value of unity, competency
· Appoints rivals to cabinet
· Brunt of jokes, criticism
· Sharp native intelligence, humble
Border states
· Virginia, North Carolina, Tennessee, Arkansas join CSA
· Maryland, Kentucky, Missouri stay with Union
· West Virginia secedes from Virginia
A war of nerves
· Two Southern forts in U.S. hands
· Davis willing to let status quo stand for moment
· Lincoln decides to re-supply forts without force
· Confederates fire, beginning April 12, 1861
Art of War influences commanders
· Focus on occupying high ground
· Focus on taking enemy cities
· Retreat when necessary
· Jomini’s 12 models of war
The Armies
· Calvary: for reconnaissance
· Artillery: weakens enemy
· Infantry: backbone of army
· Also support units
Infantry
· Brigades of 2,000–3,000
· Form double lines of 1,000 yards
· Advance into enemy fire
· Then fight hand-to-hand
· Most battles in dense woods
Yanks and Rebs
· Most between 17 and 25
· From all states, social classes
· Draft exempts upper class
· Anti-draft riots in New York City
· Draft dodgers in South
· Some bounty hunters
· High desertion rates
· Shirking duty not common
First Battle of Manassas (Bull Run)
· Both sides thought war would be short
· First battle 20 miles from Washington
· South wins, Union forces flee in panic
First Battle of Manassas (Bull Run)
· South fails to attack Washington
· South celebrates victory
· Stonewall Jackson hero for South
· South disorganized even in victory
Consequences of Manassas (Bull Run)
· South becomes overconfident
· North prepares for long fight
· George McClellan given command of Army of Potomac
Northern strategy
· Defend Washington; take Richmond
· Split Confederacy by taking Mississippi River
· Blockade southern coastline
Mismatch
· North had population advantage of 22 to 9 million
· Industry in north
· Railroads mainl.
The early civilizations of the Indus Valley known as Harappa and Moh.docxtodd701
The early civilizations of the Indus Valley known as Harappa and Mohenjo-Daro had many of the markings of a sophisticated culture. In a
2-3 page
paper discuss the noted advancements of these cultures including significant archaeological finds that suggest these civilizations were far more advanced than originally believed. For this paper, you will need to find
at least (2) outside
resources that support your writing.
.
The Early Theories of Human DevelopmentSeveral theories atte.docxtodd701
The Early Theories of Human Development
Several theories attempt to describe human development.
Briefly describe the Freud, Erickson, and Piaget theories regarding development. Provide the major similarities and differences between each.
Explain how these early theories were developed, and why there is concern related to race, gender, socioeconomic status, and other areas of diversity in how these theories were developed.
.
The Electoral College was created to protect US citizens against mob.docxtodd701
The Electoral College was created to protect US citizens against mob rule. Mob rule is the control of a lawful government system by a mass of people through violence and intimidation. However, some Americans question the legitimacy of this process. Pick one election where the outcome of the popular vote and the electoral college vote differed to create an argument in favor of or opposed to the use of the electoral college. List at least three valid points to support your argument.
Present you argument in a PowerPoint presentation.
Use speaker's notes to expand upon the bullet point main ideas on your slides, making references to research and theory with citation.
Use visuals (pictures, video, narration, graphs, etc.) to compliment the text in your presentation and to reinforce your content.
Treat this as if you were going to give this presentation live.
8-10 slides
.
The early modern age was a period of great discovery and exploration.docxtodd701
The early modern age was a period of great discovery and exploration. The frontiers of knowledge were being pushed out in many directions through the work of scientists and the colonizing of the New World by the European nations. Discuss how our world today is also a world of discovery and exploration. Reflect on this in a short paragraph (250–300) that specifically links the kinds of changes five hundred years ago with the kinds of changes our culture is experiencing today.
.
The Early Novel in the Western World Listen to the AudioEarly We.docxtodd701
The Early Novel in the Western World
Listen to the Audio
Early Western literature, especially the picaresque tale, flourished in Spain. These often quite long stories narrated the adventures of a soldier of fortune living the carefree life on the open road and getting involved in all sorts of intrigues and love affairs. The Spanish also had tales similar to the King Arthur legends, dealing with the adventures on the road of brave and dashing knights who were superheroes; tremendous in battle and noble and chivalrous toward their true loves.
The first known major novelist of the Western world was Miguel de Cervantes Saavedra (1547–1616), whose life span closely parallels Shakespeare’s. His Don Quixote (written between 1612 and 1615) remains one of the most popular and beloved of all novels. The central character is an old man who has read so many stories of brave knights that he has gone mad and believes himself to be one of them. Riding a broken-down old horse named Rocinante and attended by his faithful squire Sancho Panza, he goes off in search of glorious adventure (Figure 4.1). Intended originally as a satire on the ridiculous excesses of the wandering knight story, Don Quixote became, in the opinion of many, a tragic tale of an idealist who sees the world not as it is but as it ought to be: a world in which people are driven by the noblest of motives, chivalry prevails, and love means forever. As an adventure story, Don Quixote influenced the work of many novelists who followed, setting the pattern for long, loosely structured yarns that would find a home in the magazine serials of the eighteenth and nineteenth centuries. The serial was a publishing gimmick, each episode ending with the hero or heroine in a perilous strait, and thus keeping the reader coming back to purchase more issues.
The English novel had its true beginnings in the eighteenth century. The coming of the magazine fostered a passion for fiction that had potential novelists busily scribbling. But the period was also one of a passion for science and its search for truth. Those who dictated the taste of the reading public insisted that a lengthy published work, to be worth the time spent in reading it, must at least pretend to be a true story. Consequently, much fiction was passed off as biography or autobiography, and this meant that the author’s real name was often omitted. For example, Gulliver’s Travels (1726) by Jonathan Swift and Robinson Crusoe (1719) by Daniel Defoe, two enduringly popular works of fiction, pretended to be nonfictional accounts of actual adventures, and Pamela: Virtue Rewarded (1740), by Samuel Richardson was an epistolary novel, consisting solely of letters “written” by its 15-year-old heroine.
American writers were slow to gain recognition and respect abroad. In the early nineteenth century, British critics were asking, “Who reads an American book or goes to see an American play?” These questions incurred the wrath of American authors, who promp.
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Embracing GenAI - A Strategic ImperativePeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Francesca Gottschalk from the OECD’s Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
Palestine last event orientationfvgnh .pptxRaedMohamed3
An EFL lesson about the current events in Palestine. It is intended to be for intermediate students who wish to increase their listening skills through a short lesson in power point.
The Emerging Role of Data Scientists on Software Developmen.docx
1. The Emerging Role of Data Scientists
on Software Development Teams
Miryung Kim
UCLA
Los Angeles, CA, USA
[email protected]
Thomas Zimmermann Robert DeLine Andrew Begel
Microsoft Research
Redmond, WA, USA
{tzimmer, rdeline, andrew.begel}@microsoft.com
ABSTRACT
Creating and running software produces large amounts of raw
data
about the development process and the customer usage, which
can
be turned into actionable insight with the help of skilled data
scien-
tists. Unfortunately, data scientists with the analytical and
software
2. engineering skills to analyze these large data sets have been
hard to
come by; only recently have software companies started to
develop
competencies in software-oriented data analytics. To understand
this emerging role, we interviewed data scientists across several
product groups at Microsoft. In this paper, we describe their
educa-
tion and training background, their missions in software
engineer-
ing contexts, and the type of problems on which they work. We
identify five distinct working styles of data scientists: (1)
Insight
Providers, who work with engineers to collect the data needed
to
inform decisions that managers make; (2) Modeling Specialists,
who use their machine learning expertise to build predictive
mod-
els; (3) Platform Builders, who create data platforms, balancing
both engineering and data analysis concerns; (4) Polymaths,
who
do all data science activities themselves; and (5) Team Leaders,
3. who run teams of data scientists and spread best practices. We
fur-
ther describe a set of strategies that they employ to increase the
im-
pact and actionability of their work.
Categories and Subject Descriptors:
D.2.9 [Management]
General Terms:
Management, Measurement, Human Factors.
1. INTRODUCTION
Software teams are increasingly using data analysis to inform
their
engineering and business decisions [1] and to build data
solutions
that utilize data in software products [2]. The people who do
col-
lection and analysis are called data scientists, a term coined by
DJ
Patil and Jeff Hammerbacher in 2008 to define their jobs at
LinkedIn and Facebook [3]. The mission of a data scientist is to
transform data into insight, providing guidance for leaders to
take
4. action [4]. One example is the use of user telemetry data to
redesign
Windows Explorer (a tool for file management) for Windows 8.
Data scientists on the Windows team discovered that the top ten
most frequent commands accounted for 81.2% of all of invoked
commands, but only two of these were easily accessible from
the
command bar in the user interface 8 [5]. Based on this insight,
the
team redesigned the user experience to make these hidden com-
mands more prominent.
Until recently, data scientists were found mostly on software
teams
whose products were data-intensive, like internet search and
adver-
tising. Today, we have reached an inflection point where many
soft-
ware teams are starting to adopt data-driven decision making.
The
role of data scientist is becoming standard on development
teams,
alongside existing roles like developers, testers, and program
man-
5. agers. Online service-oriented businesses such as Bing or Azure
of-
ten require that software quality to be assessed in the field
(testing
in production); as a result, Microsoft changed the test discipline
and
hires data scientists to help with analyzing the large amount of
us-
age data. With more rapid and continuous releases of software
[6],
software development teams also need effective ways to
operation-
alize data analytics by iteratively updating the software to
gather
new data and automatically produce new analysis results.
So far, there have been only a few studies about data scientists,
which focused on the limitations of big data cloud computing
tools
and the pain points that data scientists face, based on the
experi-
ences of participants from several types of businesses [7, 8].
How-
ever, these studies have not investigated the emerging roles that
6. data scientists play within software development teams.
To investigate this emerging role, we interviewed 16 data
scientists
from eight different product organizations within Microsoft.
Dur-
ing the period of our interviews, Microsoft was in the process of
defining an official “career path” for employees in the role of
data
scientist, that is, defining the knowledge and skills expected of
the
role at different career stages. This process made Microsoft a
par-
ticularly fruitful location to conduct our research, and several of
our
participants took part in this process. We investigated the
following
research questions:
Q1 Why are data scientists needed in software development
teams and what competencies are important?
Q2 What are the educational and training backgrounds of data
scientists in software development teams?
Q3 What kinds of problems and activities do data scientists
7. work
on in software development teams?
Q4 What are the working styles of data scientists in software
de-
velopment teams?
This paper makes the following contributions:
software
company. (Section 4)
Section
5)
The paper concludes with a discussion of implications (Section
6).
Permission to make digital or hard copies of all or part of this
work for
personal or classroom use is granted without fee provided that
copies are
not made or distributed for profit or commercial advantage and
that copies
bear this notice and the full citation on the first page.
Copyrights for com-
ponents of this work owned by others than the author(s) must be
honored.
8. Abstracting with credit is permitted. To copy otherwise, or
republish, to
post on servers or to redistribute to lists, requires prior specific
permission
and/or a fee. Request permissions from [email protected]
ICSE’16, May 14–22, 2016, Austin, TX, USA.
Copyright is held by the owner/author(s). Publication rights
licensed to
ACM. ACM 978-1-4503-3900-1/16/05 …$15.00.
DOI: http://dx.doi.org/10.1145/2884781.2884783
2016 IEEE/ACM 38th IEEE International Conference on
Software Engineering
96
2. RELATED WORK
The work related to this paper falls into general data science
and
software analytics.
Data Science has become popular over the past few years as
com-
panies have recognized the value of data, for example, as data
prod-
ucts, to optimize operations, and to support decision making.
9. Not
only did Davenport and Patil [9] proclaim that data scientist
would
be “the sexiest job of the 21st century,” many authors have pub-
lished data science books based on their own experiences, e.g.,
O’Neill and Schutt [10], Foreman [11], or May [12]. Patil
summa-
rized strategies to hire and build effective data science teams
based
on his experience in building the data science team at LinkedIn
[3].
We found a small number of studies which systematically
focused
on how data scientists work inside a company. Fisher et al.
inter-
viewed sixteen Microsoft data analysts working with large
datasets,
with the goal of identifying pain points from a tooling
perspective
[7]. They uncovered tooling challenges in big data computing
plat-
forms such as data integration, cloud computing cost estimation,
difficulties shaping data to the computing platform, and the
10. need
for fast iteration on the analysis results. However, they did not
de-
scribe the roles that data scientists play within software
develop-
ment teams.
In a survey, Harris et al. asked 250+ data science practitioners
how
they viewed their skills, careers, and experiences with
prospective
employers [13]. Then, they clustered the survey respondents
into
four roles: Data Businesspeople, Data Creatives, Data
Developers,
and Data Researchers. They also observed evidence for so-
called
“T-shaped” data scientists, who have a wide breadth of skills
with
depth in a single skill area. Harris et al. focus on general
business
intelligence analysts rather than data scientists in a software
devel-
opment organization. Due to the nature of a survey research
11. method, Harris et al. also do not provide contextual, deeper
findings
on what types of problems that data scientists work on, and the
strategies that they use to increase the impact of their work.
Kandel et al. conducted interviews with 35 enterprise analysts
in
healthcare, retail, marketing, and finance [8]. Companies of all
kinds have long employed business intelligence analysts to
improve
sales and marketing strategies. However, the data scientists we
study are different in that they are an integral part of the
software
engineering team and focus their attention on software-oriented
data and applications. Unlike our work, the Kandel et al. study
does
not investigate how data scientists contribute to software debug-
ging, defect prediction, and software usage data (telemetry)
collec-
tion in software development contexts.
Software Analytics is a subfield of analytics with the focus on
soft-
ware data. Software data can take many forms such as source
12. code,
changes, bug reports, code reviews, execution data, user
feedback,
and telemetry information. Davenport, Harris, and Morison [4]
de-
fine analytics “as the use of analysis, data, and systematic
reasoning
to make decisions.” According to an Accenture survey of 254
US
managers in industry, however, up to 40 percent of major
decisions
are based on gut feel rather than facts [14]. Due to the recent
boom
in big data, several research groups have pushed for greater use
of
data for decision making [15, 16, 17] and have shared their
experi-
ences collaborating with industry on analytics projects [18, 16,
19].
Analysis of software data has a long tradition in the research
com-
munities of empirical software engineering, software reliability,
and mining software repositories [1]. Software analytics has
been
13. the dedicated topic of tutorials and panels at the ICSE
conference
[20, 21], as well as special issues of IEEE Software (July 2013
and
September 2013). Zhang et al. [22] emphasized the trinity of
soft-
ware analytics in the form of three research topics (development
process, system, users) as well as three technology pillars
(infor-
mation visualization, analysis algorithms, large-scale
computing).
Buse and Zimmermann argued for a dedicated data science role
in
software projects [17] and presented an empirical survey with
soft-
ware professionals on guidelines for analytics in software
develop-
ment [23]. They identified typical scenarios and ranked popular
in-
dicators among software professionals. Begel and Zimmermann
collected 145 questions that software engineers would like to
ask
data scientists to investigate [24]. None of this work has
14. focused on
the characterization of data scientists on software teams, which
is
one of the contributions of this paper.
Many software companies such as LinkedIn, Twitter, and
Facebook
employ data scientists to analyze user behavior and user-
provided
content data [25, 26, 27]. However, the authors of these
published
reports concentrate mainly on their “big data” pipeline
architectures
and implementations, and ignore the organizational architecture
and work activities of the data scientists themselves. According
to
our study, data scientists in software teams have a unique focus
in
analyzing their own software teams’ engineering processes to
im-
prove software correctness and developer productivity.
It is common to expect that action and insight should drive the
col-
lection of data. Goal-oriented approaches use goals, objectives,
15. strategies, and other mechanisms to guide the choice of data to
be
collected and analyzed. For example, the Goal/Question/Metric
(GQM) paradigm [28] proposes a top-down approach to define
measurement; goals lead to questions, which are then answered
with metrics. Other well-known approaches are GQM+ (which
adds business alignment to GQM) [29], Balanced Scorecard
(BSC)
[30], and Practical Software Measurement [31].
Basili et al. [32] proposed the Experience Factory, which is an
in-
dependent organization to support a software development
organi-
zation in collecting experiences from their projects. The
Experience
Factory packages these experiences (for example, in models)
and
validates and reuses experiences in future projects. Some of the
team structures that we observed in the interviews were similar
to
an Experience Factory in spirit; however, many data scientists
were
16. also directly embedded in the development organizations. While
some experiences can be reused across different products, not
all
insight is valid and actionable in different contexts.
3. METHODOLOGY
We interviewed people who acted in the role of data scientists,
then
formed a theory of the roles that data scientists play in software
development organizations.
Protocol. We conducted one-hour, semi-structured interviews,
giv-
ing us the advantage of allowing unanticipated information to
be
mentioned [33]. All interviews were conducted by two people.
Each was led by the first author, who was accompanied by one
of
the other three authors (as schedules permitted) who took notes
and
asked additional questions. Interviews were audio-taped and
later
transcribed for analysis. The interview format started with an
intro-
17. duction, a short explanation of the research being conducted,
and
demographic questions. Participants were then asked about the
role
they played on their team, their data science-related
background,
their current project(s), and their interactions with other
employees.
We also asked for stories about successes, pitfalls, and the
changes
that data is having on their team’s practices. Our interview
guide is
in Appendix A.
97
Participants. In total, we interviewed 16 participants (5 women,
11 men) from eight different organizations at Microsoft:
Advanced
Technology Lab (1 participant), Advertisement and
Monetization
(1), Azure (2), Bing (1), Engineering Excellence (1), Exchange
(1),
Office (1), Skype (2), Windows (4), and Xbox (2).
18. We selected participants by snowball sampling [34]:
-driven engineering
meet-
ups and technical community meetings, since these have been
responsible internally for sharing best practices.
-of-
mouth,
asking each participant to introduce us to other data scientists
or
other key stakeholders whom they knew.
At the time of this study in Summer 2014, there was no easy
way
to identify those who do data science work at Microsoft. In fact,
Microsoft was in the process of creating a new job discipline
called
“data and applied science.” Therefore, we used snowball
sampling
because it helped us locate hidden populations of data science
prac-
titioners, such as those employees working on data science tasks
who do not have “data” or “data science” in their job title (see
Table
1). As mentioned by P15, “a lot of people kind of moonlighted
as
19. data scientists besides their normal day job.” Our sampling
method
may have caused us to miss some data scientists, however, to
miti-
gate this threat, we seeded our sample with data science thought
leaders from various product teams identified through company-
wide engineering meetups and technical community talks.
Our findings reached saturation after interviewing 16 people.
There
was enough diversity in the participants’ responses to enable us
to
find meaningful themes and draw useful interpretations.
Stopping
after saturation is standard procedure in qualitative studies.
Data Analysis. The authors individually used the Atlas.TI
qualita-
tive coding tool (http://atlasti.com/) to code emerging themes
from
the transcripts; together, we discussed the taxonomies derived
from
the interview transcripts. In concert, we employed affinity
diagram-
20. ming [35] and card sorting [36] to make sense of our data.
Figure 1
shows a screen snapshot of Atlas.TI with an interview transcript
excerpt and corresponding code describing emerging themes. In
or-
der to further help with traceability and to provide the details of
our
data analysis process, our technical report lists the codes we de-
rived, along with supporting quotes [37].
To infer the working styles of data scientists (Q4), we
performed
two card sorts based on the roles data scientists played. One
was
done by the first author, another by the second and third
authors.
When participants shared experiences from multiple roles, we
put
each role on a separate card. This happened when participants
shared experiences from previous jobs on different teams (P2
and
P12) or had multiple responsibilities (P15 manages one team of
en-
21. gineers building a machine learning platform and another team
of
data scientists using the platform to create models). Both card
sorts
led to similar groupings of the participants, which are discussed
later in the paper.
TABLE 1. PARTICIPANT INFORMATION
Title Education
P1 Data Scientist II BS in CS / Statistics, MS in SE, currently
pursuing PhD in Informatics
P2 Director, App Statistics Engineer MS in Physics
P3 Principal Data Scientist MBA, BS in Physics / CS, currently
pursuing PhD in Statistics
P4 Principal Quality Manager BS in CS
P5 Partner Data Science Architect PhD in Applied Mathematics
P6 Principal Data Scientist PhD in Physics
P7 Research Software Design Engineer II MS in Computer
Science, MS in Statistics
P8 Program Manager BS in Cognitive Science
P9 Senior Program Manager BSE in CS and BAS in
Economics/Finance
22. P10 Director of Test BS in CS
P11 Principal Dev Manager MS in CS
P12 Data Scientist PhD in CS / Machine Learning
P13 Applied Scientist PhD in CS / Machine Learning and
Database
P14 Principal Group Program Manager BS in business
P15 Director of Data Science PhD in CS / Machine Learning
P16 Senior Data Scientist PhD in CS / Machine Learning
Figure 1. An interview transcript excerpt in Atlas.TI. Using this
tool, we added codes describing emerging themes.
98
We categorized our results in terms of how and why data
scientists
are employed in a large software company (Section 4), the
working
styles that data scientists use with software development teams
and
their strategies to increase the impact of their work (Section 5).
23. Limitations. This is a qualitative study based on 16 interviews.
In
this paper, we share the observations and themes that emerged
in
the interviews. Because experts in qualitative studies
specifically
warn the danger of quantifying inherently qualitative data [38,
39],
we do not make any quantitative statements about how
frequently
the themes occur in broader populations. We follow this
guideline
to focus on providing insights that contextualize our empirical
find-
ings, especially when they can serve as the basis for further
studies,
such as surveys.
Although this study was conducted only in one company, the
par-
ticipants came from eight different organizations, each working
on
different kinds of products. Several of our participants spoke of
data
24. science experiences they had prior to joining Microsoft. We
believe
that the nature of data science work in this context is
meaningful,
given that very few companies of a similar scale exist in the
soft-
ware industry. In addition, the software engineering research
com-
munity also uses large-scale analysis of various types of
software
artifacts.
4. DATA SCIENTISTS IN SOFTWARE
DEVELOPMENT TEAMS
Data science is not a new field, but the prevalence of interest in
it
at Microsoft has grown rapidly in the last few years. In 2015,
one
year after this study, over six hundred people are now in the
new
“data and applied science” discipline and an additional 1600+
em-
ployees are interested in data science work and signed up to
mailing
25. lists related to data science topics.
We observed an evolution of data science in the company, both
in
terms of technology and people. Product leaders across the com-
pany are eager to be empowered to make data-driven
engineering
decisions, rather than relying on gut feel. Some study
participants
who initially started as vendors or individual contributors have
moved into management roles. Participants also reported that
infra-
structure that was initially developed to support a single data-
driven
task, for example, the Windows Error Reporting tool used for
col-
lecting crash data from in-field software deployments [40], was
later extended to support multiple tasks and multiple
stakeholders.
Separate organizations within the company merged their data
engi-
neering efforts to build common infrastructure. The term, “data
sci-
entist” in this paper is a logical role, rather than the title of a
26. posi-
tion. Our goal is to understand and define this data scientist role
by
studying the participants in terms of their training and
education
background, what they work on, and how they fit in their
organiza-
tion, rather than restricting our study to those with the title
“data
scientists.”
4.1 Why are Data Scientists Needed in
Software Development Teams?
Data-driven decision making has increased the demand for data
sci-
entists with statistical knowledge and skills. Specifically,
partici-
pants described the increasing need for knowledge about experi-
mental design, statistical reasoning, and data collection.
Demand for Experimentation. As the test-in-production para-
digm for on-line services has taken off, our participants
recognized
the opportunity and need for designing experiments with real
user
27. data [41]. Real customer usage data is easier to obtain and more
authentic than the simulated data test engineers create for
antici-
pated usage scenarios.
generate a
bunch of data, that data's already here. It's more authentic be-
cause it’s real customers on real machines, real networks. You
no longer have to simulate and anticipate what the customer’s
going to do. [P10]
Participants mentioned an increase in the demand for
experiment-
ing with alternative software implementations, in order to assess
the requirements and utility of new software features. Over the
last
decade, randomized two-variant experiments (called A/B
testing)
have been used to assess the utility of software prototypes and
fea-
tures, particularly for online services like web search. Because
there
are endless possibilities for alternative software designs, data
sci-
28. entists and engineering teams build software systems with an
inher-
ent capability to inject changes, called flighting.
where
I can actually experiment based on a mockup, if you will, of the
idea. I can actually come up with a set of ideas, broad ideas
about my work, and I can actually deploy them in some easy
way. [P5]
many
things you could do… We’re trying to flight things. It has capa-
bility to inject changes. [P11]
Several participants took it upon themselves both to design
incen-
tive systems that get users to adopt a product feature and to
create
user telemetry and surveys that measure whether the systems
worked.
ely use the
fea-
ture. And then we watch what happens when we take the game
away. Did it stick or did it not stick? [P13]
Demand for Statistical Rigor. In the analysis of data,
29. participants
told us that there is an increasing demand for statistical rigor.
Data
scientists and their teams conduct formal hypothesis testing,
report
confidence intervals, and determine baselines through
normaliza-
tion.
For example, when participant P2 (who worked on estimating
fu-
ture failures) reported her estimate to her manager, the manager
asked how confident she was. She gave him a hard number, sur-
prising him because whenever he had asked the question to
previ-
ous employees, he had just been told highly confident or not
very
confident.
e “So, you are giving me some predictions. How
con-
fident are you that this is what we get?” And I’m looking and
go,
“What do you mean? It’s 95 percent! It’s implied in all the test-
ing. This is how we define this whole stuff.” And he goes,
30. “Wow,
this is the first time I’m getting this answer.” [P2]
There has been a similar increase in the demand for conducting
for-
mal hypothesis testing. For example, instead of reasoning about
av-
erages or means, engineering teams want to see how different
their
observation is from random chance:
an
alternative hypothesis. [P3]
Data scientists also have to determine a baseline of usual
behavior
so they can normalize incoming data about system behavior and
99
telemetry data that are collected from a large set of machines
under
many different conditions.
on
all these different servers. I want to get a general sense of what
31. things feel like on a normal Monday. [P8]
Demand for Data Collection Rigor. When it comes to collecting
data, data scientists discussed how much data quality matters
and
how many data cleaning issues they have to manage. Many
partic-
ipants mentioned that a large portion of their work required the
cleaning and shaping of data just to enable data analysis. This
aligns
with a recent article on New York Times that said that 80% of
data
science work requires “janitor work” [42].
cleanse the data, because there are all sorts of
data
quality issues [often, due to] imperfect instrumentation. [P11]
Furthermore, data collection itself requires a sophisticated engi-
neering system that tries to satisfy many engineering, organiza-
tional, and legal requirements.
what
about privacy? There is an entire gamut of things that you need
to jump through hoops to collect the instrumentation. [P1]
32. 4.2 Background of Data Scientists
One column in Table 1 shows the educational background of the
study participants. Data scientists often do not have a typical
four-
year degree in Computer Science [12]. In our study, 11 of 16
par-
ticipants have degrees in Computer Science; however, many
also
have joint degrees from other fields such as statistics, physics,
math, bio-informatics, applied math, business, economics, and
fi-
nance. Their interdisciplinary backgrounds contribute their
strong
numerical reasoning skills for data analysis. 11 participants
have
higher education degrees (PhD or MS), and many have prior job
experiences with dealing with big data.
Several non-CS participants expressed a strong passion for data.
ta. [P2]
I’m very focused on how you can organize and make sense of
data and being able to find patterns. I love patterns. [P14]
33. When data scientists hire other data scientists, they sometimes
look
for skill sets that mirror how they were themselves trained.
When
one team manager with a PhD in machine learning spoke about
hir-
ing new employees for his data science tools team, he said that
he
looks for “machine learning hackers.”
quantitative
field with machine learning background and the ability to code.
They have to manipulate data. The other very essential skill is
[that] we want programming. It's almost like ... a hacker-type
skill set. [P15]
Another data science team manager with strong statistics back-
ground demanded the same from everyone on his team:
an-
swer sample size questions, design experiment questions, know
standard deviations, p-value, confidence intervals, etc. [P2]
Our participants’ background in higher education also
contributes
34. to how they view the work of data science. Usually, the
problems
and questions are not given in advance. A large portion of their
re-
sponsibility is to identify important questions that could lead to
im-
pact. Then they iteratively refine questions and approaches to
the
analyses. Participants from a variety of product teams discussed
how their training in a PhD program contributed to the working
style they use to identify important questions and iteratively
refine
questions and approaches.
said, “Can you answer this question?” I mostly sit around think-
ing, “How can I be helpful?” Probably that part of your PhD is
you are figuring out what is the most important questions. [P13]
used
to designing experiments. …
Homework on Ch 11 Reliability
30 points
1. (5 points) A transistor has an exponential time to failure rate
of 0.00005 per hour. Find its reliability for 5000 hours. If the
35. time to repair is also exponential with rate 0.006 per hour. What
is its availability?
2. (5 points) A remote control has 30 components in series.
Each has a reliability of 0.995. What is the reliability of the
remote? Suppose that each component is replaced by a set of
two components in parallel. What is the reliability of the new
design?
3. (5 points) Four components A,B,C,and D are parallel in a
subsystem and the respective reliabilities are: 0.98, 0.96, 0.94
and 0.99. What s the reliability of the subsystem?
4. (15 points)
D
A
37. 0.0007
F
0.95
0.00035
a. Find the reliability of the above system using the fixed
reliability model.
b. Find the MTTF of the above system using exponential
distribution assumption (data in column 3). (For subsystems
with parallel components, find the mean time to failure and then
calculate the failure rate.) I=Once you establish the MTTF for
the two subsystems and calculated the rate you can find the
overall rate with series systems. What is the probability that he
system will run for 2000 hours?