Modeling Social Data, Lecture 2: Introduction to Counting
German_Final_Project
1. German1
Bassdrive Sessions SXSW 2015
Demographic Survey Analysis and Flyer Distribution Network
Johnnie German
Introduction
Bassdrive Radio and SUBstance Productions are both drum and bass music event
production companies. They put together a joint musical production this year during SXSW at
Dozen Street Bar in Austin, Texas on March 19, 2015. During the event, SUBstance Productions
conducted a survey to collect demographic and marketing information from their attendees at the
Bassdrive Sessions event. This analysis will focus on the results of this survey and determine
SUBstance Production’s marketing distribution network. The goal of this analysis is to reject the
null hypothesis of a homogenous distribution of attendees and an even distribution of
marketing/advertising reach. The results of this analysis will help Bassdrive Radio and
SUBstance Productions assess their demographic distribution in Texas. Furthermore, it will
determine their current marketing/advertising effectiveness and create a flyer distribution
network.
Geographic Concepts
The following geographic concepts will be utilized in this analysis. A cartographic model
will be used to join the survey data to geographic data to display the survey results
geographically. The cartographic model will allow SUBstance Productions to continuously add
data to the survey and display them geographically. Moreover, it will also join census data to
geographic data to display the target marketing demographics geographically. The demographic
details will be driven by the survey results and the selected attributes will be reflected in the
cartographic model. The purpose of this model is to stream line the data management and
manipulation flow while allowing for adjustments.
In addition to the cartographic model, this analysis will provide a marketing/advertising
distribution network for the geographic region that possesses the highest marketing potential for
the selected demographics. Distribution networks can be adjusted to evaluate effectiveness and
efficiency. Identified distribution networks will serve as a baseline for additional
marketing/advertising locations.
Data and Methodology
The utilized dataset includes survey data collected from event attendees on March 19,
2015 at a Bassdrive Sessions music event in Austin, Texas by SUBstance Promotions. The
survey contained questions on age, gender, zip code, how they heard about the event, and if they
listen to Bassdrive Radio. The event had a total of 266 attendees and 37 respondents. Survey
respondents comprised 14 percent of the total attendee population. The survey was administered
by of SUBstance Productions. The respondents were randomly selected and were given the
opportunity to win a free crew only t-shirt for their participation.
2. German2
The selection process changed during the course of the survey. The initial placement of
the survey by the front door was not successful as evidenced by limited responses. Therefore, the
survey was relocated to a more optimal location. Surveys were moved to an outside table located
by the pizza truck. This allowed the surveyor to approach candidates in a less intrusive scenario.
There were no duplicate responses. The demographic results for the survey are described in the
following graphs below.
0
5
10
15
20
25
30
35
40
45
50
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
Age
Number of Males Surveyed
Male Age Distribution
Mean= 31 Range= 23-37
21
26
31
36
41
46
1 2 3 4 5 6 7 8
Age
Number of Females Surveyed
Female Age Distribution
Mean= 33 Range= 25-44
3. German3
Results demonstrate a male majority at 78 percent and a mean age of 31. Female
attendees were in the minority at 22 percent with a mean age of 33. These results show a
rejection of the null hypothesis with a stratified survey respondent population. A male majority
was expected before the survey. However, the mean age and age ranges were not as expected.
Both the age ranges and the mean age of both genders was higher/older than anticipated. These
results indicate a target age range of 21 to 44 years for both males and females. This range is
determined by the mean ages of male and female respondents and the lack of representation in
the 21-25 age range- two females and five males.
After determining the demographic results, the geographic data was analyzed to
determine the geographic location of respondents. The survey requested each respondent’s zip
code. This data will define the project area and give a geographical indication of where and how
far customers are willing to travel for an event. The survey called for a zip code due to ease of
data collection from the respondents. Each respondent’s zip code was associated to the county in
which it resides. One respondent was eliminated from the study due to their outlying geographic
location-Oklahoma. This brings the total respondent population to 36. The geographic data
results are show in the graph below.
8%
3%
3%
5%
6%
3%
56%
8%
8%
Survey Respondents by County
Dallas Fort Bend Harris Hays
Nueces Travis Waller Williamson
4. German4
The majority of respondents reside in the Austin area. Travis, Hays, and Williamson
Counties account for seventy percent of respondents counties of residence. These areas will be
the first areas to investigate for further market value. But for the purpose of this analysis Travis
County will be the main focus of study.
The survey results will be compared to the 2013 Census five year prediction population
data on gender and age for Travis, Hays, Bexar and Williamson Counties. The age range for both
males and females will be 21-44 years. This comparison will determine the areas within Travis
County that reflect the target market. This data will be joined with a county shape file to display
them geographically. The cartographic model will also incorporate these tasks to stream line the
data manipulation flow and augment the geographic displays of these data sets. The census data
was collected at the sub-county level. This level was chosen due to its detail beyond the county
level. The census tract and block levels where determined to be in too great detail for the purpose
of this analysis.
The last section of the survey data to be analyzed is the advertising reach. Each
respondent was asked how they heard about the event. They were given the option of circling
one of these: Facebook, flyer, word of mouth, or other. The other response gave room for
respondents to write in their submissions. The other responses where mostly a form of word of
mouth or the Bassdrive Chatroom on Bassdrive.com. These responses where adjusted in the
totals for analysis. The results are shown in the graph below.
47%
3%
25%
25%
Advertising Reach
Face Book Flyer Bassdrive Chat Room Word of Mouth
5. German5
The results show that Facebook has the most effective reach for advertising, and flyers
are the least effective. This indicates that flyer based advertising needs adjusting. The flyers for
this event were distributed person to person and at flyer friendly locations around the Austin
area. However, this distribution method showed marginal results. Therefore, a new distribution
tactic is needed.
Analysis
This analysis has rejected the null hypothesis of a homogenous distribution of attendees
and an even distribution of marketing/advertising reach. It has defined the demographic areas in
Travis County that match the target market of SUBstance Productions. Furthermore, a
cartographic model was created to streamline the workflow of combining the census data, survey
data, and shape files into a workable format for further analysis. In addition to the model, a flyer
distribution network was created to facilitate the flyer distribution in the target market area.
The cartographic model automates the creation of the necessary geographic data elements
for this analysis. This model joins the census data to a sub-county shape file, join the survey data
to a county shape file, and then makes a series of attribute selections to generate the data needed
for analysis. After selections and joins are made, new feature layers are produced for display in
GIS software. The model is structured to allow for ease of use by using four data files for
execution. The four data files needed are a county shape file, county sub-division shape file,
survey excel spreadsheet, and census data excel spreadsheet. The two excel files should be
preformatted for use in ArcGIS. It is highly recommended that all projections on shape files be
done prior to executing the model. Performing the projection function in the model causes join
errors. However, if no projection is needed for analysis, use the coordinate system associated
with the county shape files. This model uses the North American Datum 1983 coordinate system.
(See model below)
6. German6
The top section of the model creates the cartographic layers from the survey data and the
county shape file. This section of the model generates the survey areas and study areas. The
lower section of the model creates the cartographic layers from the census data and the county
sub-division shape files. This portion of the model generates the census survey areas and the
target market area. (See maps on pages 7 and 8)
Travis County was chosen as the target area for marketing. Specifically, the lower
southwestern quadrant of the census sub-division or the Austin CCD area. This census sub-
division has the highest 21-44 year old population at 353,136. It is the focus of the flyer
distribution network due to the high number of target demographic candidates. (See maps on
pages 9 and 10) The flyer distribution network consists of the Amy’s Ice Cream locations within
the target market area. There are nine locations within this area. Locations two and three on the
network are less than two miles from each other and are overlapping on the map. Amy’s Ice
Cream locations where chosen for factors other than location. This business allows for the
advertising of musical events in the Austin area, and have been used in the past by SUBstance
Productions. This network distribution will be tested this summer for the next SUBstance event.
The use of this network for their next event will test its effectiveness and marketing impact. It
covers 35.2 miles and takes 2 hours and forty minutes- this network does not include impedance
such as traffic and speed limits. Each stop allows for ten minutes to park and replenish flyers.
But at some point the delivery driver will stop for ice cream. So, three hours is more realistic!
(See map page 11)
Conclusion
This analysis rejects the null hypothesis of a homogenous event population and an
even advertising reach. It defines the target market area of Travis County. This conclusion is
based on the Bassdrive Sessions SXSW 2015 Survey results. Furthermore, it includes a model to
facilitate the additional analysis of future surveys by SUBstance Productions. In addition to the
model, this analysis provides a flyer distribution network for the Austin CCD census sub-
division area. These tools will help SUBstance Productions and Bassdrive Radio make better
market predictions for future events. Moreover, it will allow them to track the effectiveness of
flyer distribution within the target market area. This analysis illustrates the areas that SUBstance
Productions needs improvement. For example, the age range of 21-25 is under represented in
their market share. This is an area that needs further investigation to maximize attendance at their
events. Further surveys of their market share will produce better analysis, and allow for more
precise population enumeration and market predictions.