Coding for
Marketers
@robinlord8
If we handle what
robots can't do
We can take advantage
of what they do well
One idea to make your life easier
First steps to put that into practice
BIG
Repeated Tasks
Analytics
Hell
BIG
Repeated Tasks
Analytics
Hell
Policy Breach Notice
Dear Customer,
Some of your webpages include users' personally identifiable information
(PII) in the URL. As a result, users' PII is accessible to any third-party with a
tracking tag on these pages.
We have further identified that these pages contain Google tagging products
(such as AdWords conversion tracking, remarketing, DoubleClick for
Publishers, Floodlight and Google Analytics).
https://www.distilled.net/resources/how-to-keep-personally-identifiable-information-out-of-google-analytics/
Tools
Regular expressions - find me this thing
Regex - find me this thing
Tools
Regular expressions - find me this thing
Regex
Find me this thing
some words and some numbers: 1234
some words and some numbers: 1234
some
some words and some numbers: 1234
[a-zA-Z]+
some words and some numbers: 1234
d+
Tools
Coding language, runs in browser - do this thing
JavaScript
Do this thing
https://websitesetup.org/javascript-cheat-sheet/
https://htmlcheatsheet.com/js/
var newTitle = document.title.replace(emailRegex, 'PII');
document.title = newTitle
var newTitle = document.title.replace(emailRegex, 'PII');
var newTitle = document.title.replace(emailRegex, 'PII');
A name - whatever I want
var newTitle = document.title.replace(emailRegex, 'PII');
The titleThe web page
var newTitle = document.title.replace(emailRegex, 'PII');
with thisReplace this
document.title = Amazon.co.uk: archie.lord@test.com
newTitle = Amazon.co.uk: PII
document.title = newTitle
document.title = newTitle
What we made
document.title = newTitle
The titleThe web page
document.title = Amazon.co.uk: PII
newTitle = Amazon.co.uk: PII
BIG
Repeated Tasks
Analytics
Hell
BIG
Repeated Tasks
JavaScript and
RegEx
BIG
Repeated Tasks
JavaScript and
RegEx
1 month later - new data
Scheduled work
1 month later - new data
Shifting sands
Us using Excel
Our problems
Tools
Jupyter turns thisJupyter turns this
Tools Into thisInto this
Tools
Regular expressions - find me this thing
Regex - find me this thing
Code
Coding language - versatile and relatively friendly, good for data processing
Python - friendly, good with data
URL User Agent Status Hits
https://surgery.biz/ Mozilla 200 1,000,003
Logfiles – an example of big repeated analysis
Import
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
Resource
https://realpython.com/python-first-steps/
log_data = pandas.read_csv('robin/logs.csv')
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
log_data = pandas.read_csv('robin/logs.csv')
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
Whatever name I want
log_data = pandas.read_csv('robin/logs.csv')
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
Open csv file
log_data = pandas.read_csv('robin/logs.csv')
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
csv file location
log_data = pandas.read_csv('logsFinal05.csv')
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
This changes
log_data = pandas.read_csv('logsFinal10.csv')
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
This changes
log_data = pandas.read_csv('logsFinal10.csv')
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
This stays the same
Repeatability
Testing for numbers
URL Status Hits
https://surgery.biz/ 200 1,000,003
https://surgery.biz/ 200 500,005
http://surgery.biz/ 301 300,000
URL Status Hits Active
https://surgery.biz/ 200 1,000,003 True
https://surgery.biz/ 200 500,005 True
http://surgery.biz/ 301 300,000 False
URL Status Hits Active
https://surgery.biz/ 200 1,000,003 True
https://surgery.biz/ 200 500,005 True
http://surgery.biz/ 301 300,000 False
log_data['Active'] = log_data['Status']==200
URL Status Hits Active
https://surgery.biz/ 200 1,000,003 True
https://surgery.biz/ 200 500,005 True
http://surgery.biz/ 301 300,000 False
log_data['Active'] = log_data['Status']==200
Our data we named
URL Status Hits Active
https://surgery.biz/ 200 1,000,003 True
https://surgery.biz/ 200 500,005 True
http://surgery.biz/ 301 300,000 False
log_data['Active'] = log_data['Status']==200
Column we're filling
URL Status Hits Active
https://surgery.biz/ 200 1,000,003 True
https://surgery.biz/ 200 500,005 True
http://surgery.biz/ 301 300,000 False
log_data['Active'] = log_data['Status']==200
Our data
URL Status Hits Active
https://surgery.biz/ 200 1,000,003 True
https://surgery.biz/ 200 500,005 True
http://surgery.biz/ 301 300,000 False
log_data['Active'] = log_data['Status']==200
Column we're checking
URL Status Hits Active
https://surgery.biz/ 200 1,000,003 True
https://surgery.biz/ 200 500,005 True
http://surgery.biz/ 301 300,000 False
log_data['Active'] = log_data['Status']==200
Our check
URL Status Hits High Hits
https://surgery.biz/ 200 1,000,003 True
https://surgery.biz/ 200 500,005 True
http://surgery.biz/ 301 300,000 False
log_data['High Hits'] = log_data['Hits']>500000
URL Status Hits High Hits
https://surgery.biz/ 200 1,000,003 True
https://surgery.biz/ 200 500,005 True
http://surgery.biz/ 301 300,000 False
log_data['High Hits'] = log_data['Hits']>500000
Column we're filling
URL Status Hits High Hits
https://surgery.biz/ 200 1,000,003 True
https://surgery.biz/ 200 500,005 True
http://surgery.biz/ 301 300,000 False
log_data['High Hits'] = log_data['Hits']>500000
Column we're checking
URL Status Hits High Hits
https://surgery.biz/ 200 1,000,003 True
https://surgery.biz/ 200 500,005 True
http://surgery.biz/ 301 300,000 False
log_data['High Hits'] = log_data['Hits']>500000
Our check
Testing for words
URL Status Hits Http
https://surgery.biz/ 200 1,000,003 False
https://surgery.biz/ 200 500,005 False
http://surgery.biz/ 301 300,000 True
URL Status Hits Http
https://surgery.biz/ 200 1,000,003 False
https://surgery.biz/ 200 500,005 False
http://surgery.biz/ 301 300,000 True
URL Status Hits Http
https://surgery.biz/ 200 1,000,003 False
https://surgery.biz/ 200 500,005 False
http://surgery.biz/ 301 300,000 True
log_data['Http'] = log_data['URL'].str.contains('http:')
URL Status Hits Http
https://surgery.biz/ 200 1,000,003 False
https://surgery.biz/ 200 500,005 False
http://surgery.biz/ 301 300,000 True
log_data['Http'] = log_data['URL'].str.contains('http:')
Column we're filling
URL Status Hits Http
https://surgery.biz/ 200 1,000,003 False
https://surgery.biz/ 200 500,005 False
http://surgery.biz/ 301 300,000 True
log_data['Http'] = log_data['URL'].str.contains('http:')
Column we're checking
URL Status Hits Http
https://surgery.biz/ 200 1,000,003 False
https://surgery.biz/ 200 500,005 False
http://surgery.biz/ 301 300,000 True
log_data['Http'] = log_data['URL'].str.contains('http:')
Our check
URL Active 301 High Hits Http
https://surgery.biz/ True False True False
https://surgery.biz/ True False True False
http://surgery.biz/ False True False True
Eeeeeeeverything
Eeeeeeeverything
Eeeeeeeverything
Eeeeeeeverything
Our output
Eeeeeeeverything
Our output
Eeeeeeeverything
Our output
Eeeeeeeverything
Our output
Repeatability
For loop
checklist=['plastic', 'legal', 'dress', 'sue']
for item in checklist:
checklist=['plastic', 'legal', 'dress', 'sue']
for item in checklist:
log_data[item] = log_data['URLs'].str.contains(item)
checklist=['plastic', 'legal', 'dress', 'sue']
for item in checklist:
log_data[item] = log_data['URLs'].str.contains(item)
checklist=['plastic', 'legal', 'dress', 'sue']
for item in checklist:
log_data['plastic']=log_data['URLs'].str.contains('plastic')
URL plastic
http://…sue-us False
checklist=['plastic', 'legal', 'dress', 'sue']
for item in checklist:
log_data['legal'] = log_data['URLs'].str.contains('legal')
URL plastic legal
http://…sue-us False False
checklist=['plastic', 'legal', 'dress', 'sue']
for item in checklist:
log_data['dress'] = log_data['URLs'].str.contains('dress')
URL plastic legal dress
http://…sue-us False False False
checklist=['plastic', 'legal', 'dress', 'sue']
for item in checklist:
log_data[ ] = log_data['URLs'].str.contains( )
URL plastic legal dress sue
http://…sue-us False False False True
checklist=['plastic', 'legal', 'dress', 'sue']
checklist=['plastic', 'legal', 'dress', 'sue', 'dresses', 'sale',
'product', 'free-trial', 'best-medical-decision-of-my-life',
'dr-nick']
checklist=['plastic', 'legal', 'dress', 'sue', 'dresses', 'sale',
'product', 'free-trial', 'best-medical-decision-of-my-life',
'dr-nick', 'many', 'more', 'checks', 'its', 'so', 'easy']
checklist=['plastic', 'legal', 'dress', 'sue', 'dresses', 'sale',
'product', 'free-trial', 'best-medical-decision-of-my-life',
'dr-nick', 'many', 'more', 'checks', 'its', 'so', 'easy', 'to', 'do',
'and', 'you', 'just', 'update', 'and', 'run', 'your', 'list',
'whenever', 'you', 'want', 'no', 'kidding', 'there', 'is', 'time',
'to', 'mess', 'around', 'with', 'stuff', 'like', 'this', 'because', 'we',
'aren't', 'repeating', 'work', 'in', 'excel']
for item in checklist:
log_data[item] = log_data['URLs'].str.contains(item)
H('contact',A2)),'contact',IF(ISNUMBER(SEARCH('dresses',A2)),'dresses',IF(ISNU
MBER(SEARCH('sale',A2)),'sale',IF(ISNUMBER(SEARCH('product',A2)),'product',
IF(ISNUMBER(SEARCH('free-trial',A2)),'free-trial',IF(ISNUMBER(SEARCH('best-
medical-decision-of-my-life',A2)),'best-medical-decision-of-my-
life',IF(ISNUMBER(SEARCH('dr-nick',A2)),'dr-
nick',IF(ISNUMBER(SEARCH('many',A2)),'many',IF(ISNUMBER(SEARCH('more',
A2)),'more',IF(ISNUMBER(SEARCH('checks',A2)),'checks',IF(ISNUMBER(SEARC
H('its',A2)),'its',IF(ISNUMBER(SEARCH('so',A2)),'so',IF(ISNUMBER(SEARCH('eas
y',A2)),'easy',IF(ISNUMBER(SEARCH('to',A2)),'to',IF(ISNUMBER(SEARCH('do',A
2)),'do',IF(ISNUMBER(SEARCH('and',A2)),'and',IF(ISNUMBER(SEARCH('you',A2)
),'you',IF(ISNUMBER(SEARCH('just',A2)),'just',IF(ISNUMBER(SEARCH('update',A
2)),'update',IF(ISNUMBER(SEARCH('and',A2)),'and',IF(ISNUMBER(SEARCH('run'
,A2)),'run',IF(ISNUMBER(SEARCH('your',A2)),'your',IF(ISNUMBER(SEARCH('list',
A2)),'list',IF(ISNUMBER(SEARCH('whenever',A2)),'whenever',IF(ISNUMBER(SEA
RCH('you',A2)),'you',IF(ISNUMBER(SEARCH('want',A2)),'want',IF(ISNUMBER(SE
ARCH('no',A2)),'no',IF(ISNUMBER(SEARCH('kidding',A2)),'kidding',IF(ISNUMBE
R(SEARCH('there',A2)),'there',IF(ISNUMBER(SEARCH('is',A2)),'is',IF(ISNUMBER(
SEARCH('time',A2)),'time',IF(ISNUMBER(SEARCH('to',A2)),'to',IF(ISNUMBER(SE
ARCH('mess',A2)),'mess',IF(ISNUMBER(SEARCH('around',A2)),'around',IF(ISNU
MBER(SEARCH('with',A2)),'with',IF(ISNUMBER(SEARCH('stuff',A2)),'stuff',IF(ISN
UMBER(SEARCH('like',A2)),'like',IF(ISNUMBER(SEARCH('this',A2)),'this',IF(ISNU
MBER(SEARCH('because',A2)),'because',IF(ISNUMBER(SEARCH('we',A2)),'we',IF
(ISNUMBER(SEARCH('aren't',A2)),'aren't',IF(ISNUMBER(SEARCH('repeating',A2
)),'repeating',IF(ISNUMBER(SEARCH('work',A2)),'work',IF(ISNUMBER(SEARCH(
'in',A2)),'in',IF(ISNUMBER(SEARCH('excel',A2)),'excel','other')))))))))))))))))))))))))
)))))))))))))))))))))))
Eeeeeeeverything
Our output
Eeeeeeeverything
Our output
Repeatability
BIG
Repeated Tasks
JavaScript and
RegEx
Python
JavaScript and
RegEx
Some first steps
Specific
1 month later - new data
General
General
It's boring
It's boring
Mindless
Repetitive
Fiddly
Time consuming
Code is good at
Accuracy
Speed
Comfort with data
Focus
Transparency
Code is good at
Accuracy
Speed
Comfort with data
Focus
Transparency
We need to handle:
Conclusions
Communication
Direction
Plan your approach:
Choose your tools
Thinking time and sense checks
Structure
@robinlord8
Credits
Credits by slide number
26 – Iron Giant, 1992, Warner Bros, Pictures
45 – Sherlock, 2010-2018, Hartswood Films, BBC Wales, WGBH
75 – Final Space, 2018, Conaco
Resources
Resources
1. Regex101 - https://regex101.com/
2. Google Tag Manager (a way to run JavaScript, will need to be added to your site, like Google Analytics) - https://www.google.com/analytics/tag-manager/
3. Simo Ahava blog - https://www.simoahava.com/
4. Sam's GTM PII fix blog post - https://www.distilled.net/resources/how-to-keep-personally-identifiable-information-out-of-google-analytics/
5. JavaScript cheat sheets - https://websitesetup.org/javascript-cheat-sheet/, https://htmlcheatsheet.com/js/
6. Anaconda - download everything you need to start with Python - including Jupyter - https://www.anaconda.com/download/
7. Pandas documentation - https://pandas.pydata.org/pandas-docs/stable/
8. Python beginner resource - https://realpython.com/python-first-steps/
9. Stack Overflow (if you Google your problem, this will probably come up anyway) - https://stackoverflow.com/

Coding for marketers