SlideShare a Scribd company logo
1 of 28
Download to read offline
Faster, Smaller
A core feature proposal for improving file
synchronization between Drupal environments
Wednesday, 17 February 16
How it works now
Stream wrappers
• public://
• private://
• temporary://
Locations stored in
variables:
• file_public_path
• file_private_path
• file_temporary_path
Wednesday, 17 February 16
Core save functions
[D7]
file_save_data($data, 'public://foo');
file_unmanaged_save_data($data, 'public://bar');
Wednesday, 17 February 16
Configuring Color [D7]
$id = $theme . '-' . substr(hash('sha256',
serialize($palette) . microtime()), 0, 8);
$paths['color'] = 'public://color';
$paths['target'] = $paths['color'] . '/' .
$id;
foreach ($paths as $path) {
file_prepare_directory($path,
FILE_CREATE_DIRECTORY);
}
Wednesday, 17 February 16
Aggregated CSS [D7]
$filename = 'css_' . drupal_hash_base64($data) .
'.css';
// Create the css/ within the files folder.
$csspath = 'public://css';
$uri = $csspath . '/' . $filename;
// Create the CSS file.
file_prepare_directory($csspath,
FILE_CREATE_DIRECTORY);
if (!file_exists($uri) && !
file_unmanaged_save_data($data, $uri,
FILE_EXISTS_REPLACE)) {
return FALSE;
}
Wednesday, 17 February 16
Hoarders Paradise
Like a hoarder who keeps everything, it all
ends up in one big bucket of stuff
sites/default/files
Wednesday, 17 February 16
Sync all the things!
rsync -rltp
live.example.com:/var/www/sites/default/files/
stage.example.com:/var/www/sites/default/files/
Big bucket
of stuff
Big bucket
of stuff
Live Stage
Wednesday, 17 February 16
Several hours later…
Wednesday, 17 February 16
Big bucket of stuff
Here is the 'stuff' on my blog site:
css
ctools
document_uploads
.htaccess
js
static
xmlsitemap
Wednesday, 17 February 16
Big bucket of stuff
Here is the 'stuff' that I actually need to sync:
css
ctools
document_uploads
.htaccess
js
static
xmlsitemap
Wednesday, 17 February 16
Excluding caches
Some files are auto-generated caches, such
as:
• Aggregated CSS/JS
• Image-style thumbnails
• Sitemaps
Wednesday, 17 February 16
Cost
For sites that are image-heavy, and/or have a
large number of image-styles, the
'regenerable content' can be many times the
size of the original source.
Wednesday, 17 February 16
More efficient rsync
rsync -rltp
--exclude css
--exclude ctools
--exclude js
--exclude styles
--exclude xmlsitemap
live.example.com:/var/www/sites/default/files/
stage.example.com:/var/www/sites/default/files/
Wednesday, 17 February 16
More efficient rsync
rsync -rltp
--exclude css
--exclude ctools
--exclude js
--exclude styles
--exclude xmlsitemap
live.example.com:/var/www/sites/default/files/
stage.example.com:/var/www/sites/default/files/
Becomes confusing and needs maintenance.
Wednesday, 17 February 16
What if there were
TWO buckets?
Smaller
bucket of
stuff
Big bucket
of stuff I can
rebuild
Smaller
bucket of
stuff
Live Stage
Wednesday, 17 February 16
Additional
stream-wrappers?
Stream wrappers
• public://
• private://
• temporary://
• cache-public://
• cache-private://
Locations stored in
variables:
• file_public_path
• file_private_path
• file_temporary_path
• file_cache_public_path
• file_cache_private_path
Wednesday, 17 February 16
Precedents
• Drupal data-cache API.
• By default, uses DB tables
• Abstracted via cache-bins
• Cache tables identified via
hook_flush_caches()
Wednesday, 17 February 16
DX: where is safe?
When I first started Drupalling, I had a client
who requested the ability to add custom CSS.
So I created a quick UI in the admin area,
thought about how to store the data, and
decided that it would be sensible to reuse the
sites/default/files/css path.
It was a shock a couple of days after launch,
when the client asked "Where has my custom
CSS gone?"
Wednesday, 17 February 16
DX: where is safe?
function drupal_clear_css_cache() {
file_scan_directory(file_create_path('c
ss'), '.*', array('.', '..', 'CVS'),
'file_delete', TRUE);
// Clear the page cache, so cached
pages do not reference nonexistent CSS.
cache_clear_all();
}
Wednesday, 17 February 16
DX: where is safe?
Yes, everything beneath sites/default/files/css
was deleted.
This was back in D5, and it is a little better in
D8: it only deletes files that haven't been
modified in 30 days.
Be careful where you put your assets!
Wednesday, 17 February 16
DX
Separating persistent storage from regenerable
cache storage will make it easier for
developers to recognise and implement
good directory-structure habits, and give a
warning sign to dangerous locations (e.g.
cache-public://css is more obviously risky a
place to store persistent files than public://
css).
Wednesday, 17 February 16
Backward Compatibility
If the variable for cache-public:// doesn't
exist, it could inherit the setting used by
public://.
Reusing the same location as public:// would
mean that for most users, there wouldn't be
any noticeable change, or any break in their
configuration.
Wednesday, 17 February 16
Risky synchronization?
In some cases, running rsync on the entirety of
sites/default/files can be harmful.
Some autogenerated content - such as XML
sitemaps - may be specific to an environment: for
example, the base URL is often different between
stage and live.
This could cause all sorts of unwanted side-effects:
duplicate notifications and inaccurate test results
are just two that immediately spring to mind.
Wednesday, 17 February 16
Edge-cases
There may be custom or contrib code
expecting assets such as image thumbnails to
belong under public:// - e.g. looking up
information such as the size of the image.
If the site were upgraded, and the developer
also moved the location of cache-public://, this
could cause failures such as recursive lookups,
and the cause may not be immediately
apparent to the developer.
Wednesday, 17 February 16
Edge-cases
On the whole, I think the edge-cases are
minimal, and can be addressed by good
communication of the implications of the
change.
Wednesday, 17 February 16
Potential use-cases
• Synchronization of files between
environments (e.g. live to staging)
• Backups
• Proxies/CDN delivery
• Garbage collection: scanning for orphaned/
removable files
Wednesday, 17 February 16
Goals of the change
1. All data in public:// should be persistent and
necessary.
2. All data in cache-public:// should be disposable,
and regenerable from other sources.
3. All data in public:// should be tracked in the
file-usage API; untracked files indicate
orphaned/deletable content.
Wednesday, 17 February 16
Summary
Adding two stream-wrappers to core would allow
regenerable content to be stored separately from
persistent content, simplifying a number of tasks
such as back and synchronization between
environments.
This change would be backwards-compatible, would
not affect existing sites without action from the
site-owner, and would improve developer's
understanding of directory structures created by
modules.
Wednesday, 17 February 16

More Related Content

What's hot

Caching with Varnish
Caching with VarnishCaching with Varnish
Caching with Varnish
schoefmax
 
Roy foubister (hosting high traffic sites on a tight budget)
Roy foubister (hosting high traffic sites on a tight budget)Roy foubister (hosting high traffic sites on a tight budget)
Roy foubister (hosting high traffic sites on a tight budget)
WordCamp Cape Town
 

What's hot (20)

Caching with Varnish
Caching with VarnishCaching with Varnish
Caching with Varnish
 
Memcache
MemcacheMemcache
Memcache
 
Caching basics in PHP
Caching basics in PHPCaching basics in PHP
Caching basics in PHP
 
Caching Data For Performance
Caching Data For PerformanceCaching Data For Performance
Caching Data For Performance
 
DevOps Meetup ansible
DevOps Meetup   ansibleDevOps Meetup   ansible
DevOps Meetup ansible
 
Scaling PHP apps
Scaling PHP appsScaling PHP apps
Scaling PHP apps
 
Pure Speed Drupal 4 Gov talk
Pure Speed Drupal 4 Gov talkPure Speed Drupal 4 Gov talk
Pure Speed Drupal 4 Gov talk
 
Cassandra as Memcache
Cassandra as MemcacheCassandra as Memcache
Cassandra as Memcache
 
HBaseConEast2016: HBase on Docker with Clusterdock
HBaseConEast2016: HBase on Docker with ClusterdockHBaseConEast2016: HBase on Docker with Clusterdock
HBaseConEast2016: HBase on Docker with Clusterdock
 
The secret life of a dispatcher (Adobe CQ AEM)
The secret life of a dispatcher (Adobe CQ AEM)The secret life of a dispatcher (Adobe CQ AEM)
The secret life of a dispatcher (Adobe CQ AEM)
 
Cache all the things - A guide to caching Drupal
Cache all the things - A guide to caching DrupalCache all the things - A guide to caching Drupal
Cache all the things - A guide to caching Drupal
 
Mini-Training: To cache or not to cache
Mini-Training: To cache or not to cacheMini-Training: To cache or not to cache
Mini-Training: To cache or not to cache
 
DrupalCampLA 2011: Drupal backend-performance
DrupalCampLA 2011: Drupal backend-performanceDrupalCampLA 2011: Drupal backend-performance
DrupalCampLA 2011: Drupal backend-performance
 
Memcached Presentation
Memcached PresentationMemcached Presentation
Memcached Presentation
 
Hosting huge amount of binaries in JCR
Hosting huge amount of binaries in JCRHosting huge amount of binaries in JCR
Hosting huge amount of binaries in JCR
 
Roy foubister (hosting high traffic sites on a tight budget)
Roy foubister (hosting high traffic sites on a tight budget)Roy foubister (hosting high traffic sites on a tight budget)
Roy foubister (hosting high traffic sites on a tight budget)
 
Aem dispatcher – tips & tricks
Aem dispatcher – tips & tricksAem dispatcher – tips & tricks
Aem dispatcher – tips & tricks
 
Web agencies: An analysis of the OVH infrastructure to optimise your web proj...
Web agencies: An analysis of the OVH infrastructure to optimise your web proj...Web agencies: An analysis of the OVH infrastructure to optimise your web proj...
Web agencies: An analysis of the OVH infrastructure to optimise your web proj...
 
DrupalCampLA 2011 - Drupal frontend-optimizing
DrupalCampLA 2011 - Drupal frontend-optimizingDrupalCampLA 2011 - Drupal frontend-optimizing
DrupalCampLA 2011 - Drupal frontend-optimizing
 
Memcached: What is it and what does it do?
Memcached: What is it and what does it do?Memcached: What is it and what does it do?
Memcached: What is it and what does it do?
 

Viewers also liked

Kibon – 70 anos de felicidade
Kibon – 70 anos de felicidadeKibon – 70 anos de felicidade
Kibon – 70 anos de felicidade
Diego Cordeiro
 
Lançamento do land rover freelander 2
Lançamento do land rover freelander 2Lançamento do land rover freelander 2
Lançamento do land rover freelander 2
Diego Cordeiro
 
Recent developments in the Kew Grasses Databases
Recent developments in the Kew Grasses DatabasesRecent developments in the Kew Grasses Databases
Recent developments in the Kew Grasses Databases
Kehan Harman
 
Copy package jaime
Copy package  jaimeCopy package  jaime
Copy package jaime
usalaser
 
BA Brazil 2015 - Construindo a Arquitetura de Negócio da sua Organização
BA Brazil 2015 - Construindo a Arquitetura de Negócio da sua OrganizaçãoBA Brazil 2015 - Construindo a Arquitetura de Negócio da sua Organização
BA Brazil 2015 - Construindo a Arquitetura de Negócio da sua Organização
Rafael Targino
 
Seguridad informatica camm
Seguridad informatica cammSeguridad informatica camm
Seguridad informatica camm
karendiego
 

Viewers also liked (19)

Kibon – 70 anos de felicidade
Kibon – 70 anos de felicidadeKibon – 70 anos de felicidade
Kibon – 70 anos de felicidade
 
Lançamento do land rover freelander 2
Lançamento do land rover freelander 2Lançamento do land rover freelander 2
Lançamento do land rover freelander 2
 
Al Límite
Al LímiteAl Límite
Al Límite
 
Comercio electronico
Comercio electronico Comercio electronico
Comercio electronico
 
Brahma sapuca aí!
Brahma sapuca aí!Brahma sapuca aí!
Brahma sapuca aí!
 
Portfolio
PortfolioPortfolio
Portfolio
 
Recent developments in the Kew Grasses Databases
Recent developments in the Kew Grasses DatabasesRecent developments in the Kew Grasses Databases
Recent developments in the Kew Grasses Databases
 
Copy package jaime
Copy package  jaimeCopy package  jaime
Copy package jaime
 
CHORI Poster Final
CHORI Poster FinalCHORI Poster Final
CHORI Poster Final
 
Francisco coll 2015 deutsch
Francisco coll 2015 deutschFrancisco coll 2015 deutsch
Francisco coll 2015 deutsch
 
Carta servicios de Two Zink
Carta servicios de Two ZinkCarta servicios de Two Zink
Carta servicios de Two Zink
 
Curso: Monte o seu computador
Curso: Monte o seu computadorCurso: Monte o seu computador
Curso: Monte o seu computador
 
TDC2016POA | Trilha Agile - Romantismo, polarização ou convergência metodológica
TDC2016POA | Trilha Agile - Romantismo, polarização ou convergência metodológicaTDC2016POA | Trilha Agile - Romantismo, polarização ou convergência metodológica
TDC2016POA | Trilha Agile - Romantismo, polarização ou convergência metodológica
 
BA Brazil 2015 - Construindo a Arquitetura de Negócio da sua Organização
BA Brazil 2015 - Construindo a Arquitetura de Negócio da sua OrganizaçãoBA Brazil 2015 - Construindo a Arquitetura de Negócio da sua Organização
BA Brazil 2015 - Construindo a Arquitetura de Negócio da sua Organização
 
Clientes Informatico
Clientes InformaticoClientes Informatico
Clientes Informatico
 
สรุป วิชาโลก ดาราศาสตร์ และอวกาศ
สรุป วิชาโลก ดาราศาสตร์ และอวกาศสรุป วิชาโลก ดาราศาสตร์ และอวกาศ
สรุป วิชาโลก ดาราศาสตร์ และอวกาศ
 
Apresentação TDC - Análise de Negócios
Apresentação TDC - Análise de NegóciosApresentação TDC - Análise de Negócios
Apresentação TDC - Análise de Negócios
 
Dos requisitos à implantação em uma palestra
Dos requisitos à implantação em uma palestraDos requisitos à implantação em uma palestra
Dos requisitos à implantação em uma palestra
 
Seguridad informatica camm
Seguridad informatica cammSeguridad informatica camm
Seguridad informatica camm
 

Similar to Drupal feature proposal: two new stream-wrappers

HDFS tiered storage
HDFS tiered storageHDFS tiered storage
HDFS tiered storage
DataWorks Summit
 
Drupal Multisite Setup
Drupal Multisite SetupDrupal Multisite Setup
Drupal Multisite Setup
ipsitamishra
 

Similar to Drupal feature proposal: two new stream-wrappers (20)

Rails - getting started
Rails - getting startedRails - getting started
Rails - getting started
 
Hong Kong Drupal User Group - Sep 13th
Hong Kong Drupal User Group - Sep 13thHong Kong Drupal User Group - Sep 13th
Hong Kong Drupal User Group - Sep 13th
 
Mobile & Desktop Cache 2.0: How To Create A Scriptable Cache
Mobile & Desktop Cache 2.0: How To Create A Scriptable CacheMobile & Desktop Cache 2.0: How To Create A Scriptable Cache
Mobile & Desktop Cache 2.0: How To Create A Scriptable Cache
 
Drupal 8 Render Cache
Drupal 8 Render CacheDrupal 8 Render Cache
Drupal 8 Render Cache
 
your browser, my storage
your browser, my storageyour browser, my storage
your browser, my storage
 
HDFS tiered storage
HDFS tiered storageHDFS tiered storage
HDFS tiered storage
 
Cache all the things #DCLondon
Cache all the things #DCLondonCache all the things #DCLondon
Cache all the things #DCLondon
 
Using Document Databases with TYPO3 Flow
Using Document Databases with TYPO3 FlowUsing Document Databases with TYPO3 Flow
Using Document Databases with TYPO3 Flow
 
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
 
Build Automation of PHP Applications
Build Automation of PHP ApplicationsBuild Automation of PHP Applications
Build Automation of PHP Applications
 
Lecture 6 Data Driven Design
Lecture 6  Data Driven DesignLecture 6  Data Driven Design
Lecture 6 Data Driven Design
 
The Virtual Repository
The Virtual RepositoryThe Virtual Repository
The Virtual Repository
 
EWD 3 Training Course Part 1: How Node.js Integrates With Global Storage Data...
EWD 3 Training Course Part 1: How Node.js Integrates With Global Storage Data...EWD 3 Training Course Part 1: How Node.js Integrates With Global Storage Data...
EWD 3 Training Course Part 1: How Node.js Integrates With Global Storage Data...
 
Michael stack -the state of apache h base
Michael stack -the state of apache h baseMichael stack -the state of apache h base
Michael stack -the state of apache h base
 
Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
 
Off the Treadmill: Building a Drupal Platform for Your Organization
Off the Treadmill: Building a Drupal Platform for Your OrganizationOff the Treadmill: Building a Drupal Platform for Your Organization
Off the Treadmill: Building a Drupal Platform for Your Organization
 
SemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeSemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in Practice
 
Important work-arounds for making ASS multi-lingual
Important work-arounds for making ASS multi-lingualImportant work-arounds for making ASS multi-lingual
Important work-arounds for making ASS multi-lingual
 
Drupal Multisite Setup
Drupal Multisite SetupDrupal Multisite Setup
Drupal Multisite Setup
 
The Nuts and Bolts of Hadoop and it's Ever-changing Ecosystem, Presented by J...
The Nuts and Bolts of Hadoop and it's Ever-changing Ecosystem, Presented by J...The Nuts and Bolts of Hadoop and it's Ever-changing Ecosystem, Presented by J...
The Nuts and Bolts of Hadoop and it's Ever-changing Ecosystem, Presented by J...
 

More from Marcus Deglos (6)

Vagrant crash course
Vagrant crash courseVagrant crash course
Vagrant crash course
 
Drupal haters gonna hate
Drupal haters gonna hateDrupal haters gonna hate
Drupal haters gonna hate
 
With one click
With one clickWith one click
With one click
 
Panels rocks!
Panels rocks!Panels rocks!
Panels rocks!
 
SSO To go
SSO To goSSO To go
SSO To go
 
Where in the world
Where in the worldWhere in the world
Where in the world
 

Recently uploaded

Recently uploaded (20)

StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdfStrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
 
A Deep Dive into Secure Product Development Frameworks.pdf
A Deep Dive into Secure Product Development Frameworks.pdfA Deep Dive into Secure Product Development Frameworks.pdf
A Deep Dive into Secure Product Development Frameworks.pdf
 
Tree in the Forest - Managing Details in BDD Scenarios (live2test 2024)
Tree in the Forest - Managing Details in BDD Scenarios (live2test 2024)Tree in the Forest - Managing Details in BDD Scenarios (live2test 2024)
Tree in the Forest - Managing Details in BDD Scenarios (live2test 2024)
 
What is a Recruitment Management Software?
What is a Recruitment Management Software?What is a Recruitment Management Software?
What is a Recruitment Management Software?
 
OpenChain Webinar: AboutCode and Beyond - End-to-End SCA
OpenChain Webinar: AboutCode and Beyond - End-to-End SCAOpenChain Webinar: AboutCode and Beyond - End-to-End SCA
OpenChain Webinar: AboutCode and Beyond - End-to-End SCA
 
[GeeCON2024] How I learned to stop worrying and love the dark silicon apocalypse
[GeeCON2024] How I learned to stop worrying and love the dark silicon apocalypse[GeeCON2024] How I learned to stop worrying and love the dark silicon apocalypse
[GeeCON2024] How I learned to stop worrying and love the dark silicon apocalypse
 
Community is Just as Important as Code by Andrea Goulet
Community is Just as Important as Code by Andrea GouletCommunity is Just as Important as Code by Andrea Goulet
Community is Just as Important as Code by Andrea Goulet
 
Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024
 
^Clinic ^%[+27788225528*Abortion Pills For Sale In soweto
^Clinic ^%[+27788225528*Abortion Pills For Sale In soweto^Clinic ^%[+27788225528*Abortion Pills For Sale In soweto
^Clinic ^%[+27788225528*Abortion Pills For Sale In soweto
 
Abortion Clinic In Polokwane ](+27832195400*)[ 🏥 Safe Abortion Pills in Polok...
Abortion Clinic In Polokwane ](+27832195400*)[ 🏥 Safe Abortion Pills in Polok...Abortion Clinic In Polokwane ](+27832195400*)[ 🏥 Safe Abortion Pills in Polok...
Abortion Clinic In Polokwane ](+27832195400*)[ 🏥 Safe Abortion Pills in Polok...
 
Transformer Neural Network Use Cases with Links
Transformer Neural Network Use Cases with LinksTransformer Neural Network Use Cases with Links
Transformer Neural Network Use Cases with Links
 
Microsoft365_Dev_Security_2024_05_16.pdf
Microsoft365_Dev_Security_2024_05_16.pdfMicrosoft365_Dev_Security_2024_05_16.pdf
Microsoft365_Dev_Security_2024_05_16.pdf
 
Prompt Engineering - an Art, a Science, or your next Job Title?
Prompt Engineering - an Art, a Science, or your next Job Title?Prompt Engineering - an Art, a Science, or your next Job Title?
Prompt Engineering - an Art, a Science, or your next Job Title?
 
Weeding your micro service landscape.pdf
Weeding your micro service landscape.pdfWeeding your micro service landscape.pdf
Weeding your micro service landscape.pdf
 
From Theory to Practice: Utilizing SpiraPlan's REST API
From Theory to Practice: Utilizing SpiraPlan's REST APIFrom Theory to Practice: Utilizing SpiraPlan's REST API
From Theory to Practice: Utilizing SpiraPlan's REST API
 
The Evolution of Web App Testing_ An Ultimate Guide to Future Trends.pdf
The Evolution of Web App Testing_ An Ultimate Guide to Future Trends.pdfThe Evolution of Web App Testing_ An Ultimate Guide to Future Trends.pdf
The Evolution of Web App Testing_ An Ultimate Guide to Future Trends.pdf
 
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...
 
The Strategic Impact of Buying vs Building in Test Automation
The Strategic Impact of Buying vs Building in Test AutomationThe Strategic Impact of Buying vs Building in Test Automation
The Strategic Impact of Buying vs Building in Test Automation
 
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
 
BusinessGPT - Security and Governance for Generative AI
BusinessGPT  - Security and Governance for Generative AIBusinessGPT  - Security and Governance for Generative AI
BusinessGPT - Security and Governance for Generative AI
 

Drupal feature proposal: two new stream-wrappers

  • 1. Faster, Smaller A core feature proposal for improving file synchronization between Drupal environments Wednesday, 17 February 16
  • 2. How it works now Stream wrappers • public:// • private:// • temporary:// Locations stored in variables: • file_public_path • file_private_path • file_temporary_path Wednesday, 17 February 16
  • 3. Core save functions [D7] file_save_data($data, 'public://foo'); file_unmanaged_save_data($data, 'public://bar'); Wednesday, 17 February 16
  • 4. Configuring Color [D7] $id = $theme . '-' . substr(hash('sha256', serialize($palette) . microtime()), 0, 8); $paths['color'] = 'public://color'; $paths['target'] = $paths['color'] . '/' . $id; foreach ($paths as $path) { file_prepare_directory($path, FILE_CREATE_DIRECTORY); } Wednesday, 17 February 16
  • 5. Aggregated CSS [D7] $filename = 'css_' . drupal_hash_base64($data) . '.css'; // Create the css/ within the files folder. $csspath = 'public://css'; $uri = $csspath . '/' . $filename; // Create the CSS file. file_prepare_directory($csspath, FILE_CREATE_DIRECTORY); if (!file_exists($uri) && ! file_unmanaged_save_data($data, $uri, FILE_EXISTS_REPLACE)) { return FALSE; } Wednesday, 17 February 16
  • 6. Hoarders Paradise Like a hoarder who keeps everything, it all ends up in one big bucket of stuff sites/default/files Wednesday, 17 February 16
  • 7. Sync all the things! rsync -rltp live.example.com:/var/www/sites/default/files/ stage.example.com:/var/www/sites/default/files/ Big bucket of stuff Big bucket of stuff Live Stage Wednesday, 17 February 16
  • 9. Big bucket of stuff Here is the 'stuff' on my blog site: css ctools document_uploads .htaccess js static xmlsitemap Wednesday, 17 February 16
  • 10. Big bucket of stuff Here is the 'stuff' that I actually need to sync: css ctools document_uploads .htaccess js static xmlsitemap Wednesday, 17 February 16
  • 11. Excluding caches Some files are auto-generated caches, such as: • Aggregated CSS/JS • Image-style thumbnails • Sitemaps Wednesday, 17 February 16
  • 12. Cost For sites that are image-heavy, and/or have a large number of image-styles, the 'regenerable content' can be many times the size of the original source. Wednesday, 17 February 16
  • 13. More efficient rsync rsync -rltp --exclude css --exclude ctools --exclude js --exclude styles --exclude xmlsitemap live.example.com:/var/www/sites/default/files/ stage.example.com:/var/www/sites/default/files/ Wednesday, 17 February 16
  • 14. More efficient rsync rsync -rltp --exclude css --exclude ctools --exclude js --exclude styles --exclude xmlsitemap live.example.com:/var/www/sites/default/files/ stage.example.com:/var/www/sites/default/files/ Becomes confusing and needs maintenance. Wednesday, 17 February 16
  • 15. What if there were TWO buckets? Smaller bucket of stuff Big bucket of stuff I can rebuild Smaller bucket of stuff Live Stage Wednesday, 17 February 16
  • 16. Additional stream-wrappers? Stream wrappers • public:// • private:// • temporary:// • cache-public:// • cache-private:// Locations stored in variables: • file_public_path • file_private_path • file_temporary_path • file_cache_public_path • file_cache_private_path Wednesday, 17 February 16
  • 17. Precedents • Drupal data-cache API. • By default, uses DB tables • Abstracted via cache-bins • Cache tables identified via hook_flush_caches() Wednesday, 17 February 16
  • 18. DX: where is safe? When I first started Drupalling, I had a client who requested the ability to add custom CSS. So I created a quick UI in the admin area, thought about how to store the data, and decided that it would be sensible to reuse the sites/default/files/css path. It was a shock a couple of days after launch, when the client asked "Where has my custom CSS gone?" Wednesday, 17 February 16
  • 19. DX: where is safe? function drupal_clear_css_cache() { file_scan_directory(file_create_path('c ss'), '.*', array('.', '..', 'CVS'), 'file_delete', TRUE); // Clear the page cache, so cached pages do not reference nonexistent CSS. cache_clear_all(); } Wednesday, 17 February 16
  • 20. DX: where is safe? Yes, everything beneath sites/default/files/css was deleted. This was back in D5, and it is a little better in D8: it only deletes files that haven't been modified in 30 days. Be careful where you put your assets! Wednesday, 17 February 16
  • 21. DX Separating persistent storage from regenerable cache storage will make it easier for developers to recognise and implement good directory-structure habits, and give a warning sign to dangerous locations (e.g. cache-public://css is more obviously risky a place to store persistent files than public:// css). Wednesday, 17 February 16
  • 22. Backward Compatibility If the variable for cache-public:// doesn't exist, it could inherit the setting used by public://. Reusing the same location as public:// would mean that for most users, there wouldn't be any noticeable change, or any break in their configuration. Wednesday, 17 February 16
  • 23. Risky synchronization? In some cases, running rsync on the entirety of sites/default/files can be harmful. Some autogenerated content - such as XML sitemaps - may be specific to an environment: for example, the base URL is often different between stage and live. This could cause all sorts of unwanted side-effects: duplicate notifications and inaccurate test results are just two that immediately spring to mind. Wednesday, 17 February 16
  • 24. Edge-cases There may be custom or contrib code expecting assets such as image thumbnails to belong under public:// - e.g. looking up information such as the size of the image. If the site were upgraded, and the developer also moved the location of cache-public://, this could cause failures such as recursive lookups, and the cause may not be immediately apparent to the developer. Wednesday, 17 February 16
  • 25. Edge-cases On the whole, I think the edge-cases are minimal, and can be addressed by good communication of the implications of the change. Wednesday, 17 February 16
  • 26. Potential use-cases • Synchronization of files between environments (e.g. live to staging) • Backups • Proxies/CDN delivery • Garbage collection: scanning for orphaned/ removable files Wednesday, 17 February 16
  • 27. Goals of the change 1. All data in public:// should be persistent and necessary. 2. All data in cache-public:// should be disposable, and regenerable from other sources. 3. All data in public:// should be tracked in the file-usage API; untracked files indicate orphaned/deletable content. Wednesday, 17 February 16
  • 28. Summary Adding two stream-wrappers to core would allow regenerable content to be stored separately from persistent content, simplifying a number of tasks such as back and synchronization between environments. This change would be backwards-compatible, would not affect existing sites without action from the site-owner, and would improve developer's understanding of directory structures created by modules. Wednesday, 17 February 16