Automatic
Code Fixes
PHP User Group Munich
2017-05-24
Stuff we do to our code
automatically beyond fixing*
PHP User Group Munich
2017-05-24
* which still is a manual process, we won't lose our jobs soon
Having fun doing automation
and learning new things
PHP User Group Munich
2017-05-24
The situation 2 years ago
● around 120 repositories with internal packages
● around 10 applications using internal and external packages
● hosted on a local Atlassian Stash instance (now Bitbucket)
● local Satis instance to access all packages
● agreed to use PSR-2 coding standard
● had a template repo for new packages (copy&paste ready)
The situation 2 years ago
● someone installed the “Job DSL” plugin on Jenkins
– uses Groovy to create/update job configurations and views
– first use for a new PHP mobile app project
– found useful and applied to some legacy software
– allows to create jobs to test every branch
Problems slowly manifesting
● creating new internal packages not straightforward
– requires manual addition to Satis
● correctly editing the config file
● on the correct server
● in an SVN repository with gigabytes of data
● hopefully not destroying the internal order of data structures
● BASICALLY: “Ask Sven, he'll do it for you.”
Problems slowly manifesting
● creating new internal packages not straightforward
– may require adding the repo under "repositories"
● good for local development with "dev-branch" dependency
● bad everywhere else when merged to prod version, including
● bad for deployment because it will clone the repo, not download ZIP
● bad for intermediate packages ("repositories" is root-only setting)
Problems slowly manifesting
● creating new internal packages not straightforward
– meta data was not really maintained
● template has placeholders for “homepage”, “source”, “wiki”, “issues”
● unclear what to fill in
● what to do if not yet available
● how to fill the list of authors
– resulting in a Satis overview page with several dead links
Problems slowly manifesting
● PSR-2 standard agreed upon but not enforced
– not ideal for code reviews
● wasting peoples brains to complain about whitespace
● becoming insensitive to the issue and stop caring about it
– pre-commit hooks are opt-in
● no general way to share them
● requires additional time executing
– pre-push hooks keep people from doing their work
● also require execution time
● complain about style at inappropriate step in the workflow
Problems slowly manifesting
● other aspects
– maintaining and updating dependencies
– checking for insecure versions
● What if we'd need one global aspect of the code
checked or changed everywhere?
Beware the tech feedback loop
● people are creative in finding solutions for their problems
● if the tech stack does not allow something to be easy, you'll see
workarounds popping up
● adding packages should be easy because isolating common
code into a lib is generally better
● also making it impossible to commit/push with “clean” code will
create workarounds like “only changed files must be clean”
ANNOYING PROCESSES WILL MAKE
PEOPLE GET ANGRY AND PUSH BACK
Automation to the rescue!
Hold on for a second!
We do not want yet another
Jenkins job complaining
about our code.
greenkeeper.io
greenkeeper.io
● maintainer registers a repository with npm dependencies
● greenkeeper detects if dependencies release an update
● sends a pull request with the new version
● travis.ci will automatically test each pull request
● feedback from continous integration if the new version works
● maintainer merges the pull request
● PROFIT
Sending pull requests
normal work goes on
automatic update in branch
code with
automatic
update
Sending pull requests
normal work goes on
automatic update in branch
code with
automatic
update
repeat regularly
for all repositories
Sending pull requests
normal work goes on
automatic update in branch
code with
automatic
update
repeat regularly
Stuff you'll need
● a continous integration environment
– Jenkins
● access to your repository server: list repos, create pull requests
– Atlassian Stash, Bitbucket Server API
● a way to automatically create jobs on your CI server
– Jenkins JobDSL plugin
● ideas what to fix in your code
creator job
list of repos
one fix job per repo
fix job
clone one repo
branch and fix
push fix branch
create PR
run
nightly
Anatomy of a fix job
● clone/update the codefixer repository
– contains all the scripts
● run a shell script
– clone/update the repository to fix into a subdirectory
– apply code fixes and send pull request
– run checks and make job red/green
Cloning/updating the repo
master git clone -b "$branch" $repo "$WORKSPACE/repo"
fresh clone
existing clone
git checkout "$branch"
git reset --hard
git fetch --prune
git rebase
Apply code fixes
master
codefix
create new branch
Apply code fixes
master
codefix
continue existing branch
Apply code fixes
master
codefix
does it merge?
Apply code fixes
mastercodefix
start branch new
and avoid dealing
with conflicts
Apply code fixes
● each code fix assumes an unchanged branch
● change the source according to it's task
● then commit the change with a meaningful message
Apply code fixes
● each code fix assumes an unchanged branch
● change the source according to it's task
● then commit the change with a meaningful message
echo "Removing autoload paths outside of the package"
php "$WORKSPACE/fixes/remove_autoload_outside_of_package.php"
git commit -am "Remove autoloading paths outside of the package"
Send changes to central repo
● only send if the repository allows it
– check presence of a file
● only push if something changed
– force-push because the branch might be created
on a changed master
Create the pull request
● source branch is the codefix branch just pushed
● target branch is master
● reviewers are taken from two sources
– the last committer on master
– one random selected entry from composer.json author list
– all maintainer roles from composer.json
Code checks
● checkout master branch
● run checks
● fail the build if one check fails
CI and human interaction
● Bitbucket informs reviewers about the new pull request
● also notifies Jenkins about the push
● CI system runs tests for the codefix branch
● test results are reported back to Bitbucket
● failing tests will prevent merging the changes via UI
● changes are reviewed and merged manually
Checks vs. fixes
● checks only alert someone about something
● those person has to have time to deal with it
● read the report, checkout the repo, fix the problem
● push, create pull request, find reviewer
● fixes work on their own, all the time
Checks vs. fixes
● checks only alert someone about something
● those person has to have time to deal with it
● read the report, checkout the repo, fix the problem
● push, create pull request, find reviewer
●
● fixes work on their own, all the time
●
● IF POSSIBLE, WRITE A FIX SCRIPT
What to fix
● composer.json
– add authors from git
– add missing license tag
– fix mail adresses
– fix platform config
– fix source repository link
– remove autoload outside
package
– remove empty arrays
– remove git repo entries
– add unwanted files/folders
to exclude list
– update dependencies
– update dev-dependencies
– update packages
What to fix
● composer.json
– add authors from git
– add missing license tag
– fix mail addresses
– fix platform config
– fix source repository link
– remove autoload outside
package
– remove empty arrays
– remove git repo entries
– add unwanted files/folders
to exclude list
– update dependencies
– update dev-dependencies
– "composer update"
What to fix
● add or update default CI script
● add .mailmap file
● fix build.xml script
● fix coding style
– using php-cs-fixer
What to check
● stuff that's usually too complex to fix automatically
● but is important enough to know about
● works on master
– validate composer.json
– check for insecure versions
– check for minimum versions
– check for abandoned packages
– check for PHP 7 compatibility
Adding authors and fixing mail addresses
● Bitbucket API allows access to user db with search feature
● previously wrong addresses have been added
– which miraculously had a pattern
– also due to merger the mail domain changed for everyone
● if composer.json has less than 2 authors, add more from Git
– top committers get added
– blacklist removes people which left and mailing lists
– affects the outcome from pull request creation
Update dependencies
● affects the “require” section
● a central configuration file has the minimum versions we should
use for some package
– example: minimum version of a SOAP lib because of a new service
version
● script will detect the already locked version and update the
requirement if possible
– previously required: 1.1.0, minimum configured: 1.3.0
– if 1.3.0 or higher is used, script will update composer.json
Composer update
● add pyrech/composer-changelogs
● remove vendor folder
● run composer update --lock –prefer-dist
● pipe changelog output to file
● commit git commit -a -F ../changelog-summary.txt
● let continous integration find any problems
Effects
● commiting coding style fixes will annoy everyone for a while
● possibly turn around your whole code base
● continuously updating dependencies creates pull requests
– we limited updates to Mondays :)
● after a while some fixes seem useless
– but why remove them if the problem can come back in new code
● automatic merges are the next step
– CI should be able to test and greenlight the merge
Thank you!
Any questions?
Find me on twitter: @SvenRtbg
Find me on StackOverflow: SvenRtbg
Find me on GitHub: SvenRtbg
Find me on Slideshare: SvenRtbg

Automatic codefixes

  • 1.
    Automatic Code Fixes PHP UserGroup Munich 2017-05-24
  • 2.
    Stuff we doto our code automatically beyond fixing* PHP User Group Munich 2017-05-24 * which still is a manual process, we won't lose our jobs soon
  • 3.
    Having fun doingautomation and learning new things PHP User Group Munich 2017-05-24
  • 4.
    The situation 2years ago ● around 120 repositories with internal packages ● around 10 applications using internal and external packages ● hosted on a local Atlassian Stash instance (now Bitbucket) ● local Satis instance to access all packages ● agreed to use PSR-2 coding standard ● had a template repo for new packages (copy&paste ready)
  • 5.
    The situation 2years ago ● someone installed the “Job DSL” plugin on Jenkins – uses Groovy to create/update job configurations and views – first use for a new PHP mobile app project – found useful and applied to some legacy software – allows to create jobs to test every branch
  • 6.
    Problems slowly manifesting ●creating new internal packages not straightforward – requires manual addition to Satis ● correctly editing the config file ● on the correct server ● in an SVN repository with gigabytes of data ● hopefully not destroying the internal order of data structures ● BASICALLY: “Ask Sven, he'll do it for you.”
  • 7.
    Problems slowly manifesting ●creating new internal packages not straightforward – may require adding the repo under "repositories" ● good for local development with "dev-branch" dependency ● bad everywhere else when merged to prod version, including ● bad for deployment because it will clone the repo, not download ZIP ● bad for intermediate packages ("repositories" is root-only setting)
  • 8.
    Problems slowly manifesting ●creating new internal packages not straightforward – meta data was not really maintained ● template has placeholders for “homepage”, “source”, “wiki”, “issues” ● unclear what to fill in ● what to do if not yet available ● how to fill the list of authors – resulting in a Satis overview page with several dead links
  • 9.
    Problems slowly manifesting ●PSR-2 standard agreed upon but not enforced – not ideal for code reviews ● wasting peoples brains to complain about whitespace ● becoming insensitive to the issue and stop caring about it – pre-commit hooks are opt-in ● no general way to share them ● requires additional time executing – pre-push hooks keep people from doing their work ● also require execution time ● complain about style at inappropriate step in the workflow
  • 10.
    Problems slowly manifesting ●other aspects – maintaining and updating dependencies – checking for insecure versions ● What if we'd need one global aspect of the code checked or changed everywhere?
  • 11.
    Beware the techfeedback loop ● people are creative in finding solutions for their problems ● if the tech stack does not allow something to be easy, you'll see workarounds popping up ● adding packages should be easy because isolating common code into a lib is generally better ● also making it impossible to commit/push with “clean” code will create workarounds like “only changed files must be clean” ANNOYING PROCESSES WILL MAKE PEOPLE GET ANGRY AND PUSH BACK
  • 12.
  • 13.
    Hold on fora second!
  • 14.
    We do notwant yet another Jenkins job complaining about our code.
  • 15.
  • 16.
    greenkeeper.io ● maintainer registersa repository with npm dependencies ● greenkeeper detects if dependencies release an update ● sends a pull request with the new version ● travis.ci will automatically test each pull request ● feedback from continous integration if the new version works ● maintainer merges the pull request ● PROFIT
  • 17.
    Sending pull requests normalwork goes on automatic update in branch code with automatic update
  • 18.
    Sending pull requests normalwork goes on automatic update in branch code with automatic update repeat regularly
  • 19.
    for all repositories Sendingpull requests normal work goes on automatic update in branch code with automatic update repeat regularly
  • 21.
    Stuff you'll need ●a continous integration environment – Jenkins ● access to your repository server: list repos, create pull requests – Atlassian Stash, Bitbucket Server API ● a way to automatically create jobs on your CI server – Jenkins JobDSL plugin ● ideas what to fix in your code
  • 22.
    creator job list ofrepos one fix job per repo fix job clone one repo branch and fix push fix branch create PR run nightly
  • 23.
    Anatomy of afix job ● clone/update the codefixer repository – contains all the scripts ● run a shell script – clone/update the repository to fix into a subdirectory – apply code fixes and send pull request – run checks and make job red/green
  • 24.
    Cloning/updating the repo mastergit clone -b "$branch" $repo "$WORKSPACE/repo" fresh clone existing clone git checkout "$branch" git reset --hard git fetch --prune git rebase
  • 25.
  • 26.
  • 27.
  • 28.
    Apply code fixes mastercodefix startbranch new and avoid dealing with conflicts
  • 29.
    Apply code fixes ●each code fix assumes an unchanged branch ● change the source according to it's task ● then commit the change with a meaningful message
  • 30.
    Apply code fixes ●each code fix assumes an unchanged branch ● change the source according to it's task ● then commit the change with a meaningful message echo "Removing autoload paths outside of the package" php "$WORKSPACE/fixes/remove_autoload_outside_of_package.php" git commit -am "Remove autoloading paths outside of the package"
  • 31.
    Send changes tocentral repo ● only send if the repository allows it – check presence of a file ● only push if something changed – force-push because the branch might be created on a changed master
  • 32.
    Create the pullrequest ● source branch is the codefix branch just pushed ● target branch is master ● reviewers are taken from two sources – the last committer on master – one random selected entry from composer.json author list – all maintainer roles from composer.json
  • 33.
    Code checks ● checkoutmaster branch ● run checks ● fail the build if one check fails
  • 34.
    CI and humaninteraction ● Bitbucket informs reviewers about the new pull request ● also notifies Jenkins about the push ● CI system runs tests for the codefix branch ● test results are reported back to Bitbucket ● failing tests will prevent merging the changes via UI ● changes are reviewed and merged manually
  • 35.
    Checks vs. fixes ●checks only alert someone about something ● those person has to have time to deal with it ● read the report, checkout the repo, fix the problem ● push, create pull request, find reviewer ● fixes work on their own, all the time
  • 36.
    Checks vs. fixes ●checks only alert someone about something ● those person has to have time to deal with it ● read the report, checkout the repo, fix the problem ● push, create pull request, find reviewer ● ● fixes work on their own, all the time ● ● IF POSSIBLE, WRITE A FIX SCRIPT
  • 37.
    What to fix ●composer.json – add authors from git – add missing license tag – fix mail adresses – fix platform config – fix source repository link – remove autoload outside package – remove empty arrays – remove git repo entries – add unwanted files/folders to exclude list – update dependencies – update dev-dependencies – update packages
  • 38.
    What to fix ●composer.json – add authors from git – add missing license tag – fix mail addresses – fix platform config – fix source repository link – remove autoload outside package – remove empty arrays – remove git repo entries – add unwanted files/folders to exclude list – update dependencies – update dev-dependencies – "composer update"
  • 39.
    What to fix ●add or update default CI script ● add .mailmap file ● fix build.xml script ● fix coding style – using php-cs-fixer
  • 40.
    What to check ●stuff that's usually too complex to fix automatically ● but is important enough to know about ● works on master – validate composer.json – check for insecure versions – check for minimum versions – check for abandoned packages – check for PHP 7 compatibility
  • 41.
    Adding authors andfixing mail addresses ● Bitbucket API allows access to user db with search feature ● previously wrong addresses have been added – which miraculously had a pattern – also due to merger the mail domain changed for everyone ● if composer.json has less than 2 authors, add more from Git – top committers get added – blacklist removes people which left and mailing lists – affects the outcome from pull request creation
  • 42.
    Update dependencies ● affectsthe “require” section ● a central configuration file has the minimum versions we should use for some package – example: minimum version of a SOAP lib because of a new service version ● script will detect the already locked version and update the requirement if possible – previously required: 1.1.0, minimum configured: 1.3.0 – if 1.3.0 or higher is used, script will update composer.json
  • 43.
    Composer update ● addpyrech/composer-changelogs ● remove vendor folder ● run composer update --lock –prefer-dist ● pipe changelog output to file ● commit git commit -a -F ../changelog-summary.txt ● let continous integration find any problems
  • 44.
    Effects ● commiting codingstyle fixes will annoy everyone for a while ● possibly turn around your whole code base ● continuously updating dependencies creates pull requests – we limited updates to Mondays :) ● after a while some fixes seem useless – but why remove them if the problem can come back in new code ● automatic merges are the next step – CI should be able to test and greenlight the merge
  • 45.
    Thank you! Any questions? Findme on twitter: @SvenRtbg Find me on StackOverflow: SvenRtbg Find me on GitHub: SvenRtbg Find me on Slideshare: SvenRtbg