Faster Secure Software Development with Continuous Deployment - PH Days 2013
Faster Secure SoftwareDevelopment with Continuous DeploymentNick GalbreathIPONWEBnickg@iponweb.netPHDays May 24, 2013Moscow, Russia
Follow AlongLatest version posted online: http://slidesha.re/199wo7Q http://client9.com/20130524
Nick Galbreath (@ngalbreath)! Spoken at Black Hat, DEFCON, OWASP! Book on cryptography! but really...! Engineering Management and SoftwareDevelopment for high growth startups.! Personal site http://www.client9.com/
libinjection! Author of libinjection! A very different way of doing SQLi detection! Right now in another room, VladimirVorontsov is showing how to bypass it (tobe fixed shortly)! Check it out and file bugs on githubhttp://libinjection.client9.com/! Found a bypass?What databaseand query is neededto exploit it?RU ♥SQLi
IPONWEB! customized online advertising infrastructureand exchanges! engineering oﬃces in Moscow, withbusiness oﬃces in London, New York andTokyo.! YEAH IN MOSCOW! YEAH WE ARE HIRING! Send email to email@example.com
Well thats a bold statement...Fixing Security by Fixing Development Using Continuous Deployment
and heres anotherFor web applications, our release-basedsoftware development lifecycle is stillbased on a pre-Internet model and isharmful to organizations and particularly harmful for security.
What needs ﬁxing?! SQLi dropped from #8 to #14 in the latestWhite Hat "The State of Web Security"report. Good news, right?! This means SQLi is only 7% of websites.Thats 1 in 15. And this is the #14 vulnerability!! And time to ﬁx was on average 196 days. Thats embarrassing.Veracode claims 32% of incoming web applications have SQLi https://info.veracode.com/state-of-software-security-report-volume5.htmlhttps://reg.whitehatsec.com/WPstats0513
Even worse...! Number 1 driver to ﬁx securityproblems... compliance.! Number 1 reason to not ﬁx security is... compliance.! Not.. ! keeping our employees and customers safe! protecting corporate interests.! improving quality! being good at what we do.
Security Products #1 .. in security bugsVeraCode: State ofSoftware Security, V4December 2011Security Product74% Fail Rate
Lets Just Give Up! “You could spend all your resources chasingsuch things as this,” William Ribich, the formerpresident of Technology Solutions Group[ QinteliQ ], said in an interview in January.Ribich, who retired in November 2009, shortlyafter the discovery of a major data theft, said heneeded to balance the uncertain risk that thehackers could use what they stole against agrowing shopping list of security products andconsulting fees.! "You ﬁnally have to reach a point where yousay ’let’s move on,’” he said. http://www.bloomberg.com/news/2013-05-01/china-cyberspies-outwit-u-s-stealing-military-secrets.html^^^ The Russian and Chinese hackers did not move on ^^^^.
I would call that broken! But preventing SQLi isnt a technicallyhard problem.! And most security patches are very small.! How did we get here?
High Distribution Cost! The Software Product Model is designed forsoftware where the cost of distribution ishigh. "High" might be ﬁnancial, risk, time,resources, customer annoyance.! Retail, physical product, CD/DVD! Embedded of Exotic Hardware! Safety, Medical or Defense Systems! Operating Systems (desk or phone)! Homework (1-time deploy)
Web Applications Year 2000! Mostly followed Software ProductModel since thats all we knew.! High barrier to entry! Specialized Hardware, Software andPeople needed to get started.! Lots of engineering needed to keepthings running.! (side note: CERT/CVE started in1999)
True Story #1 ! "Cant push out the spelling error ﬁx– its too risky"! "That code as already been throughQA, its locked down."! "Product has to prioritize thatchange, else we arent touching it."
True Story #2! Well do an iteration, where we try to ﬁxas many things as possible.! This wont be a scheduled iteration, it will be done because so many thingsare piled up.! So the spelling error will get ﬁxed... uhh, who knows when.
Web Applications 2013 ! Almost no barrier to entry! Commodity hardware! Programming not that hard! Scaling problems can be mostly outsourced (mostly)
Cost of Distribution 2013! Frequently no compile step or its very fast.! Moving to production a fewkilobytes or megabytes of codeover 1Gbps, 10Gbps link.! In other words... free
Failure is very diﬀerent however! Most web applications are data-driven.! Frequently have social features, APIs,user-generated content.! Failures might be due to algorithmicproblems... but...! Most likely to due to user input, bad datain database or operational load.! this means data in past can causeproblems in the future.
Releases and Problems! When a web-release goes out, andhas problems....! Next week is spent tracking downwho changed what, where.! Re-QA! Re-Push! meanwhile new code is piling up.
When SPM meets Web Apps! A long time between code being writtenand code being released.! Might be weeks or months! Feedback loop between code-in-devand code-in-production is broken! When security or bug reports come in,the author is likely on a diﬀerent project.
Hypothesis! It is impossible to simulate the productionenvironment in development, either due tooperational diﬀerences or datadiﬀerences.! No amount of QA or Security Testing canprove you dont have bugs, vulnerabilities,or cause severe operational problems.! You have bugs and vulnerabilities, right now, in your application.
Impedance Mismatch! Easy to write code, +! Long release cycles +! Security as an end-of-line or outof band process ==! no one cares! Something is going to break,and most people dont care.
So the Answer is...! Going slower? Im sure your bosswill love that suggestion! More steps and process? In otherwords, slower.! Asking for more people? Sure butgood luck hiring them. Doesntscale.! Asking for more products? Sincethe others have worked so well.
Continuous Deployment ! Also known as Continuous Delivery. ! A System of Software ProductionCharacterized by Numerous SmallChanges the Production Environment,initiated by the author of the change.You change it, you push it to prod.
Deployment != Feature ReleasetimechangeNew code goes out all the time.New features get turned on ina separate process.
"Writing Software"! Software Developers think their job iswriting software.! And so, they love to make things perfectbefore anyone else sees it.! Impolite: "data hiding"! code is hiding on developers computer! or on some branch! in other words invisible until its ready.
Actually! The software engineers job is actuallywriting running software, that works well.! This idea is so alien, that companies haveto remind the engineers of this.
Rackspace Haikuwriting code is hard if you cannot deployit does not matter@paulvx from DevOpsDays Austin 2013
Facebooks Analog Labs Poster"Move Fast and Break Things....Except "Push" (deployment system)via http://mitadmissions.org/blogs/entry/move_fast_and_break_things
Todays goal! but for today the goal is getting thedeveloper to care about their code in production.! If you dont have that, I dont think you canreally solve security problems.
How does this work?! Really? Developers push their own codeout?! How is this not a disaster.! How is this not a security disaster?
The Deploy Button! What is you had a button that said "DEPLOY"! That pushed to production, whateveris current in your source controlsystem.! And took about a minute! The change and who pressed thebutton is logged, but thats it.
Part 1: Fear! No one is going to pushit ;-)! Meanwhile code is pilingupReal example: A new hire I had at Etsywas afraid of deploying an HTML changethat they made. "But I dont want to break the site!"
Part 2: First Push! Someone brave will press the button! And very likely the site will explode,and a rollback will need to be done.! Theyll know since someone else willhave told them.
Part 3: With Graphs! Lets get all those operational graphsout in the open. And put them rightnext to the button.http://codeascraft.com/2011/02/15/measure-anything-measure-everything/
Part 4: Push #2! Repush! Site might still explode! But the developer is aware andcan rollback.
Take 5: Isolation! Hmmm, the developer notices that in thechange set, a million things are going out.! Maybe just pushing out a smaller changewill help isolate the issue.
Take 6: Success!! Yes, the developer just pushed outsome code and made the site better.! The secret about continuousdeployment is small changes thatcan be easily understood.
Take 7: Dark Pushes! Now we got some bugs ﬁxed, lets pusha feature.! First lets push out all the supportingﬁles. Since they arent being called,they do nothing and are safe to pushout.! Now everyone can see them
Take 8: Getting the feature live! Instead of "all at once", we slowly ramp up afeature.! if (user_id % 20 == 0): do new feature; ! we change change the percentage easily withanother code push.! or turn it down. Much nicer change log.! While the site didnt explode, its hard to see ifthe feature is being used or not.
Take 9: Application Level Graphs! Allow developers to instrument theircode so they can see what ishappening in production.! Enter StatsD and otherUDP-based tools! Enter centralized logging and in-application method to make it easyto log problems.
Take 10: Communication! So far good for one developer.! To scale up, youll need a system to allowdevelopers.! IRC-like tools work well (e.g. "the pushchannel") – skype, jabber, hangouts, etc
Along the way! Expose production logs to developers! Add in a staging-step where the codegoes to faction of the cluster, sodevelopers can test with real traﬃc! Try to make development closer to prod.! Make "smoke tests" to catch basic errors! Add syntax checkers to eliminate obviousissues. ! Use static analysis to ﬁnd bugs
Mistakes will happen! Do postmortem analysis! Everyone thought they were doingthe right thing at the time.! "How can the environment bechanged to prevent this" and buildtools to enforce it.! (Rarely can you truly change people)
That guy who pushes at 3am! Courtesy and convention willconverge very quickly when the sitegoes down at 3am and thedeveloper starts getting calls ;-)! Of hours pushes of course canhappen, when they notify operations.
What About Code Reviews?! Yes, please do them.! Nothing here prevents code reviews.! In fact code reviews are easier since! they are small! they are in mainline not some branch
What about Security Reviews! Please do them.! Nothing here eliminatesarchitectural planning or review.! This actually doesnt change theSDLC very much.
What about Agile Methods! (everyone seems to have a diﬀerent ideaof what Agile is but..)! Agile methodologies typically work toimprove the business spec / developmentcycle. (are you building what thecustomer wants)! But doesnt address code deployment.! They are complimentary practices.
What about Customer Service?! "Dont they freak out with all thechanges?"! Remember: deployment != feature release! Most deployments do very little from thecustomer point of view! Feature releases (frequently controlled byramp-ups or ﬂags) always needs to becoordinated with product and customerservice.
What about Compliance? PCI?! Let me tell you about compliance...! mechanism not policy! compliance is a lot easier when its doneevery day instead of a once-a-year audit.
Obvious Beneﬁt to Security! Security patches can go out quickly! You know this since they are nowjust part of a normal developmentcycle and code goes out regularly.! Why not clear out those low-prioritysecurity problems?
More Importantly! That Engineer who previously did notpush code is now sensitized that theircode has consequences and areresponsive and empowered to ﬁx it.! It’s amazing how interested engineersbecome in security when you ﬁndproblems with their code when they areable to ﬁx it quickly themselves.
New Security Math! Instead of focusing only on increasing MTTF,which will never be infinite! more firewalls, more process, more magic! You can focus on how fast can you detect faults,and how fast can you fix them.! How low can you go?! MTTD - Mean time to Detect! MTTR - Mean time to Repair
Hack The Stack! A side eﬀect of this you now havetools to repurpose for security and monitoring of production! Note that most changes are notsecurity problems.
Logging! Due to allow developers to seeapplication logging, its now veryeasy to instrument the application tolog security events.! Or add logs to times when you areunder attack.
Graphing! Make dashboards of! SQLi and XSS attacks! Every type of log-in failure! Core Dumps! Database Syntax Errors
Static Analysis ! You now have a place to insert them.! Work with QA group to add more codequality tests.
Post-Commit Checks! Alert on when sensitive areas of the codeare changed (auth, login)! Alert on crypto usage (why is developerusing MD5.. hmmm)! Alert on your programming languages"dangerous functions"This allows you to engage the developer at the start of the cycle.
Faster is Better! You could do most of this in a normalrelease-cycle software lifecycle.! The difference is you are findingproblems at the start instead of 10mbefore the launch and tellingeveryone to stop.! The feedback loop works.
New Roles, Less Silos! Developers: works with operations! QA: works on building systems fortesting, to empower others to writebetter tests! Release Engineering: tools to enablecode to ﬂow faster! Security: in-house consultancy,secure-by-default architecture,monitoring
Goal: 50% reduction in deploy time! Whatever your state of deployment is,no matter how many people areinvolved, no matter how long itcurrently takes, make a goal of cuttingit in half.! This is an easy sell to managementjust on cost basis.! Everything else ﬂows from this.
Mechanism not Policy! Strive for the fastest deploymentmechanism for possible! But you deﬁne the "continuous" inContinuous Deployment! Yes, Etsy was 60+ deploys per day, witheach having multiple authors.! Current gig? we have rules of no morethan 3 per week since our customer haveasked for that, and only deployed at"low-tide"
In other contexts: Operations! How fast can you deploy OS changesto you production environment?! How fast can you deploy routerchanges?! How fast can you deploy patches tothe desktopYou probably dont do it that often sinceits really painful and time consuming!Thats exactly the problem.
In other contexts: software product! here "production" might be getting codeinto the main branch and runningautomated build / test.! Its the ﬂow of code: little changes vs big.
In other contexts: silicon! Continuous deployment already done forsilicon! wut?! Only small changes, with tests areallowed to be committed!! Big changes are rejected. Learned the hard way that big changesare completely unmanageable