Advertisement

Learning Scrapy: How to write a book about your favourite Python framework

Senior Software Developer at Bloomberg LP
Jul. 15, 2016
Advertisement

More Related Content

Advertisement

Learning Scrapy: How to write a book about your favourite Python framework

  1. "Learning Scrapy" How to write a book about your favourite Python framework Dimitrios Kouzis-Loukas Watch the presentation: https://youtu.be/vqqUjQbwypM
  2. What you will get today • Explain the process • Give some shortcuts & tips • Share my experience Watch the presentation: https://youtu.be/vqqUjQbwypM
  3. How does it help the community? • Access to a wider audience • What about documentation? (Never enough!) – Controlled? – Less structured (Reference? Hacks?) Watch the presentation: https://youtu.be/vqqUjQbwypM
  4. How does it help the author? • Money – NOT! – "How many books did you sell?" – "Did you negotiate your contract?" – Do you have a consultancy pipeline? • Feels good – Contribution, Connection, Mastery • Networking Watch the presentation: https://youtu.be/vqqUjQbwypM
  5. How does it start? There will be an e-mail, an outline, a contract and a plan "100 pages 6 months"
  6. "100 pages 6 months" Yeah right! Watch the presentation: https://youtu.be/vqqUjQbwypM
  7. About the author Watch the presentation: https://youtu.be/vqqUjQbwypM
  8. When are you going to start writing a book? Time "Accomplishment" Your first circuit Applied Mathematics & Physics MSc Microelectronics Your first program Working for ARM Watch the presentation: https://youtu.be/vqqUjQbwypM
  9. Time "Accomplishment" ? Watch the presentation: https://youtu.be/vqqUjQbwypM
  10. Time "Accomplishment" When are you going to start writing a book?
  11. Every book is a product Probably a bit like me! The audience, the reader The customer!
  12. The outline Watch the presentation: https://youtu.be/vqqUjQbwypM
  13. Title, cover, Table Of Contents & Chapter 1 "Live documents" Ask your marketing friends (SEO etc.)
  14. The editor
  15. The editor • Responsible for the book (a bit like a project manager) • Might or might not have a clue When will you send me Chapter 7? Watch the presentation: https://youtu.be/vqqUjQbwypM
  16. Tell them what a URL is! The customer!
  17. The customer! Beginners is a much larger audience
  18. • Introduction • Terminology • Installation • Background knowledge
  19. Book - Web partitioning Watch the presentation: https://youtu.be/vqqUjQbwypM
  20. Done! What?
  21. The reviewer
  22. The reviewer • Hard to find • Is good only if (s)he is bad! • Reader "proxy". Invaluable! Your book is worthless! Watch the presentation: https://youtu.be/vqqUjQbwypM
  23. Rackspace woke up my wife! The customer! On a scale of 1-10, in Python, I would give myself an 8. Scrapy installation gave 32 warnings. Do you have permissions to crawl that site? Watch the presentation: https://youtu.be/vqqUjQbwypM
  24. The customer! Don’t piss off the more advanced ones!
  25. I'm actually on my own! Watch the presentation: https://youtu.be/vqqUjQbwypM
  26. If at first you don't succeed... Brand new Editor!
  27. The customer!
  28. Chapters 5-9
  29. Pay some credit, if possible Watch the presentation: https://youtu.be/vqqUjQbwypM
  30. Gartner Hype Cycle On every chapter... Pain! Skip the pain!
  31. Do you have permissions to crawl that site? This book isn’t about MySQL, nor Redis Reproducible research 9 servers, realistic, almost 0 support, works on the airport!
  32. Maybe we can cut some edges! Actually no, my reputation is at stake! Watch the presentation: https://youtu.be/vqqUjQbwypM
  33. The customer!
  34. The customer! My manager, my friend, myself!
  35. Watch the presentation: https://youtu.be/vqqUjQbwypM
  36. 667 pages/second Micro-batches Watch the presentation: https://youtu.be/vqqUjQbwypM
  37. Appendix Watch the presentation: https://youtu.be/vqqUjQbwypM
  38. Production • Check the PDFs: – Text – Diagrams – Code, code code (especially in Python) Watch the presentation: https://youtu.be/vqqUjQbwypM
  39. Look Mum, I'm on Amazon!
  40. Your Friends Get the vision, no problem! You start here You end here Do not offend these guys You really "sell" to those guys Your main audience Watch the presentation: https://youtu.be/vqqUjQbwypM
  41. Contribute A book on your favorite open source framework • Author or Co-author • Review (but be polite) • Support authors
  42. Make the World a Better Place! Watch the presentation: https://youtu.be/vqqUjQbwypM

Editor's Notes

  1. * 15k starts, 4k forks The community shouts “give me a book!” Wider audience Teacher / manager “go read a book” Supplements documentation Book has specific aim (in contrast to doc which has many)
  2. [2:40] Best seller => Write about “MS office” Why? Feels good
  3. [3:40] How does it start? Very simple – it all starts with an e-mail The plan
  4. Why you might not stick to the plan?
  5. You have to look a bit into the life of the author...
  6. I was a normal Geek
  7. Jump into the unknown Quit my job, Moved to Poland, Budapest, Seville Run my own business based in the UK helped startups and many many people tons of work, tons of software, great stories!
  8. In the middle of all this you start writing a book A strange time in your life The plan might not work But there are further reasons... Like that... Every book is a product
  9. [6:00] Every book is a product... And the reader is the customer And you’ve heard how difficult customer discovery is in startups You start with the hypothesis that the reader is somewhat like you
  10. Based on this idea, an outline Small fun book for a little startup, that needs MVP Using scrapy – MVP Faster than fast And you start writing your book...
  11. Now some things you have to know about your book... It all starts as an ugly draft Will the table of contents change? The editor doesn’t care if you change any material Sell! Everyone is going to read those, pass your message maximum reach, convince people to read more
  12. Introduce you to the editor
  13. The guy who sends you the annoying email With an broad area of expertise, like “Java” but otherwise clueless
  14. What?! The first pivot of the book
  15. If you’re about to write only one book, write one with the widest audience. Aim at beginners
  16. Beginner friendly stuff Let’s help the reader, in a tutorial-like fashion get from the website to the data
  17. But data doesn’t sell itself Imagine if I was telling you, ”you will do 10 hours of work and then you will be able to see the data on an Excel” – not motivating Chapter 4 a nice chapter, just 13 pages – a mobile app using the data.
  18. [10:20] I haven’t shown any advanced techniques But I need advanced and keep code nice clear and understandable Upload to PyPi Put hacks, boilerplate code, make your book’s code look like poetry Update after release, when e.g. scrapy version changes
  19. I’ve given him a few broken drafts... And as soon as we hit a hundred pages – “we are done!”
  20. And here comes the reviewer
  21. [11:30] Has limited time Has domain expertise If he tells you everything is fine, he doesn’t help It’s a persona, not a person – likely you will have many
  22. [11:50] and my first reviewer comes with comments like “...” He happened to be a python Expert – not my target audience Awesome to have a good reviewer The aspect of the customer who is demanding and knows his stuff really well
  23. Maybe you can’t help this customer a lot, but don’t annoy him Examples: “array” instead of “list” -> how do I trust you? “models” Talk against the management
  24. [13:11] I clearly realized a few things about the process: The editor brought the wrong reviewer Said it’s *your book* that doesn’t work “we need more code” It’s actually his ideas that didn’t but I can’t blame him. -> I’m the author <-
  25. [13:30] At that point I think we were both a little bit tired of eachother And I needed a new editor So simple! One e-mail. We have a brand new editor
  26. [13:50] New understanding of the customer – Review all the material, add chapters expand – a major pivot
  27. * Drop the old title... And here we are
  28. * Added new chapters
  29. [14:20] Hard working people No affiliation But they did amazing job – and they have a little start-up I wanted to give them some credit
  30. [14:35] In the industry people get excited/disillusioned/enlightened/productive - I felt the same with each chapter “The most important chapter” => “Disillusionment” => writer's block, after first sentence, empty paper Configuration is necessary, not exciting and covered in the documentation As soon as I have the idea, I hired people to draft me sections “write me a tutorial” When they came back, I was able to put “soul” – what the real author work is about Accurate, fun, examples. It really helped me finish this chapter very quickly and I feel sorry I didn’t think about it earlier.
  31. [16:10] Chapter 9 needs Es, Redis, MySQL If you have it, you need to tell them how to install it I don’t want to do that and I can hear the reviewer shouting “this book isn’t on MySQL” Vagrant + Docker 9 virtual servers with ssh inside the VM, even on Windows They can get the real feel Almost 0 support Works offline No need to hit any external website to do the crawls Of course there was the need to optimize those boxes both for CPU and memory to allow for a wider audience Use it, copy-paste
  32. [17:14] In my communication with the editor I was a different person “Let’s cut some edges and release before Christmas” The 2 stars on Amazon will be next to my name, not yours!
  33. [17:38] Because there’s a very very important part of the audience that I’ve finally seen
  34. [18:05] This is how we get to Chapter 10, where I develop a complete performance model of scrapy Influenced by physics “If you imagine the Urls like water going from the top” – All the important settings are here. If they aren’t aligned you get inferior performance The troubleshooting guide tells you what metrics to look at and what to do The most common of them is actually that you don’t have enough to do!
  35. [18:26] Stack overflow does it wrong – with queues per item Scrapy wants something close to Spark Micro-batching Push in S3 Put the reference of the S3 on the queue Great performance 667 pages/second
  36. [19:22] All the reference material Windows guy how to enable ssh? How to install everything. You are able to tell them – when they come with support questions – this is in page 250
  37. [19:44] You have the final drafts but it’s not over! Production -> Go check everything You speak way better English than your proofreaders Diagram resolution Code/spaces formatting
  38. [20:15] After all this. The day comes that you see your book on amazon! You are happy and proud!!! 
  39. [20:20] Body of customers Don’t offend. Don’t be vague. Write a bit defensively, excite but don’t annoy. The “are of writing” It applies on every level of expertise But mainly help your main audience to become better
  40. [22:00] Go - contribute Multiple authors... Don’t try to be fair, just do it Be polite reviewers If you hear that someone writes a book, ask them, how can I help
Advertisement