Successfully reported this slideshow.

Learning Scrapy: How to write a book about your favourite Python framework

2

Share

"Learning Scrapy"
How to write a book about your
favourite Python framework
Dimitrios Kouzis-Loukas
Watch the presentation...

YouTube videos are no longer supported on SlideShare

View original on YouTube

What you will get today
• Explain the process
• Give some shortcuts & tips
• Share my experience
Watch the presentation: h...
Loading in …3
×
1 of 45
1 of 45

Learning Scrapy: How to write a book about your favourite Python framework

2

Share

Watch the presentation: https://youtu.be/vqqUjQbwypM

How much code do you need to write to make a book that is both easy to read and where every example runs fine now and in the future?

Writing a book is creating a product. A properly engineered book will work hard for you delivering excellent learning experiences to people all over the world for years after its release.

For some people, writing a book might be one of the best and most impactful ways to contribute to their favourite open source projects. For communities, supporting authors and helping them get their books right might be a brilliant investment.

In this presentation, I will share my experience writing "Learning Scrapy", shed some light on the process and hopefully inspire you to get more involved on the writing initiatives of the projects you support.

Watch the presentation: https://youtu.be/vqqUjQbwypM

How much code do you need to write to make a book that is both easy to read and where every example runs fine now and in the future?

Writing a book is creating a product. A properly engineered book will work hard for you delivering excellent learning experiences to people all over the world for years after its release.

For some people, writing a book might be one of the best and most impactful ways to contribute to their favourite open source projects. For communities, supporting authors and helping them get their books right might be a brilliant investment.

In this presentation, I will share my experience writing "Learning Scrapy", shed some light on the process and hopefully inspire you to get more involved on the writing initiatives of the projects you support.

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Learning Scrapy: How to write a book about your favourite Python framework

  1. 1. "Learning Scrapy" How to write a book about your favourite Python framework Dimitrios Kouzis-Loukas Watch the presentation: https://youtu.be/vqqUjQbwypM
  2. 2. What you will get today • Explain the process • Give some shortcuts & tips • Share my experience Watch the presentation: https://youtu.be/vqqUjQbwypM
  3. 3. How does it help the community? • Access to a wider audience • What about documentation? (Never enough!) – Controlled? – Less structured (Reference? Hacks?) Watch the presentation: https://youtu.be/vqqUjQbwypM
  4. 4. How does it help the author? • Money – NOT! – "How many books did you sell?" – "Did you negotiate your contract?" – Do you have a consultancy pipeline? • Feels good – Contribution, Connection, Mastery • Networking Watch the presentation: https://youtu.be/vqqUjQbwypM
  5. 5. How does it start? There will be an e-mail, an outline, a contract and a plan "100 pages 6 months"
  6. 6. "100 pages 6 months" Yeah right! Watch the presentation: https://youtu.be/vqqUjQbwypM
  7. 7. About the author Watch the presentation: https://youtu.be/vqqUjQbwypM
  8. 8. When are you going to start writing a book? Time "Accomplishment" Your first circuit Applied Mathematics & Physics MSc Microelectronics Your first program Working for ARM Watch the presentation: https://youtu.be/vqqUjQbwypM
  9. 9. Time "Accomplishment" ? Watch the presentation: https://youtu.be/vqqUjQbwypM
  10. 10. Time "Accomplishment" When are you going to start writing a book?
  11. 11. Every book is a product Probably a bit like me! The audience, the reader The customer!
  12. 12. The outline Watch the presentation: https://youtu.be/vqqUjQbwypM
  13. 13. Title, cover, Table Of Contents & Chapter 1 "Live documents" Ask your marketing friends (SEO etc.)
  14. 14. The editor
  15. 15. The editor • Responsible for the book (a bit like a project manager) • Might or might not have a clue When will you send me Chapter 7? Watch the presentation: https://youtu.be/vqqUjQbwypM
  16. 16. Tell them what a URL is! The customer!
  17. 17. The customer! Beginners is a much larger audience
  18. 18. • Introduction • Terminology • Installation • Background knowledge
  19. 19. Book - Web partitioning Watch the presentation: https://youtu.be/vqqUjQbwypM
  20. 20. Done! What?
  21. 21. The reviewer
  22. 22. The reviewer • Hard to find • Is good only if (s)he is bad! • Reader "proxy". Invaluable! Your book is worthless! Watch the presentation: https://youtu.be/vqqUjQbwypM
  23. 23. Rackspace woke up my wife! The customer! On a scale of 1-10, in Python, I would give myself an 8. Scrapy installation gave 32 warnings. Do you have permissions to crawl that site? Watch the presentation: https://youtu.be/vqqUjQbwypM
  24. 24. The customer! Don’t piss off the more advanced ones!
  25. 25. I'm actually on my own! Watch the presentation: https://youtu.be/vqqUjQbwypM
  26. 26. If at first you don't succeed... Brand new Editor!
  27. 27. The customer!
  28. 28. Chapters 5-9
  29. 29. Pay some credit, if possible Watch the presentation: https://youtu.be/vqqUjQbwypM
  30. 30. Gartner Hype Cycle On every chapter... Pain! Skip the pain!
  31. 31. Do you have permissions to crawl that site? This book isn’t about MySQL, nor Redis Reproducible research 9 servers, realistic, almost 0 support, works on the airport!
  32. 32. Maybe we can cut some edges! Actually no, my reputation is at stake! Watch the presentation: https://youtu.be/vqqUjQbwypM
  33. 33. The customer!
  34. 34. The customer! My manager, my friend, myself!
  35. 35. Watch the presentation: https://youtu.be/vqqUjQbwypM
  36. 36. 667 pages/second Micro-batches Watch the presentation: https://youtu.be/vqqUjQbwypM
  37. 37. Appendix Watch the presentation: https://youtu.be/vqqUjQbwypM
  38. 38. Production • Check the PDFs: – Text – Diagrams – Code, code code (especially in Python) Watch the presentation: https://youtu.be/vqqUjQbwypM
  39. 39. Look Mum, I'm on Amazon!
  40. 40. Your Friends Get the vision, no problem! You start here You end here Do not offend these guys You really "sell" to those guys Your main audience Watch the presentation: https://youtu.be/vqqUjQbwypM
  41. 41. Contribute A book on your favorite open source framework • Author or Co-author • Review (but be polite) • Support authors
  42. 42. Make the World a Better Place! Watch the presentation: https://youtu.be/vqqUjQbwypM

Editor's Notes

  • * 15k starts, 4k forks
    The community shouts “give me a book!”
    Wider audience
    Teacher / manager “go read a book”
    Supplements documentation
    Book has specific aim (in contrast to doc which has many)
  • [2:40]
    Best seller => Write about “MS office”
    Why? Feels good
  • [3:40]
    How does it start? Very simple – it all starts with an e-mail
    The plan
  • Why you might not stick to the plan?
  • You have to look a bit into the life of the author...
  • I was a normal Geek
  • Jump into the unknown
    Quit my job, Moved to Poland, Budapest, Seville
    Run my own business based in the UK
    helped startups and many many people
    tons of work, tons of software, great stories!
  • In the middle of all this you start writing a book
    A strange time in your life
    The plan might not work
    But there are further reasons... Like that... Every book is a product
  • [6:00]
    Every book is a product... And the reader is the customer
    And you’ve heard how difficult customer discovery is in startups
    You start with the hypothesis that the reader is somewhat like you
  • Based on this idea, an outline
    Small fun book for a little startup, that needs MVP
    Using scrapy – MVP Faster than fast
    And you start writing your book...
  • Now some things you have to know about your book...
    It all starts as an ugly draft
    Will the table of contents change?
    The editor doesn’t care if you change any material
    Sell! Everyone is going to read those, pass your message
    maximum reach, convince people to read more
  • Introduce you to the editor
  • The guy who sends you the annoying email
    With an broad area of expertise, like “Java” but otherwise clueless
  • What?!
    The first pivot of the book
  • If you’re about to write only one book, write one with the widest audience.
    Aim at beginners
  • Beginner friendly stuff
    Let’s help the reader, in a tutorial-like fashion get from the website to the data
  • But data doesn’t sell itself
    Imagine if I was telling you, ”you will do 10 hours of work and then you will be able to see the data on an Excel” – not motivating
    Chapter 4 a nice chapter, just 13 pages – a mobile app using the data.

  • [10:20]
    I haven’t shown any advanced techniques
    But I need advanced and keep code nice clear and understandable
    Upload to PyPi
    Put hacks, boilerplate code, make your book’s code look like poetry
    Update after release, when e.g. scrapy version changes
  • I’ve given him a few broken drafts...
    And as soon as we hit a hundred pages – “we are done!”
  • And here comes the reviewer
  • [11:30]
    Has limited time
    Has domain expertise
    If he tells you everything is fine, he doesn’t help
    It’s a persona, not a person – likely you will have many
  • [11:50]
    and my first reviewer comes with comments like “...”
    He happened to be a python Expert – not my target audience
    Awesome to have a good reviewer
    The aspect of the customer who is demanding and knows his stuff really well
  • Maybe you can’t help this customer a lot, but don’t annoy him
    Examples: “array” instead of “list” -> how do I trust you?
    “models”
    Talk against the management
  • [13:11]
    I clearly realized a few things about the process:
    The editor brought the wrong reviewer
    Said it’s *your book* that doesn’t work “we need more code”
    It’s actually his ideas that didn’t but I can’t blame him.
    -> I’m the author <-
  • [13:30]
    At that point I think we were both a little bit tired of eachother
    And I needed a new editor
    So simple! One e-mail.
    We have a brand new editor
  • [13:50]
    New understanding of the customer –
    Review all the material, add chapters expand – a major pivot
  • * Drop the old title... And here we are
  • * Added new chapters
  • [14:20]
    Hard working people
    No affiliation
    But they did amazing job – and they have a little start-up
    I wanted to give them some credit
  • [14:35]
    In the industry people get excited/disillusioned/enlightened/productive - I felt the same with each chapter
    “The most important chapter” => “Disillusionment” => writer's block, after first sentence, empty paper
    Configuration is necessary, not exciting and covered in the documentation
    As soon as I have the idea, I hired people to draft me sections “write me a tutorial”
    When they came back, I was able to put “soul” – what the real author work is about
    Accurate, fun, examples.
    It really helped me finish this chapter very quickly and I feel sorry I didn’t think about it earlier.
  • [16:10]
    Chapter 9 needs Es, Redis, MySQL
    If you have it, you need to tell them how to install it
    I don’t want to do that and I can hear the reviewer shouting “this book isn’t on MySQL”
    Vagrant + Docker 9 virtual servers with ssh inside the VM, even on Windows
    They can get the real feel
    Almost 0 support
    Works offline
    No need to hit any external website to do the crawls
    Of course there was the need to optimize those boxes both for CPU and memory to allow for a wider audience
    Use it, copy-paste
  • [17:14]
    In my communication with the editor I was a different person
    “Let’s cut some edges and release before Christmas”
    The 2 stars on Amazon will be next to my name, not yours!
  • [17:38]
    Because there’s a very very important part of the audience that I’ve finally seen
  • [18:05]
    This is how we get to Chapter 10, where I develop a complete performance model of scrapy
    Influenced by physics
    “If you imagine the Urls like water going from the top” – All the important settings are here. If they aren’t aligned you get inferior performance
    The troubleshooting guide tells you what metrics to look at and what to do
    The most common of them is actually that you don’t have enough to do!

  • [18:26]
    Stack overflow does it wrong – with queues per item
    Scrapy wants something close to Spark Micro-batching
    Push in S3
    Put the reference of the S3 on the queue
    Great performance 667 pages/second
  • [19:22]
    All the reference material
    Windows guy how to enable ssh? How to install everything.
    You are able to tell them – when they come with support questions – this is in page 250
  • [19:44]
    You have the final drafts but it’s not over!
    Production -> Go check everything
    You speak way better English than your proofreaders
    Diagram resolution
    Code/spaces formatting
  • [20:15]
    After all this. The day comes that you see your book on amazon!
    You are happy and proud!!! 
  • [20:20]
    Body of customers
    Don’t offend. Don’t be vague. Write a bit defensively, excite but don’t annoy. The “are of writing”
    It applies on every level of expertise
    But mainly help your main audience to become better
  • [22:00]
    Go - contribute
    Multiple authors... Don’t try to be fair, just do it
    Be polite reviewers
    If you hear that someone writes a book, ask them, how can I help
  • ×