A Journey through New Languages - Rancho Dev 2017

1. LANGUAGES: A JOURNEY @akitaonrails

2. LANGUAGES: A JOURNEY RANCHO DEV 2017 @akitaonrails

3. @akitaonrails

4. www.theconf.club

6. Languages Syntax are EASY Architectures (PATTERNS) are HARD

15. git checkout -b old_version remotes/origin/old_version

16. time bin/manga-downloadr -t

17. #!/usr/bin/env ruby $LOAD_PATH.unshift File.join(File.dirname(__FILE__), '..', 'lib') require 'optparse' options = { test: false } option_parser = OptionParser.new do |opts| opts.banner = "Usage: manga-downloadr [options]" opts.on("-t", "--test", "Test routine") do |t| options[:url] = "http://www.mangareader.net/onepunch-man" options[:name] = "one-punch-man" options[:directory] = "/tmp/manga-downloadr/one-punch-man" options[:test] = true end opts.on("-u URL", "--url URL", "Full MangaReader.net manga homepage URL - required") do |v| options[:url] = v end opts.on("-n NAME", "--name NAME", "slug to be used for the sub-folder to store all manga files - required") do |n| options[:name] = n end opts.on("-d DIRECTORY", "--directory DIRECTORY", "main folder where all mangas will be stored - required") do |d| options[:directory] = d end opts.on("-h", "--help", "Show this message") do puts opts exit end end

18. require 'manga-downloadr' generator = MangaDownloadr::Workflow.create(options[:url], options[:name], options[:directory]) generator.fetch_chapter_urls! generator.fetch_page_urls! generator.fetch_image_urls! generator.fetch_images! generator.compile_ebooks!

19. require 'manga-downloadr' generator = MangaDownloadr::Workflow.create(options[:url], options[:name], options[:directory]) puts "Massive parallel scanning of all chapters " generator.fetch_chapter_urls! puts "nMassive parallel scanning of all pages " generator.fetch_page_urls! puts "nMassive parallel scanning of all images " generator.fetch_image_urls! puts "nTotal page links found: #{generator.chapter_pages_count}" puts "nMassive parallel download of all page images " generator.fetch_images! puts "nCompiling all images into PDF volumes " generator.compile_ebooks! puts "nProcess finished."

20. require 'manga-downloadr' generator = MangaDownloadr::Workflow.create(options[:url], options[:name], options[:directory]) unless generator.state?(:chapter_urls) puts "Massive parallel scanning of all chapters " generator.fetch_chapter_urls! end unless generator.state?(:page_urls) puts "nMassive parallel scanning of all pages " generator.fetch_page_urls! end unless generator.state?(:image_urls) puts "nMassive parallel scanning of all images " generator.fetch_image_urls! puts "nTotal page links found: #{generator.chapter_pages_count}" end unless generator.state?(:images) puts "nMassive parallel download of all page images " generator.fetch_images! end unless options[:test] puts "nCompiling all images into PDF volumes " generator.compile_ebooks! end puts "nProcess finished." MangaDownloadr::Workflow

21. MangaDownloadr::Workflowmodule MangaDownloadr ImageData = Struct.new(:folder, :filename, :url) class Workflow def initialize(root_url = nil, manga_name = nil, manga_root = nil, options = {}) end def fetch_chapter_urls! end def fetch_page_urls! end def fetch_image_urls! end def fetch_images! end def compile_ebooks! end def state?(state) end private def current_state(state) end end end fetch_chapter_urls!

22. module MangaDownloadr ImageData = Struct.new(:folder, :filename, :url) class Workflow def initialize(root_url = nil, manga_name = nil, manga_root = nil, options = {}) end def fetch_chapter_urls! end def fetch_page_urls! end def fetch_image_urls! end def fetch_images! end def compile_ebooks! end def state?(state) end private def current_state(state) end end end fetch_chapter_urls!

23. fetch_chapter_urls!def fetch_chapter_urls! doc = Nokogiri::HTML(open(manga_root_url)) self.chapter_list = doc.css("#listing a").map { |l| l['href']} self.manga_title = doc.css("#mangaproperties h1").first.text current_state :chapter_urls end

24. fetch_chapter_urls!def fetch_chapter_urls! doc = Nokogiri::HTML(open(manga_root_url)) self.chapter_list = doc.css("#listing a").map { |l| l['href']} self.manga_title = doc.css("#mangaproperties h1").first.text current_state :chapter_urls end

25. def fetch_page_urls! chapter_list.each do |chapter_link| response = Typhoeus.get "http://www.mangareader.net#{chapter_link}" chapter_doc = Nokogiri::HTML(response.body) pages = chapter_doc.xpath("//div[@id='selectpage']//select[@id='pageMenu']//option") chapter_pages.merge!(chapter_link => pages.map { |p| p['value'] }) print '.' end self.chapter_pages_count = chapter_pages.values.inject(0) { |total, list| total += list.size } current_state :page_urls end

26. def fetch_page_urls! chapter_list.each do |chapter_link| begin response = Typhoeus.get "http://www.mangareader.net#{chapter_link}" begin chapter_doc = Nokogiri::HTML(response.body) pages = chapter_doc.xpath("//div[@id='selectpage']//select[@id='pageMenu']//option") chapter_pages.merge!(chapter_link => pages.map { |p| p['value'] }) print '.' rescue => e self.fetch_page_urls_errors << { url: chapter_link, error: e, body: response.body } print 'x' end end rescue => e puts e end end unless fetch_page_urls_errors.empty? puts "n Errors fetching page urls:" puts fetch_page_urls_errors end self.chapter_pages_count = chapter_pages.values.inject(0) { |total, list| total += list.size } current_state :page_urls end

27. def fetch_page_urls! hydra = Typhoeus::Hydra.new(max_concurrency: hydra_concurrency) chapter_list.each do |chapter_link| begin request = Typhoeus::Request.new "http://www.mangareader.net#{chapter_link}" request.on_complete do |response| begin chapter_doc = Nokogiri::HTML(response.body) pages = chapter_doc.xpath("//div[@id='selectpage']//select[@id='pageMenu']//option") chapter_pages.merge!(chapter_link => pages.map { |p| p['value'] }) print '.' rescue => e self.fetch_page_urls_errors << { url: chapter_link, error: e, body: response.body } print 'x' end end hydra.queue request rescue => e puts e end end hydra.run unless fetch_page_urls_errors.empty? puts "n Errors fetching page urls:" puts fetch_page_urls_errors end self.chapter_pages_count = chapter_pages.values.inject(0) { |total, list| total += list.size } current_state :page_urls end

35. def fetch_page_urls! hydra = Typhoeus::Hydra.new(max_con chapter_list.each do |chapter_link| begin request = Typhoeus::Request.new request.on_complete do |respons begin chapter_doc = Nokogiri::HTM pages = chapter_doc.xpath(" chapter_pages.merge!(chapte print '.' rescue => e self.fetch_page_urls_errors print 'x' end end hydra.queue request rescue => e puts e end end hydra.run unless fetch_page_urls_errors.empty puts "n Errors fetching page url puts fetch_page_urls_errors end self.chapter_pages_count = chapter_ current_state :page_urls end

36. def fetch_image_urls! hydra = Typhoeus::Hydra.new(max_concurrency: hydra_concurrency) chapter_list.each do |chapter_key| chapter_pages[chapter_key].each do |page_link| begin request = Typhoeus::Request.new "http://www.mangareader.net#{page_link}" request.on_complete do |response| begin chapter_doc = Nokogiri::HTML(response.body) image = chapter_doc.css('#img').first tokens = image['alt'].match("^(.*?)s-s(.*?)$") extension = File.extname(URI.parse(image['src']).path) chapter_images.merge!(chapter_key => []) if chapter_images[chapter_key].nil? chapter_images[chapter_key] << ImageData.new( tokens[1], "#{tokens[2]}#{extension}", image['src'] ) print '.' rescue => e self.fetch_image_urls_errors << { url: page_link, error: e } print 'x' end end hydra.queue request rescue => e puts e end end end hydra.run unless fetch_image_urls_errors.empty? puts "nErrors fetching image urls:" puts fetch_image_urls_errors end current_state :image_urls end

37. def fetch_image_urls! hydra = Typhoeus::Hydra.new(max_concurrency: hydra_concurrency) chapter_list.each do |chapter_key| chapter_pages[chapter_key].each do |page_link| begin request = Typhoeus::Request.new "http://www.mangareader.net#{page_link}" request.on_complete do |response| begin chapter_doc = Nokogiri::HTML(response.body) image = chapter_doc.css('#img').first tokens = image['alt'].match("^(.*?)s-s(.*?)$") extension = File.extname(URI.parse(image['src']).path) chapter_images.merge!(chapter_key => []) if chapter_images[chapter_key].nil? chapter_images[chapter_key] << ImageData.new( tokens[1], "#{tokens[2]}#{extension}", image['src'] ) print '.' rescue => e self.fetch_image_urls_errors << { url: page_link, error: e } print 'x' end end hydra.queue request rescue => e puts e end end end hydra.run unless fetch_image_urls_errors.empty? puts "nErrors fetching image urls:" puts fetch_image_urls_errors end current_state :image_urls end

38. def fetch_images! hydra = Typhoeus::Hydra.new(max_concurrency: hydra_concurrency) chapter_list.each_with_index do |chapter_key, chapter_index| chapter_images[chapter_key].each do |file| downloaded_filename = File.join(manga_root_folder, file.folder, file.filename) next if File.exists?(downloaded_filename) # effectively resumes the download list without re-downloading eve request = Typhoeus::Request.new file.url request.on_complete do |response| begin # download FileUtils.mkdir_p(File.join(manga_root_folder, file.folder)) File.open(downloaded_filename, "wb+") { |f| f.write response.body } unless is_test # resize image = Magick::Image.read( downloaded_filename ).first resized = image.resize_to_fit(600, 800) resized.write( downloaded_filename ) { self.quality = 50 } GC.start # to avoid a leak too big (ImageMagick is notorious for that, specially on resizes) end print '.' rescue => e self.fetch_images_errors << { url: file.url, error: e } print '#' end end hydra.queue request end end hydra.run unless fetch_images_errors.empty? puts "nErrors downloading images:" puts fetch_images_errors end current_state :images end

39. def compile_ebooks! folders = Dir[manga_root_folder + "/*/"].sort_by { |element| ary = element.split(" ").last.to_i } self.download_links = folders.inject([]) do |list, folder| list += Dir[folder + "*.*"].sort_by { |element| ary = element.split(" ").last.to_i } end # concatenating PDF files (250 pages per volume) chapter_number = 0 while !download_links.empty? chapter_number += 1 pdf_file = File.join(manga_root_folder, "#{manga_title} #{chapter_number}.pdf") list = download_links.slice!(0..pages_per_volume) Prawn::Document.generate(pdf_file, page_size: page_size) do |pdf| list.each do |image_file| begin pdf.image image_file, position: :center, vposition: :center rescue => e puts "Error in #{image_file} - #{e}" end end end print '.' end current_state :ebooks end

40. time bin/manga-downloadr -t 17.18s user 17.62s system 41% cpu 1:24.04 total

45. . !"" _build # $"" ... !"" config # $"" config.exs !"" deps # !"" ... !"" ex_manga_downloadr !"" lib # !"" ex_manga_downloadr # # !"" cli.ex # # !"" mangafox # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" mangareader # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" manga_wrapper.ex # # $"" workflow.ex # $"" ex_manga_downloadr.ex !"" mix.exs !"" mix.lock !"" README.md $"" test !"" ex_manga_downloadr # !"" mangafox_test.exs # $"" mangareader_test.exs !"" ex_manga_downloadr_test.exs $"" test_helper.exs 61 directories, 281 files mix.exs

46. mix.exsdefmodule ExMangaDownloadr.Mixfile do use Mix.Project def project do [app: :ex_manga_downloadr, version: "1.0.2", elixir: "~> 1.4", build_embedded: Mix.env == :prod, start_permanent: Mix.env == :prod, escript: [main_module: ExMangaDownloadr.CLI], deps: deps()] end def application do [applications: [:logger, :httpoison, :porcelain, :observer]] end defp deps do [ {:httpoison, "~> 0.11"}, {:floki, "~> 0.17"}, {:porcelain, "~> 2.0.3"}, {:mock, "~> 0.2", only: :test} ] end end Mixfile

47. MixfilePoolManagement . !"" _build # $"" ... !"" config # $"" config.exs !"" deps # !"" ... !"" ex_manga_downloadr !"" lib # !"" ex_manga_downloadr # # !"" cli.ex # # !"" mangafox # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" mangareader # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" manga_wrapper.ex # # $"" workflow.ex # $"" ex_manga_downloadr.ex !"" mix.exs !"" mix.lock !"" README.md $"" test !"" ex_manga_downloadr # !"" mangafox_test.exs # $"" mangareader_test.exs !"" ex_manga_downloadr_test.exs $"" test_helper.exs 61 directories, 281 files workflow.ex

48. Mixfile . !"" _build # $"" ... !"" config # $"" config.exs !"" deps # !"" ... !"" ex_manga_downloadr !"" lib # !"" ex_manga_downloadr # # !"" cli.ex # # !"" mangafox # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" mangareader # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" manga_wrapper.ex # # $"" workflow.ex # $"" ex_manga_downloadr.ex !"" mix.exs !"" mix.lock !"" README.md $"" test !"" ex_manga_downloadr # !"" mangafox_test.exs # $"" mangareader_test.exs !"" ex_manga_downloadr_test.exs $"" test_helper.exs 61 directories, 281 files workflow.ex

49. workflow.exdefmodule ExMangaDownloadr.Workflow do def determine_source(url) do end def chapters({url, source}) do {:ok, {_manga_title, chapter_list}} = MangaWrapper.index_page(url, source) {chapter_list, source} end def pages({chapter_list, source}) do pages_list = chapter_list |> Task.async_stream(MangaWrapper, :chapter_page, [source], max_concurrency: @max_demand) |> Enum.to_list() |> Enum.reduce([], fn {:ok, {:ok, list}}, acc -> acc ++ list end) {pages_list, source} end def images_sources({pages_list, source}) do pages_list |> Task.async_stream(MangaWrapper, :page_image, [source], max_concurrency: @max_demand) |> Enum.to_list() |> Enum.map(fn {:ok, {:ok, image}} -> image end) end def process_downloads(images_list, directory) do images_list |> Task.async_stream(MangaWrapper, :page_download_image, [directory], max_concurrency: @max_demand / 2, timeout: @download_timeout) |> Enum.to_list() directory end def optimize_images(directory) do … end def compile_pdfs(directory, manga_name) do … end defp compile_volume(manga_name, directory, {chunk, index}) do … end defp prepare_volume(manga_name, directory, chunk, index) do … end defp chunk(collection, default_size) do … end end :chapter_page

54. POOL

56. manga_wrapper.exdefmodule MangaWrapper do require Logger def index_page(url, source) do source |> manga_source("IndexPage") |> apply(:chapters, [url]) end def chapter_page(chapter_link, source) do source |> manga_source("ChapterPage") |> apply(:pages, [chapter_link]) end def page_image(page_link, source) do source |> manga_source("Page") |> apply(:image, [page_link]) end def page_download_image(image_data, directory) do download_image(image_data, directory) end defp manga_source(source, module) do case source do "mangareader" -> :"Elixir.ExMangaDownloadr.MangaReader.#{module}" "mangafox" -> :"Elixir.ExMangaDownloadr.Mangafox.#{module}" end end defp download_image({image_src, image_filename}, directory) do end end :chapter_page ChapterPage

57. manga_wrapper.exdefmodule MangaWrapper do require Logger def index_page(url, source) do source |> manga_source("IndexPage") |> apply(:chapters, [url]) end def chapter_page(chapter_link, source) do source |> manga_source("ChapterPage") |> apply(:pages, [chapter_link]) end def page_image(page_link, source) do source |> manga_source("Page") |> apply(:image, [page_link]) end def page_download_image(image_data, directory) do download_image(image_data, directory) end defp manga_source(source, module) do case source do "mangareader" -> :"Elixir.ExMangaDownloadr.MangaReader.#{module}" "mangafox" -> :"Elixir.ExMangaDownloadr.Mangafox.#{module}" end end defp download_image({image_src, image_filename}, directory) do end end . !"" _build # $"" ... !"" config # $"" config.exs !"" deps # !"" ... !"" ex_manga_downloadr !"" lib # !"" ex_manga_downloadr # # !"" cli.ex # # !"" mangafox # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" mangareader # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" manga_wrapper.ex # # $"" workflow.ex # $"" ex_manga_downloadr.ex !"" mix.exs !"" mix.lock !"" README.md $"" test !"" ex_manga_downloadr # !"" mangafox_test.exs # $"" mangareader_test.exs !"" ex_manga_downloadr_test.exs $"" test_helper.exs 61 directories, 281 files :chapter_page ChapterPage

61. defmodule ExMangaDownloadr.Mangafox.ChapterPage do require Logger require ExMangaDownloadr def pages(chapter_link) do ExMangaDownloadr.fetch chapter_link, do: fetch_pages(chapter_link) end defp fetch_pages(html, chapter_link) do [_page|link_template] = chapter_link |> String.split("/") |> Enum.reverse html |> Floki.find("div[id='top_center_bar'] option") |> Floki.attribute("value") |> Enum.reject(fn page_number -> page_number == "0" end) |> Enum.map(fn page_number -> ["#{page_number}.html"|link_template] |> Enum.reverse |> Enum.join("/") end) end end ChapterPage

62. ChapterPage . !"" _build # $"" ... !"" config # $"" config.exs !"" deps # !"" ... !"" ex_manga_downloadr !"" lib # !"" ex_manga_downloadr # # !"" cli.ex # # !"" mangafox # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" mangareader # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" manga_wrapper.ex # # $"" workflow.ex # $"" ex_manga_downloadr.ex !"" mix.exs !"" mix.lock !"" README.md $"" test !"" ex_manga_downloadr # !"" mangafox_test.exs # $"" mangareader_test.exs !"" ex_manga_downloadr_test.exs $"" test_helper.exs 61 directories, 281 files cli.ex

63. cli.exdefmodule ExMangaDownloadr.CLI do alias ExMangaDownloadr.Workflow require ExMangaDownloadr def main(args) do args |> parse_args |> process end ... defp parse_args(args) do end defp process(:help) do end defp process(directory, url) do File.mkdir_p!(directory) File.mkdir_p!("/tmp/ex_manga_downloadr_cache") manga_name = directory |> String.split("/") |> Enum.reverse |> Enum.at(0) url |> Workflow.determine_source |> Workflow.chapters |> Workflow.pages |> Workflow.images_sources |> Workflow.process_downloads(directory) |> Workflow.optimize_images |> Workflow.compile_pdfs(manga_name) |> finish_process end defp process_test(directory, url) do end defp finish_process(directory) do end end Workflow

64. mix deps.get mix test mix escript.build

65. mix deps.get mix test mix escript.build ex_manga_downloadr - 4.6M

66. time ./ex_manga_downloadr —test

67. time ./ex_manga_downloadr —test 32.03s user 57.97s system 120% cpu 1:14.45 total

72. . !"" _build # $"" ... !"" config # $"" config.exs !"" deps # !"" ... !"" ex_manga_downloadr !"" lib # !"" ex_manga_downloadr # # !"" cli.ex # # !"" mangafox # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" mangareader # # # !"" chapter_page.ex # # # !"" index_page.ex # # # $"" page.ex # # !"" manga_wrapper.ex # # $"" workflow.ex # $"" ex_manga_downloadr.ex !"" mix.exs !"" mix.lock !"" README.md $"" test !"" ex_manga_downloadr # !"" mangafox_test.exs # $"" mangareader_test.exs !"" ex_manga_downloadr_test.exs $"" test_helper.exs 61 directories, 281 files . !"" cr_manga_downloadr !"" libs # !"" ... !"" LICENSE !"" README.md !"" shard.lock !"" shard.yml !"" spec # !"" cr_manga_downloadr # # !"" chapters_spec.cr # # !"" concurrency_spec.cr # # !"" image_downloader_spec.cr # # !"" page_image_spec.cr # # $"" pages_spec.cr # !"" fixtures # # !"" ... # $"" spec_helper.cr $"" src !"" cr_manga_downloadr # !"" chapters.cr # !"" concurrency.cr # !"" downloadr_client.cr # !"" image_downloader.cr # !"" page_image.cr # !"" pages.cr # !"" records.cr # !"" version.cr # $"" workflow.cr $"" cr_manga_downloadr.cr

75. def run Dir.mkdir_p @config.download_directory pipe Steps.fetch_chapters(@config) .>> Steps.fetch_pages(@config) .>> Steps.fetch_images(@config) .>> Steps.download_images(@config) .>> Steps.optimize_images(@config) .>> Steps.prepare_volumes(@config) .>> unwrap puts "Done!" end File.mkdir_p!(directory) File.mkdir_p!("/tmp/ex_manga_downloadr_cache") manga_name = directory |> String.split("/") |> Enum.reverse |> Enum.at(0) url |> Workflow.determine_source |> Workflow.chapters |> Workflow.pages |> Workflow.images_sources |> Workflow.process_downloads(directory) |> Workflow.optimize_images |> Workflow.compile_pdfs(manga_name) |> finish_process end

76. # 1 y = c(b(a)) # 2 x = b(a) y = c(x) # Elixir Pipes y = a |> b |> c # Crystal Macro Pipes y = pipe a .>> b .>> c .>> unwrap

78. defmodule ExMangaDownloadr.MangaReader.IndexPage do require Logger require ExMangaDownloadr def chapters(manga_root_url) do ExMangaDownloadr.fetch manga_root_url, do: collect end defp collect(html) do {fetch_manga_title(html), fetch_chapters(html)} end defp fetch_manga_title(html) do html |> Floki.find("#mangaproperties h1") |> Floki.text end defp fetch_chapters(html) do html |> Floki.find("#listing a") |> Floki.attribute("href") end end

81. require "./downloadr_client" require "xml" module CrMangaDownloadr class Chapters < DownloadrClient def fetch html = get(@config.root_uri).as(XML::Node) nodes = html.xpath_nodes( "//table[contains(@id, 'listing')]//td//a/@href") nodes.map { |node| node.text.as(String) } end end end DownloadrClient

82. require "./downloadr_client" require "xml" module CrMangaDownloadr class Chapters < DownloadrClient def fetch html = get(@config.root_uri).as(XML::Node) nodes = html.xpath_nodes( "//table[contains(@id, 'listing')]//td//a/@href") nodes.map { |node| node.text.as(String) } end end end DownloadrClient

83. module CrMangaDownloadr class DownloadrClient ... def get(uri : String, binary = false) Dir.mkdir_p(@config.cache_directory) unless Dir.exists?(@config.cache_directory) cache_path = File.join(@config.cache_directory, cache_filename(uri)) while true begin response = if @cache_http && File.exists?(cache_path) body = File.read(cache_path) HTTP::Client::Response.new(200, body) else @http_client.get(uri, headers: HTTP::Headers{ "User-Agent" => CrMangaDownloadr::USER_AGENT }) end case response.status_code when 301 uri = response.headers["Location"] when 200 if ( binary || @cache_http ) && !File.exists?(cache_path) File.open(cache_path, "w") do |f| f.print response.body end end if binary return cache_path else return XML.parse_html(response.body) end end rescue IO::Timeout puts "Sleeping over #{uri}" sleep 1 end end end ... end DownloadrClient

89. require "fiberpool" module CrMangaDownloadr struct Concurrency(A, B) def initialize(@config : Config, @engine_class : DownloadrClient.class) end def fetch(collection : Array(A)?, &block : A, DownloadrClient -> Array(B)?) : Array(B) results = [] of B if collection pool = Fiberpool.new(collection, @config.download_batch_size) pool.run do |item| engine = @engine_class.new(@config) if reply = block.call(item, engine) results.concat(reply) end end end results end end end fetch Concurrency

95. fetch Concurrency module CrMangaDownloadr class Workflow end module Steps def self.fetch_chapters(config : Config) end def self.fetch_pages(chapters : Array(String)?, config : Config) puts "Fetching pages from all chapters ..." reactor = Concurrency(String, String).new(config, Pages) reactor.fetch(chapters) do |link, engine| engine.try(&.fetch(link)).as(Array(String)) end end def self.fetch_images(pages : Array(String)?, config : Config) end def self.download_images(images : Array(Image)?, config : Config) end def self.optimize_images(downloads : Array(String), config : Config) end def self.prepare_volumes(downloads : Array(String), config : Config) end end end

96. fetch Concurrency module CrMangaDownloadr class Workflow end module Steps def self.fetch_chapters(config : Config) end def self.fetch_pages(chapters : Array(String)?, config : Config) puts "Fetching pages from all chapters ..." reactor = Concurrency(String, String).new(config, Pages) reactor.fetch(chapters) do |link, engine| engine.try(&.fetch(link)).as(Array(String)) end end def self.fetch_images(pages : Array(String)?, config : Config) end def self.download_images(images : Array(Image)?, config : Config) end def self.optimize_images(downloads : Array(String), config : Config) end def self.prepare_volumes(downloads : Array(String), config : Config) end end end

98. crystal deps crystal spec crystal build src/cr_manga_downloadr.cr --release

99. crystal deps crystal spec crystal build src/cr_manga_downloadr.cr --release cr_manga_downloadr 752K

100. time ./cr_manga_downloadr -t

101. time ./cr_manga_downloadr -t 5.57s user 6.79s system 14% cpu 1:26.76 total

104. . !"" cr_manga_downloadr !"" libs # !"" ... !"" LICENSE !"" README.md !"" shard.lock !"" shard.yml !"" spec # !"" cr_manga_downloadr # # !"" chapters_spec.cr # # !"" concurrency_spec.cr # # !"" image_downloader_spec.cr # # !"" page_image_spec.cr # # $"" pages_spec.cr # !"" fixtures # # !"" ... # $"" spec_helper.cr $"" src !"" cr_manga_downloadr # !"" chapters.cr # !"" concurrency.cr # !"" downloadr_client.cr # !"" image_downloader.cr # !"" page_image.cr # !"" pages.cr # !"" records.cr # !"" version.cr # $"" workflow.cr $"" cr_manga_downloadr.cr . !"" bin # $"" manga-downloadr !"" Gemfile !"" Gemfile.lock !"" lib # !"" manga-downloadr # # !"" chapters.rb # # !"" concurrency.rb # # !"" downloadr_client.rb # # !"" image_downloader.rb # # !"" page_image.rb # # !"" pages.rb # # !"" records.rb # # !"" version.rb # # $"" workflow.rb # $"" manga-downloadr.rb !"" LICENSE.txt !"" manga-downloadr.gemspec !"" Rakefile !"" README.md $"" spec !"" fixtures # !"" ... !"" manga-downloadr # !"" chapters_spec.rb # !"" concurrency_spec.rb # !"" image_downloader_spec.rb # !"" page_image_spec.rb # $"" pages_spec.rb $"" spec_helper.rb

107. def run Dir.mkdir_p @config.download_directory pipe Steps.fetch_chapters(@config) .>> Steps.fetch_pages(@config) .>> Steps.fetch_images(@config) .>> Steps.download_images(@config) .>> Steps.optimize_images(@config) .>> Steps.prepare_volumes(@config) .>> unwrap puts "Done!" end

108. def self.run(config = Config.new) FileUtils.mkdir_p config.download_directory CM(config, Workflow) .fetch_chapters .fetch_pages(config) .fetch_images(config) .download_images(config) .optimize_images(config) .prepare_volumes(config) .unwrap puts "Done!" end def run Dir.mkdir_p @config.download_directory pipe Steps.fetch_chapters(@config) .>> Steps.fetch_pages(@config) .>> Steps.fetch_images(@config) .>> Steps.download_images(@config) .>> Steps.optimize_images(@config) .>> Steps.prepare_volumes(@config) .>> unwrap puts "Done!" end

110. # concurrency.cr pool = Fiberpool.new(collection, @config.download_batch_size) pool.run do |item| engine = @engine_class.new(@config) if reply = block.call(item, engine) results.concat(reply) end end

111. # concurrency.cr pool = Fiberpool.new(collection, @config.download_batch_size) pool.run do |item| engine = @engine_class.new(@config) if reply = block.call(item, engine) results.concat(reply) end end pool = Thread.pool(@config.download_batch_size) mutex = Mutex.new results = [] collection.each do |item| pool.process { engine = @turn_on_engine ? @engine_klass.new(@config.domain, @config.cache_http) : nil reply = block.call(item, engine)&.flatten mutex.synchronize do results += ( reply || [] ) end } end pool.shutdown

112. Fibers Threads

113. module CrMangaDownloadr class Pages < DownloadrClient def fetch(chapter_link : String) html = get(chapter_link) nodes = html.xpath_nodes("//div[@id='selectpage']//select[@id='pageMenu']//option") nodes.map { |node| "#{chapter_link}/#{node.text}" } end end end

114. module CrMangaDownloadr class Pages < DownloadrClient def fetch(chapter_link : String) html = get(chapter_link) nodes = html.xpath_nodes("//div[@id='selectpage']//select[@id='pageMenu']//option") nodes.map { |node| "#{chapter_link}/#{node.text}" } end end end module MangaDownloadr class Pages < DownloadrClient def fetch(chapter_link) get chapter_link do |html| nodes = html.xpath("//div[@id='selectpage']//select[@id='pageMenu']//option") nodes.map { |node| [chapter_link, node.children.to_s].join("/") } end end end end

115. time bin/manga-downloadr -t

119. Ruby/Typhoeus  (hydra_concurrency = 50) 41% CPU 1:24 min

120. Ruby/Typhoeus  (hydra_concurrency = 50) 41% CPU 1:24 min Elixir 1.4.5  (@max_demand=50) 120% CPU 1:14 min

121. Ruby/Typhoeus  (hydra_concurrency = 50) 41% CPU 1:24 min Elixir 1.4.5  (@max_demand=50) 120% CPU 1:14 min Crystal 0.23.0  (opt_batch_size = 50) 14% CPU 1:26 min

122. Ruby/Typhoeus  (hydra_concurrency = 50) 41% CPU 1:24 min Elixir 1.4.5  (@max_demand=50) 120% CPU 1:14 min Crystal 0.23.0  (opt_batch_size = 50) 14% CPU 1:26 min Ruby 2.4.1  (opt_batch_size = 50) 33% CPU 1:31 min

124. Ruby Typhoeus libcurl

125. Ruby Typhoeus libcurl Elixir OTP Poolboy

126. Ruby Typhoeus libcurl Elixir OTP Poolboy Crystal Fibers Fiberpool

127. Ruby Typhoeus libcurl Elixir OTP Poolboy Crystal Fibers Fiberpool Ruby Thread Thread/Pool

130. manga-downloadr ex_manga_downloadr cr_manga_downloadr

131. manga-downloadr ex_manga_downloadr cr_manga_downloadr ﬁberpool cr_chainable_methods chainable_methods

137. PREMATURE OPTIMIZATION The Root of ALL Evil

138. THANKS @akitaonrails slideshare.net/akitaonrails

139. www.theconf.club

A Journey through New Languages - Rancho Dev 2017

Recommended

Recommended

More Related Content

What's hot

What's hot (12)

Similar to A Journey through New Languages - Rancho Dev 2017

Similar to A Journey through New Languages - Rancho Dev 2017 (20)

More from Fabio Akita

More from Fabio Akita (20)

Recently uploaded

Recently uploaded (20)

A Journey through New Languages - Rancho Dev 2017