More Related Content Similar to Hacking the browser with puppeteer sharp .NET conf AR 2018 (20) More from Darío Kondratiuk (6) Hacking the browser with puppeteer sharp .NET conf AR 20181. Hacking the Browser
With Puppeteer-Sharp
Darío Kondratiuk
.NET Senior Developer - MultiTracks.com
Author of Puppeteer-Sharp
@kblok - @hardkoded
www.hardkoded.com
2. Hacking the Browser
With Puppeteer-Sharp
Darío Kondratiuk
.NET Senior Developer @ MultiTracks.com
Author of Puppeteer-Sharp
@hardkoded - @kblok
www.hardkoded.com
https://developer.mozilla.org/en-US/docs/Web/WebDriver October 4th, 5th & 6th 2018.NET Conf AR v2018
4. Headless Browsers
● July 4, 2017 => Google Chrome 59
○ chrome --headless --disable-gpu --print-to-pdf https://www.chromestatus.com/
https://developer.mozilla.org/en-US/docs/Web/WebDriver October 4th, 5th & 6th 2018.NET Conf AR v2018
7. Headless Browsers
● July 4, 2017 => Google Chrome 59
○ chrome --headless --disable-gpu --print-to-pdf https://www.chromestatus.com/
● August 8, 2017 => Firefox 55
● August 16, 2017 => Puppeteer v0.9
● January 12, 2018 => Puppeteer v1.0
● March 1, 2018 => Puppeteer Sharp v0.1
● April 27, 2018 => Edge DevTools Protocol v0.1
https://developer.mozilla.org/en-US/docs/Web/WebDriver October 4th, 5th & 6th 2018.NET Conf AR v2018
9. WebDriver vs Headless Browsers
https://developer.mozilla.org/en-US/docs/Web/WebDriver October 4th, 5th & 6th 2018.NET Conf AR v2018
10. WebDriver
WebDriver is a remote control interface that enables
introspection and control of user agents. It provides a
platform- and language-neutral wire protocol as a way for out-
of-process programs to remotely instruct the behavior of web
browsers.
https://developer.mozilla.org/en-US/docs/Web/WebDriver October 4th, 5th & 6th 2018.NET Conf AR v2018
17. [HttpGet("{owner}/{repo}")]
public async Task<FileContentResult> Get(string owner, string repo)
{
var contributorsPage = $"https://github.com/{owner}/{repo}/graphs/contributors";
using (var browser = await Puppeteer.LaunchAsync(new LaunchOptions
{
Headless = false
}))
using (var page = await browser.NewPageAsync())
{
await page.GoToAsync(contributorsPage);
await page.WaitForSelectorAsync(".contrib-person");
var element = await page.QuerySelectorAsync("#contributors");
var image = await element.ScreenshotDataAsync();
return File(image, "image/png");
}
}
19. [HttpGet("{owner}/{post}")]
public async Task<FileContentResult> Get(string author, string post)
{
var contributorsPage = $"https://medium.com/{author}/{post}";
using (var browser = await Puppeteer.LaunchAsync(new LaunchOptions
{
Headless = true
}))
using (var page = await browser.NewPageAsync())
{
await page.GoToAsync(contributorsPage);
await page.WaitForSelectorAsync("HEADER");
await page.EvaluateExpressionAsync("document.querySelector('HEADER').remove();");
var pdf = await page.PdfDataAsync();
return File(pdf, "application/pdf");
}
}
23. var url = "https://www.despegar.com.ar/shop/flights/results/roundtrip/BUE/MDZ/2018-12-01/2018-12-08/1";
using (var browser = await Puppeteer.LaunchAsync(options))
using (var page = await browser.NewPageAsync())
{
await page.GoToAsync(url, WaitUntilNavigation.Networkidle0);
await page.WaitForSelectorAsync("buy-button");
var bestPrice = await page.EvaluateFunctionAsync<string>(@"() => {
var elements = document.querySelectorAll('.main-content .price-amount');
return elements.length ? elements[0].innerText : '0';
}");
Console.WriteLine($"Best price for Mendoza {bestPrice}");
await Task.Delay(60000);
}
25. [Fact]
public async Task ShouldHonorThePrice()
{
//Previous Code
var clickElement = await page.EvaluateExpressionHandleAsync(@"
document.querySelectorAll('.main-content buy-button:first-child A')[0]")
as ElementHandle;
await clickElement.ClickAsync();
await page.WaitForSelectorAsync(".price-container .amount");
var checkoutPrice = await page.EvaluateExpressionAsync<string>(@"
document.querySelectorAll('.price-container .amount')[0].innerText
");
Assert.Equal(bestPrice, checkoutPrice);
}
31. Don’ts
● DDoS Attacks
● Unethical Web Scraping
● Fake page loads
● Credential Stuffing
https://developer.mozilla.org/en-US/docs/Web/WebDriver October 4th, 5th & 6th 2018.NET Conf AR v2018
32. Puppeteer the world!
● Puppeteer Recorder
● Rendertron
● Checkly
● Contributors!
https://developer.mozilla.org/en-US/docs/Web/WebDriver October 4th, 5th & 6th 2018.NET Conf AR v2018
34. The power of a Star
https://developer.mozilla.org/en-US/docs/Web/WebDriver October 4th, 5th & 6th 2018.NET Conf AR v2018
Editor's Notes Contributors images Contributors images Contributors images Contributors images Contributors images Contributors images Contributors images Contributors images Contributors images Contributors images Contributors images Contributors images Contributors images Contributors images Contributors images Chrome tells you when it runs in automation mode Contributors images Contributors images Contributors images Contributors images