This is a bug bounty hunter presentation given at Nullcon 2016 by Bugcrowd's Faraz Khan.
Learn more about Bugcrowd here: https://bugcrowd.com/join-the-crowd
Reconnaissance denotes the work of information gathering before any real attacks are planned. The idea is to collect as much interesting information as possible about the target. The methodology described here increases the assets for testing and thus increasing the scope of finding vulnerabilities.
Fun with ruby and redis, arrrrcamp edition, javier_ramirez, teowakijavier ramirez
In this talk I make an introduction to Redis, then I explain how some big names (twitter, pinterest...) are using it, then I describe some pitfalls, then I explain how we are using redis at teowaki
Progressive Enhancement is one of the most important and useful software engineering tools in our web development toolbox, but in practice it's largely ignored. We'll dive into the basics of PE, the common pitfalls (think <noscript> and the newer class="no-js"), how to support Blackberry 4.x and IE6 without a ton of extra work, tools to avoid that violate PE best practices, and how to apply PE pragmatically.
Google I/O 2012 - Protecting your user experience while integrating 3rd party...Patrick Meenan
The amount of 3rd-party content included on websites is exploding (social sharing buttons, user tracking, advertising, code libraries, etc). Learn tips and techniques for how best to integrate them into your sites without risking a slower user experience or even your sites becoming unavailable.
Video is available here: http://www.youtube.com/watch?v=JB4ulhFFdH4&feature=plcp
This is a bug bounty hunter presentation given at Nullcon 2016 by Bugcrowd's Faraz Khan.
Learn more about Bugcrowd here: https://bugcrowd.com/join-the-crowd
Reconnaissance denotes the work of information gathering before any real attacks are planned. The idea is to collect as much interesting information as possible about the target. The methodology described here increases the assets for testing and thus increasing the scope of finding vulnerabilities.
Fun with ruby and redis, arrrrcamp edition, javier_ramirez, teowakijavier ramirez
In this talk I make an introduction to Redis, then I explain how some big names (twitter, pinterest...) are using it, then I describe some pitfalls, then I explain how we are using redis at teowaki
Progressive Enhancement is one of the most important and useful software engineering tools in our web development toolbox, but in practice it's largely ignored. We'll dive into the basics of PE, the common pitfalls (think <noscript> and the newer class="no-js"), how to support Blackberry 4.x and IE6 without a ton of extra work, tools to avoid that violate PE best practices, and how to apply PE pragmatically.
Google I/O 2012 - Protecting your user experience while integrating 3rd party...Patrick Meenan
The amount of 3rd-party content included on websites is exploding (social sharing buttons, user tracking, advertising, code libraries, etc). Learn tips and techniques for how best to integrate them into your sites without risking a slower user experience or even your sites becoming unavailable.
Video is available here: http://www.youtube.com/watch?v=JB4ulhFFdH4&feature=plcp
Want to know the REAL reason people are scared of spiders? Here we look at some of the silly myths about spiders and show you some of the amazing things you would be able to do if you were a spider.
Spiders, Are they scare you? Or do you scare them?
Actually It doesn't matter because Some spiders scare you and you scared some spider. The only difference is The venomous and power of it. Here we are going to share top 5 most venomous spiders in the word that often scared us if not kill!
Yeah, These spiders can easily kill human and domestic animals by injection venom to animal body.
This presentation gives a detailed insight into spiders, including what they are, how they live, how they hunt, how they defend themselves etc. Please do enjoy!
Scraping the web with Laravel, Dusk, Docker, and PHPPaul Redmond
Jumpstart your web scraping automation in the cloud with Laravel Dusk, Docker, and friends. We will discuss the types of web scraping tools, the best tools for the job, and how to deal with running selenium in Docker.
Code examples @ https://github.com/paulredmond/scraping-with-laravel-dusk
Want to know the REAL reason people are scared of spiders? Here we look at some of the silly myths about spiders and show you some of the amazing things you would be able to do if you were a spider.
Spiders, Are they scare you? Or do you scare them?
Actually It doesn't matter because Some spiders scare you and you scared some spider. The only difference is The venomous and power of it. Here we are going to share top 5 most venomous spiders in the word that often scared us if not kill!
Yeah, These spiders can easily kill human and domestic animals by injection venom to animal body.
This presentation gives a detailed insight into spiders, including what they are, how they live, how they hunt, how they defend themselves etc. Please do enjoy!
Scraping the web with Laravel, Dusk, Docker, and PHPPaul Redmond
Jumpstart your web scraping automation in the cloud with Laravel Dusk, Docker, and friends. We will discuss the types of web scraping tools, the best tools for the job, and how to deal with running selenium in Docker.
Code examples @ https://github.com/paulredmond/scraping-with-laravel-dusk
Stefan Judis "Did we(b development) lose the right direction?"Fwdays
Keeping up with the state of web technology is one of the biggest challenges for us developers today. We invent new tools; we define new best practices, everything’s new, always... And we do all that for good user experience! We do all that to build the best possible web – it’s all about our users.
But is it, really? Or do developers like to play with technology secretly loving the new and shiny? Or do we only pretend that it’s about users, and behind closed doors, it’s developer experience that matters to us? Did we lose direction? Is it time for a critical look at the state of the web and the role JavaScript plays in it?
In this guide, we will go over all the core concepts of large-scale web scraping and learn everything about it, from challenges to best practices. Large Scale Web Scraping is scraping web pages and extracting data from them. This can be done manually or with automated tools. The extracted data can then be used to build charts and graphs, create reports and perform other analyses on the data. It can be used to analyze large amounts of data, like traffic on a website or the number of visitors they receive. In addition, It can also be used to test different website versions so that you know which version gets more traffic than others.
Large Scale Web Scraping is an essential tool for businesses as it allows them to analyze their audience's behavior on different websites and compare which performs better. Large-scale scraping is a task that requires a lot of time, knowledge, and experience. It is not easy to do, and there are many challenges that you need to overcome in order to succeed. Performance is one of the significant challenges in large-scale web scraping.
The main reason for this is the size of web pages and the number of links resulting from the increased use of AJAX technology. This makes it difficult to scrape data from many web pages accurately and quickly. Web structure is the most crucial challenge in scraping. The structure of a web page is complex, and it is hard to extract information from it automatically. This problem can be solved using a web crawler explicitly developed for this task. Anti-Scraping Technique
Another major challenge that comes when you want to scrape the website at a large scale is anti-scraping. It is a method of blocking the scraping script from accessing the site.
If a site's server detects that it has been accessed from an external source, it will respond by blocking access to that external source and preventing scraping scripts from accessing it. Large-scale web scraping requires a lot of data and is challenging to manage. It is not a one-time process but a continuous one requiring regular updates. Here are some of the best practices for large-scale web scraping:
1. Create Crawling Path
The first thing to scrape extensive data is to create a crawling path. Crawling is systematically exploring a website and its content to gather information.
Data Warehouse
The data warehouse is a storehouse of enterprise data that is analyzed, consolidated, and analyzed to provide the business with valuable information. Proxy Service
Proxy service is a great way to scrape large-scale data. It can be used for scraping images, blog posts, and other types of data from the Internet. Detecting Bots & Blocking
Bots are a real problem for scraping. They are used to extract data from websites and make it available for human consumption. They do this by using software designed to mimic a human user so that when the bot does something on a website, it looks like a real human user was doing it.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
2. ...programs which scan the web in a methodical and automated way. ...they copy all the pages they visit and leave them to the search engine for indexing. ...not all spiders have the same job though, some check links, or collect email addresses, or validate code for example. Spiders are... ...some people call them crawlers, bots and even ants or worms. (“Spidering” means to request every page on a site)
3. A spider's architecture: Downloads web pages Stuff is stored URLs get queued Co-ordinates the processes
5. The crawl list would look like this (although it would be much much bigger than this small sample): http://www.techcrunch.com/ http://www.crunchgear.com/ http://www.mobilecrunch.com/ http://www.techcrunchit.com/ http://www.crunchbase.com/ http://www.techcrunch.com/# http://www.inviteshare.com/ http://pitches.techcrunch.com/ http://gillmorgang.techcrunch.com/ http://www.talkcrunch.com/ http://www.techcrunch50.com/ http://uk.techcrunch.com/ http://fr.techcrunch.com/ http://jp.techcrunch.com/ The spider will also save a copy of each page it visits in a database. The search engine will then index those. The first URLs given to the spider as a starting point are called “seeds”. The list gets bigger and bigger and in order to make sure that the search engine index is current, the spider will need to re-visit those links often to track any changes. There are 2 lists: a list of URLs visited and a list of URLs to visit. This list is known as “The crawl frontier”.
6.
7.
8. A re-visit policy that states when to check for changes to the pages.
11. Build a spider You can use any programming language that you feel comfortable with, although JAVA, Perl and C# ones are the most popular. You can also use these tutorials: Java sun spider - http://tiny.cc/e2KAy Chilkat in python - http://tiny.cc/WH7eh Swish-e in Perl - http://tiny.cc/nNF5Q Remember that a poorly designed spider can impact overall network and server performance.
12. OpenSource spiders You can use one of these for free (some knowledge of programming can help in setting them up): OpenWebSpider in C# - http://www.openwebspider.org Arachnid in Java - http://arachnid.sourceforge.net/ Java-web-spider - http://code.google.com/p/java-web-spider/ MOMSpider in perl - http://tiny.cc/36XQA
13. Robots.txt This is a file that allows webmasters to give instructions to visiting spiders who must respect it. Some areas are off-limits. Disallow spider from everything User-agent: * Disallow: / Disallow all except Googlebot and BackRub, which can access /private User-agent: Googlebot User-agent: BackRub Disallow: /private and churl, which can access everything User-agent: churl Disallow:
20. Share your results List your spider in the database http://www.robotstxt.org/db.html
21. Spider traps Intentionally and non-intentionally, traps crop up on the spider's path sometimes and stop it functioning properly. Dynamic pages, deep directories that never end, pages with special links and commands pointing the spider to other directories...anything that can put the spider into an infinite loop is an issue. You might however want to deploy a spider trap if you know that one is visiting your site and not respecting your robots.txt for example or because it's a spambot.
22. Fleiner's spider trap <html><head><title> You are a bad netizen if you are a web bot! </title> <body><h1><b> You are a bad netizen if you are a web bot! </h1></b> <!--#config timefmt="%y%j%H%M%S" --> <!-- of date string --> <!--#exec cmd="sleep 20" --> <!-- make this page sloooow to load --> To give robots some work here some special links: these are <a href=a<!--#echo var="DATE_GMT" -->.html> some links </a> to this <a href=b<!--#echo var="DATE_GMT" -->.html> very page </a> but with <a href=c<!--#echo var="DATE_GMT" -->.html> different names </a> You can download spider traps and find out more at Fleiner's page: http://www.fleiner.com/bots/#trap