WebRobot is a London-based company that operates in the web scraping and mining industry. It aims to become a leader by building a scalable infrastructure for data acquisition as a web service, exploiting cloud computing and big data technologies. WebRobot's goal is to become a complete ETL service involving data extraction, web mining, machine learning, and big data analytics to help companies acquire and manage online data. The company is currently finalizing its first version and needs funding to further develop its algorithms and tools.
This presentation is an overview of IBM App Connect, a new solution for business users to connect the apps they use everyday to automate their workflow and free up more time to get back to the work that matters to them. Learn more about App Connect here: http://ibm.co/1pNVwgV
APIdays Paris 2019 - APIs4DGov Study: Towards an API framework for government...apidays
APIs4DGov Study: Towards an API framework for government
An evidence-based approach based on best practices literature review
Mark Boyd (API Expert), Writer/Analyst at Platformable
How Cloud Based Market Data Enables InnovationStephane Dubois
How legacy market data infrastructure kills innovation
Cloud-Based Market Data Distribution overview
How Cloud APIs drive innovation
Xignite introduction
APIs Are Powering Fintech Innovation. What Is Next?Stephane Dubois
Presented at Xignite API World 2016 - San Jose
Look in to the history and role of APIs in Financial services and the Fintech Revolution, assess the current situation in fintech and finserv markets and discuss what the future of APIs in financial services might look like.
This presentation is an overview of IBM App Connect, a new solution for business users to connect the apps they use everyday to automate their workflow and free up more time to get back to the work that matters to them. Learn more about App Connect here: http://ibm.co/1pNVwgV
APIdays Paris 2019 - APIs4DGov Study: Towards an API framework for government...apidays
APIs4DGov Study: Towards an API framework for government
An evidence-based approach based on best practices literature review
Mark Boyd (API Expert), Writer/Analyst at Platformable
How Cloud Based Market Data Enables InnovationStephane Dubois
How legacy market data infrastructure kills innovation
Cloud-Based Market Data Distribution overview
How Cloud APIs drive innovation
Xignite introduction
APIs Are Powering Fintech Innovation. What Is Next?Stephane Dubois
Presented at Xignite API World 2016 - San Jose
Look in to the history and role of APIs in Financial services and the Fintech Revolution, assess the current situation in fintech and finserv markets and discuss what the future of APIs in financial services might look like.
The API economy is your opportunity to disrupt business as usual—the chance to rethink
business models and reach new audiences. It’s the new way to deliver digital services to
employees, partners and consumers. http://ibm.com/apieconomy
Gartner says that for CEOs, “Growth is the top priority, by far. In 2014, it almost equals the sum of the next three top issues”. Companies effectively using software development to achieve competitive advantage are more profitable than their peers. Organizations such as Square, Uber, Netflix, Airbnb, the Climate Corporation and Etsy are using software to change industries and disrupt business models. Put another way, software is eating the world. Companies looking to drive innovation through software development have new options and opportunities.
Monitor your car from the cloud! DIY Telematics and the Internet of ThingsTom Gersic
My Dreamforce session on DIY Telematics for OBD-II with a Raspberry Pi.
What if your car was connected to the Cloud, and had been logging sensor data for the past 30 days? What if it could automatically file a Case with your local dealership with the Diagnostic Trouble Codes from your car and all that log data attached? What if the Service department could give you a call to schedule an appointment to get it fixed, so you don’t even have to remember to call them? Since they have all the automotive log information they need to diagnose the problem before you even show up, they can even tell you what’s wrong over the phone and quote you an estimate for the repair.
Here’s a Raspberry Pi DIY project that does just that.
More info at: http://gersic.com/ive-connected-my-car-to-salesforce-com/
Machine Learning in Retail: ML in the Retail SectorBigML, Inc
This is an introductory session about the role that Machine Learning is playing in the retail sector and how it is being deployed across the different areas of this industry.
Speaker: Atakan Cetinsoy, VP of Predictive Applications at BigML.
*ML in Retail 2021: Webinar.
Data Integration: Huntflow and PowerBI | Case Study | Software Development Co...*instinctools
What problems can flexible and detailed analytics tackle in recruitment?
Here is the case study on data integration from Huntflow, a professional recruiting CRM system, with PowerBI software, that provided informative dashboards and helped:
✔️ Recruiters to get a clear picture of the talent pipeline, showing the path of every candidate from every source through every stage of the hiring process;
✔️ The company to fill the positions 21% faster than it used to;
✔️ C-levels to make the necessary adjustments to the wage scale.
Reach out to *instinctools Business Intelligence experts if you have a similar project idea or challenge with data integration > contact@instinctools.com
Are you looking to build an effective on demand service platform with impressive features? Then, utilize Appkodes Idemand, a phenomenal Urbanclap clone solution with interesting functions such as options for placing service requirements, describing and scheduling tasks, posting jobs, and so on. So, grabbing the attention of your users on your on demand services platform will not be a tough job anymore with our readymade solution. So, build an amazing urbanclap clone app and attain success in your online business venture.
Outside in thinking - How APIs can help open up your business Jeremy Brown
Outside in thinking - How APIs can help open up your business
Many organisations are digital immigrants, traditional enterprises who are making the switch to digital and thinking about how they can release and monetise the true value of the data they have in their backend systems, we call this an Inside Out approach.
A new way of thinking is required for someone to make the crossover into being a true digital immigrant and to start thinking more like the leading digital natives they are trying to follow. We call this thinking Outside In.
This presentation I gave at API Days London discusses how an Outside In approach can help your business adapt quickly, build new ideas and reuse existing APIs to create amazing new products.
SugarCON 2013: World Class Analytics for SugarCRM with IBMPalmtreeConsulting
Presented at SugarCON 2013 in New York, SugarCRM Gold Partner and IBM Premier Partner Palmtree Consulting explores how partners can make the most of CRM data with IBM Cognos as intelligence engine.
apidays LIVE Hong Kong 2021 - Unleash the Power of Big Data with API Collabor...apidays
apidays LIVE Hong Kong 2021 - API Ecosystem & Data Interchange
August 25 & 26, 2021
Unleash the Power of Big Data with API Collaboration
Dr. Toa Charm, Associate Professor of Practice in Management at CUHK Business School
SaaS adoption has proliferated across nearly every industry vertical and business function to the point where it is now nearly ubiquitous throughout the economy. This has created multiple market opportunities for businesses utilizing an “API-first” strategy, whereby they offer new products or services via API that complement existing SaaS applications or they use API technology to break down data silos and connect what would otherwise be disparate data sets. A handful of first-movers have capitalized on a subset of these opportunities in both products & services (e.g., Twilio in voice & text, Stripe in payments) and data aggregation (e.g., MuleSoft for enterprises, Plaid for FinTech / financial institutions). We believe this dynamic is still in its early days and that there will be many future opportunities for growth investors with strong SaaS experience to partner with these businesses.
To capitalize on this trend, Catalyst is exploring businesses participating in the “API Economy”, which we define as companies doing any of the following:
1. Pursuing an “API-first” product or go-to-market strategy (e.g., Twilio, Stripe, Algolia, MapBox)
2. Utilizing APIs to connect data silos (e.g., MuleSoft, Plaid, Zapier, Segment, Redox, Trulioo)
3. Enabling the development and maintenance of APIs (e.g., Apigee, Kong, SmartBear)
At Catalyst, we employ a proactive, research-based approach to investing, targeting sectors experiencing outstanding growth. If you are an owner, operator, or investor of a growth stage company participating in the API Economy, we would like to hear from you. Please send inquiries and business plans to grady@catalyst.com.
ANTS - The Future of Digital Marketing - FPT TECH DAY 2016ANTS
ANTS - The Future of Digital Marketing - FPT TECH DAY 2016
Vừa trở về từ sân chơi công nghệ lớn nhất khu vực châu Á Thái Bình Dương (APMF 2016) diễn ra tại Indonesia, Ông Đinh Lê Đạt CEO Công ty Cổ phần Giải Pháp Quảng Cáo Trực Tuyến ANTS sẽ mang đến buổi tọa đàm những chia sẻ chân thực nhất về hành trình ra biển lớn của ‘Bầy kiến công nghệ” với giải pháp cốt lõi - Nền tảng Marketing số hoá - của mình.
Thai nghén ngay trong lòng tập đoàn FPT, ANTS có gần hai năm để biến ý tưởng khởi nghiệp dựa trên Big Data trở thành một trong những giải pháp AdTech & Data-Driven Marketing hàng đầu trên thị trường quảng cáo trực tuyến Đông Nam Á.
Công ty đã ghi dấu ấn qua ANTS Ad Exchange, sàn giao dịch mua bán quảng cáo trực tuyến cho phép đấu giá theo thời gian thực (Real-time Bidding) đầu tiên của Việt Nam với 20 tỷ lượt quảng cáo/tháng, hoạt động dựa trên nền tảng dữ liệu lớn (Big Data) 15 Terabytes logs/tháng, tiếp cận được hơn 60 triệu người dùng Internet Việt Nam và Indonesia.
For many, web-scale IT is an alien and drastic approach being met with fear and resistance. So the first question for any organization should be; what is it? Cameron Haight, Gartner’s chief of research for infrastructure and operations, coined the term “Web-scale IT” earlier 2014 as a way to describe the new ways organizations leverage technology to provide their customers with content quickly and at massive scale.
Startup pitch presented by co-founder and CEO Jaco Els. Cubitic offers a predictive analytics platform that allows developers to build custom solutions for analytics and visualisation on top of a machine learning engine.
Universal business operating system QOTEQ is the new generation of business management software that integrates all existing business applications, allowing customers to maintain and forecast their business easier, using one software system. QOTEQ allows employees to “code” with business terms without programming skills .
The API economy is your opportunity to disrupt business as usual—the chance to rethink
business models and reach new audiences. It’s the new way to deliver digital services to
employees, partners and consumers. http://ibm.com/apieconomy
Gartner says that for CEOs, “Growth is the top priority, by far. In 2014, it almost equals the sum of the next three top issues”. Companies effectively using software development to achieve competitive advantage are more profitable than their peers. Organizations such as Square, Uber, Netflix, Airbnb, the Climate Corporation and Etsy are using software to change industries and disrupt business models. Put another way, software is eating the world. Companies looking to drive innovation through software development have new options and opportunities.
Monitor your car from the cloud! DIY Telematics and the Internet of ThingsTom Gersic
My Dreamforce session on DIY Telematics for OBD-II with a Raspberry Pi.
What if your car was connected to the Cloud, and had been logging sensor data for the past 30 days? What if it could automatically file a Case with your local dealership with the Diagnostic Trouble Codes from your car and all that log data attached? What if the Service department could give you a call to schedule an appointment to get it fixed, so you don’t even have to remember to call them? Since they have all the automotive log information they need to diagnose the problem before you even show up, they can even tell you what’s wrong over the phone and quote you an estimate for the repair.
Here’s a Raspberry Pi DIY project that does just that.
More info at: http://gersic.com/ive-connected-my-car-to-salesforce-com/
Machine Learning in Retail: ML in the Retail SectorBigML, Inc
This is an introductory session about the role that Machine Learning is playing in the retail sector and how it is being deployed across the different areas of this industry.
Speaker: Atakan Cetinsoy, VP of Predictive Applications at BigML.
*ML in Retail 2021: Webinar.
Data Integration: Huntflow and PowerBI | Case Study | Software Development Co...*instinctools
What problems can flexible and detailed analytics tackle in recruitment?
Here is the case study on data integration from Huntflow, a professional recruiting CRM system, with PowerBI software, that provided informative dashboards and helped:
✔️ Recruiters to get a clear picture of the talent pipeline, showing the path of every candidate from every source through every stage of the hiring process;
✔️ The company to fill the positions 21% faster than it used to;
✔️ C-levels to make the necessary adjustments to the wage scale.
Reach out to *instinctools Business Intelligence experts if you have a similar project idea or challenge with data integration > contact@instinctools.com
Are you looking to build an effective on demand service platform with impressive features? Then, utilize Appkodes Idemand, a phenomenal Urbanclap clone solution with interesting functions such as options for placing service requirements, describing and scheduling tasks, posting jobs, and so on. So, grabbing the attention of your users on your on demand services platform will not be a tough job anymore with our readymade solution. So, build an amazing urbanclap clone app and attain success in your online business venture.
Outside in thinking - How APIs can help open up your business Jeremy Brown
Outside in thinking - How APIs can help open up your business
Many organisations are digital immigrants, traditional enterprises who are making the switch to digital and thinking about how they can release and monetise the true value of the data they have in their backend systems, we call this an Inside Out approach.
A new way of thinking is required for someone to make the crossover into being a true digital immigrant and to start thinking more like the leading digital natives they are trying to follow. We call this thinking Outside In.
This presentation I gave at API Days London discusses how an Outside In approach can help your business adapt quickly, build new ideas and reuse existing APIs to create amazing new products.
SugarCON 2013: World Class Analytics for SugarCRM with IBMPalmtreeConsulting
Presented at SugarCON 2013 in New York, SugarCRM Gold Partner and IBM Premier Partner Palmtree Consulting explores how partners can make the most of CRM data with IBM Cognos as intelligence engine.
apidays LIVE Hong Kong 2021 - Unleash the Power of Big Data with API Collabor...apidays
apidays LIVE Hong Kong 2021 - API Ecosystem & Data Interchange
August 25 & 26, 2021
Unleash the Power of Big Data with API Collaboration
Dr. Toa Charm, Associate Professor of Practice in Management at CUHK Business School
SaaS adoption has proliferated across nearly every industry vertical and business function to the point where it is now nearly ubiquitous throughout the economy. This has created multiple market opportunities for businesses utilizing an “API-first” strategy, whereby they offer new products or services via API that complement existing SaaS applications or they use API technology to break down data silos and connect what would otherwise be disparate data sets. A handful of first-movers have capitalized on a subset of these opportunities in both products & services (e.g., Twilio in voice & text, Stripe in payments) and data aggregation (e.g., MuleSoft for enterprises, Plaid for FinTech / financial institutions). We believe this dynamic is still in its early days and that there will be many future opportunities for growth investors with strong SaaS experience to partner with these businesses.
To capitalize on this trend, Catalyst is exploring businesses participating in the “API Economy”, which we define as companies doing any of the following:
1. Pursuing an “API-first” product or go-to-market strategy (e.g., Twilio, Stripe, Algolia, MapBox)
2. Utilizing APIs to connect data silos (e.g., MuleSoft, Plaid, Zapier, Segment, Redox, Trulioo)
3. Enabling the development and maintenance of APIs (e.g., Apigee, Kong, SmartBear)
At Catalyst, we employ a proactive, research-based approach to investing, targeting sectors experiencing outstanding growth. If you are an owner, operator, or investor of a growth stage company participating in the API Economy, we would like to hear from you. Please send inquiries and business plans to grady@catalyst.com.
ANTS - The Future of Digital Marketing - FPT TECH DAY 2016ANTS
ANTS - The Future of Digital Marketing - FPT TECH DAY 2016
Vừa trở về từ sân chơi công nghệ lớn nhất khu vực châu Á Thái Bình Dương (APMF 2016) diễn ra tại Indonesia, Ông Đinh Lê Đạt CEO Công ty Cổ phần Giải Pháp Quảng Cáo Trực Tuyến ANTS sẽ mang đến buổi tọa đàm những chia sẻ chân thực nhất về hành trình ra biển lớn của ‘Bầy kiến công nghệ” với giải pháp cốt lõi - Nền tảng Marketing số hoá - của mình.
Thai nghén ngay trong lòng tập đoàn FPT, ANTS có gần hai năm để biến ý tưởng khởi nghiệp dựa trên Big Data trở thành một trong những giải pháp AdTech & Data-Driven Marketing hàng đầu trên thị trường quảng cáo trực tuyến Đông Nam Á.
Công ty đã ghi dấu ấn qua ANTS Ad Exchange, sàn giao dịch mua bán quảng cáo trực tuyến cho phép đấu giá theo thời gian thực (Real-time Bidding) đầu tiên của Việt Nam với 20 tỷ lượt quảng cáo/tháng, hoạt động dựa trên nền tảng dữ liệu lớn (Big Data) 15 Terabytes logs/tháng, tiếp cận được hơn 60 triệu người dùng Internet Việt Nam và Indonesia.
For many, web-scale IT is an alien and drastic approach being met with fear and resistance. So the first question for any organization should be; what is it? Cameron Haight, Gartner’s chief of research for infrastructure and operations, coined the term “Web-scale IT” earlier 2014 as a way to describe the new ways organizations leverage technology to provide their customers with content quickly and at massive scale.
Startup pitch presented by co-founder and CEO Jaco Els. Cubitic offers a predictive analytics platform that allows developers to build custom solutions for analytics and visualisation on top of a machine learning engine.
Universal business operating system QOTEQ is the new generation of business management software that integrates all existing business applications, allowing customers to maintain and forecast their business easier, using one software system. QOTEQ allows employees to “code” with business terms without programming skills .
Digital revolution is disrupting businesses like never before! Ability to extract actionable insight from a large amount of disparate data has become the determining factor of competitive advantage! Everyday new business models are created around data and forcing the incumbents to reinvent themselves to be relevant. Consumer facing businesses felt this pressure early on but eventually every business need to be data driven. But what is the best strategy to address this digital disruption? Our experience says the core data infrastructure modernization is the logical starting point! In this session, we will share trends, strategies and our experience on rejuvenating data integration landscape to address digital disruptions.
This document brings together a set
of latest data points and publicly
available information relevant for
Agile & AI Operations Industry. We
are very excited to share this content
and believe that readers will benefit
from this periodic publication
immensely.
BIG Data & Hadoop Applications in FinanceSkillspeed
Explore the applications of BIG Data & Hadoop in Finance via Skillspeed.
BIG Data & Hadoop in Finance is a key differentiator, especially in terms of generating greater investment insights. They are used by companies & professionals for risk assessment, fraud detection & forecasting trends in financial markets.
To get more details regarding BIG Data & Hadoop, please visit - www.SkillSpeed.com
Stepping into the Digital Future with IoTCognizant
How 14 companies across industries are demonstrating the reality of IoT-at-scale and generating actionable intelligence to fuel higher levels of efficiency, innovation and new business models.
Integration of Big Data Analytics with IoT and OT Systems to Turn Insights in...Alaa Mahjoub
Presentation Main Points:
A- The Role of OT & IoT Systems in Digital Business Transformation
1- What is digital business
2- Digital business platform reference architecture
3- How to use the enterprise architecture to plan and implement digital business transformation
4- Use case: transportation industry digital business platform
B- How to Integrate Big Data Analytics with IoT and OT Systems
1- Basic definitions related to big data analytics
2- Essentials of big data strategy
3- Use cases of integrating big data analytics with IoT and OT systems (in transportation and petroleum industries)
4- Big data platform integration options and their cost benefit trade-offs
Tech. 2017 predictions presentation for meetupsSumant Parimal
This presentation is based on various secondary data available on emerging Tech. markets and specially customized for start up and SMB Tech. firms to have guidance for their GTM in year 2017 and onwards.
MongoDB .local Chicago 2019: MongoDB – Powering the new age data demandsMongoDB
To successfully implement our clients' unique use cases and data patterns, it is mandatory that we unlearn many relational concepts while designing and rapidly developing efficient applications in NoSQL.
In this session, we will talk about some of our client use cases and the strategies we adopted using features of MongoDB.
Similar to An Innovative Big-Data Web Scraping Tech Company (20)
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
2. Innovative Big-Data Web Scraping Tech Compnay 2
HIGHLIGHTS
v What is WebRobot?
v The Problem
v How We Can Solve It
v Team
v Track Record
v Business Model
v Trends & Opportunities
v Main Competitors
v Target
v SWOT Analysis
v Some Numbers (sales, profit, clients)
v Investment Plan
3. 3
1. THE PROJECT
Description
WebRobot ltd is a London-based company that operates in the web scraping and web
mining industry in which it aims to become the leader.
In WebRobot we are building a super scalable infrastructure for data acquisition that
customers can use as a web service. It exploits cloud computing and big-data technologies,
as well as data-extraction and information-extraction algorithms.
WebRobot will be a great ally to every company that needs to acquire this heterogeneous
network of information and wants to reduce its internal management costs. WebRobot’s
services will represent a strategic resource essential to its business success.
Innovative Big-Data Web Scraping Tech Compnay
4. 4
1. THE PROJECT
The problem
Every company wishing to achieve, keep and improve its business success needs information (data)
on both the market, customers, and competitors, but this is challenging.
It must get good, reliable, and well-organized data. In addition, it needs to manage them properly.
The World Wide Web is made up of a huge amount of semi-structured and unstructured data.
Furthermore, it constantly changes its structure.
The cost to collect all of these data is often very expensive.
For all these reasons, we need robust and scalable algorithms that can reduce this onerous
maintenance activity.
Innovative Big-Data Web Scraping Tech Compnay
5. 5
1. THE PROJECT
How We Can Solve the Problem
We can guarantee algorithmic and structural scalability with automatic extraction features.
We offer a powerful solution in the form of a web service.
We integrate cloud computing with big-data technologies applied in the more general web mining
context.
We use visual support tools and SDK to connect to our stack.
WebRobot’s goal is to become a complete ETL service involving data extraction,
web mining, machine learning, and big-data analytics.
Innovative Big-Data Web Scraping Tech Compnay
6. 6
2. THE TEAM
CEO, CTO
Roger Giuffrè
71% of Equity
Mediterraneo
Capital Ltd
25% of Equity
CCO, CMO
Denis Giuffrè
4% of Equity
CFO
Antonio
Censabella
Roger Giuffrè
Denis Giuffrè Antonio Bensabella
MEDITERRANEO CAPITAL LTD
Innovative Big-Data Web Scraping Tech Compnay
7. 7
3. TRACK RECORD
We are finalizing the first version of the web service which will include the serverless version on the
Lambda technology and Amazon EMR.
We need to integrate the wrapper induction algorithms directly into the spark context. This will help us
refine them with the latest academic findings.
API implementation is fundamentally finished. We have to complete the usability studies of the current
interface.
We need to complete the dashboard that will be released under an open-source license.
We have to design visual tools to support the ETL that has to be generated.
We have a new grammar to set up for the query.
Innovative Big-Data Web Scraping Tech Compnay
8. 8
4. THE BUSINESS MODEL
The Strategy
We will release the service on the Amazon marketplace, available in three commercial packages:
Entry-Level, Professional, and Enterprise.
Our average selling price could be around 0.0008 Euro per page scraped, but we will make a
distinction between static and dynamic pages that need complex algorithms.
We have verified that the execution costs on a serverless environment and on an EMR cluster can
guarantee us a margin of at least 50%. This margin represents a cost constraint in our pricing policy.
In the future, we will integrate a web agents marketplace and adopt a B2B2C paradigm to fill the gap
with the end users, as well as with the actual use cases.
Innovative Big-Data Web Scraping Tech Compnay
9. 9
5. THE MARKET AND COMPETITORS
Trends and opportunities
Markets: Web Scraping, Web Mining, Data Analytics.
Dimension: $2 billion of estimated value in 2020 alone (in just one single year).
Growth: based on the market researches, we expect further growth in the
next years induced by (1) an ever-greater centrality of data in the entire
business process, and (2) the predisposition of the companies to outsource,
more and more often, the above-mentioned activities.
Innovative Big-Data Web Scraping Tech Compnay
10. 10
Main Competitors
Diffbot: an API for data extraction that uses machine learning heuristic and features to crawl the
pages. Unfortunately, the results are not 100% precise.
Scrapyhub: a cloud service focused on the Scrapy framework. It offers every single service
separately plus automatic extraction functions that are still in beta version. Anyway, the results are
not always compliant.
ImportIO: visual tools that customers can use to configure the extractors. However, it is particularly
expensive.
5. THE MARKET AND COMPETITORS
Innovative Big-Data Web Scraping Tech Compnay
11. 11
6. TARGET
E-commerce companies that require algorithmic pricing and competition monitoring.
Big companies that produce press reviews, carry out social media analysis, opinion mining, and
sentiment analysis activities.
Hedge funds and financial institutions for which information such as financial data and sentiment
indicators are extremely important.
Marketing agencies that need web scraping for SEO and web marketing automation purposes.
Established and startup companies that run or are developing any kind of vertical search engine.
Startups and small businesses that can benefit from building dedicated applications on our stack.
Innovative Big-Data Web Scraping Tech Compnay
12. 12
7. SWOT ANALYSIS
STRENGTHS WEAKNESSES
Scalability.
Self-service fast big-data extraction
solution.
We need PhD resources to reinforce the
algorithmic extraction.
Very specialized high-tech service that
requires an effort to make it user-friendly
(for non-technical users).
OPPORTUNITIES RISKS
Global market with big expansion
opportunities.
Profitable niche with low competition.
Restrictive regulations on the use of
personal data (in Europe), on data
collection (in Asia), on data referring to
minors (worldwide).
Innovative Big-Data Web Scraping Tech Compnay
13. 13
8. THE NUMBERS
We are considering a medium / large customer that requires at least 1 million pages per day
at a price of €800.00 (there is a global potential request of 100 billion pages per day).
EUR (in thousands) Year 2021 Year 2022 Year 2023 Year 2024
Sales 2,880 7,200 13,248 20,160
Gross margin 1,440 3,600 6,624 10,080
Net margin 1,440 3,600 6,624 10,080
Num. Customers 10 25 46 70
Innovative Big-Data Web Scraping Tech Compnay
14. 14
9. INVESTMENT PLAN
The investment strategy
First round: 9% in equity for €300k with a pre-money evaluation of €3 million.
Second round: 9% in equity for €2 million.
Third round: 9% in equity for €10 million.
We plan to eventually go public on the stock exchange.
Innovative Big-Data Web Scraping Tech Compnay