The document summarizes a project on web scraping and exploratory data analysis of TVs on Flipkart. Key points:
1. Data was scraped from Flipkart on various TVs and their features like price, display size, resolution etc. and analyzed.
2. The data was cleaned and explored using visualizations like box plots, heatmaps to find correlations between features.
3. It was found that pixel, display size and pixel rows had strong positive correlation with price. Ultra HD TVs had higher ratings and were available only on certain operating systems.
4. The conclusion was the best ultra HD TV could be purchased for Rs. 25,000 on Android
We've known for years that data-driven content was a 'thing' when we'd produce simple infographics that shared a few statistics and they'd get easy traction for us online. The game has lifted and consumers are becoming more and more obsessed with data and are now demanding higher quality and more complex data-driven content. The challenge for us now as "T-Shaped" marketers is that there are increasing demands for us to learn new skills to produce this content but we don't have the time to do this amongst the other things we need to be expert at.
This presentation is going to give you specific help on how to produce data-driven content without any programming skill. After watching this presentation you'll have the confidence to build your own data-driven content with the knowledge of:
- blueprints for data-driven content ideas
- scraping tools, frameworks and methodologies
- how to brief in a data scraping project to your in-house team or a freelancer
- how to turn your data into visually appealing content
- channels for promoting data-driven content to ensure it gets traction
Learn more about the tools, techniques and technologies for working productively with data at any scale. This presentation introduces the family of data analytics tools on AWS which you can use to collect, compute and collaborate around data, from gigabytes to petabytes. We'll discuss Amazon Elastic MapReduce, Hadoop, structured and unstructured data, and the EC2 instance types which enable high performance analytics.
Jon Einkauf, Senior Product Manager, Elastic MapReduce, AWS
Alan Priestley, Marketing Manager, Intel and Bob Harris, CTO, Channel 4
In this talk from DevCon TLV we covered:
● The power of HTML5 APIs and how you can use them in your next modern Web Apps.
● On the server side how you can use: Google Cloud Endpoints to scale your API and gain more productivity.
● We did some live Demos and talked about Big Query interfaces.
Xamarin Evolve 2014 - Designing Android UIs for the Ever Changing Device Land...mstonis
Android is everywhere. Developers can now build apps that run on phones, tablets, TVs, cars, wearables, and even appliances! While this provides a breadth of opportunity, it creates a big problem when designing apps that can run across all of these different device types. In this session, join Michael Stonis to talk about how to create and manage dynamic UIs for Xamarin Android apps that look and feel great across different screen sizes and form factors.
Presentation Video: http://youtu.be/2k2SMiH37eg
We've known for years that data-driven content was a 'thing' when we'd produce simple infographics that shared a few statistics and they'd get easy traction for us online. The game has lifted and consumers are becoming more and more obsessed with data and are now demanding higher quality and more complex data-driven content. The challenge for us now as "T-Shaped" marketers is that there are increasing demands for us to learn new skills to produce this content but we don't have the time to do this amongst the other things we need to be expert at.
This presentation is going to give you specific help on how to produce data-driven content without any programming skill. After watching this presentation you'll have the confidence to build your own data-driven content with the knowledge of:
- blueprints for data-driven content ideas
- scraping tools, frameworks and methodologies
- how to brief in a data scraping project to your in-house team or a freelancer
- how to turn your data into visually appealing content
- channels for promoting data-driven content to ensure it gets traction
Learn more about the tools, techniques and technologies for working productively with data at any scale. This presentation introduces the family of data analytics tools on AWS which you can use to collect, compute and collaborate around data, from gigabytes to petabytes. We'll discuss Amazon Elastic MapReduce, Hadoop, structured and unstructured data, and the EC2 instance types which enable high performance analytics.
Jon Einkauf, Senior Product Manager, Elastic MapReduce, AWS
Alan Priestley, Marketing Manager, Intel and Bob Harris, CTO, Channel 4
In this talk from DevCon TLV we covered:
● The power of HTML5 APIs and how you can use them in your next modern Web Apps.
● On the server side how you can use: Google Cloud Endpoints to scale your API and gain more productivity.
● We did some live Demos and talked about Big Query interfaces.
Xamarin Evolve 2014 - Designing Android UIs for the Ever Changing Device Land...mstonis
Android is everywhere. Developers can now build apps that run on phones, tablets, TVs, cars, wearables, and even appliances! While this provides a breadth of opportunity, it creates a big problem when designing apps that can run across all of these different device types. In this session, join Michael Stonis to talk about how to create and manage dynamic UIs for Xamarin Android apps that look and feel great across different screen sizes and form factors.
Presentation Video: http://youtu.be/2k2SMiH37eg
How well do you know your pixels? Join this session to learn everything from basic information on how we display colors, all the way through using advanced calculations to prove that a device has a retina display. Whether you design interfaces for watches, phones, tablets, desktops, or 10-foot UI’s, you will gain some great insight into the fundamentals of how your work is displayed. This session will give you the foundation to come up with the next great concepts in digital interfaces!
Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...Amazon Web Services
Amazon EC2 now offers a new GPU instance capable of running graphics and GPU compute workloads. In this session, we take a deeper look at the remote graphics capabilities of this new GPU instance, the tooling required to get started, and a live demo of applications streamed from our West Coast regions. We also explore the benefits of hosting your 3D graphics applications in the AWS cloud, where you can harness the vast compute and storage resources.
Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼Elasticsearch
https://www.elastic.co/elasticon/tour/2019/seoul/devsisters-game-service-integration-logging-platform-using-elastic-stack
데브시스터즈에서 서비스하고 있는 모든 게임에서 생성된 각종 로그들은 하나의 통합 로깅 플랫폼으로 수집되어 데이터 분석, 서버 운영 및 트러블슈팅, 고객 문의 대응 등 다양한 용도로 사용하고 있습니다. 본 발표에서는 이 통합 로깅 플랫폼에서 Elastic Stack이 어떻게 사용되는지 다룹니다. 구체적으로, Filebeat를 이용한 Kubernetes와 AWS EC2 환경에서의 로그 수집, Elasticsearch를 이용한 로그 조회 서비스 구성에 대해 살펴보며, 서비스 구축 및 운영 과정에서 발생한 이슈들의 해결 과정, 그리고 앞으로의 미래에 대해 이야기합니다.
The industry move towards wearables is all the rage and taking advantage of these new devices doesn’t have to mean learning a whole new platform. For example the Microsoft Band is a multi-function wearable device that works with your smart phone to help you track heart rate, steps, calorie burn, sleep quality and be productive with email and calendar alerts and more. While you can quickly and easily build an app for the Band in just a few minutes how can you be sure the back end is up to the scale you’d need to support potential massive growth if it were to take off? Enter the cloud and tools available that we can use to load test and explore the performance characteristics of the solution. In this session we’ll take a look at what’s possible and walk thru the scenario to see first hand how it is done.
Build once deploy everywhere using the telerik platformAspenware
The Telerik Platform is a suite of tools for developing, testing, deploying and analyzing hybrid mobile applications. Hybrid mobile applications are most commonly built using PhoneGap, which interprets HTML5/CSS3/JavaScript and compiles it into a package that can be delivered in the app stores. PhoneGap also utilizes Apache Cordova JavaScript APIs to access certain native mobile features of the device. The Telerik Platform uses AppBuilder to abstract the complexity of PhoneGap/Cordova and provides a more intuitive way to build hybrid mobile applications. If you are looking to expand your .NET and web based development skills into the mobile market this is the session for you.
Lessons learned:
-What a hybrid mobile apps can do
-How Icenium helps build a hybrid mobile app
-How you can leverage your current web knowledge and assets to create a mobile app
How well do you know your pixels? Join this session to learn everything from basic information on how we display colors, all the way through using advanced calculations to prove that a device has a retina display. Whether you design interfaces for watches, phones, tablets, desktops, or 10-foot UI’s, you will gain some great insight into the fundamentals of how your work is displayed. This session will give you the foundation to come up with the next great concepts in digital interfaces!
Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...Amazon Web Services
Amazon EC2 now offers a new GPU instance capable of running graphics and GPU compute workloads. In this session, we take a deeper look at the remote graphics capabilities of this new GPU instance, the tooling required to get started, and a live demo of applications streamed from our West Coast regions. We also explore the benefits of hosting your 3D graphics applications in the AWS cloud, where you can harness the vast compute and storage resources.
Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼Elasticsearch
https://www.elastic.co/elasticon/tour/2019/seoul/devsisters-game-service-integration-logging-platform-using-elastic-stack
데브시스터즈에서 서비스하고 있는 모든 게임에서 생성된 각종 로그들은 하나의 통합 로깅 플랫폼으로 수집되어 데이터 분석, 서버 운영 및 트러블슈팅, 고객 문의 대응 등 다양한 용도로 사용하고 있습니다. 본 발표에서는 이 통합 로깅 플랫폼에서 Elastic Stack이 어떻게 사용되는지 다룹니다. 구체적으로, Filebeat를 이용한 Kubernetes와 AWS EC2 환경에서의 로그 수집, Elasticsearch를 이용한 로그 조회 서비스 구성에 대해 살펴보며, 서비스 구축 및 운영 과정에서 발생한 이슈들의 해결 과정, 그리고 앞으로의 미래에 대해 이야기합니다.
The industry move towards wearables is all the rage and taking advantage of these new devices doesn’t have to mean learning a whole new platform. For example the Microsoft Band is a multi-function wearable device that works with your smart phone to help you track heart rate, steps, calorie burn, sleep quality and be productive with email and calendar alerts and more. While you can quickly and easily build an app for the Band in just a few minutes how can you be sure the back end is up to the scale you’d need to support potential massive growth if it were to take off? Enter the cloud and tools available that we can use to load test and explore the performance characteristics of the solution. In this session we’ll take a look at what’s possible and walk thru the scenario to see first hand how it is done.
Build once deploy everywhere using the telerik platformAspenware
The Telerik Platform is a suite of tools for developing, testing, deploying and analyzing hybrid mobile applications. Hybrid mobile applications are most commonly built using PhoneGap, which interprets HTML5/CSS3/JavaScript and compiles it into a package that can be delivered in the app stores. PhoneGap also utilizes Apache Cordova JavaScript APIs to access certain native mobile features of the device. The Telerik Platform uses AppBuilder to abstract the complexity of PhoneGap/Cordova and provides a more intuitive way to build hybrid mobile applications. If you are looking to expand your .NET and web based development skills into the mobile market this is the session for you.
Lessons learned:
-What a hybrid mobile apps can do
-How Icenium helps build a hybrid mobile app
-How you can leverage your current web knowledge and assets to create a mobile app
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
1. By –Anil Khare
Submitted to – DATA FOLKZ
Project-Web Scrapping with EDA
1
Web Scrapping with EDA
Capstone Project -1
Data Analytics
2. Introduction
Project-Web Scrapping with EDA
2
• There are many ways to access a website , like through browser,
which are capable of viewing one page at a time .
web scrapers are excellent at gathering and processing large
amount of data
Rather than viewing one page at a time ,we can view databases
spanning thousands of pages at once.
Web scraping is the practice of gathering data through any
means other than a program interacting with an API.
Web scraping encompasses a wide variety of programming
techniques and technologies, such as Exploratory data analysis.
3. Uses Of Web Scrapping
Project-Web Scrapping with EDA
3
• Web scraping is used for-
Social media sentiment analysis
E-commerce pricing
Investment opportunities
Machine Learning
Email address gathering
Research and Development and many more application
4. The project – Web scrapping with
EDA
Project-Web Scrapping with EDA
4
As a project for Data analysis the topic given was ‘Web
Scraping with EDA’
I chose the to Scrape the Flipkart website for various TV’s
available
The problem statement was ‘Getting the best proposal of TV
with
best available features in lowest price.’
The Steps followed for the project
1. Scrape the Flipkart website for various TV’s available on the site
2. Request was used to open the url.
3. Beautiful soup library was used to parse the HTML data and get the raw
data.
4. The data was then cleaned using various techniques in python and
5. The Data
Project-Web Scrapping with EDA
5
A structured data of 24 rows & 12 columns was saved in csv file
‘Capstone 111.csv’ . The various features as column of the data
are -:
Column _Name Data type
TV Name object
OS object
HD object
Speaker(in_W) int64
refresh_rate(in_Hz) int64
USB(in_Nos) int64
Price(in_Rs) int64
Rating float64
HDMI(in_Nos) int64
pixel float64
Display size (in inch) int64
pixel_rows int64
6. The Data
Project-Web Scrapping with EDA
6
TV Name OS HD Speake
r
(in_W)
refresh_ra
te(in_Hz)
USB
(in_Nos
)
Price
(in_Rs)
Rating HDMI
(in_No
s)
pixel Display
size (in
inch)
pixel_ro
ws
Mi 4X 13
Android
Based
Ultra HD
(4K) 20 60 2 40999 4.4 3 3840 55 2160
LG WebOS
Ultra HD
(4K) 20 50 2 34999 4.4 3 3840 43 2160
Mi 4X 12 Android
Ultra HD
(4K) 20 60 2 34999 4.4 3 3840 50 2160
SAMSUNG Tizen Full HD 20 60 1 31999 4.4 2 1920 43 1080
LG WebOS Full HD 20 50 1 29999 4.4 2 1920 43 1080
Mi 4X Android
Ultra HD
(4K) 20 60 2 27999 4.4 3 3840 43 2160
Mi 4X Android
Ultra HD
(4K) 20 60 2 27999 4.4 3 3840 43 2160
iFFALCON by TCL Android
Ultra HD
(4K) 24 60 1 27999 4.3 3 3840 50 2160
iFFALCON by TCL
107 Android Full HD 20 60 1 26999 4.2 2 1920 43 1080
OnePlus Y Series Android Full HD 20 60 2 25499 4.3 2 1920 43 1080
Mi 4A Pro Android Full HD 20 60 3 24999 4.4 3 1920 43 1080
Vu Premium Android Full HD 24 60 2 24999 4.3 2 1920 43 1080
iFFALCON by TCL Android
Ultra HD
(4K) 24 60 1 24999 4.3 2 3840 43 2160
realme Android Full HD 24 60 2 23999 4.3 3 1920 43 1080
Mi 4A Android Full HD 20 60 2 21999 4.4 3 1920 40 1080
iFFALCON by TCL
10 Android Full HD 20 60 1 19999 4.3 2 1920 40 1080
SAMSUNG Tizen HD Ready 20 60 1 17999 4.3 2 1366 32 768
SAMSUNG Tizen HD Ready 20 60 1 17490 4.4 2 1366 32 768
7. Box Plot for outliers
7 Project-Web Scrapping with EDA
As per the Box Plot there
are no outliers in price for
the data
8. Minimum & Maximum Price Analysis
Project-Web Scrapping with EDA
8
Max_Price Min_Price
Mi 4X 13 Mi 4A PRO
Android Based Android
Ultra HD (4K) HD Ready
20 20
60 60
2 2
40999 14499
4.4 4.4
3 3
3840 1366
55 32
2160 768
The various observation from
the
data are –
The min and max value of
price is Rs. 14999/- and max
value of Rs. 40999/-
Comparing the min max
values we find that only the
features such as OS,HD,
pixel rows ,pixel & display
size are different for the max
and the min values .
9. From the heatmap we can infer that only
Pixel
Display size
And pixel rows have strong positive correlation with Price
Project-Web Scrapping with EDA
9
Heatmap for the Data
10. Project-Web Scrapping with EDA
10
Comparison HD vs Price vs OS
The Rating of Ultra HD (4K) is
4.4 as compared to Full HD and HD
Ready
Rating of android is 4.3 as compared
to Android based , WebOS and
Tizen which have 4.4 rating
11. Comparison HD and OS
Project-Web Scrapping with EDA
11
The ultra HD (4K) TVs
are available only in
android , Android based
and WebOS operating
system only
Tizen is available in Full
HD & HD ready only.
12. Comparison of HD vs Price
Project-Web Scrapping with EDA
12
Price range for
Ultra HD (4K) -
25000 -30000 -2 nos.
35000 – 1 no.
Above 40000 -1 No.
Full HD -
20000 – 25000 -4 nos.
25000 -30000- 3 nos.
30001 – 35000 – 1 No.
HD Ready -
Below 15000 – 1 no.
15000-20000 – 5 Nos.
13. Project-Web Scrapping with EDA
13
Price Range : Rs.25000-30000
Android based –Nil
Tizen – Nil
WebOS- 1 No.
Android – 5 Nos.
Price Range –Rs. 35000
Android based –Nil
Tizen – Nil
WebOS- 1 No.
Android – 1 Nos
Price Range – Above
Rs.40000
Android based –Nil
Tizen – Nil
WebOS- 1 No.
Android – 1 Nos
Comparison Price vs OS
14. The Conclusion
Project-Web Scrapping with EDA
14
The Ultra HD (4K) can
be purchased in
Android operating
system for Rs.
25000/-
For Web OS the price
shall be Rs. 35000/-