This a presentation I gave to Fog World Congress 2018 in San Francisco. It translates my findings from over 100 data sources and 40 years of historical data to show how data generation outpaces data movement (telcom) by 20% year over year. This will soon be published in a white paper.
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
The 20% Rule: How the seismic growth of data has always and will always outgrow telcom
1.
2. 40 Years of Data Generation and Communication in a Nutshell
or
Why the Edge/Fog is inevitable
or
The 20% Rule.
Perry Lea
Principal Technologist at Microsoft
Co-Founder and Distinguished Technologist at Rumble
3. Topics
Why the edge is required
Worldwide capacity to move data
Cost of data creation and transportation
The economics of data generation
Data generation and proliferation
4. Take Away
Executive Summary
• Edge computing is the natural state of
computing and is inevitable.
• The economics and trends suggest that the
amount of data generated versus the ability to
move data grows by 20% YoY
5. Sources of Data
• Cisco VNI Study
• Hilbert and Lopez Global Data Generation
• Ethernet Alliance Telcom Capacity
• IDC Data Generation Study
• Public information from Intel, ARM, Nvidia, AMD, Seagate, Western Digital, and Micron.
• Telcom Transit Database (historical and projected) by DrPeering.
• FCC Federal Databases
• Securities Exchange Commission
• IBM Research
• Federal Reserve Bank of St. Louis - Economic Database
• World Economic Forum Database
• Internet Trend Code Conference
• The Economist
• TSMC
• NCTA
• Dozens of other sources…
9. Factors Improving Data Generation
Gordon Moore
Intel - 1975
Chip density doubles every 18 months
Robert Dennard
IBM - 1974
As transistors get smaller, their power
density stays constant.
Each generation scales transistors by
0.7x in size and increases speed by
1.7x
Jonathan Koomey
Stanford - 2010
Performance per watt
doubles every 1.57 years
10. There are exceptions..
• Dennard scaling has slowed down since 2016
• Moore’s law is now doubling transistor count every 24 months.
• Software Bloat – Wirth’s Law
1998 2008 2018
Max Transistor
Density
10M
Intel Pentium II
2B
Intel Itanium
20B
AMD Ryzen
Manufacturing
Steps
200 400 1000
Node Size 130nm 28nm 7nm
Max Frequency 450 MHz 1.8 GHz 3.6 GHz
CAPEX $2B $29B $80B
$10B to build new fab
Density growth
decreased
Results in decreased
yields and increased
costs
5nm requires significant
litho investment and
quantum mitigation
Clock frequency growth
has flattened
16. Limits of Data MovementShannon Limit: Maximum data rate physically allowable
17. Costs of Data Motion
Wired Wireless
• Wired and wireless communication services and equipment cost have greater variability
• The reduction in hardware and service cost has not followed Moore’s law
(copper: -0.05%)
(fiber: -1.39%)
(wireless hardware: -3.4%)
(wireless services: -4.8%)
18. Why Does Data Motion Lag Moores Law?
Telcom
CAPEX
•Overhaul towers
•Dig up roads
•Lay wire
Reluctance
to buy
bandwidth
•Buy a faster computer
•Buy more flash storage
•Buy more bandwidth
may or may not have an
effect
Service Cost
•Bandwidth has always
come with some form
of recurring cost
19. Summary so Far
• The cost to generate a bit of data << cost to transport data
• Telcom services and hardware have not followed Moore’s law in reducing overall cost/transmitted bit
24. IoT Still Reaps the Rewards
Printed Sensors$0.01 ARM Cortex M0 Polymeric
25. IoT Still Reaps the Rewards
1998 2018
Product Intel 80C51 Microchip SAMA5D2
CPU Single 8-bit ALU @ 33MHz
32-bit ARM A5 @ 500 MHZ with
NEON Media Processing FPU
Memory 64K addressable 4 GB DDR3/LPDDR3
IO Full Duplex UART, 32 GPIO
1024x768 LCD, 10/100
Ethernet, Two USB 2.0, SPI,
CAN, UART, 2-wire, I2C, ADC,
128 GPIO
Operating Systems None Linux and modern RTOS’
Die Size
About 3000 transistors in
0.33mm2
About 20,000,000 transistors in
0.4 mm2
Power 0.11 mW/MHz 0.09 mW/MHz
Performance About 3 MIPS 785 DMIPS
Cost Roughly $4.00 in 1998 $3.97 in 2018
• Edge devices have room to
absorb Moore’s advantages
• Edge devices still enjoy
Koomey’s scaling of power.
26. The “New” Edge Use Cases
Data Aggregation Situational Awareness Geolocation Tracking
Content Caching
Gateway Services
Edge Rules and AI Edge Security & Health
Denaturing & FilteringLocal Control Systems
27. Edge Data in a 2010 World
Courtesy Sandive Corp.
Edge devices are data producers more-so than data consumers:
• Device status
• Unstructured data generation
• Time series correlated data
• 1Billion nodes
Average
Downstream
Average Upstream
4G LTE 27.33 Mbps 8.63 Mbps
5G
theoretical
(mmWave)
20 Gbps 10 Gbps
5G realistic
(mmWave)
~186 Mbps ~50Mbps
DOCSIS 100.07 Mbps 33.58 Mbps
28.
29. Edge Opportunities
• Next wave of data next killer app
• A single spigot to the cloud won’t work. The cloud must be
localized
• Latency effects need to be tempered for realtime control and
streaming applications
• SLA costs need to be mitigated
30. The 20% Rule
1
10
100
1000
10000
100000
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
WorldNetworkCapacity(EB/month)
World Data Generated vs World Network Capacity
Data Growth (EMC, UNECE & IDC Projection) EB/month
Network Capacity Projected Growth (Cisco VNI) EB/month
126x
231x
31. About this Work
Perry Lea
Founder and Technologist for Rumble: www.rumblenow.com
Advisor and Technologist for SimplyLEDs: www.simplyleds.com
Author: “Internet of Things for Architects”
Principal Technologist for Microsoft: www.microsoft.com
Founder of Computational Vision: www.computationalvision.com
Editor's Notes
The Hilbert and Lopez study for example draws in 40 years of data in various formats and medias. It considers not only digital data but analog, such as radio in the 1980s and newsprint.
Today most data is digital, but we need to account for all types of data for an accurate study to measure data growth year over year.
Much of the data reflects 40 years of aggregate data generation and communication.
Printed material
Radio
Television
Video
Data communication
Internet Traffic and Web Traffic
Mobile Traffic
Digitization and Digital Video
Streaming
We separate data generation from data transport. This is important
Data generation represents the local creation of data. This could be human created or machine created data. It could be structured (databases and time correlated data) or unstructured (images and audio). It can even be analog or digital.
Data Movement is the telecommunication and networking component to the equation. It considers local and wide area movement of data. From legacy wireline transport to MPLS systems to Intenert and cloud data. It may be Internet based or may not.
Data motion takes physical components like coper or fiber cables as well as energy of transport. For the amount of data we are considering the costs and energy are overwhelming.
As an anecdote, my past life was to find ways to resolves data motion in CPU design. Making CPUs and processors faster has reach their limits as we will see. A bulk of the time and energy is moving data. Not just between a sensor and a cloud datacenter, but even from an ALU and DRAM. Data movement has impact at all levels of computing.
Dennard Scaling has slowed. Transistor counts still increase, but performance improvements are minimal. We are unable to increase clock frequencies any more without significant thermal issues.
The trend is to add more silicon, hardware assist, and CPUs to a single die without increasing frequency. This leads to dark silicon.
Moore’s law has also slowed. IT used to be doubling every 18 months. However expense and complexity in lithography and fabrication steps has made it prohibitively difficult to produce higher density chips at good yields for logic and memory components.
Moore’s law has translated directly into end user cost.
Price for storage, computing, and semiconductors has tracked Moore’s law well through 2010
Even as the cost of a single die has increased YoY, overall cost has followed the law.
It is interesting to note that communication equipment follows a different curve in cost reductions YoY.
Whereas data storage and logic costs have followed Moore’s law as transistor density increases, actual communication equipment has not followed the same curve.
Nielsen’s Law of Broadband connectivity (Jakob Nielsen) is a good measure of overall communication bandwidth over time.
It states that high end bandwidth grows at approximately 50% per year.
It should be noted this is only downstream bandwidth. Upstream bandwidth has only increased by 30% YoY.
This is 10% less than Moore’s law.
Bandwidth costs and speed follow nearly the same slopes.
Average bandwidth has slowed. This is partly due to technical limitations and hardware speeds, but also due to limitations in available spectrum and bandwidth that will have greater availability with 5G spectrum space.
There are limits.
The upper limit on data rate is found with Shannon’s limit. Any communications medium is characterized by signal and noise.
Industry crams more data into a narrow slice of spectrum. You can use techniques like Raman amplification, larger constellations of OFDM, and better fiber-optic or wireless quality to improve SNR. These techniques are expensive to incorporate in a global telecommunication network.
Wired costs follow material and commodity pricing of copper, metals, and fiber
For example, between 2002 and 2008 the cost of copper soared from $0.50 per pound to $4.00 per pound.
Data Movement has always lagged the growth of the rest of the IT industry.
5G for example will require new MIMO based towers, femto cell and small cell infrastructure, as well as new silicon.
About every 5 years we have a set of technologies that allow us to create and store and transmit more content
Web and Broadband
MP3 and MP4 video compression and streaming
Social media and digital photos
HD, 4k and 8K resolution
The current path puts us on course to produce 2.6 Zetabytes of data per year.
IoT and edge computing will nearly double that to 4.3 ZB per year
What we see if that during the early 2000’s telcom capacity grew as broadband became prevalent and affordable and technologies like cellular became mainstream. In a sense telcom finally could up to the IT revolution.
Now telcom capacity is levelling off and the growth rates are no longer 50% YoY. We anticipate telcom capacity to grow at roughly 20% YoY. Even with the advent of mmWave spectrum, the growth will be capped.
Data generation however continues to grow and find new uses cases of data generation. We anticipate between a 30% and 50% YoY growth in data generation.
This leads to the gap…
Sensor prices have fallen at a steeper slope than Moore’s law. When we approach .20 sensor devices consumer can support a “throwaway model”. This will happen with polymeric fasbrication as well as printable electronics.
New use cases are generating the plethora of “mew” data.
In particular, we have seen concentrated growth in:
Filtering – removing of redundant or temporal data. To alleviate network pressure, bandwidth and service costs, edge devices are used to remove significant data. Most cases remove 98% of edge data.
Control systems aren’t new. In fact they were the first connected things dating back to 1935 when the early US power grid managed power stations using “pilot wires”. Connecting Industry and Industry 4.0 has produced significant value by connecting brownfield devices.
Situational awareness edge devices combine AI and geolocation to provide systems that can respond to events depending on time, space, distance, or cause.
Data aggregation and caching provide the answer to latency issues at the edge. Content is cached and managed locally to avoid long hops.
Whereas intenet bandwith for the last 20 years has been dominated by downstream data. IoT and edge devices will be content producers. It will demand greater upstream bandwidth and better latency.
Cat0, Cat1 and Cat NB will help close some of the disparity between uplink and downlink speed but they are focused on low bandwidth use cases. Unstructured edge data still has significant hurdles.
The bulk of modern Internet data is unstructured: Netflix, Facetime, Youtube, Skype
A sizable amount remains structured or partially structured: Facebook and HTTP.
This data will continue to grow, but at a steady rate, but the 1.7ZB of data will be edge created.
Not all of it will (or can) move through our networks.
YoY there continues to be a disparity between the amount of content being generated and the ability, financials, or need to move that data. More data is produced than absorbed.
The average growth rate over the last 15 years has been 20% year over year.
We predict this will continue to grow as Moore’s law still has a play space at the edge and in sensors, but service costs and data transmission will not follow the same curve.