This document provides an overview of advancements in microprocessor technology from 1965 to 2015. It discusses how the focus has shifted from increasing clock speeds to improving efficiency through methods like multi-core designs that enable parallel processing. The document outlines key developments such as the introduction of multi-core chips and Intel's move toward multi-core architectures. It also discusses how software tools are becoming increasingly important to optimize performance and how the Itanium processor family was designed to take advantage of instruction-level parallelism.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
I have introduced developments in multi-core computers along with their architectural developments. Also, I have explained about high performance computing, where these are used. At the end, openMP is introduced with many ready to run parallel programs.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
I have introduced developments in multi-core computers along with their architectural developments. Also, I have explained about high performance computing, where these are used. At the end, openMP is introduced with many ready to run parallel programs.
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AIinside-BigData.com
Satoshi Matsuoka from RIKEN gave this talk at the HPC User Forum in Santa Fe.
"With rapid rise and increase of Big Data and AI as a new breed of high-performance workloads on supercomputers, we need to accommodate them at scale, and thus the need for R&D for HW and SW Infrastructures where traditional simulation-based HPC and BD/AI would converge, in a BYTES-oriented fashion. Post-K is the flagship next generation national supercomputer being developed by Riken and Fujitsu in collaboration. Post-K will have hyperscale class resource in one exascale machine, with well more than 100,000 nodes of sever-class A64fx many-core Arm CPUs, realized through extensive co-design process involving the entire Japanese HPC community.
Rather than to focus on double precision flops that are of lesser utility, rather Post-K, especially its Arm64fx processor and the Tofu-D network is designed to sustain extreme bandwidth on realistic applications including those for oil and gas, such as seismic wave propagation, CFD, as well as structural codes, besting its rivals by several factors in measured performance. Post-K is slated to perform 100 times faster on some key applications c.f. its predecessor, the K-Computer, but also will likely to be the premier big data and AI/ML infrastructure. Currently, we are conducting research to scale deep learning to more than 100,000 nodes on Post-K, where we would obtain near top GPU-class performance on each node."
Watch the video: https://wp.me/p3RLHQ-k6G
Learn more: https://en.wikichip.org/wiki/supercomputers/post-k
and
http://hpcuserforum.com
Tianhe 2 is the worlds fastest super computer according to top500.org. This presentation is based on Tianhe 2. Tianhe 2 Software architecture, Hardware architecture, specifications, motive factors and several other key aspects are highlighted in this presentation.
This presentation prepared by me with 4 other Computer Science Undergraduate from University of Colombo School of Computing, Sri Lanka.
In this video from the Argonne Training Program on Extreme-Scale Computing 2019, Jeffrey Vetter from ORNL presents: The Coming Age of Extreme Heterogeneity.
"In this talk, I'm going to talk about the high-level trends guiding our industry. Moore’s Law as we know it is definitely ending for either economic or technical reasons by 2025. Our community must aggressively explore emerging technologies now!"
Watch the video: https://wp.me/p3RLHQ-lic
Learn more: https://ft.ornl.gov/~vetter/
and
https://extremecomputingtraining.anl.gov/archive/atpesc-2019/agenda-2019/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...EUDAT
Giuseppe will present the differences between high-performance and high-throughput applications. High-throughput computing (HTC) refers to computations where individual tasks do not need to interact while running. It differs from High-performance (HPC) where frequent and rapid exchanges of intermediate results is required to perform the computations. HPC codes are based on tightly coupled MPI, OpenMP, GPGPU, and hybrid programs and require low latency interconnected nodes. HTC makes use of unreliable components distributing the work out to every node and collecting results at the end of all parallel tasks.
Visit: https://www.eudat.eu/eudat-summer-school
Moore's law is the observation that the number of transistors in a dense integrated circuit doubles approximately every two years. The observation is named after Gordon Moore, the co-founder of Fairchild Semiconductor and Intel, whose 1965 paper described a doubling every year in the number of components per integrated circuit, and projected this rate of growth would continue for at least another decade. In 1975, looking forward to the next decade, he revised the forecast to doubling every two years. The period is often quoted as 18 months because of Intel executive David House, who predicted that chip performance would double every 18 months (being a combination of the effect of more transistors and the transistors being faster).
BM NeXtScale - the next generation of dense computing
transtec HPC solutions with IBM NeXtScale are highly dense systems for those workloads that are currently the fastest growing, such as social media, analytics, technical computing and cloud applications. NeXtScale has been developed with standard components and provides up to three times as many cores in a one-unit rack unit when compared to previous versions.
The increasing use of this workload and delivery model generates increased demands on data centres. Operators are on the look-out for new technologies that can deal with the current demands with the highest possible level of performance and the lowest possible level of power consumption. NeXtScale is the latest addition to the transtec IBM x86 portfolio: Developed to allow applications with the power of a "supercomputer" to run in data centres – via a simple, flexible and open architecture.
Session four of my series on many cores turns to data, both big and small. Looks at MapReduce but approaches sideways from a classic computer science perspective.
Unum Computing: An Energy Efficient and Massively Parallel Approach to Valid ...inside-BigData.com
In this deck, John Gustafson presents: An Energy Efficient and Massively Parallel Approach to Valid Numerics.
"Written by one of the foremost experts in high-performance computing and the inventor of Gustafson’s Law, The End of Error: Unum Computing explains a new approach to computer arithmetic: the universal number (unum). The unum encompasses all IEEE floating-point formats as well as fixed-point and exact integer arithmetic. This new number type obtains more accurate answers than floating-point arithmetic yet uses fewer bits in many cases, saving memory, bandwidth, energy, and power."
Watch the video presentation: http://wp.me/p3RLHQ-dTk
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AIinside-BigData.com
Satoshi Matsuoka from RIKEN gave this talk at the HPC User Forum in Santa Fe.
"With rapid rise and increase of Big Data and AI as a new breed of high-performance workloads on supercomputers, we need to accommodate them at scale, and thus the need for R&D for HW and SW Infrastructures where traditional simulation-based HPC and BD/AI would converge, in a BYTES-oriented fashion. Post-K is the flagship next generation national supercomputer being developed by Riken and Fujitsu in collaboration. Post-K will have hyperscale class resource in one exascale machine, with well more than 100,000 nodes of sever-class A64fx many-core Arm CPUs, realized through extensive co-design process involving the entire Japanese HPC community.
Rather than to focus on double precision flops that are of lesser utility, rather Post-K, especially its Arm64fx processor and the Tofu-D network is designed to sustain extreme bandwidth on realistic applications including those for oil and gas, such as seismic wave propagation, CFD, as well as structural codes, besting its rivals by several factors in measured performance. Post-K is slated to perform 100 times faster on some key applications c.f. its predecessor, the K-Computer, but also will likely to be the premier big data and AI/ML infrastructure. Currently, we are conducting research to scale deep learning to more than 100,000 nodes on Post-K, where we would obtain near top GPU-class performance on each node."
Watch the video: https://wp.me/p3RLHQ-k6G
Learn more: https://en.wikichip.org/wiki/supercomputers/post-k
and
http://hpcuserforum.com
Tianhe 2 is the worlds fastest super computer according to top500.org. This presentation is based on Tianhe 2. Tianhe 2 Software architecture, Hardware architecture, specifications, motive factors and several other key aspects are highlighted in this presentation.
This presentation prepared by me with 4 other Computer Science Undergraduate from University of Colombo School of Computing, Sri Lanka.
In this video from the Argonne Training Program on Extreme-Scale Computing 2019, Jeffrey Vetter from ORNL presents: The Coming Age of Extreme Heterogeneity.
"In this talk, I'm going to talk about the high-level trends guiding our industry. Moore’s Law as we know it is definitely ending for either economic or technical reasons by 2025. Our community must aggressively explore emerging technologies now!"
Watch the video: https://wp.me/p3RLHQ-lic
Learn more: https://ft.ornl.gov/~vetter/
and
https://extremecomputingtraining.anl.gov/archive/atpesc-2019/agenda-2019/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...EUDAT
Giuseppe will present the differences between high-performance and high-throughput applications. High-throughput computing (HTC) refers to computations where individual tasks do not need to interact while running. It differs from High-performance (HPC) where frequent and rapid exchanges of intermediate results is required to perform the computations. HPC codes are based on tightly coupled MPI, OpenMP, GPGPU, and hybrid programs and require low latency interconnected nodes. HTC makes use of unreliable components distributing the work out to every node and collecting results at the end of all parallel tasks.
Visit: https://www.eudat.eu/eudat-summer-school
Moore's law is the observation that the number of transistors in a dense integrated circuit doubles approximately every two years. The observation is named after Gordon Moore, the co-founder of Fairchild Semiconductor and Intel, whose 1965 paper described a doubling every year in the number of components per integrated circuit, and projected this rate of growth would continue for at least another decade. In 1975, looking forward to the next decade, he revised the forecast to doubling every two years. The period is often quoted as 18 months because of Intel executive David House, who predicted that chip performance would double every 18 months (being a combination of the effect of more transistors and the transistors being faster).
BM NeXtScale - the next generation of dense computing
transtec HPC solutions with IBM NeXtScale are highly dense systems for those workloads that are currently the fastest growing, such as social media, analytics, technical computing and cloud applications. NeXtScale has been developed with standard components and provides up to three times as many cores in a one-unit rack unit when compared to previous versions.
The increasing use of this workload and delivery model generates increased demands on data centres. Operators are on the look-out for new technologies that can deal with the current demands with the highest possible level of performance and the lowest possible level of power consumption. NeXtScale is the latest addition to the transtec IBM x86 portfolio: Developed to allow applications with the power of a "supercomputer" to run in data centres – via a simple, flexible and open architecture.
Session four of my series on many cores turns to data, both big and small. Looks at MapReduce but approaches sideways from a classic computer science perspective.
Unum Computing: An Energy Efficient and Massively Parallel Approach to Valid ...inside-BigData.com
In this deck, John Gustafson presents: An Energy Efficient and Massively Parallel Approach to Valid Numerics.
"Written by one of the foremost experts in high-performance computing and the inventor of Gustafson’s Law, The End of Error: Unum Computing explains a new approach to computer arithmetic: the universal number (unum). The unum encompasses all IEEE floating-point formats as well as fixed-point and exact integer arithmetic. This new number type obtains more accurate answers than floating-point arithmetic yet uses fewer bits in many cases, saving memory, bandwidth, energy, and power."
Watch the video presentation: http://wp.me/p3RLHQ-dTk
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Table of Contents
History of Core technology 2
Magnetic Core Memory 2
Who Invented Core Memory? 2
What is Core-i Technology ? 2
Single core, Dual core, Quad core and Octa core. 3
ADVANTAGES : 3
CURRENT LINEUP CORE PROCESSORS 3
Q: 2 Computer transformation 4
Q:3: Vendors of Technology Hardware 5
Q: 4 Counter Argument 8
Will this be a hard sell? why or why not? 9
Q-5: Specialized Organizations In Creating Customized Software Applications For The Clients 9
HISTORY AND FUTURE TRENDS OF MULTICORE COMPUTER ARCHITECTUREijcga
The multicore technology concept is centered on the parallel computing possibility that can boost computer
efficiency and speed by integrating two or more CPUs (Central Processing Units) in a single chip. A
multicore architecture places multiple processor cores and groups them as one physical processor. The
primary goal is to develop a system that can handle and complete more than one task at the same time,
thereby getting a better system performance in general. This paper will describe the history and future
trends of multicore computer architecture.
History and Future Trends of Multicore Computer Architectureijcga
The multicore technology concept is centered on the parallel computing possibility that can boost computer efficiency and speed by integrating two or more CPUs (Central Processing Units) in a single chip. A multicore architecture places multiple processor cores and groups them as one physical processor. The primary goal is to develop a system that can handle and complete more than one task at the same time, thereby getting a better system performance in general. This paper will describe the history and future trends of multicore computer architecture.
The rush to the edge and new applications around AI are causing a shift in design strategies toward the highest performance per watt, rather than the highest performance or lowest power.
Moore’s Law is slowing, but more importantly the world is changing from PCs to smart phones and cloud computing where improvements continue to occur. Improvements are still occurring in other types of ICs such as wireless, GPUs, and 3D camera chips because they lag microprocessors and parallel processing is easier on them than on microprocessors. Data centers are also experiencing rapid improvements as changes in architecture are made, particularly for analyzing unstructured data, i.e., Big Data. These slides discuss the implications for new services in areas such as smart phones, software, and Big Data. The last one-third of the slides summarize alternatives to silicon and von Neumann.
Moore’s Law Effect on Transistors EvolutionEditor IJCATR
With respect to time increasing in the number of transistors has a great effect on the performance and the speed of
processors. In this paper we are comparing the transistors evolution related to Moore’s law. According to the Moore’s law the
number of transistors should be double every 24 month. The effect of increasing processors design complexity also increases the
power consumption and cost of design efforts. In this paper we discuss the methods and procedures to scale the hardware complexity
of processors.
Moore’s Law is a computing term which originated around 1970 & states that overall processing power for computers will double every two years. The size reduction enables product engineering to innovate and produce gadgets which are smaller in size and yet powerful, fast, and consume less power. What used to fit in a building now fits in your pocket, what fits in your pocket today.
Peter Biantes | Computing and The Future.pdfPeter Biantes
The future of computing that was earlier envisaged is now. Even now, technology, and by extension, the internet and computer have taken over the world. This generation has taken for granted the presence of advanced technology and devices, unlike in the past, when such didn’t exist.
According to a corporate advisor, Peter Biantes, technology has helped humanity by introducing computers, programming languages, and so on.
Micro Server Design - Open Compute ProjectHitesh Jani
The Micro Server is a cluster of Low Power, High Density Servers which can be applied in growing workloads such as Distributed Computing, Cloud Computing, Internet of Things (IoT) and Big Data.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Securing your Kubernetes cluster_ a step-by-step guide to success !
Fifty Year Of Microprocessor
1. Page 1
ISA White Paper
Fifty Years of Microprocessor
Technology Advancements:
1965 to 2015
2. Page 2
Table of Contents
Executive Summary............................................................................. 3
Changing the Focus: More Cores Instead of More Transistors .......... 4
Replacing the x86 Paradigm ............................................................... 6
Advancing to Multi-Core Designs ........................................................ 7
Leveraging Software Tools for Optimal Performance ......................... 9
Optimizing Microarchitecture through Parallelism............................... 9
Combining Performance with Security .............................................. 11
Further Advancing the Microprocessor: 2015 and Beyond............... 11
Summary Conclusion ........................................................................ 12
3. Page 3
Executive Summary
Everyone who works in the computer industry is well familiar with Moore's Law
and the doubling of the number of transistors (an approximate measure of
computer processing power) every 18 to 24 months. Until recently, overall
microprocessor performance was often described in terms of processor clock
speeds, expressed in megahertz (MHz) or gigahertz (GHz).
Today there's far more than clock speed to consider when you're evaluating how
a given processor will perform for a given application and where it fits on the
performance scale. Microprocessor designers today are more focused on
methods that leverage the latest silicon production processes and designs that
minimize microprocessor footprint size, power consumption and heat generation.
Designers are also concerned with microarchitecture optimization, multi-
processing parallelism, reliability, designed-in security features, memory
structure efficiency and better synergy between the hardware and accompanying
software tools, such as compilers. The more attention that a designer devotes to
refining the efficiency of the software code rather than making the hardware
responsible for dynamic optimization, the higher the ultimate system performance
will be.
As an example, the Intel® Itanium® processor family has been designed around
small footprint cores that are remarkably compact in terms of transistor count,
especially when one considers the amount of processing work that they
accomplish. Itanium has taken instruction level parallelism to a new level, and
this can be used in conjunction with thread level parallelism to leverage more
processor cores and more threads per core to produce higher performance.
Some microprocessor designs of the past have been overly complex and have
relied on out-of-order logic to reshuffle and optimize software instructions. Going
forward, microprocessor designers will continue to deliver better and better
software tools, higher software optimization and better compilers.
Because it is so efficient and so small and doesn't depend on out-of-order logic,
the latest generation Itanium processor can deliver higher performance without
creating thermal generation problems. This makes Itanium a very simple yet
efficient and refined engine that enables more consistent long-term improvement
in code execution via small improvements in software, thus reducing the need for
significant advancements in hardware. These are becoming more and more
difficult to accomplish as, even Gordon Moore believes, the exponential upward
curve in microprocessor hardware advancements “can’t continue forever.”
4. Page 4
Changing the Focus: More Cores Instead of More Transistors
Microprocessor advancement accelerated rapidly in 1968 when three engineers
from Fairchild Semiconductor – Robert Noyce, Andy Grove and Gordon Moore –
founded Intel Corporation in Mountain View, California, to develop new
technologies for silicon-based chips.
In 1971, Intel developers successfully embedded a central processing unit,
memory, input and output controls on the world's first single chip microprocessor,
the Intel 4004 (U.S. Patent #3,821,715) (“Inventors of the Modern Computer,” by
Mary Bellis, About.com, undated).
Moore was well known in the industry for a 1965 article where he predicted the
doubling of the number of transistors on microprocessors (an approximate
measure of processing power) every 18 to 24 months. For more than 40 years,
what became known as Moore’s Law has accurately described the pace of
advancements that have driven the semiconductor industry to reach over $200
billion in annual revenue today and serve as the foundation of the trillion-dollar
electronics industry. (“Moore Optimistic on Moore’s Law,” Intel Corporation,
http://www.intel.com/technology/silicon/mooreslaw/eml02031.htm)
But even Gordon Moore, as quoted recently by Techworld, says Moore’s Law
“can't continue forever.” (“Moore's Law is dead, says Gordon Moore” by Manek
Dubash, Techworld, April 13, 2005). Heat generation and circuit power
requirements become more and more of a barrier as transistor density and
processor clock speeds increase.
Either directly or indirectly, processor clock speed, expressed in Megahertz
(MHz) or Gigahertz (GHz), was once the common reference point used to predict
how a given processor would perform for a given application and where a
processor ranked on performance comparisons. However, microprocessor
designers today are more focused on other properties that enable processors to
deliver higher performance. Among these are methods that leverage the latest
silicon production processes and designs that minimize microprocessor footprint
size, power consumption and heat generation.
Designers are also concerned with microarchitecture optimization, multi-
processing parallelism, reliability, designed-in security features, memory
structure efficiency and better synergy between the hardware and accompanying
software tools, such as compilers and system and applications libraries. The
more attention that a designer devotes to refining the efficiency of the software
code rather than making the hardware responsible for dynamic optimization, the
higher the ultimate system performance will be.
5. Page 5
Widely available microprocessors in 1993 had around three million transistors
while the Intel® Itanium® processor currently has nearly one billion transistors. If
this rate continued,” writes science and technology journalist Geoff Koch, “Intel
processors would soon be producing more heat per square centimeter than the
surface of the sun—which is why the problem of heat is already setting hard
limits to frequency (clock speed) increases.” (“Discovering Multi-Core: Extending
the Benefits of Moore's Law,” by Geoff Koch, Technology@Intel Magazine
(online), undated)
Increasing processor performance without producing excessive heat is a
challenge that can be solved, in part, by dual- and multi-core processor
architecture, according to researchers at IDC (“The Next Evolution in Enterprise
Computing,” by Kelly Quinn, Jessica Yang and
Vernon Turner, IDC, April 2005).
Multi-core chips produce higher performance without a proportionate increase in
power consumption and only a minimal increase in heat generation. By
increasing the number of cores rather than the number of transistors on a single
core, performance advancements can continue indefinitely by leveraging the
operational benefits of microprocessor parallelism and process concurrency.
“If we were to continue down the Gigahertz path, the power requirements and
heat problems of processors would get out of hand,” says Paul Barr, a
technology manager at Intel. Looking ahead over the next three to four years,
Barr expects to see as much as a 10x boost in microprocessor performance due
to multi-core processsors and multi-threaded applications. “Multi-core is the next
generation,” Barr explains. “It’s just a natural progression of Moore’s law.” (Paul
Barr, Intel, interviewed by the Itanium Solutions Alliance.).
Today there's far more than clock speed to consider when you're evaluating how
a given processor will perform for a given application and where it fits on the
performance scale.
“There is no doubt that the whole industry has shifted the focus away from ramping
clock speed and improving ILP (instruction level parallelism) to increasing
performance by exploiting TLP (thread level parallelism),” says popular European
technology writer Johan De Gelas. (“Itanium - is there light at the end of the tunnel?”
by Johan De Gelas, Ace’s Hardware, Nov 9, 2005)
6. Page 6
Writing in Dr. Dobb's Journal, developer Herb Sutter estimates that “Like all
exponential progressions, Moore’s Law must end someday, but it does not seem
to be in danger for a few more years yet. Despite the wall that chip engineers
have hit in juicing up raw clock cycles, transistor counts continue to explode and
it seems CPUs will continue to follow Moore’s Law-like throughput gains for some
years to come.” (“The Free Lunch Is Over: A Fundamental Turn Toward
Concurrency in Software,” by Herb Sutter, Dr. Dobb's Journal, March 2005.
“As you move to multiple-core devices, scaling the frequency higher isn't as
important as the ability to put multiple cores on a chip,” says Dean McCarron,
principal analyst at Mercury Research Inc., quoted in eWeek. (Intel Swaps Clock
Speed for Power Efficiency, by John G. Spooner eWeek, August 15, 2005)
This is not to say that single-core clock speeds won’t increase – they will – but
future advancements will happen at a slower pace. At the same time, dual-core
processors are expected to offer substantial performance improvements over
single core by running a little faster and taking full advantage of dual core
benefits such as parallelism and other things such as support for the Message
Passing Interface (MPI), a standardized API typically used for parallel and/or
distributed computing, created by the MPI Forum (www.mpi-forum.org).
“The immutable laws of physics don't necessarily lead to hard limits for computer
users,” Geoff Koch adds. “New chip architectures built for scaling out instead of
scaling up will offer enhanced performance, reduced power consumption and
more efficient simultaneous processing of multiple tasks.” (“Discovering Multi-
Core: Extending the Benefits of Moore's Law,” by Geoff Koch, Technology@Intel
Magazine (online), undated).
Replacing the x86 Paradigm
As Thomas Kuhn once observed (“The Structure of Scientific Revolution,”
Thomas Kuhn, 1962.), scientific advancement is not evolutionary, but is rather a
“series of peaceful interludes punctuated by intellectually violent revolutions”, and
in those revolutions “one conceptual world view is replaced by another.”
7. Page 7
The conceptual framework of the Intel x86 or 80x86 microprocessor architecture
was first introduced by Intel in 1978, and has since followed an unwavering path
of advancement. Starting with an 8-bit instruction set, the x86 ISA grew to a 16-
bit and then to a 32-bit instruction set. These labels (8-bit, 16-bit, 32-bit)
designate the number of bits that each of the microprocessor's general-purpose
registers (GPRs) can hold. The term “32-bit processor” translates to “a processor
with GPRs that store 32-bit numbers.” Similarly, a “32-bit instruction” is an
instruction that operates on 32-bit numbers.
A 32-bit address space allows the CPU to directly address 4 GB of data. Though
4 GB once seemed gargantuan, the size requirements of memory-intensive
applications such as multimedia programs or database query engines are often
much higher. In response to this shortcoming, the 64-bit microarchitecture has
increased RAM addressability from 4 GB to a theoretical 18 million terabytes
(1019 bytes). However, since the virtual address space of x86-64 is 48 bits and
the physical address space is 40 bits, in reality the yield is only an approximate
256 terabytes. The 64-bit processor also has registers and arithmetic logic units
that can manipulate larger instructions (64-bits worth) at each processing step.
Since 64-bit processors can handle chunks of data and instructions twice as
large as 32-bit processors, the 64-bit microarchitecture should theoretically be
able to process twice as much data per clock cycle as the 32-bit microprocessor.
Unfortunately, that's not quite true. As Jonathan Stokes writes, “only applications
that require and use 64-bit integers will see a performance increase on 64-bit
hardware that is due solely to a 64-bit processor's wider registers and increased
dynamic range.” (“An Introduction to 64-bit Computing and x64,” by Jonathan
Stokes, arstechnica.com, undated.)
Advancing to Multi-Core Designs
In order to advance beyond current x86 capabilities without adding more
transistors, microprocessor designs are now headed down two paths. One path
represents an embellishment of the traditional x86 design – multiple cores
packaged on one board running parallel threads on each.
The other path – Intel Itanium – represents one of those paradigm shifts
described by Kuhn as a “revolution in which one conceptual world view is
replaced by another.”
Conceptually, multi-core architecture refers to a single processor package
containing two or more processor “execution cores,” or computational engines
that deliver fully parallel execution of multiple software threads. The operating
system treats each of its execution cores as a discrete processor, with all
associated execution resources.
8. Page 8
Multi-core processors thus deliver higher performance and greater efficiency
without the heat problems and other disadvantages experienced by single core
processors run at higher frequencies to squeeze out more performance. By
multiplying the number of cores in the processor, it is possible to dramatically
increase computing resources, higher multithreaded throughput, and the benefits
of parallel computing. (“Intel Multi-core Platforms, Intel Corporation,
www.intel.com/technology/computing/multi-core/index.htm, undated)
Although multi-core technology was first discussed by Intel in 1989
(“Microprocessors Circa 2000” by Intel VP Pat Gelsinger and others, IEEE
Spectrum, October 1989.), the company released its first dual-core processor in
April 2005 to mark the first step in its transition to multi-core computing. The
company is now engaged in research on architectures that could include dozens
or even hundreds of processors on a single die. (“Intel Multi-core Platforms, Intel
Corporation, www.intel.com/technology/computing/multi-core/index.htm,
undated)
Intel has publicly committed itself to a vision of “moving beyond gigahertz” to
deliver greater value, performance and functionality with multi-core with multi-
core architectures and a platform-centric approach.
Central to this strategy, Intel and its industry partners allied through the Itanium
Solutions Alliance (founded in September, 2005) are making billions of dollars in
strategic technology investments to assure that the Intel Itanium processor
becomes the platform of choice for mission critical enterprise systems and
technical computing within four years. ISA founding sponsors include Bull,
Fujitsu, Fujitsu Siemens Computers, Hitachi, HP, Intel, NEC, SGI and Unisys.
Charter members include BEA, Microsoft, Novell, Oracle, Red Hat, SAP, SAS
and Sybase. Over a dozen additional technology organizations have also joined
the alliance.
According to Lisa Graff, general manager of Intel's high-end server group,
Itanium is already being well accepted in the mission-critical systems
marketplace, with half of the world's 100 largest enterprises now deploying the
Itanium platform. Graff told CNET News “I think Itanium is the architecture for the
next 20 years. It's the newest architecture that has come out. It has the
headroom. I think the RISC architectures will run out of steam.” (“Itanium: A
cautionary tale,” By Stephen Shankland, CNET News.com, December 7, 2005).
Researchers at Gartner, Inc., quoted by CNET, estimate that the Itanium-based
server market is currently about 42,000 units, or about $2.6 billion in sales. By
2010, Itanium is projected by Gartner to expand 234,000 servers, or about $7.7
billion in market value.
9. Page 9
Leveraging Software Tools for Optimal Performance
Some microprocessor designs of the past have been overly complex and have
relied on out-of-order logic to reshuffle and optimize software instructions. Going
forward, designers will continue to deliver better and better software tools, higher
software optimization and better compilers.
Among semiconductor companies, Intel provides one of the strongest suites of
software tools to support and enhance the performance of microprocessors such
as the Intel Itanium. This includes an 18-month roadmap that Intel shares with its
major customers showing how its software tools will advance and evolve over
time. This enables Itanium developers to deliver higher performance to their
users and customers.
Among the most important software tools from Intel are a family of compilers
optimized for each of its microprocessor platforms, including Xeon and Itanium.
According to Intel, some of these compilers work cross platform, often enabling
one set of source code to be delivered for multiple platforms, occasionally
requiring a simple recompile for each platform. (John McHugh, Intel, interviewed
by the Itanium Solutions Alliance.).
Leveraging a compiler to enhance code performance for Itanium applications is a
key benefit for Itanium users, as explained by technology writer Johan De Gelas.
“The main philosophy behind Itanium is… that a compiler can statically schedule
instructions much better than a hardware scheduler, which has to decide this
dynamically in a few clock cycles… the compiler can search through thousands
of instructions ahead while the hardware scheduler can check only a few tens of
instructions for independent instructions. The compiler will make groups of
instructions that can be issued simultaneously without dependencies or
interlocks. These groups can be one or tens of instructions.” (“Itanium - is there
light at the end of the tunnel?” by Johan De Gelas, Ace’s Hardware, Nov 9, 2005)
In addition to Itanium compilers, Intel offers a math kernel library (MKL) and an
integrative performance primitives library (IPP), that can significantly enhance
Itanium code performance for scientific cluster applications, video image
processing, audio processing and image recognition.
Optimizing Microarchitecture through Parallelism
As microprocessor technology continues to advance with new architectures such
as the Intel Itanium, efficiency and performance optimization become even more
critical. A new technique used to achieve this is multi-processing parallelism and
a shift from instruction level parallelism to thread level parallelism.
10. Page 10
The Intel Itanium processor family is designed around small footprint cores that
are remarkably compact in terms of transistor count, especially when one
considers the amount of processing work that they accomplish. Itanium has
taken instruction level parallelism to a new level, and this can be used to
leverage more processor cores and more threads per core to produce higher
performance.
When Intel introduced Itanium in 2001, the company made a commitment to a
design that takes a quantum leap ahead of instruction level parallelism.
Instruction level parallelism is a process in which independent instructions
(instructions not dependent on the outcome of one another) execute concurrently
to utilize more of the available resources of a processor core and increase
instruction throughput. The ability of the processor to work on more than one
instruction at a time improves the cycles per instruction and increases the
frequency from previous architectures. (“A Recent History of Intel Architecture,”
by Sara Sarmiento, Intel Corporation, undated.)
Contrast this with a processor equipped with thread-level parallelism that
executes separate threads of code. This could be one thread running from an
application and a second thread running from an operating system, or parallel
threads running from within a single application.
In the past, microprocessor advancements have been based on improving the
performance of a single thread. Since there was only one core per processor,
advancements were achieved by adding more transistors and increasing the
speed of each transistor to improve program performance. This, along with
various pipelining and code strategies, made it possible for multiple instructions
to be issued in parallel. (John Crawford, Intel, interviewed by the Itanium
Solutions Alliance.).
Moving forward, Itanium will leverage multiple cores and thread level parallelism
to produce greater performance enhancements that would be possible through
single cores and single thread performance boost.
This move toward chip-level multiprocessing architectures with a large number of
cores continues a decades-long trend at Intel, offering dramatically increased
performance and power characteristics. (“Platform 2015 Software: Enabling
Innovation in Parallelism for the Next Decade,” by David J. Kuck,
Technology@Intel Magazine, (online) undated). This also presents significant
challenges, including a need to make multi-core processors easy to program,
which is accomplished, in part, using the software tools described above.
11. Page 11
Combining Performance with Security
The dramatic performance advancements delivered by Itanium make it possible
for additional security features to be incorporated into applications, enabling
developers to avoid the trade-offs often made between performance and
security.
Itanium provides four privilege levels that the operating system can leverage to
provide a clean separation between what the user can access versus what the
virtual machine can access. More importantly, Itanium provides a protection key
scheme based on container logic. With the appropriate application support, a
developer can compartmentalize everything contained in vast memory stores,
and, in effect, put a security wrapper around each process. (Dave Myron, Intel,
interviewed by the Itanium Solutions Alliance.).
Itanium’s ability to run high degrees of instructions per cycle means that
developers can create security layers that run very efficiently to protect code
stacks against a variety of attacks. There’s also an application that takes strong
advantage of the instructions per cycle to create an efficient security layer. In
addition to that, there are floating point units unique to Itanium that provide very
fast encryption.
The combination of all of these features – parallelism, floating point units,
privilege levels, and the protection key scheme, when taken together and using
the right application, provide microarchitecture security that is second-to-none.
Further Advancing the Microprocessor: 2015 and Beyond
Considering the dramatic progress made in microprocessor design and
architecture since 1965, it’s risky to project what new technologies could become
available in ten to fifteen years. And yet, a group of future-minded researchers
are expressing optimism about the potential of tiny nanoelectronic components,
organic molecules, carbon nanotubes and individual electrons that could serve
as the underlying technology for new generation of microprocessors emerging
around 2015.
There are limits to what can be accomplished with silicon, says Philip J. Kuekes,
a physics researcher at Hewlett-Packard Laboratories quoted in the New York
Times (“Chip Industry Sets a Plan For Life After Silicon,” by John Markoff, New
York Times, December 29, 2005). Kuekes says H-P is currently working on
molecular-scale nanotechnology switches that, it is hoped, will be able to
overcome some of the present-day technological challenges discussed
throughout this paper.
12. Page 12
Kuekes, along with electrical engineers, chemists and physicists from around the
world, are collaborating with various semiconductor manufacturers and suppliers,
government organizations, consortia and universities to promote advancements
in the performance of microprocessors and solve some of the challenges that
cast doubt on the continuation of the advancements described by Moore's Law.
The mid-term future of microprocessor advancements could very well be based
on nanotechnology designs that overcome the physical and quantum problems
associated with conventional silicon transistors and processor cores.
Summary Conclusion
The latest advancements in microprocessor technology are well represented
within the Intel Itanium Processor, delivering reliability, scalability, security,
massive resources, parallelism and a new memory model on a sound
microarchitectural foundation.
Because It is so efficient and so small and doesn't depend on out-of-order logic,
the latest generation Itanium processor delivers higher performance without
creating thermal generation problems. This makes Itanium a simple yet efficient
and refined engine that enables more consistent long-term improvement in code
execution via small improvements in software, thus reducing the need for
significant new advancements in hardware.
Microprocessor hardware improvements are becoming more and more difficult to
accomplish as, even Gordon Moore believes, the exponential upward curve in
microprocessor hardware advancements “can’t continue forever.”
* * * * *