2 A.C.S. Beck et al.these conﬂicting design constraints in a sustainable fashion, and still allow hugefabrication volumes. Each challenge is developed in details throughout the nextchapters, providing an extensive literature review as well as settling a promisingresearch agenda for adaptability.1.1 Performance GapThe possibility of increasing the number of transistors inside an integrated circuitwith the passing years, according to Moore’s Law, has been sustaining the perfor-mance growth along the years. However, this law, as known today, will no longerhold in a near future. The reason is very simple: physical limits of silicon [11, 19].Because of that, new technologies that will completely or partially replace siliconare arising. However, according to the ITRS roadmap , these technologies haveeither higher density levels and are slower than traditional scaled CMOS, or entirelythe opposite: new devices can achieve higher speeds but with a huge area and poweroverhead, even if one considers future CMOS technologies. Additionally, high performance architectures as the diffused superscalar machinesare achieving their limits. According to what is discussed in [3, 7], and , thereare no novel research results in such systems regarding performance improvements.The advances in ILP (Instruction Level Parallelism) exploitation are stagnating:considering Intel’s family of processors, the overall efﬁciency (comparison ofprocessors performance running at the same clock frequency) has not signiﬁcantlyincreased since the Pentium Pro in 1995. The newest Intel architectures follow thesame trend: the Core2 micro architecture has not presented a signiﬁcant increase inits IPC (Instructions per Cycle) rate, as demonstrated in . Performance stagnation occurs because these architectures are challenging somewell-known limits of the ILP . Therefore, even small increases in the ILPbecame extremely costly. One of the techniques used to increase ILP is the carefulchoice of the dispatch width. However, the dispatch width offers serious impacts onthe overall circuit area. For example, the register bank area grows cubically withthe dispatch width, considering a typical superscalar processor such as the MIPSR10000 . In , the so-called “Mobile Supercomputers” are discussed, which are thoseembedded devices that will need to perform several intensive computational tasks,such as real-time speech recognition, cryptography, augmented reality, besidesthe conventional ones, like word and e-mail processing. Even considering desk-top computer processors, new architectures may not meet the requirements forfuture and more computational demanding embedded systems, giving rise to aperformance gap.
1 Adaptability: The Key for Future Embedded Systems 31.2 Power and Energy ConstraintsAdditionally to performance, one should take into account that the potentiallylargest problem in embedded systems design is excessive power consumption.Future embedded systems are expected not to exceed 75 mW, since batteries do nothave an equivalent Moore’s law . Furthermore, leakage power is becoming moreimportant and, while a system is in standby mode, leakage will be the dominantsource of power consumption. Nowadays, in general purpose microprocessors, theleakage power dissipation is between 20 and 30 W (considering a total power budgetof 100 W) . One can observe that, in order to attain the power constraints, companies aremigrating to chip multiprocessors to take advantage of the extra area available, eventhough there is still a huge potential to speed up single threaded software. In theessence, stagnation in the increase of clock frequency, excessive power consumptionand higher hardware costs to ILP exploitation, together with the foreseen slowertechnologies, are new architectural challenges that must be dealt with.1.3 Reuse of Existing Binary CodeAmong thousands of products launched by consumer electronics companies, onecan observe those which become a great success and those which completely fail.The explanation perhaps is not just about their quality, but it is also about theirstandardization in the industry and the concern of the ﬁnal user on how long theproduct that is being acquired will be subject to updates. The x86 architecture is one of these major examples. Considering nowadaysstandards, the x86 ISA (Instruction Set Architecture) itself does not follow thelast trends in processor architectures. It was developed at a time when memorywas considered very expensive and developers used to compete on who wouldimplement more and different instructions in their architectures. The x86 ISAis a typical example of a traditional CISC machine. Nowadays, the newest x86compatible architectures spend extra pipeline stages plus a considerable area incontrol logic and microprogrammable ROM just to decode these CISC instructionsinto RISC-like ones. This way, it is possible to implement deep pipelining and allother high performance RISC techniques while maintaining the x86 instruction setand, consequently, backward software compatibility. Although new instructions have been included in the x86 original instructionset, like the SIMD, MMX, and SSE ones , targeting multimedia applications,there is still support to the original 80 instructions implemented in the very ﬁrst x86processor. This means that any software written for any x86 in the past, even thoselaunched at the end of the 1970s, can be executed on the last Intel processor. This isone of the keys to the success of this family: the possibility of reusing the existingbinary code, without any kind of modiﬁcation. This was one of the main reasons
4 A.C.S. Beck et al.why this product became the leader in its market segment. Intel could guarantee toits consumers that their programs would not be obsoleted during a long period oftime and, even when changing the system to a faster one, they would still be able toreuse and execute the same software again. Therefore, companies such as Intel and AMD keep implementing more powerconsuming superscalar techniques, trying to push the frequency increase for theiroperation to the extreme. Branch predictors with higher accuracy, more advancedalgorithms for parallelism detection, or the use of Simultaneous Multithreading(SMT) architectures, like the Intel Hyperthreading , are some of the knownstrategies. However, the basic principle used for high performance architectures isstill the same: superscalarity. As embedded products are more and more based on ahuge amount of software development, the cost of sustaining legacy code will mostlikely have to be taken into consideration when new platforms come to the market.1.4 Yield and Manufacturing CostsIn , a discussion is made about the future of the fabrication processes usingnew technologies. According to the authors standard cells, as they are today, willnot exist anymore. As the manufacturing interface is changing, regular fabrics willsoon become a necessity. How much regularity versus how much conﬁgurabilityis necessary (as well as the granularity of these regular circuits) is still an openquestion. Regularity can be understood as the replication of equal parts, or blocks, tocompose a whole. These blocks can be composed of gates, standard-cells, standard-blocks, to name a few. What is almost a consensus is the fact that the freedom of thedesigners, represented by the irregularity of the project, will be more expensive inthe future. By the use of regular circuits, the design company can decrease costs, aswell as the possibility of manufacturing defects, since the reliability of printing thegeometries employed today in 65 nm and below is a big issue. In  it is claimedthat maybe the main research focus for researches when developing a new systemwill be reliability, instead of performance. Nowadays, the amount of resources to create an ASIC design of moderately highvolume, complexity and low power, is considered very high. Some design compa-nies can still succeed to do it because they have experienced designers, infrastructureand expertise. However, for the very same reasons, there are companies that justcannot afford it. For these companies, a more regular fabric seems the best way togo as a compromise for using an advanced process. As an example, in 1997 therewere 11,000 ASIC design startups. This number dropped to 1,400 in 2003 . Themask cost seems to be the primary problem. For example, mask costs for a typicalsystem-on-chip have gone from $800,000 at 65 nm to $2.8 million at 28 nm . Thisway, to maintain the same number of ASIC designs, their costs need to return to tensof thousands of dollars. The costs concerning the lithography tool chain to fabricate CMOS transistorsare also a major source of high expenses. According to , the costs related to
1 Adaptability: The Key for Future Embedded Systems 5lithography steppers increased from $10 to $35 million in this decade. Therefore,the cost of a modern factory varies between $2 and $3 billion. On the other hand,the cost per transistor decreases, because even though it is more expensive to builda circuit nowadays, more transistors are integrated onto one die. Moreover, it is very likely that the design and veriﬁcation costs are growing in thesame proportion, impacting the ﬁnal cost even more. For the 0.8 μm technology, thenon-recurring engineering (NRE) costs were only about $40,000. With each advancein IC technology, the NRE costs have dramatically increased. NRE costs for 0.18 μmdesign are around $350,000, and at 0.13 μm, the costs are over $1 million . Thistrend is expected to continue at each subsequent technology node, making it moredifﬁcult for designers to justify producing an ASIC using nowadays technologies. The time it takes for a design to be manufactured at a fabrication facility andreturned to the designers in the form of an initial IC (turnaround time) is alsoincreasing. Longer turnaround times lead to higher design costs, which may implyin loss of revenue if the design is late to the market. Because of all these issues discussed before, there is a limit in the numberof situations that can justify producing designs using the latest IC technology.Already in 2003, less than 1,000 out of every 10,000 ASIC designs had high enoughvolumes to justify fabrication at 0.13 μm . Therefore, if design costs and timesfor producing a high-end IC are becoming increasingly large, just a few of themwill justify their production in the future. The problems of increasing design costsand long turnaround times become even more noticeable due to increasing marketpressures. The time available for a company to introduce a product into the marketis shrinking. This way, the design of new ICs is increasingly being driven by time-to-market concerns. Nevertheless, there will be a crossover point where, if the company needs amore customized silicon implementation, it will be necessary to afford the maskand production costs. However, economics are clearly pushing designers towardmore regular structures that can be manufactured in larger quantities. Regular fabricwould solve the mask cost and many other issues such as printability, extraction,power integrity, testing, and yield. Customization of a product, however, cannot relysolely on software programming, mostly for energy efﬁciency reasons. This way,some form of hardware adaptability must be present to ensure that low cost, massproduced devices can still be tuned for different applications needs, without redesignand fabrication costs.1.5 MemoryMemories have been a concern since the early years of computing systems. Whetherdue to size, manufacturing cost, bandwidth, reliability or energy consumption,special care has always been taken when designing the memory structure of asystem. The historical and ever growing gap between the access time of memoriesand the throughput of processors has also driven the development of very advanced
6 A.C.S. Beck et al.and large cache memories, with complex allocation and replacement schemes.Moreover, the growing integration capacity of manufacturing processes has furtherfueled the use of large on-chip caches, which occupy a signiﬁcant fraction of thesilicon area for most current IC designs. Thus, memories represent nowadays asigniﬁcant component for the overall cost, performance and power consumption ofmost systems, creating the need for careful design and dimensioning of the memoryrelated subsystems. The development of memories for current embedded systems is supported mainlyby the scaling of transistors. Thus, the same basic SRAM, DRAM and Flashcells have been used generation after generation with smaller transistors. Whilethis approach improves latency and density, it also brings several new challenges.As leakage current does not decrease at the same pace as density increases, the staticpower dissipation is already a major concern for memory architectures, leading tojoint efforts at all design levels. While research on device level tries to provide lowleakage cells , research on architecture level tries to power off memory bankswhenever possible [13, 24]. Moreover, the reduced critical charge increases thesoft error rates and places greater pressure on efﬁcient error correction techniques,especially for safety-critical applications. The reduced feature sizes also increaseprocess variability, leading to increased losses in yield. Thus, extensive researchis required to maintain the performance and energy consumption improvementsexpected from the next generations of embedded systems, while not jeopardizingyield and reliability. Another great challenge arises with the growing difﬁculties found in CMOSscaling. New memory technologies are expected to replace both the volatile andthe non-volatile fabrics used nowadays. These technologies should provide lowpower consumption, low access latency, high reliability, high density, and, mostimportantly, ultra-low cost per bit . As coupling the required features onnew technologies is a highly demanding task, several contenders arise as possiblesolutions, such as ferroelectric, nanoelectromechanical, and organic cells . Eachmemory type has speciﬁc tasks within an MPSoC. Since memory is a large part ofany system nowadays, bringing obvious costs and energy dissipation problems, thechallenge is to make its usage as efﬁcient as possible, possibly using run-time orapplication based information not available at design time.1.6 CommunicationWith the increasing limitations in power consumption and the growing complexityof improving the current levels of ILP exploitation, the trend towards embeddingmultiple processing cores in a single chip has become a reality. While the useof multiple processors provides more manageable resources, which can be turnedoff independently to save power, for instance , it is crucial that they are able tocommunicate among themselves in an efﬁcient manner, in order to allow actual ac-celeration with thread level parallelism. From the communication infrastructure one
1 Adaptability: The Key for Future Embedded Systems 7expects high bandwidth, low latency, low power consumption, low manufacturingcosts, and high reliability, with more or less relevance to each feature depending onthe application. Even though this may be a simple task for a small set of processors,it becomes increasingly complex for a larger set of processors. Furthermore, asidefrom processors, embedded SoCs include heterogeneous components, such asdedicated accelerators and off-chip communication interfaces, which must also beinterconnected. The number of processing components expected to be integratedwithin a single SoC is expected to grow quickly in the next years, exceeding1,000 components in 2019 . Thus, the need for highly scalable communicationsystems is one the most prominent challenges found when creating a multi-processor system-on-chip (MPSoC). As classical approaches such as busses or shared multi-port memories have poorscalability, new communication techniques and topologies are required to meet thedemands of the new MPSoCs with many cores and stringent area and power limi-tations. Among such techniques, networks-on-chip (NoCs) have received extensiveattention over the past years, since they bring high scalability and high bandwidthas signiﬁcant assets . With the rise of NoCs as a promising interconnectionfor MPSoCs, several related issues have to be addressed, such as the optimummemory organization, routing mechanism, thread scheduling and placement, andso on. Additionally, as all these design choices are highly application-dependant,there is a great room for adaptability also on the communication infrastructure, notonly for NoCs but for any chosen scheme covering the communication fabric.1.7 Fault ToleranceFault Tolerance has gained more attention in the past years due to the intrinsicvulnerability that deep-submicron technologies have imposed. As one gets closerto the physical limits of current CMOS technology, the impact of physical effectson system reliability is magniﬁed. This is a consequence of the susceptibility that avery fragile circuit has when exposed to many different types of extreme conditions,such as elevated temperatures and voltages, radioactive particles coming from outerspace, or impurities presented in the materials used for packaging or manufacturingthe circuit, etc. Independent on the agent that causes the fault, the predictions aboutfuture nanoscale circuits indicate a major need for fault tolerance solutions to copewith the expected high fault rates . Fault-tolerant solutions exist since 1950, ﬁrst for the purpose of working inhostile and remote environments of military and space missions. Later, to attain thedemand for highly reliable mission-critical applications systems, such as bankingsystems, car braking, airplanes, telecommunication, etc. . The main problem ofthe mentioned solutions is the fact that they are targeted to avoid that a fault affectsthe system at any cost, since any problem could have catastrophic consequences.For this reason, in many cases, there is no concern with the area/power/performanceoverhead that the fault-tolerant solution may add to the system.
8 A.C.S. Beck et al. In this sense, the main challenge is to allow the development of high performanceembedded systems, considering all the aspects mentioned before, such as powerand energy consumption, applications with heterogeneous behavior, memory, etc.,while still providing a highly reliable system that can cope with a large assortmentof faults. Therefore, this ever-increasing need for fault-tolerant, high performance,low cost, low energy systems leads to an essential question: which is the best fault-tolerant approach targeted to embedded systems, that is robust enough to handlehigh fault rates and cause a low impact on all the other aspects of embeddedsystem design? The answer changes among applications, type of task and underlyinghardware platform. Once again, the key to solve this problem at different instancesrelies on adaptive techniques to reduce cost and sustain performance.1.8 Software Engineering and Development for Adaptive PlatformsAdaptive hardware imposes real challenges for software engineering, from therequirement elicitation to the software development phases. The difﬁculties forsoftware engineering are created due to the high ﬂexibility and design space thatexists in adaptive hardware platforms. Besides the main behavior that the softwareimplements, i.e. the functional requirements, an adaptive hardware platform unveilsa big range of non-functional requirements that must be met by the softwareunder execution and supported by the software engineering process. Non-functionalrequirements are a burden to software development even nowadays. While it issomewhat known how to control some of the classical ones, such as performance orlatency, for the ones speciﬁcally important to the embedded domain, such as energyand power, the proper handling is still an open research problem. Embedded software has radically changed at fast pace within just a few years.Once being highly specialized to perform just a few tasks, such as decoding voice,or organizing a simple phone book in case of mobile phones and one at a time,the software we ﬁnd today in any mainstream smart phone contains several piecesof interconnected APIs and frameworks working together to deliver a completelydifferent experience to the user. The embedded software is now multitask and runsin parallel, since even mobile devices contains a distinct set of microprocessors,each one dedicated to a certain task, such as speech processing and graphics. Thesedistinct architectures exist and are necessary to save energy. Wasting computationaland energy resources is a luxury that resource constrained devices cannot afford.However, the above intricate and heterogeneous hardware, which support morethan one instruction set architecture (ISA), were designed to be resource-efﬁcient,and not to ease software design and production. In addition, since there arepotentially many computing nodes, parallel software designed to efﬁciently occupythe heterogeneous hardware is mandatory also to save energy. Needless to sayhow difﬁcult parallel software design is. If the software is not well designed totake advantage and efﬁciently use all the available ISAs, the software designer
1 Adaptability: The Key for Future Embedded Systems 9will probably miss an optimal point of resources utilization, yielding energy-hungry applications. One can easily imagine several of them running concurrently,coming from unknown and distinct software publishers, implementing unforeseenfunctionalities, and have the whole picture of how challenging software design anddevelopment for these devices can be. If adaptive hardware platforms are meant to be programmable commoditydevices in the near future, the software engineering for them must transparentlyhandle their intrinsic complexity, removing this burden from the code. In theadaptive embedded systems arena, software will continue to be the actual source ofdifferentiation between competing products and of innovation for electronics con-sumer companies. A whole new environment of programming languages, softwaredevelopment tools, and compilers may be necessary to support the development ofadaptive software or, at least, a deep rethink of the existing technologies. Industryuses a myriad of programming and modeling languages, versioning systems,software design and development tools, just to name a few of the key technologies,to keep delivering innovation in their software products. The big question is how tomake those technologies scale in terms of productivity, reliability, and complexityfor the new and exciting software engineering scenario created by adaptive systems.1.9 This BookIndustry faces a great number of challenges, at different levels, when designingembedded systems: they need to boost performance while maintaining energy con-sumption as low as possible, they must be able to reuse existent software code, andat the same time they need to take advantage of the extra logic available in the chip,represented by multiple processors working together. In this book we present anddiscuss several strategies to achieve such conﬂicting and interrelated goals, throughthe use of adaptability. We start by discussing the main challenges designers musthandle in these days and in the future. Then, we start showing different hardwaresolutions that can cope with some of the aforementioned problems: reconﬁgurablesystems; dynamic optimization techniques, such as Binary Translation and TraceReuse; new memory architectures; homogeneous and heterogeneous multiprocessorsystems and MPSoCs; communication issues and NOCs; fault tolerance againstfabrication defects and soft errors; and, ﬁnally, how to employ specialized softwareto improve this new scenario for embedded systems design, and how this new kindof software must be designed and programmed. In Chap. 2, we show, with the help of examples, how the behavior of evena single thread execution is heterogeneous, and how difﬁcult it is to distributeheterogeneous tasks among the components in a SoC environment, reinforcing theneed for adaptability. Chapter 3 gives an overview of adaptive and reconﬁgurable systems and theirbasic functioning. It starts with a classiﬁcation about reconﬁgurable architectures,
10 A.C.S. Beck et al.including coupling, granularity, etc. Then, several reconﬁgurable systems areshown, and for those which are the most used, the chapter discusses their advantagesand drawbacks. Chapter 4 discusses the importance of memory hierarchies in modern embeddedsystems. The importance of carefully dimensioning the size or associativity ofcache memories is presented by means of its impact on access latency and energyconsumption. Moreover, simple benchmark applications show that the optimummemory architecture greatly varies according to software behavior. Hence, thereis no universal memory hierarchy that will present maximum performance withminimum energy consumption for every application. This property creates roomfor adaptable memory architectures that aim at getting as close as possible tothis optimum conﬁguration for the application at hand. The ﬁnal part of Chap. 4discusses relevant works that propose such architectures. In Chap. 5, Network-on-Chips are shown, and several adaptive techniques thatcan be applied to them are discussed. Chapter 6 shows how dynamic techniques,such as binary translation and trace reuse, work to sustain adaptability and stillmaintain binary compatibility. We will also discuss architectures that present somelevel of dynamic adaptability, as well as what is the price to pay for such type ofadaptability, and for which kind of applications it is well suited. Chapter 7, about Fault Tolerance, starts with a brief review of some of themost used concepts concerning this subject, such as reliability, maintainability,and dependability, and discusses their impact on the yield rate and costs ofmanufacturing. Then, several techniques that employ fault tolerance at some levelare demonstrated, with a critical analysis. In Chap. 8 we discuss how important the communication infrastructure is forfuture embedded systems, which will have more heterogeneous applications beingexecuted, and how the communication pattern might aggressively change, even withthe same set of heterogeneous cores, from application to application. Chapter 9 puts adaptive embedded systems into the center of the software engi-neering process, making them programmable devices. This chapter presents tech-niques from the software inception, passing through functional and non-functionalrequirements elicitation, programming language paradigms, and automatic designspace exploration. Adaptive embedded systems impose harsh burdens to softwaredesign and development, requiring us to devise novel techniques and methodologiesfor software engineering. In the end of the chapter, a propositional software designﬂow is presented, which helps to connect the techniques and methods discussed inthe previous chapters and to put into technological grounds a research agenda foradaptive embedded software and systems.References 1. Austin, T., Blaauw, D., Mahlke, S., Mudge, T., Chakrabarti, C., Wolf, W.: Mobile supercom- puters. Computer 37(5), 81–83 (2004). doi:http://dx.doi.org/10.1109/MC.2004.1297253
1 Adaptability: The Key for Future Embedded Systems 11 2. Bjerregaard, T., Mahadevan, S.: A survey of research and practices of network-on-chip. ACM Comput. Surv. 38(1) (2006). doi:http://doi.acm.org/10.1145/1132952.1132953. 3. Borkar, S., Chien, A.A.: The future of microprocessors. Commun. ACM 54(5), 67–77 (2011). doi:10.1145/1941487.1941507. http://doi.acm.org/10.1145/1941487.1941507 4. Burger, D., Goodman, J.R.: Billion-transistor architectures: there and back again. Computer 37(3), 22–28 (2004). doi:http://dx.doi.org/10.1109/MC.2004.1273999 5. Burns, J., Gaudiot, J.L.: Smt layout overhead and scalability. IEEE Trans. Parallel Distrib. Syst. 13(2), 142–155 (2002). doi:http://dx.doi.org/10.1109/71.983942 6. Conte, G., Tommesani, S., Zanichelli, F.: The long and winding road to high-performance image processing with mmx/sse. In: CAMP ’00: Proceedings of the Fifth IEEE International Workshop on Computer Architectures for Machine Perception (CAMP’00), p. 302. IEEE Computer Society, Washington, DC (2000) 7. Flynn, M.J., Hung, P.: Microprocessor design issues: Thoughts on the road ahead. IEEE Micro. 25(3), 16–31 (2005). doi:http://dx.doi.org/10.1109/MM.2005.56 8. Fujimura, A.: All lithography roads ahead lead to more e-beam innovation. In: Future Fab. Int. (37), http://www.future-fab.com (2011) 9. Isci, C., Buyuktosunoglu, A., Cher, C., Bose, P., Martonosi, M.: An analysis of efﬁcient multi-core global power management policies: maximizing performance for a given power budget. In: Proceedings of the 39th annual IEEE/ACM International Symposium on Mi- croarchitecture, MICRO 39, pp. 347–358. IEEE Computer Society, Washington, DC (2006). doi:10.1109/MICRO.2006.810. ITRS: ITRS 2011 Roadmap. Tech. rep., International Technology Roadmap for Semiconduc- tors (2011)11. Kim, N.S., Austin, T., Blaauw, D., Mudge, T., Flautner, K., Hu, J.S., Irwin, M.J., Kandemir, M., Narayanan, V.: Leakage current: Moore’s law meets static power. Computer 36(12), 68–75 (2003). doi:http://dx.doi.org/10.1109/MC.2003.125088512. Koufaty, D., Marr, D.T.: Hyperthreading technology in the netburst microarchitecture. IEEE Micro. 23(2), 56–65 (2003)13. Powell, M., Yang, S.H., Falsaﬁ, B., Roy, K., Vijaykumar, T.N.: Gated-vdd: a circuit technique to reduce leakage in deep-submicron cache memories. In: Proceedings of the 2000 Interna- tional Symposium on Low Power Electronics and Design, ISLPED ’00, pp. 90–95. ACM, New York (2000). doi:10.1145/344166.344526. http://doi.acm.org/10.1145/344166.34452614. Pradhan, D.K.: Fault-Tolerant Computer System Design. Prentice Hall, Upper Saddle River (1996)15. Prakash, T.K., Peng, L.: Performance characterization of spec cpu2006 benchmarks on intel core 2 duo processor. ISAST Trans. Comput. Softw. Eng. 2(1), 36–41 (2008)16. Rutenbar, R.A., Baron, M., Daniel, T., Jayaraman, R., Or-Bach, Z., Rose, J., Sechen, C.: (when) will fpgas kill asics? (panel session). In: DAC ’01: Proceedings of the 38th Annual Design Automation Conference, pp. 321–322. ACM, New York (2001). doi:http://doi.acm. org/10.1145/378239.37849917. Sima, D.: Decisive aspects in the evolution of microprocessors. Proc. IEEE 92(12), 1896–1926 (2004)18. Thompson, S., Parthasarathy, S.: Moore’s law: The future of si microelectronics. Mater. Today 9(6), 20–25 (2006)19. Thompson, S.E., Chau, R.S., Ghani, T., Mistry, K., Tyagi, S., Bohr, M.T.: In search of “forever,” continued transistor scaling one new material at a time. IEEE Trans. Semicond. Manuf. 18(1), 26–36 (2005). doi:10.1109/TSM.2004.841816. http://dx.doi.org/10.1109/TSM.2004.84181620. Vahid, F., Lysecky, R.L., Zhang, C., Stitt, G.: Highly conﬁgurable platforms for embedded computing systems. Microelectron. J. 34(11), 1025–1029 (2003)21. Wall, D.W.: Limits of instruction-level parallelism. In: ASPLOS-IV: Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 176–188. ACM, New York (1991). doi:http://doi.acm.org/10.1145/ 106972.106991
12 A.C.S. Beck et al.22. White, M., Chen, Y.: Scaled cmos technology reliability users guide. Tech. rep., Jet Propulsion Laboratory, National Aeronautics and Space Administration (2008)23. Yang, S., et al: 28nm metal-gate high-k cmos soc technology for high-performance mobile applications. In: Custom Integrated Circuits Conference (CICC), 2011 IEEE, pp. 1–5 (2011). doi:10.1109/CICC.2011.605535524. Zhang, C., Vahid, F., Najjar, W.: A highly conﬁgurable cache architecture for embedded systems. In: Proceedings of the 30th Annual International Symposium on Computer Archi- tecture, ISCA ’03, pp. 136–146. ACM, New York (2003). doi:10.1145/859618.859635. http:// doi.acm.org/10.1145/859618.859635