This document contains 3 summaries of research papers from the IEEE Transactions on Emerging Topics in Computing from May and June 2016.
The first paper proposes a software toolchain that introduces variability awareness from high-level modeling down to runtime management on heterogeneous multicore platforms. It demonstrates the toolchain on 2 platforms.
The second paper proposes a method to jointly tune on-chip lasers and microring resonators in nanophotonic interconnects to improve energy efficiency under thermal variations. It shows up to 53% energy reduction is possible.
The third paper introduces a new multiple-access single-charge associative memory architecture called MASC TCAM that can search contents multiple times with a single precharge, achieving
Introduction to ArtificiaI Intelligence in Higher Education
IEEE Emerging topic in computing Title and Abstract 2016
1. For Details, Contact TSYS Academic Projects.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING TOPICS 2016
A Software Toolchain for Variability Awareness on Heterogenous Multicore Platforms
Abstract - Workload allocation in embedded multicore platforms is an increasing challenging
issue due to heterogeneity of components and their parallelism. Additionally, the impact of
process variations in current and next generation technology nodes is becoming relevant and
cannot be compensated at the device or architectural level. Intra-die process variations raising at
the core level and platform level makes parallel multicore platforms intrinsically heterogeneous,
because the various cores are clocked at different operational frequencies. Power consumption
becomes heterogeneous too, both considering dynamic and leakage consumption. In this context,
to fully exploit the computational capability of the platform parallelism, variability aware task
allocation strategies must be adopted. Despite the consistent research performed to design
variability-aware task allocation policies, little effort has been devoted make available to
programmers a software toolchain enabling the exploitation of these policies. Such toolchain
need to exploit fabrication-level information about core clock speed and power consumption. In
this work, we address to present a methodology and the associated toolchain to program in
presence of process variability, integrating power and performance variability information in all
the steps of the toolchain. To this purpose, the proposed approach is vertically integrated, from
high level modelling down to runtime management. Variability information is introduced
through a XML configuration file that is exploited by toolchain components to make the
appropriate runtime allocation decision. We demonstrate the proposed toolchain using state of art
variability-aware task allocation policies on two multicore platforms: i) The MIPS-based
GENEPY simulator with 4 and 8 parallel homogeneous cores and ii) The Tegra2-based Zynq
platform, where the on-board FPGA has been used to map 10 microblaze slave cores.
Experiments show that the proposed toolchain supports the inte- ration of variability awareness
in a simple yet effective programming environment.
IEEE Transactions on Emerging Topics in Computing (May 2016)
Towards Maximum Energy Efficiency in Nanophotonic Interconnects with Thermal-
Aware On-Chip Laser Tuning
2. For Details, Contact TSYS Academic Projects.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
Abstract - Nanophotonic is an emerging technology considered as one of the key solutions for
future generation on-chip interconnects. However, silicon photonic devices are highly sensitive
to temperature variation. Under a given chip activity, this leads to a lower laser efficiency and a
drift of wavelengths of optical devices (on-chip lasers and microring resonators (MRs)), which
results in a higher Bit Error Ratio (BER). In this paper, we propose to jointly tune the on-chip
lasers and and MRs in order to align the wavelengths of the emitted signals with the resonant
wavelengths of the MRs. Our method allows significant improvements of the power
consumption with regard to the related methods, while meeting the BER requirement. Compared
to methods for which laser tuning is not possible, results show that a combined tuning of laser
and MRs leads to 53% energy reduction when the uniform chip activity decreases from 20% to
5%. BER-energy tradeoffs have been explored and allow strategies to be defined to minimize
either the energy, or the BER. As a key result, we have shown that, under specific chip activities,
increasing the laser power consumption allows both energy and BER to be improved. This trend
has been observed for a MWSR channel interconnecting 12 interfaces.
IEEE Transactions on Emerging Topics in Computing (May 2016)
Approximate Computing using Multiple-Access Single-Charge Associative Memory
Abstract - Memory-based computing using associative memory is a promising way to reduce
the energy consumption of important classes of streaming applications by avoiding redundant
computations. A set of frequent patterns that represent basic functions are pre-stored in Ternary
Content Addressable Memory (TCAM) and reused. The primary limitation to using associative
memory in modern parallel processors is the large search energy required by TCAMs. In
TCAMs, all rows that match, except hit rows, precharge and discharge for every search
operation, resulting in high energy consumption. In this paper, we propose a new Multiple-
Access Single-Charge (MASC) TCAM architecture which is capable of searching TCAM
contents multiple times with only a single precharge cycle. In contrast to previous designs, the
MASC TCAM keeps the match-line voltage of all miss-rows high and uses their charge for the
next search operation, while only the hit rows discharge. We use periodic refresh to control the
accuracy of the search. We also implement a new type of approximate associative memory by
setting longer refresh times for MASC TCAMs, which yields search results within 1-2 bit
3. For Details, Contact TSYS Academic Projects.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
Hamming distances of the exact value. To further decrease the energy consumption of MASC
TCAM and reduce the area, we implement MASC with crossbar TCAMs. Our evaluation on
AMD Southern Island GPU shows that using MASC (crossbar MASC) associative memory can
improve the average floating point units energy efficiency by 33.4%, 38.1%, and 36.7% (37.7%,
42.6%, and 43.1%) for exact matching, selective 1-HD and 2-HD approximations respectively,
providing an acceptable quality of service (PSNR>30dB and average relative error<10%). This
shows that MASC (crossbar MASC) can achieve 1.77X (1.93X) higher energy savings as
compared to the state of the art implementation of GPGPU that uses voltage overscaling on
TCAM.
IEEE Transactions on Emerging Topics in Computing (May 2016)
Demographic Information Prediction: A Portrait of Smartphone Application Users
Abstract - Demographic information is usually treated as private data (e.g., gender and age), but
has been shown great values in personalized services, advertisement, behavior study and other
aspects. In this paper, we propose a novel approach to make efficient demographic prediction
based on smartphone application usage. Specifically, we firstly consider to characterize the data
set by building a matrix to correlate users with types of categories from the log file of
smartphone applications. Then, by considering the category-unbalance problem, we make use of
the correlation between users’ demographic information and their requested Internet resources to
make the prediction, and propose an optimal method to further smooth the obtained results with
category neighbors and user neighbors. The evaluation is supplemented by the dataset from real
world workload. The results show advantages of the proposed prediction approach compared
with baseline prediction. In particular, the proposed approach can achieve 81.21% of Accuracy
in gender prediction. While in dealing with a more challenging multi-class problem, the
proposed approach can still achieve good performance (e.g., 73.84% of Accuracy in the
prediction of age group and 66.42% of Accuracy in the prediction of phone level).
IEEE Transactions on Emerging Topics in Computing (May 2016)
An Energy-Efficient Heterogeneous Memory Architecture for Future Dark Silicon
Embedded Chip-Multiprocessors
4. For Details, Contact TSYS Academic Projects.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
Abstract - Main memories play an important role in overall energy consumption of embedded
systems. Using conventional memory technologies in future designs in nanoscale era causes a
drastic increase in leakage power consumption and temperature-related problems. Emerging non-
volatile memory (NVM) technologies offer many desirable characteristics such as near-zero
leakage power, high density and non-volatility. They can significantly mitigate the issue of
memory leakage power in future embedded chip-multiprocessor (eCMP) systems. However, they
suffer from challenges such as limited write endurance and high write energy consumption
which restrict them for adoption in modern memory systems. In this article, we present a convex
optimization model to design a 3D stacked hybrid memory architecture in order to minimize the
future embedded systems energy consumption in the dark silicon era. This proposed approach
satisfies endurance constraint in order to design a reliable memory system. Our convex model
optimizes numbers and placement of eDRAM and STT-RAM memory banks on the memory
layer to exploit the advantages of both technologies in future eCMPs. Energy consumption, the
main challenge in the dark silicon era, is represented as a major target in this work and it is
minimized by the detailed optimization model in order to design a dark silicon aware 3D Chip-
Multiprocessor. Experimental results show that in comparison with the Baseline memory design,
the proposed architecture improves the energy consumption and performance of the 3D CMP on
average about 61.33% and 9% respectively.
IEEE Transactions on Emerging Topics in Computing (May 2016)
Univariate Power Analysis Attacks Exploiting Static Dissipation of Nanometer CMOS
VLSI Circuits for Cryptographic Applications
Abstract - In this work we focus on Power Analysis Attacks (PAAs) which exploit the
dependence of the static current of sub- 50nm CMOS integrated circuits on the internally
processed data. Spice simulations of static power have been carried out to show that the
coefficient of variation of nanometer logic gates is increasing with the scaling of CMOS
technology. We demonstrate that it is possible to recover the secret key of a cryptographic core
by exploiting this data dependence by means of different statistical distinguishers. For the first
time in the literature we formulate the Attack Exploiting Static Power (AESP) as a univariate
attack by using the mutual information approach to quantify the information that leaks through
5. For Details, Contact TSYS Academic Projects.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
the static power side channel independently from the adopted leakage model. This analysis
shows that countermeasures conceived to protect cryptographic hardware from attacks based on
dynamic power consumption (e.g. WDDL, MDPL, SABL) still exhibit a leakage through the
static power side channel. Finally, we show that the Time Enclosed Logic (TEL) concept does
not leak information through the static power and is suitable to be used as a countermeasure
against both attacks explointig dynamic power and attacks exploiting static power.
IEEE Transactions on Emerging Topics in Computing (May 2016)
Energy-Aware Bio-signal Compressed Sensing Reconstruction on the WBSN-gateway
Abstract - Technology scaling enables today the design of ultra-low power wearable bio-sensors
for continuous vital signs monitoring or wellness applications. Such bio-sensing nodes are
typically integrated in Wireless Body Sensor Network (WBSN) to acquire and process
biomedical signals, e.g. Electrocardiogram (ECG), and transmit them to the WBSN gateway, e.g.
smartphone, for online reconstruction or features extraction. Both bio-sensing node and gateway
are battery powered devices, although they show very different autonomy requirements (weeks
vs. days). The rakeness-based Compressed Sensing (CS) proved to outperform standard CS,
achieving a higher compression for the same quality level, therefore reducing the transmission
costs in the node. However, most of the research focus has been on the efficiency of the node,
neglecting the energy cost of the CS decoder. In this work, we evaluate the energy cost and real-
time reconstruction feasibility on the gateway, considering different signal reconstruction
algorithms running on a heterogeneous mobile SoC based on the ARM big.LITTLE TM
architecture. The experimental results show that it is not always possible to obtain the theoretical
QoS under real-time constraints. Moreover, the standard CS does not satisfy real-time
constraints, while the rakeness enables different QoS-energy trade-offs. Finally, we show that in
the optimal setup (OMP, n = 128) heterogeneous architectures make the CS decoding task
suitable for wearable devices oriented to long-term ECG monitoring.
IEEE Transactions on Emerging Topics in Computing (May 2016)
Fast Object Detection at Constrained Energy
Abstract - Visual computing, e.g., automatic object detection, in mobile devices attracts more
and more attention recently, in which fast models at constrained energy cost is a critical problem.
6. For Details, Contact TSYS Academic Projects.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
In this paper, we introduce our work on designing models based on deep learning for 200 classes
object detection in mobile devices, as well as exploring trade-off between accuracy and energy
cost. In particular, we investigate several methods of extracting object proposals and integrate
them into the fast-RCNN framework for object detection. Extensive experiments are conducted
using the Jetson TK1 SOC platform and the Alienware- 15 laptop, including detailed parameters
evaluation with respect to accuracy, energy cost and speed. From these experiments, we
conclude how to obtain good balance between accuracy and energy cost, which might provide
guidance to design effective and efficient object detection models on mobile devices.
IEEE Transactions on Emerging Topics in Computing (June 2016)
Performance enhancement of a Time-Delay PUF Design by Utilizing Integrated Nanoscale
ReRAM Devices
Abstract - Currently the semiconductor industry is in search of a Physically-Unclonable-
Function (PUF) implementation, which combines high reliability and uniqueness with low area
and power consumption. The characteristics of emerging nanoscale Resistive Random Access
Memory (ReRAM) devices fulfill most of these properties, as they exhibit inherent variability
with low area consumption. Of particular interest is that the resistive states of ReRAM devices
show a strong dependence on the distribution of grain boundaries within the device, which leads
to variability in total device resistance. In this work we transform the classic CMOS time-delay
PUF (TD-PUF) utilizing integrated nanoscale ReRAM devices to achieve better performance
metrics including uniqueness and reliabilitiiy. The enhanced design exploits the property of high
resistance variability of ReRAMs for the design of a ReRAM based delay stage that exhibits
excellent uniqueness. Accurate simulation and characterization of the proposed PUF was
achieved by extracting resistance values, temperature dependence and usage stress of ReRAM
devices fabricated in-house and their application in the proposed TD-PUF are discussed. A 24
stage time-delay PUF utilizing 48 ReRAM devices was simulated and results show excellent
reliability with respect to environmental parameters. A temperature range of 0 to 125???C was
simulated and an optimum reliability was observed at 0.79V. A supply voltage noise of ±30mV
had no impact on the uniqueness and reliability. The proposed design was compared against two
pure CMOS implementations of a TD-PUF. The comparison was performed with respect to the
7. For Details, Contact TSYS Academic Projects.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
aforementioned metrics and under the same environmental conditions, showing up to 5 times
increase in performance.
IEEE Transactions on Emerging Topics in Computing (June 2016)
Optimizing Network Traffic for Spiking Neural Network Simulations on Densely
Interconnected Many-Core Neuromorphic Platforms
Abstract - In this paper we present a new Partitioning and Placement methodology able to maps
Spiking Neural Network on parallel neuromorphic platforms. This methodology improves
scalability/reliability of Spiking Neural Network (SNN) simulations on many-core and densely
interconnected platforms. SNNs mimic brain activity by emulating spikes sent between neuron
populations. Many-core platforms are emerging computing targets that aim to achieve real-time
SNN simulations. Neurons are mapped to parallel cores, and spikes are sent in the form of
packets over the on-chip and off-chip network. However, the activity of neuron populations is
heterogeneous and complex. Thus, achieving an efficient exploitation of platform resources is a
challenge that often affects simulation scalability/reliability. To address this challenge, the
proposed methodology uses customised SNN to profile the board bottlenecks and implements a
SNN partitioning and placement (SNN-PP) algorithm for improving on-chip and off-chip
communication efficiency. The cortical microcircuit SNN was simulated and performances of
the developed SNN-PP algorithm were compared with performances of standard methods. These
comparisons showed significant traffic reduction produced by the new method, that for some
configurations reached up to 96X. Results demonstrate that it is possible to consistently reduce
packet traffic and improve simulation scalability/reliability with an effective neuron placement.
IEEE Transactions on Emerging Topics in Computing (June 2016)
Non-Intrusive Anomaly Detection With Streaming Performance Metrics and Logs for
DevOps in Public Clouds: A Case Study in AWS
Abstract - Public clouds are a style of computing platforms, where scalable and elastic
Information Technology-enabled capabilities are provided as a service to external customers
using Internet technologies. Using public cloud services can reduce costs and increase the
choices of technologies, but it also implies limited system information for users. Thus, anomaly
detection at user end has to be non-intrusive and hence difficult, particularly during DevOps
8. For Details, Contact TSYS Academic Projects.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
operations because the impacts from both anomalies and these operations are often
indistinguishable, and hence, it is hard to detect the anomalies. In this paper, our work is specific
to a successful public cloud, Amazon Web Service, and a representative DevOps operation,
rolling upgrade, on which we report our anomaly detection that can effectively detect anomalies.
Our anomaly detection requires only metrics data and logs supplied by most public clouds
officially. We use support vector machine to train multiple classifiers from monitored data for
different system environments, on which the log information can indicate the best suitable
classifier. Moreover, our detection aims at finding anomalies over every time interval, called
window, such that the features include not only some indicative performance metrics but also the
entropy and the moving average of metrics data in each window. Our experimental evaluation
systematically demonstrates the effectiveness of our approach.
IEEE Transactions on Emerging Topics in Computing (June 2016)
Content Retrieval At the Edge: A Social-Aware and Named Data Cooperative Framework
Abstract - Recent years with the popularity of mobile devices have witnessed an explosive
growth of mobile multimedia contents which dominate more than 50% of mobile data traffic.
This significant growth poses a severe challenge for future cellular networks. As a promising
approach to overcome the challenge, we advocate Content Retrieval At the Edge, a content-
centric cooperative service paradigm via device-to-device (D2D) communications to reduce
cellular traffic volume in mobile networks. By leveraging the Named Data Networking (NDN)
principle, we propose sNDN, a social-aware named data framework to achieve efficient
cooperative content retrieval. Specifically, sNDN introduces Friendship Circle by grouping a
user with her close friends of both high mobility similarity and high content similarity. We
construct NDN routing tables conditioned on Friendship Circle encounter frequency to navigate
a content request and a content reply packet between Friendship Circles, and leverage social
properties in Friendship Circle to search for the final target as inner-Friendship Circle routing.
The evaluation results demonstrate that sNDN can save cellular capacity greatly and outperform
other content retrieval schemes significantly.
IEEE Transactions on Emerging Topics in Computing (June 2016)
Pricing and Repurchasing for Big Data Processing in Multi-Clouds
9. For Details, Contact TSYS Academic Projects.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
Abstract - Processing streaming big data becomes critical as new diver Internet of Thing
applications begin to emerge. The existing cloud pricing strategy is unfriendly for processing
streaming big data with varying loads. Multiple cloud environments are a potential solution with
an efficient pay-on-demand pricing strategy for processing streaming big data. In this paper, we
propose an intermediary framework with multiple cloud environments to provide streaming big
data computing service with lower cost per load, in which a cloud service intermediary rents the
cloud service from multiple cloud providers and provides streaming processing service to the
users with multiple service interfaces. In this framework, we also propose a pricing strategy to
maximize the revenue of the multiple cloud intermediaries. With extensive simulations, our
pricing strategy brings higher revenue than other pricing methods.
IEEE Transactions on Emerging Topics in Computing (June 2016)
SUPPORT OFFERED TO REGISTERED STUDENTS:
1. IEEE Base paper.
2. Review material as per individuals’ university guidelines
3. Future Enhancement
4. assist in answering all critical questions
5. Training on programming language
6. Complete Source Code.
7. Final Report / Document
8. International Conference / International Journal Publication on your Project.
FOLLOW US ON FACEBOOK @ TSYS Academic Projects