The measure of computerized information being created and put away is expanding at a disturbing rate. This information is classified and handled to distil and convey data to clients crossing various businesses for example, finance, online networking, gaming and so forth. This class of workloads is alluded to as throughput computing applications. Multi-core CPUs have been viewed as reasonable for handling information in such workloads. Be that as it may, energized by high computational throughput and energy proficiency, there has been a fast reception of Graphics Processing Units (GPUs) as computing engines lately. GPU computing has risen lately as a reasonable execution stage for throughput situated applications or regions of code. GPUs began as free units for program execution however there are clear patterns towards tight-sew CPU-GPU integration. In this paper, we look to comprehend cutting edge Heterogeneous System Architecture and inspect a few key segments that influences it to emerge from other architecture designs by analyzing existing inquiries about, articles and reports bearing and future open doors for HSA systems.
The graphics processing unit GPU is a computer chip that performs rapid mathematical calculations. GPU is a ubiquitous device which appears in every computing systems such PC, laptop, desktop, and workstation. It is a many core multithreaded multiprocessor that excels at both graphics and non graphic applications. GPU computing is using a GPU as a co processor to accelerate CPUs scientific and engineering computing. The paper provides a brief introduction to GPU computing. Matthew N. O. Sadiku | Adedamola A. Omotoso | Sarhan M. Musa "GPU Computing: An Introduction" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-1 , December 2019, URL: https://www.ijtsrd.com/papers/ijtsrd29648.pdf Paper URL: https://www.ijtsrd.com/engineering/electrical-engineering/29648/gpu-computing-an-introduction/matthew-n-o-sadiku
Map-Reduce Synchronized and Comparative Queue Capacity Scheduler in Hadoop fo...iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Highlighted notes on Hybrid Multicore Computing
While doing research work under Prof. Dip Banerjee, Prof. Kishore Kothapalli.
In this comprehensive report, Prof. Dip Banerjee describes about the benefit of utilizing both multicore systems, CPUs with vector instructions, and manycore systems, GPUs with large no. of low speed ALUs. Such hybrid systems are beneficial to several algorithms as an accelerator cant optimize for all parts of an algorithms (some computations are very regular, while some very irregular).
Programming Modes and Performance of Raspberry-Pi ClustersAM Publications
In present times, updated information and knowledge has become readily accessible to researchers, enthusiasts, developers, and academics through the Internet on many different subjects for wider areas of application. The underlying framework facilitating such possibilities is networking of servers, nodes, and personal computers. However, such setups, comprising of mainframes, servers and networking devices are inaccessible to many, costly, and are not portable. In addition, students and lab-level enthusiasts do not have the requisite access to modify the functionality to suit specific purposes. The Raspberry-Pi (R-Pi) is a small device capable of many functionalities akin to super-computing while being portable, economical and flexible. It runs on open source Linux, making it a preferred choice for lab-level research and studies. Users have started using the embedded networking capability to design portable clusters that replace the costlier machines. This paper introduces new users to the most commonly used frameworks and some recent developments that best exploit the capabilities of R-Pi when used in clusters. This paper also introduces some of the tools and measures that rate efficiencies of clusters to help users assess the quality of cluster design. The paper aims to make users aware of the various parameters in a cluster environment.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
The graphics processing unit GPU is a computer chip that performs rapid mathematical calculations. GPU is a ubiquitous device which appears in every computing systems such PC, laptop, desktop, and workstation. It is a many core multithreaded multiprocessor that excels at both graphics and non graphic applications. GPU computing is using a GPU as a co processor to accelerate CPUs scientific and engineering computing. The paper provides a brief introduction to GPU computing. Matthew N. O. Sadiku | Adedamola A. Omotoso | Sarhan M. Musa "GPU Computing: An Introduction" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-1 , December 2019, URL: https://www.ijtsrd.com/papers/ijtsrd29648.pdf Paper URL: https://www.ijtsrd.com/engineering/electrical-engineering/29648/gpu-computing-an-introduction/matthew-n-o-sadiku
Map-Reduce Synchronized and Comparative Queue Capacity Scheduler in Hadoop fo...iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Highlighted notes on Hybrid Multicore Computing
While doing research work under Prof. Dip Banerjee, Prof. Kishore Kothapalli.
In this comprehensive report, Prof. Dip Banerjee describes about the benefit of utilizing both multicore systems, CPUs with vector instructions, and manycore systems, GPUs with large no. of low speed ALUs. Such hybrid systems are beneficial to several algorithms as an accelerator cant optimize for all parts of an algorithms (some computations are very regular, while some very irregular).
Programming Modes and Performance of Raspberry-Pi ClustersAM Publications
In present times, updated information and knowledge has become readily accessible to researchers, enthusiasts, developers, and academics through the Internet on many different subjects for wider areas of application. The underlying framework facilitating such possibilities is networking of servers, nodes, and personal computers. However, such setups, comprising of mainframes, servers and networking devices are inaccessible to many, costly, and are not portable. In addition, students and lab-level enthusiasts do not have the requisite access to modify the functionality to suit specific purposes. The Raspberry-Pi (R-Pi) is a small device capable of many functionalities akin to super-computing while being portable, economical and flexible. It runs on open source Linux, making it a preferred choice for lab-level research and studies. Users have started using the embedded networking capability to design portable clusters that replace the costlier machines. This paper introduces new users to the most commonly used frameworks and some recent developments that best exploit the capabilities of R-Pi when used in clusters. This paper also introduces some of the tools and measures that rate efficiencies of clusters to help users assess the quality of cluster design. The paper aims to make users aware of the various parameters in a cluster environment.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
Hybrid Job-Driven Meta Data Scheduling for BigData with MapReduce Clusters an...dbpublications
It is cost-efficient for a tenant with a limited budget to establish a virtual Map Reduce cluster by renting multiple virtual private servers (VPSs) from a VPS provider. To provide an appropriate scheduling scheme for this type of computing environment, we propose in this paper a hybrid job-driven scheduling scheme (JoSS for short) from a tenant’s perspective. JoSS provides not only job level scheduling, but also map-task level scheduling and reduce-task level scheduling. JoSS classifies Map Reduce jobs based on job scale and job type and designs an appropriate scheduling policy to schedule each class of jobs. The goal is to improve data locality for both map tasks and reduce tasks, avoid job starvation, and improve job execution performance. Two variations of JoSS are further introduced to separately achieve a better map-data locality and a faster task assignment. We conduct extensive experiments to evaluate and compare the two variations with current scheduling algorithms supported by Hadoop. The results show that the two variations outperform the other tested algorithms in terms of map-data locality, reduce-data locality, and network overhead without incurring significant overhead. In addition, the two variations are separately suitable for different Map Reduce workload scenarios and provide the best job performance among all tested algorithms.
International Journal of Computational Engineering Research(IJCER) is an intentional online Journal in English monthly publishing journal. This Journal publish original research work that contributes significantly to further the scientific knowledge in engineering and Technology.
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISONijcsit
Map Reduce has gained remarkable significance as a rominent parallel data processing tool in the research community, academia and industry with the spurt in volume of data that is to be analyzed. Map Reduce is used in different applications such as data mining, data analytic where massive data analysis is required, but still it is constantly being explored on different parameters such as performance and efficiency. This survey intends to explore large scale data processing using Map Reduce and its various implementations to facilitate the database, researchers and other communities in developing the technical understanding of the Map Reduce framework. In this survey, different Map Reduce implementations are explored and their inherent features are compared on different parameters. It also addresses the open issues and challenges raised on fully functional DBMS/Data Warehouse on Map Reduce. The comparison of various Map Reduce implementations is done with the most popular implementation Hadoop and other similar implementations using other platforms.
• What is MapReduce?
• What are MapReduce implementations?
Facing these questions I have make a personal research, and realize a synthesis, which has help me to clarify some ideas. The attached presentation does not intend to be exhaustive on the subject, but could perhaps bring you some useful insights.
Great Paper on HSAemu Full system simulator built form PQUEMU to do Full System Emulation of HSA from our Academic Member Yeh-Ching Chung of National Tsing Hua University
Hybrid Job-Driven Meta Data Scheduling for BigData with MapReduce Clusters an...dbpublications
It is cost-efficient for a tenant with a limited budget to establish a virtual Map Reduce cluster by renting multiple virtual private servers (VPSs) from a VPS provider. To provide an appropriate scheduling scheme for this type of computing environment, we propose in this paper a hybrid job-driven scheduling scheme (JoSS for short) from a tenant’s perspective. JoSS provides not only job level scheduling, but also map-task level scheduling and reduce-task level scheduling. JoSS classifies Map Reduce jobs based on job scale and job type and designs an appropriate scheduling policy to schedule each class of jobs. The goal is to improve data locality for both map tasks and reduce tasks, avoid job starvation, and improve job execution performance. Two variations of JoSS are further introduced to separately achieve a better map-data locality and a faster task assignment. We conduct extensive experiments to evaluate and compare the two variations with current scheduling algorithms supported by Hadoop. The results show that the two variations outperform the other tested algorithms in terms of map-data locality, reduce-data locality, and network overhead without incurring significant overhead. In addition, the two variations are separately suitable for different Map Reduce workload scenarios and provide the best job performance among all tested algorithms.
International Journal of Computational Engineering Research(IJCER) is an intentional online Journal in English monthly publishing journal. This Journal publish original research work that contributes significantly to further the scientific knowledge in engineering and Technology.
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISONijcsit
Map Reduce has gained remarkable significance as a rominent parallel data processing tool in the research community, academia and industry with the spurt in volume of data that is to be analyzed. Map Reduce is used in different applications such as data mining, data analytic where massive data analysis is required, but still it is constantly being explored on different parameters such as performance and efficiency. This survey intends to explore large scale data processing using Map Reduce and its various implementations to facilitate the database, researchers and other communities in developing the technical understanding of the Map Reduce framework. In this survey, different Map Reduce implementations are explored and their inherent features are compared on different parameters. It also addresses the open issues and challenges raised on fully functional DBMS/Data Warehouse on Map Reduce. The comparison of various Map Reduce implementations is done with the most popular implementation Hadoop and other similar implementations using other platforms.
• What is MapReduce?
• What are MapReduce implementations?
Facing these questions I have make a personal research, and realize a synthesis, which has help me to clarify some ideas. The attached presentation does not intend to be exhaustive on the subject, but could perhaps bring you some useful insights.
Great Paper on HSAemu Full system simulator built form PQUEMU to do Full System Emulation of HSA from our Academic Member Yeh-Ching Chung of National Tsing Hua University
A SURVEY ON GPU SYSTEM CONSIDERING ITS PERFORMANCE ON DIFFERENT APPLICATIONScseij
In this paper we study NVIDIA graphics processing unit (GPU) along with its computational power and applications. Although these units are specially designed for graphics application we can employee there computation power for non graphics application too. GPU has high parallel processing power, low cost of computation and less time utilization; it gives good result of performance per energy ratio. This GPU deployment property for excessive computation of similar small set of instruction played a significant role in reducing CPU overhead. GPU has several key advantages over CPU architecture as it provides high parallelism, intensive computation and significantly higher throughput. It consists of thousands of hardware threads that execute programs in a SIMD fashion hence GPU can be an alternate to CPU in high performance environment and in supercomputing environment. The base line is GPU based general purpose computing is a hot topics of research and there is great to explore rather than only graphics processing application.
Is Multicore Hardware For General-Purpose Parallel Processing Broken? : NotesSubhajit Sahu
Highlighted notes of article while studying Concurrent Data Structures, CSE:
Is Multicore Hardware For General-Purpose Parallel Processing Broken?
By Uzi Vishkin
Communications of the ACM, April 2014, Vol. 57 No. 4, Pages 35-39
10.1145/2580945
This white paper explains how JethroData can help you achieve truly interactive response times for BI on big data, and how the underlying technology works.
It analyzes the challenges of implementing indexes for big data and how JethroData solved these challenges. It then discusses how the JethroData design of separating compute from storage works with Hadoop and with Amazon S3. Finally, it briefly discusses some of the main features behind JethroData's performance, including I/O, query planning and execution features.
The rush to the edge and new applications around AI are causing a shift in design strategies toward the highest performance per watt, rather than the highest performance or lowest power.
Graphics Processing Unit GPU is a processor or electronic chip for graphics. GPUs are massively parallel processors used widely used for 3D graphic and many non graphic applications. As the demand for graphics applications increases, GPU has become indispensable. The use of GPUs has now matured to a point where there are countless industrial applications. This paper provides a brief introduction on GPUs, their properties, and their applications. Matthew N. O. Sadiku | Adedamola A. Omotoso | Sarhan M. Musa "Graphics Processing Unit: An Introduction" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-1 , December 2019, URL: https://www.ijtsrd.com/papers/ijtsrd29647.pdf Paper URL: https://www.ijtsrd.com/engineering/electrical-engineering/29647/graphics-processing-unit-an-introduction/matthew-n-o-sadiku
Wearables are small electronic devices, often comprising one or more sensors and having computational capability. Devices such as wrist watches, pens, and glasses with installed cameras are now available at cheap prices for user to purchase to monitor or securing themselves. The Nigerian state at this period is faced with a lot of kidnapping activities in schools, homes and abduction for the purpose of ransomed collection and other illegal activities necessitate these reviews. The success of the wearable technology in medical uses prompted the research into application into security uses. The method of research is the use of case studies and literature search. This paper takes a look at the possible applications of the wearable technology to combat the cases of abduction and kidnapping in Nigeria.
A REVIEW OF APPLICATIONS OF THEORY OF COMPUTATION AND AUTOMATA TO MUSICDr. Michael Agbaje
Theory of Computation and Automata is a theoretical branch of computer science. It established its roots during 20th Century when mathematicians began developing theoretically and literally machines which mimic certain features of man, completing calculations more quickly and reliably. The word automaton is closely related to the word "automation", meaning automatic processes carrying out the production of specific processes. Automata theory deals with the logic of computation with respect to simple machines, referred to as automata. Through automata, computer scientists are able to understand how machines compute functions and solve problems and more importantly, what it means for a function to be defined as computable or for a question to be described as decidable (Stanford(2004),Cristopher(2013))
A software Process model is a standardised format for planning, organising, and runninga new software development project. The need to complete and deliver software projects faster require using a suitable model.
There are several different kind of models being used which have evolved over the years, in this paper we carried out survey on the following main types of model; waterfall model, V-model, Component assembly model, Chaos model, Incremental model, Prototyping model, Spiral model, Rapid application development (RAD) model, Agile model, rational unified process (RUP), Iconix process and Software ecosystem (SECO) model by describing their characteristic features.
We concluded the study by listing the strengths and weaknesses of each of this models.
Ethics has to do with moral principles that control or influence a person’s behavior. Research ethics has taken a prime position in the process of research. Digital watermarking as a technology that embeds information, in machine-readable form, within the content of a digital media file could raise privacy issues if deployed in ways that fail to take privacy into account. Digital watermarking can be applied to different applications including digital signatures, fingerprinting, broadcast and publication monitoring, authentication, copy control, and secret communication. This paper brings to view various ethical concerns of digital watermarking such as privacy, piracy, deception and anonymity.
This paper illustrates that cellular automata can drive
the spread of gossips in our environment using
stochastic model. The spread of rumour is
encouraged by the use of homogeneous cells which
can either be allowed or disallowed which is further
influenced by a parity model in modelling physical
science which comprises four cells signifying the
white cell as never heard and the black cell as having
heard. This generates a rule which determines if a
cell lives or dies. We can observe that if the cell is
white, and it has one or more black neighbours,
consider each black neighbour in turn. For each black
neighbour, change to black with some specific
probability, otherwise remain white. While on the
other hand, once a cell becomes black, the cell
remains black
PARASITIC COMPUTING: PROBLEMS AND ETHICAL CONSIDERATIONDr. Michael Agbaje
Parasitic computing is programming technique where a program in normal authorized interactions with another program manages to get the other program to perform computations of a complex nature. It is, in a sense, a security exploit in that the program implementing the parasitic computing has no authority to consume resources made available to the other program.The paper takes a look at the ethical issues of parasitic computing and suggest a look into the current operation of the internet TCP/IP.
— Short Message Service (SMS) is the text communication service component of phone, web or mobile communication systems, using standardized communications protocols that allow the exchange of short text messages between fixed line or mobile phone devices. The usage SMS as data application in the world is enormous, with 2.4 billion active users, or 74% of all mobile phone subscribers. This paper develops an SMS voting system that can be used in conducting a trustworthy and generally acceptable electoral conduct based on the legislation of a particular country. It is base on a level structure and a national SIM card module used for only electoral process. The SIM card can be used for either the Internet voting system or the SMS voting. The method is cheap and fast and guarantees prompt election result.
Broadcast monitoring is the process of tracking and observing activities on broadcasting channels in compliance with intellectual property rights and other illegal activities not conforming to broadcasting laws using the computer or human system. The problem here has a unique challenge from the pattern recognition point of view because a very high recognition rate is needed under non- ideal conditions. There is also a problem in comparing a small audio sequence with a large audio stream (the broadcast) searching for matches. Broadcast monitoring could be active or passive. In this paper we did a review of the various application and techniques useful to broadcast monitoring systems.
Effect of Block Sizes on the Attributes of Watermarking Digital ImagesDr. Michael Agbaje
This work examines the effect of block sizes on attributes (robustness, capacity, time of watermarking, visibility and distortion) of watermarked digital images using Discrete Cosine Transform (DCT) function. The DCT function breaks up the image into various frequency bands and allows watermark data to be easily embedded. The advantage of this transformation is the ability to pack input image data into a few coefficients. The block size 8 x 8 is commonly used in watermarking. The work investigates the effect of using block sizes below and above 8 x 8 on the attributes of watermark. The attributes of robustness and capacity increase as the block size increases (62-70db, 31.5-35.9 bit/pixel). The time for watermarking reduces as the block size increases. The watermark is still visible for block sizes below 8 x 8 but invisible for those above it. Distortion decreases sharply from a high value at 2 x 2 block size to minimum at 8 x 8 and gradually increases with block size. The overall observation indicates that watermarked image gradually reduces in quality due to fading above 8 x 8 block size. For easy detection of image against piracy the block size 16 x 16 gives the best output result because it closely resembles the original image in terms of visual quality displayed despite the fact that it contains a hidden watermark.
A 3-dimensional motion sensor and tracking system using vector analysis methodDr. Michael Agbaje
The importance of security in today’s world cannot be overemphasized. It is therefore, imperative to have a cheaper and lower resource requiring methods of achieving good security. Leaving home, work place, a child or somewhere important that needs supervision unattended is now a days not safe. There should be a form of supervision or surveillance either in person or by a machine. This research aimed at designing an electronic based device to provide such surveillance by sensing and tracking a single object in 3-Dimensional (3-D) motion within a defined space. Peering into the natural world it is found that animals such as dolphins, owls and bats locate objects and prey using sound, some animals even sense distortions in electromagnetic waves. Inspiration is not only garnered from nature but also from current technologies for this research. Three electronic sensors namely passive infra-red, active infrared and ultrasonic are evaluated and considered for designing a 3-D motion tracking system using mathematical method of vector analysis. An extensive look into the workings and features of all thethree sensors was done in order to find which sensor holds the greatest capacity for sensing an objects motion and obtaining the relative position of an object by the sensor in 3-dimensional space. Once achieved, the process of measurement can be done repeatedly at distinct time intervals to track the motion of the object in 3D space. The major benefit of this research is for the security and surveillance of property or personnel but is not limited to this. There is also an opportunity for applications in other areas such as in sports for tracking the 3D motion of an athlete and can also assist in evaluating player statistics such as speed and acceleration.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsVictor Morales
K8sGPT is a tool that analyzes and diagnoses Kubernetes clusters. This presentation was used to share the requirements and dependencies to deploy K8sGPT in a local environment.
An Approach to Detecting Writing Styles Based on Clustering Techniquesambekarshweta25
An Approach to Detecting Writing Styles Based on Clustering Techniques
Authors:
-Devkinandan Jagtap
-Shweta Ambekar
-Harshit Singh
-Nakul Sharma (Assistant Professor)
Institution:
VIIT Pune, India
Abstract:
This paper proposes a system to differentiate between human-generated and AI-generated texts using stylometric analysis. The system analyzes text files and classifies writing styles by employing various clustering algorithms, such as k-means, k-means++, hierarchical, and DBSCAN. The effectiveness of these algorithms is measured using silhouette scores. The system successfully identifies distinct writing styles within documents, demonstrating its potential for plagiarism detection.
Introduction:
Stylometry, the study of linguistic and structural features in texts, is used for tasks like plagiarism detection, genre separation, and author verification. This paper leverages stylometric analysis to identify different writing styles and improve plagiarism detection methods.
Methodology:
The system includes data collection, preprocessing, feature extraction, dimensional reduction, machine learning models for clustering, and performance comparison using silhouette scores. Feature extraction focuses on lexical features, vocabulary richness, and readability scores. The study uses a small dataset of texts from various authors and employs algorithms like k-means, k-means++, hierarchical clustering, and DBSCAN for clustering.
Results:
Experiments show that the system effectively identifies writing styles, with silhouette scores indicating reasonable to strong clustering when k=2. As the number of clusters increases, the silhouette scores decrease, indicating a drop in accuracy. K-means and k-means++ perform similarly, while hierarchical clustering is less optimized.
Conclusion and Future Work:
The system works well for distinguishing writing styles with two clusters but becomes less accurate as the number of clusters increases. Future research could focus on adding more parameters and optimizing the methodology to improve accuracy with higher cluster values. This system can enhance existing plagiarism detection tools, especially in academic settings.
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...ssuser7dcef0
Power plants release a large amount of water vapor into the
atmosphere through the stack. The flue gas can be a potential
source for obtaining much needed cooling water for a power
plant. If a power plant could recover and reuse a portion of this
moisture, it could reduce its total cooling water intake
requirement. One of the most practical way to recover water
from flue gas is to use a condensing heat exchanger. The power
plant could also recover latent heat due to condensation as well
as sensible heat due to lowering the flue gas exit temperature.
Additionally, harmful acids released from the stack can be
reduced in a condensing heat exchanger by acid condensation. reduced in a condensing heat exchanger by acid condensation.
Condensation of vapors in flue gas is a complicated
phenomenon since heat and mass transfer of water vapor and
various acids simultaneously occur in the presence of noncondensable
gases such as nitrogen and oxygen. Design of a
condenser depends on the knowledge and understanding of the
heat and mass transfer processes. A computer program for
numerical simulations of water (H2O) and sulfuric acid (H2SO4)
condensation in a flue gas condensing heat exchanger was
developed using MATLAB. Governing equations based on
mass and energy balances for the system were derived to
predict variables such as flue gas exit temperature, cooling
water outlet temperature, mole fraction and condensation rates
of water and sulfuric acid vapors. The equations were solved
using an iterative solution technique with calculations of heat
and mass transfer coefficients and physical properties.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
HEAP SORT ILLUSTRATED WITH HEAPIFY, BUILD HEAP FOR DYNAMIC ARRAYS.
Heap sort is a comparison-based sorting technique based on Binary Heap data structure. It is similar to the selection sort where we first find the minimum element and place the minimum element at the beginning. Repeat the same process for the remaining elements.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
2. Volume 3, Issue 3 | March-April-2018 | http:// ijsrcseit.com
Agbaje Michae et al. Int J S Res CSE & IT. 2018 Mar-Apr;3(3) : 539-546
540
Creating a reason for the incorporation of extra
processing components past the CPU and GPU.
HSA enables exploitation of the abundant
information parallelism in the computational
workloads of today and without bounds in a power-
proficient way. It likewise gives continued support to
conventional programming models and computer
architectures. (George Kyriazis, 2012)
We are at the point where we are unable to power all
of the transistors we have on chip. Ways around this
have generally involved heterogeneous architectures,
where you have multiple distinct computation units
that are optimized in terms of performance and
power for specific tasks. Then, computation can be
directed to those units.
The main objective of this paper is to do an overview
and descriptive study of Heterogeneous System
Architecture by reviewing articles and journals about
HSA paradigm and also related works done in the
improvement of HSA.
II. LITERATURE REVIEW
2.1History of HSA
Computers and other innovation initially started
with single-core processors; in the early 2000s, the
historical backdrop of computing was perpetually
changed by pushing in multi-core processors. The
single-core processors were attaining limits, and they
could not physically enhance these present designs
without revising the whole production process. This
lead to designing the processors on a multi-core
format. After some time, the multi-core processor
advanced from dual core to tri, quad, hex and octa
core designs. A few processors now hold many
several cores.
The development additionally proceeds regarding
innovative design refinement. The present processors
with various cores are currently designed with multi-
threading features, achievement improvements, for
example, memory-on-chip, and heterogeneous core
design intended for special purposes. We can ascribe
these developments to a few rising needs patterns: on
one hand, contemporary technologies must turn out
to be increasingly productive in networking,
multimedia processing, and device recognition. On
the other hand, energy effectiveness must increase
also.
Since the earliest reference point of integrated
circuits advancement, processors were designed with
consistently expanding clock frequencies and
sophisticated in-build optimization strategies. As
individual CPU gates grow smaller through
manufacture breakthroughs, semiconductor
microelectronics likewise turn out to be increasingly
enhanced regarding physical properties.
Because of physical constraints, this speedup has
arrived at an end. Physical imperatives set up various
hindrances in accomplishing high performance
computing within power budgets. Computer
architectures are endeavoring to cross these
performance restricting walls utilizing number of
innovative processor architectures employing area
concurrency (Paul, 2014).
The single chip graphics processor by around year
2000 incorporated practically everything about the
conventional top of the line workstation graphics
pipeline and thusly, merited another name past VGA
controller. The term GPU was instituted to indicate
that the graphics device had turned into a processor.
After some time, GPUs turned out to be more
programmable, as programmable processors
supplanted fixed function dedicated logic while
keeping up the fundamental 3D graphics pipeline
organization. What's more, computations turned out
to be more exact after some time, progressing from
indexed arithmetic, to integer and fixed point, to
single precision floating-point, and recently to
double precision floating-point. GPUs have
progressed toward becoming hugely parallel
programmable processors with hundreds of cores and
thousands of threads.
3. Volume 3, Issue 3 | March-April-2018 | http:// ijsrcseit.com
Agbaje Michae et al. Int J S Res CSE & IT. 2018 Mar-Apr;3(3) : 539-546
541
As of late, processor instructions and memory
hardware were added to help general-purpose
programming languages, and a programming
environment was made to enable GPUs to be
modified utilizing well-known languages, including
C and C++. This advancement makes a GPU a fully
general purpose, programmable, many core processor,
though still with some special advantages and
limitations (Nickolls & Kirk, 2012).
Multicore processors have just been prominent to
enhance the total performance. Considering the
power and temperature constraints, they may be the
sole practical solution. A considerable measure of
studies to decide the best multicore setup is
conducted and it is believed that the heterogeneous
multicore is the best in power and performance
trade-off (Sato, Mori, Yano, & Hayashida, 2012).
2.2 Features of HSA
The HSA engineering manages two sorts of compute
units:
A latency compute unit (LCU) is a
generalization of a CPU. A LCU bolsters both
its local CPU instruction set and the HSA
intermediate language (HSAIL) instruction set.
A throughput compute unit (TCU) is a
generalization of a GPU. A TCU underpins just
the HSAIL instruction set. TCUs perform
extremely proficient parallel execution.
A HSA application can keep running on an extensive
variety of platforms comprising of both LCUs and
TCUs. The HSA structure enables the application to
execute at the most ideal performance and power
point on a given platform, without yielding
adaptability. In the meantime, HSA enhances
programmability, portability and compatibility.
George Kyriazis (2012), featured some noticeable
compositional highlights of HSA and it incorporates:
Shared page table support: To disentangle
operating system and user software, HSA permits
a single set of page table entries to be shared
amongst LCUs and TCUs. This enables units of
the two types to get to memory through the same
virtual address. In simplified terms, the operating
system just needs to oversee one set of page
tables; along these lines empowering Shared
Virtual Memory (SVM) semantics amongst LCU
and TCU.
Page faulting: operating systems permit user
processes to access more memory than is
physically addressable by paging memory to and
from disk. What's more, the operating system
and driver needed to create and manage a
different virtual address space for the TCU to
utilize. HSA expels the burdens of pinned
memory and separate virtual address
management, by permitting compute units to
page fault and to utilize the same large address
space as the CPU.
User-level command queuing: Time spent
waiting for the operating system kernel service
was often a major performance bottleneck in
earlier throughput compute systems. HSA
radically lessens the time to dispatch work to the
TCU by enabling a dispatch queue for every
application and by permitting user mode process
to dispatch directly into those queues, requiring
no operating system kernel service transition or
services. This makes the full performance of the
platform accessible to the developer, limiting
software driver overheads.
Hardware scheduling: HSA gives a mechanism
whereby the TCU engine hardware can switch
between application dispatch queues
automatically, without requiring operating
system intervention on each switch. The
operating system scheduler can characterize each
part of the switching sequence and still looks
after control. Hardware scheduling is quicker and
devours less power.
Coherent memory region: HSA grasps a
completely coherent shared memory model, with
unified addressing. This provides developers with
the same coherent memory model that they
4. Volume 3, Issue 3 | March-April-2018 | http:// ijsrcseit.com
Agbaje Michae et al. Int J S Res CSE & IT. 2018 Mar-Apr;3(3) : 539-546
542
appreciate on SMP CPU systems. This enables
developers to compose applications that closely
couple LCU and TCU codes in popular design
patterns like producer-consumer.
HSA Support Libraries
The HSA platform is intended to support high-level
parallel programming languages and models,
including C++ AMP, C++, C#, CUDA, OpenCL,
OpenMP, Java and Python, to give some examples.
HSA-aware tools create program binaries that can
execute on HSA-enabled systems supporting multiple
instruction sets (commonly, one for the LCU and one
for the TCU) and furthermore can run on existing
architectures without HSA support (George Kyriazis,
2012).
Program binaries that can run on both LCUs and
TCUs contain CPU ISA (Instruction Set Architecture)
for the LCU and HSA Intermediate Language (HSAIL)
for the TCU. A finalizer converts HSAIL to TCU ISA.
The finalizer is regularly lightweight and might be
run at install time, compiler time, or program
execution time, contingent upon decisions made by
the platform implementation.
HSA API Level
This provides insight into the current tools and APIs
used in heterogeneous software development.
2.3 HSA INTERMEDIATE LANGUAGE (HSAIL)
HSAIL is a low level instruction set intended for
parallel compute in a shared virtual memory
environment. HSAIL is SIMT (Single-Instruction
Multiple-Thread) in form and does not dictate
hardware microarchitecture. It's intended for fast
compile time, moving most optimization to HL
compiler. And furthermore at an indistinguishable
level as PTX: an intermediate assembly or Virtual
Machine Target. It's represented as bit-code in a Brig
file format which help late binding of libraries
(Hedge, 2013).
HSAIL is the intermediate language for parallel
compute in HSA
Generated by a high level compiler (LLVM, gcc,
Java VM, and so on)
Compiled down to GPU ISA or other parallel
processor ISA by an IHV Finalizer
Finalizer may execute at run time, install time or
build time, contingent upon platform type.
COMPUTE UNIFIED DEVICE ARCHITECTURE
(CUDA)
At a high level, CUDA is a proprietary tool for
execution of general purpose programs on NVIDIA
graphics cards. To utilize it, you should have
NVIDIA hardware and NVIDIA's compiler. CUDA is
an adaptable parallel programming model and
software platform for the GPU and other parallel
processors that enables the software engineer to
sidestep the graphics API and graphics interfaces of
the GPU and basically program in C or C++. The
CUDA programming model has a SPMD (single-
program multiple data) software style, in which a
software engineer composes a program for one
thread that is instanced and executed by many
threads in parallel on the multiple processors of the
GPU. Actually, CUDA additionally gives a facility to
programming multiple CPU cores too, so CUDA is an
environment for composing parallel programs for the
whole heterogeneous computer system (Nickolls &
Kirk, 2012).
OPEN COMPUTE LANGUAGE (OpenCL)
OpenCL is a recent and open heterogeneous
programming standard bolstered by the Khronos
Compute Working Group. OpenCL is an industry
standard programming language for parallel
computing, that gives a bound together programming
model to CPUs, GPUs, Smart Phones, Tablets,
Servers (Cloud). Enabling programming engineers to
compose programs once and runs cross-platform. It is
likewise upheld by all major hardware & software
vendors (Nickolls & Kirk, 2012).
5. Volume 3, Issue 3 | March-April-2018 | http:// ijsrcseit.com
Agbaje Michae et al. Int J S Res CSE & IT. 2018 Mar-Apr;3(3) : 539-546
543
HSA additionally uncovered a few advantages for
picking lower level programming interface for those
that need a definitive control and performance. A
portion of the advantages of utilizing OpenCL on
HSA incorporates
Avoidance of inefficient duplicates
Low latency dispatch
Improved memory model
Pointers shared amongst CPU and GPU
CUDA and OpenCL are compared by studying their
platform models, memory models, and execution
models. (Grossman, 2013)
Platform Model
A platform model indicates how the hardware
accessible to a software engineer on a specific system
is exhibited, both conceptual and through the API.
Both OpenCL and CUDA platform models depict
discrete devices which are overseen through an API
from a host program. These devices have separate
address spaces from the host program and utilize
explicit transfer to receive input and return output to
the host program. OpenCL and CUDA give
techniques to accessing metadata on every device in
a platform, (for example, memory size,
computational units accessible, and so forth).
At the highest granularity, an OpenCL installation
can contain at least one or more platforms, each of
which contains at least one or more devices. Inside
each OpenCL device there are multiple compute
units. Access to the platforms and devices in an
OpenCL program is more express and verbose than in
CUDA and requires the formation of contexts and
command queues. An OpenCL context is a gathering
of at least one or more OpenCL devices. OpenCL
command queues are utilized to issue commands to
devices, and each commands queue is unequivocally
connected with a solitary OpenCL device. CUDA is
by default less explicit than OpenCL, however
regardless it bolsters a considerable amount of similar
operations on a CUDA platform. Consistently, CUDA
has a selected device which CUDA operations are
verifiably issued to, however CUDA enables you to
unequivocally set a currently active device. While
there is a model of "streams" of work in CUDA like
OpenCL's command queues, CUDA streams are not
required to be expressly given by the developer to
each device operation.
Execution Model
The execution model of a heterogeneous
programming model portrays the conceptual model
for the execution of user-written computation. Both
OpenCL and CUDA are SIMD (Single-Instruction
Multiple-Data) programming models. The software
engineer composes a kernel for the device and
expressly shows it is for device execution utilizing
language keywords. Threads executing these kernels
utilize a special thread ID to choose their data
sources. Both OpenCL and CUDA utilize a batched
kernel invocation model where large numbers of
kernel instances or threads are launched in a single
API call. Both CUDA and OpenCL group individual
threads into little accumulations. Kernel invocation
dispatch numerous thread collections without a
moment's delay. One region in which OpenCL's and
CUDA's execution model veer is setting up a user-
written kernel for execution on a device. CUDA
compiles kernels for execution at compile-time
utilizing NVIDIA's compiler. OpenCL program and
OpenCL kernel are compiled or loaded at runtime.
An OpenCL program represents a collection of
executable functions. An OpenCL kernel object is
associated with a program object and determines a
single entry point to that program. This makes
OpenCL both a more adaptable and unequivocal
programming model than CUDA with regards to
executing computation on various sorts of devices.
Then again, every OpenCL program must set up
executable objects before executing them on a device
whereas CUDA prepares them for the user implicitly.
Memory Model
Both CUDA and OpenCL utilize discrete address
spaces to represent the memory available from a
device, even in the situations where an OpenCL host
6. Volume 3, Issue 3 | March-April-2018 | http:// ijsrcseit.com
Agbaje Michae et al. Int J S Res CSE & IT. 2018 Mar-Apr;3(3) : 539-546
544
application is executing utilizing an indistinguishable
memory as an OpenCL device (as is regularly the
situation when multi-core CPUs are gotten to
through OpenCL). All together for computation
executing on a device to have access to input values
from the host program, those values must have been
previously and explicitly copied to global buffers
related with that device. For the host application to
get output values from device computation, those
values must be duplicated out of global device buffers
and into the host program's address space. Both
CUDA and OpenCL give API calls to duplicate in and
duplicate out of global memory, and also approaches
to utilize special purpose memory, (for example,
texture memory) which may enhance performance
for the correct access patterns on certain hardware.
The CUDA and OpenCL kernel languages
additionally incorporate special keywords for
determining local, scratchpad memory available from
a kernel. This content of scratchpad memory has the
same lifetime from a thread group on a compute unit
and exhibits lower latency.
Limitations of HSA
The limitations/bottlenecks of HAS are considered
temporary as developments are still ongoing. These
limitations includes but not limited to (Hedge, 2013):
Programmability
Communication overhead
Related Works
Hsu, Chen, & Chen(2015) introduced a virtual
platform conforming to the HSA programing model
and the HSA Intermediate Language (HSAIL)
specification. This platform has an advanced
simulator demonstrating the cutting edge GPU
microarchitecture intended for Single Instruction
Multiple Data (SIMD) processing. The platform
additionally gives a simulation framework, including
OpenCL and OpenGL API, the driver for simulator,
and compilation flow from OpenCL kernel and
OpenGL shader program to HSAIL and lastly to a
custom instruction set. This platform was considered
with several benchmarks. OpenCL benchmarks are
for the most part the AMD sample programs.
OpenGL benchmarks are programs of classic shading
algorithms. On this platform, the performance issue
in various GPU microarchitecture, ISA configuration,
task scheduling algorithms and SIMD control
divergence handling mechanisms were broken down.
Arora, (2012) explored the engineering and
advancement of general purpose CPU-GPU systems;
which was begun by portraying cutting edge designs
in GPGPU (General-Purpose Graphics Processing
Unit). Considered answers for key GPGPU issues –
performance loss due to control-flow divergence and
poor scheduling. As an initial step, chip integration
offers better performance. In any case, lessened
latencies and increased bandwidth are enabling
optimizations previously not possible.
Comprehensive CPU-GPU system enhancement
methods, for example, CPU core optimizations,
redundancy elimination and the optimized design of
shared components was depicted. Furthermore,
opportunistic enhancements of the CPU-GPU system
by means of collaborative execution was considered.
Finally, recommended future work open doors for
CPU-GPU systems.
Grossman (2013) exhibited four heterogeneous
programming frameworks, each with the high-level
objective of enhancing programmability of
heterogeneous platforms while either keeping up or
enhancing performance. This objective was
accomplished by balancing architectural
transparency with programming abstractions. Each
of the programming models or runtime systems
introduced, positions itself at an alternate point
between accentuating programmability and
performance. Apparently every one of them lie
somewhere close to the low-level heterogeneous
programming models (like CUDA and OpenCL) and
high-level models (like OpenACC or CUDA libraries).
By striking a superior harmony amongst abstraction
and transparency, these programming models
7. Volume 3, Issue 3 | March-April-2018 | http:// ijsrcseit.com
Agbaje Michae et al. Int J S Res CSE & IT. 2018 Mar-Apr;3(3) : 539-546
545
empower software engineers to be constructive and
create elite applications on heterogeneous platform.
In any case, in spite of research effort in
heterogeneous programming models, it is anything
but difficult to contend that the issue of proficient
advancement of reasonably complex applications on
real-world, distributed, heterogeneous systems is to a
great extent unsolved.
Power, et al., (2013) built up a Heterogeneous System
Coherence (HSC) for CPU-GPU systems to relieve
the coherence bandwidth impacts of GPU memory
demands. HSC replaces a standard directory with a
region directory and adds a region buffer to the L2
cache. These structures enable the system to move
bandwidth from the coherence network to the high-
bandwidth direct-access bus without sacrificing
coherence. The outcomes were evaluated with a
subset of Rodinia benchmarks and the AMD APP
SDK and it demonstrated that HSC can enhance
performance contrasted with a conventional
directory protocol by an average of more than 2x and
a maximum of more than 4.5x. Furthermore, HSC
decreases the bandwidth to the directory by an
average of 94% and over 99% for four of the broken
down benchmarks
III. METHODOLOGY
Different works on HSA were inspected and different
conceivable routes by which HSA can be enhanced
were recognized and furthermore the current devices
and procedures utilized. These related works
evaluated, concentrated on improving on the
bottlenecks of HSA.
Table 1
Bottlenecks Procedures Results
Programming
model
Balancing architectural transparency
with programming abstractions.
By striking a superior harmony,
programmability of the
heterogeneous system architecture
can be achieved and maintained.
Thus, enhancing performance.
A virtual platform conforming to the
HSA programing model and the HSA
Intermediate Language (HSAIL)
specification
Explore microarchitecture design
and evaluate the performance issues
for both the OpenCL and
OpenGL applications.
Performance Chip integration Better performance
Communication
overhead
Heterogeneous System Coherence
replaces a standard directory with a
region directory and adds a region
buffer to the L2 cache.
Moves bandwidth from the
coherence network to the high-
bandwidth direct-access bus without
sacrificing coherence.
These procedures can be utilized individual or
combined in a number of ways to better mitigate the
bottlenecks identified, improve and fast track its
acceptance of Heterogeneous System Architecture in
our day to day computations.
These systems however effective and in spite of
research effort in heterogeneous system architecture,
it is anything but difficult to contend that the issue of
proficient advancement of realistically complex
applications on real-world, distributed,
heterogeneous systems is to a great extent unsolved.
8. Volume 3, Issue 3 | March-April-2018 | http:// ijsrcseit.com
Agbaje Michae et al. Int J S Res CSE & IT. 2018 Mar-Apr;3(3) : 539-546
546
IV. RESULT
The architectural path for the future is clear. An
open design, with published specifications and an
open source execution programming stack.
Permitting programming designs set up on
Symmetric Multi-Processor (SMP) systems relocate
to the heterogeneous world. Furthermore,
Heterogeneous cores cooperating consistently in
coherent memory, permitting low latency dispatch
and no software fault lines. This has brought about
game-changing HSA, parallel compute adoption at
tipping point, intense and developing programming
ecosystem and winning the heart and psyches of
developers.
V. CONCLUSION
Heterogeneity is an undeniably essential trend and
the market is at last beginning to make and receive
the important open benchmarks. HSA is a bound
together computing framework. It gives a single
address space available to both CPU and GPU (to
avoid information replicating), user-space queuing
(to limit communication overhead), and preemptive
context switching (for better quality of service) over
all computing elements in the system. HSA binds
together CPUs and GPUs into a single system with
common computing concepts, enabling the developer
to solve a greater variety of complex issues all the
more effortlessly. However, the current state of the
art of GPU high-performance computing is not
flexible enough for many of today’s computational
problems.
VI. REFERENCES
[1]. Arora, M. (2012). The Architecture and
Evolution of CPU-GPU Systems for General
Purpose Computing. San Diego.
[2]. George Kyriazis, A. (2012). Heterogeneous
System Architecture: A Technical Review.
[3]. Grossman, M. (2013). Programming Models
and Runtimes for Heterogeneous Systems.
Houston, Texas.
[4]. Hedge, M. (2013). Heterogeneous System
Architecture and the Software Ecosystem.
[5]. Hsu, Y., Chen, H.-Y., & Chen, C.-H. (2015). A
Heterogeneous System Architecture
Conformed GPU platform supporting OpenCL
and OpenGL.
[6]. Nickolls, J., & Kirk, D. (2009). Appendix A:
Graphics and omputing GPUs.
[7]. Nickolls, J., & Kirk, D. (2012). Appendix A:
Graphics and Computing GPUs.
[8]. Paul. (2014, August 24). The History of the
Multi core processor. Retrieved from
BurnWorld.com: www.burnworld.com/the-
history-of-the-multi-core-processor/
[9]. Power, J., Basu, A., Gu, J., Puthoor, S.,
Beckmann, B. M., Hill, M. D.,.Wood, D. A.
(2013). Heterogeneous System Coherence for
Integrated CPU-GPU Systems.
[10]. Sato, T., Mori, H., Yano, R., & Hayashida, T.
(2012). Importance of Single-Core Performance
in the Multicore Era. Thirty-Fifth Australasian
Computer Science Conference (ACSC 2012).
Melbourne, Australia.