Frank Ableson presented on unlocking Android. He discussed his background and experience with Android development. He reviewed common Android resources for developers such as the Android developer site. He then demonstrated a sample field service application built for Android, walking through the architecture and user flows. He also showed how to access Twitter from an Android device using the Android Scripting Environment and Python code. Finally, he briefly mentioned building native Android applications using C code.
[Harvard CS264] 11a - Programming the Memory Hierarchy with Sequoia (Mike Bau...npinto
This document discusses performance optimization of GPU kernels. It outlines analyzing kernels to determine if they are limited by memory bandwidth, instruction throughput, or latency. The profiler can identify limiting factors by comparing memory transactions and instructions issued. Source code modifications for memory-only and math-only versions help analyze memory vs computation balance and latency hiding. The goal is to optimize kernels by addressing their most significant performance limiters.
High-Performance Computing Needs Machine Learning... And Vice Versa (NIPS 201...npinto
This document discusses using high-performance computing for machine learning tasks like analyzing large convolutional neural networks for visual object recognition. It proposes running hundreds of thousands of large neural network models in parallel on GPUs to more efficiently search the parameter space, beyond what is normally possible with a single graduate student and model. This high-throughput screening approach aims to identify better performing network architectures through exploring a vast number of possible combinations in the available parameter space.
This document summarizes a student's MASc research on developing an area-efficient FPGA architecture for datapath circuits. It proposes combining bus-based and bit-based routing to better utilize multibit computing elements. Simulation results show the multi-bit logic block approach reduces routing area by 14% compared to conventional FPGAs. Future work involves exploring directional single-driver wires which could further reduce area by 25% and delay by 9% on average. The student seeks feedback on modifications to the CAD flow needed to support the new architectural features.
An FPGA is a programmable logic device containing an array of configurable logic blocks and interconnects that can be programmed to perform different logic functions. It allows reprogramming to perform different functions in microseconds. The key parts of an FPGA are I/O blocks around the edge to interface with other components, logic blocks in the interior to implement logic functions, and interconnects to connect the blocks. FPGAs are programmed by configuring electronic switches to define logic functions and connect the blocks as required.
The document discusses the evolution of programmable logic from TTL to FPGAs. It describes how early programmable logic arrays (PLAs) combined logic gates and registers into single devices with programmable connections. Modern FPGAs arrange logic blocks in an array with programmable interconnect to implement complex digital designs with high density, performance and reprogrammability. The document outlines FPGA architecture including look-up tables, routing resources and specialized blocks to efficiently implement applications like high-speed data processing.
Frank Ableson presented on unlocking Android. He discussed his background and experience with Android development. He reviewed common Android resources for developers such as the Android developer site. He then demonstrated a sample field service application built for Android, walking through the architecture and user flows. He also showed how to access Twitter from an Android device using the Android Scripting Environment and Python code. Finally, he briefly mentioned building native Android applications using C code.
[Harvard CS264] 11a - Programming the Memory Hierarchy with Sequoia (Mike Bau...npinto
This document discusses performance optimization of GPU kernels. It outlines analyzing kernels to determine if they are limited by memory bandwidth, instruction throughput, or latency. The profiler can identify limiting factors by comparing memory transactions and instructions issued. Source code modifications for memory-only and math-only versions help analyze memory vs computation balance and latency hiding. The goal is to optimize kernels by addressing their most significant performance limiters.
High-Performance Computing Needs Machine Learning... And Vice Versa (NIPS 201...npinto
This document discusses using high-performance computing for machine learning tasks like analyzing large convolutional neural networks for visual object recognition. It proposes running hundreds of thousands of large neural network models in parallel on GPUs to more efficiently search the parameter space, beyond what is normally possible with a single graduate student and model. This high-throughput screening approach aims to identify better performing network architectures through exploring a vast number of possible combinations in the available parameter space.
This document summarizes a student's MASc research on developing an area-efficient FPGA architecture for datapath circuits. It proposes combining bus-based and bit-based routing to better utilize multibit computing elements. Simulation results show the multi-bit logic block approach reduces routing area by 14% compared to conventional FPGAs. Future work involves exploring directional single-driver wires which could further reduce area by 25% and delay by 9% on average. The student seeks feedback on modifications to the CAD flow needed to support the new architectural features.
An FPGA is a programmable logic device containing an array of configurable logic blocks and interconnects that can be programmed to perform different logic functions. It allows reprogramming to perform different functions in microseconds. The key parts of an FPGA are I/O blocks around the edge to interface with other components, logic blocks in the interior to implement logic functions, and interconnects to connect the blocks. FPGAs are programmed by configuring electronic switches to define logic functions and connect the blocks as required.
The document discusses the evolution of programmable logic from TTL to FPGAs. It describes how early programmable logic arrays (PLAs) combined logic gates and registers into single devices with programmable connections. Modern FPGAs arrange logic blocks in an array with programmable interconnect to implement complex digital designs with high density, performance and reprogrammability. The document outlines FPGA architecture including look-up tables, routing resources and specialized blocks to efficiently implement applications like high-speed data processing.
The document provides instructions for creating a new Eclipse project to build the CarryDrop simulation by importing existing code from an online tutorial. It describes how to create a new Java project in Eclipse, add the RePast library to the build path, and then create and populate new classes - CarryDropModel, CarryDropAgent, and CarryDropSpace - by copying code from the tutorial website. Once all the classes are created, the simulation can be run by selecting the CarryDropModel main class.
This document summarizes steps to build an agent-based simulation model called CarryDrop in the RePast modeling framework. The key steps include:
1. Creating a SimModel object that represents the overall model and controls initialization and scheduling.
2. Adding user-settable parameters for number of agents, world size, and total money.
3. Creating an Agent class to represent individual carriers and defining their attributes.
4. Building a Space class to represent the 2D grid world and initialize the distribution of money across locations.
5. Integrating the Space class into the model and adding visualization of money amounts using colors.
6. Implementing agent behaviors like moving between locations and
The document provides instructions for creating a new Eclipse project to build the CarryDrop simulation by importing existing code from an online tutorial. It describes how to create a new Java project in Eclipse, add the RePast library to the build path, and then create and populate new classes - CarryDropModel, CarryDropAgent, and CarryDropSpace - by copying code from the tutorial website. Once all the classes are created, the simulation can be run by selecting the CarryDropModel main class.
This document summarizes steps to build an agent-based simulation model called CarryDrop in the RePast modeling framework. The key steps include:
1. Creating a SimModel object that represents the overall model and controls initialization and scheduling.
2. Adding user-settable parameters for number of agents, world size, and total money.
3. Creating an Agent class to represent individual carriers and defining their attributes.
4. Building a Space class to represent the 2D grid world and initialize the distribution of money across locations.
5. Integrating the Space class into the model and adding visualization of money amounts using colors.
6. Implementing agent behaviors like moving between locations and