In this deck from the HPC User Forum, David Wolpert from the Santa Fe Institute presents: Thermodynamics of Computation: Far More Than Counting Bit Erasure.
"The thermodynamic restrictions on all systems that perform computation provide major challenges to modern design of computers. The time is ripe to pursue a new field of science and engineering: a modern thermodynamics of computation. This would combine the resource/time tradeoffs of concern in conventional CS with the thermodynamic tradeoffs in computation that are now being revealed. In this way we should be able to develop the tools necessary both for analyzing thermodynamic costs in biological systems and for engineering next-generation computers."
Watch the video: https://wp.me/p3RLHQ-k4h
Learn more: https://www.santafe.edu/research/projects/thermodynamics-computation
and
http://hpcuserforum.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
1. The document provides engineering formulas and equations for statistics, mechanics, electricity, fluid mechanics, thermodynamics, structural analysis, and simple machines.
2. Key formulas include those for mean, median, mode, standard deviation, and probability. Mechanics formulas include those for force, torque, energy, power, and kinematics.
3. Formulas are also provided for stress, strain, modulus of elasticity, beam deflection, truss analysis, and mechanical advantage of simple machines like levers, inclined planes, and gears.
A Simple Communication System Design Lab #4 with MATLAB SimulinkJaewook. Kang
This document outlines a communication systems design lab using MATLAB Simulink. It discusses implementing various components of a communication system including channels, phase splitters, up/down conversion, and more. The lab covers how to build subsystems, use MATLAB functions in Simulink, and bring variables from the workspace. The goal is to complete a target communication system by implementing a channel model using Simulink blocks, MATLAB functions, and variables from the workspace.
A Simple Communication System Design Lab #3 with MATLAB SimulinkJaewook. Kang
This document outlines the schedule and topics for a series of labs on communication system design using MATLAB Simulink. The upcoming Lab #3 will cover phase splitting, which extracts the real and imaginary components from a complex baseband signal, and up/down conversion, which shifts signals between baseband and intermediate frequencies. The lab is scheduled for April 1st from 1-4pm and will be instructed by Jaewook Kang. Previous and future labs will cover topics like OFDM, S-function design, channel modeling, and subsystem implementation.
This document provides an overview of RF transceiver systems and related concepts. It begins with definitions of dB, phasors, and modulation techniques. It then discusses transmitter and receiver architectures, moving from basics to more advanced concepts. Key topics covered include I/Q modulation, linear modulation, transmitter architectures using either I/Q or polar modulation, and the use of phasors in various applications from circuit analysis to communications systems.
The document discusses block ciphers and stream ciphers. It defines block ciphers as encrypting data in fixed-size blocks using the same key for each block. Stream ciphers encrypt individual bits or characters, generating a unique key for each bit using a pseudorandom number generator. The document then focuses on stream ciphers, describing synchronous and self-synchronizing stream ciphers, linear feedback shift registers (LFSRs) used to generate keystreams, and how to determine a stream cipher's characteristic polynomial from its keystream bits.
The document describes optimizing a lighting calculation for the SPU by analyzing memory requirements, partitioning data, and rearranging data for a streaming model. It then provides an example of optimizing a lighting calculation function, including vectorizing the calculation by hand to process 4 vertices simultaneously. The optimizations reduced the calculation time from 231.6 cycles per vertex per light to 208.5 cycles through compiler hints and further to an estimated higher performance by manual vectorization.
1. The document provides engineering formulas and equations for statistics, mechanics, electricity, fluid mechanics, thermodynamics, structural analysis, and simple machines.
2. Key formulas include those for mean, median, mode, standard deviation, and probability. Mechanics formulas include those for force, torque, energy, power, and kinematics.
3. Formulas are also provided for stress, strain, modulus of elasticity, beam deflection, truss analysis, and mechanical advantage of simple machines like levers, inclined planes, and gears.
A Simple Communication System Design Lab #4 with MATLAB SimulinkJaewook. Kang
This document outlines a communication systems design lab using MATLAB Simulink. It discusses implementing various components of a communication system including channels, phase splitters, up/down conversion, and more. The lab covers how to build subsystems, use MATLAB functions in Simulink, and bring variables from the workspace. The goal is to complete a target communication system by implementing a channel model using Simulink blocks, MATLAB functions, and variables from the workspace.
A Simple Communication System Design Lab #3 with MATLAB SimulinkJaewook. Kang
This document outlines the schedule and topics for a series of labs on communication system design using MATLAB Simulink. The upcoming Lab #3 will cover phase splitting, which extracts the real and imaginary components from a complex baseband signal, and up/down conversion, which shifts signals between baseband and intermediate frequencies. The lab is scheduled for April 1st from 1-4pm and will be instructed by Jaewook Kang. Previous and future labs will cover topics like OFDM, S-function design, channel modeling, and subsystem implementation.
This document provides an overview of RF transceiver systems and related concepts. It begins with definitions of dB, phasors, and modulation techniques. It then discusses transmitter and receiver architectures, moving from basics to more advanced concepts. Key topics covered include I/Q modulation, linear modulation, transmitter architectures using either I/Q or polar modulation, and the use of phasors in various applications from circuit analysis to communications systems.
The document discusses block ciphers and stream ciphers. It defines block ciphers as encrypting data in fixed-size blocks using the same key for each block. Stream ciphers encrypt individual bits or characters, generating a unique key for each bit using a pseudorandom number generator. The document then focuses on stream ciphers, describing synchronous and self-synchronizing stream ciphers, linear feedback shift registers (LFSRs) used to generate keystreams, and how to determine a stream cipher's characteristic polynomial from its keystream bits.
The document describes optimizing a lighting calculation for the SPU by analyzing memory requirements, partitioning data, and rearranging data for a streaming model. It then provides an example of optimizing a lighting calculation function, including vectorizing the calculation by hand to process 4 vertices simultaneously. The optimizations reduced the calculation time from 231.6 cycles per vertex per light to 208.5 cycles through compiler hints and further to an estimated higher performance by manual vectorization.
(1) The document discusses obtaining first-order low-pass, high-pass, band-reject, and band-pass filters from a first-order all-pass filter.
(2) It shows how to derive the transfer functions for each filter type by adding or subtracting the input and output of one or two cascaded all-pass filter sections.
(3) Key specifications of each filter like cutoff frequencies, gain, and bandwidth are calculated and verified through SPICE simulation, showing good agreement between calculated and simulated responses.
The document describes an experiment to verify the Nyquist sampling theorem using MATLAB. It discusses sampling a continuous time signal at frequencies below, equal to, and above twice the maximum frequency of the signal. The results show aliasing when sampling below the Nyquist rate, no aliasing when sampling at the Nyquist rate, and perfect reconstruction when sampling above the Nyquist rate. The experiment generates a sinusoidal signal, samples it at different rates, and plots the discrete and reconstructed continuous signals to demonstrate the sampling theorem.
This document provides an introduction to SPU optimizations by summarizing the SPU assembly instructions. It begins by explaining the SPU execution environment and memory model. It then categorizes the instruction set into classes based on arity and latency. The majority of the document details the various instructions in the Single Precision Floating Point (SP), Fixed precision (FX), and other classes; explaining their syntax, latency, and examples of use. The goal is to familiarize programmers with the SPU hardware and instruction set to enable improved performance through optimization techniques.
The document summarizes network flow problems and the Ford-Fulkerson algorithm for finding the maximum flow in a network. It introduces network representations using graphs, the concepts of capacity, flow, and defines the maximum flow problem. It then describes the Ford-Fulkerson method using residual networks and augmenting paths to iteratively increase the flow. It proves that this process results in a valid flow and terminates when no augmenting path remains, at which point the maximum flow has been found.
This document presents an overview of the Ford-Fulkerson algorithm for computing maximum flow in a flow network. It defines the Ford-Fulkerson algorithm and provides pseudocode. The algorithm works by finding augmenting paths in the residual graph to incrementally increase the flow from the source to the sink. Examples are given to demonstrate computing the maximum flow and residual capacities. Applications of maximum flow problems include modeling traffic networks, fluid flows, electrical circuits, and computer networks.
Learning Erlang (from a Prolog dropout's perspective)elliando dias
This document is a presentation by Kenji Rikitake given at the 1000speakers:4 conference on 26-APR-2008 about learning Erlang from the perspective of a former Prolog programmer. It discusses why Rikitake didn't like Prolog in the 1980s, why he is now interested in Erlang, some challenges in learning Erlang, examples of IPv6 string manipulation and parallel processing he will demonstrate, results from concurrency testing using Erlang's built-in parallel map function, and conclusions about Erlang and parallel programming.
TF2.0 is designed to improve usability and productivity. As a TF's enthusiastic user, I am very excited. Personally, I think the most important thing about usability is "how does TF provide a user-friendly API?" Aside from the other aspects in TF 2.0, this post was a quick review from an API usage perspective.
This document discusses pulse amplitude modulation (PAM) and matched filtering. It begins with an outline of topics to be covered, including PAM, matched filtering, PAM systems, intersymbol interference, and eye diagrams. It then provides definitions and illustrations of digital and analog PAM. The key aspects of matched filtering are introduced, including its use for pulse detection in additive noise. Derivations show that the optimal matched filter is a time-reversed and scaled version of the transmitted pulse shape. Intersymbol interference is discussed and methods to eliminate it are presented. Bit error probability calculations for binary PAM signals are also covered.
The document describes a beamforming system with five transmitter antennas and five receiver antennas. It aims to find the optimal location of five scattering devices to maximize the received power at each receiver, allowing five data streams to be transmitted simultaneously to five users. It outlines the process of calculating the channel matrix H which characterizes the propagation environment and incorporates the effects of scattering devices located between transmitters and receivers. Key parameters like path loss, phase angle, and channel state information are considered in determining the elements of the H matrix.
Search and optimization on quantum accelerators - 2019-05-23Aritra Sarkar
The document discusses search and optimization on quantum accelerators. It begins with an overview of the big picture of quantum search and optimization. It then discusses quantum search algorithms like Grover's algorithm and conditional oracle algorithms. It describes how these algorithms can be used for problems like sub-sequence index search. The document next discusses quantum optimization algorithms like the quantum approximate optimization algorithm (QAOA). It notes that QAOA is a hybrid quantum-classical algorithm that can be used to solve NP-hard combinatorial optimization problems. Finally, it provides an example of how QAOA could potentially be applied to problems in genomics optimization.
Circuit complexity is a model for computation that uses Boolean circuits to simulate functions. A Boolean circuit is a collection of gates like AND, OR, and NOT connected by wires without cycles. Circuit families can represent languages by having a circuit for each string length. The size and depth complexity of a circuit family is the minimum size and depth needed among equivalent circuits. The circuit satisfiability problem (CIRCUIT-SAT) of determining if a circuit outputs 1 for some input assignment is NP-complete, showing that circuit simulation is a hard problem.
RF Circuit Design - [Ch2-1] Resonator and Impedance MatchingSimen Li
1) The document discusses resonators and impedance matching using lumped elements. It describes series and parallel resonant circuits, quality factor, bandwidth, and loaded/unloaded Q.
2) It also covers two-element L-shaped impedance matching networks for matching a load impedance to a source impedance. Methods for determining the reactance and susceptance values are presented for cases where the source impedance is less than or greater than the load impedance.
3) The goal of impedance matching is to maximize power transfer by making the impedances seen looking into the matching network equal to the source or transmission line impedance.
The document summarizes the Ford-Fulkerson algorithm for finding the maximum flow in a flow network. It defines key terms like flow network, source, sink, flow, residual graph and augmented path. It then outlines the steps of the Ford-Fulkerson algorithm to incrementally send flow along augmented paths from the source to the sink until no more such paths exist. An example applying the algorithm to find the maximum flow in a sample network is provided with illustrations of the residual capacities after each flow augmentation.
Kinematic Equations for Uniformly Accelerated MotionPavishma Suresh
This document derives three equations for calculating displacement (x) from velocity-time graphs using different methods:
1) x = 1/2 at^2 + Vot
2) x = (V+Vo)/2 * t
3) x = 1/2 * (V^2 - Vo^2)/a
Where Vo is the initial velocity, V is the final velocity, t is time, and a is acceleration. The derivations show calculating the area under the velocity-time graph in different ways to relate it to displacement.
Concurrent Triple Band Low Noise Amplifier DesignHalil Kayıhan
The document describes the design of a concurrent triple-band low noise amplifier (LNA) that operates at 1.8 GHz, 2.4 GHz, and 5.2 GHz. A cascode structure with a source degeneration inductor is used. The input matching network employs a multi-element LC filter to match the input to 50 ohms across all three bands. Separate output resonance circuits are used for each band. Simulation results show the LNA achieves good input matching and noise figure across bands while providing sufficient gain and linearity.
• Developed standard library cells using IBM 130nm technology in Cadence Virtuoso Layout editor for inverter, nand2, nor2, xnor2, mux2:1, oai2221, aoi22, oai121 and a master-slave negative edge triggered D-flip-flop with minimum area and diffusion breaks. Constructed the schematic, performed DRC-LVS closure of layout and generated a SPICE netlist with Calibre PEX extraction of all the standard cells.
• Simulated the netlists by HSPICE, verified the correctness of its functionality and also made timing analysis of D-flip-flop setup and hold times. Generated a new Synopsys cell library using SiliconSmart ACE and a new Cadence cell library from all the standard cells.
The document discusses the transfer of a reservoir model for the Snorre WFB pilot to the STARS simulator to re-evaluate the impact of foam. It describes establishing a workflow for foam simulation, benchmarking simulation run times, and conducting sensitivity studies on foam parameters like mobility reduction factor, dry-out, oil tolerance, and shear thinning effects. The study aims to history match production data from the pilot and provide recommendations to improve foam simulation capabilities in STARS.
This document discusses concurrency in Go and provides examples of using goroutines and channels for common concurrency patterns like background jobs, streaming data processing, and building services. It explains how goroutines allow running functions concurrently, and how typed channels enable goroutines to communicate and synchronize work through message passing. The examples demonstrate spawning goroutines, piping data between processes, and implementing a service backend that handles requests concurrently using a select statement.
This document discusses the history and development of predictive functional control (PFC). It begins by discussing how Jean Piaget's work on cognitive psychology and learning influenced the concepts of internal models, reference trajectories, and error compensation, which are key principles of PFC. It then states that PFC is not an invention but a discovery based on these natural principles of control. The document provides several examples of implementing PFC to control different industrial processes like reactors, casting, and pumping. It emphasizes that PFC allows transparent control that transfers constraints to help control complex nonlinear systems.
The document discusses delay modeling in digital VLSI circuits. It notes that circuit delay depends on many factors like charge, discharge, parasitics, transistor width-to-length ratio, fan-in, fan-out and topology. Existing delay models do not clearly indicate the contribution of each factor. This wastes circuit designers' time in simulation and tweaking. The document then presents a delay model based on logical effort that estimates delay based on the topology of the gate and relative sizes of its transistors. It shows how to compute logical effort values and parasitic delays for different gates. Applying this model helps optimize circuit design parameters like transistor sizes, number of stages in a path and topology for minimum delay.
An Efficient Construction of Online Testable Circuits using Reversible Logic ...ijsrd.com
The vital for many safety critical applications is the testable fault tolerant system. Due to its less heat dissipating characteristics, the reversible logic gaining interest in the recent times. Any Boolean logic function can be implemented using reversible gates. The credential part of the paper proposes a technique to convert any reversible logic gate to a testable gate that is also reversible. The resultant reversible testable gate can detect online any single bit errors that include Single Stuck Faults and Single Event Upsets S. Karp et.al. The proposed technique is illustrated using an example that converts a reversible decoder circuit to an online testable reversible decoder circuit.
(1) The document discusses obtaining first-order low-pass, high-pass, band-reject, and band-pass filters from a first-order all-pass filter.
(2) It shows how to derive the transfer functions for each filter type by adding or subtracting the input and output of one or two cascaded all-pass filter sections.
(3) Key specifications of each filter like cutoff frequencies, gain, and bandwidth are calculated and verified through SPICE simulation, showing good agreement between calculated and simulated responses.
The document describes an experiment to verify the Nyquist sampling theorem using MATLAB. It discusses sampling a continuous time signal at frequencies below, equal to, and above twice the maximum frequency of the signal. The results show aliasing when sampling below the Nyquist rate, no aliasing when sampling at the Nyquist rate, and perfect reconstruction when sampling above the Nyquist rate. The experiment generates a sinusoidal signal, samples it at different rates, and plots the discrete and reconstructed continuous signals to demonstrate the sampling theorem.
This document provides an introduction to SPU optimizations by summarizing the SPU assembly instructions. It begins by explaining the SPU execution environment and memory model. It then categorizes the instruction set into classes based on arity and latency. The majority of the document details the various instructions in the Single Precision Floating Point (SP), Fixed precision (FX), and other classes; explaining their syntax, latency, and examples of use. The goal is to familiarize programmers with the SPU hardware and instruction set to enable improved performance through optimization techniques.
The document summarizes network flow problems and the Ford-Fulkerson algorithm for finding the maximum flow in a network. It introduces network representations using graphs, the concepts of capacity, flow, and defines the maximum flow problem. It then describes the Ford-Fulkerson method using residual networks and augmenting paths to iteratively increase the flow. It proves that this process results in a valid flow and terminates when no augmenting path remains, at which point the maximum flow has been found.
This document presents an overview of the Ford-Fulkerson algorithm for computing maximum flow in a flow network. It defines the Ford-Fulkerson algorithm and provides pseudocode. The algorithm works by finding augmenting paths in the residual graph to incrementally increase the flow from the source to the sink. Examples are given to demonstrate computing the maximum flow and residual capacities. Applications of maximum flow problems include modeling traffic networks, fluid flows, electrical circuits, and computer networks.
Learning Erlang (from a Prolog dropout's perspective)elliando dias
This document is a presentation by Kenji Rikitake given at the 1000speakers:4 conference on 26-APR-2008 about learning Erlang from the perspective of a former Prolog programmer. It discusses why Rikitake didn't like Prolog in the 1980s, why he is now interested in Erlang, some challenges in learning Erlang, examples of IPv6 string manipulation and parallel processing he will demonstrate, results from concurrency testing using Erlang's built-in parallel map function, and conclusions about Erlang and parallel programming.
TF2.0 is designed to improve usability and productivity. As a TF's enthusiastic user, I am very excited. Personally, I think the most important thing about usability is "how does TF provide a user-friendly API?" Aside from the other aspects in TF 2.0, this post was a quick review from an API usage perspective.
This document discusses pulse amplitude modulation (PAM) and matched filtering. It begins with an outline of topics to be covered, including PAM, matched filtering, PAM systems, intersymbol interference, and eye diagrams. It then provides definitions and illustrations of digital and analog PAM. The key aspects of matched filtering are introduced, including its use for pulse detection in additive noise. Derivations show that the optimal matched filter is a time-reversed and scaled version of the transmitted pulse shape. Intersymbol interference is discussed and methods to eliminate it are presented. Bit error probability calculations for binary PAM signals are also covered.
The document describes a beamforming system with five transmitter antennas and five receiver antennas. It aims to find the optimal location of five scattering devices to maximize the received power at each receiver, allowing five data streams to be transmitted simultaneously to five users. It outlines the process of calculating the channel matrix H which characterizes the propagation environment and incorporates the effects of scattering devices located between transmitters and receivers. Key parameters like path loss, phase angle, and channel state information are considered in determining the elements of the H matrix.
Search and optimization on quantum accelerators - 2019-05-23Aritra Sarkar
The document discusses search and optimization on quantum accelerators. It begins with an overview of the big picture of quantum search and optimization. It then discusses quantum search algorithms like Grover's algorithm and conditional oracle algorithms. It describes how these algorithms can be used for problems like sub-sequence index search. The document next discusses quantum optimization algorithms like the quantum approximate optimization algorithm (QAOA). It notes that QAOA is a hybrid quantum-classical algorithm that can be used to solve NP-hard combinatorial optimization problems. Finally, it provides an example of how QAOA could potentially be applied to problems in genomics optimization.
Circuit complexity is a model for computation that uses Boolean circuits to simulate functions. A Boolean circuit is a collection of gates like AND, OR, and NOT connected by wires without cycles. Circuit families can represent languages by having a circuit for each string length. The size and depth complexity of a circuit family is the minimum size and depth needed among equivalent circuits. The circuit satisfiability problem (CIRCUIT-SAT) of determining if a circuit outputs 1 for some input assignment is NP-complete, showing that circuit simulation is a hard problem.
RF Circuit Design - [Ch2-1] Resonator and Impedance MatchingSimen Li
1) The document discusses resonators and impedance matching using lumped elements. It describes series and parallel resonant circuits, quality factor, bandwidth, and loaded/unloaded Q.
2) It also covers two-element L-shaped impedance matching networks for matching a load impedance to a source impedance. Methods for determining the reactance and susceptance values are presented for cases where the source impedance is less than or greater than the load impedance.
3) The goal of impedance matching is to maximize power transfer by making the impedances seen looking into the matching network equal to the source or transmission line impedance.
The document summarizes the Ford-Fulkerson algorithm for finding the maximum flow in a flow network. It defines key terms like flow network, source, sink, flow, residual graph and augmented path. It then outlines the steps of the Ford-Fulkerson algorithm to incrementally send flow along augmented paths from the source to the sink until no more such paths exist. An example applying the algorithm to find the maximum flow in a sample network is provided with illustrations of the residual capacities after each flow augmentation.
Kinematic Equations for Uniformly Accelerated MotionPavishma Suresh
This document derives three equations for calculating displacement (x) from velocity-time graphs using different methods:
1) x = 1/2 at^2 + Vot
2) x = (V+Vo)/2 * t
3) x = 1/2 * (V^2 - Vo^2)/a
Where Vo is the initial velocity, V is the final velocity, t is time, and a is acceleration. The derivations show calculating the area under the velocity-time graph in different ways to relate it to displacement.
Concurrent Triple Band Low Noise Amplifier DesignHalil Kayıhan
The document describes the design of a concurrent triple-band low noise amplifier (LNA) that operates at 1.8 GHz, 2.4 GHz, and 5.2 GHz. A cascode structure with a source degeneration inductor is used. The input matching network employs a multi-element LC filter to match the input to 50 ohms across all three bands. Separate output resonance circuits are used for each band. Simulation results show the LNA achieves good input matching and noise figure across bands while providing sufficient gain and linearity.
• Developed standard library cells using IBM 130nm technology in Cadence Virtuoso Layout editor for inverter, nand2, nor2, xnor2, mux2:1, oai2221, aoi22, oai121 and a master-slave negative edge triggered D-flip-flop with minimum area and diffusion breaks. Constructed the schematic, performed DRC-LVS closure of layout and generated a SPICE netlist with Calibre PEX extraction of all the standard cells.
• Simulated the netlists by HSPICE, verified the correctness of its functionality and also made timing analysis of D-flip-flop setup and hold times. Generated a new Synopsys cell library using SiliconSmart ACE and a new Cadence cell library from all the standard cells.
The document discusses the transfer of a reservoir model for the Snorre WFB pilot to the STARS simulator to re-evaluate the impact of foam. It describes establishing a workflow for foam simulation, benchmarking simulation run times, and conducting sensitivity studies on foam parameters like mobility reduction factor, dry-out, oil tolerance, and shear thinning effects. The study aims to history match production data from the pilot and provide recommendations to improve foam simulation capabilities in STARS.
This document discusses concurrency in Go and provides examples of using goroutines and channels for common concurrency patterns like background jobs, streaming data processing, and building services. It explains how goroutines allow running functions concurrently, and how typed channels enable goroutines to communicate and synchronize work through message passing. The examples demonstrate spawning goroutines, piping data between processes, and implementing a service backend that handles requests concurrently using a select statement.
This document discusses the history and development of predictive functional control (PFC). It begins by discussing how Jean Piaget's work on cognitive psychology and learning influenced the concepts of internal models, reference trajectories, and error compensation, which are key principles of PFC. It then states that PFC is not an invention but a discovery based on these natural principles of control. The document provides several examples of implementing PFC to control different industrial processes like reactors, casting, and pumping. It emphasizes that PFC allows transparent control that transfers constraints to help control complex nonlinear systems.
The document discusses delay modeling in digital VLSI circuits. It notes that circuit delay depends on many factors like charge, discharge, parasitics, transistor width-to-length ratio, fan-in, fan-out and topology. Existing delay models do not clearly indicate the contribution of each factor. This wastes circuit designers' time in simulation and tweaking. The document then presents a delay model based on logical effort that estimates delay based on the topology of the gate and relative sizes of its transistors. It shows how to compute logical effort values and parasitic delays for different gates. Applying this model helps optimize circuit design parameters like transistor sizes, number of stages in a path and topology for minimum delay.
An Efficient Construction of Online Testable Circuits using Reversible Logic ...ijsrd.com
The vital for many safety critical applications is the testable fault tolerant system. Due to its less heat dissipating characteristics, the reversible logic gaining interest in the recent times. Any Boolean logic function can be implemented using reversible gates. The credential part of the paper proposes a technique to convert any reversible logic gate to a testable gate that is also reversible. The resultant reversible testable gate can detect online any single bit errors that include Single Stuck Faults and Single Event Upsets S. Karp et.al. The proposed technique is illustrated using an example that converts a reversible decoder circuit to an online testable reversible decoder circuit.
This document describes a conjugate heat transfer analysis of an electronics cooling system using OpenFOAM. It outlines the objectives to develop a CFD model for CHT analysis and validate it with experiments. The methodology section describes the governing equations solved for fluid and solid regions as well as the interface coupling. A simple circuit board cooling case is modeled and tested. Additionally, a server cooling case is proposed with details on geometry, meshing, boundary conditions and results showing temperature distributions.
This document discusses timing analysis of logic circuits. It defines propagation delay time (tp) as the time required for an output signal to change due to a change in the input signal. A timing diagram is used to graphically represent tp. The document discusses how real circuits have intrinsic resistance and capacitance that cause delay. It provides an example of calculating delay through a simple RC circuit. Combinational logic delay is represented using a cloud model. The document also discusses setup time, hold time, and register delay time for D flip-flops and how to calculate maximum switching frequency, including using pipelining to increase maximum frequency.
Electrical System Design transformer 4.pptxGulAhmad16
The document discusses the design of transformers, including their construction types (core type and shell type) and key differences. It also covers the output equations for single phase and three phase transformers, which relate the kVA output to factors like the core area, window space, current density, and number of turns. The equations show that kVA output is directly proportional to factors like frequency, flux, and the product of core area and window space. The document also mentions the ratio of specific magnetic to electric loading (r) used in transformer design and provides typical r values for different transformer types.
This document provides an overview of various analysis tools available in EWB software for circuit simulation and analysis. It describes the following analysis types: DC operating point analysis, AC frequency analysis, transient analysis, Fourier analysis, noise analysis, distortion analysis, DC sweep analysis, sensitivity analysis, parameter sweep analysis, temperature sweep analysis, transfer function analysis, worst case analysis, pole zero analysis, and Monte Carlo analysis. For each analysis type, it provides a brief description of the analysis and an example circuit to demonstrate how to set up and interpret the results of that analysis.
The document discusses the equivalent circuit model of a transformer.
1) The equivalent circuit accounts for copper losses in the primary and secondary windings, eddy current losses in the core, hysteresis losses in the core, and leakage fluxes between the primary and secondary coils.
2) Key components of the equivalent circuit model include resistances to represent copper losses, inductances to represent the effects of mutual and leakage fluxes, and a resistance and inductance in parallel to represent core losses and excitation.
3) Test procedures for determining the parameters of the equivalent circuit model are described, including open circuit and short circuit tests to calculate resistance, reactance, and impedance values.
This document discusses porting the Gyrokinetic Tokamak Solver (GTS) plasma turbulence simulation code to GPUs using CUDA. GTS uses a particle-in-cell approach to simulate plasma fusion in tokamaks. The most computationally intensive part of GTS, the "gather" step which interpolates field quantities to particle positions, was ported to CUDA kernels. This resulted in a 3-4x speedup for the full code and 5-10x speedup for the "gather" kernels. Testing on the Titan supercomputer showed good weak scaling up to 8000 nodes. Using an NVIDIA K40 GPU instead of a K20X provided an additional 20% performance improvement due to the
EXPERT SYSTEMS AND SOLUTIONS
Project Center For Research in Power Electronics and Power Systems
IEEE 2010 , IEEE 2011 BASED PROJECTS FOR FINAL YEAR STUDENTS OF B.E
Email: expertsyssol@gmail.com,
Cell: +919952749533, +918608603634
www.researchprojects.info
OMR, CHENNAI
IEEE based Projects For
Final year students of B.E in
EEE, ECE, EIE,CSE
M.E (Power Systems)
M.E (Applied Electronics)
M.E (Power Electronics)
Ph.D Electrical and Electronics.
Training
Students can assemble their hardware in our Research labs. Experts will be guiding the projects.
EXPERT GUIDANCE IN POWER SYSTEMS POWER ELECTRONICS
We provide guidance and codes for the for the following power systems areas.
1. Deregulated Systems,
2. Wind power Generation and Grid connection
3. Unit commitment
4. Economic Dispatch using AI methods
5. Voltage stability
6. FLC Control
7. Transformer Fault Identifications
8. SCADA - Power system Automation
we provide guidance and codes for the for the following power Electronics areas.
1. Three phase inverter and converters
2. Buck Boost Converter
3. Matrix Converter
4. Inverter and converter topologies
5. Fuzzy based control of Electric Drives.
6. Optimal design of Electrical Machines
7. BLDC and SR motor Drives
The document discusses parallel graph algorithms. It describes Dijkstra's algorithm for finding single-source shortest paths and its parallel formulations. It also describes Floyd's algorithm for finding all-pairs shortest paths and its parallel formulation using a 2D block mapping. Additionally, it discusses Johnson's algorithm, a modification of Dijkstra's algorithm to efficiently handle sparse graphs, and its parallel formulation.
Reduction of Total Harmonic Distortion in Cascaded H-Bridge Inverter by Patte...IJECEIAES
Pattern Search technique can be used to find the solution for the optimization problem. In this paper, pattern search algorithm has been utilized to calculate the switching angles for the cascaded H-bridge inverter with the consideration of minimizing total harmonic distortion. Mathematical equations for the optimization problem were formulated by fourier analysis technique. Lower order harmonics such as third, fifth, seventh, ninth and eleventh order harmonics were taken into account to mitigate the total harmonic distortion of the inverter. Simulations have been carried out for thirteen level, fifteen level and seventeen level cascaded H-bridge inverter using matlab software. Total harmonic distortion of voltage and current for resistive load, resistive-inductive load and motor load were analyzed.
1. The document describes a final project to build an analog PID control circuit using op-amps. It includes objectives, a list of components, and detailed instructions on assembling the circuit and testing it.
2. Key steps include deriving the transfer functions for the proportional, derivative, and integral controllers. Tests are done to observe input-output waveforms for each section alone and for the combined PID controller.
3. Optional tests include modifying the derivative and integral sections, testing with different input signals, closed-loop simulations, and integrating the PID controller into a double integrator plant model.
MOS Inverters Switching Characterstics and interconnect Effects-converted.pptxBalraj Singh
This document discusses MOS inverters and their switching characteristics. It introduces various parasitic capacitances associated with MOSFETs that affect inverter delay times. Delay time is defined as the time required for the output voltage to transition between logic levels. Formulas are provided to calculate delay times based on the load and average charging/discharging currents. The document also discusses estimating interconnect parasitic capacitances and resistances, and how to model interconnects as transmission lines at small scales. Methods for calculating delay due to interconnects such as the Elmore delay are presented. Buffer design to minimize delay for large capacitive loads is also covered.
I am Britney P. I love exploring new topics. Academic writing seemed an interesting option for me. After working for many years with progamminghomeworkhelp.com, I have assisted many students with their Design and Analysis of Algorithms Assignments. I can proudly say, each student I have served is happy with the quality of the solution that I have provided. I have acquired my bachelor's from Sunway University, Malaysia.
The document discusses power estimation techniques at the circuit level. It describes SPICE simulation as the standard tool for circuit-level power analysis and mentions faster analytical models. It covers topics like power characterization of cell libraries through simulation and probability-based power estimation, which involves calculating signal probabilities and switching activity. Switching activity depends on factors like activity factor, abnormal switching due to glitches, and static and dynamic components of transition probability.
HDT Italia produces EDA tools for signal integrity, hardware modeling, and EMC/EMI analysis and validation. Their main product is PRESTO, which uses the SPRINT simulation engine to enable fast, exhaustive simulation of entire printed circuit boards. PRESTO can analyze signal integrity issues, EMC/EMI compliance, and validate design functionality. It produces detailed reports and can interface with measurement equipment for validation. HDT also provides consulting services and the EmiR tool for predicting radiated emissions from PCB designs.
The document discusses first order active RC filter sections based on inverting operational amplifier configurations. It describes how a bilinear transfer function can be realized using such a configuration, with the transfer function equal to the negative ratio of feedback and input impedances Z2 and Z1. Z1 and Z2 are formed using series RC networks, allowing the transfer function to take the standard bilinear form with poles and zeros defined by the RC component values.
This document provides information about current transformers (CTs) including their function, construction, standards, ratings, errors, and types. CTs are used to reduce high power system currents to lower values that can be measured by instrumentation. They provide insulation between the primary and secondary circuits and allow the use of standard current ratings for secondary equipment. The performance of protective relays depends on the CT that drives it. The document discusses various CT constructions, standards, magnetization characteristics, saturation effects, and ratings parameters like rated burden, continuous and short time rated currents. It also defines current and phase errors that can occur in CTs.
This document provides an overview of current transformers (CTs), including their function, construction, standards, ratings, and sources of errors. CTs are used to reduce high currents to lower, more easily measured values while providing insulation between the primary and secondary circuits. They allow the use of standard instrument ratings and help drive protective relays. The document discusses various CT types, designs, and materials as well as definitions for key ratings like rated burden, rated currents, and accuracy limit factor. Sources of errors like saturation, phase shift, and incorrect current magnitudes are also covered.
This document discusses CMOS digital integrated circuits and combinational logic circuits. It covers static CMOS circuits, NMOS and PMOS transistors, threshold calculations for logic gates like NOR and NAND, layout of logic gates, and device sizing in complex gates. The key points are:
- Static CMOS circuits have a continuous low-resistance path between outputs and power/ground.
- Threshold calculations allow NOR and NAND gates to switch at VDD/2.
- Layout and stick diagrams show transistor positions and connections for logic gates.
- Device sizing methods ensure all signal paths can support switching.
Similar to Thermodynamics of Computation: Far More Than Counting Bit Erasure (20)
The document discusses the top 5 technologies that all organizations must understand: digital transformation, quantum computing, IoT, 5G, and AI/HPC. It provides an overview of each technology including opportunities and threats to organizations. The document emphasizes that understanding these emerging technologies is mandatory as the information revolution changes many aspects of life and business.
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
In this deck from IWOCL / SYCLcon 2020, Hal Finkel from Argonne National Laboratory presents: Preparing to program Aurora at Exascale - Early experiences and future directions.
"Argonne National Laboratory’s Leadership Computing Facility will be home to Aurora, our first exascale supercomputer. Aurora promises to take scientific computing to a whole new level, and scientists and engineers from many different fields will take advantage of Aurora’s unprecedented computational capabilities to push the boundaries of human knowledge. In addition, Aurora’s support for advanced machine-learning and big-data computations will enable scientific workflows incorporating these techniques along with traditional HPC algorithms. Programming the state-of-the-art hardware in Aurora will be accomplished using state-of-the-art programming models. Some of these models, such as OpenMP, are long-established in the HPC ecosystem. Other models, such as Intel’s oneAPI, based on SYCL, are relatively-new models constructed with the benefit of significant experience. Many applications will not use these models directly, but rather, will use C++ abstraction libraries such as Kokkos or RAJA. Python will also be a common entry point to high-performance capabilities. As we look toward the future, features in the C++ standard itself will become increasingly relevant for accessing the extreme parallelism of exascale platforms.
This presentation will summarize the experiences of our team as we prepare for Aurora, exploring how to port applications to Aurora’s architecture and programming models, and distilling the challenges and best practices we’ve developed to date. oneAPI/SYCL and OpenMP are both critical models in these efforts, and while the ecosystem for Aurora has yet to mature, we’ve already had a great deal of success. Importantly, we are not passive recipients of programming models developed by others. Our team works not only with vendor-provided compilers and tools, but also develops improved open-source LLVM-based technologies that feed both open-source and vendor-provided capabilities. In addition, we actively participate in the standardization of OpenMP, SYCL, and C++. To conclude, I’ll share our thoughts on how these models can best develop in the future to support exascale-class systems."
Watch the video: https://wp.me/p3RLHQ-lPT
Learn more: https://www.iwocl.org/iwocl-2020/conference-program/
and
https://www.anl.gov/topic/aurora
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck, Greg Wahl from Advantech presents: Transforming Private 5G Networks.
Advantech Networks & Communications Group is driving innovation in next-generation network solutions with their High Performance Servers. We provide business critical hardware to the world's leading telecom and networking equipment manufacturers with both standard and customized products. Our High Performance Servers are highly configurable platforms designed to balance the best in x86 server-class processing performance with maximum I/O and offload density. The systems are cost effective, highly available and optimized to meet next generation networking and media processing needs.
“Advantech’s Networks and Communication Group has been both an innovator and trusted enabling partner in the telecommunications and network security markets for over a decade, designing and manufacturing products for OEMs that accelerate their network platform evolution and time to market.” Said Advantech Vice President of Networks & Communications Group, Ween Niu. “In the new IP Infrastructure era, we will be expanding our expertise in Software Defined Networking (SDN) and Network Function Virtualization (NFV), two of the essential conduits to 5G infrastructure agility making networks easier to install, secure, automate and manage in a cloud-based infrastructure.”
In addition to innovation in air interface technologies and architecture extensions, 5G will also need a new generation of network computing platforms to run the emerging software defined infrastructure, one that provides greater topology flexibility, essential to deliver on the promises of high availability, high coverage, low latency and high bandwidth connections. This will open up new parallel industry opportunities through dedicated 5G network slices reserved for specific industries dedicated to video traffic, augmented reality, IoT, connected cars etc. 5G unlocks many new doors and one of the keys to its enablement lies in the elasticity and flexibility of the underlying infrastructure.
Advantech’s corporate vision is to enable an intelligent planet. The company is a global leader in the fields of IoT intelligent systems and embedded platforms. To embrace the trends of IoT, big data, and artificial intelligence, Advantech promotes IoT hardware and software solutions with the Edge Intelligence WISE-PaaS core to assist business partners and clients in connecting their industrial chains. Advantech is also working with business partners to co-create business ecosystems that accelerate the goal of industrial intelligence."
Watch the video: https://wp.me/p3RLHQ-lPQ
* Company website: https://www.advantech.com/
* Solution page: https://www2.advantech.com/nc/newsletter/NCG/SKY/benefits.html
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...inside-BigData.com
In this deck from the Stanford HPC Conference, Katie Lewis from Lawrence Livermore National Laboratory presents: The Incorporation of Machine Learning into Scientific Simulations at Lawrence Livermore National Laboratory.
"Scientific simulations have driven computing at Lawrence Livermore National Laboratory (LLNL) for decades. During that time, we have seen significant changes in hardware, tools, and algorithms. Today, data science, including machine learning, is one of the fastest growing areas of computing, and LLNL is investing in hardware, applications, and algorithms in this space. While the use of simulations to focus and understand experiments is well accepted in our community, machine learning brings new challenges that need to be addressed. I will explore applications for machine learning in scientific simulations that are showing promising results and further investigation that is needed to better understand its usefulness."
Watch the video: https://youtu.be/NVwmvCWpZ6Y
Learn more: https://computing.llnl.gov/research-area/machine-learning
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
In this deck from the Stanford HPC Conference, DK Panda from Ohio State University presents: How to Achieve High-Performance, Scalable and Distributed DNN Training on Modern HPC Systems?
"This talk will start with an overview of challenges being faced by the AI community to achieve high-performance, scalable and distributed DNN training on Modern HPC systems with both scale-up and scale-out strategies. After that, the talk will focus on a range of solutions being carried out in my group to address these challenges. The solutions will include: 1) MPI-driven Deep Learning, 2) Co-designing Deep Learning Stacks with High-Performance MPI, 3) Out-of- core DNN training, and 4) Hybrid (Data and Model) parallelism. Case studies to accelerate DNN training with popular frameworks like TensorFlow, PyTorch, MXNet and Caffe on modern HPC systems will be presented."
Watch the video: https://youtu.be/LeUNoKZVuwQ
Learn more: http://web.cse.ohio-state.edu/~panda.2/
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
In this deck from the Stanford HPC Conference, Nick Nystrom and Paola Buitrago provide an update from the Pittsburgh Supercomputing Center.
Nick Nystrom is Chief Scientist at the Pittsburgh Supercomputing Center (PSC). Nick is architect and PI for Bridges, PSC's flagship system that successfully pioneered the convergence of HPC, AI, and Big Data. He is also PI for the NIH Human Biomolecular Atlas Program’s HIVE Infrastructure Component and co-PI for projects that bring emerging AI technologies to research (Open Compass), apply machine learning to biomedical data for breast and lung cancer (Big Data for Better Health), and identify causal relationships in biomedical big data (the Center for Causal Discovery, an NIH Big Data to Knowledge Center of Excellence). His current research interests include hardware and software architecture, applications of machine learning to multimodal data (particularly for the life sciences) and to enhance simulation, and graph analytics.
Watch the video: https://youtu.be/LWEU1L1o7yY
Learn more: https://www.psc.edu/
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The document discusses using systems intelligence and artificial intelligence/neural networks to enhance semiconductor electronic design automation (EDA) workflows by collecting telemetry data from EDA jobs and infrastructure and analyzing it using complex event processing, machine learning models, and messaging substrates to provide insights that could optimize EDA pipelines and infrastructure. The approach aims to allow both internal and external augmentation of EDA processes and environments through unsupervised and incremental learning.
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoringinside-BigData.com
In this deck from the Stanford HPC Conference, Nicole Xu from Stanford University describes how she transformed a common jellyfish into a bionic creature that is part animal and part machine.
"Animal locomotion and bioinspiration have the potential to expand the performance capabilities of robots, but current implementations are limited. Mechanical soft robots leverage engineered materials and are highly controllable, but these biomimetic robots consume more power than corresponding animal counterparts. Biological soft robots from a bottom-up approach offer advantages such as speed and controllability but are limited to survival in cell media. Instead, biohybrid robots that comprise live animals and self- contained microelectronic systems leverage the animals’ own metabolism to reduce power constraints and body as an natural scaffold with damage tolerance. We demonstrate that by integrating onboard microelectronics into live jellyfish, we can enhance propulsion up to threefold, using only 10 mW of external power input to the microelectronics and at only a twofold increase in cost of transport to the animal. This robotic system uses 10 to 1000 times less external power per mass than existing swimming robots in literature and can be used in future applications for ocean monitoring to track environmental changes."
Watch the video: https://youtu.be/HrmJFyvInj8
Learn more: https://sanfrancisco.cbslocal.com/2020/02/05/stanford-research-project-common-jellyfish-bionic-sea-creatures/
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck from the Stanford HPC Conference, Peter Dueben from the European Centre for Medium-Range Weather Forecasts (ECMWF) presents: Machine Learning for Weather Forecasts.
"I will present recent studies that use deep learning to learn the equations of motion of the atmosphere, to emulate model components of weather forecast models and to enhance usability of weather forecasts. I will than talk about the main challenges for the application of deep learning in cutting-edge weather forecasts and suggest approaches to improve usability in the future."
Peter is contributing to the development and optimization of weather and climate models for modern supercomputers. He is focusing on a better understanding of model error and model uncertainty, on the use of reduced numerical precision that is optimised for a given level of model error, on global cloud- resolving simulations with ECMWF's forecast model, and the use of machine learning, and in particular deep learning, to improve the workflow and predictions. Peter has graduated in Physics and wrote his PhD thesis at the Max Planck Institute for Meteorology in Germany. He worked as Postdoc with Tim Palmer at the University of Oxford and has taken up a position as University Research Fellow of the Royal Society at the European Centre for Medium-Range Weather Forecasts (ECMWF) in 2017.
Watch the video: https://youtu.be/ks3fkRj8Iqc
Learn more: https://www.ecmwf.int/
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck, Gilad Shainer from the HPC AI Advisory Council describes how this organization fosters innovation in the high performance computing community.
"The HPC-AI Advisory Council’s mission is to bridge the gap between high-performance computing (HPC) and Artificial Intelligence (AI) use and its potential, bring the beneficial capabilities of HPC and AI to new users for better research, education, innovation and product manufacturing, bring users the expertise needed to operate HPC and AI systems, provide application designers with the tools needed to enable parallel computing, and to strengthen the qualification and integration of HPC and AI system products."
Watch the video: https://wp.me/p3RLHQ-lNz
Learn more: http://hpcadvisorycouncil.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Today RIKEN in Japan announced that the Fugaku supercomputer will be made available for research projects aimed to combat COVID-19.
"Fugaku is currently being installed and is scheduled to be available to the public in 2021. However, faced with the devastating disaster unfolding before our eyes, RIKEN and MEXT decided to make a portion of the computational resources of Fugaku available for COVID-19-related projects ahead of schedule while continuing the installation process.
Fugaku is being developed not only for the progress in science, but also to help build the society dubbed as the “Society 5.0” by the Japanese government, where all people will live safe and comfortable lives. The current initiative to fight against the novel coronavirus is driven by the philosophy behind the development of Fugaku."
Initial Projects
Exploring new drug candidates for COVID-19 by "Fugaku"
Yasushi Okuno, RIKEN / Kyoto University
Prediction of conformational dynamics of proteins on the surface of SARS-Cov-2 using Fugaku
Yuji Sugita, RIKEN
Simulation analysis of pandemic phenomena
Nobuyasu Ito, RIKEN
Fragment molecular orbital calculations for COVID-19 proteins
Yuji Mochizuki, Rikkyo University
In this deck from the Performance Optimisation and Productivity group, Lubomir Riha from IT4Innovations presents: Energy Efficient Computing using Dynamic Tuning.
"We now live in a world of power-constrained architectures and systems and power consumption represents a significant cost factor in the overall HPC system economy. For these reasons, in recent years researchers, supercomputing centers and major vendors have developed new tools and methodologies to measure and optimize the energy consumption of large-scale high performance system installations. Due to the link between energy consumption, power consumption and execution time of an application executed by the final user, it is important for these tools and the methodology used to consider all these aspects, empowering the final user and the system administrator with the capability of finding the best configuration given different high level objectives.
This webinar focused on tools designed to improve the energy-efficiency of HPC applications using a methodology of dynamic tuning of HPC applications, developed under the H2020 READEX project. The READEX methodology has been designed for exploiting the dynamic behaviour of software. At design time, different runtime situations (RTS) are detected and optimized system configurations are determined. RTSs with the same configuration are grouped into scenarios, forming the tuning model. At runtime, the tuning model is used to switch system configurations dynamically.
The MERIC tool, that implements the READEX methodology, is presented. It supports manual or binary instrumentation of the analysed applications to simplify the analysis. This instrumentation is used to identify and annotate the significant regions in the HPC application. Automatic binary instrumentation annotates regions with significant runtime. Manual instrumentation, which can be combined with automatic, allows code developer to annotate regions of particular interest."
Watch the video: https://wp.me/p3RLHQ-lJP
Learn more: https://pop-coe.eu/blog/14th-pop-webinar-energy-efficient-computing-using-dynamic-tuning
and
https://code.it4i.cz/vys0053/meric
Sign up for our insideHPC Newsletter: http://insidehpc.com/newslett
The document discusses how DDN A3I storage solutions and Nvidia's SuperPOD platform can enable HPC at scale. It provides details on DDN's A3I appliances that are optimized for AI and deep learning workloads and validated for Nvidia's DGX-2 SuperPOD reference architecture. The solutions are said to deliver the fastest performance, effortless scaling, reliability and flexibility for data-intensive workloads.
In this deck, Paul Isaacs from Linaro presents: State of ARM-based HPC. This talk provides an overview of applications and infrastructure services successfully ported to Aarch64 and benefiting from scale.
"With its debut on the TOP500, the 125,000-core Astra supercomputer at New Mexico’s Sandia Labs uses Cavium ThunderX2 chips to mark Arm’s entry into the petascale world. In Japan, the Fujitsu A64FX Arm-based CPU in the pending Fugaku supercomputer has been optimized to achieve high-level, real-world application performance, anticipating up to one hundred times the application execution performance of the K computer. K was the first computer to top 10 petaflops in 2011."
Watch the video: https://wp.me/p3RLHQ-lIT
Learn more: https://www.linaro.org/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
Today Xilinx announced Versal Premium, the third series in the Versal ACAP portfolio. The Versal Premium series features highly integrated, networked and power-optimized cores and the industry’s highest bandwidth and compute density on an adaptable platform. Versal Premium is designed for the highest bandwidth networks operating in thermally and spatially constrained environments, as well as for cloud providers who need scalable, adaptable application acceleration.
Versal is the industry’s first adaptive compute acceleration platform (ACAP), a revolutionary new category of heterogeneous compute devices with capabilities that far exceed those of conventional silicon architectures. Developed on TSMC’s 7-nanometer process technology, Versal Premium combines software programmability with dynamically configurable hardware acceleration and pre-engineered connectivity and security features to enable a faster time-to- market. The Versal Premium series delivers up to 3X higher throughput compared to current generation FPGAs, with built-in Ethernet, Interlaken, and cryptographic engines that enable fast and secure networks. The series doubles the compute density of currently deployed mainstream FPGAs and provides the adaptability to keep pace with increasingly diverse and evolving cloud and networking workloads.
Learn more: https://insidehpc.com/2020/03/xilinx-announces-versal-premium-acap-for-network-and-cloud-acceleration/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Zettar: Moving Massive Amounts of Data across Any Distance Efficientlyinside-BigData.com
In this video from the Rice Oil & Gas Conference, Chin Fang from Zettar presents: Moving Massive Amounts of Data across Any Distance Efficiently.
The objective of this talk is to present two on-going projects aiming at improving and ensuring highly efficient bulk transferring or streaming of massive amounts of data over digital connections across any distance. It examines the current state of the art, a few very common misconceptions, the differences among the three major type of data movement solutions, a current initiative attempting to improve the data movement efficiency from the ground up, and another multi-stage project that shows how to conduct long distance large scale data movement at speed and scale internationally. Both projects have real world motivations, e.g. the ambitious data transfer requirements of Linac Coherent Light Source II (LCLS-II) [1], a premier preparation project of the U.S. DOE Exascale Computing Initiative (ECI) [2]. Their immediate goals are described and explained, together with the solution used for each. Findings and early results are reported. Possible future works are outlined.
Watch the video: https://wp.me/p3RLHQ-lBX
Learn more: https://www.zettar.com/
and
https://rice2020oghpc.rice.edu/program-2/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck from the Rice Oil & Gas Conference, Bradley McCredie from AMD presents: Scaling TCO in a Post Moore's Law Era.
"While foundries bravely drive forward to overcome the technical and economic challenges posed by scaling to 5nm and beyond, Moore’s law alone can provide only a fraction of the performance / watt and performance / dollar gains needed to satisfy the demands of today’s high performance computing and artificial intelligence applications. To close the gap, multiple strategies are required. First, new levels of innovation and design efficiency will supplement technology gains to continue to deliver meaningful improvements in SoC performance. Second, heterogenous compute architectures will create x-factor increases of performance efficiency for the most critical applications. Finally, open software frameworks, APIs, and toolsets will enable broad ecosystems of application level innovation."
Watch the video:
Learn more: http://amd.com
and
https://rice2020oghpc.rice.edu/program-2/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
In this deck from the ECSS Symposium, Abe Stern from NVIDIA presents: CUDA-Python and RAPIDS for blazing fast scientific computing.
"We will introduce Numba and RAPIDS for GPU programming in Python. Numba allows us to write just-in-time compiled CUDA code in Python, giving us easy access to the power of GPUs from a powerful high-level language. RAPIDS is a suite of tools with a Python interface for machine learning and dataframe operations. Together, Numba and RAPIDS represent a potent set of tools for rapid prototyping, development, and analysis for scientific computing. We will cover the basics of each library and go over simple examples to get users started. Finally, we will briefly highlight several other relevant libraries for GPU programming."
Watch the video: https://wp.me/p3RLHQ-lvu
Learn more: https://developer.nvidia.com/rapids
and
https://www.xsede.org/for-users/ecss/ecss-symposium
Sign up for our insideHPC Newsletter: http://insidehp.com/newsletter
In this deck from FOSDEM 2020, Colin Sauze from Aberystwyth University describes the development of a RaspberryPi cluster for teaching an introduction to HPC.
"The motivation for this was to overcome four key problems faced by new HPC users:
* The availability of a real HPC system and the effect running training courses can have on the real system, conversely the availability of spare resources on the real system can cause problems for the training course.
* A fear of using a large and expensive HPC system for the first time and worries that doing something wrong might damage the system.
* That HPC systems are very abstract systems sitting in data centres that users never see, it is difficult for them to understand exactly what it is they are using.
* That new users fail to understand resource limitations, in part because of the vast resources in modern HPC systems a lot of mistakes can be made before running out of resources. A more resource constrained system makes it easier to understand this.
The talk will also discuss some of the technical challenges in deploying an HPC environment to a Raspberry Pi and attempts to keep that environment as close to a "real" HPC as possible. The issue to trying to automate the installation process will also be covered."
Learn more: https://github.com/colinsauze/pi_cluster
and
https://fosdem.org/2020/schedule/events/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck from ATPESC 2019, Ken Raffenetti from Argonne presents an overview of HPC interconnects.
"The Argonne Training Program on Extreme-Scale Computing (ATPESC) provides intensive, two-week training on the key skills, approaches, and tools to design, implement, and execute computational science and engineering applications on current high-end computing systems and the leadership-class computing systems of the future."
Watch the video: https://wp.me/p3RLHQ-luc
Learn more: https://extremecomputingtraining.anl.gov/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
20 Comprehensive Checklist of Designing and Developing a WebsitePixlogix Infotech
Dive into the world of Website Designing and Developing with Pixlogix! Looking to create a stunning online presence? Look no further! Our comprehensive checklist covers everything you need to know to craft a website that stands out. From user-friendly design to seamless functionality, we've got you covered. Don't miss out on this invaluable resource! Check out our checklist now at Pixlogix and start your journey towards a captivating online presence today.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
2. • ~5% energy use in developed countries goes to heating digital circuits
• Cells compute ~ 105 times more efficiently than CMOS computers
• In fact, biosphere as a whole is more efficient than supercomputers
• Deep connections among information theory, physics and (neuro)biology
3. • Calculate heat produced by running each gate in a circuit
• Sum over all gates to get total heat produced by running the full circuit
• Must be able to calculate heat produced by running an arbitrary gate…
(See N. Gershenfeld, IBM Systems Journal 35, 577 (1996))
Analyze relation between circuit’s design and the
heat it produces
4. Consider a (perhaps time-varying) master equation that sends
p0(x) to p1(x) = ∑x0
P(x1 | x0) p0(x).
• Example: Stochastic dynamics in a genetic network
• Example: (Noise-free) dynamics of a digital gate in a circuit
5. Consider a (perhaps time-varying) master equation that sends
p0(x) to p1(x) = ∑x0
P(x1 | x0) p0(x).
• Example: Stochastic dynamics in a genetic network
• Example: (Noise-free) dynamics of a digital gate in a circuit
where:
• S(p) is Shannon entropy of p
• EF(p0) is total entropy flow (out of system) between t = 0 and t = 1
• EP(p0) is total entropy production in system between t = 0 and t = 1
- cannot be negative
S(p1) − S(p0) = −EF(p0) + EP(p0)
(VanDenbroeck and Esposito, Physica A, 2015)
6. EXAMPLE
System evolves while connected to multiple reservoirs, e.g., heat baths at
different temperatures.
Assume “local detailed balance” holds
Then: EF(p0) is (temperature-normalized) heat flow into reservoirs
EP is non-negative. So:
EF(p0) = S(p0) − S(p1) + EP(p0)
“G𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿′
𝑠𝑠 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏”:
Heat flow from system ≥ S(p0) − S(p1)
7. EXAMPLE
• System evolves while connected to single heat bath at temperature T
• Two possible states
• p0 uniform
• Process implements bit erasure (so p1 a delta function)
So generalized Landauer’s bound says
Landauer’s conclusion
Total heat flow from system ≥ kT ln[2]
(Parrondo et al. 2015, Sagawa 2014,
Hasegawa et al. 2010, Wolpert 2015, etc.)
8. BACK TO CIRCUITS
• For fixed P(x1 | x0), changing p0 changes S(p0) − S(p1)
• N.b., same P(x1 | x0). e.g., same AND gate, has different p0, depending on
where it is in a circuit.
• So identical gates at different locations in a circuit have different
values of Landauer cost, S(p0) − S(p1)
A new circuit design optimization problem
EF(p0) = S(p0) − S(p1) + EP(p0)
Different circuits all implementing same Boolean
function have different sum-total Landauer cost
9. NOTATION:
𝐼𝐼 𝑃𝑃 𝑋𝑋1, 𝑋𝑋2, … = ∑𝑖𝑖 𝑆𝑆(𝑃𝑃 𝑋𝑋𝑖𝑖 ) − 𝑆𝑆(𝑃𝑃 𝑋𝑋1, 𝑋𝑋2, … )
- “Multi-information”
- A generalization of mutual information
- Quantifies how much information is shared among the Xi
10. WHAT CIRCUIT TO COMPUTE A GIVEN FUNCTION?
(Partial) answer: Total Landauer cost if we use circuit C′ rather than C:
where g indexes gates.
I.e., choose circuit with
smallest sum total multi-information
of input distributions into its gates.
�
𝑔𝑔∈𝐶𝐶′
𝐼𝐼(𝑝𝑝𝑝𝑝𝑝𝑝 𝑔𝑔 ) − �
𝑔𝑔∈𝐶𝐶
𝐼𝐼(𝑝𝑝𝑝𝑝𝑝𝑝 𝑔𝑔 )
11. WHAT CIRCUIT TO COMPUTE A GIVEN FUNCTION?
(Partial) answer: Total Landauer cost if we use circuit C′ rather than C:
where g indexes gates.
I.e., choose circuit with
smallest sum total multi-information
of input distributions into its gates.
A global optimization problem.
�
𝑔𝑔∈𝐶𝐶′
𝐼𝐼(𝑝𝑝𝑝𝑝𝑝𝑝 𝑔𝑔 ) − �
𝑔𝑔∈𝐶𝐶
𝐼𝐼(𝑝𝑝𝑝𝑝𝑝𝑝 𝑔𝑔 )
12. • But total EF is the Landauer cost of the circuit plus its EP:
• For fixed P(x1 | x0), e.g., a fixed gate, changing p0 changes EP(p0)
So in order to know total EF from a gate in a circuit
(and therefore total EF in the circuit):
Need to know how EP(p0) depends on p0
EF(p0) = S(p0) − S(p1) + EP(p0)
13. Theorem: For any p0, and any master equation, entropy production is
where:
• KL(., .) is Kullback-Leibler divergence;
• q0(x) is a “prior” built into the gate (a physical parameter);
• The sum is over “islands” of the gate’s dynamics;
• 𝐸𝐸𝐸𝐸𝑐𝑐(. ) is non-negative for each c (a physical parameter);
(Kolchinsky and Wolpert, 2017)
14. • Total EF is the Landauer cost of the circuit plus its EP:
• Sum of entropy plus KL divergence is cross-entropy. So:
Total entropy flow out of a gate – the thermodynamic cost – is
where K(., .) is cross–entropy.
EF(p0) = S(p0) − S(p1) + EP(p0)
15. Total entropy flow out of a gate – the thermodynamic cost – is
where K(., .) is cross–entropy.
Example: An idealized two-ends “wire” maps
• (xin, xout) = (b, 0) ➙ (xin, xout) = (0, b)
• So regardless of p0, drop in cross-entropies equals 0.
• So in an idealized wire, entropy flow is linear in input distribution p0
16. Total entropy flow out of a gate – the thermodynamic cost – is
where K(., .) is cross–entropy.
Evaluate this cost for each wire and gate in a circuit, and then sum.
17. For simplicity, take EP(.) = 0.
Then total entropy flow out of circuit – the thermodynamic cost – is
where:
• g indexes the circuit’s gates (including “wire gates”)
• pa(g) is parent gates of g in circuit (so ppa(g) is joint distribution into g)
Another new circuit design optimization problem
18. OTHER RESULTS
• Analysis where do not set EP = 0.
• A family of circuits refers to any set of circuits that all “implement the same
function”, just for differing numbers of input bits
- Circuit complexity theory analyzes how costs (e.g., number of gates)
of each circuit in a family of circuits scale with number of input bits.
- Extension of circuit complexity theory to include thermodynamics costs.
• Analysis for “logically reversible circuits” - circuits built out of Fredkin
gates, with enough extra gates added to remove all “garbage bits”.
19. THERMODNAMICS OF TURING MACHINES
Kolmogorov complexity of bit string v:
Minimum length of an input bit string that causes a given
(universal, prefix) Turing machine to compute v and halt.
20. Thermodynamic Kolmogorov complexity of bit string v:
Minimum (over all input bit strings) entropy flow for a given
Turing machine to compute the bit string v and then halt:
where
K(v) is Kolmogorov complexity of v
Z – the normalization constant – is Chaitin’s constant
G(Iv) is expected length of all input strings that compute v (i.e.,
probability of v under “universal distribution”)
K(v) + log[G(Iv)] + log[Z]
21. Thermodynamic Kolmogorov complexity of bit string v:
Minimum (over all input bit strings) entropy flow for a given
Turing machine to compute the bit string v and then halt:
where
K(v) is Kolmogorov complexity of v
Z – the normalization constant – is Chaitin’s constant
G(Iv) is expected length of all input strings that compute v (i.e.,
probability of v under “universal distribution”)
A “correction” to Kolmogorov complexity,
reflecting cost of many-to-one maps
K(v) + log[G(Iv)] + log[Z]
22. Minimum (over all initial bit strings) entropy flow for a given Turing
machine to compute the bit string v and then halt:
State of
TM
Time v
A “correction” to Kolmogorov complexity,
reflecting cost of many-to-one maps
K(v) + log[G(Iv)] + log[Z]
23. Minimum (over all initial bit strings) entropy flow for a given Turing
machine to compute the bit string v and then halt:
State of
TM
Time v
K(v) is unbounded – no constant exceeds
length of {the shortest string to compute v} for all v
K(v) {
K(v) + log[G(Iv)] + log[Z]
24. Minimum (over all initial bit strings) entropy flow for a given Turing
machine to compute the bit string v and then halt:
State of
TM
Time v
Minimal EF is bounded – there is a constant that
exceeds {minimal work to compute v} for all v:
log[sum of lengths of red lines]
K(v) + log[G(Iv)] + log[Z]
K(v) {
= Prior prob.’s of
initial TM states
25. CONCLUSIONS
• Exact equations for entire entropy flow of a system:
EF(p0) = Landauer cost (p0) + EP(p0)
• Different circuits, all implementing the same function, all using
thermodynamically reversible gates, have different thermodynamic costs.
• Lots of future research!
26. • New Wiki:
https://centre.santafe.edu/thermocomp
Please visit and start to add material!
• New book:
“The energetics of computing in Life and
Machines”, D. Wolpert, C. Kempes, P.
Stadler, J. Grochow (Ed.), SFI Press, 2019
• New invited review:
“The stochastic thermodynamics of
computation”, D. Wolpert, J. Physics A (2019)
27.
28. So entropy flow out of a system – the thermodynamic cost – is
where K(., .) is cross–entropy.
Applies to:
• Each gate in a digital circuit
• Each wire in a digital circuit
• Each “gate” in a noisy circuit, e.g., in a genetic circuit
• Each reaction in a stochastic chemical reaction network
29. OTHER RESULTS
• Some sufficient conditions for C to have greater EP than AO(C)
• Some sufficient conditions for C to have less EP than AO(C)
• Analysis when outdegrees of some gates > 1
• Analysis when prior distributions qg at each gate g are arbitrary.
• Analysis accounting for thermodynamic costs of wires
30. Total entropy flow out of circuit – the thermodynamic cost – is
where:
• g indexes the circuit’s gates
• pa(g) is parent gates of g in circuit (so ppa(g) is joint distribution into g)
• L(g) is the set of islands of (function implemented by) gate g
31. Total entropy flow out of circuit – the thermodynamic cost – is
where:
• g indexes the circuit’s gates
• pa(g) is parent gates of g in circuit (so ppa(g) is joint distribution into g)
• L(g) is the set of islands of (function implemented by) gate g
To focus on information theory, assume every EP(qpa(g);c) = 0
32. Total entropy flow out of circuit – the thermodynamic cost – is
where:
• g indexes the circuit’s gates
• pa(g) is parent gates of g in circuit (so p0
pa(g) is joint distribution into g)
• For any circuit C, the “All-at-once gate”, AO(C), is a single gate that
computes same function as C.
Rest of talk: Compare
Landauer costs and total EF for C and AO(C)
33. 𝐸𝐸𝐸𝐸𝐶𝐶 𝑝𝑝, 𝑞𝑞 = ∑𝑔𝑔 𝐾𝐾𝑔𝑔 𝑝𝑝𝑝𝑝𝑝𝑝 𝑔𝑔 , 𝑞𝑞𝑝𝑝𝑝𝑝(𝑔𝑔) − 𝐾𝐾𝑔𝑔 𝑝𝑝 𝑔𝑔, 𝑞𝑞 𝑔𝑔
EF for running circuit C on input distribution p when optimal distribution is q:
EF for running AO gate that implements same input-output function as C:
𝐸𝐸𝐸𝐸𝐴𝐴𝐴𝐴(𝐶𝐶) 𝑝𝑝, 𝑞𝑞 = 𝐾𝐾𝑖𝑖 𝑖𝑖 𝑝𝑝𝑖𝑖 𝑖𝑖
, 𝑞𝑞𝑖𝑖 𝑖𝑖
- 𝐾𝐾𝑜𝑜𝑜𝑜𝑜𝑜 𝑝𝑝𝑜𝑜𝑜𝑜𝑜𝑜
, 𝑞𝑞𝑜𝑜𝑜𝑜𝑜𝑜
34. 𝐸𝐸𝐸𝐸𝐶𝐶 𝑝𝑝, 𝑞𝑞 = ∑𝑔𝑔(𝐾𝐾𝑔𝑔 𝑝𝑝𝑝𝑝𝑝𝑝 𝑔𝑔 , 𝑞𝑞𝑝𝑝𝑝𝑝(𝑔𝑔) − 𝐾𝐾𝑔𝑔 𝑝𝑝 𝑔𝑔, 𝑞𝑞 𝑔𝑔 )
EF for running circuit C on input distribution p when optimal distribution is q:
EF for running AO gate that implements same input-output function as C:
Thermodynamic penalty / gain by using C rather than AO(C):
𝐸𝐸𝐸𝐸𝐴𝐴𝐴𝐴(𝐶𝐶) 𝑝𝑝, 𝑞𝑞 = 𝐾𝐾𝑖𝑖 𝑖𝑖 𝑝𝑝𝑖𝑖 𝑖𝑖
, 𝑞𝑞𝑖𝑖 𝑖𝑖
- 𝐾𝐾𝑜𝑜𝑜𝑜𝑜𝑜 𝑝𝑝𝑜𝑜𝑜𝑜𝑜𝑜
, 𝑞𝑞𝑜𝑜𝑜𝑜𝑜𝑜
Δ𝐸𝐸𝐸𝐸𝐶𝐶 𝑝𝑝, 𝑞𝑞 = 𝐸𝐸𝐸𝐸𝐶𝐶 𝑝𝑝, 𝑞𝑞 − 𝐸𝐸𝐸𝐸𝐴𝐴𝐴𝐴(𝐶𝐶) 𝑝𝑝, 𝑞𝑞
= 𝐼𝐼 𝐾𝐾 𝑝𝑝, 𝑞𝑞 − ∑𝑔𝑔 𝐼𝐼 𝐾𝐾(𝑝𝑝𝑝𝑝𝑝𝑝 𝑔𝑔 , 𝑞𝑞𝑝𝑝𝑝𝑝(𝑔𝑔))
35. Thermodynamic penalty / gain by using C rather than AO(C):
where IK(p, q) is cross multi-information
When p = q, ∆EFC(p, q) cannot be negative
- Indeed, it can be +∞.
So never an advantage to using a circuit ... if p = q, i.e., if one “guessed
right” when designing every single gate.
Δ𝐸𝐸𝐸𝐸𝐶𝐶 𝑝𝑝, 𝑞𝑞 = 𝐸𝐸𝐸𝐸𝐶𝐶 𝑝𝑝, 𝑞𝑞 − 𝐸𝐸𝐸𝐸𝐴𝐴𝐴𝐴(𝐶𝐶) 𝑝𝑝, 𝑞𝑞
= 𝐼𝐼 𝐾𝐾
𝑝𝑝, 𝑞𝑞 − ∑𝑔𝑔 𝐼𝐼 𝐾𝐾
(𝑝𝑝𝑝𝑝𝑝𝑝 𝑔𝑔
, 𝑞𝑞𝑝𝑝𝑝𝑝(𝑔𝑔)
)
36. Thermodynamic penalty / gain by using C rather than AO(C):
So never an advantage to using a circuit, if p = q.
However in real world, too expensive to build an AO gate.
Even if p = q, there are circuits where total (Landauer) cost is infinite!
So, even if p = q, if we can’t use an AO gate,
what circuit to use to implement a given function?
Δ𝐸𝐸𝐸𝐸𝐶𝐶 𝑝𝑝, 𝑞𝑞 = 𝐸𝐸𝐸𝐸𝐶𝐶 𝑝𝑝, 𝑞𝑞 − 𝐸𝐸𝐸𝐸𝐴𝐴𝐴𝐴(𝐶𝐶) 𝑝𝑝, 𝑞𝑞
= 𝐼𝐼 𝐾𝐾 𝑝𝑝, 𝑞𝑞 − ∑𝑔𝑔 𝐼𝐼 𝐾𝐾(𝑝𝑝𝑝𝑝𝑝𝑝 𝑔𝑔 , 𝑞𝑞𝑝𝑝𝑝𝑝(𝑔𝑔))
37. Even worse: if p ≠ q, extra total entropy flow if use C rather than AO(C) is
This can be positive or negative
In fact, extra EF can be either −∞ or +∞
So, if p ≠ q,
what circuit to use to implement a given function?
𝐼𝐼 𝐾𝐾 𝑝𝑝, 𝑞𝑞 − ∑𝑔𝑔 𝐼𝐼 𝐾𝐾(𝑝𝑝𝑝𝑝𝑝𝑝 𝑔𝑔 , 𝑞𝑞𝑝𝑝𝑝𝑝(𝑔𝑔))
When – as in real world – we aren’t lucky enough
to have gates obey p = q, using a circuit C rather
than AO(C) may either increase or decrease
total entropy flow out of circuit.
38. However if p ≠ q, extra EP if use C rather than AO(C) is
This can be positive or negative
−𝐼𝐼 𝐷𝐷 𝑝𝑝, 𝑞𝑞 + ∑𝑔𝑔 𝐼𝐼 𝐷𝐷(𝑝𝑝𝑝𝑝𝑝𝑝 𝑔𝑔 , 𝑞𝑞𝑝𝑝𝑝𝑝(𝑔𝑔))
When – as in real world – we aren’t lucky enough
to have gates obey p = q, using a circuit C rather
than AO(C) may either increase or decrease EP
39. Even worse: if p ≠ q, extra total entropy flow if use C rather than AO(C) is
This can be positive or negative
In fact, extra EF can be either −∞ or +∞
𝐼𝐼 𝐾𝐾 𝑝𝑝, 𝑞𝑞 − ∑𝑔𝑔 𝐼𝐼 𝐾𝐾(𝑝𝑝𝑝𝑝𝑝𝑝 𝑔𝑔 , 𝑞𝑞𝑝𝑝𝑝𝑝(𝑔𝑔))
When – as in real world – we aren’t lucky enough
to have gates obey p = q, using a circuit C rather
than AO(C) may either increase or decrease
total entropy flow out of circuit.
40. Consider a (perhaps time-varying) master equation that sends
p0(x) to p1(x) = ∑x0
P(x1 | x0) p0(x).
• Example: Stochastic dynamics in a genetic network network
• Example: (Noise-free) dynamics of a digital gate in a circuit
EP(p0) is non-negative (regardless of the master equation)
EF(p0) = S(p0) − S(p1) + EP(p0)
(See Van Den Broeck and Esposito, Physica A, 2015)
41. Consider a (perhaps time-varying) master equation that sends
p0(x) to p1(x) = ∑x0
P(x1 | x0) p0(x).
EP(p0) is non-negative (regardless of the master equation)
EF(p0) = S(p0) − S(p1) + EP(p0)
EF(p0) ≥ S(p0) − S(p1)
“𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝑟𝑟′ 𝑠𝑠 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏
42. Consider a (perhaps time-varying) master equation that sends
p0(x) to p1(x) = ∑x0
P(x1 | x0) p0(x).
EP(p0) is non-negative (regardless of the master equation).
• S(p0) − S(p1) called “Landauer cost” – minimal possible heat flow
So far, all math, no physics …
EF(p0) = S(p0) − S(p1) + EP(p0)
EF(p0) ≥ S(p0) − S(p1)
“𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝑟𝑟′ 𝑠𝑠 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏
43. EXAMPLE
• System evolves while connected to single heat bath at temperature T
• Two possible states
• p0 uniform
• Process implements bit erasure (so p1 a delta function)
So generalized Landauer’s bound says
Landauer’s conclusion
Total heat flow from system ≥ kT ln[2]
(Parrondo et al. 2015, Sagawa 2014,
Hasegawa et al. 2010, Wolpert 2015, etc.)
44. STATISTICAL PHYSICS APPLICATION
Suppose system evolves while connected to multiple reservoirs, e.g., heat
baths at different temperatures.
Assume “local detailed balance” holds for those reservoirs (usually true)
Then: EF(p0) is (temperature-normalized) heat flow into reservoirs
S(p0) − S(p1) called “Landauer cost” – minimal possible heat flow
EF(p0) = S(p0) − S(p1) + EP(p0)
Generalized Landauer′
s bound:
Heat flow from system ≥ S(p0) − S(p1)
45. For any p0, and any master equation,
• The drop in KL divergence is called mismatch cost
• A purely information theoretic contribution to EF
• Non-negative; equals 0 iff p0 = q0
46. For any p0, and any master equation,
• The average of minimal entropy productions is called residual EP
• A non-information theoretic contribution to EF
• Non-negative; generally equals 0 only if the process is quasistatic
47. • For any p0, and any master equation,
• So total entropy flow out of a system – the thermodynamic cost – is
(where K(., .) is cross entropy)
EF(p0) = S(p0) − S(p1) + EP(p0)
48. EXAMPLE
System successively connected to two heat baths, at temp.’s T1 and T2 < T1
Recall: EF(p0) is sum over each reservoir of (temperature-normalized) total
heat flow from system into that reservoir
Suppose full cycle, so S(p0) − S(p1) = 0
So generalized Landauer’s bound says
𝑄𝑄𝑜𝑜𝑜𝑜𝑜𝑜(𝑐𝑐𝑐𝑐𝑐𝑐 𝑐𝑐)
𝑇𝑇2
−
𝑄𝑄𝑖𝑖 𝑖𝑖(ℎ𝑜𝑜𝑜𝑜)
𝑇𝑇1
≤ 0
Clausius’ inequality
49. Some simplifications arise if we restrict attention to particular
graphical model representations of the distributions.
Examples/
• A product distribution form of q 𝑥𝑥𝑖𝑖 𝑖𝑖 , the optimal input distribution
• A Bayes net representation of each 𝑝𝑝 𝑝𝑝𝑝𝑝 𝑔𝑔
(one for each gate g)
• A factor graph representation of 𝑝𝑝 𝑥𝑥𝑖𝑖 𝑖𝑖 , the actual input distribution
50. Aside: state dynamics (computation)
versus thermo-dynamics
t
v:
1
0
Dynamics of states in bit erasure
51. Aside: state dynamics (computation)
versus thermo-dynamics
t
v:
1
0
Dynamics of states in bit erasure
52. Aside: state dynamics (computation)
versus thermo-dynamics
t
v:
1
0
Dynamics of states in bit erasure
53. Aside: state dynamics (computation)
versus thermo-dynamics
t
v:
1
0
?
Dynamics of states in bit erasure
• Non-reversible map
54. Aside: state dynamics (computation)
versus thermo-dynamics
t
v:
1
0
Dynamics of distributions in bit erasure
with uniform initial probabilities
59. Example of equation for work applied to a digital gate:
• If thermodynamically reversible, running in reverse sends ending
distribution back to starting one.
60. Example of equation for work applied to a digital gate:
• If thermodynamically reversible, running in reverse sends ending
distribution back to starting one.
61. Example of equation for work applied to a digital gate:
• If thermodynamically reversible, running in reverse sends ending
distribution back to starting one.
• In this case, mismatch cost = 0; total work is drop in entropies
62. Example: An all-at-once gate to implement some function f over large spaces
• Four bit input (16 states), 2 bit output
63. Example: An all-at-once gate to implement some function f over large spaces
• Four bit input (16 states), 2 bit output
• If thermodynamically reversible, running in reverse sends ending
distribution back to starting one.
• Always true for (optimized) all-at-once gate
64. Example: A digital circuit that also implements f
• Four bit input (16 states), 2 bit output
65. Example: A digital circuit that also implements f
• Four bit input (16 states), 2 bit output
• Suppose distribution over inputs is not a product of marginals
• If thermodynamically reversible, running in reverse sends ending
distribution back to starting one.
66. Example: A digital circuit that also implements f
• Four bit input (16 states), 2 bit output
• Suppose distribution over inputs is not a product of marginals
• If thermodynamically reversible, running in reverse sends ending
distribution back to starting one.
• Not true in general
67. Example: A digital circuit that also implements f
• Four bit input (16 states), 2 bit output
• Mismatch cost to implement f depends on what circuit used to
implement f, in general.
68. Example: A digital circuit that also implements f
• Four bit input (16 states), 2 bit output
• Mismatch cost to implement f depends on what circuit used to
implement f, in general.
• Can be lower with a circuit than with an “all at once” (AO) gate.
69. Example: A digital circuit that also implements f
• Four bit input (16 states), 2 bit output
• Mismatch cost to implement f depends on what circuit used to
implement f, in general.
• Can be lower with a circuit than with an “all at once” (AO) gate.
New kind of circuit design problem
70. Total entropy flow out of a gate / wire – the thermodynamic cost – is
where K(., .) is cross–entropy.
Evaluate this cost for each wire and gate in a {…} circuit, and then sum.
- Thermodynamic cost for any circuit, not just digital, noise-free circuits
So analysis holds for:
- “Noisy circuits” with noise in gates and/or paths, e.g., genetic circuits
- “Reversible circuits”, e.g., made with Fredkin gates
71. Some simplifications arise if we restrict attention to particular
graphical model representations of the distributions.
Shorthand: ∆WC(p, q) = extra EF using C rather than AO(C)
∆ ℇC(p, q) = extra mismatch cost using C
72. For Bayes net representations of distributions 𝑝𝑝𝑝𝑝𝑝𝑝(𝑔𝑔)
:
- have a separate DAG at each gate g, Γ(𝑔𝑔, 𝑝𝑝) (to define the Bayes nets)
The total work cost when p = q reduces to a sum of mutual informations:
where
- “in” refers to the joint set of all inputs to the circuit,
- For any DAG Γ, “R(Γ)” means its root nodes.
73. • A factor graph representation of a distribution P(x):
- each factor 𝜉𝜉𝑖𝑖 is a subset of X
- each 𝜙𝜙𝑖𝑖(𝜉𝜉𝑖𝑖) is a positive real-valued function
• Note that a given x can occur in more than one factor
P(x) ∝ �
𝑖𝑖
𝜙𝜙𝑖𝑖(𝜉𝜉𝑖𝑖)
75. …..
Different color curly brackets:
- Different factors of factor graph
representation of q.
- Each yellow node has its
parents statistically
independent.
- Theorem: Each yellow node
contributes zero to total EF
…..
76. • For fixed P(x1 | x0), changing p0 also changes EP(p0)
• So need to know how EP(p0) depends on p0 in order to fully specify the
problem of designing a circuit to minimize total EF
• To present this result, must first define “islands” ...
EF(p0) = S(p0) − S(p1) + EP(p0)
Different circuits all implementing same Boolean
function have different sum-total Landauer cost
First new result: EP(p0) is a sum of an information-theoretic
term and a non-information-theoretic term.
77. An island of a function f : x0 → x1 is any set f -1(x1) for some x1.
Examples:
1) A noise-free wire maps (xin = y, xout = 0) → (xin = 0, xout = y).
For binary y, two islands
2) A noise-free AND gate maps
(x1
in = y1, x2
in = y2, xout = 0) → (0, 0, y1 AND y2)
For binary y1, y2, two islands
• Given a fixed master equation implementing f, for each island c, define
- The initial distribution that results in best possible thermodynamic
efficiency, for the (implicit) master equation – the prior distribution
qc(.) := argminr EP(r)
78. Total entropy flow out of a gate / wire – the thermodynamic cost – is
where K(., .) is cross–entropy.
Examples:
Noise-free wire: (xin = y, xout = 0) → (xin = 0, xout = y)
- Landauer cost = mismatch cost = 0; drop in cross-entropy = 0
- Residual EP captures all the dependence of the heat generated by
using a wire on the distribution of signals through it
79. Total entropy flow out of a gate / wire – the thermodynamic cost – is
where K(., .) is cross–entropy.
Examples:
Noise-free AND gate: (x1
in = y1, x2
in = y2, xout = 0) → (0, 0, y1 AND y2).
- Neither Landauer cost not mismatch cost = 0
- Drop in cross-entropy ≠ 0; depends on p0
80. Total entropy flow out of a gate / wire – the thermodynamic cost – is
where K(., .) is cross–entropy.
Evaluate this cost for each wire and gate in a circuit, and then sum.
Lots of new information theory comes out, relating:
• The topology of the circuit
• The distribution over the circuit’s inputs
• The thermodynamic cost of that circuit when run with that distribution
81. Total entropy flow out of circuit – the thermodynamic cost – is
where:
• g indexes the circuit’s gates
• pa(g) is parent gates of g in circuit (so ppa(g) is joint distribution into g)
• ∀g, the distributions pg and ppa(g) are given by propagating input
distribution pin through the circuit
• Nitty-gritty physical details of each gate g determine qpa(g) and qg
• To focus on information theory, assume ∀g, qpa(g) and qg are given by
propagating a joint input prior distribution qin through the circuit
82. 𝐸𝐸𝐸𝐸𝐶𝐶 𝑝𝑝, 𝑞𝑞 = ∑𝑔𝑔 𝐾𝐾𝑔𝑔 𝑝𝑝𝑝𝑝𝑝𝑝 𝑔𝑔 , 𝑞𝑞𝑝𝑝𝑝𝑝(𝑔𝑔) − 𝐾𝐾𝑔𝑔 𝑝𝑝 𝑔𝑔, 𝑞𝑞 𝑔𝑔
EF for running circuit C on input distribution p when optimal distribution is q:
EF for running AO gate that implements same input-output function as C:
For simplicity, assume outdegree of each gate is 1 (a “Boolean formula”)
𝐸𝐸𝐸𝐸𝐴𝐴𝐴𝐴(𝐶𝐶) 𝑝𝑝, 𝑞𝑞 = 𝐾𝐾𝑖𝑖 𝑖𝑖 𝑝𝑝𝑖𝑖 𝑖𝑖
, 𝑞𝑞𝑖𝑖 𝑖𝑖
- 𝐾𝐾𝑜𝑜𝑜𝑜𝑜𝑜 𝑝𝑝𝑜𝑜𝑜𝑜𝑜𝑜
, 𝑞𝑞𝑜𝑜𝑜𝑜𝑜𝑜