- Large optimization models are increasingly challenging to solve optimally due to super-linear growth in solving effort as model size increases. Parallel heuristic methods provide good quality solutions within practical time limits by solving smaller submodels simultaneously on multiple processor threads.
- Testing on scheduling, supply chain, and telecommunications models found parallel heuristics found high quality solutions for most models in hours, while optimal solutions were impossible within days for some larger models. However, using too many threads showed diminishing returns and even degradation in solution quality due to memory bus bandwidth limitations.
If you are new to Kanban then this presentation is for you. I am talking briefly about lean principles and Elements of the Kanban Method. The difference between Kanban with capital K and kanban with small k.
Agenda
Kanban Self Assessment
Lean Principles
What is Kanban?
Motivation to use Kanban
Elements of the Kanban Method
Kanban Practices
Kanban and Scrum
Lean Kanban India 2018 | WIP decides Lead Time, Delivery Rate and Flow Effic...LeanKanbanIndia
Session Title: WIP decides Lead Time, Delivery Rate and Flow Efficiency
Session Overview:
We all are familiar with Kanban metrics. When we adopt Kanban, we want to achieve shorter Lead Time, higher delivery rate and better flow efficiency. How do we achieve all of this? The master key is WIP limit. Together, let’s look at different scenarios to understand impact of WIP limit on all three... Is there some magic formula? What does the data say... what are the findings so far?
Literally, Kanban is a Japanese word that means "visual card". At Toyota, Kanban is the term used for the visual & physical signaling system that ties together the whole Lean Production system. Kanban as used in Lean Production is over a half century old. It is being adopted newly to some disciplines as software.
From Specs To App 10X Faster Using Reactive ProgrammingEspresso Logic
Webinar - how to build mobile, web and integration backends 10X faster using reactive programming. For transnational database applications generate an API in minutes, then add business rules using reactive programming. For the typical 5% of logic that remains, code the rest in server-side JavaScript.
If you are new to Kanban then this presentation is for you. I am talking briefly about lean principles and Elements of the Kanban Method. The difference between Kanban with capital K and kanban with small k.
Agenda
Kanban Self Assessment
Lean Principles
What is Kanban?
Motivation to use Kanban
Elements of the Kanban Method
Kanban Practices
Kanban and Scrum
Lean Kanban India 2018 | WIP decides Lead Time, Delivery Rate and Flow Effic...LeanKanbanIndia
Session Title: WIP decides Lead Time, Delivery Rate and Flow Efficiency
Session Overview:
We all are familiar with Kanban metrics. When we adopt Kanban, we want to achieve shorter Lead Time, higher delivery rate and better flow efficiency. How do we achieve all of this? The master key is WIP limit. Together, let’s look at different scenarios to understand impact of WIP limit on all three... Is there some magic formula? What does the data say... what are the findings so far?
Literally, Kanban is a Japanese word that means "visual card". At Toyota, Kanban is the term used for the visual & physical signaling system that ties together the whole Lean Production system. Kanban as used in Lean Production is over a half century old. It is being adopted newly to some disciplines as software.
From Specs To App 10X Faster Using Reactive ProgrammingEspresso Logic
Webinar - how to build mobile, web and integration backends 10X faster using reactive programming. For transnational database applications generate an API in minutes, then add business rules using reactive programming. For the typical 5% of logic that remains, code the rest in server-side JavaScript.
Altus Alliance 2016 - Transitioning From On-Premise to the CloudSparkrock
Presentation by Diana Budreau and Wilkin Shum on February 5th, 2016.
Are you considering moving your NAV or CRM solution to the cloud? This session will walk you through the steps you'll need to take to make sure it's the right decision, get your organization ready, and build a project plan to ensure that the move is a success.
Cpk: indispensable index or misleading measure? by PQ SystemsBlackberry&Cross
Capability analysis is a set of calculations used to assess whether a system is able to meet a set of requirements. Customers, engineers, or managers usually set the requirements, which can be specifications, goals, aims, or standards.
The primary reason for doing a capability analysis is to answer the question: Can we meet customer requirements? To be more specific: Can our system produce consistently within tolerances required by the customer now and in the future?
Capability analysis involves two entities: 1) the producer and 2) the consumer. The consumer sets the requirements and the producer must be able to meet the requirements.
PQ Systems and Blackberry&Cross are partners since 2006.
Visit: http://www.blackberrycross.com
Cross-department Kanban Systems - 3 dimensions of scaling #llkd15Andy Carmichael
Describes Clearvision's journey of adopting Kanban, not just in the software development team but in Marketing and other departments. Uses 3 dimensions of scaling - Width (before and after); Height (different sizes, timescales, decision-making); Depth (interdependent services at the same level)
Presented in this short document is a description of what is called the (classic) “Pooling Optimization Problem” and was first described in Haverly (1978) where he modeled a small distillate blending problem with three component materials (A, B, C), one pool for mixing or blending of only two components, two products (P1, P2) and one property (sulfur, S) as well as only one time-period. The GAMS file of this exact same problem is found in Appendix A which describes all of the sets, lists, parameters, variables and constraints required to represent this problem. Related types of NLP sub-models can also be found in Kelly and Zyngier (2015) where they formulate other sub-types of continuous-processes such as blenders, splitters, separators, reactors, fractionators and black-boxes for adhoc or custom sub-models.
На данный момент в ассортименте более 3000 потребительских товаров ведущих брендов мира.
Практическая вся продукция компании eCosway поступает от независимых производителей, что позволяет компании использовать передовые технологии глобального производства и последние достижения научного сообщества, а также сильно отличает ее от типичных MLM компаний.
Chronological Decomposition Heuristic: A Temporal Divide-and-Conquer Strateg...Alkis Vazacopoulos
The chronological decomposition heuristic (CDH) is a simple divide-and-conquer strategy intended to find rapidly, integer-feasible solutions to production scheduling optimization problems of practical scale. It is not an exact algorithm in that it will not find the global optimum although it does use either branch-and-bound or branch-and-cut. The CDH is specifically designed for production scheduling optimization problems which are formulated using a uniform discretization of time where a time grid is pre-specified with fixed time-period spacing. The basic premise of the CDH is to slice the scheduling time horizon into aggregate time-intervals or “time-chunks” which are some multiple of the base time-period. Each time-chunk is solved using mixed-integer linear programming (MILP) techniques starting from the first time-chunk and moving forward in time using the technique of chronological backtracking if required (Marriott and Stuckey, 1998; for more details see the extensive literature on constraint logic programming). The efficiency of the heuristic is that it decomposes the temporal dimension into smaller size time-chunks which are solved in succession instead of solving one large problem over the entire scheduling horizon. The basic idea of such a decomposition strategy was partially presented in Bassett et. al. (1996) whereby they provided a hierarchical interaction or collaboration between a planning layer and a temporally decomposed scheduling layer. For the CDH, we focus on the time-based decomposition of the scheduling layer without the need for a higher-level coordinating or planning layer.
For many industrial size problems, solving the MILP using commercial branch-and-bound or branch-and-cut optimization can be a somewhat futile exercise even for well-formulated problems of practical interest. Instead, many researchers such as Kudva et. al. (1994), Wolsey (1998), Nott and Lee (1999), Blomer and Gunther (2000) and Kelly (2002) have devised elaborate primal heuristic techniques to enable the solution of problems of large scale and complexity; these techniques can also be augmented by other decomposition strategies such as Lagrangean and Bender’s relaxation. Unfortunately with these heuristics global optimality or even global feasibility cannot be guaranteed, however these methods and others not mentioned, have proven useful for problems which are sometimes too large to be solved using conventional methods alone. Therefore, the CDH should be considered as a step in the direction of aiding the scheduling user in finding integer-feasible solutions of reasonable quality quickly.
Presented in this short document is a description of what we call "Advanced" Property Tracking or Tracing (APT). APT is the term given to the technique of predicting, simulating, calculating or estimating the properties (i.e., densities, compositions, conditions, qualities, etc.) in a network or superstructure with significant inventory using statistical data reconciliation and regression (DRR)
Altus Alliance 2016 - Transitioning From On-Premise to the CloudSparkrock
Presentation by Diana Budreau and Wilkin Shum on February 5th, 2016.
Are you considering moving your NAV or CRM solution to the cloud? This session will walk you through the steps you'll need to take to make sure it's the right decision, get your organization ready, and build a project plan to ensure that the move is a success.
Cpk: indispensable index or misleading measure? by PQ SystemsBlackberry&Cross
Capability analysis is a set of calculations used to assess whether a system is able to meet a set of requirements. Customers, engineers, or managers usually set the requirements, which can be specifications, goals, aims, or standards.
The primary reason for doing a capability analysis is to answer the question: Can we meet customer requirements? To be more specific: Can our system produce consistently within tolerances required by the customer now and in the future?
Capability analysis involves two entities: 1) the producer and 2) the consumer. The consumer sets the requirements and the producer must be able to meet the requirements.
PQ Systems and Blackberry&Cross are partners since 2006.
Visit: http://www.blackberrycross.com
Cross-department Kanban Systems - 3 dimensions of scaling #llkd15Andy Carmichael
Describes Clearvision's journey of adopting Kanban, not just in the software development team but in Marketing and other departments. Uses 3 dimensions of scaling - Width (before and after); Height (different sizes, timescales, decision-making); Depth (interdependent services at the same level)
Presented in this short document is a description of what is called the (classic) “Pooling Optimization Problem” and was first described in Haverly (1978) where he modeled a small distillate blending problem with three component materials (A, B, C), one pool for mixing or blending of only two components, two products (P1, P2) and one property (sulfur, S) as well as only one time-period. The GAMS file of this exact same problem is found in Appendix A which describes all of the sets, lists, parameters, variables and constraints required to represent this problem. Related types of NLP sub-models can also be found in Kelly and Zyngier (2015) where they formulate other sub-types of continuous-processes such as blenders, splitters, separators, reactors, fractionators and black-boxes for adhoc or custom sub-models.
На данный момент в ассортименте более 3000 потребительских товаров ведущих брендов мира.
Практическая вся продукция компании eCosway поступает от независимых производителей, что позволяет компании использовать передовые технологии глобального производства и последние достижения научного сообщества, а также сильно отличает ее от типичных MLM компаний.
Chronological Decomposition Heuristic: A Temporal Divide-and-Conquer Strateg...Alkis Vazacopoulos
The chronological decomposition heuristic (CDH) is a simple divide-and-conquer strategy intended to find rapidly, integer-feasible solutions to production scheduling optimization problems of practical scale. It is not an exact algorithm in that it will not find the global optimum although it does use either branch-and-bound or branch-and-cut. The CDH is specifically designed for production scheduling optimization problems which are formulated using a uniform discretization of time where a time grid is pre-specified with fixed time-period spacing. The basic premise of the CDH is to slice the scheduling time horizon into aggregate time-intervals or “time-chunks” which are some multiple of the base time-period. Each time-chunk is solved using mixed-integer linear programming (MILP) techniques starting from the first time-chunk and moving forward in time using the technique of chronological backtracking if required (Marriott and Stuckey, 1998; for more details see the extensive literature on constraint logic programming). The efficiency of the heuristic is that it decomposes the temporal dimension into smaller size time-chunks which are solved in succession instead of solving one large problem over the entire scheduling horizon. The basic idea of such a decomposition strategy was partially presented in Bassett et. al. (1996) whereby they provided a hierarchical interaction or collaboration between a planning layer and a temporally decomposed scheduling layer. For the CDH, we focus on the time-based decomposition of the scheduling layer without the need for a higher-level coordinating or planning layer.
For many industrial size problems, solving the MILP using commercial branch-and-bound or branch-and-cut optimization can be a somewhat futile exercise even for well-formulated problems of practical interest. Instead, many researchers such as Kudva et. al. (1994), Wolsey (1998), Nott and Lee (1999), Blomer and Gunther (2000) and Kelly (2002) have devised elaborate primal heuristic techniques to enable the solution of problems of large scale and complexity; these techniques can also be augmented by other decomposition strategies such as Lagrangean and Bender’s relaxation. Unfortunately with these heuristics global optimality or even global feasibility cannot be guaranteed, however these methods and others not mentioned, have proven useful for problems which are sometimes too large to be solved using conventional methods alone. Therefore, the CDH should be considered as a step in the direction of aiding the scheduling user in finding integer-feasible solutions of reasonable quality quickly.
Presented in this short document is a description of what we call "Advanced" Property Tracking or Tracing (APT). APT is the term given to the technique of predicting, simulating, calculating or estimating the properties (i.e., densities, compositions, conditions, qualities, etc.) in a network or superstructure with significant inventory using statistical data reconciliation and regression (DRR)
CPLEX Optimization Studio, Modeling, Theory, Best Practices and Case Studiesoptimizatiodirectdirect
Recent advancements in Linear and Mixed Programing give us the capability to solve larger Optimization Problems. CPLEX Optimization Studio solves large-scale optimization problems and enables better business decisions and resulting financial benefits in areas such as supply chain management, operations, healthcare, retail, transportation, logistics and asset management. In this workshop using CPLEX Optimization Studio we will discuss modeling practices, case studies and demonstrate good practices for solving Hard Optimization Problems. We will also discuss recent CPLEX performance improvements and recently added features.
Solving Large Scale Optimization Problems using CPLEX Optimization Studiooptimizatiodirectdirect
Recent advancements in Linear and Mixed Programing give us the capability to solve larger Optimization Problems. In this talk using CPLEX Optimization Studio we will discuss modeling practices, case studies and demonstrate good practices for solving Hard Optimization Problems. We will also discuss recent CPLEX performance improvements and recently added features.
Scaling with sync_replication using Galera and EC2Marco Tusa
Challenging architecture design, and proof of concept on a real case of study using Syncrhomous solution.
Customer asks me to investigate and design MySQL architecture to support his application serving shops around the globe.
Scale out and scale in base to sales seasons.
This is a summary of the sessions I attended at PASS Summit 2017. Out of the week-long conference, I put together these slides to summarize the conference and present at my company. The slides are about my favorite sessions that I found had the most value. The slides included screenshotted demos I personally developed and tested alike the speakers at the conference.
Parallel Distributed Deep Learning on HPCC SystemsHPCC Systems
As part of the 2018 HPCC Systems Community Day event:
The training process for modern deep neural networks requires big data and large amounts of computational power. Combining HPCC Systems and Google’s TensorFlow, Robert created a parallel stochastic gradient descent algorithm to provide a basis for future deep neural network research, thereby helping to enhance the distributed neural network training capabilities of HPCC Systems.
Robert Kennedy is a first year Ph.D. student in CS at Florida Atlantic University with research interests in Deep Learning and parallel and distributed computing. His current research is in improving distributed deep learning by implementing and optimizing distributed algorithms.
This is the story of how we managed to scale and improve Tappsi’s RoR RESTful API to handle our ever-growing load - told from different perspectives: infrastructure, data storage tuning, web server tuning, RoR optimization, monitoring and architecture design.
Designing, Building, and Maintaining Large Cubes using Lessons LearnedDenny Lee
This is Nicholas Dritsas, Eric Jacobsen, and my 2007 SQL PASS Summit presentation on designing, building, and maintaining large Analysis Services cubes
AWS Webcast - An Introduction to High Performance Computing on AWSAmazon Web Services
High Performance Computing (HPC) allows scientists and engineers to solve complex science, engineering, and business problems using applications that require high bandwidth, low latency networking, and very high compute capabilities. Learn how the AWS cloud can cost- effectively provide the scalable computing resources, storage services, and analytic tools that enable running various kinds of HPC workloads. Who should attend? Engineers, architects, product managers, data scientists, high performance computing specialists, and researchers from industry and academia, along with technically-minded business stakeholders looking to put data to work for their organization.
Simulation of Heterogeneous Cloud InfrastructuresCloudLightning
During the last years, except from the traditional CPU based hardware servers, hardware accelerators are widely used in various HPC application areas. More specifically, Graphics Processing Units (GPUs), Many Integrated Cores (MICs) and Field-Programmable Gate Arrays (FPGAs) have shown a great potential in HPC and have been widely mobilised in supercomputing and in HPC-Clouds. This presentation focuses on the development of a cloud simulation framework that supports hardware accelerators. The design and implementation of the framework are also discussed.
This presentation was given by Dr. Konstantinos Giannoutakis (CERTH) at the CloudLightning Conference on 11th April 2017.
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]Malin Weiss
By leveraging memory-mapped files, Speedment and the Chronicle Engine supports large Java maps that easily can exceed the size of your server’s RAM.Because the Java maps are mapped onto files, these maps can be shared instantly between several microservice JVMs and new microservice instances can be added, removed, or restarted very quickly. Data can be retrieved with predictable ultralow latency for a wide range of operations. The solution can be synchronized with an underlying database so that your in-memory maps will be consistently “alive.” The mapped files can be tens of terabytes, which has been done in real-world deployment cases, and a large number of micro services can share these maps simultaneously. Learn more in this session.
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]Speedment, Inc.
By leveraging memory-mapped files, Speedment and the Chronicle Engine supports large Java maps that easily can exceed the size of your server’s RAM.Because the Java maps are mapped onto files, these maps can be shared instantly between several microservice JVMs and new microservice instances can be added, removed, or restarted very quickly. Data can be retrieved with predictable ultralow latency for a wide range of operations. The solution can be synchronized with an underlying database so that your in-memory maps will be consistently “alive.” The mapped files can be tens of terabytes, which has been done in real-world deployment cases, and a large number of micro services can share these maps simultaneously. Learn more in this session.
Low-latency Multi-threaded Ensemble Learning for Dynamic Big Data StreamsDiego Marrón Vida
Real–time mining of evolving data streams in-
volves new challenges when targeting today’s application do-
mains such as the Internet of the Things: increasing volume,
velocity and volatility requires data to be processed on–the–
fly with fast reaction and adaptation to changes. This paper
presents a high performance scalable design for decision trees
and ensemble combinations that makes use of the vector SIMD
and multicore capabilities available in modern processors to
provide the required throughput and accuracy. The proposed
design offers very low latency and good scalability with the
number of cores on commodity hardware when compared to
other state–of–the art implementations. On an Intel i7-based
system, processing a single decision tree is 6x faster than
MOA (Java), and 7x faster than StreamDM (C++), two well-
known reference implementations. On the same system, the
use of the 6 cores (and 12 hardware threads) available allow
to process an ensemble of 100 learners 85x faster that MOA
while providing the same accuracy. Furthermore, our solution
is highly scalable: on an Intel Xeon socket with large core
counts, the proposed ensemble design achieves up to 16x speed-
up when employing 24 cores with respect to a single threaded
execution
We tested ODH|CPLEX 4.24 on Miplib Open-v7 Models, a public collection of 286 models to which and optimal solution has not been proven. 257 of these are known to have a feasible solution.
ODH|CPLEX proved optimality on 6 models and found better solutions in 2 hours, to 40% of the models with 12 threads and 35% with 8 threads. ODH|CPLEX matched on 21% of the models.
EX Optimization Studio* solves large-scale optimization problems and enables better business decisions and resulting financial benefits in areas such as supply chain management, operations, healthcare, retail, transportation, logistics and asset management. It has been applied in sectors as diverse as manufacturing, processing, distribution, retailing, transport, finance and investment. CPLEX Optimization Studio is an analytical decision support toolkit for rapid development and deployment of optimization models using mathematical and constraint programming. It combines an integrated development environment (IDE) with the powerful Optimization Programming Language (OPL) and high-performance ILOG CPLEX optimizer solvers. CPLEX Optimization Studio enables clients to: Optimize business decisions with high-performance optimization engines. Develop and deploy optimization models quickly by using flexible interfaces and prebuilt deployment scenarios. Create real-world applications that can significantly improve business outcomes. Optimization Direct has partnered with and entered into a technology licensing and distribution agreement with IBM. By combining the founders' industry and software experience and IBM’s CPLEX Optimization Studio product with the arsenal of Optimization modeling and solving tools from IBM provides customers the most powerful capabilities in the industry.
Missing-Value Handling in Dynamic Model Estimation using IMPL Alkis Vazacopoulos
Presented in this short document is a description of how IMPL handles missing-values or missing-data when estimating dynamic models which inherently involve time-lagged or time-shifted input and output variables. Missing-values in a data set imply that for some reason the data is not available most likely due to a mal-functioning instrument or even lack of proper accounting. Missing-data handling is relatively well-studied especially for time-series or dynamic data given that it is not as easy as removing, ignoring or deleting bad sections of data when static or steady-state models are calibrated (Honaker and King, 2010; Smits and Baggelaar, 2010; Fisher and Waclawski, 2015). Unfortunately, all of their methods involve what is known as “imputation” i.e., replacing or substituting missing-data with some reasonably assumed value which is at the very least is a biased estimate. When regression techniques such as PLS and PCR are used (Nelson et. al., 2006) then missing-data can be handled without imputation by computing the input-output covariance matrices excluding the contribution from the missing-values given the temporal and structural redundancy in the system. However, it is shown in Dayal (1996) that using PLS and other types of regression techniques such as Canonical Correlation Regression (CCR) and Reduced Rank Regression (RRR) to fit non-parsimonious and non-parametric finite impulse/step response models (FIR/FSR), that this is not as reliable as fitting lower-ordered transfer functions especially considering the robust stability of the resulting model predictive controller if that is its intended use.
Finite Impulse Response Estimation of Gas Furnace Data in IMPL Industrial Mod...Alkis Vazacopoulos
Presented in this short document is a description of how to estimate deterministic and stochastic non-parametric finite impulse response (FIR) models in IMPL applied to industrial gas furnace data identical to that found in TSE-GFD-IMF using parametric transfer-functions. The methodology of time-series analysis or system identification involves essentially three (3) stages (Box and Jenkins, 1976): (1) model structure identification, (2) model parameter estimation and (3) model checking and diagnostics. We do not address (1) which requires stationarity and seasonality assessment/adjustment, auto-, cross- and partial-correlation, etc. to establish the parametric transfer function polynomial degrees especially when we are using non-parametric FIR estimation. Instead we focus only on the parameter estimation and diagnostics. These types of parameter estimation problems involve dynamic and nonlinear relationships shown below and we solve these using IMPL’s Sequential Equality-Constrained QP Engine (SECQPE) and Supplemental Observability, Redundancy and Variability Estimator (SORVE). Other types of non-parametric identification known as Subspace Identification (Qin, 2006) and can used to estimate state-space models.
Our Industrial Modeling Service (IMS) involves several important (but rarely implemented) methods to significantly improve and advance your existing models and data. Since it is well-known that good decision-making requires good models and data, IMS is ideally suited to support this continuous-improvement endeavour. IMS is specifically designed to either co-exist with your existing design, planning, scheduling, etc. applications or these same models and data can be used seamlessly into our Industrial Modeling and Programming Language (IMPL) to create new value-added applications. The following techniques form the basis of our IMS offering.
This short note describes a relatively simple methodology, procedure or approach to increase the performance of already installed industrial models used for optimization, control, simulation and/or monitoring purposes. The method is called Excess or X-Model Regression (XMR) where the concept of “excess modeling” or an X-model is taken from the field of thermodynamics to describe the departure or residual behaviour of real (non-ideal) gases and liquids from their ideal state (Kyle, 1999; Poling et. al., 2001; Smith et. al., 2001). It has also been applied to model the non-ideal or nonlinear behaviour of blending motor gasoline octanes with its synergistic and antagonistic interactional effects (Muller, 1992).
The fundamental idea of XMR is to calibrate, train, fit or estimate, using actual data and multiple linear regression (MLR) or ordinary least squares (OLS), the deviations of the measured responses from the existing model responses. The existing model may be a glass, grey or black-box model (known or unknown, linear or nonlinear, implicit/open or explicit/closed) depending on the use of the model. That is, for optimization and control the model structure and parameters are available given that derivative information is required although for simulation and monitoring, the model may only be observed through the dependent output variables given the necessary independent input variables.
Presented in this short document is a description of how to model and solve multi-utility scheduling optimization (MUSO) problems in IMPL. Multi-utility systems (co/tri-generation) are typically found in petroleum refineries and petrochemical plants (multi-commodity systems) especially when fuel-gas (i.e., off-gases of methane and ethane) is a co- or by-product of the production from which multi-pressure heating-, motive- and process-steam are generated on-site. Other utilities include hydrogen, electricity, water, cooling media, air, nitrogen, chemicals, etc. where a multi-utility system is shown in Figure 1 with an intermediate or integrated utility (both produced and consumed) such as fuel-gas, steam or electricity. Itemized benefit areas just for better management of an integrated steam network can be found in Pelham (2013) where his sample multi-pressure steam utility flowsheet is found in Figure 2.
Advanced Parameter Estimation (APE) for Motor Gasoline Blending (MGB) Indust...Alkis Vazacopoulos
Presented in this short document is a description of how to model and solve advanced parameter estimation (APE) problems in IMPL. APE is the term given to the application of estimating, fitting or calibrating parameters in models involving a network, topology, superstructure or flowsheet. When estimating parameters with multiple linear regression (MLR), ordinary least squares (OLS), ridge regression (RR), principal component regression (PCR) and partial least squares (PLS) there is no explicit model but simply an X-block and Y-block of data. Hence, these methods are referred to as “non-parametric” or “data-based” methods as opposed to the “parametric” or “model-based” method used here. To solve these types of problems we use what is commonly referred to as “error-in-variables” (EIV) regression which is conveniently implemented as nonlinear data reconciliation and regression (NDRR) using the technology found in Kelly (1998a; 1998b; 1999) and Kelly and Zyngier (2008a). The primary benefit of using EIV (NDRR) over the other regression methods is that we can easily handle the inclusion of conservation laws and constitutive relations, explicitly, a must for any industrial estimation problem (IEP).
Presented in this short document is a description of modeling and solving partial differential equations (PDE’s) in both the temporal and spatial dimensions using IMPL. The sample PDE problem is taken from Cutlip and Shacham (1999 and 2014) and models the process of unsteady-state heat transfer or conduction in a one dimensional (1D) slab with one face insulated and constant thermal conductivity as discussed by Geankoplis (1993).
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
2. Summary
• Challenges of Large Scale Optimization
• Heuristic Software Approach
• Scheduling, supply chain and telecomms
examples
• Benefits and problems of parallelism
3. Large Scale Optimization
• Many models now solved routinely which would have
been impossible (‘unsolvable’) a few years ago
• BUT: have super-linear growth of solving effort as model
size/complexity increases
• AND: customer models keep getting larger
• Globalized business has larger and more complex supply
chain
• Optimization expandinginto new areas, especially
scheduling
• Detailed models easier to sell to management and end-
users
4. The Curse of Dimensionality: Size Matters
• Super-linear solve time growth often supposed
• The reality is worse
• Few data sets available to support this
• Look at randomly selected sub-models of two scheduling
models
• Simple basic model
• More complex model with additional entity types
• Two hour time limit on each solve
• 8 threads on 4 core hyperthreaded Intel i7-4790K
• See how solve time varies with integers after presolve
7. Why size matters
• Solver has to
• (Presolve, probe and) solve LP relaxation
• Find and apply cuts
• Branch on remaining infeasibilities (and find and apply cuts too)
• Look for feasible solutions with heuristics all the while
• Simplex relaxation solves theoretically NP, but in practice effort
increases between linearly and quadratic
• Barrier solver effort grows more slowly, but:
• cross-over still grows quickly
• usually get more integer infeasibilities
• can’t use last solution/basis to accelerate
• Cutting grows faster than quadratic: each cut requires more
effort, more cuts/round, more rounds of cuts, each round harder
to apply.
• Branching is exponential: 2n in number of (say) binaries n
8. What can be done?
• Decomposition
• Solve smaller models
• Use ‘good’ formulations
• As tight as possible
• Avoid symmetry
• Settle for a good solution
• Use heuristics
• Use more powerful hardware
• Can usually make minor improvements by having a faster
processor and memory
• Big potential is from parallel processing
• Use 4, 8 12, 24,… or more threads on separate ‘processors’
9. Parallel Processing
• CPU clock speed is not improving
• Power consumed (and heat generated accordingto the
square of the speed
• Memory speed is not improving (much)
• Can fit more processors onto a single chip
• Can get wider registers
• Vectorization
• do several (up to 8) fp calculationsat once on 512 bit
register (SIMD)
• of limited use in sparse optimization
• Cannot use full processor capability if using many cores
10. Heuristic Solution Software
• Proof of optimality may be impractical, but want good solutions to (say)
20%
• Uses CPLEX callable library
• First-feasible-solution heuristics
• Improvement heuristic
• Solves sequence of smaller sub-model(s)
• approach used e.g. by RINS and local branching
• Use model structure to create sub-models and combine solutions
• Assess solution quality by a very aggressive root solve of whole model
• Multiple instances can be run concurrently with different seeds
• Can run on only one core
• Can interrupt at any point and take best solution so far
time limit / call-back /SIGINT
11. Parallel Heuristic Approach
• Run several heuristic threads with different seeds
simultaneously
• CPLEX callable library very flexible, so
• Exchange solution information between runs
• Kill sub-model solves when done better elsewhere
• Improves sub-model selection
• Opportunistic or deterministic
• 5 instances run on 24 core
Intel
2
x
Xeon
E5-‐2690v3
running
at
up
to
3.5
GHz
(DDR4-‐2133
memory)
12. Example: Large Scale Scheduling, Supply
Chain and Telecomms Models
† No usable (say within 30% gap) solution after 3 days run time
on fastest hardware (Intel i7 4790K ‘Devil’s Canyon’)
Model entities rows cols integers
Easy 314 299288 57804 57804
Medium† 314 389560 94200 94200
Difficult† 406 371964 149132 149132
Large 302965 2836736 4892396 18271400
Huge 27000 2577916 12944400 12944400
13. Heuristic Results on Scheduling Models
8
Cores
on
Intel
2
x
Xeon
E5-‐2690v3
3GHz
Solution Time Gap
Easy 96 231.06 0%
Medium 113 28800 ≤
13%
Difficult 773.8 28800 ≤
56%
Large 1.1493E+7 28800 ≤
1%
Huge 370 28800 ≤
61%
Lower
bounds
established
by
separate
CPLEX
runs
Times
are
in
seconds
20. What’s Gone Wrong?
• Size (or rather density) matters again
• More threads can give worse results
• Used 24 core Intel Xeon E5-2690v3 server
• Not used hyperthreading
• Memory transfer speed is ~20GB/sec
• MIP solves become memory bus bound
22. Conclusions
• Hard size barriers to solve (to optimality) times
• May have to be satisfied with solutions of unproven
quality/optimality
• Heuristic methods demand serious consideration
• Parallel solution methods best way of exploiting modern
hardware
• Even these are limited by memory bus speeds