The document discusses the importance of cost functions in machine learning models for production forecasting. It begins by defining what a cost function is and how it quantifies the goodness of fit of a model to data. It then discusses alternatives to the typical least squares cost function, including different norm formulations. The document applies these concepts to nonlinear regression problems and forecasting production data from an oil well. It emphasizes that the choice of cost function can significantly impact the regression results.
This document provides an overview of different types of charts used for data visualization, including column charts, bar charts, pie charts, doughnut charts, line charts, area charts, scatter charts, spider/radar charts, gauge charts, and comparison charts. It describes the purpose and use of each chart type, highlighting when each type is most effective to visualize different kinds of data relationships. The document aims to help readers select the most appropriate chart type based on their data and visualization goals.
The purpose of this project is to do a visual analysis of the given data and to arrive at useful insights that can be of use to the principal company. There is no clear set of instructions in such open-ended problems and it is expected of the consultant to do a lot of exploration first and formulate the problems themselves.
Target Audience
The target audience for this report are the top management and the CXOs of the insurance company whose data we are analyzing.
It could also provide insights to any person from the same industry as well as industries who work in close coordination with the insurance companies.
Tool Used
The tool used for this analysis is Tableau Desktop (Public edition) running on MacOS
This document discusses Insights for ArcGIS, a new tool for analyzing and visualizing data in ArcGIS. It can perform descriptive data analysis graphically and intuitively. Insights connects analysis and visualization through interactive cards. It works directly with individual data fields and generates a spatial or temporal model in the background. Results are created on relevant cards and can be shared, collaborated on, and embedded in Story Maps. Insights can be used online through ArcGIS Online or on-premises with ArcGIS Enterprise.
Session 4 c discussion of xianjia ye paper in session 4a 26 augustIARIW 2014
This paper aims to explain differences in structural change paths across countries using a measure of relatedness between economic activities. It uses World Input-Output Database data to construct a matrix measuring relatedness between 84 activities based on comparative advantage. Clustering and network analysis of this matrix reveal patterns of related activities. Econometric models test if relatedness predicts future comparative advantage gains, finding employment in highly related activities promotes structural upgrading. However, moving between low- and high-skill activities is difficult. Overall, the paper presents a novel approach using international input-output data to empirically study patterns of structural change.
March 2, 2018 - Machine Learning for Production ForecastingDavid Fulford
This document summarizes a presentation on using machine learning for production forecasting. It discusses challenges with traditional forecasting models for unconventional wells, which can have long transient flow periods. A new transient hyperbolic model was developed that better accounts for the different flow regimes. Machine learning techniques like Markov chain Monte Carlo simulation are applied to estimate model parameters and quantify uncertainty. This allows incorporating historical data to improve forecasts of future well performance compared to simple regression models.
This document discusses using the Analytic Hierarchy Process (AHP) and Saaty method to choose the best decision alternative. It provides examples of using AHP to estimate software sizes by comparing to a known example, and prioritize system requirements by comparing costs and benefits. Measurement data can support AHP by providing attributes and sizes of past projects to estimate new projects. AHP is useful for ranking choices based on criteria in a relatively short time and detects inconsistencies in rankings.
The document provides an overview of topics to be covered in a data analytics training, including a review of previous concepts and an introduction to new topics. It discusses the data science process, linear regression, k-means clustering, k-nearest neighbors (k-NN) classification, and provides examples of applying these machine learning algorithms to real datasets. Sample R code is also included to demonstrate k-means and k-NN algorithms on synthetic data. The goal is for students to gain hands-on experience applying different analytical techniques through worked examples and exercises using real data.
This document provides an overview of different types of charts used for data visualization, including column charts, bar charts, pie charts, doughnut charts, line charts, area charts, scatter charts, spider/radar charts, gauge charts, and comparison charts. It describes the purpose and use of each chart type, highlighting when each type is most effective to visualize different kinds of data relationships. The document aims to help readers select the most appropriate chart type based on their data and visualization goals.
The purpose of this project is to do a visual analysis of the given data and to arrive at useful insights that can be of use to the principal company. There is no clear set of instructions in such open-ended problems and it is expected of the consultant to do a lot of exploration first and formulate the problems themselves.
Target Audience
The target audience for this report are the top management and the CXOs of the insurance company whose data we are analyzing.
It could also provide insights to any person from the same industry as well as industries who work in close coordination with the insurance companies.
Tool Used
The tool used for this analysis is Tableau Desktop (Public edition) running on MacOS
This document discusses Insights for ArcGIS, a new tool for analyzing and visualizing data in ArcGIS. It can perform descriptive data analysis graphically and intuitively. Insights connects analysis and visualization through interactive cards. It works directly with individual data fields and generates a spatial or temporal model in the background. Results are created on relevant cards and can be shared, collaborated on, and embedded in Story Maps. Insights can be used online through ArcGIS Online or on-premises with ArcGIS Enterprise.
Session 4 c discussion of xianjia ye paper in session 4a 26 augustIARIW 2014
This paper aims to explain differences in structural change paths across countries using a measure of relatedness between economic activities. It uses World Input-Output Database data to construct a matrix measuring relatedness between 84 activities based on comparative advantage. Clustering and network analysis of this matrix reveal patterns of related activities. Econometric models test if relatedness predicts future comparative advantage gains, finding employment in highly related activities promotes structural upgrading. However, moving between low- and high-skill activities is difficult. Overall, the paper presents a novel approach using international input-output data to empirically study patterns of structural change.
March 2, 2018 - Machine Learning for Production ForecastingDavid Fulford
This document summarizes a presentation on using machine learning for production forecasting. It discusses challenges with traditional forecasting models for unconventional wells, which can have long transient flow periods. A new transient hyperbolic model was developed that better accounts for the different flow regimes. Machine learning techniques like Markov chain Monte Carlo simulation are applied to estimate model parameters and quantify uncertainty. This allows incorporating historical data to improve forecasts of future well performance compared to simple regression models.
This document discusses using the Analytic Hierarchy Process (AHP) and Saaty method to choose the best decision alternative. It provides examples of using AHP to estimate software sizes by comparing to a known example, and prioritize system requirements by comparing costs and benefits. Measurement data can support AHP by providing attributes and sizes of past projects to estimate new projects. AHP is useful for ranking choices based on criteria in a relatively short time and detects inconsistencies in rankings.
The document provides an overview of topics to be covered in a data analytics training, including a review of previous concepts and an introduction to new topics. It discusses the data science process, linear regression, k-means clustering, k-nearest neighbors (k-NN) classification, and provides examples of applying these machine learning algorithms to real datasets. Sample R code is also included to demonstrate k-means and k-NN algorithms on synthetic data. The goal is for students to gain hands-on experience applying different analytical techniques through worked examples and exercises using real data.
This document provides an overview of the algorithms used in the IBM SPSS Statistics Algorithms procedure. It begins with an introduction to algorithms and notes that algorithms are avoided in documentation to promote readability. It then discusses algorithms used across multiple procedures and factors that influence the choice of formulas. The document outlines algorithms for various statistical tests and procedures, including two-stage least squares, autocorrelation/partial autocorrelation, attribute importance testing, and ALSCAL multidimensional scaling. Notation is provided and each algorithm is explained step-by-step with details on computational details, references, and terminology.
This document provides an overview of lectures for a Business Statistics II course between weeks 11-19. It covers topics like simple regression analysis, estimation in regression models, and assessing regression models. Key points include using least squares to estimate regression coefficients, calculating residuals, and evaluating fit using measures like the coefficient of determination and standard error of the estimate. Examples are provided to illustrate simple linear regression analysis.
A Combined Entropy-FR Weightage Formulation Model for Delineation of Groundwa...IRJET Journal
This document presents a study that develops a combined entropy-frequency ratio (FR) weightage formulation model to delineate groundwater potential zones. Digital elevation models and satellite imagery are used to prepare thematic maps of factors like slope and rainfall. These maps are divided into classes and their pixel counts are calculated to determine area percentages. Frequency ratios are then calculated comparing area percentages of each class to the total area. Entropy and information coefficients are also calculated. Finally, weights are assigned to each class for each thematic map by combining the entropy and FR values using the proposed entropy-FR model. This provides objective weights to replace subjective weights. The weighted thematic maps are then overlaid to produce a composite groundwater potential zones map. The
This document discusses developing a theory of data analysis systems that integrates statistical methodology with the design of distributed data systems. It aims to balance tradeoffs between computational, transmission, and statistical costs when performing large-scale, distributed data analysis. As a proof of concept, it presents a toy example involving maximum likelihood estimation of parameters for a Gaussian process model using distributed spatial data. The example quantifies various costs associated with data access, transmission, and computation to jointly optimize the statistical analysis approach and data system design. Challenges include developing objective functions that can optimize both aspects simultaneously and approximating statistical costs like uncertainty.
Modern, large scale data analysis typically involves the use of massive data stored on different computers that do not share the same file system. Computing complex statistical quantities, such as those that characterize spatial or temporal statistical dependence, requires information that crosses the boundaries imposed by this partitioning of the data. To leverage the information in these distributed data sets, analysts are faced with a trade-off between various costs (e.g., computational, transmission, and even the cost building an appropriate data system infrastructure) and inferential uncertainties (bias, variance, etc.) in the estimates produced by the analysis. In this talk we introduce a framework for quantifying this trade-off by optimizing over both statistical and data system design aspects of the problem. We illustrate with a simple example, and discuss how it may be extended to more complex settings.
This document summarizes the application of computational intelligence techniques like genetic algorithms and particle swarm optimization for solving economic load dispatch problems. It first applies a real-coded genetic algorithm to minimize generation costs for a 6-generator test system with continuous fuel cost equations, showing superiority over quadratic programming. It then uses particle swarm optimization to minimize costs for a 10-generator system with each generator having discontinuous fuel options, showing better results than other published methods. The document provides background on economic load dispatch problems and optimization techniques like quadratic programming, genetic algorithms, and particle swarm optimization.
This document discusses calibration of computer models in the face of model discrepancy. It begins by introducing the problem of calibrating a computer model S to a real complex system Z, where discrepancy δ exists between them. The standard Bayesian approach of Kennedy and O'Hagan is described. An issue is that Bayesian inference is performed on the joint model Mζ regardless of data size. The document explores using a Bayesian treed model to partition the input/calibration space, allowing basic GP models to be fit in each region to better represent local features and discontinuities. It suggests this approach may help mitigate non-identifiability issues compared to a standard Bayesian calibration. Modularizing the Bayesian analysis by learning model components separately from different data
This document summarizes Chapter 5 of the textbook "Data Mining: Concepts and Techniques". It discusses concept description, which involves characterizing data through generalization, summarization, and comparison of different classes. Key aspects covered include data cube approaches to characterization, attribute-oriented induction for generalization, analytical characterization of attribute relevance, and presenting generalized results through cross-tabulation, visualization, and rules. Implementation can utilize pre-computed data cubes to enable efficient analysis operations like drill-down.
Discover the latest insights on Data Driven Maintenance with our comprehensive webinar presentation. Learn about traditional maintenance challenges, the right approach to utilizing data, and the benefits of adopting a Data Driven Maintenance strategy. Explore real-world examples, industry best practices, and innovative solutions like FMECA and the D3M model. This presentation, led by expert Jules Oudmans, is essential for asset owners looking to optimize their maintenance processes and leverage digital technologies for improved efficiency and performance. Download now to stay ahead in the evolving maintenance landscape.
Null Bangalore | Pentesters Approach to AWS IAMDivyanshu
#Abstract:
- Learn more about the real-world methods for auditing AWS IAM (Identity and Access Management) as a pentester. So let us proceed with a brief discussion of IAM as well as some typical misconfigurations and their potential exploits in order to reinforce the understanding of IAM security best practices.
- Gain actionable insights into AWS IAM policies and roles, using hands on approach.
#Prerequisites:
- Basic understanding of AWS services and architecture
- Familiarity with cloud security concepts
- Experience using the AWS Management Console or AWS CLI.
- For hands on lab create account on [killercoda.com](https://killercoda.com/cloudsecurity-scenario/)
# Scenario Covered:
- Basics of IAM in AWS
- Implementing IAM Policies with Least Privilege to Manage S3 Bucket
- Objective: Create an S3 bucket with least privilege IAM policy and validate access.
- Steps:
- Create S3 bucket.
- Attach least privilege policy to IAM user.
- Validate access.
- Exploiting IAM PassRole Misconfiguration
-Allows a user to pass a specific IAM role to an AWS service (ec2), typically used for service access delegation. Then exploit PassRole Misconfiguration granting unauthorized access to sensitive resources.
- Objective: Demonstrate how a PassRole misconfiguration can grant unauthorized access.
- Steps:
- Allow user to pass IAM role to EC2.
- Exploit misconfiguration for unauthorized access.
- Access sensitive resources.
- Exploiting IAM AssumeRole Misconfiguration with Overly Permissive Role
- An overly permissive IAM role configuration can lead to privilege escalation by creating a role with administrative privileges and allow a user to assume this role.
- Objective: Show how overly permissive IAM roles can lead to privilege escalation.
- Steps:
- Create role with administrative privileges.
- Allow user to assume the role.
- Perform administrative actions.
- Differentiation between PassRole vs AssumeRole
Try at [killercoda.com](https://killercoda.com/cloudsecurity-scenario/)
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Sinan KOZAK
Sinan from the Delivery Hero mobile infrastructure engineering team shares a deep dive into performance acceleration with Gradle build cache optimizations. Sinan shares their journey into solving complex build-cache problems that affect Gradle builds. By understanding the challenges and solutions found in our journey, we aim to demonstrate the possibilities for faster builds. The case study reveals how overlapping outputs and cache misconfigurations led to significant increases in build times, especially as the project scaled up with numerous modules using Paparazzi tests. The journey from diagnosing to defeating cache issues offers invaluable lessons on maintaining cache integrity without sacrificing functionality.
Applications of artificial Intelligence in Mechanical Engineering.pdfAtif Razi
Historically, mechanical engineering has relied heavily on human expertise and empirical methods to solve complex problems. With the introduction of computer-aided design (CAD) and finite element analysis (FEA), the field took its first steps towards digitization. These tools allowed engineers to simulate and analyze mechanical systems with greater accuracy and efficiency. However, the sheer volume of data generated by modern engineering systems and the increasing complexity of these systems have necessitated more advanced analytical tools, paving the way for AI.
AI offers the capability to process vast amounts of data, identify patterns, and make predictions with a level of speed and accuracy unattainable by traditional methods. This has profound implications for mechanical engineering, enabling more efficient design processes, predictive maintenance strategies, and optimized manufacturing operations. AI-driven tools can learn from historical data, adapt to new information, and continuously improve their performance, making them invaluable in tackling the multifaceted challenges of modern mechanical engineering.
artificial intelligence and data science contents.pptxGauravCar
What is artificial intelligence? Artificial intelligence is the ability of a computer or computer-controlled robot to perform tasks that are commonly associated with the intellectual processes characteristic of humans, such as the ability to reason.
› ...
Artificial intelligence (AI) | Definitio
More Related Content
Similar to February 2, 2019 - It's Not Hip to be Square - The Importance of Cost Functions in Production Forecasting
This document provides an overview of the algorithms used in the IBM SPSS Statistics Algorithms procedure. It begins with an introduction to algorithms and notes that algorithms are avoided in documentation to promote readability. It then discusses algorithms used across multiple procedures and factors that influence the choice of formulas. The document outlines algorithms for various statistical tests and procedures, including two-stage least squares, autocorrelation/partial autocorrelation, attribute importance testing, and ALSCAL multidimensional scaling. Notation is provided and each algorithm is explained step-by-step with details on computational details, references, and terminology.
This document provides an overview of lectures for a Business Statistics II course between weeks 11-19. It covers topics like simple regression analysis, estimation in regression models, and assessing regression models. Key points include using least squares to estimate regression coefficients, calculating residuals, and evaluating fit using measures like the coefficient of determination and standard error of the estimate. Examples are provided to illustrate simple linear regression analysis.
A Combined Entropy-FR Weightage Formulation Model for Delineation of Groundwa...IRJET Journal
This document presents a study that develops a combined entropy-frequency ratio (FR) weightage formulation model to delineate groundwater potential zones. Digital elevation models and satellite imagery are used to prepare thematic maps of factors like slope and rainfall. These maps are divided into classes and their pixel counts are calculated to determine area percentages. Frequency ratios are then calculated comparing area percentages of each class to the total area. Entropy and information coefficients are also calculated. Finally, weights are assigned to each class for each thematic map by combining the entropy and FR values using the proposed entropy-FR model. This provides objective weights to replace subjective weights. The weighted thematic maps are then overlaid to produce a composite groundwater potential zones map. The
This document discusses developing a theory of data analysis systems that integrates statistical methodology with the design of distributed data systems. It aims to balance tradeoffs between computational, transmission, and statistical costs when performing large-scale, distributed data analysis. As a proof of concept, it presents a toy example involving maximum likelihood estimation of parameters for a Gaussian process model using distributed spatial data. The example quantifies various costs associated with data access, transmission, and computation to jointly optimize the statistical analysis approach and data system design. Challenges include developing objective functions that can optimize both aspects simultaneously and approximating statistical costs like uncertainty.
Modern, large scale data analysis typically involves the use of massive data stored on different computers that do not share the same file system. Computing complex statistical quantities, such as those that characterize spatial or temporal statistical dependence, requires information that crosses the boundaries imposed by this partitioning of the data. To leverage the information in these distributed data sets, analysts are faced with a trade-off between various costs (e.g., computational, transmission, and even the cost building an appropriate data system infrastructure) and inferential uncertainties (bias, variance, etc.) in the estimates produced by the analysis. In this talk we introduce a framework for quantifying this trade-off by optimizing over both statistical and data system design aspects of the problem. We illustrate with a simple example, and discuss how it may be extended to more complex settings.
This document summarizes the application of computational intelligence techniques like genetic algorithms and particle swarm optimization for solving economic load dispatch problems. It first applies a real-coded genetic algorithm to minimize generation costs for a 6-generator test system with continuous fuel cost equations, showing superiority over quadratic programming. It then uses particle swarm optimization to minimize costs for a 10-generator system with each generator having discontinuous fuel options, showing better results than other published methods. The document provides background on economic load dispatch problems and optimization techniques like quadratic programming, genetic algorithms, and particle swarm optimization.
This document discusses calibration of computer models in the face of model discrepancy. It begins by introducing the problem of calibrating a computer model S to a real complex system Z, where discrepancy δ exists between them. The standard Bayesian approach of Kennedy and O'Hagan is described. An issue is that Bayesian inference is performed on the joint model Mζ regardless of data size. The document explores using a Bayesian treed model to partition the input/calibration space, allowing basic GP models to be fit in each region to better represent local features and discontinuities. It suggests this approach may help mitigate non-identifiability issues compared to a standard Bayesian calibration. Modularizing the Bayesian analysis by learning model components separately from different data
This document summarizes Chapter 5 of the textbook "Data Mining: Concepts and Techniques". It discusses concept description, which involves characterizing data through generalization, summarization, and comparison of different classes. Key aspects covered include data cube approaches to characterization, attribute-oriented induction for generalization, analytical characterization of attribute relevance, and presenting generalized results through cross-tabulation, visualization, and rules. Implementation can utilize pre-computed data cubes to enable efficient analysis operations like drill-down.
Similar to February 2, 2019 - It's Not Hip to be Square - The Importance of Cost Functions in Production Forecasting (8)
Discover the latest insights on Data Driven Maintenance with our comprehensive webinar presentation. Learn about traditional maintenance challenges, the right approach to utilizing data, and the benefits of adopting a Data Driven Maintenance strategy. Explore real-world examples, industry best practices, and innovative solutions like FMECA and the D3M model. This presentation, led by expert Jules Oudmans, is essential for asset owners looking to optimize their maintenance processes and leverage digital technologies for improved efficiency and performance. Download now to stay ahead in the evolving maintenance landscape.
Null Bangalore | Pentesters Approach to AWS IAMDivyanshu
#Abstract:
- Learn more about the real-world methods for auditing AWS IAM (Identity and Access Management) as a pentester. So let us proceed with a brief discussion of IAM as well as some typical misconfigurations and their potential exploits in order to reinforce the understanding of IAM security best practices.
- Gain actionable insights into AWS IAM policies and roles, using hands on approach.
#Prerequisites:
- Basic understanding of AWS services and architecture
- Familiarity with cloud security concepts
- Experience using the AWS Management Console or AWS CLI.
- For hands on lab create account on [killercoda.com](https://killercoda.com/cloudsecurity-scenario/)
# Scenario Covered:
- Basics of IAM in AWS
- Implementing IAM Policies with Least Privilege to Manage S3 Bucket
- Objective: Create an S3 bucket with least privilege IAM policy and validate access.
- Steps:
- Create S3 bucket.
- Attach least privilege policy to IAM user.
- Validate access.
- Exploiting IAM PassRole Misconfiguration
-Allows a user to pass a specific IAM role to an AWS service (ec2), typically used for service access delegation. Then exploit PassRole Misconfiguration granting unauthorized access to sensitive resources.
- Objective: Demonstrate how a PassRole misconfiguration can grant unauthorized access.
- Steps:
- Allow user to pass IAM role to EC2.
- Exploit misconfiguration for unauthorized access.
- Access sensitive resources.
- Exploiting IAM AssumeRole Misconfiguration with Overly Permissive Role
- An overly permissive IAM role configuration can lead to privilege escalation by creating a role with administrative privileges and allow a user to assume this role.
- Objective: Show how overly permissive IAM roles can lead to privilege escalation.
- Steps:
- Create role with administrative privileges.
- Allow user to assume the role.
- Perform administrative actions.
- Differentiation between PassRole vs AssumeRole
Try at [killercoda.com](https://killercoda.com/cloudsecurity-scenario/)
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Sinan KOZAK
Sinan from the Delivery Hero mobile infrastructure engineering team shares a deep dive into performance acceleration with Gradle build cache optimizations. Sinan shares their journey into solving complex build-cache problems that affect Gradle builds. By understanding the challenges and solutions found in our journey, we aim to demonstrate the possibilities for faster builds. The case study reveals how overlapping outputs and cache misconfigurations led to significant increases in build times, especially as the project scaled up with numerous modules using Paparazzi tests. The journey from diagnosing to defeating cache issues offers invaluable lessons on maintaining cache integrity without sacrificing functionality.
Applications of artificial Intelligence in Mechanical Engineering.pdfAtif Razi
Historically, mechanical engineering has relied heavily on human expertise and empirical methods to solve complex problems. With the introduction of computer-aided design (CAD) and finite element analysis (FEA), the field took its first steps towards digitization. These tools allowed engineers to simulate and analyze mechanical systems with greater accuracy and efficiency. However, the sheer volume of data generated by modern engineering systems and the increasing complexity of these systems have necessitated more advanced analytical tools, paving the way for AI.
AI offers the capability to process vast amounts of data, identify patterns, and make predictions with a level of speed and accuracy unattainable by traditional methods. This has profound implications for mechanical engineering, enabling more efficient design processes, predictive maintenance strategies, and optimized manufacturing operations. AI-driven tools can learn from historical data, adapt to new information, and continuously improve their performance, making them invaluable in tackling the multifaceted challenges of modern mechanical engineering.
artificial intelligence and data science contents.pptxGauravCar
What is artificial intelligence? Artificial intelligence is the ability of a computer or computer-controlled robot to perform tasks that are commonly associated with the intellectual processes characteristic of humans, such as the ability to reason.
› ...
Artificial intelligence (AI) | Definitio
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...shadow0702a
This document serves as a comprehensive step-by-step guide on how to effectively use PyCharm for remote debugging of the Windows Subsystem for Linux (WSL) on a local Windows machine. It meticulously outlines several critical steps in the process, starting with the crucial task of enabling permissions, followed by the installation and configuration of WSL.
The guide then proceeds to explain how to set up the SSH service within the WSL environment, an integral part of the process. Alongside this, it also provides detailed instructions on how to modify the inbound rules of the Windows firewall to facilitate the process, ensuring that there are no connectivity issues that could potentially hinder the debugging process.
The document further emphasizes on the importance of checking the connection between the Windows and WSL environments, providing instructions on how to ensure that the connection is optimal and ready for remote debugging.
It also offers an in-depth guide on how to configure the WSL interpreter and files within the PyCharm environment. This is essential for ensuring that the debugging process is set up correctly and that the program can be run effectively within the WSL terminal.
Additionally, the document provides guidance on how to set up breakpoints for debugging, a fundamental aspect of the debugging process which allows the developer to stop the execution of their code at certain points and inspect their program at those stages.
Finally, the document concludes by providing a link to a reference blog. This blog offers additional information and guidance on configuring the remote Python interpreter in PyCharm, providing the reader with a well-rounded understanding of the process.
Rainfall intensity duration frequency curve statistical analysis and modeling...bijceesjournal
Using data from 41 years in Patna’ India’ the study’s goal is to analyze the trends of how often it rains on a weekly, seasonal, and annual basis (1981−2020). First, utilizing the intensity-duration-frequency (IDF) curve and the relationship by statistically analyzing rainfall’ the historical rainfall data set for Patna’ India’ during a 41 year period (1981−2020), was evaluated for its quality. Changes in the hydrologic cycle as a result of increased greenhouse gas emissions are expected to induce variations in the intensity, length, and frequency of precipitation events. One strategy to lessen vulnerability is to quantify probable changes and adapt to them. Techniques such as log-normal, normal, and Gumbel are used (EV-I). Distributions were created with durations of 1, 2, 3, 6, and 24 h and return times of 2, 5, 10, 25, and 100 years. There were also mathematical correlations discovered between rainfall and recurrence interval.
Findings: Based on findings, the Gumbel approach produced the highest intensity values, whereas the other approaches produced values that were close to each other. The data indicates that 461.9 mm of rain fell during the monsoon season’s 301st week. However, it was found that the 29th week had the greatest average rainfall, 92.6 mm. With 952.6 mm on average, the monsoon season saw the highest rainfall. Calculations revealed that the yearly rainfall averaged 1171.1 mm. Using Weibull’s method, the study was subsequently expanded to examine rainfall distribution at different recurrence intervals of 2, 5, 10, and 25 years. Rainfall and recurrence interval mathematical correlations were also developed. Further regression analysis revealed that short wave irrigation, wind direction, wind speed, pressure, relative humidity, and temperature all had a substantial influence on rainfall.
Originality and value: The results of the rainfall IDF curves can provide useful information to policymakers in making appropriate decisions in managing and minimizing floods in the study area.
Batteries -Introduction – Types of Batteries – discharging and charging of battery - characteristics of battery –battery rating- various tests on battery- – Primary battery: silver button cell- Secondary battery :Ni-Cd battery-modern battery: lithium ion battery-maintenance of batteries-choices of batteries for electric vehicle applications.
Fuel Cells: Introduction- importance and classification of fuel cells - description, principle, components, applications of fuel cells: H2-O2 fuel cell, alkaline fuel cell, molten carbonate fuel cell and direct methanol fuel cells.
February 2, 2019 - It's Not Hip to be Square - The Importance of Cost Functions in Production Forecasting
1. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
IT’S (NOT) HIP TO BE SQUARE — THE IMPORTANCE OF
COST FUNCTIONS IN PRODUCTION FORECASTING
APPLIED DATA ANALYTICS: UPSTREAM
MARCH 18–21, 2019
HOUSTON, TEXAS, USA
DAVID S. FULFORD
DATA ENGINEERING & ANALYTICS — SUBSURFACE ANALYTICS
APACHE CORPORATION
2. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
INTRODUCTION
3. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
INTRODUCTION
Valid Model Data Transform
Robust Regression Is this a Valid Model?
4. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
INTRODUCTION
5. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
What’s a cost function?
Alternative Norms
Regression of Non-linear problems
Applications to Production Forecasting
Conclusions
OUTLINE
6. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
To fit a model to data, we require an quantification of
“goodness of fit” – a cost function
Applies to any machine learning (ML) algorithm
In general, we can write the process of model fitting
as regression
we desire to map predictor variables to response variables
WHAT’S A COST FUNCTION?
6
7. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
The simplest case of regression would be a linear model
Holding that the assumptions of a linear model hold true to other models, viz.
E 𝑒 = 0
𝑒 are homoscedastic and uncorrelated
If we have a linear model:
𝐘 = 𝐗𝛽 + 𝒆
We predict 𝐘 with:
𝐘 = map 𝑓 መ𝛽, 𝑥 , 𝐗 + ො𝒆
Then the cost function is:
J መ𝛽 = 𝐘 − መ𝛽𝐗
𝟐
≡ 𝐘 − map 𝑓 መ𝛽, 𝑥 , 𝐗
𝟐
And we regress the model by minimizing J መ𝛽
መ𝛽 ← argmin J መ𝛽
WHAT’S A COST FUNCTION?
7
ො𝐳 is an estimator of 𝐳
8. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
Minimization of J( መ𝛽) is equivalent to setting the gradient of J( መ𝛽) to zero:
argmin J መ𝛽 ≡
𝜕J
𝜕 መ𝛽
→ 0
𝜕J
𝜕 መ𝛽
= −2𝐗 𝑌 − map 𝑓 መ𝛽, 𝑥 , 𝐗
If 𝐗 is:
𝐗 = [1, 1, 1, … , 1]
Then the gradient becomes:
𝜕J
𝜕 መ𝛽
= −2 𝐘 − መ𝛽 ≔ 0
And its obvious that a value of መ𝛽 = ത𝑦 satisfies the equation
WHAT’S A COST FUNCTION?
8
9. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
Summarizing:
The arithmetic mean as a best estimator of a parameter value,
and
Squared errors as the cost function of choice to regress a model
… are consequences of the estimator’s linear properties
መ𝛽 is a fixed linear combination of 𝐘
e.g. መ𝛽 = 𝐚T
𝐘 for some 𝐚 such as 𝐚 = 𝐗 𝐓
𝐗
−𝟏
𝐗 𝐓
E መ𝛽 = 𝛽
Most problems in which we’re interested are not linear!
e.g. production data
more on this later…
WHAT’S A COST FUNCTION?
9
10. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
Additionally, means are not robust
ҧ𝑥 =
1
𝑛
𝐗 =
1
𝑛
𝑖=1
𝑛
𝑥𝑖
With some manipulation we can show that:
ҧ𝑥 =
𝑛 − 1
𝑛
ҧ𝑥 𝑛−1 +
1
𝑛
𝑥 𝑛
Indicating that any single value, if large enough, can
dominate ҧ𝑥
WHAT’S A COST FUNCTION?
10
11. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
WHAT’S A COST FUNCTION?
11
12. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
We can write the least squares cost function as the
Euclidean distance between two points:
𝛂 = 𝐘 − map 𝑓 መ𝛽, 𝑥 , 𝐗
𝛂 2 = σ 𝛂 𝟐
1
2
Generally, we can have any level of distance, which
we call a norm:
𝛂 𝑛 = σ 𝛂 𝑛
1
𝑛
ALTERNATIVE NORMS
12
13. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
L2, L1, and L0 norms:
𝛂 2 = σ 𝛂 𝟐
1
2
𝛂 1 = σ 𝛂
𝛂 0 = ቊ
0 if 𝑥 = 0
1 else
We can even have a L∞ norm:
𝛂 ∞ = max 𝛂
ALTERNATIVE NORMS
13
14. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
However, not all provide a closed-form analytic
solution for regression
If we draw the L2 norm between two points, we find
a unique solution
ALTERNATIVE NORMS
14
15. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
Adding some L1 norms, we find multiple non-unique
solutions
ALTERNATIVE NORMS
15
L2
by-flight distance
L1
taxicab distance
16. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
ALTERNATIVE NORMS
16
17. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
ALTERNATIVE NORMS
18. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
ALTERNATIVE NORMS
𝛻 J 𝐻 = ቊ
𝛿 if 𝛼 ≥ 𝛿
𝛼 sign 𝛼 else
19. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
ALTERNATIVE NORMS
J 𝐻 = ൝
𝛿𝛼 − Τ1 2 𝛼2
if 𝛼 ≥ 𝛿
Τ1 2 𝛼2
else
20. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
ALTERNATIVE NORMS
21. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
ALTERNATIVE NORMS
22. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
Use stochastic gradient descent (SGD) to minimize
cost functions:
1. while J መ𝛽 𝑖−1
− J መ𝛽 𝑖
> 𝜀
2. J መ𝛽 = 𝐘 − map 𝑓 መ𝛽, 𝑥 , 𝐗
2
3. መ𝛽 ← መ𝛽 − 𝜂𝛻J መ𝛽
where:
𝜀 = precision threshold
𝜂 = learn rate
ALTERNATIVE NORMS
22
23. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
ALTERNATIVE NORMS
24. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
We can also regularize based upon norms:
LASSO Regression → L2 norm of data, L1 norm of መ𝛽
J መ𝛽 = 𝐘 − map 𝑓 መ𝛽, 𝑥 , 𝐗
2
+ 𝜆 መ𝛽
Ridge Regression → L2 norm of data, L2 norm of መ𝛽
J መ𝛽 = 𝐘 − map 𝑓 መ𝛽, 𝑥 , 𝐗
2
+ 𝜆 መ𝛽2
ALTERNATIVE NORMS
24
25. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
REGRESSION OF NONLINEAR PROBLEMS
25
26. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
REGRESSION OF NONLINEAR PROBLEMS
26
27. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
REGRESSION OF NONLINEAR PROBLEMS
27
28. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
Example well from Eagle Ford
≈3.5 years of production history
APPLICATIONS TO PRODUCTION FORECASTING
28
29. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
If we use a hyperbolic model, we believe that:
𝑞 = 𝑞𝑖 1 + 𝐷𝑖 𝑏𝑡 − ൗ1
𝑏
𝑞 ≈
𝑞𝑖 𝐷𝑖 𝑏 − ൗ1
𝑏
𝑡 ൗ1
𝑏
𝑞 ≈
𝛼
𝑡 ൗ1
𝑏
Meaning, a power-law function is the base functional
relationship and must log-transform the data
APPLICATIONS TO PRODUCTION FORECASTING
29
URTeC 2903036 (Fulford) – A Model-Based Diagnostic Workflow, 2018
30. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
Regression on data... neither is good
APPLICATIONS TO PRODUCTION FORECASTING
30
31. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
Point-by-point cost; note the scale on the colorbars
APPLICATIONS TO PRODUCTION FORECASTING
31
32. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
APPLICATIONS TO PRODUCTION FORECASTING
32
33. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
APPLICATIONS TO PRODUCTION FORECASTING
33
34. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
APPLICATIONS TO PRODUCTION FORECASTING
34
Is this unique?
Minimize cost function for 𝑞𝑖 = 200, 2000 with interval
of 100
35. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
Plot the cost function over mesh with:
𝑥 = 𝑞𝑖
𝑦 = 𝐷𝑖
APPLICATIONS TO PRODUCTION FORECASTING
35
36. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
Plot the gradient…
𝜀 = 1 × 10−40 (precision)
… yet algorithm still did
not find 𝛻J = 0 for each
forecast in list
APPLICATIONS TO PRODUCTION FORECASTING
36
37. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
Extract the minimum cost from the mesh at each
row/column
Which is the “correct” forecast?
APPLICATIONS TO PRODUCTION FORECASTING
37
38. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
We’re not limited to pre-defined cost functions… make your own!
SPE 174784-PA proposed the following:
JF =
1
𝑛
σ 𝑒 − 𝜖min
𝜎𝜖
2
+
1
𝑛
σ 𝑒 −
1
𝑛
σ 𝑒
2
− 𝜀min
𝜎𝜀
2
Which is more clearly written as:
JF =
E 𝑒 − 𝜖min
𝜎𝜖
2
+ 𝜆 𝜀 Var 𝑒 − 𝜀min
2
The features of this cost function are:
L1 cost against data
L2 cost against best-fit model
L2 regularization for min. variance (generalizes to noisy data / outliers)
APPLICATIONS TO PRODUCTION FORECASTING
38
SPE 174784-PA (Fulford, Bowie, Berry, Bowen, Turk) – Machine Learning as a Reliable Technology, 2016
39. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
Regularization on variance limits range of 𝐷𝑖 and 𝑏-
parameter
L2 cost against best-fit penalizes spread through data
APPLICATIONS TO PRODUCTION FORECASTING
39
40. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
The choice of cost function(s) may impact regression
results as much as the choice of predictive model
Understanding the base expectation of data & model
functional-forms gives insight into appropriate choice of
cost function
Uncertainty is a fundamental characteristic of modeling
A best-fit is not the same as a best forecast
It does not mean only one unique set of model parameters exists!
CONCLUSIONS
40
41. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
41
APPENDIX
42. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
𝐗 𝐗 𝐓
𝐗
−𝟏
𝐗 𝐓
is the projection matrix of 𝐘 to 𝐘
𝐘 = 𝐗 𝐗 𝐓 𝐗
−𝟏
𝐗 𝐓 𝐘 = 𝐗𝛽
WHAT’S A COST FUNCTION?
42
43. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
DISTRIBUTION OF FORECASTS
Possible fits of data + uncertainty of future
performance
Actual vs. MCMC Forecasts
Time
Rate
43
44. ADA Upstream | March 18–21, 2019 | Houston, TX, USA
REVISIONS
How much should I expect to revise forecasts from
month-to-month with this approach?
On average, zero!
Change in EUR from prior month
Clifford and Torres (2017)
44