How Data Collection Shapes MI PerformancePresentation Transcript
How Data Collection Shapes Manufacturing Intelligence PerformanceManufacturing Intelligence for Intelligent Manufacturing
Enterprise Manufacturing Intelligence Working DefinitionEnterprise Manufacturing Intelligence (EMI) is aterm which applies to software used to bring a corporationsmanufacturing-related data together from many sources for thepurposes of reporting, analysis, visual summaries, andpassing data between enterprise-level and plant-floor systems.As data is combined from multiple sources, it can be given anew structure or context that will help users find what theyneed regardless of where it came from.The primary goal is to turn large amounts of manufacturingdata into real knowledge and drive business results based onthat knowledge. Wikipedia, others
Core Functions of EMI*• Aggregation: Making available data from many sources, most often databases.• Contextualization: Providing a structure, or model, for the data that will help users find what they need.• Analysis: Enabling users to analyze data across sources and especially across production sites.• Visualization: Providing tools to create visual summaries of the data to alert decision makers and call attention to the most important information of the moment.• Propagation: Automating the transfer of data from the plant-floor up to enterprise-level systems or vice versa. *AMR/Gartner
“Intelligence” is based on Analytics• EMI is based on the (statistical) analysis of data collected from the manufacturing process.• The most important element of successful statistical analysis is the collection of data.• If the data collection process is flawed, simple statistical techniques will fail and sophisticated techniques can’t fix it• Bad Data = Bad Analytics = Bad Intelligence.
The Importance of Analytics• Data alone, or data compared to limits that were not determined statistically can only provide some sense of what a process is doing.• Analytics helps provide meaning by identifying key events and relationships with a known certainty.• The following example of applied Statistical Process Control (SPC) analysis illustrates the value of Analytics.• SPC determines if variation in a process is unusual, detects events, and helps point to the source or cause.
This is a “Run Chart” – data is displayed in a line graph with noanalysis of the data. Are any points unusually high or low? ? ?
This is an “SPC Chart” of the same data where upper and lower limits havebeen calculated to determine if any of shows unusual variation. This datashows normal variation – there are no unusually high or low points.
This is another “Run Chart” – are any points on this chartunusually high or low? ? ?
This is the same data displayed on an SPC Chart. Note that onepoint has been found to be unusually high (and worth investigating).
Two key process variables – one showing normal variation andthe other indicating that something unusual is happening.If this is a process that is has been having its problems, thesecharts will be invaluable in determining the cause.
Combining statistical limits and specifications/process set-point cancreate the possibility of an “early warning” system – a simplepredictive analytic. Upper ? Specification ? Upper SPC Limit Lower SPC Limit Lower Specification
Consequences of Poor Data Collection Practices • Missed Signals – Systems fail to detect problems • False Alarms – Analytics indicate problems that aren’t there • Unreliable KPI’s • Loss of faith in Analytics and Intelligence systems
Primary Data Sources in Manufacturing • Manual sampling and collection • Automated data collection systems • Existing data
Manual Sampling and CollectingIn many industries, the majority of data is collected manually(food, consumer products, most types of packaging,materials) Influences: • History - it was like this when I got here… • Folk wisdom (not the result of study/analysis) • Cost • Convenience Results • Overly complex methodology • Non-random sampling • Insufficient data • Important data not collected
Manual Sampling and Collecting IssuesIncoming tank car containing raw material – multiplesamples taken from the same car…If material in car is homogenous (well mixed) the extrasamples are identical, offer no additional information, andwill affect any statistical analysis performed. If data is“sub-grouped”, SPC charts will not work.If the material in the car is stratified, but is mixed/blendedbefore use, the samples do not represent the materialused in the process.The sample(s) taken must represent the material as it isused in the process.
Manual Sampling and Collecting IssuesSheet/roll process with samples taken of material beforeroll-up. Difficulty in reaching across roll results in: x x x x x x x x x x x x x x x x x x x x x x x xEasier to check the edges, misses 30% of the product…
Manual Sampling and Collecting IssuesProduct packaged in boxes with multiple compartments:Sample 5 items from left side on every other box, sample5 items from right side on alternating boxes every 15minutes, sample 5 on each side every hour, sample allitems in one box each shift, unless an out-of-spec item isfound then double sampling on same side and sample 5on other side on every box until 10 boxes have beensampled without an out-of-spec item…uh…except onLeap Year when we do all of this backward…Result (among many): Data collected is too inconsistentto be used to analyze the process – not to mention anannoyed workforce.
Automated Data CollectionMost data in Chemicals/Petrochemical industry is collectedby automated systems, common in all “Process” industries. Sources: • DCS • SCADA • Process Historians • Can sample multiple times per second Types of automatically collected data: • Sensor data (process temperature, pressure, etc.) • Analytical instrument results (chemical & physical parameters) • Control indicators (valve state, machine instructions, etc.) • Process status (start up, running, shut down, fault) • Equipment parameters (current load, temperature, speed)
Automated Data CollectionIssues:• Enormous quantities of data• Temptation to use all of it – hard to convince otherwise• Overwhelms analytics systems• Oversampling can result in invalid statistical results• Most of the data isn’t suitable for statistical analysisConsiderations:• Is the data used for anything• How is the data used (control, alarms, analysis, reports)• Response time required• Process cycle• Autocorrelation
Data sampled too frequently – the process has not had a chance tochange so the sensor is measuring the same material – the variationis the sensor’s measurement error and SPC won’t work.
Data sampled at a frequency that allows the process to change –the sensor is measuring different material and the variation is due tochanges in the process.
Hazards of Existing DataExamples:• Laboratory Information Management Systems (LIMS)• Process Historians• Quality Systems• MES, ERP• That database nobody is sure aboutConsiderations:• Why was the data collected in first place• Who benefits from data being right (or not-so-right)• Was the data used for anything important - vetted?• Were there constraints on the values?• Can it be sampled (if there is a lot)• Why analyze the past anyway?
Hazards of Existing DataThings that make historical data problematic:• Data reduction (averaging, …)• Data filtering (removing “outliers”)• Improper sampling (biased)• Changes is process not identified• Data isn’t “real”The problem with Historical Data is you often can’t tell
Data that has been averaged loses potentially importantinformation – in this case, data that exceeds a key limit:
The Importance of Context• Data without context has little or no meaning.• Lack of context makes data “un-actionable”.• The further the data gets from the process, the more important it is to preserve context.
A not unusual chart with no context – just the rownumber of the data file used to create the chart:
Knowing the row number of data that shows unusualbehavior doesn’t do much good:
Adding Date/Time helps, but requires looking up other informationfrom multiple sources to know what is really happening:
Full context – all pertinent information brought forward to the analyticspresentation allows quick recognition of problems and fast response:
Finally, if the users can add information such as Cause and CorrectiveAction and have it “stick”, the information resource becomes aKnowledge Base:
Aggregating Data Across Systems• Increasingly major issue for NWA’s process customers• Provides “total process” understanding• Helps link product quality to process operations• Reveals relationships between raw materials, storage, unit operations, blending, packaging/delivery• Most “continuous process” operations actually combine process and batch• Key is getting a “Batch” view of overall process • (Some Historians have functions that can help)
Three systems together know what is going on, but no single system has all the information:SCADA – Precise date/time, LIMS – Product, approximateprocess unit and parameters date/time, lab test results MES – Product, production schedule, line, customer
Problems Aggregating Data Across Systems • Different sampling methods – time, event, and sample- based • Difficulty querying historized data (Historians use data compression) • Data in different formats, databases, structures • Lead/lag relationships • Auto & Cross-correlation problems • Different analysis techniques • Data “owned” by different groups (production, engineering, lab)
Process, Event, & Batch DataHistorian LIMS Process Event Batch
Aggregated Process, Event, & Batch Data
Database SQL Queries for Historian only – now all we need is some SQL for the LIMS and MES and we are all set…SELECT * FROM OpenQuery( INSQL, SELECT [DateTime], [Batch%Conc],[BatchNumber], [ReactLevel], [ReactTemp], [SetPoint] FROMRuntime.dbo.WideHistory WHERE DateTime >= DATEADD(hour, -1, GETDATE())AND DateTime <= GETDATE() AND wwRetrievalMode = "cyclic" ANDwwResolution = 60000)SELECT * FROM OpenQuery( INSQL, SELECT [DateTime], [Batch%Conc],[BatchNumber], [ReactLevel], [ReactTemp], [SetPoint] FROM Runtime.dbo.WideHistoryWHERE DateTime >= DATEADD(hour, -1, GETDATE()) AND DateTime <= GETDATE()AND wwRetrievalMode = "delta" AND wwValueDeadband = 50 ) wide INNER JOINEventHistory ON wide.DateTime = EventHistory.DateTime WHERETagName=SysStatusEvent
Conclusions:• Data collection techniques should focus on data that represents the process or material.• The ultimate use of the data should guide how it is collected.• Balance the cost of data collection with the value of the collected data.• Be aware of the pitfalls of using historical data.• Avoid the temptation to use “all” of the data that is available.• Include as much context as possible as early in the data collection process as possible.