Upcoming SlideShare
×

3,165 views

Published on

Published in: Self Improvement, Technology
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
3,165
On SlideShare
0
From Embeds
0
Number of Embeds
155
Actions
Shares
0
44
0
Likes
0
Embeds 0
No embeds

No notes for slide
• When working as an engineer or scientist, you undoubtedly will be faced with having to handle data in some form. The data could be a simple table of collected results to a large database of experimental measurement. Either way, since the beginning of my career I’ve never found a good tool or home for data to reside. Since I most often preferred MathCAD to analyze or generate results, I often wished there was some type of data management tool built into MathCAD. Being that there isn’t, getting data in and out of MathCAD was always awkward and more tedious then I would have liked. In fact, I found it so tedious that I decided to build a solution. This presentation will focus mostly on the benefits and technique of using Data Management with MathCAD, but the actual implementation will be done through my developed tool, which I call SciData
• To demonstrate this idea of Data Management, I think its best to use an applied example of an experimental measurement of the drag coefficient of diffeernt parachute designs. The way tthat drag coefficient is measured is by tracking the change in velocity over time to see how quickly the parachute decelerates.Furthermore, looking at the equation for the force of drag, we see it is a function of the:-drag coefficient, Cv, our unknown in this case, -the density of air, -cross sectional area of the parachute, and the -velocity, which we will need to measure. By balancing the force of gravity on the mass and the force of drag, we can solve for the coefficient of drag by fitting a curve to the data. Our experiment then is to take 4 different parachute designs, A thru D, drop them from 100m and measure their velocity every half second. The data collected is shown in the plot for each of the parachutes. Take note that we have several different types of data, even in this very simple example: An array of velocity vs. time, which is the experimental data set, Characeterstic values, in this case is the cross sectional area, and the parachute type, Which can also referred to as MetadataSo where is the best place to put each of these pieces of information? This is what I will be showing you next.
• The way that we would typically solve this problem in MathCAD is to start with the data and characteristic value input at the top of the document and then below that crunch the numbers. In this case we type in the -cross sectional area, -the density and mass, -and then read the velocity measurements in from a file. We might have also simply copied and pasted the data into a matrix and stored the data in the MathCAD file. Either way, the data and characteristic values are hard coded, meaning that if we want to change them out for the next data set, we will have to do this manually.
• The problems with having our analysis setup like this is that it becomes entirely a manual process to go from one data set to the next (the “file save as..” method). This might be fine for just a hand full of tests, but when the number of tests grows and your time available to spend analyzing the data shrinks, this becomes a problem. Furthermore when the more manual the process, the tendency for errors grows. Did you remember to change the characteristic values to match the right data when changing from Parachute A to B? There is no way to know for sure that human error did not become a factor in the output of your analysis.
• Moving to the new way of data analysis, we replace the hard coded data and characteristic values with variables. These variables are then fed data thru SciData, which works by way of the MathCAD API.SciData is written so that the only rule you need to follow as a user is to give each column of data a name and simply use the same text based name in MathCAD to retrieve the data. The nice thing about this is that the Analysis and Results remain exactly the same. At this point, it might not be entirely clear why we want to do this, but later it will be shown how this opens the door to quickly changing out data sets while utilizing a single MathCAD analysis.
• The benefits of moving to this new way of doing things is that a single MathCAD document can be used while data sets are quickly changed out. Also, characteristic values are linked with the data. So envision now that each row in the shown data table does not just hold a set of numbers, instead it contains the whole bucket of information about each of your test.
• The real benefit now comes from a Data Management System as the driver of the inputs and outputsAn overhead view of the Data Management System will now look like this. In my implementation, SciData acts both as the data storage center and the driver of the process. The final step in the process now is to retrieve results from the MathCAD analysis, to the Data Management System.In this way we now have a system that can Batch Process several data sets automatically. This becomes key when the number of data sets grows to a sizable number. But this is also greatly beneficial when your MathCAD analysis is updated or changed…Most often our analysis will evolve with the experiment. We learn things as we go and the analysis may change. Therefore, the past data should be ran through the analysis again. By batch processing all the data sets old and new, you can be sure that your results always reflect the most recent MathCAD analysis you have edited.
• To implement the Parachute data analysis example, as mentioned, we will be using my developed tool, SciData. This tool is where your data, metadata, and results will all be stored. Everything is organized in a table that gives a row for each data set. SciData is then used to send and retrive data back and forth from MathCAD. In the following slides I will go though a quick example of how this works.
• The first step in implementing SciData is to import the data of interest. SciData is very much like for your data as iTunes is for your music. In our case here the data is simply comma delimited arrays of time and velocity. SciData uses the Scientific Dataset Library from Microsoft Research to implement a standard for storing data in a self-descriptive data packages. So in other words, by using SciData, your data will be automatically filed away with a standard that makes each dataset fully documented. Eash row in the table shadows the actual SDS file in the Data folder. This has the benefit of allowing you to more easily document your data and share it with others. In this step, if the original data is a simple comma separated file, as shown, then it is easily interpreted and imported to SciData. Later we will discuss more complex data structures, which are not easily imported to SciData
• Now we can add additional fields to the data to fill out the Metadata. In this step we add textual Characteristic Values.ParachuteType is added as a category for example, so each type of parachute used is tracked. In this example, the types used were simply labeled A,B,C, and D
• In this step we add a numerical Characteristic Value, A_x to represent the size of the parachute. As can be seen, we have now columns for all the different value types (arrays, text, numbers). It is possible to add as many columns of the different data types as you wish, allowing you to fully document your experiment
• By clicking the ‘Send Row’ button, data from the selected row is sent to the target MathCAD file. With a blank MathCAD file, simply type the variable to see that it is populated with the data.Note, we know that MathCAD works Top to Bottom, Left to Right.The data and variables are defined at the very top of the file in memory.Variables can be redefined at any point just like regular defined variables.
• We can now modify our original file to accept the input from SciData. Retrieving results back to SciData requires the use of the WRITEPRN function with a special tag which specifies what to be retrieved. We have to specify what type of result is being retrieved, a single value, an array, or a string. The WRITEPRN function is used to actually write the results to a file which is then read into SciData. The tag is followed by the variable name that will be used to identify the results in SciData.
• Then click Scan file and the result column is added.
• Now everything is setup. With the click of a button we can now process all the data automatically, batch processing the whole data library with our single MathCAD analysis.
• When complete, we can now see everything in one table. It is now very simple to copy the table and put it in a report. It’s important to note here that since this was done automatically, we can be confident that the results are correctly linked to the data. In the past, this type of thing was done manually, a table was either kept by hand or results were copy and pasted one by one from MathCAD. When done like that there is always room for simple errors. This example shows just 4 tests, but in real life, we could have tens or hundreds of tests, in which the benefit of a tool like this to automate the analysis becomes increasingly helpful.
• This concludes the first half of this presentation, which summarizes the data management method with SciData in a nutshell. I hope that this example has given you a good feeling of the benefits of such a tool. Please take note that SciData is available as a free resource at sourceforge.net. Since this application was authored by me, please don’t hesitate to contact me if you have questions, suggestions, or problems, this is still very much a work in development so I can easily help make changes based on feedback. For the second half of the presentation I’ll be going over additional examples and covering additional challenges of data management.
• I will attempt to cover the listed examples here in the remaining time. If anyone has more specific problems that you would like covered, please let me know and I’ll see if I can address it. The first example will demonstrate how to use SciData to also drive a Design Table.
• By using SciData in a similar way, we can use it to drive a design table. Again, we can use a single MathCAD file to run some specific calculations for a particular design case. Then we can make certain values as parameters and drive those parameters, row by row, from SciData. For this example, the driving parameters are the drag coefficient, C_v, cross sectional area, A_x, and the load, m_g, as can be seen in the table. 4 cases are considered for the drag coefficient from 2 to 5. In this case, we now retrieve the velocity change over time as a result rather than an input. We also are retrieving the velocity at the 1 second mark, not for any particular reason, just as an example single value result. We can then batch process our calculations and see our results. After doing that we can compare the results and see a trend between the velocity and coefficient of drag, not surprisingly, the more drag the slower the speed. You may be thinking that this can be done in MathCAD alone, which is true. But it certainly becomes more challenging. Consider the use of the Odesolve block. If our inputs are arrays rather than single values, we need to somehow modify the solve block to handle this complexity. I’m not even quite sure how I would handle it, suffice it to say it would not be easy.Now that we have completed our design table, another funciton of SciData is available. As shown here, we identified a trend of velocity with Coefficient of drag. SciData is broken up into Row and Table tabs for Row by Row operations AND for full Table operations. Switching to the Table tab you will see that an additional MathCAD analysis file can be added to accept the full table data set.
• By sending the full table to MathCAD, the arrays are sent too, but they are put together in a matrix, where each column represents reach row sent. Therefore, making a comparison plot in MathCAD requires either breaking up the matrix into separate arrays, or using the 3D plot, as shown here.One of the down sides of MathCAD is that plots cannot be scripted. All changes to the plot need to be done manually, which can be very time consuming. For comparison, I show here how the same plot can be made with a script in Scilab. As can be seen, this only takes 4 lines of code. Note how the lengend in automatically generated from the data. To prove how benefitial this is, we can filter the table to just a couple rows and the legend would correctly reflect the changes. The MathCAD file would reflect the changes too, but unfortunately there is no way to generate a legend.
• Here is the filtered result
• Although MathCAD may not have a data management solution, it does have an excellent ability to import and interpret data from different file types. The next few slides will cover lots of good strategies of how to import different types of formatted data using MathCAD. In this way MathCAD becomes a great tool to translate data from non-standard sources into SciData. Since every program and reference uses different ways to output data, this becomes a very useful thing to know.The first step to reading a data file is to provide a file path. The examples shown here are using the file structure that SciData implements. When using Scidata, a variable ‘data_path’ is always sent to MathCAD. It is easy then to build a path to a particular data file using the data_path variable and the concat function. As can be seen, the path to ParahcuteA.dat is built as shown. Outside of SciData, if you are simply building a path to a particular data file, it is best to use a relative path as shown in Option 2. The dot-dot-backslash notation allows you to move up the folder structure and then enter the Data folder. It’s always a good idea to use relative paths so that your MathCAD file does not fail when moved to another location, such as emailing to a colleague.Note that we now have a file path that is built based on the Name variable. All we need to do is change the Name and we can target a different file.
• Now that we have a path to the data file of interest, we can use different functions in MathCAD to read in the data. The READFILE function is great at importing delimited data, such as commas, tabs, spaces, etc. One of the downfalls of this funciton though is that it trys to figure out what the delimeter is. The example shown here does not work well with READFILE. As can be seen, we desire a comma delimited scenario, but the function assumes a space delimiter. The result is not right, as shown.A workaround is to include a comma in the first line. This tricks the function into using the comma delimeter.Another option is to write the read highlighted area to file, and read it back in. Note that READFILE also works with Excel files and additional inputs are available for specifing which columns and rows to import.
• Another option is available using the File Input or Data Import Wizard. This component seems like a nice tool, but unfortunately it has some downsides and bugs that I have noticed in the past. File Input cannot use a variable path, and Data Import Wizard cannot use a relative path. I have also seen that when updating the file variable to point to a new path, the component has trouble properly updating. Therefore, I prefer the READFILE function.
• Another available function for importing data is READPRN. This function accepts only a file path. It will scan the file and extract a identifiable pattern of numbers and ignore any text. This example shows how the header and footer text is automatically removed, and only the numerical information is extracted. This will work if the data is comma separated as well. Another example is shown here to further prove the point of how well READPRN filters out text, finds the numerical informaiton and returns it. For this case this saves a lot of work from having to extract a single value from a text report.
• In the cases where a data file contains mixed types of data, as shown with the example Parachute data file, we can use the READFILE function and then set up some logic to properly extract the desired information, as shown in the example MathCAD document here. This is certainly more challenging, but it provides you with more flexibility to work with any type of data format given. As can be seen here, we extra the ‘t’ and ‘v’ arrays, and the cross sectional area, A_x.When using SciData we can use the 4 step approach shown here. The first file extracts the data as shown here from the non-standard file and imports it to SciData, which allows us then to start with a Clean Slate file and focus purely on the analysis.
• I have one additional example here to further demonstrate the benefits of data management with SciData. In this example we will be analyzing high speed video data. The high speed video is of a droplet ejecting from an orifice and our goal is to track the position of the droplet. Each of the 34 frames of the video file is written to a bmp file which MathCAD can analyze, as shown here. By making the file name an input variable we can quickly change it and analyze each file. By using SciData, we can drive the file name input variable and batch process the analysis, furthermore we can store the results for each file. In this example we are storing the droplet position from the orifice. After collecting this value for each frame, we can then plot the position change over time by running a Full Table Analysis. As expected we can see the drop decelerating.In order to set this analysis up, we need to setup SciData with rows containing the image file names. This can be done by moving the images files to the Data folder and scanning for them. SciData looks for SDS data files of the *.csv or *.nc extensions. Also SciData scans for *.dat file types. When found rows are added to represent the file. We can then add the .dat extension to all the image files so they are recognized when scanning. A quick rename of all the image files can be done with a ‘ren’ call in DOS as shown. Then with a click of a button, we can batch process all 34 images automatically.

1. 1. Getting Organized: Data Managementfor MathCAD Bradley Carman June 2012 1
2. 2. IntroductionMathCAD is the ideal tool for data analysisNot ideal for data storage and managementA tool is needed for Data Management Started with Excel Add-in, Evolved to full application (SciData) Data Management 2
3. 3. Applied Example: Drag Coefficient Measurement forParachuteParachutes A-D dropped from 100m elevationVelocity measured every 0.5s for 5s100kg mass attached to each parachuteGoal is to determine the Drag Coefficient: Cv A B Ac=1.5m2 Ac=1.75m2 C D Ac=1.4m2 Ac=2m2 3
4. 4. Data Analysis – The Current WayMathCAD easily provides the Data Analysis for this problem Plotting, Curve Fitting, etc.But, current data import is “Hard Coded” 1. Hard Coded Data Input and Characteristic Values 2. Analysis & Results 4
5. 5. Data Analysis – The Current WayProblems:Slow to change out data setsNo link between characteristic values and data setsChanging data sets often leads to separate MathCAD files = tendency for errors 5
6. 6. Data Analysis – The New WayReplace hard coded data and characteristic values with variablesFeed the variables data from database using the MathCAD APIAnalysis & Results remain exactly the same! 2. Analysis & Results 1. Variable Data Input/Characteristic Values 6
7. 7. Data Analysis – The New WayResult:Single MathCAD document and analysisQuickly change out data setsData and characteristic values can be linked 7
8. 8. Separating Data from Analysis - ImplementedReal benefit now comes from a Data Management System as the driver of inputs andoutputs of the analysisNow it is possible to Batch Process all data sets! Data Data Data Data Send to MathCAD Characteristic Characteristic Characteristic Values Characteristic Values Values Values Retrieved Results Results Results Results Results Data Management System 8
9. 9. Demonstration: Using SciData as the DataManagement SystemData is organized in a tableEach row contains a data set, characteristic values, and results Data Characteristic Values Results 9
10. 10. Step 1: Importing DataData files use SDS Standard 10
11. 11. Step 2: Categorize the Data Set 11
12. 12. Step 3: Add Characteristic Values to the Data Set 12
14. 14. Step 5: Execute the Link 14
15. 15. Step 6: Setup Inputs and Outputs Note: Results are retrieved by file due to bugs with MathCAD API. WRITEPRN is much faster. # = Result tag 15
16. 16. Step 6: Setup Outputs Continued… 16
17. 17. Step 7: Batch Processing 17
18. 18. Step 8: Compile ResultsAll results are now compiled together and ready to be presentedSince everything came from the same MathCAD analysis, there are no discrepanciesamong results Fast, organized, and error free! Parachute Ax Cv A 1.5 1.48 B 1.75 2.34 C 2 2.59 D 1.4 1.84 18
19. 19. ConclusionData and Analysis are now separate:-Now have a good place to store and organize data and information-Enabling batch processing-Improving efficiency and accuracy! sourceforge.net/projects/scidata/ 19
20. 20. Additional Examples Design Table Full Table Analysis Table Filter Data Extraction High Speed Video Analysis 20
21. 21. Parachute Design Table 21
22. 22. Full Table Analysis 22
23. 23. Table Filter 23
24. 24. Data ExtractionWhen SciData cannot interpret a data file, only the file name is addedUse MathCAD to interpret and filter more complex data files, then export to SciDataFirst challenge: Building a path to the file using the nameBuilding a relative path  Best practice 24
25. 25. Data ExtractionHandling different data formats 25
26. 26. Data ExtractionUse READFILE or READPRNDon’t Use Import>Data>… Problem: - Can’t use variable path Problem: - Can’t use relative path ‘....Data’ - Has bugs (problems updating) 26
27. 27. Data ExtractionREADPRN can filter out all non-numeric information automatically. 27
28. 28. Data Extraction1. Data Extraction File 2. Extract to SciData 4. Start with a Clean Slate 3. Add Clean Analysis File 28
29. 29. High Speed Video Drop Analysis1. In DOS run: ren *.bmp *.bmp.dat2. Scan Data folder3. Batch Process, get dropPositionX4. Full Table Analysis 29
30. 30. ReviewData Management for MathCAD SciData can be downloaded at: – sourceforge.net/projects/scidata/ Data Management benefits: – Organization and error avoidance – All data types stored conveniently together in a standardized form (metadata and arrays) – Batch Processing – Row by Row and Full Table Analysis from the same root source Contact Info – Brad Carman – bradcarman@users.sourceforge.net Questions? – Remember your evaluations 30