Presents…..! Data Mining anExpansive Groundwater System
Press your Pause key to stop/restart this presentation at anytime.Press your Esc key to end it.
Advanced Data Mining (ADMi) hasdeveloped unique Data Mining technologyfor modeling natural systems. This videodemonstrates its application to anexpansive groundwater system.Data Mining extracts valuable knowledgefrom large amounts of data. It employsadvanced methods from several scientificdisciplines.
The groundwatersystem of interest isthe Upper FloridianAquifer in theSuwannee River Valley
This system is approximately 100 x 120miles with a maximum surface elevationof 220 feet.The following illustration shows itstopography. Land elevation is indicatedby the key at left. The path of theSuwannee River can be readily seennear the center.
Histories for a few wells go back to the1940’s, however, the record prior to1982 is sparse.The vertical blue streaks in thefollowing 3D image show the historicalrange of individual wells. Together theyshow the dynamic range of the aquifer.
Elevation above Sea LevelN EW Gulf of Mexico S
Collectively, these data comprise avast, but unwieldy source ofpotentially valuable knowledge.We researched how Data Miningcould be used to extract knowledgeabout this complex system andothers like it.
Computer models of groundwatersystems are important tools for learninghow these invaluable resources areaffected by weather, pumping and landdevelopment.Our goal was to use Data Mining tocreate an accurate model of theaquifer’s water level.
The following is a 25 x 30 miledetail from near the center of thesystem. It shows the positions of 22wells and their histories since 1982.Note that the two groups of circledwells clearly behave differently fromeach other.
490000470000450000 Suw25 miles430000 anne e Rive410000 r390000370000350000 2360000 2380000 2400000 2420000 2440000 2460000 2480000 2500000 30 miles
Because the wells exhibited so manydifferent behaviors, it was necessaryto group them into “classes”. Wellsassigned to a particular class behavesimilarly.Data Mining optimally determined thenumber of classes and how the wellswould be assigned.
The following shows that 12 classeswere used and how the wells wereassigned. The classes are numbered1 to 12.It was surprising how some classesare distributed over a broad area andare intermingled with other classes.
Closer inspection showed that DataMining did indeed optimally assignthe wells.The following shows the “normalized”histories of wells for two of theclasses.Note the seasonal variability.
The next Data Mining task was to assignaquifer locations to the 12 classes.Locations were optimally assignedbased on their topologicalcharacteristics and proximity to wellswhose classes were known.Results are shown in the following.
The next Data Mining task was tocreate a water level model for eachclass. Every location was assigned toa class, and therefore, a model.Inputs to each model were thecharacteristics of a location and waterlevels of selected wells. The outputwas the predicted water level of thelocation.
The models are very accurate.Accuracy can be checked at locationswhere there are well histories.The following compares predictions toactual histories for wells of fourdifferent classes. The water levels arenormalized to land surface elevation.
Normalized Water Level above Sea Level Class 1 Actual Prediction History from April 1982 to October 1998
Normalized Water Level above Sea Level Class 3 Actual Prediction History from April 1982 to October 1998
Normalized Water Level above Sea Level Class 6 Actual Prediction History from April 1982 to October 1998
Normalized Water Level above Sea Level Class 10 Actual Prediction History from April 1982 to October 1998
The “model” of the aquifer is actually acollection of models, one for each class.A computer program was created thatintegrates the models, a history database,and a graphical user interface.The following shows a long termsimulation of the aquifer’s water levelgenerated by the model. Note the colorkey at right, and that time is reversed.
Often multi-dimensional visualizationreveals important information thatwould otherwise go unnoticed. ADMihas world-class capabilities inadvanced visualization technology.The following shows the model’sprediction of the upper range (ceiling)of the aquifer. The vertical scale isexaggerated to show details.
The following shows the predictedaquifer level for the period fromJanuary 1995 to October 1998.Note the spatially asynchronousmotions caused by variability inrainfall and the Suwannee River’sstage.
Conclusion sThis Data Mining-based model requiredabout 10 weeks to develop.A conventional finite-difference model ofthe same natural system was developedby a government agency. It took over 3years to complete! It is much lessaccurate at predicting water level.
Conclusion sData Mining is incredibly powerful forextracting knowledge about complexnatural systems from databases.The models can be more accuratethan traditional approaches, andrequire much less time to develop.