Georgia Geospatial Workshop: Proper Care and Feeding of Metadata
Proper Care and Feeding of
Ryan E. Bowe, GISP
Photo Science, Inc. – A Quantum Spatial Company
ASPRS YP Council Mentoring Coordinator
Vanguard Cabinet Steering Committee Member
Cumberland URISA President-Elect
Proper Care and Feeding of
Metadata by Ryan E. Bowe, GISP
Ryan Elizabeth Bowe, GISP URISA Vanguard Cabinet Member (January 2012 – January 2014), now Steering Committee member Secretary of Cumberland URISA, now President-Elect ASPRS Young Professionals Council member At Photo Science, a Quantum Spatial Company I started out as a GIS Technician and have moved all around, including Alternate Sensor Operator, and settling in Metadata and Report Manager. I have written hundreds of thousands of metadata files! I really and truly LOVE metadata and hope I can share my passion with you today. Here is the first look at XML as well. It’s fake tags with the exception of the first line. And the importance of comments The first line points to a CSDGM DocType definition. It helps writing with XML editors but will cause Internet Explorer to freak out if the file isn’t in the same folder as the XML file. I consider XML to be much more flexible than text, and if you are writing ISO metadata, you have to write it in XML format (unless you have a great tool). But we’ll get into all these acronyms momentarily.
Yep, I love metadata so much I consider it yummy. After a minimal Overview of Metadata Meanings and options we will talk about Metadata Madness I will go over the FGDC CSDGM sections in revers order. LiDAR Base Specification Brief look at ISO metadata Maintaining Metadata Good, bad and ugly examples Sections to watch for updating.
I love the can label justification for metadata. You might have heard that story about going into the store and finding an unlabeled can and trying to guess if it is cat food or tuna (and how much it costs as well) and then being asked if you’d buy and eat the contents without all your senses present. But since this is Proper Care and Feeding of Metadata, let’s try a new example.
So you don’t think you need metadata? Well, then. I have some seeds to sell you. I don’t have a clue what will sprout from these seeds I don’t know how you should plant them. Sun or Shade Potted or Outdoors With lots of space to grow or a minimal footprint? Water? Fertilizer? Germination time frame? Do you need a “male” and “female” plant like with kiwi? Is it legal or illegal (unless you’re in Colorado)? I don’t know how long the entity that grows will last. What if they are Phirana Plants?!? How do you kill them? Will you be picking beans until November? Will you instantly be killing giants like Jack the Giant Slayer if you “add water”? I don’t know what you will pay for them. And, once I do charge you my mystery fee, I have no clue how you will receive them. Worst of all, since I’m a mystery seed seller…you’ll have no way of knowing how to contact me if you do “get them wet” (or feed them [the mogwai] after midnight)
Don’t you think Seymour wished Audrey II came with some metadata on that ill-fated total eclipse of the sun? So, is that a better example of Cat Food v. Tuna? About the same? These Little Shop of Horrors images make a great point as well…don’t wait until the metadata beastie is a big enough problem that it can consume you whole!
So, what is metadata? A headache, right? Like organizing all these library card catalog entries after ghostbusters. And, I’ve heard metadata likened to these old school card catalogs. But I heard rumors that they’re doing away with such things and going digital…so I have to wonder how long this comparison will be relevant. I used this example and foud out that there are some active aerial photography card catalogs out there!!! Boy, did I feel bad.
And it’s been likened to the information on the back of photographs. I know those have gone digital. Here is an example from LightRoom. My cameraphone took this photo on the 7th of July at 8:37.02PM. The light was fantastic (and I could get into some geeky camera terms that are well-labeled, but we’re not here to talk camera stuff, we are here to talk metadata).
Here is another example of what I consider “current” metadata: your music collection! (No more 8-track, Record, cassette, CDs!) So, imagine you had a ton of Unknown songs in your music cloud. Do you still consider metadata something to be avoided at all costs?
And, lastly, I also believe the new reason I probably could have skipped the last few slides and why no one really needs to define metadata anymore is the NSA “scandal”. It is scary when you put it in this context, but when you think about your highly valuable geospatial data, it’s perfect, right? You don’t have to look at the 2GB image, you can read the metadata and know ANYTHING. How, you ask?!? Let’s look at one more definition, this one relevant to geospatial data.
I’ve had enough fun defining metadata in general, so let’s talk about geospatial metadata. Back in 1994, Bill Clinton signed Executive order 12906, creating the National Spatial Data Infrastructure (NSDI) and Federal Geographic Data Committee (FGDC) in order to have a clearinghouse of geospatial data. The clearinghouses have changed faces over the years, but their searches have been based on the Content Standard for Digital Geospatial Metadata. CSDGM.
Before we talk about what TO do, let’s make sure you know EXACTLY what NOT to do. Do not stare at a blank slate! Look at the actual dataset Start to gather facts (talk to people who worked on the dataset if you didn’t work on it yourself) Request information from the “Source” if you aren’t the source Search for relevant templates (by searching for similar datasets, if nothing else) Consider finding a quiet place to write metadata or your coworkers may accuse you of turning into an Aubrey II when you write metadata.
CSDGM is your best friend while writing FGDC Metadata. I have a well-worn copy printed out by my desk. I still have to double check things when I write sections I do not use all the time. In order to familiarize myself with the document, I went through and I highlighted all the optional fields in my copy. It helped reinforce the “symbols” they use there. It also helps for days when the curly brace looks an awful lot like a parenthesis. Ahem…days when you’re feeling old. When you realize the “next” generation isn’t going to have to learn how to use a card catalog.
The other big geospatial metadata standard is ISO19115. NOAA’s National Coastal Data Development Center [NCDDC] has a great series on it. I’ve taken it several times and learn something new each time. Since they do so well, I’m going to focus on FGDC more. I know, that link is difficult to read. But if you search for NOAA Metadata Training…that’s the first link. I can’t recommend their webinars slash training enough. They have recordings on their website that are complete with “homework”. They currently have: Introduction to Geospatial Metadata ISO 191** Metadata And Transitioning from FGDC CSDGM Metadata to ISO 191** Metadata running. Don’t worry about missing some of the courses – they put the recordings on their FTP site so you can catch up this weekend.
At one point in time I’d suggest going to ArcGIS as better than a blank slate because it had a readable interface with CSDGM specifications in it, but now, with 10.x, not so much. Here’s the old school 9.3.x editor, may it rest in peace.
Now, we have this…initially. Oh, but don’t forget the nifty trick of turning it to FGDC metadata in the options.
Just in case you haven’t found the options interface, here it is. It is on under the Customize &gt; ArcCatalog Options menu. There, you get to choose your Metadata Style. And, also notice that you can tell it whether or not to automatically update your metadata. I usually like to leave this unchecked, but it is a personal preference. And it will be very nice if you need to track exactly how you created a feature.
But, now that you have the “correct” options chosen, you get this. Sigh. It’s giving me ISO descriptions down there. How’s that going to help me write FGDC metadata? At least it has some of the required tags correct (Identification Information and Metadata are the only two required sections, right?)
See my cursor hovering over the Title element…
So, yep, we are working with ISO in an FGDC file. There are a few places where you can find FGDC descriptions, but it’s on ones that are FGDC only, really. For the title, though, it’s exactly the same in the Citation Information &gt; Title (8.4): The name by which the data set is known. But man does that make my head hurt! So, let’s go back and look at the couple sections. (Skipping the browse graphic because that is still a very nice tool.)
Again…I hover over those boxes (sometimes it takes a bit of encouraging to get the text at the bottom to come up….
So here are the next three sections. And the types…and I think I have an even bigger headache now! There’s no relation to tags. Abstract: a brief narrative summary of the dataset Purpose: a summary of the intentions with which the data set was developed.
Toolboxes. Before I just throw my hands up and walk away from ArcGIS, I will point out the all important difference between the Model (two blue dots, a yellow dot, and a green dot) and the Tools in your toolbox. For whatever reason I have not been able to get the models to function properly. They always error out for me. Now, maybe it is better in 10.2, but I don’t know. I do know that the tools (hammers) work, so instead of wasting time seeing if it works again…I stick with the hammer-time-tools!
Before we talk about tools other than ArcGIS, I have to remind you that you do not have to spend any money because the only thing you really need to write metadata is a text editor (such as text pad), the standard (all free online), and a validator (MP is provided from USGS and free).
There are plenty of tools out there other than ArcGIS…play around with tools (they all have trial periods if they aren’t free) and find one with which you are comfortable. This is another thing NOAA NCDDC training does really well. They review several different tools (Mermaid, CatMDEdit, GeoNetwork, ISOMorph, Geoportal, Altova, oXygen). Again, the main thing is to find something you’re comfortable with and run with it. For the longest time I would only use UltraEdit. Now, I use oXygen. But I am always on the lookout for new ones. And, if you’re really REALLY good…like Watershed Sciences, a Quantum Spatial Company, you’ll make your own tools. That’s another topic all together.
(This is a bit of a cart before the horse issue because you will probably want to decide if you are working in text before you commit to an XML editor. Then again, some of the platforms available will let you output the data in text or XML…so maybe it’s more chicken-before-the-egg debate?) Anyway, bit more about “platforms”. Once you pick your editor, you also get to choose between text and XML. I personally love XML. It’s probably something to do with the fact that the spaces in text metadata make me feel vapid. If you’re off by one, you’re done. I don’t play games I cannot win, and that feels like the house always wins to me. Yes, it’s more readable, but I’ve been working with XML and the CSDGM long enough to be comfortable with tags. These screen shots are UltraEditor (on a Mac) for the text on the left and oXygen XML for the XML on the Right. Again on a Mac. And, again, I view XML as much more portable and flexible so the rest of the webinar will be in XML tags. Also, if you are writing ISO metadata, you have to write it in XML.
Keeping with the USGS DEM example, here is an HTML view from http://ned.usgs.gov/ned.html It’s a bit more readable because you can jump to sections with the hyperlinks, but it is not as easily computer readable. I guess that’s not too important, though, since this one doesn’t have the bounding coordinates!
So, since we’re learning XML because we will have to know how to use it in ISO, here’s a brief course on what you need to know about XML. It is very similar to HTML, if you’ve ever seen that.
When it boils down to it, all you really need is a text editor (and there are plenty that are free that work just fine) and metaparser. By the way, I did NOT say Internet Explorer. Please don’t try to edit your XML files in IE. It just won’t work. It is, however, a good test to see if you have all your XML tags done properly. And, it is your link to MP! If you’re in for a challenge, install the MP software package. The problem with this is that updates come out so frequently that it is much easier to run the online translation. I will say using the command line MP input builds character
When the government is not shut down, this is what MP looks like. It will always be the most current version, so you don’t have to worry about reinstalling or downloading the latest MP. But, every once in a while, you might want to because it may be the only link you have to MP.
Old school! When the Government shut down I was quite worried I wouldn’t be able to use MP. And when a metadata file needed to be written in Text and XML I was quite terrified. Translating to text is not something I will ever want to do by hand. So I looked through and I found I had just downloaded a version because I knew it had the LiDAR extension in it. Saved! This is what you see when you start command line MP. Yes, this is where you have to brush up on your DOS commands (hopefully you aren’t learning these for the first time ever, because that will make me feel like a grizzled old mapper)!
And here is how I used MP to translate a TIGER_unedited file to an XML file. Easy, right?
Now that I have shown you some Tools and Formats and you suffered through the old school command line example (which I had originally hoped to never have to show anyone), let me tell you a story about another tool that led to my discovery of the power of templates. A long, long time ago, maybe the fifth time I wrote metadata, a contract with metadata was brought to my attention. It had a link to something called “XMLInput”, but the link was broken. After some serious internet searching, I managed to find the proper link. I tried a few times to make the actual tool work, but I gave up because the templates provided with the tool were much better for me. But as I have repeatedly pointed out I love XML and I love the CSDGM. Although these templates make terrible bedtime reading (even for me), using them has made me a better metadata writer. In the background you see the 133UATemplate, where you can delete the comments with all the information you ever needed to know about the tags…so it is very easy to write what you need. Remember my introductory slide with comments? Here they are again! These are just a few templates that I rely on, there are often others for the different clients and “profiles”…one of the hardest has to be the National Flood Insurance Program (NFIP) because it has so many fields that do not change. It should be easy, but it is the square-peg-round hole issue. You have to describe your data with a set phrase that just…doesn’t describe the data! Another fun one is the USGS LiDAR base specification. It is one of the most recent revisions to MP! But, I haven’t seen the Document Type Definition [DTD] updated. If you really get into it, you can revise a version of the DTD so you can see the changes in your XML editor! Give yourself a few hours…we will look at the LiDAR Base specification later.
Before we look at some of the sections of FGDC metadata (and be thinking of which ones you want to talk about from 1-7…I view the last three as building blocks so the most important to discuss) I want to look at the training options available to us. Some of them are straightforward training sites and conferences, but others aren’t You’d be surprised what social media can teach you! That’s where I found the NCDDC training. Also, lynda.com has an excellent XML class if you’re totally lost on my slides or need a refresher. While I like GeoSpatial Training Services, there’s nothing SPECIFICALLY metadata. Unless you’re needing coding lessons.
You can spend tons of money on various materials, but the best training method is to just hit the books on your own and write metadata. So lets hit the books and look at some of the sections.
I know you have all seen this before at least once now, but I’ve modified it a bit because the three building blocks were left out. There are actually ten sections of the CSDGM, with the last three being repeated throughout. (just think about how many time period features there will be in metadata!) What we are going to do now is to go through the sections in reverse order, starting with 10, contact information. Here we go.
I can’t stress how important it is to get a handle on these sections because they are used throughout. So we will be looking at the sections in reverse order. I’ve tried to keep a color scheme going here where red has several options available, green are optional, and blue is a little quirk. So, onto the examples. For Contact Person, you can have a person or an organization. For my example, it is a person. You don’t have to list a contact position (cntpos). Mailing and Physical is an Address Type. Although the CSDGM allows free text, the suggestions are “mailing”, “physical”, or “mailing and physical”…This is absolutely NOT the first line of your address! This is supposed to tell people how they can use this address.
Here are the two options for Contact Person/Organization. You can actually have an organization with the contact person as well, too. I just chose not to insert that here.
There are three main types of dates: single, range (with beginning and ending), and multiple (which is made up of single dates…and by made up of I mean a multiple date and time must have at least two dates). You can also use time here, but I rarely use time entries so it is one of the tags I’d have to go back to the CSDGM to be able to write properly.
So here’s a range of dates.
The only tricky part about Citation Information is that you can nest it incessantly within Larger Work Citation. I don’t see any reason to do this, though. Id like to point out that green unknown up there. That’s a pet peeve of mine. It is optional, so since it is unknown just leave it out!
A bit more readable version of Citeinfo with just a bit less…
Here’s the graphical CSDGM, but it is missing the most important sections in my mind…which is what we will go over next. Granted, these are the “core” seven sections and the rest are just the building blocks that appear in most of these sections and sometimes multiple times in the same section. I’d like to see if we can cover the ones you mentioned, but we have to cover metadata because…
There is no good way to really explain the metadata information without falling down the rabbit hole. It is the biography of the data and where you give yourself credit for all the hard work you put into crafting this data documentation. Think of it as your byline.
…it is my favorite section! Take a look at the si section because it’ll look really familiar in a bit. I’ve also added in the LiDAR Base Spec…to prep you for that discussion. And, looking at the poll results, we are a pretty diverse group of producers and users. That’s the beauty of metadata – it’s flexible!
And here’s the brief version…which is much more readable!
Sorry, but if you thought my other slides were bad with the XML of sections, Distribution would have put you all into a coma. It’s back to the NSA and planning, which Pointy-Haired-Boss doesn’t do so well. And watch out for pointy haired bosses…they may say they’re editing your metadata but it’s in-one-ear-out-the-other. Sometimes it’s best to parse the metadata into an e-mail and say “here, read this and make sure you’re ok with it.”
Entity and Attribute Information is going to document the data columns. It has the potential to become very detailed, or extremely easy. First we will look at the overview (easy) and then the detailed, which has four options for the attributions.
Entity and attribute overview is simply a description and the “citation” of that description. Detailed is more complex, and built for “vector” data with true attributes. It doesn’t totally make sense for a true raster dataset because you’d have X, Y, and “value” but you can do that if you need to. So, the entity type has a label, a description and the source of the description…it’s like the overview of the details.
Then, you get into the actual attribution. This also has a label, a definition, and a source of the definitin. Then the domain value has four options: unrepresentable (udom) Codeset (codesetd) – established set of coded values Range (rdom) Enumerated (edom) – established set of values
Here are the four types of entity and attribute information detailed sections. There’s also the overview section, but that is infinitely simpler than these detailed ones. Both overview and detailed sections of entity and attribute information are definitely sections to to consider writing and having ready to pull into the larger file if you have commonly used fields.
Here are the next two…hopefully that explains the four options. There are more things you can do with the attributes (dating them, accuracy, and measurement frequency). Some of this feels very redundant to me because you talk about this overall, but if the attribute itself has values that are different than the overall dataset, I could see how it might be useful.
Spatial reference is the largest section, but it is one of the most important ones. Starting with the two easiest ones: geographic and local. I exported geograph from ArcGIS (see, it has it’s uses) but I really don’t care too much for the resolution! It does pass through MP though. And Local is just a description. Keep this in mind when you are trying to make sure that the HARN is noticed.
We are going to go back a bit because this is an easier section that stays the same no matter which one you choose. It’s the planar coordinate information with planar coordinate encoding method being able to have coordinate pair or distance and bearing in addition to row and column. Then, the rest is simple, right? Abscissa and ordinate resolution are the nominal minimal distance between x (abscissa) and y (ordinate).
UTM 17…note the differences between LCC and TM
I only have a few horizontal references but they feel as if they are the most deeply nested and confusing section of metadata (almost giving Distribution a run for its money). Anyone tell me what state plane zone I have here? 4204 = Texas South Central (this example is in meters)
And, one more look at map projections: an albers. I left the spaces in there so we can scroll back through and see what changes (besides the values)
Sorry it is a bit early for Christmas unless you are a retail store, but the Horizontal Datum Name is optional and is restricted to NAD83 and NAD27. No free text option here. Ellipsoid Name has suggested GRS80 and Clarke 1866. So, what do you do with HARN? Ahh…the fun things to think of…
Both Altitude and Depths have similar formats as you can see here. And, while there are huge lists for the names, they both allow free text.
spdoinfo is a little complex as it captures the number or rows and columns or the number of points and vectors. So, in the example on the slide there are 3200 pieces of glass in the Chihluy glass tower in Indianapolis. I guess we are lucky we are not writing metadata for the tower because we could be documenting how many of those different figures there are as well! And, we could be forced to write metadata on a “tile” level instead of a “project” or deliverable level and document exactly which color rod was used, dates, temperatures, members involved in blowing one of the 3200 pieces, who cooled it down, which carton it was shipped in, last time it was cleaned…well, we’d escape that one because cleaning is probably a metadata fact best attributed to the entire structure.
These are all green because you have options – you can have indirect or direct. Every once in a while I’m really glad I work in pixel (or dot) land. Raster spatial data organization information is much easier to fill out than vector spatial data organization information. Yes, I have to count rows and columns but for the most part the size of an image is standardized quite nicely. Not always, but…
…vectors have two formats Spatial Data Transfer Standard and Vector Product Format…just when you thought raster’s row and column counts looked bad…vectors have the option of a point and vector object count. And a somewhat difficult to figure out type. If you have questions about that, please visit the website on the slide.
Again, I find rows and columns much nicer than this. But, I don’t get out of it that easily. I have to write metadata for vectors as well. We have seamlines and contours and planimetric database features.
I hope you have all see the Accuracy v. Precision graphs…thanks again to overboard (one of my favorite comic strips) for another interpretation of accuracy and precision with a process involved as well. Which is exactly what the Data Quality section is! Three types of accuracies (Attribute, horizontal and/or vertical positional accuracies) and Lineage (sources and, more importantly, processes) with a little cloud cover thrown in for good measure.
The accuracy sections are quite varied but have the same basic format: a report with the optional quantitative bit which gives a value and how the value was assessed. And then two text reports.
Here is the attribute accuracy report. Even though attributes are drastically different from positional accuracy values, they still have that report; value and explanation format. I don’t have a very good example here, but an example might be if you had a classification attribute (whether it is contour type or feature type [road v. building] you could report how accurately you had the attributes categorized. Also notice that the quantitative section is always optional. And, again, if you documented attribute accuracy on the actual attribute, you may not want to repeat the information here or just do the report to explain that it is documented in the attribute information.
Again I am just quoting the standard here, but this would be where you could explain how the depression and intermediate contours were created and, if there are any void areas in the data, why they are present and, if you’re working in pixel land, what color they are.
And here is the horizontal positional accuracy section. Sometimes I find it difficult to write both the report and the explanation because they end up sounding exactly alike. What I usually do to keep this from happening is to report the target in the report and keep the explanation to exactly how that value was derived and which value it is (NMAS, NSSDA, etc.).
Notice that the vertical and horizontal tags are both “optional” … you have to have one or the other. You won’t just have positional accuracy without anything else. (Of course, I’m a “completist” in that sense. I hate seeing sections with N/A or None. If that’s the case, just leave it out!)
This is an overly simplified, but optional, source info. Theoretically, if a source is used in the process step, it should be defined. It is repeatable, but I warn you to use some restraint here. It gets ugly quickly!
So, there are two tricky domains here. srcused and srcprod. Both have to be “Source Citation Abbreviations from the Source Information entries.” But there aren’t any checks built into MP for this. Somehow, they trust us. Plus, I suppose since it is optional it really doesn’t have to be thoroughly checked. Again, it is repeatable. And again I warn that you should use caution. I know it says “information about a single event” in the process step description but no one wants to read about every single mouse movement.
We’re almost there! Cloud Cover is the last tag in Data Quality! It feels a bit out of place, but the amount of data which is obscured does fit! I hope that you see the silver lining on the metadata cloud right now.
OK, It can be painful, but let’s look at how you will be giving people their data. I chose an antiquated method of delivery for this title slide for a reason…
Before we get to the last section, we will look at Distribution. If you do the minimum, it is very easy. But this is an old section that shows how much has changed since 1998. One of the sections that isn’t required is the Standard Order Process, which can have two forms. One of which is Non-Digital Form. But WAIT!!! What does the D in CSDGM stand for?!?
Just for argument sake, this is what the non-digital format looks like (along with the optional resource description at the top…just thought I’d throw that in there). Still nice and easy!
So, here’s the first have of the digital information. You might be able to tell my heart’s not in this. Typically I’m a data producer and that does not include data distribution. So, in order for me to write this I have to consult my crystal ball. AKA, I have to beg and plead to have someone contact the client to make sure it is OK to leave it out or to have them write it for me. I don’t know their URL’s and I usually don’t know their fees.
So, while my crystal ball unfogs and tells me how to write this section does anyone see any problems with this screen capture of formats from the CSDGM? I do…SHP, JPG, SID, MDB, GDB…just to name a few. But this does allow Free Text.
And here’s offline option. It’s nice and logical, I think.
So, this is what you need to have for the online option. There’s also Dialup Instructions, but I won’t waste our time on that. It has even been removed from ArcGIS with the following message “Dialup instructions can’t be provided in an item’s metadata using ArcGIS. If existing metadata includes this information it should be updated to include current information by which the item can be obtained.”
Well, we’ve made it this far. One more to go. It is probably my second favorite section. Metainfo being my first. I just hope you feel more like the bear in this photo than the fish by this point in the workshop.
There’s a very, very good reason to save this for last. You’ve probably had to research most of these things in one form or another to write the other sections…so for the most part this information will be easy. We’ll take it a few pieces at a time now.
The citation is simply a nested citeinfo (by now, with as many ellipses as there have been I hope you have seen just how important it was for us to start backwards…those are the building blocks of FGDC metadata so it is imperative to understand how to write those sections). Then comes what I think is probably one of the hardest sections to write because you have to have an excellent understanding of the data: the description section. It has an abstract (did your teachers always have you write the abstract last as well?), purpose, and catch-all supplemental information. Again, the time information is present
Spatial Domain is a really spatial one for me…I think about all the people using clearinghouses to search for data that is powered by my metadata’s bounding coordinates. And my heart goes pitter pat! By the way, the bounding coordinates are for the Continental US.
Keywords also make my heart go pitter pat because I think of people searching for specific datasets based on keywords. Here’s the basic structure. I use the first two frequently, but not the second two. Notice they’re empty. This doesn’t fly with MP. It will produce errors. But for this, it at least shows you the structure…
And here are three examples. Notice that I’ve repeated place. That’s important to notice. If you change the thesaurus, you need a new keyword section (theme, place, stratum temporal). You can repeat the keywords, though. So, in my example, the data falls in KY and TN.
We really are almost there. Once you get past the legal jargon (but, ever wonder what legal rights you’ll have if someone doesn’t read your metadata file or if the file doesn’t travel with the data?) we get into the optional sections of IDINFO. Finally. Point of contact, a nested cntinfo, is first.
More optional. The sec will look familiar from metadata. Browse graphic…which I highly recommend in ArcGIS. Data Credit is where Data Producers put all their hard work. Native Data Set Environment. You can go as detailed as you want in there and list the different versions. So, I should have listed 9.3.1 and 10.1 for ArcGIS, right? And, lastly, crossref. Another section which can quickly get carried away. Then again, there are several valid ways to use this!
Just when you thought you’d made it through the toughest section, they made it tougher. If you deal with LiDAR, you get to add a new section. Here’s an image of what it looks like from the sample. It’s an interesting addition…and always a fun challenge to fill out as an acquisition firm. What happens if you use two sensor settings (or even two sensors) on the project? I actually had to call the USGS for help with that! It’s not easy!!!
We have officially made it through CSDGM! Now we can slowly start to switch to ISO. So, the first thing we will look at is crosswalks…because we’re totally FGDC CSDGM experts now, right?
Here is where you download the Crosswalk. There’s also an FGDC version, but I found this one more useful. http://www.fgdc.gov/metadata/documents/FGDC_Sections_v40.xls/view http://www.ncddc.noaa.gov/metadata-standards/metadata-xml/
Now that we’ve gone through FGDC CSDGM thoroughly, let’s look at how to get from CSDGM to ISO metadata. This shows you the location of the FGDC tags. And it’s from NCDDC. So, if you wanted to see where you’d put your currentness reference in ISO, you’d drill down through all that stuff… Again, you’ll see everything I have about ISO is from NCDDC / NOAA because otherwise you have to pay some pretty big money for a single copy which you can only print once. I have used their wiki to use their XSLT to transform things from FGDC to ISO. They have great documentation. https://geo-ide.noaa.gov/wiki/index.php?title=Transform_in_Oxygen_XML_Editor https://geo-ide.noaa.gov/wiki/index.php?title=Web_Accessible_Folder_Tools
Yep, it looks pretty nasty, doesn’t it?!? Not nearly as “clean” as FGDC CSDGM. That is largely because ISO uses attributes on the XML tags. On occasion it will have the “single” tag (open and closed)…so in this the pointOfContact would go away and the closing tag would be replaced with a /&gt; forward slash greater than
Here is a clean version of the different sections of ISO metadata. We’d largely be working in the green, although I usually work with the MI metadata (powder blue) because it has all of MD and also the imagery and gridding… 19157 is the Data Quality in general.
Here is the MD Template.
And here is the MI template. You can see they are largely the same here, with the exception of the first tag.
http://s.ngm.com/2012/04/titanic/img/titanic-bow-615.jpg There’s so much more to talk about in terms of metadata. We could go through each line of the files and explain it, compare it in XML and Text…go through the different ways to display the metadata once it is written…. I encourage you to go out and try some things out. Set up your snippets for easy use. Try some new program. Even try to break ArcGIS (hey, I have a warped sense of fun but discovering new “features” in the software is at least interesting).
Now that you have the FGDC basics, we can look at some of the things to do and not to do when it comes to metadata. Keep in mind that this is very subjective. What I consider ugly may be something you have to do. I know I’ve had to do “ugly” metadata before!
Although Russian Nesting Dolls can be really beautiful, nested metadata is practically illegible and ends up like these dolls…the last doll doesn’t have any information! This is the one I’ve had to do to comply with certain requests. http://www.therussianshop.com/russhop/invitationalpage/Kalyazin_Matryoshka_Doll_5.jpg http://www.bestpysanky.com/v/vspfiles/photos/nds05155-3.jpg http://media-cache-ak0.pinimg.com/736x/d1/dd/1e/d1dd1ebc4896d7a033e2099aeb8cdb4d.jpg Recursive: http://arttechlaw.com/wp-content/uploads/2012/06/recursion.jpg http://ts3.mm.bing.net/th?id=HN.608013339006340232&pid=1.7 Simpsons http://www.threadbombing.com/data/media/67/tumblr_ld50v7q6tn1qabw68o1_400.gif http://upload.wikimedia.org/wikipedia/commons/thumb/6/62/Droste.jpg/220px-Droste.jpg
What does it really mean to meet the minimum? And, can you really complete the maximum metadata? You can get away with only completing two of the sections we just went through: idinfo and metainfo (1 & 7) Metainfo only has 4 tags you have to fill in: Date metadata were created (or last updated) Metadata contact Metadata Standard Name (formatted text) Metadata Standard Version (more or less formatted free text as well) Idinfo is more complex Citation (nested citeinfo) Descript Abstract Purpose Timeperiod (nested timeinfo) Current Status Update Spdom Keywords Theme Themekey Themekt Access Constraints Use Constraints The hardest one of those, to me, is probably the bounding coordinates. Everything else is basically free text (or dates) which means they could all be none!!! So, if you just meet the minimum…your data isn’t projected, has no measurable quality, and you can’t document the process of the data creation! One of the repeated phrases is “mandatory if applicable” (from a contract: “Profiles and extensions to the standard that have been endorsed by the FGDC shall be used if they are applicable to the data or data products. The metadata records shall contain any and all elements, including those that are considered optional, wherever applicable to the data or data product. The metadata record shall contain sufficient detail to ensure the data or data product can be fully understood for future use and for posterity. The metadata records shall be delivered free of errors in both content and format as determined by the metadata parser (mp) program developed by the United States Geological Survey or an equivalent.”
Yes, this passes through MP!!!
So I have shown you two things I thought were ugly and bad, but what is good? It is very subjective. Lexington has been having a bumper sticker war for years now – one side said “Growth is Good” and the other said “Growth destroys the bluegrass forever.” Unlike the bumper sticker war, though, metadata does not truly have the same kind of implications. Being data about data it is nebulous. So, for me, good is making the clients happy. That is a wide spectrum of metadata. What do you think makes for good metadata?
When I read “Who Moved my Cheese?” I thought about data being “moved”…which made me realize if it was moved I’d have to update the metadata. Revisiting metadata can be painful, but it also lets you revamp the metadata quality. And, if you have a large group of people “messing” with your data, you will have to update the metadata frequently. I hope they all take good notes and can tell you what they did, though, or you have a flawless backup system!
Data manipulation techniques change Software is updated Data itself is updated MP is updated The general rule is if the data changes, you should revisit the metadata. Maybe you have some static layers. Great. You don’t have to change your metadata. Then again, maybe you update that static layer (images, maybe) every so often. After you use the first dataset’s metadata you can always improve the future “generations” of metadata. What tags make more sense for your organization? Which don’t make any sense at all? I urge you to test those tools while writing a template. Then, go back to it a week or a month or a year later. Reread it. Try new tools. One of the best ways to learn metadata is to “just do it”