Hopefully now you won’t run screaming from metadata. Course, it may take a bit of organization to eliminate the ghosts in the data library, but you will be glad you did it!
Ryan Elizabeth Bowe, GISP
URISA Vanguard Cabinet Member (January 2012 – January 2014)
Secretary of Cumberland URISA
ASPRS Young Professionals Council member
At Photo Science, I started out as a GIS Technician and have moved all around, including Alternate Sensor Operator, and settling in Metadata and Report Manager.
I have written hundreds of thousands of metadata files!
I really and truly LOVE metadata and hope I can share my passion with you today.
Yep, I love metadata so much I consider it yummy. After a minimal Overview of Metadata Meanings we will talk about
I love the can label justification for metadata. You have all probably heard that story about going into the store and finding an unlabeled can and trying to guess if it is cat food or tuna (and how much it costs as well) and then being asked if you’d buy and eat the contents without all your senses present. But since this is Proper Care and Feeding of Metadata, let’s try a new example.
So you don’t think you need metadata? Well, then. I have some seeds to sell you.
I don’t have a clue what they are!
I don’t know how you should plant them.
Sun or Shade
Potted or Outdoors
With lots of space to grow or a minimal footprint?
Germination time frame?
Do you need a “male” and “female” plant like with kiwi?
I don’t know how long the entity that grows will last.
What if they are Phirana Plants?!? How do you kill them?
Will you be picking beans until November?
Will you instantly be killing giants like Jack the Giant Slayer if you “add water”?
I don’t know what I’ll charge you for them.
And, once I do charge you my mystery fee, I have no clue how I’ll deliver them to you.
Worst of all, since I’m a mystery seed seller…you’ll have no way of knowing how to contact me if you do “get them wet” (or feed the mogwai after midnight)
Don’t you think Seymour wished Audrey II came with some metadata on that ill-fated total eclipse of the sun?
So, is that a better example of Cat Food v. Tuna? About the same?
These Little Shop of Horrors images make a great point as well…don’t wait until the metadata beastie is a big enough problem that it can consume you whole!
So, what is metadata? A headache, right? Like organizing all these library card catalog entries after ghostbusters. And, I’ve heard metadata likened to these old school card catalogs. But I heard rumors that they’re doing away with such things and going digital…so I have to wonder how long this comparison will be relevant.
And it’s been likened to the information on the back of photographs. I know those have gone digital. Here is an example from LightRoom. My cameraphone took this photo on the 7th of July at 8:37.02PM. The light was fantastic (and I could get into some geeky camera terms that are well-labeled, but we’re not here to talk camera stuff, we are here to talk metadata).
Here is another example of what I consider “current” metadata: your music collection! (8-track, Record, cassette, CD, waaa?) So, imagine you had a ton of Unknown songs in your music collection. Do you still consider metadata something to be avoided at all costs?
And, lastly, I also believe the new reason no one really needs to define metadata anymore is the NSA “scandal”. It is scary when you put it in this context, but when you think about your highly valuable geospatial data, it’s perfect, right? You don’t have to look at the 2GB image, you can read the metadata and know ANYTHING. How, you ask?!?
I’ve had enough fun defining metadata in general, so let’s talk about geospatial metadata. Back in 1994, Bill Clinton signed Executive order 12906, creating the National Spatial Data Infrastructure (NSDI) and Federal Geographic Data Committee (FGDC) in order to have a clearinghouse of geospatial data. The clearinghouses have changed faces over the years, but their searches have been based on the Content Standard for Digital Geospatial Metadata. CSDGM.
Before we talk about what TO do, let’s make sure you know EXACTLY what NOT to do.
Do not stare at a blank slate!
Look at the actual dataset
Start to gather facts (talk to people who worked on the dataset if you didn’t work on it yourself)
Request information from the “Source”
Search for relevant templates (by searching for similar datasets, if nothing else)
One of the early scenes of Ghostbusters demonstrates the “doing” research on the ghost…I mean data.
I’d also like to point out that it helps if you have a quiet place to work while writing metadata. I know some of my coworkers worry about me turning into the ghost at the end when they interrupt me. Ok, maybe not quite that bad but you’ll have to ask them.
Also notice that they went to the source, when the source didn’t reply, they made an alternate plan. Granted, it was kind of a bad plan in this situation, but if the ghost is your data…looking at the data was the first idea of mine instead of staring at a blank slate, wasn’t it!
A full torso apparition and it’s real
So what do we do?
Could you come over here and talk to me for a second please…could you just come over here for a second please…right over here…c’mere Franciene c’mere.
What do we do?
I donno, what do you think?
We gotta make contact. One of us should actually try to speak to it.
Hello. I’m peter. Where you from, orginally?
Alright. Ok. The usual stuff isn’t working.
Ok. I have a plan. I know exactly what to do. … Now stay close, stay close. I know. Do exactly as I say. Get ready. Ready. GET HER!
CSDGM is your best friend while writing FGDC Metadata. I have a well-worn copy printed out by my desk. I still have to double check things when I write sections I do not use all the time. In order to familiarize myself with the document, I went through and I highlighted all the optional fields in my copy. It helped reinforce the “symbols” they use there. It also helps for days when the curly brace looks an awful lot like a parenthesis. Ahem…days when you’re feeling old. When you realize the “next” generation isn’t going to have to learn how to use a card catalog.
The other big geospatial metadata standard is ISO19115. NOAA’s NCDDC has a great series on it. I’ve taken it several times and learn something new each time. Since they do so well, I’m going to focus on FGDC more. I know, that link is difficult to read. But if you search for NOAA Metadata Training…that’s the first link. I can’t recommend their webinars slash training enough.
At one point in time I’d suggest going to ArcGIS as better than a blank slate because it had a readable interface with CSDGM specifications in it, but now, with 10.x, not so much. Here’s the old school 9.3.x editor, may it rest in peace.
Now, we have this…initially. Oh, but don’t forget the nifty trick of turning it to FGDC metadata in the options.
Just in case you haven’t found the options interface, here it is. It is on under the Customize > ArcCatalog Options menu. There, you get to choose your Metadata Style. And, also notice that you can tell it whether or not to automatically update your metadata. I usually like to leave this unchecked, but it is a personal preference. And it will be very nice if you need to track exactly how you created a feature.
But, now that you have the “correct” options chosen, you get this. Sigh. It’s giving me ISO descriptions down there. How’s that going to help me write FGDC metadata? At least it has some of the required tags correct (Identification Information and Metadata are the only two required sections, right?)
See my cursor hovering over the Title element…
Toolboxes. Before I just throw my hands up and walk away from ArcGIS, I will point out the all important difference between the Model (two blue dots, a yellow dot, and a green dot) and the Tools in your toolbox. For whatever reason I have not been able to get the models to function properly. They always error out for me. Now, maybe it is better in 10.2, but I don’t know. I do know that the tools (hammers) work, so instead of wasting time seeing if it works again…I stick with the hammer-time-tools!
Before we talk about tools, I have to remind you that you do not have to spend any money because the only thing you really need to write metadata is a text editor (such as text pad), the standard (all free online), and a validator (MP is provided from USGS and free).
There are plenty of tools out there other than ArcGIS…play around with tools (they all have trial periods if they aren’t free) and find one with which you are comfortable. This is another thing NOAA NCDDC training does really well. They review several different tools (Mermaid, CatMDEdit, GeoNetwork, ISOMorph, Geoportal, Altova, oXygen). Again, the main thing is to find something you’re comfortable with and run with it. For the longest time I would only use UltraEdit. Now, I use oXygen. And, if you’re really REALLY good…you’ll make your own tools. That’s another topic all together.
(This is a bit of a cart before the horse issue because you will probably want to decide if you are working in text before you commit to an XML editor. Then again, some of the platforms available will let you output the data in text or XML…so maybe it’s more chicken-before-the-egg debate?) Anyway, bit more about “platforms”. Once you pick your editor, you also get to choose between text and XML. I personally love XML. It’s probably something to do with the fact that the spaces make me feel vapid. If you’re off by one, you’re done. I don’t play games I cannot win, and that feels like the house always wins to me. Yes, it’s more readable, but I’ve been working with XML and the CSDGM long enough to be comfortable with tags. These screen shots are UltraEditor (on a Mac) for the text on the left and oXygen XML for the XML on the Right.
When it boils down to it, all you really need is a text editor (and there are plenty that are free that work just fine) and metaparser. By the way, I did NOT say Internet Explorer. Please don’t try to edit your XML files in IE. It just won’t work. It is, however, a good test to see if you have all your XML tags done properly. And, it is your link to MP!
If you’re in for a challenge, install the software. The problem with this is that updates come out so frequently that it is much easier to run the online translation. I will say using the command line MP input builds character
Now that I have shown you some Tools and Formats, let me tell you a story about another tool that led to my discovery of the power of templates. A long, long time ago, maybe the fifth time I wrote metadata, a contract with metadata was brought to my attention. It had a link to something called “XMLInput”, but the link was broken. After some serious internet searching, I managed to find the proper link. I tried a few times to make the actual tool work, but I gave up because the templates provided with the tool were much better for me. But as I have repeatedly pointed out I love XML and I love the CSDGM. Although these templates make terrible bedtime reading (even for me), using them has made me a better metadata writer. In the background you see the 133UATemplate, where you can delete the comments with all the information you ever needed to know about the tags…so it is very easy to write what you need.
These are just a few templates that I rely on, there are often others for the different clients and “profiles”…one of the hardest has to be the National Flood Insurance Program (NFIP) because it has so many fields that do not change. It should be easy, but it is the square-peg-round hole issue. You have to describe your data with a set phrase that just…doesn’t describe the data!
Another fun one is the USGS LiDAR base specification. It is one of the most recent revisions to MP! But, I haven’t seen the DTD updated. If you really get into it, you can revise a version of the DTD so you can see the changes in your XML editor! Give yourself a few hours…
Before we look at some of the sections of FGDC metadata (and be thinking of which ones you want to talk about from 4-10…I view the last three as building blocks so the most important to discuss) I want to look at the training options available to us. Some of them are straightforward training sites and conferences, but others aren’t
You’d be surprised what social media can teach you! That’s where I found the NCDDC training.
Also, lynda.com has an excellent XML class if you’re totally lost on those.
While I like GeoSpatial Training Services, there’s nothing SPECIFICALLY metadata. Unless you’re needing coding lessons.
You can spend tons of money on various materials, but the best training method is to just hit the books on your own and write metadata. So lets look at some of the sections.
Let’s go through the sections in reverse order. The last sections are the most important because they are used throughout the other sections.
I’ve tried to keep a color scheme going here where red has several options available, green are optional, and blue is a little quirk. So, onto the examples.
For Contact Person, you can have a person or an organization. For my example, it is a person. You don’t have to list a contact position (cntpos). Mailing and Physical is an Address Type. Although the CSDGM allows free text, the suggestions are “mailing”, “physical”, or “mailing and physical”…This is absolutely NOT the first line of your address! This is supposed to tell people how they can use this address.
There are three main types of dates: single, range (with beginning and ending), and multiple (which is made up of single dates…and by made up of I mean a multiple date and time must have at least two dates). You can also use time here, but I rarely use time entries so it is one of the tags I’d have to go back to the CSDGM to be able to write properly.
The only tricky part about Citation Information is that you can nest it incessantly within Larger Work Citation. I don’t see any reason to do this, though.
Sorry, but if you thought my other slides were bad with the XML of sections, Distribution would have put you all into a coma. It’s back to the NSA and planning, which Pointy-Haired-Boss doesn’t do so well. And watch out for pointy haired bosses…they may say they’re editing your metadata but it’s in-one-ear-out-the-other. Sometimes it’s best to parse the metadata into an e-mail and say “here, read this and make sure you’re ok with it.”
Here are the four types of entity and attribute information detailed sections. There’s also the overview section, but that is infinitely simpler than these detailed ones. Both overview and detailed sections of entity and attribute information are definitely sections to to consider writing and having ready to pull into the larger file if you have commonly used fields.
UTM 17…note the differences between LCC and TM
I only have a few horizontal references but they feel as if they are the most deeply nested and confusing section of metadata (almost giving Distribution a run for its money). Anyone tell me what state plane zone I have here?
Data manipulation techniques change
Software is updated
Data itself is updated
MP is updated
The general rule is if the data changes, you should revisit the metadata. Maybe you have some static layers. Great. You don’t have to change your metadata. Then again, maybe you update that static layer (images, maybe) every so often. After you use the first dataset’s metadata you can always improve the future “generations” of metadata. What tags make more sense for your organization? Which don’t make any sense at all?
When I read “Who Moved my Cheese?” I thought about data being “moved”…which made me realize if it was moved I’d have to update the metadata. Revisiting metadata can be painful, but it also lets you revamp the metadata quality. And, if you have a large group of people “messing” with your data, you will have to update the metadata frequently. I hope they all take good notes and can tell you what they did, though, or you have a flawless backup system!
There’s so much more to talk about in terms of metadata. We could go through each line of the files and explain it, compare it in XML and Text…go through the different ways to display the metadata once it is written…. I encourage you to go out and try some things out. Set up your snippets for easy use. Try some new program. Even try to break ArcGIS (hey, I have a warped sense of fun but discovering new “features” in the software is at least interesting).