Date Published May 15, 2018 - Last Updated December 13, 2018
“Data is content, and metadata is context. Metadata can be much more revealing than data, especially when collected in the aggregate.” —Bruce Schneier, Data and Goliath
When I was young in my IT career I would sometimes have a user come in and ask of a recent incident, “How did this happen?” Oftentimes these incidents would be malware related and they wanted to know how they got infected or what the damage was. My typical response was something along like, “It could be one thing, it could be 1000 things; these things can be complex.” Technically true, but a dodge nonetheless. In my IT nascency, I probably just wanted to work on something more interesting than figure out the intricate details of an isolated BSOD.
Later in my career, I had the opportunity to be an ITIL Problem Manager and everything became about root cause elimination and analysis. There, I did a complete 180: if you didn’t have an answer to exactly why something happened, I wasn’t going away. Fortunately now the pendulum has leveled off into an “It depends” state of mind. But on what does that “it” depend? The answer is context, and the only way that you can have context is to have something that is, at times, more important than the data itself: metadata.
Data is what happened: it’s the descriptions, the details about events, the notes people add to tickets, etc. It can be informative, but is not necessarily so. Metadata, on the other hand, is data about data: entry dates, form fields, files system data, logs, program extensions, etc. It is always informative. Consider a computer image: the data is the image itself—perhaps of a serene beach. But the metadata is the stuff about the picture; is it a .png or jpeg? What size is it? When was it created? With what rights? Where is it located? Why then, if all we’re interested in is looking at a nice picture should we care at all, or perhaps even more, about the metadata behind this?
Context: Understanding the Value
If data is the new oil, then metadata is the refinery; without it, you have no way of knowing or utilizing what you have. Consider the statement below:
“Help!”
Though brief, this is perfectly valid data. But is it informative? Does it contain value? Not really, there’s not enough to go on. Now let’s add some context:
Dispatch_number: 123u3h
Input_method: Call-in
Call_back: 555-867-5309
Tower_location: Springfield
Call_time: 6:07 CST
Call_length: :07
Call_priority: 1
Recording_transcription: “Help!”
With these simple fields added we can easily understand a lot more about what is going on. Now we have time-frames, priorities, locations, a call-back number we can possibly compare with other data to get a person/address, a cell-tower location for triangulation, etc. Assuming this is raw data (note the “_”), we may also have insight into the table structure as well, which could be helpful for reporting. We could also correlate this with other phone records that came in around this time to know if this is a singular incident, an escalation of an existing incident, or something altogether different, all thanks to the context provided by the metadata.
Part of the value of metadata is derived from filling in these gaps. The above involved a process example, but this could also be strictly technical. Think back to the serene beach picture from the opening; all we want is to look at a pretty picture, so why do we care? Well, have you ever tried opening a file with the incorrect application format? Or one with a broken link or mount point? Or that is too large? All of these gaps in important information are answered by metadata to provide context to the underlying systems and technical processes.
OK, great. But how can we help integrate this knowledge into our lives in ITSM? For that, we need to understand a little more about the nature of good design.
Context: The Power of Good Design
“Good design is actually a lot harder to notice than poor design, in part because good designs fit our needs so well that the design is invisible, serving us without drawing attention to itself.” —Don Norman, The Design of Everyday Things
Good metadata, like good design, often goes unnoticed and as such when needed should not require much thought (by us or the computer). The reason you don’t pay attention to coffee-cup handles is because they’re instinctive: you know where to grab. But what about doors with levers? Are they push? Pull? Does the location of the door or angle of the handle provide any hint? Bad design forces you to think. When thinking about designing tables, forms, process inputs and outputs, etc. (i.e., things that will become metadata), you too should make things as clear and unambiguous as possible.
Let’s take term definition for example. Once I was helping with a SIPOC process diagram and there was concern over getting groups to unify on the definition of a term. Pulling in what I knew about reporting and table structures, I offered that whatever was chosen should be able to exist independently of whatever process it was a part of (i.e., it should be devoid of any needed pretext). Since “Priority” needs to mean the same thing across teams, it helps to think about it like a part of a database schema (after all, if it ever becomes metadata, it will be). Even though the values may be different amongst teams (though per ITIL/HDI best practices, they shouldn’t be either), the “Priority” attribute needs to mean the same thing. To avoid possible confusion, try asking the question, “Would this mean the same thing at a table level if I needed it somewhere else?” This will often lead to valuable insights when designing form fields and their selectable values. The fields themselves and the values they can contain should be easily understood to lend themselves to apparent use. Good metadata should provide clarity and consistency to data relationships and definitions.
Metadata: Assessing and Utilizing Impact
All data has a cost, both real and perceived. Let me repeat that: all data—metadata and “regular” data—has a cost associated with it. So far, the examples presented in this article have outlined how you can think about metadata to add context and value, but you also need to think about it from a practical perspective. Unused data, unneeded fields, bad joins, ill-defined configurations: these are all examples of waste. This waste can be direct (you’re paying for SAN space you don’t need because you don’t have an archive policy), or indirect (your users are constantly taking longer to get back to you because they hate filling out misleading or confusing fields; audits take longer because metadata is not being utilized in your data governance processes and reporting). A 2016 study found that 80% of organizations believe that metadata is as important if not more important than it was 10 years ago, yet 60% of those same respondents said they either had no strategy for managing metadata (24.46%), or only ad-hoc strategies for specific use cases (35.87%). You need to make it a point to either minimize or utilize metadata as much as possible.
Examples of how to do this include the following:
- Setup rule-based filters to drop noisy logs (e.g., low level TCP/UDP teardowns for SIEM sensors or firewall ACLs)
- Implement clear naming conventions and/or a business glossary to improve search capabilities, especially in less rigid ad-hoc environments
- Use metadata to enhance analytics and reporting for IT-specific processes and tools; if good metadata is available, there’s no reason you can’t run analytics off IT/ITSM processes and tools just like you would your traditional financial and NPS metrics
Metadata: Use It
We’ve all been there. A tech wrote “Resolved” in the resolution field, and we know this is not valuable information. But before we bring the wrath, consider: Are we running NLP (Natural Language Processing) algorithms off this field? Are poorly defined entries being tracked and showing up in some report of the tech’s performance? Are we more concerned about this lack of description vs. miscategorization of other more reportable fields, or that this is redundant if other fields (e.g., ‘Status: Resolved’) are correct? All of these answers are related to our understanding to the metadata produced.
Remember, all data has a cost. But for it to have value, you need to utilize it accordingly. If you want the ability to understand context, reduce waste, and enhance analytics, you need to take a moment and not only appreciate metadata, but dedicate time to think about its design and use.
All data has a cost. But for it to have value, you need to utilize it accordingly.
Adam Rauh has been working in IT since 2005. Currently in the business intelligence and analytics space at Tableau, he spent over a decade working in IT operations focusing on ITSM, leadership, and infrastructure support. He is passionate about data analytics, security, and process frameworks and methodologies. He has spoken at, contributed to, or authored articles for a number of conferences, seminars, and user-groups across the US on a variety of subjects related to IT, data analytics, and public policy. He currently lives in Georgia. Connect with Adam on LinkedIn.