Wednesday, September 23, 2015

Very brief XML introduction - pursuant to Extended Events discussions

I haven't been blogging nearly as much as I hoped - and I hadn't hoped for much more than a single post a month, which means blogging has been pretty damn skimpy!

But I decided that the world needed a bit more information about Extended Events in SQL Server, and, since I was studying them for my own professional development, it was time to try to share them with others.

Problem: if you want to do much of anything useful with Extended Events, you need to know a bit about XML. Talk about a pain!

XML is one of those things, you know? On the face of it, it's stone-cold simple. Enclose text in tags, sometimes even nested tags. You can also have attributes within the tags. And it's extensible, because you can add new attributes, or new tags, and you won't (necessarily) break something that doesn't know to look for the new tags - just like with old web browsers, if you don't know what a tag means, ignore it!

Ah, but XML in SQL Server is a brand new beast. Sometimes you want to check a few things - are they there, or not? - and sometimes you want extract a few values, from text, or attributes. Sometimes you want to pull out some XML, sometimes you need to poke around in different places... you can do all of this, and because you can do all of this, you can process Extended Events results a bit easier than otherwise.

Step one: we need to discuss a bit about XML so you understand enough to know how it works.

There are many, many places to learn about XML out on the web. I'm not going to try to re-invent the wheel, so I'm going to spell out the most important parts here:

  1. XML is eXtensible Markup Language - the "markup" is done by tagging, putting opening and closing tags around a section of text.
  2. There must be a root tag to the document - all other tags will open and close within this.
  3. XML is case sensitive! Never forget this, it will save you a lot of grief if you remember. But you'll probably forget, as I do, far too often.
  4. Tagging is atomic - you can't cross your tags. You can have <Car><Make>Ford<Model>Fiesta</Model></Make></Car>- your tags are all nested - but you can't have <Car><Make>Ford<Model>Fiesta</Make></Model></Car>. If Model starts in Make, it must end there too.
  5. Tags must be closed - you can create a self-closing tag by adding a slash near the end. We could have, for example, </Model Name="Fiesta" />
  6. Tags can have text data - the text enclosed in the tags - or attributes, like the "Name=" in the previous point.
To expand upon this:
<?xml version="1.0"?>
<Memorobilia>
<B5>
<Weapon>
<PPG type="resin_sculpture">
<Quantity>4</Quantity>
</PPG>
</Weapon>
<Souvenir>
<Bear type="Teddy">
<Desc>JS embroidered shirt</Desc>
<Quantity>2</Quantity>
</Bear>
<Londo_Mollari_Figure>
<Quantity>1</Quantity>
<Desc>Posable 'action' figure </Desc>
</Londo_Mollari_Figure>
</Souvenir>
</B5>
</Memorobilia>
Notice that we start and end with "Memorobilia", and that everything we've collected is categorized as B5 (for Babylon 5, a darn good science fiction TV series). We could have eliminated either tag - but by enclosing the B5 tags in Memorobilia tags, we have room to add other forms of memorobilia. Sorry: Memorobilia. XML is case sensitive! And if we want to add this to other collectibles that aren't exactly Memorobilia, we can enclose this entire structure within a new set of tags. That's one of the neat things about XML.

This isn't the best, most well designed XML schema, but it's enough to do some basic understanding. Notice, interestingly, that there's little actual text data in here! But that's perfectly reasonable - as with the car example above, we don't necesssarily need text if we have the right categorizations.

You can also see that some of the data is included, not by tagging/categorization, but by attibutes - we can mention that it's a "Teddy" bear by throwing in an attribute on the Bear (in case there's a posable bear, or a statue of a bear), while noting that the Londo figure, of all the Souvenir "children", has no attributes.

I mentioned "children"; that will be the subject of my next post, when I'll discuss XPath (and maybe XQuery).

No comments:

Post a Comment