Spirit of SQL : XML and SQL Server - with some actual Extended Events content, I promise!!!

As promised, we're going to start in on Extended Events now - kind of. We're going to do a bit of looking at Extended Events around the edges, and I'm going to give you some examples of working with XML in SQL Server - because trust me, manipulating XML in SQL is a bit of an art. Now: Extended Events are intended to replace most of what Profiler does - not everything that Profiler does, but a lot of it. And when you created a data collector in Profiler it was called a Trace. In Extended Events, it's called a "session". I like to think that this terminology made if far more excusable for me to fail to put Extended Events together for a bit!

Now, a very nice thing about extended events is that you don't just have to collect them in a trace, either in memory or in a file. You can have multiple nifty kinds of targets. You can send them to a file; the approved extension is XEL. In SQL Server 2008, there was also an XEM file, a format file - but those aren't used in SQL Server 2012 and later. You can also send extended events to a ring buffer - that's a data structure that wraps around and overwrites older entries with newer entries when it fills up. And you can use some other, special purpose targets that can tally up data for you in a variety of ways, or even collect unpaired events (for example, Batch Starting with no Batch Completed). Those other targets are valuable, but we did all of this XML so we could start by looking at files and ring buffers. To do that, let's take a look at two views: sys.dm_xe_sessions and sys.dm_xe_session_targets:

I know... I cut off the output from dm_xe_sessions - there's an entire books online page for it you can review if you want! But I wanted to show you a few obvious bits. First, you can see that the join column is the address column of sys.dm_xe_sessions and the event_session_address of dm_xe_session_targets. For files, target_data contains file information - in the event_file row for event_session_address 0x00000003ED4D10C1, I copied the XML text in the event_data to the query window. You see that it shows you information about the file.

But for the ring_buffer target, the target_data column is the entire XML representation of the ring buffer. Yes, I said "XML representation". Time for a bit of pedanticism: the ring buffer data isn't actually stored in XML, which is a pretty inefficient method of storing data. But when we query it, we'll see it in XML. And, if we query it, and cast the result as XML, we can throw it into an XML variable, or column, and draw information from it as needed. So, here's a query you could use to get that XML data:
declare @QueryProcessingXML xml
select @QueryProcessingXML = cast(xst.target_data as xml) from sys.dm_xe_sessions xs inner join sys.dm_xe_session_targets xst on xs.address=xst.event_session_address where xs.name='system_health' and xst.target_name = 'ring_buffer'

We're not going to work with that - a 4meg ring buffer is a bit too much to display on a blog entry. So, let's take a look at XEL files instead! To draw information from the file, we can use fn_xe_file_target_read_file. This reads an Extended Event XEL file. I'm working from SQL Server 2012, so I'm not worrying about XEM files, and, I'm able to show off some slightly more interesting data from the server_health extended event session.

Just like with the sys.dm_xe_session_targets, the fn_xe_target_read_file has an "event_data" column, and you can also cast this as XML. You can pull in one row per event captured, with an XML representation of all data captured in the event_data column. I did this while creating this blog post, and pulled a column from one of the Query_Processing diagnostic results, so I could demonstrate how we're going to slice and dice our XML data from extended events.

I then did a DECLARE @QueryProcessingXML XML; SELECT @QueryProcessingXML= '
...and then my blob of XML data, and what I put in my half-quotes was this:

<event name="sp_server_diagnostics_component_result" package="sqlserver" timestamp="2015-10-04T21:11:43.193Z">
  <data name="component">
    <value>2</value>
    <text>QUERY_PROCESSING</text>
  </data>
  <data name="state">
    <value>1</value>
    <text>CLEAN</text>
  </data>
  <data name="data">
    <value>
      <queryProcessing maxWorkers="544" workersCreated="57" workersIdle="17" tasksCompletedWithinInterval="1537" pendingTasks="0" oldestPendingTaskWaitingTime="0" hasUnresolvableDeadlockOccurred="0" hasDeadlockedSchedulersOccurred="0" trackingNonYieldingScheduler="0x0">
        <topWaits>
          <nonPreemptive>
            <byCount>
              <wait waitType="HADR_FILESTREAM_IOMGR_IOCOMPLETION" waits="287139" averageWaitTime="825" maxWaitTime="89175239" />
              <wait waitType="IO_COMPLETION" waits="17158" averageWaitTime="3" maxWaitTime="1605" />
              <wait waitType="WRITE_COMPLETION" waits="16468" averageWaitTime="0" maxWaitTime="12" />
              <wait waitType="THREADPOOL" waits="5068" averageWaitTime="0" maxWaitTime="41" />
              <wait waitType="CLR_AUTO_EVENT" waits="3192" averageWaitTime="148586" maxWaitTime="89227233" />
              <wait waitType="QDS_PERSIST_TASK_MAIN_LOOP_SLEEP" waits="2454" averageWaitTime="96626" maxWaitTime="89232595" />
              <wait waitType="PAGEIOLATCH_SH" waits="946" averageWaitTime="99" maxWaitTime="5705" />
              <wait waitType="PAGEIOLATCH_EX" waits="464" averageWaitTime="6" maxWaitTime="21" />
              <wait waitType="CMEMTHREAD" waits="272" averageWaitTime="0" maxWaitTime="2" />
              <wait waitType="CLR_MANUAL_EVENT" waits="222" averageWaitTime="9" maxWaitTime="115" />
            </byCount>
            <byDuration>
              <wait waitType="CLR_AUTO_EVENT" waits="3192" averageWaitTime="148586" maxWaitTime="89227233" />
              <wait waitType="HADR_FILESTREAM_IOMGR_IOCOMPLETION" waits="287139" averageWaitTime="825" maxWaitTime="89175239" />
              <wait waitType="QDS_PERSIST_TASK_MAIN_LOOP_SLEEP" waits="2454" averageWaitTime="96626" maxWaitTime="89232595" />
              <wait waitType="LCK_M_S" waits="26" averageWaitTime="4808" maxWaitTime="11197" />
              <wait waitType="PAGEIOLATCH_SH" waits="946" averageWaitTime="99" maxWaitTime="5705" />
              <wait waitType="IO_COMPLETION" waits="17158" averageWaitTime="3" maxWaitTime="1605" />
              <wait waitType="FFT_RECOVERY" waits="30" averageWaitTime="929" maxWaitTime="2329" />
              <wait waitType="FT_IFTSHC_MUTEX" waits="9" averageWaitTime="2273" maxWaitTime="17228" />
              <wait waitType="SLEEP_DBSTARTUP" waits="153" averageWaitTime="100" maxWaitTime="112" />
              <wait waitType="PWAIT_ALL_COMPONENTS_INITIALIZED" waits="3" averageWaitTime="2316" maxWaitTime="2326" />
            </byDuration>
          </nonPreemptive>
          <preemptive>LHW:Preemptive waits deleted for space - but they normally list both count/duration as well</preemptive>
        </topWaits>
        <cpuIntensiveRequests />
        <pendingTasks />
        <blockingTasks />
      </queryProcessing>
    </value>
  </data>
</event>' -- that's my closing half-quote from above!

Pictures, and a scrollbox - I've never done either of these before, and I just know this post is going to look awful. Give me a mysterious performance issue any day of the week over trying to make a nice looking blog entry.

If I just wanted to know a single fact - e.g.: do I have a lot of threadpool waits by duration? Any signs of IO issues? - this would be fine. But the odds are, I want to do some more in-depth analysis. Do I see my top waits by count or duration changing? How many workers do I have, with how many idle, over time? How often do we have blocking tasks included in this report?

For this, we will use a few methods that SQL Server has built in.

Time for a quick vocabulary lesson: "Methods" is a term from object oriented programming. It's really just a name for a function, but a function that does something. The goal of OOP is to have your object do things (or have things done to them). Well, how do you do things to an object in a consistent way? You have a set of steps - a method, if you will - of doing the same thing to different objects. And voila! we have a name for this particular type of function. So, a method's just a function, but usually one attached to an object. In this case, we might have a method like @QueryProcessingXML.SomeMethod() - and that's the key difference between a method and a function. A method is written x.f(), whereas we normally think of a function as being f(x) - because in OOP, x, the object, is what matters.

The two methods I want to play with right now are the query method, and the nodes method. We'll run those off of the @QueryProcessingXML that I created earlier. Soon, we'll add in the value method ... but first, we're going to make sure we can find the data whose values we want. And that is why I had to write up a bit about XPath first.

Query is an amazing method - you'll see it does a lot of work for us. For right now, we're going to use it just to review our XPath infomation. For example:

Things to note here:

In each case, we have a something dot method-name: xcol.query(), and @QueryProcessingXML.nodes(). We'll explain the names in just a moment :-)
The "nodes" method returns raw data with no table structure, so SQL Server demands that it must be aliased. We pretend (because SQL Server makes us) that it creates a table, with a named XML column. Our table name in this example is tbl, and our XML column is xcol. See? That's why it's xcol.query() above.
This would point to all data nodes with a name attribute of "component" that are children of event, the root node - we know we only have one here, but you should be aware of this - for example, if /event had a thousand data children with a name of "component" we'd get a thousand XML blocks here.
The output of this is a set of XML blocks, generally smaller ones. While I'm not sure, I believe that complicated queries are faster against small blocks of XML - and this is why so many people use this to slice-n-dice their XML into smaller blocks
Another advantage of slicing off blocks of XML is that there's more freedom to use those XML blocks in subsequent queries, with no fear that they might contain multiple data[@name="component"] further down the line
You've just been handed a toy with which to futz around with the XPath skills that you picked up in the last post - go use it!

I'm only half-kidding on the last point. I'm going to show you a few more examples of the kinds of things we can do with the query and nodes tools, but if you're like me, you need to practice, practice, practice with these things, because otherwise every time you think about digging into XML, you'll freeze up because you don't want to have to re-review all that XPath stuff.

So:

We could pull in the waits by duration using this:
select xcol.query('.') from @QueryProcessingXML.nodes('/event/data[@name="data"]/value/queryProcessing/topWaits/nonPreemptive/byDuration/wait') as tbl(xcol). That's an absolute path description.

Or if we thought it was a terrible drag to type in all those values, and we only wanted the first 8 values, we could use this:
select xcol.query('.') from @QueryProcessingXML.nodes('//byDuration/wait[position()<=8]') as tbl(xcol)
IMPORTANT NOTE: As you'll see if you read through the XML above, there are both nonPreemptive and preemptive waits in these results - so on a normal QUERY_PROCESSING block, you'll get 16 results. That's the downside of zipping through XML with a too-broad XPath. I've grown to love the // XPath operator, but it can quickly cause more problems than it solves, if you're not careful.

Or for a more reasonable predicate, we could choose only those whose waits that were highest byCount, and also were averaging higher than 0:
select xcol.query('.') from @QueryProcessingXML.nodes('//byCount/wait[@averageWaitTime>0]') as tbl(xcol)

It might be really annoying that Extended Events dump data into XML, rather than nice rowsets, but doesn't that last example show off some of the flexibility of using XML?

As I noted earlier, with these last two examples, you're taking a bit of a risk. You might think that "byCount/wait" isn't going to be re-used for any other block of data, but you can't be sure. You'd probably be better off making sure you're in a QUERY_PROCESSING component block. How could you do that?

As you'll find out a bit further down, you don't generally need to. But let's create a more complicated XPath expression, just for fun. We want precision, here, and we know that we want an event. But that event has to have a special property, so that will be event[]. What property is that? Well - it has a data child, with a name of "component". So, that would be event[data[@name="component"]]. But that's not enough - the "component" must have text of "QUERY_PROCESSING".

Well, then: that's event[data[@name="component]/text="QUERY_PROCESSING"]

So far so good. We then want the data child of this event, that has a name attribute equal to "data". And under that, we know there's a byCount set of waits. Ready?
select xcol.query('.') from @QueryProcessingXML.nodes('event[data[@name="component"]/text="QUERY_PROCESSING"]/data[@name="data"]//byCount/wait') as tbl(xcol)

That's: Give me the event, having a data node with a name attribute="component". But that component name must include a child, text="QUERY_PROCESSING". Once we have that event, we look at its child node data (name attribute="data" as well), and then, under that, we look for any and all "byCount" nodes with "wait" children. And if we decide that we don't want the preemptive waits, we can just change this to:

select xcol.query('.') from @QueryProcessingXML.nodes('event[data[@name="component"]/text="QUERY_PROCESSING"]/data[@name="data"]//nonPreemptive/byCount/wait') as tbl(xcol)

This is still pretty darn wordy, but it might save us a bit of trouble if we're digging through a larger XML document, where there might be a lot of similarly named node paths. Conceivably.

"Conceivably?" you ask? I'm afraid so. Believe it or not, while that was a good exercise in developing an XPath descriptor from the data we have, and increasing our understanding of predicates... we didn't need to do that. In this particular case,
@QueryProcessingXML.nodes('event/data/value/queryProcessing/topWaits/nonPreemptive/byCount/wait')

will give you the same set of nodes. We don't need to specify the attributes in this case, because the path alone uniquely takes us to the values we need.

So, now that I led you down one path to a red herring, let's give you something you actually want. How do we pull values from this? That's done with the value() method. The value method has two arguments - the XPath(/XQuery) that isolates a single value, and then a data type. So:

select @QueryProcessingXML.value('(event/data/value/queryProcessing/topWaits/nonPreemptive/byDuration/wait/@waitType)[1]','nvarchar(50)')

... will pull out the first wait type. We need to use the brackets to pull just the first. As you can see we would need to do a lot of writing if we wanted to pull out all 20 waits here with all 4 attributes for each wait - and I've cut out the preemptive waits, so we'd normally have 40! That leads us to one of my primary annoyances with this particular capture.

As you can imagine, it's really easy to pull out all waits attributes:

select n.value('(wait/@waitType)[1]','varchar(50)') [Wait Type], n.value('(wait/@waits)[1]','bigint') [Wait Count], n.value('(wait/@averageWaitTime)[1]','bigint') [Avg Wait],n.value('(wait/@maxWaitTime)[1]','bigint') [Max Wait] from ( select xcol.query('.') as n from @QueryProcessingXML.nodes('//wait') as tbl(xcol) ) as tbl_outer

The problem we run into here is that, while we've pulled out the individual waits, we lose the upper attributes - like whether the wait is top ten byCount or byDuration, preemptive or nonPreemptive, or, more importantly, the timestamp of the capture. The time stamp is an attribute of the top level event in our example. In case you're curious:

I think that's the great, great, great, great, great great grandparent. So, sometimes, XML can be a pain. Is this a serious problem? Not really. Since I analyze SQL Server performance as part of my job, I yearn for the ability to review the top ten waits for each time frame, mostly byDuration (but byCount might matter too). Well, we can get all of that, but when we use this "slice and dice" method, we get one row per node - if we run through on the event level, it's hard to grab each set of ten waits out in one fell swoop. So how do we fix this?

First, obviously, you could just select the XML bits into XML columns in a table. Do you want one column with all waits, preemptive and non? Or two, or four columns, for preemptive and non, and maybe byCount and byDuration? For raw data capture and storage, and display, if you've been following along and playing with XML, it should be starting to lose any intimidation it may have had. This idea has a certain appeal.

However, in accordance with an old Internet Maxim[1], the obvious, best solution is to dump the timestamp and the XML into a temporary table, and then cursor your way through it, running 4x10 selects to pull out the separate rows, then pick your dates, and see how your waits changed... which we can do almost as easily if we've saved the waits as XML. And we'd lose a lot of the flexibility, and we'd have ten rows with the same timestamp, and.... yeah, let's go with the XML.

So: let's review. Over the past three blog posts, we've reviewed the very basics of XML, and then a bit about how XPath works - hopefully enough so you can see how these examples worked. But honestly? My real hope is that you can start to read through something like Jonathan Kehayias's excellent XEvent A Day series. Not because that's the be-all and end-all of Extended Events but because when I started to read through it, I found that my XML difficulties kept me from following along. I could kind-of understand what he was doing, but not being able to read the XPath information was driving me crazy.

Well, in life, it's good to be the change you want to see - so, if you want a world where there's a quick-start set of blog posts on XML, XPath, and SQL XML methods, and how they relate to Extended Events - sometimes the best option is to try to create one. So here's my effort. Don't worry (or do worry?) - I don't plan to stop blogging about Extended Events yet. But I feel that this set of posts gets over the big XML hump, and I hope they help someone out there. Or, I hope they're so terrible they convince someone else to do them better - it's good to be an inspiration, after all....

[1] That maxim states: "The best way to learn something is not to ask a question; it's to post something wrong."[2]
[2]Worked pretty quickly this time, didn't it?

Spirit of SQL

Tuesday, October 13, 2015

XML and SQL Server - with some actual Extended Events content, I promise!!!

No comments:

Post a Comment