Welcome and thank you for coming to Microsoft Dynamics Technical Conference. I know it's a little bit late in the day, but I hope I'll give you some very valuable information regarding.
Basically how to troubleshoot the key word here in my demos and throughout the talk is going to be how do we really as a as a support engineer that it's worked in supporting your <profanity>pee</profanity> for the past 16 years.
Kinda using all the different tools, if you went to some of the performance tool sessions today with Thomas. He went over dynamics per ex tracing how do we leverage the various tools for an issue and how do we get to the wrecked cars.
So what I did is.
Came up with several demos later in the presentation that will go through will walk through those examples and try and figure out from work backwards from a code level to try and focus on what code is maybe.
The worst or the slowest performing work backwards to maybe a setup issue. Maybe look for different hot fix that are currently even already released that you may not have applied and also the final example is going through a query tuning and what we can do there as well.
Okay, so the main objective really is.
First few points will be going through the amer-peace stages.
A lot of people talk about the stages of, you know, okay. We're going to go and grab the supply and demand the updates stage and then we're going to come up with coverage futures actions as an that's basically the just of it. There's a lot of stuff going on in between all those stages or part of coverage as you know, 6 7 stages inside of it. So we'll go over that a little bit.
And we'll go through overview of the calmest coming performance tools and really, I'm just going to do you use that as part of my demo and kind of say, okay, this is the issue. What would I use, okay?
And then techniques for isolating the problem to me, it's all about dividing conquer with Emma Repeats, like, if you say, it's rainy 8 hours for your foal runs where do you even start okay. You need to come up with a plan and how you going to tackle that.
So the first thing I would suggest is maybe take a look at our indie blogs. The first two.
Are pretty good at giving you different information about.
Different settings inside the application such as.
Your bundle size your make sure using helpers if you aren't using helpers you really aren't benefiting from the stages that are multithreaded capable.
Will there's features that some customers don't even use maybe you have you don't have a use for action messages and your organization, well, if you only turn a moffatt the coverage group that are tight to. Certain items.
And every coverage group is turned off and guess what we still call some code to get ready to in anticipation that you're going to have some, so I'll show you how to completely disabled that.
And just very steps along the way part the one that's tail part two it goes more into time fences in different things that more prom application focus of how to set it up and maybe best practices the third. Option on or the third block poses logging and tracing act dynamics a ax this goes into when you're running explosion to be able to see or why did futures get called. And so forth from the explosion window. It also goes into how to set up.
Windows events that are coming out from or.
Performance monitor you can set up a data collection specifically around emar <profanity>pee</profanity>.
Events that we have have inside the code that are emitted.
The problem with that is if you have an 8 hour running ever appear on the amount of data you have there to weed through good luck. So that's why I go back to we need to focus on divide and conquer the last one is another common question. Not so much with performance, but when you get the, the not enough capacity air why is that happening there's some tips on there and getting the XML that we feed to the scheduling engine? Dll.
Yes, the schedule and engine. Dll would be difficult if not impossible to troubleshoot once it's in there, but at least shows you what we're sending to that in the results. And so forth. So.
It also goes through that same tracing that you can turn on from performance monitor.
As well.
One thing 0.2.
I have quite a bit of speaker notes inside the Powerpoint. So I would recommend grabbing the Powerpoint, I'd I requested that they leave all that content in I also have a package up a zip file with someone SQL queries. And so for that I'm using so if you have a chance get that otherwise email me and I'll certainly sure that all to you.
Alright, so some additional recommendations.
Would be.
You know, as you're creating running memory. You could have a lot of planned orders plan transfers.
Planned.
Plan can bands.
Documents giving created.
So typically we would want to make sure your pre allocating those number sequences for those let's say, you're creating 10000 everytime while.
If we're going to the database every time we want one. It's probably not the best thing and you're more than likely start seeing the number sequence table in dynamics perfect as.
Total elapsed time as it aggregates all the calls that it makes you might start to see that there as a hint that I need to go take a look at that.
The next point is avoid using the delete plan process. So when we run Amir <profanity>pee</profanity>.
We have an active planets out there people are using and then when we initiate hammurapi we basically create an inactive plan the new plan is inactive from the start of that process to the very end at the very end. We basically flip those two and then we spin off a cleanup job that gets rid of the old old up plan.
Why do I say to leave it don't run the daily plan ahead of time?
I've seen case and I used to do this bring data in, I just want to clean it as much as possible. Right. I'm going to set up my test run this thing.
House. I'm getting some bad sequel plans, I would start creating sequel plan guys divorce it the way I had expected to and why am I getting this.
Well, it's because when I deleted it, it might have been the only company they had only plan they had to delete it. There's nothing in wreck trans. So and a lot of other tables. Of course too, but when sequel starts running we have very few records, and that table more than likely small tables what a sequel like to do.
Likes to scan. So it's going to either table scanner and clustered index scan.
That's fine for one small when it's a large table. And if that plan continues to get reused.
You're going to suffer some performance issues. And so what I've seen is kind of common scenarios every time I bring data in head clean everything up nice and base need before I started troubleshooting I've run into these issues whereas if I left the plan. Out there was fine which can you read that next point. It's like, well, chad, you just told me don't delete the plan and then you're saying cleanup fail them appearance.
Well.
What I've also seen is when let's say you have Amir peak performance issues. Maybe you're killing the process you're doing something midstream and never gets that cleanup process. What you have a lot of rec trans records out there that are tide to enact a plan.
You may also have if you're running with helpers which you should be you'll have a lot of recalde task bundle records. You could have thousands and thousands if not millions of records out there that are just sitting out there for months on end that you've never cleaned up. So we'll talk about how to clean those up as well.
But, yeah, I would try and avoid, I guess I'm not sure why there would be a good reason to delete it ahead of time, I think in back in next 29 we might have had some issues where support. Even recommend hey, it's a good idea to clean it ahead of time. That's not the case anymore. Okay. So.
For the second point just avoid running that process.
Unless you have a real valid reason.
Okay. So the initial research what I typically will do our glance at the statistical information. So when you go to the plan, I'll show you this in the first demo get a feel for some of these Windows.
I'll take a look at kind of the durations of where things are currently at again, that window is going to be group in at a major tasks like coverage lot if you looked at the code coverage is 6 7 different smaller. Tasks in there and will peak at that were trouble soon as well.
Another thing you can do in production environment is just get a handle on if you're monitoring the unfinished scheduling process window. And that's also where you can delete some of those.
Failed processors.
You'll see him in that window. And if ephemera peas running you'll see information there especially if you're running in a batch anyway.
But that window will tell you what level you're at bottom level zero on level three. It's going to show you what major task ran coverage scheduling there's coverage.
Initialise support you'll be able to see where you're hanging up. So maybe it's hanging up on a certain level certain.
Tascon subtasks just making notes around that is helpful.
And then also, you know the dynamics perv.
Is it a great tool 2.0 is out in beta right now 1.2 you can still use that that's fine.
I typically use that for looking at.
Worse queries out there and kind of working backwards trying to get from the bad Corey too.
The ex callstack can be a little challenging but I'll give you some tips along the way for how to do that as well.
I briefly talked about this earlier, but what are the challenges wise is so hard.
Well.
If you have something running for 8 hours and you know that this process is made up of you know.
60 steps i guess if you're counting different bomb levels and whatever probably more than that.
It's difficult to capture a complete race start to finish in aggregate that all a mixed race unit is not going to happen.
Typically I try and limit those to about five gig to make it manageable.
um
Dynamics perfect capturing everything production environment. You're having other queries other things out there, so when I look at that.
I don't know ephemera peas Colin that those queries or something else is. So the best I can do is maybe filter down typical tables that I know more <profanity>pee</profanity> uses of the inserting wreck trans car of the young processed on scheduled.
New pios that are out there wreck peel and take a look at those.
um
The other hard thing is every customer's different time fences are different businesses different. So there's so many options in amer-pee.
And you can set up solve something take it down from days to two hours at one customer site doesn't have any impact on another customer site. So sometimes our code issues are enhancements. We've made in hotfixes sometimes it's for your particular situation. We need to maybe do a sequel plan guide or suggest some some.
Different options for you.
I said the statistics that you get from.
By default.
It's just can break it up into coverage and.
Which scheduling is part of that futures schedule is part of that it could be the scheduling part. That's taken the longest not necessarily.
The stuff that coverage is really doing this as well.
And then.
You know, we dupe promote and suggest, of course, you're going to run in batch with helpers.
You know, I've seen anywhere from 8 probably less than 8 but up to probably 24 helpers i don't have anybody use more than 24 helpers for a number appear on.
So that's probably the highest I've seen.
And I probably really depends on are you a man you more manufacturing more distribution using it for.
Creating transfer orders. And so for that it really probably comes down to a lot of things but main point is.
As this next slide shows.
There's a lot of steps for one and I try to.
Break it up into initializing located not much is going on there recalculate item levels.
If you change a bomb a lot of things that happen. But if you for one thing if you add to Obama change a bomb we flip a flag in the database more <profanity>pee</profanity> runs. It's going to see that I need to something change. I need to recover bottom levels. So wet your troubleshooting one thing is if you run at once and runs it run it again, right after the results might not be the same because maybe you have an issue with recalculation bottom level. It's not going to run the second time if you didn't make a change in between the two.
Free update.
I think in those first three I try to make I don't know how well it stands out in the back but.
The bolded tasks are multithreaded. So let's say you have a problem with pre update.
And that's taking 20 minutes of the 30 minutes that up the update task when you look at in statistics is taken.
Well throwing more helpers and threads that is not going to solve anything you need to take a look at if that's your biggest time consuming process. Let's get the next race, right around that process worse in the whole works.
The inn as you go down. So update if you think about update if it's basically gathering your supply and demand. It's not really linking them together do anything like that yet. It's looking at your coverage time fence and only gathering that information that fits into that. So there's quite a bit in there.
On this slide in the speaker notes. I basically wrote down what are the major methods, and what they're kinda doing because when you start diving into this stuff. It's gets pretty deep of where you're at so use it as a road map, you know, if you see long running queries. If you're capturing that through dynamics perv and that you're getting some callstack, so you can kind of reference or what.
What does that method mean where is it fall into place out here and it might get use a couple places, but it should help, you.
So then there's a few more insert more, inter-company stuff.
And we got basically the post update not generally don't see too much issue there, then we have free coverage gets not too much is going on there initialize the level.
He out again, that's a single threaded one.
Could be a source of issue again throwing threads that is not going to solve it. Then we go to the coverage. That's link doing the linking between some of your demands and trying to find.
Find if you have some on hand if you have a purchase order coming in that'll meet it link those together.
If not it also starts creating plan production orders as well as part of that process.
So let's say we get through those code coverage for process industries.
Co product, and so forth.
This is one of the points that the aren't team pointed out two is if you're not using process industries. Maybe it's a good idea to dis able those keys.
You know that's an option for you.
The coverage partition orders single threaded processor basically saying, hey, I got all these plan production orders guess what I need to schedule them. Let's put him out there in a list break them apart. So that we can get to the next stage schedule resources. All right. I'll be threads gravel list were condom rights multi-threaded again finalize level.
Repeat for every level you have. So if you have finished good subassembly raw material 0 1 2 it's going to repeat that those coverage steps three times.
And then post coverage pre futures kinda works the same way except for it's going to go in reverse so if you have three levels many of you probably have deeper bomb structures nap it'll start with level three because.
During the coverage process in the scheduling.
If we notice that am I got something that I can't make in time?
It's going to flag that is it's delayed but it's not going to figure it out when it needs to go yet.
When we get to futures exist pick up that flag say, okay, I'm going to go out there look for those initialization level and grab those.
And.
Work from level three raw material. I need to get that purchase ordered somewhere first that's going to take five days. Now I'm going to go on level two to some assembly level one and schedule those out.
That processes running in reverse. Let's say right.
And pre actions auto coverage update dynamic plan auto firm in statistics is really work and work toward the tail and we're just saying how many planned purchase or is that we generate how many items that we walk. Through. And so forth. And creates those time stamps and hollering.
How much time?
Coverage took futures talk. It doesn't break him down. So.
With that.
I think it's time to get into the fun stuff is the demos.
Can't function without my mouse me a second.
Hardware issue.
Battery issue.
Okay.
Alright, so.
I have a case the rainfall Amir <profanity>pee</profanity> stake in let's say it's taken 8 hours word. I start typically. I like to get the data in and look at it if I wasn't getting the data in definitely, i'd want to dine perfect. I would start hunting around in there get some theories, I also like to have some information from them about.
How they have things setup. So under the planning parameters service little small or blurry in the back but.
Here's where a couple things use of cash.
Set the maximum if you don't have some memory pressure on your aliases.
I've seen that take one of the queries that a customer is having issues with that was getting called a lot of times 60000 times but it was bubbling up towards the top of the worst offenders brought it down to us. Basically from 60000 to about 20000 calls and we still have to go get the data in and we're at least for example we can reuse during it was going to use that coverage. But is getting reused again, it at the futures and action messages. So we basically leverage that and.
The main thing is.
If you have memory you set it max.
The number of Casper bundle.
That really depends on if you have as we break up the work.
The worst case scenario is you have this set too high, and we get to.
You know, it's in a multithreaded portion of the code, maybe get 500 in a bundle the last one.
Group goes and grabs the very last bundle at that level says, okay.
I need you to work on these it's processing some of the other ones are finishing up maybe they finish up five minutes after this last one started but it takes two hours for this one to finish.
Well, if you would have said that bundle size smaller.
These other guys wouldn't be waiting because they cannot move on and tell everything at that level is done.
Okay.
So.
And I'll show you done one of the ex traces to a lot of times people get pretty excited about assigned bundles taking a second every time it's called in, that is that's basically saying.
One of those other threads went to get the next bundle and a double doesn't check his name is everybody done at the previous level. Okay, somebody still running there's code in there it says go to sleep for a second and then try again, so that's where the second comes from.
Alright. So you might need to play around the bundle size and hopefully I can give you some visibility and if that's happening to you or not with one of the SQL scripts.
All right. What else one of the things I mentioned earlier is how do I totally disabled. Let's say action messages. I don't use action messages.
Lot of people think of I gota coverage.
If I take this box and all my covers groups that action messages logical fire.
Well.
That's sort of true because you don't have any items, but we still build this big map of stuff that we anticipate you might have actions set up on one of your items. So if you want to completely get rid of the time spent in action messages.
Need to go to the plan.
Okay.
Go under time fences. Let's say action messages.
Sorry appear action messages.
Liddle reversed thinking you have to check the box and you have to put it to zero.
Then it will skip that whole task.
And you will have no time there. Now I'm like I say, you have to turn this off. I'm just saying if you're not using it to don't waste your CPU cycles running code that doesn't need to be around same with futures is another one some customers have, no need for you can market so. What I'm saying is you need to put a zero in there as well.
But I want to leave that alone for a demo.
Days, so.
Basically your time Vince's for when do you want to run futures for? And so forth. So same coverage how far out do you want to run coverage for you might have demand.
7 months from now. Do you really want to run it every night and look at that problem?
Might not want to but I guess depends on your business?
Probably depends on how long it takes you to produce an item to me.
All right. What else would I take a look at this an unscheduled processes?
So typically if no one's running Emory right now this should be blank. So let's say a valid it or maybe that Davis was months ago. I should really come in here marketing delete. No, you cannot Markov. I'm sorry. There's it's kind of a wish but usually you shouldn't have that much out there.
um
Now the other thing is.
Uncertain situations. I've found that there is.
Remnants have more data, that's out there we go to my desktop and this is one of.
Script set support tends to send out.
Some I think.
Couple partners as you know, he even built a window and said, okay. We'll just run these.
To be honest.
You might want to make these little little better than what I did for I use it for my cases, but there's no as you can see where my joins don't have any data area dior partition in there. But basically it's in this first top part is getting counts to say it, you have anything in the REC process item wrecked ask, no one's running in in any of your company's those should there should be. Anything out there.
And really, it's the storage space. That is taken up. I've seen millions of records out here getting cleaned up that were just from people that I've had serious issues with them or <profanity>pee</profanity> might have bad a lot of information out there also getting rid of any records. That where the plan is inactive.
Also getting are well with the top ones are counting the bottoms of leaning getting rid of records that have records that don't even have a plan version in the plan version table.
Again, you won't see those from the front end because the headers, you know the plan version is truly missing.
So again.
You know if you wanted to you could spruces up and put some data area IDS. And so forth. And there sometimes I get little lazy at work, and I'll just do truncate table and some of these they have a lot of data, but.
Okay. So there's that I'm going to actually run this because I think I do have some stuff.
Up there.
Tried a few records.
From probably me killing one way more <profanity>pee</profanity> runs biz.
See one of the later tests that were doing.
Okay. So back to the client what else.
I might also take a look at your scheduling setup.
So schedule is going to be part of the coverage remember there's it coverage process with a sub task of scheduling and there's also futures with the sub task of scheduling might take a look at this to see what. You have set up hopefully have the check boxes marked to limit how long the scheduling engine can run on the top one if you have the set too low you basically.
You get a warning if it wasn't able to come up with the schedule and you're out and so forth. And that amount of time. So typically you have to set that to whatever you need to get your processes to schedule. The second one.
I don't know, how I haven't really seen a good reason to work customer really needs this even set the main thing is have it marked said it either 2 5 0 i preferred set at zero is normally i have not when i've. Said why do you have that set to a certain number?
No one can give me a straight answer. So I'm like, well, why don't you said it to zero, so that we're not spending anymore time trying to optimize the first?
The first answer that it got because that's probably good enough for you. So unless the customer saying, hey, my schedule and it seems like it's not quite right. But when I set this to maybe the exact same value as the top I get a better, okay, fine. I'm, okay, with that. But most customers are just.
Are just setting it something and I guess my point is why let if you don't have a good reason for its leave it to zero.
Alright, so that's that.
Master plan. Let's see what's going on with one.
So I'll come in here.
Static plan.
Process. This is something you could do in your production system is mark this check box as well.
I like it in a little bit, I'll show you where is a script that I wrote kind of.
Kind of was born because I was getting some strange results from this. I'm like, well, that's strange.
Scheduling helpers.
However, you know for a time not sure what you guys running but typically a lot of times they'll see 8 16 depends on how many threads you have available on a west oreos group that you're running against so that. Helpers and threads are synonymous. I'm in the same thing, right?
So in my little stand alone box. I think I have Mac threads of my IOS at 2 8 i'm going to run this with four.
If you.
One thing in out is if you run it at with 8 with a was it only has 8.
um
Or let the house I say this.
By default if I ran off with 7.
It was capable of 8 it will have actually 8 threads one that's the main thread and 7 helpers okay. So if you set it to 8 minutes 8 and a loss.
It's smart enough not to freak out it just won't create that 7 helper that it wants to do.
Case and then batch processing if you had a batch group. I'm just gonna run against my blank batch group. Okay. So this should get picked up a little bit and start showing up in this window ignore that little statuses little I must have label file issue or something.
So it should pop up in here.
Give it a minute pick it up.
Well, I'm going to get ready to show you a little bit more of this.
Script.
It's again, it's just something I use in house thought I'd share it.
So this wreck process list one summer peace starts it should get one record out here.
Looks like we're in for the batch job to pick up.
Look probably by the time I get to batch batch jobs, it will be running but.
Executing, okay, perfect. So just to prove there wasn't line refresh gates out here and I can kind of watch is saying it looks like it.
Might already be to the point where I wanted to be I can tell from here. I'm under coverage planning.
Okay, not.
Not a little lot more information that but I'm kind of hung up at that point.
If I come out here and we use my script, and this script only give you results as Emre peas running. So once because during the cleanup job. Yes, we clean up these tables that I'm Korean against.
So if you really wanted to keep it, you might want to select into another table.
You know, if you have a four process at three and a half hours go out and grab this thing and you can research a little bit later.
But I like about it is it kind of gives me minutes since it started and I can kind of quickly scroll down.
And find where I get big jumps currently, it's kind of stuck on processing example, one kind of a giveaway, I guess I'm demo.
Problem a problem child. So it tells me that him having problems with the major status of coverage.
The next two columns over.
This is the minor one. So meaning the sub task of coverage some other major tasks have sub testimonial.
Which item, I'm on which bundle, it's running against.
And which thread maybe interested in the fact that maybe it's the that bundle size earlier from this type of data, you can say, well gee that last bundle that I worked on thread ex why Zee was running. For two hours.
Maybe I should spread that love across a few more threads and make it the bundle size smaller, I've seen one all sizes from one to 100 probably even higher than that on some customers.
If you were at the Ericsson the new dynamics a ax session with what's new in amer-pee Mohammed has done some enhancements to the bundles on the new version talking about backward in some of those, so i think they? 'll be some benefit that will see even in the next 2012 versions.
So.
Why is this script important to me?
For one I get to know.
This could be fun.
This kind of took me in a production environment to get to that problem child item. Maybe have multiple problem child items.
Maybe it took three hours to get to that how would you have found that earlier.
Okay. Now, there are of course some tools, I believe I mark the checkbox to track the processed time.
That will give it to you as well.
One's, well, if I go but draw back to the item.
A few details.
And I have seen this quite few times as you get it narrow down. It's like, well, where these settings, actually right now, you know, this is sample data. I just need something for this presentation to trick us a problem. So that's what I did.
Okay, so.
You know, if I looked at let's go back to the plans.
The static plan. That's what I was running this last one statistics.
Granted.
Coverage took three minutes, you know again sample data just imagine this was three hours, okay, same kind of thing you would you would say, okay, I have a lot of time in coverage compared to the rest of it what in. Coverage is was it coverage, it could have been covered scheduling could given pre coverage. We don't know, so under in queries Pat processed ask again. This is very.
But what would cause quality is one's, well, if I go but draw back to the item.
Few details.
And I have seen this quite a few times is you get it narrow down. It's like, well, where these settings, actually right now, you know, this is sample data. I just need something for this presentation to trick us a problem. So that's what I did.
Okay, so.
You know, if I looked at let's go back to the plans.
The static plan. That's what I was running this last one statistics.
Granted.
Coverage took three minutes, you know again sample data just imagine this was three hours, okay, same kind of thing you you would say, okay, I have a lot of time in coverage compared to the rest of it what in. Coverage is was it coverage, it could have been covered scheduling could have been pre coverage. We don't know. So under in queries Pat processed ask again. This is very.
But what would cause quality is one's, well, if I go but draw back to the item.
Few details.
And I have seen this quite few times as you get it narrow down. It's like, well, where these settings, actually right now, you know, this is sample data. I just need something for this presentation to trick us a problem. So that's what I did.
Okay, so.
You know, if I looked at let's go back to the plans.
The static plan. That's what I was running this last one statistics.
Granted.
Coverage took three minutes, you know again sample data just imagine this was three hours, okay, same kind of thing you would you would say, okay, I have a lot of time in coverage compared to the rest of it what in. Coverage is was it coverage, it could have been covered scheduling could have been pre coverage. We don't know. So under in queries Pat processed ask again. This is very.
But what would cause quality is one's, well, if I go but draw back to the item.
Few details.
And I have seen this quite few times as you get it narrow down. It's like, well, where these settings, actually right now, you know, this is sample data. I just need something for this presentation to trick us a problem. So that's what I did.
Okay, so.
You know, if I looked at let's go back to the plans.
The static plan. That's what I was running this last one statistics.
Granted.
Coverage took three minutes, you know again sample data just imagine this was three hours, okay, same kind of thing you would you would say, okay, I have a lot of time in coverage compared to the rest of it what in. Coverage is was it coverage, it could have been covered scheduling could have given pre coverage. We don't know. So under in queries Pat processed ask again. This is very.
But what would cause quality is one's, well, if I go but draw back to the item.
A few details.
And I have seen this quite a few times is you get it narrow down. It's like, well, where these settings, actually right now, you know, this is sample data. I just need something for this presentation to trick us a problem. So that's what I did.
Okay, so.
You know, if I looked at let's go back to the plans.
The static plan. That's what I was running this last one statistics.
Granted.
Coverage took three minutes, you know again sample data just imagine this was three hours, okay, same kind of thing you would you would say, okay, I have a lot of time in coverage compared to the rest of it what in. Coverage is was it coverage, it could have been covered scheduling could have been pre coverage. We don't know. So under in queries Pat processed ask again. This is very.
But what would cause quality is one's, well, if I go but draw back to the item.
A few details.
And I have seen this quite few times as you get it narrow down. It's like, well, where these settings, actually right now, you know, this is sample data. I just need something for this presentation to trick us a problem. So that's what I did.
Okay, so.
You know, if I looked at let's go back to the plans.
The static plan. That's what I was running this last one statistics.
Granted.
Coverage took three minutes, you know again sample data just imagine this was three hours, okay, same kind of thing you would you would say, okay, I have a lot of time in coverage compared to the rest of it what in. Coverage is was it coverage, it could have been covered scheduling could given pre coverage. We don't know, so under in queries Pat processed ask again. This is very.
But what would cause quality is one's, well, if I go but draw back to the item.
Few details.
And I have seen this quite a few times is you get it narrow down. It's like, well, where these settings, actually right now, you know, this is sample data. I just need something for this presentation to trick us a problem. So that's what I did.
Okay, so.
You know, if I looked at let's go back to the plans.
The static plan. That's what I was running this last one statistics.
Granted.
Coverage took three minutes, you know again sample data just imagine this was three hours, okay, same kind of thing you you would say, okay, I have a lot of time in coverage compared to the rest of it what in. Coverage is was it coverage, it could have been covered scheduling could have been pre coverage. We don't know. So under in queries Pat processed ask again. This is very.
But what would cause quality is one's, well, if I go but draw back to the item.
Few details.
And I have seen this quite few times as you get it narrow down. It's like, well, where these settings, actually right now, you know, this is sample data. I just need something for this presentation to trick us a problem. So that's what I did.
Okay, so.
You know, if I looked at let's go back to the plans.
The static plan. That's what I was running this last one statistics.
Granted.
Coverage took three minutes, you know again sample data just imagine this was three hours, okay, same kind of thing you would you would say, okay, I have a lot of time in coverage compared to the rest of it what in. Coverage is was it coverage, it could have been covered scheduling could have been pre coverage. We don't know. So under in queries Pat processed ask again. This is very.
But what would cause quality is one's, well, if I go but draw back to the item.
Few details.
And I have seen this quite few times as you get it narrow down. It's like, well, where these settings, actually right now, you know, this is sample data. I just need something for this presentation to trick us a problem. So that's what I did.
Okay, so.
You know, if I looked at let's go back to the plans.
The static plan. That's what I was running this last one statistics.
Granted.
Coverage took three minutes, you know again sample data just imagine this was three hours, okay, same kind of thing you would you would say, okay, I have a lot of time in coverage compared to the rest of it what in. Coverage is was it coverage, it could have been covered scheduling could have given pre coverage. We don't know. So under in queries Pat processed ask again. This is very.
But what would cause quality is one's, well, if I go but draw back to the item.
A few details.
And I have seen this quite a few times is you get it narrow down. It's like, well, where these settings, actually right now, you know, this is sample data. I just need something for this presentation to trick us a problem. So that's what I did.
Okay, so.
You know, if I looked at let's go back to the plans.
The static plan. That's what I was running this last one statistics.
Granted.
Coverage took three minutes, you know again sample data just imagine this was three hours, okay, same kind of thing you would you would say, okay, I have a lot of time in coverage compared to the rest of it what in. Coverage is was it coverage, it could have been covered scheduling could have been pre coverage. We don't know. So under in queries Pat processed ask again. This is very similar to what my queries doing but I have to wait until afterwards.
The other.
Sometimes what I've seen in this thing. This is why I started writing a script to begin with it's all I seen was one thread.
Well, I'm only see one through better. The other ones doing is a capture and is just label issue problem. I don't know.
So that was the birth of that script in.
I think it does a pretty good job of give me a real live view granted you can see this afterwards I can see my script after it's done unless I pushed it over to a table.
If you've ever worked with dynamics perv rather handsome one of the main authors of that, he's been after me to build with new 2.0 we can go querying get stuff and merge it into dime perv than after me to. Kind of do some of that information is then we can start looking at both the statistics at a high level and start getting some of the aggregated information as well. And see some trends that way. So look for that down the road. So my short list of things to do, but for now I'll give you what I have and I think it's very helpful.
All right was that helpful to get even just the ex trace again, that logic is in my upload? So if you need that.
So the key takeaways here is.
Get that focus area is a which processes it is the one of the main ones is a sub task is it a multi thread or single threaded. Maybe, if it's a multithread maybe we can throw some more threads out it, maybe we can change the bundle size.
The big one is how to program eclisse programmatically start and stop the next race in a certain spot of code that is huge. It's helped me immensely on some of these.
Very long running processes.
Okay.
Yes, yeah.
Yeah.
All right. So just kind of keep moving on with with demos is we're going to move on to a more of a block in one and I think it's a unique block in one.
Oh.
I got a little set up to do.
It doesn't give it away.
Then we'll maybe talk about something.
Of course restart in a less during a presentation is maybe not the best thing to do, right?
It is necessary for this one.
So.
There's blocking right away if amber's familiar. It's equal blocking once bid is has a transaction open touch the table has a certain lock on a record another one that needs to update that same record on SIS take, you know, exclusive loughran update lock on, it. Can't do it has his block trade Intel this commits are rolls back.
Then it can move forward. So blocking is kind of normal it keeps data consistency long blocks. That's bad, but block in his is fairly normal.
Then there's deadlocks sequel sees one spare getting blocked by another and this big gets down to this line of code in its transaction, and it's getting blocked by another one these two speeds are not going to move forward. Right sequels looks out and said, well, who did the least amount of work boom you're dead, right your victim says, so it rolls back and then hopefully your application retries and does it stuff.
A third one is what we're going to look at application deadlock anybody deal with an application deadlock before.
It's a little interesting, and it's let's go through it, and we'll talk about it.
Close Windows.
Master planning.
So typically if I even if I suspect blocking I'm going to leverage dynamics perv there's multiple ways, you can capture blocking information again. I do a lot of things real time. So I'm going to start this optional pull blocking versus you could start the default sequel trace as well get the block process report that way but to open that while it's running.
So.
I like to do this.
Start that.
Start that hopefully start a job where did I click on.
Running.
Job.
Running.
Okay, less panic. It's going to keep running this is not going to go away and tell I tell it to stop. So I'm just going to close it.
Well, that's running it's going to record wrecked information into this time per block stable.
And I looks like I have some stuff in there. But let's 19 records.
Right now.
I might turn on long.
There's a script in dynamics perfect does it as well. But under options. It's not something again, I wouldn't be doing a lot of this stuff in production definitely. I don't want to isolated over into a test environment.
Where is that option sequel trace looks like I put the threshold at four seconds for analog in it to a table. So there's a script in dynamics perfect sets it for all users, and I chose four seconds, and I think the script by default is five mostly because.
If I had a deadlock situation generally if no deadlocks are happening the system seek awaits five seconds before it goes and checks if there's deadlocks.
So hopefully I might get some call stacks in there, and that's what this is going to get me is some cults call stacks. I don't always get what I want there but I multiply what I will set up.
Okay, so master planning.
With any luck.
It will start.
Started.
And you like I said, you know, getting a feel for this.
You have it's for our process. I don't know if I would sit here the whole time I've been known to monitor this pretty close in some of those cases just to get a feel for.
You know, usually don't have one thing wrong, right. You have multiple.
But.
I get to this point, okay. Well, let's go back to my script again.
Process list.
And it's kind of a manual process view had multiple hammurapis running or you didn't do the cleanup of those tables, you might have to just choose the latest one in there.
Make sure it's for the company are running on maybe have multiple companies.
Again. This is my troubleshooting tools what use it with a grain thought great assault.
Well, I had it filtered by just the processing ones this time at the bottom of the where clause.
Guess what.
Again demo. Right example, two course. So we know something's going on again coverage example two fights out here and I waited until next week at this thing is not going to move.
Nor will sequel pick up a deadlock?
Okay.
What's going on well?
At this point.
Go to the blocks.
Blocks table.
And I'm assuming I'm going to have a new entry in here.
Number 20.
So this dime purpose going to give me a lot of good information where the lead blocker is maybe it's multiple things getting blocked but leave blocker this bid 76 i'm looking at what type of lock the lee blocker has. An exclusive lock on a key.
It could be a table lock anybody know if you see in a table lock what you'd make sure you have enabled.
Was that trace flag that we talked about that I think it was I was going to repeat the old deadlock trees flags and not get mixed up at 12 24 is.
One where if we turn that on for the sequel database. It's only going to escalate to table locks if we have memory pressure. So if you don't have it turned on if you have 5000 locks i think it.
That value changes with SQL version probably but used to be 5000 locks on the table on different rose we'd start escalating.
But with it on it. Let's like increase until we have a certain memory pressure and it's so.
Typically we want that on but this is a key level lock. All right. Okay. Well, which object unit of measure cash? I'm going to take a little note of that.
Probably also take a look at the statements that are where the last statements these two transaction called.
I might also take a look at depending on what type of locks they were the actual execution plans.
But if I looked at this get rid of the parameters.
So once duinen insert. So the guy that's getting blocked is training in sort of record a specific record that the other one has locked.
So the other the first bid that's blocking probably did what an update inside. It's Tran and it hasn't released its Tran yet or maybe it didn't delete maybe even there and insert i don't know what I did all I know is the last thing that it did was a select statement and it. 's what is it doing now typically if it was still moving, let's go look at that the blockers bed.
So blocked that got us blocks 80 blocker 73.
Get to a new line.
I'm sure there's multiple ways.
Mine.
73 is the blocker.
An equal sign in there.
So if it could be that the blocker is still busy doing stuff. Right.
What does it say here you sleep and he's not doing nothing.
Okay So that's interesting.
And I could sometimes you get a sleeping in there and you run few times you'll finally it's doing something. So if it is doing something is just it has to finish was doing but this case, it's sleeping forever.
Okay, that's interesting.
So again, now I would probably go to Eric getting that ex trace.
Open up example two.
Okay.
In this one.
We have few probably at different threads.
I'll spare you the hunting around for this a little bit, I'll just go to this top one.
And.
What would I typically do since I know it's that table.
That's not in mine.
Unit of measure.
There, it is.
Type a sit down in there.
Star unit measure star.
This is the only thread that showed any activity on it, luckily. Right. I got lucky, but.
So I did a slacked did to delete.
Okay, interesting. We've done the other threads or do anything.
Where did where is that insert at all. It's not going to get logged here until it got completed, so that blocker, I'm not going to see, but if I go to the bottom of this thing. So let's see what the very last things that this one is doing.
Little hand, you might want to filter.
That a little bit. So you not seen a bunch of noise.
Now looks like maybe over filtered it towards the bottom.
But sometimes when you're trying to expand something you've got all that noise in there.
I don't remember when that filter without it but.
Force of habit creature habits are even notice it four versions.
Yeah. Definitely helpful when you're trying to filter down some of the noise that you don't need to see.
Okay.
Course.
In time.
What will see is.
The those select the delete statement method call in the delete and then.
After that.
That same block of code.
So think of it, your.
Reading cover your running code, right? You can run a delete.
There is a unit of measure.
One thing you might want to do is from the sequel tab just jump to the call stack at that particular.
Spot that would've been easier.
Okay, so.
Yeah, I should have that.
The deleted somewhere upon in this area.
And then write down here I can take a look at this that code is spinning up a user connection.
And trying to do stuff, what is that code look like.
Okay. Here's all the user connection here is the insert never around be is getting blocked by the same block codex. I gotta transaction I did what did I say a delete.
Inside that transaction it come down here still same thread I say create a user connection and go in search something in there. I can't insert the exact same record, I deleted.
Because this transaction over here isn't completed. And this will never finish Intel.
This user connection comes back and finishes its logic to continue on. So I call that an application deadlock is because it's one thread is coming down.
It's doing touching someday to spin enough another user connection. That's typically where you see an application deadlock is now I'm over here.
This code and I can keep moving until I come back with whatever, I'm done with move on. So.
You know this one's black by the first one this one is never going to move on and tell this one is done never gonna get done.
What do I do next?
Well, this is where I personally would be, okay, take a look at the methods.
Unit of measure converter cash.
I'm going to go to hopefully I can log into celsius real quick.
Sign in.
I love this part of else. Yes. So I'm going to go into one of my projects, you could probably just do the search from the right there. But if I go to.
Issue search.
A lot of people search by keywords like Apple remember peace hanging below.
You might not get that hit hit on that. But if I know.
A certain method.
And it kind of tells you hey, there's a hint ear if you put in a dollar sign slash.
There, it is thought I typed it in there once.
If I can get a little bigger for you.
System stop there's a bunch of alright but on our three system starts responding at the in sort of a record on the unit measure cash table boom problem solved do I have that Cabey.
Probably not.
So.
You know you can go in here and take a look at it view the changes, unfortunately I kind of wish it would tell you. Exactly, what changed in here?
Fortune. It's listing all the methods in reality was one or two that were changed.
The main reason for that problem is we don't want you just put in code without plan the whole right.
Is that cause more support cases for us?
Because things are sometimes interwoven.
So.
Main point there is.
Couple things.
And I probably chose the hardest blocking case in Canada just because it's an interesting one, but dynamics perv capture stats the blocking options the long running acts statements is basically that option under tools option sequel again in dynamics per. Few looking through the documentation, there's a script that turns at all on.
I wouldn't put that on production. Of course, but or if you do for a short period of time and then you can disable it again, but you probably don't know, I guess I probably would put in production.
Special on something that you can repro in and test environment like cycle services search for the key words, but also if you get it narrowed down to a focus area of a class or method man. That is pretty powerful on looking through a cave which ones might apply to this thing.
You could go out there and search for Wreck Calc Raquel classes a lot of a bulk of it, but there's all these classes that inherit from it. So.
And Recalde cash. I can't remember all of them but over the certain tables with indexes.
You know, you there's a lot of different ways. You can go about that blocking this talk about the typical blocking the sequel deadlock there's different ways. You can troubleshoot each one of those.
Last one, okay. I got 11 minutes here i'm going to make it through this last demo.
This one we don't have to go through a lot of application part of it is we're going to more. So to show you.
Examples example three.
Okay, so dine perv.
I might take a look at that and one of the queries that if you look through the analyzing of the week we give you all the script. So you can run and slice and dice that data, so many different ways. I typically start with long running stuff because anything by total totally lapsed time and there might be other ways i could slice it but generally I start with this. So I ran Amir <profanity>pee</profanity>.
I'm filtering by that specific runner that capture timeframe that around the this job the diaper of capture stats.
And I'll see the top order by total laps descending.
Scroll to the right. I'm looking at some of the is maybe.
Production system something's probably goes down a lot more than here. Of course, but this was something that Iran, I'm like that's interesting. I'm going to use that example. I didn't do anything that was out of the ordinary to cause this.
And to be honest. I had a heck of a time getting it to happen. The second time. So I can capture for more information to show you but.
So this fetch fetch cursor. I can't really see what that's doing but I'm going to take a look at this next one the next running one and why am I looking at that versus maybe the next to the one after that well, this. One's taking 90 on average 90 milliseconds a call is getting called 245 times the second one in the group by total elapsed time. Maybe I can tune that down to maybe five milliseconds or something.
And granted, this is sample data numbers are much smaller than what you see a production but it's a good example. So I parse it out and grab that threw it up here looking at the tables. I have wreck trans curve wreck trance and wrecked on schedule orders of those tables in a production environment.
Which ones are going to be big?
Anyway wreck trans right wreck trans carve the one that joins the two together. Let's compete pretty darn big tables.
Wrecked unscheduled orders just from the name. It's probably we're probably done with coverage. We're going to go through start scheduling stuff.
The where clause is what's going to make it unique.
Ref type.
Yeah, maybe.
Maybe.
But if you look at it the smallest tables probably going to be reckoned scheduled logically speaking is probably gonna be run scheduled.
Orders.
And the filter.
If I look at it partitioned 80 or 80 process. Heidi is a specific camera <profanity>pee</profanity> run, okay. That's their prosody probably seen that from my scripted who were looking for that.
And then engine bundle.
If you haven't set the 50.
You're probably gonna 50 orders in there pretty small record set you think you would start with that joint on the rest of the tables at least that's what i would do if i was sequence start with a small thing. Joint on the rest later. Let's see what it was doing.
Click on the execution plan.
And let's see if I can make that bigger.
Okay. So little hard to read but at first glance, hey seek good. I like it scans.
Not necessarily an usually in production see thicker lines. This is going to be larger, but if I hover over this first seek this is where we're starting in wreck Transco of why are we not starting with reckon schedule orders.
Beats me but the other thing is if I hover over this what is sequel doing it's seeking. So it is doing to seek but it's seeking by partition data area and plan version.
On wreck, Transco of that's a lot of records, right? That's for one plan version every single record that you've created out there. That's not going to be very unique. Luckily. My sample data didn't have that much data there. So it doesn't look horrible. But again, if.
And this comes in with, you know, tuning where is not something just pick up overnight. I don't know how much experience you have but there's sort of a logic to it sort of.
I don't know get feel for how it should work.
And there's many different ways to approach it. So if I seen this and it.
I might go back and take a look at.
Well, did that always run that way? So I'm going to go by query hash. And if it's the same exact same statement more likely. It's going to on the same sequel server. It's going to have the same hash or you could he could run it either. One of these I kind of sometimes I'll do this statement quite a bit because I'm moving data around to different servers. So I'll try and get the sequel text narrow down far enough with the light class to get it as well.
But you'll notice some of these is average elapsed time is a few milliseconds, right? I'm going to highlight one of those execution plans guess what I was right should start with unscheduled orders this one's running so much better at least that's what the data is telling me still just a theory how. Do I how do I make it do this?
Lots of different ways. Right. I could go into ex plus find out where this code is out next plus plus believe there's a hint, you can put an EX plus plus 4 4 select order i believe if i may have the terms mixed up in a. Certain order tele sense helps me out quite a bit or.
If I want to do a quick test is I know. That's going to take it, you know I have two compiles. And so forth. And what about sequel plan guide?
Okay. Let's do that will once I have this statement typical, right click edit query? And this is the way I do it there's probably tons of ways but I basically at the parameters get rid of the prince is at the front and now I have the parameters in the statement.
And then.
I'll go ahead and.
Stop.
Examples.
Use what I've done in the past, I've did this before forcing exact execution plan. I'll come in come to my little template and paste in the statement.
Paste in the statement based on the parameters, and then figure out what type of hint is going to give me that.
And sometimes, you can force the exact can't there's ways to do that in this case based on what I know of the table. I'm going to take a risk, I'm going to force order don't get carried away with for sort of it can cause problems too, you're not always in the right pieces of data changes. But in this particular one, that's what I did.
The order that they would specified in that query statement. So if they're in the right order you get lucky, right? If ax plus plus encoded in a different way.
Run with them and just for many of your systems and solve issues. But I'd rather make the product better. And if it if that's what it makes sense that it's not just because customer ABC has this specific data set because that's really a tuning exercise. But if I can get 10 20 customers out this is all helping.
Okay. We need to take a look at it. So.
The question.
Alright thanks for attending guys appreciate your attention. I hope it was a time.
Thank you. Say when I go to.
Engineering team on that. And I do work closely with them. So if you do have issues.
I'd like to hear those.
You know, if it's a new case we need to open up support case and work it, but if you come across something that this is solved it for me. I don't mind getting those occasionally emails and it's just, so I can add to listen.
I gave you the tools you might run with them in just for many of your systems and solve issues, but I'd rather make the product better. And if it if that's what it makes sense that it's not just because customer ABC has this specific data set because that's really a tuning exercise, but if I can get 10 20 customers out this is all helping.
Okay. We need to take a look at it from.
The question.
Alright thanks for attending guys appreciate your attention. I hope it was a time.
Thank you. Say when I go to.
Engineering team on that. And I do work closely with them. So if you do have issues.
I'd like to hear those.
You know, if it's a new case we need to open up support case and work it, but if you come across something that this is solved it for me. I don't mind getting those occasionally emails and it's just, so I can add the list that I gave you the tools you might run with them and just for many of your systems and solve issues. But I'd rather make the product better, and if it if that's what it makes sense that it's not just because customer ABC has this specific data set because that's really a tuning exercise. But if I can get 10 20 customers out this is all helping.
Okay. We need to take a look at it from.
The question.
Alright thanks for attending guys appreciate your attention. I hope it was a time.
Thank you. Say when I go to.
Engineering team on that. And I do work closely with them. So if you do have issues.
I'd like to hear those.
You know, if it's a new case we need to open up support case and work it, but if you come across something that this is solved it for me. I don't mind getting those occasionally emails and it's just, so I can add to listen.
I gave you the tools you might run with them and just for many of your systems and solve issues, but I'd rather make the product better. And if it if that's what it makes sense that it's not just because customer ABC has this specific data set because that's really a tuning exercise, but if I can get 10 20 customers out this is all helping.
Okay. We need to take a look at it. So.
The question.
Alright thanks for attending guys appreciate your attention. I hope it was a time.
Thank you. Say when I go to.
Engineering team on that. And I do work closely with them. So if you do have issues.
I'd like to hear those.
You know, if it's a new case we need to open up support case and work it, but if you come across something that this is solved it for me. I don't mind getting those occasionally emails and it's just, so I can add the list that I gave you the tools you might run with them and just for many of your systems and solve issues. But I'd rather make the product better, and if it if that's what it makes sense that it's not just because customer ABC has this specific data set because that's really a tuning exercise. But if I can get 10 20 customers out this is all helping.
Okay. We need to take a look at it from.
The question.
Alright thanks for attending guys appreciate your attention. I hope it was a time.
Thank you. Say when I go to.
Engineering team on that. And I do work closely with them. So if you do have issues.
I'd like to hear those.
You know, if it's a new case we need to open up support case and work it, but if you come across something that this is solved it for me. I don't mind getting those occasionally emails and it's just, so I can add the list that I gave you the tools you might run with them and just for many of your systems and solve issues. But I'd rather make the product better, and if it if that's what it makes sense that it's not just because customer ABC has this specific data set because that's really a tuning exercise. But if I can get 10 20 customers out this is all helping.
Okay. We need to take a look at it. So.
The question.
Alright thanks for attending guys appreciate your attention. I hope it was a time.
Thanks.
Không có nhận xét nào:
Đăng nhận xét