Microsoft Azure AZ-801 — Section 19: Troubleshoot Windows Server virtual machines in Azure Part 2

Microsoft Azure AZ-801 — Section 19: Troubleshoot Windows Server virtual machines in Azure Part 2

114. Troubleshoot VM performance issues

I’d like to now talk about troubleshooting performance issues with your virtual machines that are being hosted in Azure.

So, the first thing we have to do, of course, is have a virtual machine, which I do have a virtual machine called Windows Server two. But I could always have added one by going to the menu button, going to virtual machines and clicking to create a virtual machine. But I’m going to click on my win server two here. And the first thing we have to think about when we think about obviously performance issues is what level of performance do we have on that virtual machine? And we can figure that out by going on to the virtual machine here and clicking on size. And from there we can look at what size we’ve got in this case. I’ve got two virtual CPUs and eight gigs of RAM. Those are, of course, being the two big things you want to sort of think about. The third big thing would be the maximum input outputs per second 3200.

So, you know, you have to think about what level of performance you can even get out of that. The good news is you can always go to a higher level of size or if you actually wanted to go to a lower level of size, you can do that.

So, you have a resize option here. So, I could go higher or lower, but obviously if you’re talking about, maybe, slow performance, you might want to consider going to a higher level of performance. You definitely should check out the Azure calculator before you do that to get a feel for what the cost is going to be on all of that. Okay.

So, again, that’s the first thing, is looking at the size. Another thing would be the disk types that you’re using. In this case, I’m using a standard hard disk drive. So, that’s going to be slower than if I was to use solid state. So, considering using solid state with your virtual machines would be another thing to look at. All right.

The next thing we can do is we can we can connect directly into our virtual machine and we can troubleshoot performance using the same types of tools that we would on a physical server. In fact, to connect to it, all I got to do is come up here. Let’s go back up here to connect and then go to RDP, download the RDP file and then open it, which I’ve already done. I’ve put my credentials in and so here is the server.

So, just like I do on an on-premises server, I can go here to Server Manager, go Tools and I can open up performance monitor. And I can use performance monitor just like I do an on-premises server. I can pull counters and figure out why this might be going so slow. Looking at the different counters, I can do data collector sets. I can generate a performance data collector set if I wanted, just like I do in an on-premises server.

So, I have all that available to me. You also, of course, have the task manager for quick looks at what’s going on. Right. I can see all the apps that are running, All the applications that are running right now have a performance Tab that gives me a quick visual of my CPU memory, my network. I can see any users that are currently logged on. In this case of being only one, you only have multiple users if you had remote desktop going anyway. Details. You can view every executable that’s running right now and then you can see how hard it’s hitting the CPU. The idle process is what you normally would see unless your server was really busy. The idle process is just something that keeps the processor going when there’s not really a lot of activity. And then your services.

So, another thing to consider would be all the services. If a server is slow, one of the things you can do to, to, to help with that is consider getting rid of and disabling some of the different services that we don’t need. I can right click my Start button here. I can go to computer management. And as soon as that pops up on the screen, I can go under services and applications services and I can look at services that perhaps aren’t necessary.

So, one thing you have to kind of do here is just be familiar with what services, do what and what’s important and what’s not important. Like, for example, I have principal or enabled, and that’s using memory.

Now, if this server never does anything with printing, you could just stop that service and that’s going to free up some memory. And you would also want to set it to disabled so that it doesn’t start upon the next boot.

So, that’s something to think about is familiarize yourself with what services, get rid of services that will help free up some resources. All right. Another thing we’ve got is we’ve got event viewer for troubleshooting. If a system is running, has got performance problems, what you can do here with event viewer is look for critical messages.

So, I can go here under custom views. I can right click, create a custom view and. We could create a custom view just for critical based messages. All right. On, we’ll say all Windows logs. So, click. Okay. We’ll say we’ll just call this critical. All right. Click Okay. And it’s only going to show me critical messages.

Now, granted, I don’t have any critical messages because I haven’t really had any performance issues. But if I did that, this is the stuff you would look at. This would be the view you would want to look at. Okay. As far as vinegar goes. All right. So, with server, you also have what’s called reliability history.

So, if you type the words view and start typing reliability, you’ll see reliability history appear. And this is a helpful tool because it will tell you if there’s been any failures at all. Stop errors, failures, something crashed. This is essentially very like very much like the event viewer, except it puts it in a different form. It doesn’t give you quite as much information as before. But what’s great about it is it gives you a different visualization of what’s going on the server day by day.

So, you’ll see I haven’t had any errors or any problems like that, but if I did. You would see a little red X here, you would see yellow warning messages, and I could click on each day individually and see what might have happened on that day.

So, if I was having, you know, let’s say I went out of town on vacation and I came back and I had a junior level admin who said, oh, we had a we had our one of our virtual machines crashed or something or it was really slow. I can go back and pull this and look and see. I was out of town last week. I could pull and say, okay, well what happened last week? I can look on those days and see what was slowing it down.

So, reliability this is called reliability monitor, and it is just part of server. And if you do again, if you just search the word reliability, you’ll find it your reliability history. Okay. Another thing we’ve got is something called resource monitor. You just search the word resource. You can pull up resource monitor here. And this gives you another view, very similar to task manager, but gives you more information. You can look at your CPU; you find every executable that is running on your CPU against your CPU right now. What services are running? You can see what your memory is doing right now, see what executables are using memory. But to me, the most valuable two options for this is disk and network. So, I can click on disk and I can see every executable how much reading and writing it’s doing to the disk.

If your computer could be slow because there’s a lot of reading or writing, you can figure out what’s doing that. And then finally, network. You could use this for troubleshooting. I actually helped troubleshoot a company one time as a consultant. I was not able to come to this company’s facility. I was I was I don’t remember if I was out of town or what was going on, but they called me up and they said that there was a problem with their server, that every time they booted their server up, their Internet connection would go down. I knew right then and there it was probably a denial of service attack because I could shut their server down and it would the internet connection would come back up. And so it turned out that the company had switched out their firewall without talking to me about it. They just got a new firewall and the new firewall had open supports, namely the port. 3389, which is the LDAP port, was open and hackers had discovered they had done a port scan against their Internet connection to discover that port open.

What they were trying to do is to a brute force attack to break their admin credentials and get in remotely into the network. So, now to troubleshoot that, I couldn’t see it.

So, I had the person take a picture of the screen. I had them open up the person at this facility, open up this tool, take a picture of the screen and then send it to me. And I knew right then and there I could see exactly what port number these hackers were attempting to hit, which was Port 389, which was RDP. I’m sorry, LDAP lightweight directory access protocol, which of course is what you would want to break into if you were trying to do a brute force attack.

So, then they told me, “Oh, we switched out our firewall.” Then we knew, okay, we’ve got to get on this new firewall and block the port. Anyway, moral of the story is this is a great tool to keep in mind and it can help you troubleshoot lots of issues, performance and security and all that stuff as well.

So, again, ultimately what you’ll find is as far as troubleshooting performance issues on a server, you have the same on a virtual machine that’s in Azure, that’s a server, you have the same tools available to you as you do on-premises. You also can use log analytics, workspace and all that stuff just like you can with on-premises machines to monitor VMs in the cloud as well.

115. Troubleshoot VM extension issues

Let’s talk now about dealing with issues involving the concept of server extensions with VMs.

Let’s take a look at extensions. Here I am on portal.azure.com. I’m going to click the menu button and go down here to virtual machines, which I do have a virtual machine called Windows Server two that I’ve set up previously, and I’m just going to click on that. And then from there, if I scroll down a little bit, you’ll see a blade called extensions and applications.

So, I’m going to click on that and I have a couple of extensions here that I’ve already installed. For example, the Azure Windows Monitor and. A couple of things to consider, first off, and is that it tells you that it’s the provision state is succeeded. That means it did go through successfully. So, there’s been no issues, but the upgrade status is disabled. When you see something like that, you generally can click on the extension and you can turn on the feature that’s been disabled.

Now, if there’s an error. You’re going to get an error up here through the notification and you would need to read what that notification says. Like I yesterday I was doing something involving this resource group via monitor. I got an error so I can click on that error and I can read what that error is telling me. This resource was not found. It may have been deleted. That’s because I tried to do something after I had already deleted or triggered a deletion.

So, the important thing to understand is always read your notifications on what the error might involve. And that’s just that’s General Azure. Just a consideration you would think about just in Azure in general. Right? Let’s go back over here to extensions. The other thing is you can use extensions for assisting you with problems.

For example, I can go here to add and. I could add the let’s find the antimalware. So, let’s say I’m concerned about viruses and there’s some issues with that. I could install this Microsoft anti malware extension this is all right you want to exclude anything, exclude files, any processes. I’m going to enable real time protection, enable run a scheduled scan. I can do that scan type do quick scans are full scans. Quick scans are just going to scan. The important high priority stuff in Windows is supposed to the full hard drive scan which days click, review and create and click to create. All right.

Now, what might cause that to fail would be if I did not have privileges to manage this virtual machine.

One of the things I would want to think about if I was a user, some type of an admin that was doing this and I tried to deploy an extension and I got a failure message. It could be related to role based access control.

So, if I go over here to let’s go back over to our virtual machine. Pull that virtual machine back up here, win server two and let’s go to the IM Blade access can control, which is the identity and access management is essentially what that is. That’s role based access control.

So, I’ll go there and then from there I can check the role assignments. All right. This is going to tell you who has privileges and all that. Well, John Christopher is an owner, so if it was crashing and from doing an install, it wouldn’t be because of privileges, because I’m an owner. But what if you’re dealing with the users, not an owner, Right? So if you go here and you add a role assignment. Uh, and let’s say the user is just a reader. Well, they can’t install stuff on that virtual machine. Right.

So, if you’re dealing with somebody like Alex here and I gave Alex the reader privilege, that’s all Alex could do. Alex would get an error, so it could involve privileges. But the error messages that you get should show that if you’re trying to install some kind of an extension in it through an error, it would it would show that. All right. So, as you can see, I did not get an error when Installed this anti malware.

There it is right there. No problems there. And it is it is showing up. And if I jump over to the virtual machine. And sometimes this does take a while, but I should also I can also troubleshoot extensions inside the virtual machine itself. And this being the virtual machine I had already connected to. If we go to programs and features, we can see anything that’s been installed. All right. And it does sometimes show. Take a little while to show up, but an extension will show up here if there’s any type of issue. All right. And then also you can check event viewer. If there’s any issues with extensions, the event viewer will provide information under application. The application logs. If I go under a Windows Logs application, I could see if there were any errors here. All right, which there isn’t. The other thing is setup logs. This would involve something being installed. So, if there’s any kind of error there, you could look that up. All right.

Those are some of the things that you would want to think about. Then lastly, there’s the reliability law of monitor V reliability history if something crashed or something that would show up.

So, those are essentially the things you would want to be thinking about involving a situation where an extension might be having some problems. All right. Most importantly, you’d want to look at notification. You would want to look at the error that you’ve got. If it’s privilege related, then the access control blade here in Blade is going to be what you’re going to you’re going to deal with. If it’s something inside the virtual machine that’s failing even after it’s been installed successfully, you need to jump into the virtual machine and use the troubleshooting tools that the virtual machine provides.