I was inspired to write this post based on a question posted on Reddit a little bit ago. While I was happy with my response (and that so many people agreed with me!) I figured it would benefit those who read my blog or go searching for this information and aren’t Reddit readers. I’ve also added in a few links, that I didn’t include in my original post, to extra information on the specific commands. So…onwards we go.
We have a truck load of checks that we need to do of a morning, as part of a script, to check what’s going on with our AD. That’s not to say we just check our AD, but this post is going to be aimed at those things we do to check that our AD environment is running as smoothly as it can be and should be. I personally didn’t write the script, and I’m pretty sure it would be classified as an internal document (and intellectual property of someone/something other than me!) so I couldn’t give it out anyway, so sadly you’re just going to have to make do with what I give you and make your own script. Sorry.
That said, we’ll shuffle right along into some of the daily checks that you should be performing
Replication
- Replication status: repadmin commands
- Replication summary: repadmin /replsummary
- Monitor AD Replication errors: repadmin /showrepl * /errorsonly
- Monitor AD Replication latency: repadmin /showutdvec * dc=domain,dc=com
- Monitor AD Replication queue length: repadmin /queue *
- Checking Fail Cache on ISTG DC’s: repadmin /failcache
FSMO Roles
- FSMO role holders: netdom query /domain: FSMO
ISTG
- Identifying ISTG DC’s: repadmin /istg /verbose
Time Settings & Synchronisation
- Set DC time settings:
- Verify DC time sync:
- Verify Forest Time Config: w32tm /query /configuration (run on all DC’s)
Trusts
- Trust Relationship check: nltest /domain_trusts
DNS & Networking
- Check DNS records across the forest and report on errors: dnslint utility
- Monitor for missing subnets: type %systemroot%\debug\netlogon.log | findstr NO_CLIENT_SITE
- Monitoring DC TCP ports: manual or scripted, checking to make sure all ports required for DC communication are ‘LISTENING’. The main ports to look out for: 389, 636, 3268, 3269, 135, 53, 88, 445, 139, 123
Event Logs
- Checking System Event Logs: This can be done however you want to do it (manual, scripted, half-and-half) – we do this via our ELK server, but any logging server that is logging your AD System logs will pick these up. The main event ID’s to lock out for: 29, 1056, 16645, 16650, 55
- DNS Event Log checks: done via a script (or can be done via a logging server, but this can get noisy depending on how your DNS logs are configured) – main event ID’s: 5774, 5775, 5781
- Review of Directory Service Event Logs: This can be done however you want to do it (manual, scripted, half-and-half) – we do this via our ELK server, but any logging server that is logging the DC ‘Directory Service’ logs will pick these up.
- Analyse/archive of DC security logs: This can be done however you want to do it (manual, scripted, half-and-half) – we do this via our ELK server, but any logging server that is logging your AD Security logs will pick these up
Account Security
- Account lockouts: This can be done however you want to do it (manual, scripted, half-and-half) – we do this via our ELK server, but any logging server that is logging your AD Security logs will pick these up
- Check admin group memberships: this will vary depending on how your administrative groups are set up, just be sure to check them regularly. Keep your ‘Enterprise Admins’ & ‘Schema Admins’ empty, if possible.
EVERYTHING!
The magic command that lets you get a high level overview of everything that’s going on in your domain and where you need to focus your attention on further:
Hey Jess,
This is great. I’ve been looking for sysadmin type checklists for some of my customers. My startup helps teams with recurring processes. https://www.manifest.ly/
I’m interested in sharing this as a template on my system. Would that be OK if I included attribution back to you?
Thanks much!
// Philip
The next step would be to automate all these checks and let the system do them for you. Something like Adaxes (http://www.adaxes.com/tutorials_AutomatingDailyTasks_AutomaticallyDeprovisionInactiveActiveDirectoryUsers.htm) works perfectly for such purposes
As I said in the post – we have some of these automated, but as the script has been written as a collaboration between other members of my team which makes it an internal document of where I work. Due to organisation rules regarding the publication of things, I can’t post it publicly.
Hi Jess,
Thank you, thank you. You post are very helpful and I am just like you. I was trained to a document NUT….and it drives up wall when I am rushed to just work on a system without road map.
Thank you for this! I’m building as much as I can into Zabbix and it’s working out great.