Advanced Active Directory Design and Troubleshooting - PowerPoint PPT Presentation

Loading...

PPT – Advanced Active Directory Design and Troubleshooting PowerPoint presentation | free to download - id: 61bee1-ODdiO



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Advanced Active Directory Design and Troubleshooting

Description:

Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer HP Business Critical Call Center Oct. 06, 2002 – PowerPoint PPT presentation

Number of Views:1378
Avg rating:3.0/5.0
Slides: 181
Provided by: bestitdoc
Learn more at: http://www.bestitdocuments.com
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Advanced Active Directory Design and Troubleshooting


1
  • Advanced Active Directory Design and
    Troubleshooting
  • Ed Whittington
  • Principal Software Engineer
  • HP Business Critical Call Center
  • Oct. 06, 2002

2
Topics
  • Troubleshooting Basics
  • Troubleshooting Tools
  • DNS Troubleshooting
  • Troubleshooting Replication
  • Troubleshooting DCPromo
  • Troubleshooting FRS Replication and DFS
  • Troubleshooting Group Policy
  • Troubleshooting in .NET

3
Troubleshooting Basics
4
Basic Troubleshooting Steps
  • Define the problem (make sure there is one)
  • Whats failing?
  • Client authentication and security
  • Group policy application.
  • Replication.
  • Name resolution.
  • Errors and warnings in event logs.
  • FRS/DFS
  • Application
  • How is the problem replicated?
  • One or multiple machines?
  • Narrow the variables

5
Basic Troubleshooting Steps
  • MPSReports_DS (from HP or Microsoft)
  • Get the Log files
  • Event logs
  • http//www.eventid.net
  • windir\debug\usermode\Userenv.log
  • windir\debug\DCPromo.log
  • Turn on Verbose Logging
  • Run NetDiag, DCDiag (verbose)
  • Get status report from Replication Monitor.

6
Basic Troubleshooting Steps
  • Check DNS.
  • Resolver on ALL computers.
  • Name Server Properties (forwarding, etc.).
  • Monitoring tab test name resolution.
  • Nslookup, ping to test name resolution.
  • Ping SRV records.
  • Check Replication.
  • Force replication.
  • Identify who isnt replicating to whom.
  • Outbound vs. inbound.

7
Basic Troubleshooting Steps
  • If all else fails, try demoting.
  • Really cleans up a lot of problems If problem is
    isolated to one DC.
  • If replication isnt working, demotion wont
    work.
  • Reinstall to remove the AD, then clean up AD
  • Ntdsutil to remove server object.
  • Delete server object from Sites Services.
  • Delete FRS server object from System container.
  • Can manually demote a DC.

8
Manual Demotion of a DC
  • HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet
  • \Control\ProductOptions
  • Product Type
  • ServerNT (when the computer is a Member Server)
  • LanManNT (when the computer is a Domain
    Controller)
  • Change from LanManNT to ServerNT
  • Its now a dirty member server
  • Clean server objects from the AD (Ntdsutil)
  • Clean up the disk and Registry
  • Create new Forward Lookup Zone Bogus.com
  • Run DCpromo create new forest for Bogus.com
  • Demote and eliminate Bogus.com
  • Wait for Replication
  • Promote back into domain use same name if
    desired
  • Tool in Windows .NET

9
Troubleshooting Tools
Gathering Information
10
Netdiag.exe
  • NETDIAG.EXE
  • /v - verbose always turn this on.
  • /l - log writes netdiag.log to default
    directory.
  • /ddomain controller finds DC in domain.
  • /test - runs only specified tests.
  • /skip - skips specified tests.
  • Cant execute remotely.
  • Cgtnetdiag /v /l

11
Netdiag.exe
  • Domain Controller Discovery
  • Bindings, IP address, Default Gateway tests
  • DNS tests
  • NBTstat and WINS ping
  • Netstat
  • Route
  • Trust
  • Kerberos

12
Dcdiag.exe
  • DCdiag /v
  • Domain controller functions of netdiag
  • More domain-specific
  • FSMO roles
  • Connectivity
  • Replications
  • Domain controller locator
  • Intersite health
  • Topology integrity

13
Nltest.exe
  • /serverservername Sets default server
  • /dsgetdcdomainname Dsgetdcname API
  • /gc /timeserv /ldap
  • /dclistdomainname Lists DCs in domain
  • /parentdomain Lists parent domain
  • /dsgetsite Lists site of server
  • /dsgetsitecov Lists DC covering site
  • /dcnamedomainname Lists PDC for domain
  • /dcpromo Tests potential success
    of DCPromo
  • /whowilldomain user Returns name of DC that
    will authenticate user

14
Netdom.exe
  • /join
  • /add
  • /reset
  • /resetpwd
  • /query FSMO
  • /trust

15
NTDSUtil
  • Built-in utility.
  • Directly accesses Active Directory.
  • Authoritative Restore.
  • Can restore an older version of the AD and force
    it on all DCs to correct variety of problems.
  • Entire AD or single tree.
  • Cant restore the schema.
  • FSMO Roles.
  • List, Transfer, Seize roles.
  • Better than UI can manipulate all roles in
    forest and all domains from one utility..

16
NTDSUtil
  • Metadata Cleanup
  • Delete orphaned objects.
  • Servers
  • Domains
  • The UI can and will lie to you! Dont trust it.
  • Useful tool for listing contents of the AD
  • Sites, domains, servers, FSMO role holders.
  • Domains in site.
  • Servers in domain, servers in site.
  • Q216364, Q216498, Q230306

17
Gpresult.exe
  • Run on client
  • Returns
  • Security group membership
  • User and Computer policy info
  • GPOs applied to each
  • Registry settings set in the GPO
  • Client-side extensions set
  • Scripts applied
  • Remember
  • Policy is cached reboot / login to clear
  • Note who authenticating server is
  • Environmental Variable logon server
  • Much Improved in .NET!

18
GPOtool.exe
  • Run on domain controller.
  • Returns
  • Analysis of all GPOs in domain.
  • GUID and friendly name of all GPOs.
  • DS and Sysvol versions.
  • Errors encountered.
  • Good group policy troubleshooting tool.
  • May take a long time to process (GPOs)

19
ADSIedit.exe
  • GUI much like Users Computers snap-in
    /Advanced features.
  • Graphical view of AD.
  • Like LDP.exe but
  • Easier to browse.
  • Can modify attribute values
  • Dont confuse with Users Computers!

20
LDP.exe
  • Takes time to set up
  • Connect
  • Bind
  • View Tree
  • Enter DN to start (blank for default)
  • Exposes attributes quickly, easy to see.
  • Faster than ADSIedit no GUI to traverse.
  • LDAP searches.
  • Can delete and modify, but not as easy as
    ADSIedit.
  • Can execute remotely.

21
DCPromo.log, DCPromoui.log
  • Located in systemroot\debug.
  • Logged every time dcpromo runs.
  • DCPromo.log
  • Shorter.
  • Appended (read bottom up).
  • DCPromoUI.log and DCPromoUI.xxxx.log
  • Results of what is seen in the UI longer.
  • Find Results of getdsdcname, DNS query, Time
    service sync, authentication, replication, Site
    info.
  • Error (0x0) success no error .
  • Error reporting different read both logs.

22
Userenv.log
  • Located systemroot\debug\usermode
  • User environment info
  • Group policy (registry)
  • Client side extensions
  • Scripts
  • Security
  • Increase verbose logging (Q221833)
  • Take time read and study and you may be
    surprised at what you can find!

23
Additional User Mode Logs
  • Client-side extensions
  • Registry see Q216357
  • HKLM\software\Microsoft\WindowsNT\currentversion\w
    inlogon\ GPExtension
  • Errors created in windir\debug\user mode
  • Named after the .dll
  • Scripts Gptext.dll gptext.log
  • Folder Redirection fdeploy.dll fdeploy.log
  • Security scecli.dll winlogon.log
  • Q245422
  • Produced automatically on error (except
    winlogon.log)
  • Check User Mode directory for these files
  • Invaluable in debugging. Use them!

24
Client Side Extensions (registry)
25
Windows .NET Troubleshooting Tools
26
Remote Desktop Resource Redirection
  • Client Resources Available when using Terminal
    Services Remote Desktop
  • File System Local drives and Network drives on
    Local Machine available on Remote machine
  • Audio Audio streams such as .wav and .mp3 files
    can be played through the client sound system.
  • Port Applications have access to the serial and
    parallel ports
  • Printer The default local or network printer on
    the client becomes the default-printing device
    for the Remote Desktop.
  • Clipboard The Remote Desktop and client
    computer share a clipboard
  • Terminal Services Virtual Channel Application
    Programming Interfaces (APIs) are provided to
    extend client resource redirection for custom
    applications.

27
WMI
  • Computer management
  • Active Directory
  • Provider MicrosoftActiveDirectory
  • Classes
  • Replication - See replprov.mof windir\system32
  • Trust health
  • Provider MicrosoftHealthMonitor
  • Classes see system32\wbem\trusthm.mof
  • DNS
  • Provider MicrosoftDNS
  • Classes system32\wbem\dnsprov.mof
  • Cluster
  • MSCluster
  • Also look in CIM Studio in MSDN

28
WMIC Sample Commands
  • Look in windir\system32\wbem .mof files for
    names of providers, classes, etc.
  • Active Directory
  • Provider MicrosoftActiveDirectory
  • wmic/namespace \\root\microsoftactivedirectory
  • PATH msad_replneighbor
  • (shows replication partners)
  • wmic/namespace\\root\rsop\user path RSOP_GPO
  • (lists GPOs with User settings)

29
Admin Tool Improvements
  • Users and Computers snap-in
  • Drag and drop.
  • Multi-select and edit user objects.
  • Heavily revised object picker.
  • Users and Computers, Sites and Services, DNS
    Snap-ins
  • Saved queries.
  • Viewing Saved DS, DNS, FRS eventlogs on non-DCs!
  • .NET Adminpak (only on XP)

30
Command Line Tools
  • GPresult
  • Enhanced reporting
  • DCDiag
  • dcdiag /testDCPromo
  • Repadmin enhanced reporting
  • Netdom computername for DCrename
  • Others
  • Shipped on
  • Service Pack 2 CD (install manually)
  • .NET Server, AdvSvr CD

31
Windows .NET Improvement to NTDSUtil
  • Change Offline, DS Repair Mode Password While
    Online!
  • NTDSUtil
  • Set DSRM Password (main menu)
  • Increases server up-time limited by password
    change interval in Win2K.
  • (Had to reboot to DS Repair mode to change.)
  • Q223301 (Win2K limit)
  • Cool error message!
  • Setting password failed.
  • WIN32 Error Code 0x6ba
  • Error Message The RPC server is
    unavailable.
  • See Microsoft Knowledge Base article Q271641 at
  • http//support.microsoft.com for more
    information.

32
Errors in Windows .NET Kinder, Gentler and
Report to Microsoft
33
Active Directory Load Balancing Tool
  • Does the job of branch office deployment.
  • KCC chooses BHS for connection objects choose
    the same one.
  • Tool allows you to spread the load to other DCs
    in the site (that have that NC).
  • ADLB tool modifies the Hub DCs replication
    schedules to spread it out over time.
  • Generates a log like replmons status log.
  • For Deployments with hundreds of branch offices
    all replicating to a single hub..
  • Toolno benefit to sites with only one DC per
    domain.

34
Future Graphical Replication Monitoring Tool
  • Very much like Age of Directories
  • Ability to make configuration changes
  • Not in .NET - maybe Longhorn or Blackcomb?

35
Troubleshooting DNS
36
DNS Resolver Configuration
  • Win2K clients, servers point to Win2K DNS Name
    Server that is SOA for their zone.
  • Dont point to ISP, other Internal NS.
  • (even as additional.)
  • Keep it simple.
  • Win2K Name Servers forward to ISP or internal
    name server hosting registered domain.

37
DNS Name Server Configuration Basics
  • Dynamic updates Yes.
  • Active Directory Integrated Zone
  • Select one Primary
  • All other ADI Primary NS point to it for DNS
  • Win2k Name Servers can
  • Forward to ISP or Internal NS.
  • Use root hints (or modify root hints).
  • Reverse Lookup Zones NOT required
  • Needed only for tools - NSLookup

38
ADI Primary and Standard Secondary mixed zone
  • Only a DC can host an ADI primary zone
  • Member Servers can host Secondary zone
  • Synch off of an ADI Primary

ADI Primary
Secondary
Secondary
ADI Primary
ADI Primary
39
DNS Case Study
Forwarding
na.corp.net
corp.net
sa.corp.net
eu.corp.net
na.corp.net
Zone xfers
Secondary zones
sa.corp.net
eu.corp.net
40
DNS Case Study
na.corp.net
corp.net
sa.corp.net
eu.corp.net
eu.corp.net
find na.corp.net
sa.corp.net
na.corp.net
41
With Conditional Forwarding FeatureIn Windows
.NET Server
na.corp.net
corp.net
sa.corp.net
eu.corp.net
find na.corp.net
42
Problem SRV records only in Root domain
Location of SRV PDC GC Cname
w2k.net
corp.com
corp.com
Zone Xfer
Forwarder
EU.w2k.net
NA.w2k.net
43
Solution Delegate _msdcs zone
Location of SRV PDC GC Cname
corp.com _msdcs _tcp _sites _udp
w2k.net
_msdcs
Delegation
Forwarder
EU.w2k.net
NA.w2k.net
44
DNS Hotfix
  • Symptom Replication breaks
  • Configuration Using Secondary Zones for root
    _msdcs at child domains.
  • Problem Serial Number of Secondary zone is
    higher than the primary zone transfers stop.
  • Hotfix Q304653
  • The Serial Number Is Decremented in DNS When You
    Reboot
  • Solved in .Net

45
DNS Troubleshooting Basics
  • Check DNS event log (and others).
  • Check Location of DNS servers.
  • Usually want Name Server in remote sites.
  • Check population of SRV records.
  • _msdcs _tcp _udp _sites
  • Need Kerberos, LDAP records for each DC.
  • Correct address, etc.
  • Can delete, repopulate by restarting netlogon.
  • Check Delegations correct names, IP.

46
DNS Troubleshooting Basics
  • Use of Active Directory Integrated (ADI) zones.
  • Put standard secondary zones on mbr svrs.
  • Can clear problems by switching to Std Pri.
  • Ping DC by SRV record
  • ping ltguidgt.site._msdcs.compaq.com.
  • Clear the server cache.
  • Negative Caching problems.
  • Test Server Properties Monitoring tab.
  • Test Ping names, NSLookup.

47
Troubleshooting AD Replication
48
Replication Troubleshooting Tools
  • Event logs Directory Services, System
  • Sites and Services snap-in
  • Age of Directories (AOD) HP
  • Replication Monitor
  • Aelita Event Admin
  • NetPro Directory Analyzer
  • Command Line (Support Tools Res Kit)
  • DCdiag, Netdiag
  • Repadmin.exe

49
Event Logs for Replication Troubleshooting
  • Directory Services Log
  • 5778 - Subnets not mapped.
  • Will break clients site awareness.
  • 1311 - serious - Not enough connectivity.
  • Connectivity, traffic issue.
  • Sites with DCs and no site links.
  • Site topology incorrectly defined.
  • DNS Lookup failure.
  • 1772 RPC Server is unavailable.
  • Physical connectivity.
  • DNS.

50
Event Logs for Replication Troubleshooting
  • System Log
  • Netlogon errors
  • Authentication
  • Trusts
  • Secure channel
  • w32Time errors
  • Kerberos authentication required for replication
  • DCs must be no more than five minutes out of
    sync.
  • Watch time zones!

51
Sites and Services Snap-in
  • Check for duplicate connection objects.
  • KCC generating gt1 connection between 2 DCs.
  • Delete all connections and select check
    replication topology option to regenerate them.
  • If they come back, find out why.
  • Usually a DNS problem.
  • Breaks FRS and AD replication.

52
Sites and Services Snap-in
  • Check for sites with no DCs
  • OK to have a site with no servers if you plan it
    that way.
  • If there should be a server in that site, find it
    and move it there.
  • Make sure all subnets are mapped to correct
    sites.
  • Keep up on IP addressing changes.

53
Sites and Services Snap-in
  • Make sure site links are correct.
  • Link correct sites per design (need a drawing).
  • Cost, schedule, replication frequency.
  • Force replication between DCs.
  • All connections are inbound.
  • Use check replication topology.
  • Create new site, user named for the DC.
  • Checks Configuration NC and Domain NC.
  • Force Replication Between Replication Partners.
  • On DC1 from DC2 and on DC2 from DC1.

54
Sites and Services Snap-in
  • Validate inbound, outbound replication on all
    DCs.
  • Create new site, user named for the DC.
  • Checks Configuration NC and Domain NC.
  • Wait for replication (dont force it).
  • Check each DC for copy of these users, sites.

DC1
DC3
DC2
User Site DC1 DC1 DC3 DC3
User Site DC1 DC1 DC2 DC2 DC3
User Site DC2 DC2 DC3 DC3
55
Check Cname DNS Records
  • In root _msdcs zone (only), alias record mapping
    DCs FQDN to its server GUID.
  • Only one record.
  • Delete duplicates.
  • Match GUID in alias record to GUID reported by
    Repadmin /showreps.
  • If in doubt, delete DCs Alias record(s) and
    re-start netlogon on broken DC to re-register .

56
Age Of Directories Tool - Demo
  • If interested, contact me ed.whittingtonn_at_HP.com

57
Replication Monitor
  • Status report (replication health report)
  • List of all GCs, BHS, Trusts
  • List of all replication errors on all DCs in
    domain
  • Changes not replicated
  • Replication partners
  • Force push/pull replication
  • Meta-data
  • Group Policy Object status
  • FSMO validation
  • Inbound connections (including reason)

58
Replication Monitor
59
Command-Line Utilities
  • RepAdmin
  • In Support Tools.
  • Perhaps the most useful tool for troubleshooting
    replication.
  • /showreps - lists inbound, outbound connections.
  • Only one to list outbound connections.
  • Lists Server GUID (used for replication).
  • Lists successful replication messages.
  • Lists replication errors.
  • Lists Replication partner used to replicate every
    naming context inbound and outbound.

60
NTDS Diagnostic Logging
  • HKLM\system\CCS\Services\NTDS\diagnostics
  • Set value 0-5
  • 0 off 5very verbose
  • Start with 3 to begin with
  • Reported in Event log
  • Important Values
  • 1 Knowledge Consistency Checker
  • 13 Name Resolution
  • 5 Replication Events
  • 8 Directory Access
  • 9 Internal Processing
  • 18 Global Catalog

61
Things that break Replication(or indicate that
its broken)
  • Duplicate connection objects
  • Orphaned objects
  • Esp. DC objects, caused by a DC being removed
    from the domain without successful DCPromo.
  • Garbage Collection initiated manually before all
    DCs and GCs are fully replicated.
  • Reported in event logs.

62
Things that break Replication(or indicate that
its broken)
  • DC unavailable
  • Down
  • Name Resolution
  • Network problem
  • DNS misconfigured
  • TCP/IP addresses change
  • Delegation
  • Client resolver configuration (including name
    servers)
  • DHCP scope configuration for DNS registration
  • Failure to Contact a DNS server (for SRV records)

63
Things that break Replication(or indicate that
its broken)
  • KCC doesnt do its job
  • Routes around inaccessible DCs by creating
    duplicate connection objects.
  • When DCs come back on line, KCC should clean up
    the duplicate connection objects.
  • Usually doesnt
  • Causes replication errors.
  • Events in the DS Log.
  • Need to clean them up manually.

64
Lingering Object Behavior
  • Basics
  • Scenerios

65
Object Deletions
  • Deleted objects turn into tombstones
  • Tombstones replicated to other DCs
  • This is how replication partners learn that an
    object was deleted
  • Tombstones purged from local database after
    tombstone lifetime has expired
  • AD 60 days, adjustable (2 days minimum)
  • Sysvol 60 days
  • If tombstone does not replicate to a DC, object
    deletion is not replicated
  • Object not deleted on this DC
  • Object is now a Lingering Object
  • Can be on DC or GC
  • Rule tombstone lifetime
  • Max time DC can be disconnected
  • Max lifetime of Backup tape

66
Lingering Objects Scenarios
  • Deleted object re-appears on all domain
    controllers in a domain and on all GCs
  • Deleted account does not disappear from Exchange
    GAL
  • Object was moved between domains and disconnected
    GC is brought online
  • Replication error on GC when new object is
    created
  • Lingering object still holds attribute where
    uniqueness is enforced (samAccountName)
  • Exchange cannot create mailbox because object
    already exists

67
Why does this Happen????
  • DCs disconnected for more than tombstone lifetime
  • Left in storage room for long time
  • Replication failures
  • I.e., bridgehead servers overloaded, no
    monitoring in place
  • WAN connections down for a long time
  • Tombstone lifetime abuse
  • Somebody changed time on a DC to garbage
    collect an object
  • Tombstone lifetime was changed to garbage collect
    objects on single servers
  • Can this be avoided?
  • YES, monitor KCC topology and replication
  • Do not set tombstone lifetime to less than 60
    days
  • DCs offline gt tombstone lifetime must be
    re-promoted

68
Lingering ObjectsStrict vs. Loose Replication
Behavior
  • Replication Behavior
  • Defines how DC reacts if an update for an object
    is replicated in, and the object does not exist
    on DC
  • Loose Behavior
  • DC requests full copy from replication source
  • Logs event ID 1388
  • Strict Behavior
  • DC stops replication from offending replication
    source
  • Logs error code 8240 (ERROR_DS_NO_SUCH_OBJECT)
    embedded in event ID 1084
  • Requires logging level 1
  • Behavior can be set via registry key
  • HKEY_LOCAL_MACHINE\System\CurrentControlSet\Servic
    es\NTDS\Parameters\Strict Replication Consistency
  • Introduced in Q314282

69
Deleting Lingering Objects
  • If found on a DC
  • In loose behavior Delete the object via users
    and computers
  • In strict behavior Follow procedures outlined in
    Q314282
  • On GC (in read-only NC)
  • Object cannot be changed or deleted on GC
  • Solution 1 Delete object on writeable replica
    (if possible)
  • Solution 2 Use ldp to delete the object on the
    GC
  • Support to remove lingering objects from GC added
    in Q314282
  • Follow procedures outlined in Q314282
  • You might have to set loose behavior temporarily

70
Best Practice Recommendations
  • DC has not replicated for more than 60 days
  • Tombstone lifetime default (60 days)
  • Do not replicate, re-install OS
  • Tombstone lifetime adjusted to gt 60 days
  • 60 days lt time DC disconnected lt tombstone
    lifetime
  • Re-connect DC, restore sysvol
  • Time DC disconnected gt tombstone lifetime
  • Do not replicate, re-install OS
  • If you have to disconnect a DC
  • Make sure that it replicates successfully before
    you take it off-line
  • New deployments
  • Add registry key to enforce strict replication
    behavior at DC OS installation time

71
More Best Practice Recommendations
  • Existing deployments
  • Default setting Loose replication (even on SP3)
  • Goal Get to strict mode asap
  • Set registry key to strict mode on all DCs
  • Watch event logs on DCs
  • If you get many replication errors on single DCs,
    re-promote DC
  • For small number of replication errors, clean-up
    the DC
  • Delete lingering objects if necessary
  • Follow procedures outlined in Q314282
  • If you were monitoring
  • Then dont worry, you wont see any replication
    errors ?
  • Dont lower tombstone lifetime to less than 60
    days
  • Monitor!

72
Lingering Object Fix
  • Q317097 (good instructions)
  • HKLM\System\CurrentControlSet\Services\NTDS\Parame
    ters
  • Add Value Name Correct Missing Object
  • Data Type REG_DWORD
  • Value 1 (tight)
  • 0 (loose)
  • Allows or Restricts AD replication when lingering
    objects are discovered.
  • Tight when you want to know.
  • Loose to inventory and remove the objects.

73
Value Level Replication
  • WNT Object Replication
  • change to attribute or value
  • W2K Attribute level replication
  • Better than NT (more efficient)
  • Change to attribute replicates attribute
  • Change to value replicates attribute
  • Problem Multi-Valued Attributes
  • Group Attribute
  • Member Value
  • Change Member replicate attribute with all
    members
  • Impacts network traffic
  • Limit (per Microsoft) of 5,000 users/group
  • .NET Value Level Replication
  • Replicates values not attributes
  • Eliminates 5,000 user/group limit

74
Domain Limit
  • There is a limit of about 800 child domains to a
    single parent
  • Child domains are unlinked, multi-valued
    attribute stored in the crossref attribute of
    the domain object
  • Jet database limits the data that can be stored.
    No way to patch must change Jet
  • Might be improved in Longhorn (not Whistler)

75
Domain Limit
  • One customer got to 900 domains
  • Replication failed
  • Authentication failed
  • Mission critical application failed
  • Temporary Repair
  • Demote all domains in reverse order of creation
    to return to 800
  • Fixed Replication
  • Solution
  • Redesign and redeployed to a single domain

76
DCPromo Troubleshooting
77
DCPromo Basics
  • First Test of
  • DNS registration and resolution .
  • LDAP query and response.
  • Kerberos authentication.
  • Active Directory replication.
  • FRS replication.
  • Application of group policy.
  • Validation and Flow
  • Chapter 2, Active Directory Data Storage in the
    Windows 2000 Resource Kit

78
DCPromo Logs
  • windir\debug
  • Dcpromo.log
  • Dcpromoui.log
  • Dcpromoui.xxx.log
  • Set verbosity on dcpromoui.log
  • HKLM\Software\Microsoft\Windows\CurrentVersion\Adm
    inDebug
  • Values DCpromo and DCPromoui
  • Data
  • 380001 Default
  • 0xFF003 full file and debugger logging output
  • 0xFF001 maximum detail to DCPromoui.log

79
DCPromo Phases
  • Initialization
  • UI Input - DNS Name resolution
  • LDAP Query/resp - Kerberos Authentication
  • AD Replication
  • FRS Replication
  • Wrap Up
  • Apply policy - Upgrade Trusts
  • Publish new DC in the DS

80
Initialization Phase
  • Authorization error
  • Enterprise Admin required to create new domain
    (or to remove the last one).
  • Domain Admin required to add replica DC (or
    demote a replica).
  • Cant find DNS with Dynamic Updates.
  • Prompt to let DCPromo configure DNS.
  • Creating domain.
  • Answer NO!
  • Replicas, Child must find DNS server to locate
    a sourcing DC.

81
Errors Creating the Computer Account
  • Need privileges to create the account.
  • First creates the account, puts it in
    domain/computers container.
  • Then puts it in domain controllers OU.
  • Source DC identified in DCPromo logs.

82
DCPromo Initialization Checklist
  • Privileges required
  • Enterprise Admin if creating new domain.
  • Domain Admin if creating a replica.
  • System time configured properly
  • Kerberos requires sync within five minutes.
  • All parent, child domain DCs.
  • Sufficient free disk space.
  • 850 MB
  • Domain Naming Master FSMO required if creating
    new domain.

83
DCPromo Initialization Checklist
  • Everyone or Enterprise DC group has Access this
    computer from network
  • Enterprise DC group rights
  • Manage Replication Topology.
  • Replicating Directory Changes.
  • Replication Synchronization.
  • Sourcing DC
  • Security policy applied.
  • Enable Computer and user account to be trusted
    for delegation.

84
DCPromo Initialization Checklist
  • Target DC has valid Kerberos tickets.
  • Kerbtray.exe utility from Resource Kit.
  • GC must be contacted.
  • Nltest /dsgetdccompaq.com/GC
  • Able to contact a functional existing DC.
  • Uses UDP (watch for firewall issues).
  • Can use TCP but its a Microsoft Secret!
  • Use Ping, NLTest, Nslookup to find a DC.

85
If Source DC not Reachable...
  • See if one responds.
  • Ping FQDN of domain (Ping compaq.com).
  • NLTest /dsgetdccompaq.com /ds
  • Other /gc /pdc /timeserv
  • Check Site mapping for this computer.
  • Nltest /serverltnamegt /dsgetsite
  • Check Dcpromoui.log to see source.
  • Force DCPromo to use a specific source
  • Q224390
  • Turn off Netlogon on other DCs.
  • Join the Server to the domain then DCPromo.

86
Info to Collect for Debug
  • Netdiag /v
  • Problem DC
  • Source DC (see dcpromo.log)
  • DCDiag /v
  • Source DC
  • Replication working? (other DC in site)

87
AD FRS Replication Phases
  • Initially inbound connection created to replicate
    from source DC.
  • Machine acct (DC1) moved to DC OU.
  • UserAccountControl Attribute set
  • 4096 (1000 hex) Workstation/Server
  • 532480 (82000 hex) DC
  • Account is moved.
  • Error DC1 not found, access denied, etc.
  • Credentials of account running Dcpromo
  • Source must have computer object.
  • Source must have security policy applied to
    itself.
  • Q250874

88
AD FRS Replication Phases
  • After first reboot
  • Outbound connection created.
  • AD changes for new DC replicated to source.
  • Including UserAccountControl attribute.
  • Server (Replication) object.
  • Replicated to other DCs.
  • Sysvol is populated (policies copied to new DC).
  • Sysvol and Netlogon Shares created.

89
Troubleshooting Missing Sysvol, Netlogon Shares
  • Outbound connection failed
  • Look in Sites and Services or Repadmin
  • UserAccountControl still 4096 on source
  • Q257338 Good but
  • Build manual outbound connection
  • Force KCC to Check Replication Topology
  • Check UDP traffic if in a remote site.

90
Missing Sysvol and Netlogon Shares
  • Create replication links manually then force
    replication
  • Repadmin /add (adds outbound link)
  • Repadmin /sync (forces replication)
  • Cant create them manually. When Replication is
    fixed, theyll get created.

91
Tracking Down a GUID
  • Problem GUID referenced in event log. What is
    it?
  • Solution (Q216359)
  • LDP search for the GUID
  • Search.vbs in Support tools
  • Orphaned Object (will kill replication)
  • Turn up NTDS diagnostic logging
  • Internal processing
  • Replication
  • Find object (GUID) in event logs
  • Delete it via LDP

92
DCPromo Improvements in Windows .NET
93
Install From Media (IFM)
  • Source Replica AD from Media in DCPromo
  • GCs or DCs (Replica only).
  • No initial replication from a DC.
  • Faster (no searching for a DC).
  • Less network impact (No full sync on the WAN).
  • Easy branch office installation.
  • After initial load, replicates changes.
  • Network connectivity still required.
  • Unattended Answer File Support
  • ReplicateFromMedia
  • ReplicationSourcePath

94
Install From Media (IFM)
  • Unattended Answer File Support
  • ReplicateFromMedia
  • ReplicationSourcePath
  • Media must be local drive.
  • Media useful life lt 60 days.
  • How?Use Backup Files/Media
  • Create first DC in domain.
  • Back up DC.
  • Restore to Media (local disk, CD, ).
  • Cgtdcpromo /adv.
  • Wizard produces an additional screen

95
(No Transcript)
96
DCPromo Answer File
  • See Q223757
  • Unattended
  • Unattendmodefullunattended
  • DCINSTALL
  • UserNameadministrator
  • PasswordPassword3
  • UserDomaincorp.net
  • DatabasePathc\windows\ntds
  • LogPathc\windows\ntds
  • SYSVOLPathc\windows\sysvol
  • SafeModeAdminPasswordPassword2
  • CriticalReplicationOnly
  • SiteNameSeattle
  • ReplicaOrNewDomainReplica
  • ReplicaDomainDNSNamecorp.net
  • ReplicationSourceDC
    ! Leave this blank for IFM
  • ReplicateFromMediayes
  • ReplicationSourcePathe\DSrestore
  • RebootOnSuccessyes

97
File Replication Service (FRS) Basics
98
FRS Background
  • File Replication Service
  • Replicates file system portion of policy
  • Optional replication engine for DFS
  • Concepts
  • Challenges
  • Journal wraps
  • Staging File backlog
  • Reconciliation / Morphed Directories

99
Concepts
  • Objects in DS
  • Members, Subscribers, Conn. objects, filters
  • Depends on AD replication
  • Determines partners and schedule
  • NTFS USN Journal
  • Used by FRS to track changes to NTFS volumes
  • Staging File and Directory
  • Rename safe
  • Compression support
  • Database
  • Record of incoming, outgoing existing files

100
File Replica Service (FRS)
  • Replaces NT 3.X\4.0 LMREPL service
  • Replicates SYSTEM Policy, Group Policy, DFS
  • Group policy templates
  • Ntconfig.pol logon scripts for down-level
    clients
  • NETLOGON Share
  • DFS share contents
  • Multi-threaded replication engine
  • Replicate different files to different computers
    simultaneously.

101
Terminology
  • Computer A and B replicate DFSSYSVOL
  • B is computer As outbound partner
  • A is Bs inbound partner.
  • A is Bs upstream partner
  • Changes flow downstream to B

Downstream
Upstream
Replication
Computer A
Computer B
As Outbound partner
Bs Inbound partner
102
Basic Operation
103
File and Folder Filters
  • Excluded from FRS Replication
  • Computer specific EFS files/folders
  • File names beginning with
  • Files with .bak or .tmp extensions
  • NTFS Mount Points
  • Reparse points
  • Configurable for DFS shares

104
The Replication Process
AD Object version updated
DC1
\winnt\sysvol\sysvol\compaq.com\policies
\winnt\sysvol\staging\domain
\winnt\sysvol\staging areas\compaq.com
Notify Partners
105
The Replication Process
DC2
Pull
Sysvol version of GPO updated
DC1
/\winnt\sysvol\sysvol\DO_NOT_REMOVE_ntfrs_PreInsta
ll_Domain
/\winnt\sysvol\sysvol\compaq.com\policies
106
FRS Replication
  • Observe File Replication Process
  • Edit a group policy modify and save it.
  • Copy of changed file goes to staging and staging
    areas directories.
  • Copied to staging/staging areas directories on
    other DCs..
  • Moved to sysvol\sysvol directory on the DC.
  • Group policy file is updated.

107
Distributed File System (DFS)
108
DFS Basics
  • Domain-based (Win2K) vs Standalone (NT)
  • Root
  • Must be on a DC.
  • Contains PKT.
  • DFS service.
  • Replica
  • PKT from DC, stored locally.
  • DC or Member Server.
  • FRS Replicates Data between DCs
  • Member servers DFS replicate data to share via
    DFS service.
  • Site Aware (clients locate closest DFS Replica)

109
The DFS Replication Process
DC1 - Root
DFS service
FRS
SVR1 Replica
SVR2 Replica
DC2 Replica
110
DFS Troubleshooting
  • Symptom Shared folders not in sync.
  • Make Sure DFS service is started on all servers
    and DCs.
  • Make sure AD Replication is working.
  • Make sure FRS is working.
  • DFSUtil.exe.
  • Watch for applications that keep files open.
  • Anti-virus.
  • Defragmenters.

111
FRS TroubleshootingTechniques
112
Basics
  • Remember
  • You MUST install latest service pack and hot fix.
  • Post SP2 (SP3) Hot fix Q307319
  • Dont go any further until this is installed.
  • Multi Master characteristics replicates changes
    (and problems) quickly. Turn off the FRS Service
    to get control.
  • FRS depends on AD Replication, which depends on
    DNS.

113
Diagnostic Tools
  • Event Viewer FRS log, DS Log
  • NTFRSutl.exe
  • /outlog outbound logs
  • /inlog inbound logs
  • /ds directory service
  • NTFRSxxx.log in \winnt\debug
  • NTFRS Health Check utility
  • HP, Microsoft
  • Netdiag, DCDiag
  • AD replication tools

114
FRS Replication
  • What happens if it breaks?
  • Changes not replicated to all DCs, resulting in
    inconsistent AD
  • Group policy gets out of sync and may not get
    applied.
  • GPOTool Version mismatch
  • Logon scripts dont get applied.
  • DFS shares out of sync.

115
FRS Replication
  • How to tell if its broken
  • Events in FRS log
  • Event 1000, 1001 in app log every five minutes.
  • Files backed up in staging areas
  • Get size of staging directories (MB).
  • Get date of oldest file (how long it has been
    broken).
  • Group Policy not applied (new changes)

116
Replication Problems
  • Ensure DNS is working.
  • DNS Lookup Failures in events (description).
  • Ping, Nslookup to resolve names.
  • Domain name
  • DC, Server names
  • Ensure AD Replication is working.
  • Create New Objects and see if they replicate.
  • Repadmin/showreps and /showconn
  • DS Event Log
  • DCDiag

117
Replication Problems
  • Staging Areas should have no files
  • Common FRS problem.
  • Check size of dir, date of files.
  • Ensure FRS is working.
  • Create text file on each DC, named for the DC.
  • Put it in \winnt\sysvol\sysvol\ltdomain namegt.
  • All DCs should have copy of all DCs text files .

118
Replication Problems
  • FRS Event Log
  • 13508 Normalbut watch them
  • 13509 success after having 13508s
  • 13514 When Sysvol share not created FRS
    preventing computer from becoming a DC
  • 13553,13554 FRS successfully added computer to
    replica set (DCPromo successful)
  • 13557 Duplicate Connection Objects
  • 13522 Staging area full Q264822
  • Lots of KB Articles Search for FRS and Event

119
Interpreting the Logs NTFRS_000x.log
  • \WINNT\DEBUG
  • Identify errors, warning messages and milestone
    events in the log files
  • Very difficult to interpret

120
NTFRSutl.exe
  • Ntfrsutl inlog Lists inbound log
  • Ntfrsutl outlog Lists outbound log
  • Ntfrsutl sets Lists replica sets
  • Ntfrsutl DS FRSs view of the DS
  • Can execute remotely
  • Ntfrsutl sets DC1

121
Group Policy Troubleshooting
122
Group Policy Troubleshooting Basics
  • Policy isnt getting applied
  • Set something easy Admin Templates
  • User Settings Log off/on
  • Computer Settings Reboot
  • Client-side extensions act as separate policies
    debug separately from Admin Templates
  • Folder Redirection
  • Scripts
  • Disk Quotas
  • Security
  • IE Branding
  • EFS Recovery
  • IPSec
  • Application Management

123
Group Policy Troubleshooting Basics
  • Policy applied, but settings not effective.
  • Userenv.log (verbose) Q221833
  • Set Diagnostic logging Q186454
  • HKLM\software\Microsoft\WindowsNT\CurrentVersion\D
    iagnostics
  • Value RunDiagnosticLoggingGroupPolicy
  • Value Type REG_DWORD
  • Value Data 3 (value 0-5 0off)
  • Change One setting in GPO
  • Logoff/on or reboot
  • Verbose info in Application log
  • Lists all registry settings applied to user
  • Turn it off afterward fills the event log fast!

124
Gpresult.exe
  • Resource Kit command-line utility.
  • Reports applied policy for user, computer.
  • DN
  • Security groups
  • Verbose mode gpresult /v
  • Registry settings
  • Computer Client-side extensions.
  • WATCH
  • Logon server.
  • Cached policy on client may mask solution.
  • Refresh Policy make sure its applied .

125
GPOtool
  • Resource Kit command-line utility.
  • Run on DC only.
  • Version Comparison AD vs. Sysvol.
  • AD version set immediately on change.
  • Sysvol version set after FRS Replication.
  • Friendly name /GUID association
  • Policy 08FAB736-9628-41D5-B5A8-37A0F98D7E43
  • Policy OK
  • Details
  • -------------------------------------------------
    -----------
  • DC Qtest-DC2.qtest.cpqcorp.net
  • Friendly name Folder Redirection Policy

126
Solving Version Mismatch
  • Small mismatch is normal.
  • After change until FRS Replication completes.
  • Be patient see if it resolves.
  • Big mismatch is bad.
  • Prevents application of policy.
  • Unreplicated changes.
  • Manually set FRS version AD version.
  • windir\sysvol\sysvol\ltdomaingt\policies\guid\gp
    t.ini
  • Will lose changes.

127
Resetting Default Domain Policy or Default DC
Policy
  • These policies are always same (GUID).
  • Default Domain 31B2F340-016D-11D2-945F-00C04FB98
    4F9
  • Default DC 6AC1786C-016F-11D2-945F-00C04FB984F9
  • Changes are a mess need to restore default.
  • To restore security defaults only, import the
    BasicDC.inf template (Q258595).
  • If settings are hosed, copy an original copy of
    the policy to winnt\sysvol\sysvol\
    ltdomaingt\policies.
  • Copying policies only supported for these two
    cases.
  • Other will have different GUIDs.
  • Cant copy other policies from one forest to
    another for debug.

128
How to copy the Default Domain and Default DC
policy
  • Get a copy of a clean, default policy folder.
  • Restore the policy folder (GUID) from backup.
  • Create new domain and copy the GUID folder from
    that machine .
  • Dont zip it .
  • Delete existing policy.
  • Wait for replication.
  • Copy new policy folder to winnt\sysvol\sysvol\ltdom
    aingt\policies.
  • Wait for replication.
  • Run GPOtool to make sure it shows up on all DCs.

129
Unable to Edit Group Policy
  • Group policy changed on PDC by default.
  • If PDC is not available.
  • Dialog Change on any DC, current DC or not.
  • Error Unable to contact Domain (no DC).
  • Solution Transfer or seize the PDC role to
    another DC.
  • Can set policy to NOT use PDC . Dont!

130
Using Userenv.log to solve Group Policy problems
  • Turn on Verbose Logging Q221833
  • interpreting group policy information in
    userenv.log

131
Debugging Logon Scripts (script doesnt apply)
  • Configure it via group policy snap-in.
  • Make sure policy is applied.
  • Set a desktop setting.
  • Use Gpresult /v.
  • Enable verbose logging for Userenv.log.
  • Turn on Run logon scripts visible.
  • Create simple logon script as a .bat file to make
    sure its not the script failing.
  • Example Using Userenv.log to find script errors.

132
Cant find FSMO Role Holder
  • Problem Operation trying to contact a FSMO role
    holder PDC Emulator or?
  • Can ping by name seems to be ok
  • Operation cant find it
  • Solution
  • Find out who has that role
  • netdom query fsmo
  • (returns a quick list)
  • Transfer the role to a local DC

133
Group Policy Refresh Anomaly
  • Users complain of a 5-25 second hang
    intermittently in any application Outlook,
    Word, 3rd party apps. Keystrokes are buffered
    and they can continue to work
  • Noticed direct correlation between the 1704
    events (GP Refresh) and the hang.
  • Change refresh interval via group policy and the
    frequency of the hang changed.

134
Group Policy Refresh Anomaly
  • Cause SceCli applies group policy every 16 hrs
    (default) if no gpo changes have occurred. (DCs
    are every 5 minutes)
  • Broadcasts WM_settingschanged to all top level
    windows
  • Wakes up sleeping processes causing massive
    paging in/out of memory causing hangs
  • More pronounced on slower computers
  • Solution Configure Policy Refresh Interval in
    Group Policy so refresh occurs every 12 hrs at
    midnight/noon so users dont notice it.

135
Account Lockout
  • Background
  • Finding locked out user accounts
  • Client Bugs and Fixes
  • Server Bugs and Fixes
  • Resolution and Futures

136
Lockout Reasons Options
  • Prevent spoofing or hijacking account
  • Optional event logging in Audit Policy
  • Account Lockout Options
  • Timed lockout
  • Account enabled after admin defined time
  • Hard lockout
  • Account disabled until reset by admin
  • Lockout policy defined in group policy
  • Single lockout and password policy per domain
  • Location default domain policy

137
Account Lockout on DCs
  • Each DC records of bad password attempts
  • BDC check PDC for latest password
  • All Bad password attempts seen by PDC
  • PDC always 1st to lock out account
  • PDC urgently replicates lockout when threshold
    reached
  • Bad password attempts not replicated by DC
  • BadPasswordCount reset to 0 on 1st good password

138
PDC chaining operations
  • If BDC fails authentication with
  • STATUS_WRONG_PASSWORD
  • STATUS_PASSWORD_EXPIRED
  • STATUS_PASSWORD_MUST_CHANGE
  • STATUS_ACCOUNT_LOCKED_OUT
  • Referred to as BadPasswordStatus
  • BDC chains authentication to PDC
  • Return status from PDC if status success or
    listed above
  • Otherwise, ignore PDC status and use local status
  • Exception to PDC chaining
  • AvoidPDCOnWan enabled and PDC in remote site
    (Q225511)
  • 10 BadPasswordStatusevents logged in 10 minutes
  • NegativeCache enhancement Q263821
  • Cache reset after good password entered

139
Troubleshooting account lockouts
  • Your goal Answer the 4 Ws
  • Who, Where, When and Why
  • Environment setup
  • Enable Auditing in domain policy
  • Account Logon Events Failure
  • Account Management Success
  • Logon Events Failure
  • Security Event log on DCs 10K events
    over-write
  • Enable netlogon logging (ntlm clients)
  • NLTEST /DBFLAG2080FFFF (no reboot)
  • Enable Kerberos Logging
  • Q262177 Kerberos logging (kerb clients)

140
Account Lockout Where
  • DC Resources
  • NTLM Clients
  • Search DC CLIENT NETLOGON.LOG for lockouts
  • 0xC000006A bad passwords
  • 0xC0000234 account lockout
  • NTLM Kerberos Clients
  • Search DS Event Logs
  • Q230254, Q299475, Q273499 and Q301677 for
    description
  • 644 NTLM Kerberos Lockout Event
  • 675 Kerberos badd password
  • 681 NTLM bad password
  • 529 Failed logon
  • 531 Account disabled
  • Tools
  • EVENTCOMB
  • AL.EXE
  • NETMON.EXE

141
EVENTCOMB
142
AL.EXE
143
Account Lockout Why
  • Attack, Pilot Error or Bug
  • Wrong Password entered, mis-configured Service
    Account
  • Scenario
  • Account type user, computer or service account
  • Lockout trigger?
  • logon, drive access, following p/w change)
  • Drill Down Look at TOD, pattern frequency
  • Process related lockouts
  • Structured pattern
  • Logged when users not present
  • Look for
  • common services, applications, client
    configuration
  • User related lockouts
  • Random pattern,
  • Fewer events logged
  • Look at
  • shortcuts, mapped drives, logon scripts,
    applications

144
Account Lockout Client
  • Win9X
  • Q278558 Access denied to a mapped drive after
    disconnect
  • Q272594 Client can't log on after log off w/o
    reboot
  • Q293793 VREDIR looses file tracking structures
  • Q271496 One unsuccessful logon attempt triggers
    lockout (13)
  • Net use dsgetdc logon attempt.
  • Q266772 Logon fails if Unicode string password
    to NTLM SSPI
  • DS Client on Win95, Windows 98, 98 Second Ed
  • DSCLIENT MUST be installed before any hotfixes!
  • Q301344, Q283261
  • DS Client lets WIN98 account lockout fixes work
    on Win95
  • Win2K
  • Q275508 User locked when accessing home dir
    after changing p/w
  • Hotfix or SP2
  • Windows XP
  • None

145
Account Lockout Server Fixes
  • Read server side KB articles
  • Q287639 Win9x Clients Locked Out after unlock
  • MSV1 package does password check against BDC with
About PowerShow.com