Active Directory: A network-wide NT Filesystem
If you’re keeping on top of developments from Microsoft, you will be hearing a lot about Active Directory. Microsoft is planning to introduce Active Directory as a filesystem format in Windows NT 5.0, expected to debut in early 1998. Active Directory is only one of the important enhancements of the NT 5.0 operating system, but in one way it is the most significant because it finally makes NT suitable for wide-scale network filesystem support. Before we get into Active Directory in more detail, it’s useful to understand why a new filesystem model was developed and what type of impact it will have on NT networks.
When NT 4.0 debuted, one of the primary problems for enterprise implementation was the lack of a flexible, scalable, secure, network-wide filesystem. NT 4.0 gives you the choice of the traditional DOS-compatible FAT (File Allocation Table) or the new NTFS (New Technology File System) layouts for your hard drives, and while NTFS is a definite improvement over FAT there are still many problems with it. NTFS adds large capacity filesystem capabilities to NT allowing a maximum file and partition size of 16 Exabytes, improved reliability and recoverability with RAID support and transaction rollback, and has added much better security than is possible with FAT, yet NTFS was designed as a single-server filesystem.
With FAT or NTFS, in order for you to access a file on another system on the network you have to use the Universal Naming Convention (UNC) to specify the remote machine’s name and directory. For example, the UNC \\merlin\public\outgoing\articles\security.doc specifies the remote machine’s name (merlin) and the directory (\public\outgoing\articles) to the file (security.doc). You have to know the machine name and full path to the file in order to gain access. This is awkward and unfriendly, and leads to many problems with frequent shares of directories. Also, if you want to set up a set of directories or files for sharing among several groups or individuals, the procedure can be tedious and annoying, and sometimes almost impossible to proper create. What FAT and NTFS lack is a filesystem model like that under UNIX and NetWare which permits resources to be mounted across networks via simple names, fast location of remote directories and files over the network, and friendly naming conventions and lookups to simplify cross-machine access.
An improvement to the NT filesystem was added by Microsoft late last year with the introduction of the Distributed File System (which Microsoft insists on referring to by the mixed-case acronym Dfs). Dfs is not a radical new product, though. It is a derivative of the SMB-based file access protocol introduced as the Common Internet File System (CFIS) with Windows 95, Windows NT, and LAN Manager. SMB technology is old hat, but Microsoft has rechristened the system and is pushing vendors to embrace CFIS.
Dfs is a stepping stone towards an enterprise-wise filesystem model, but still lacks a lot of features. Dfs doesn’t change the UNC requirement, still requiring the remote machine and directory names. The major change with Dfs is the ability to group shared directories, files, and other resources together across several machines into a single logical volume. When a client accesses a Dfs-based volume, all the shares look like a single drive and the filesystem handles the mapping to each individual machine on the network. The UNC looks the same, except the machine name now refers to the Dfs volume and the path can be spread across several machines. For example, the UNC \\arthur\public\outgoing\articles\security.doc may refer to the Dfs volume arthur (called the Dfs root) which can be located on one server, the directory \public\outgoing may be on another, the directory and file \articles\security.doc can be on a third, and so on. To the client, it looks like a single logical volume. This makes accessing shared resources much easier and eliminates a lot of network browsing looking for files. Dfs can work across protocols, too, allowing access to NetWare machines for NT clients (but not Windows 95 clients).
To provide a filesystem comparable to the NetWare Directory Services (NDS) or UNIX’ NFS and DNS-based network-wide filesystem, Microsoft turned to the X.500 directory standard. X.500 has been available for years but is seldom completely implemented because it can be hideously complex and awkward to use (Novell’s NDS is one of the few X.500-based filesystems in general usage). The essentials of X.500 are useful for a network-wide filesystem, though.
Combining X.500-style directory services, DNS, and Kerberos, Microsoft has put together a new filesystem model called Active Directory. Accessing a resource through Active Directory can be through several methods, including the traditional UNC naming method, through a shorter name that can be resolved to a specific object, or through other naming conventions like URLs (so you could access a file with a URL like http://merlin.tpci.com/public/outgoing/articles/security.doc) and finally through X.500 names (although X.500 names are verbose and awkward).
The components of Active Directory all contribute some capabilities to the whole. DNS provides mapping of machine names to IP addresses on the network. X.500’s complexity was reduced by using a simpler version of the system called LDAP (Lightweight Directory Access Protocol). LDAP and DNS together allow the mapping of network resources (both physical and software) to simpler names. With this type of name resolution system, not only can a machine be identified by a simple name, but a printer, hard drive, CD-ROM, directory, or file can also be mapped.
Kerberos provides proper authentication between server and clients, as well as resources, providing much better security than is possible on most network-wide filesystems.
The choice of Kerberos for Active Directory means that the NT LAN Manager (NTLM) security protocol used in NT 4.0 will be phased out. The advantages of Kerberos are clear: better data protection (both from the integrity and privacy points of view) and authentication of devices, all of which are missing from NTLM. Microsoft has chosen to implement Kerberos unchanged from the version 5 release from MIT. Each Active Directory domain will have its own Kerberos server, usually on the Active Directory domain server, and replication to secondary Kerberos servers is provided.
Kerberos doesn’t affect the standard NT Access Control Lists (ACLs) used in the underlying operating system. For clients that do not implement Kerberos, access to Active Directory services can still be provided with other public key certificate methods. The system can also use a trusted authority method similar to those in Unix.
Implementing Active Directory on a Network
Active Directory is implemented by grouping machines into domains, each of which gets a unique DNS domain name. NT 4.0 allows domains, but they are not resolved in the same manner as DNS domain names. Active Directory will allow domains to be set up for the entire network (with a traditional domain name such as tpci.com) or into subnetworks based on a DNS domain name (such as accounts.tpci.com and research.tpci.com). In every domain there will be one Active Directory server which handles a complete map of the resources of that domain.
At present, it is expected that the Active Directory domain server will have to reside on an NT 5.0 server, although other platforms may be appearing if Active Directory becomes a significant market hit. While the server must be NT, any client can access the NT domain server for resource resolution. Non-Windows operating systems already widely support DNS, and LDAP clients have appeared for all the major operating systems like Unix, Macintosh, and OS/2. When a client requests access to a non-NT resource (such as a Unix filesystem), the domain server will handle the contact procedures and make the entire process transparent to the client.
For better security, more than one Active Directory domain server can be set up in a domain. Automatic replication between multiple domain servers allows for both fault tolerance (in case the primary server goes down a secondary can take over the resolution tasks) and load sharing between the domain servers. On most networks, replication is performed every five minutes (the default value), although you can configure replication times to any interval you wish.
The replication is handled by a Microsoft-developed protocol relying on RPC (Remote Procedure Calls), which also supports X.500’s Directory Information Shadowing Protocol (DISP) used for cross-platform replication. The use of RPC instead of just relying on DISP is important, as it permits much more flexibility for replication. It is entirely feasible to use slow techniques such as e-mail, for example, to replicate information with RPC. While this may seem silly at first thought, it is immensely useful when a reliable high-speed link cannot be established.
Active Directory is scalable. By treating a network as a domain, it allows for a single large directory to be constructed with subdivisions of a domain for organizational purposes. As a network grows, more machines or users can be added to the structure with a minimum of fuss, ending up with a heavily populated but still diagramatically simple structure. Microsoft’s early version of Active Directory allows for 10 million objects (resources) to be contained in a single domain, with higher limits to come with newer versions of the software. If you start with a small network, it can be contained in a single domain. As the network grows, you can break the larger network into smaller organizational units, each with a full set of ten million objects.
Microsoft has developed a number of APIs (Application Programming Interfaces) for Active Directory, too. The most useful for application developers is Active Directory Service Interface (ADSI) which uses the Component Object Model (COM). Since many other Microsoft APIs including OLE and MAPI use COM the transition to supporting Active Directory in a custom application is not likely to be difficult. If another directory service is used for a resource, a directory service provider interface must be written or used between the Active Directory interface and the other directory service. Many interfaces are expected to be provided with NT 5.0. ADSI allows a class store to be set up, containing a list of resources much the same way the Windows Registry does. The implications of the setup are that a developer can develop applications on one system to execute on any other on the network. COM technology was already extended by Microsoft to a distributed version called DCOM, first introduced in NT 4.0.
How does Active Directory Work?
So much for the background. How will Active Directory get all the information it needs to clients, and which machines on the network are impacted? Let’s look at a typical client request for access to a resource elsewhere on the network (see Figure). The process starts when the client is given an Active Directory name by the machine’s user. The client software sends the remote machine’s name to a DNS server (step 1) and receives back the IP address of the Active Directory domain server for the remote machine (step 2).
The client then uses LDAP to send a request to the Active Directory domain server to resolve the name of the resource the client needs (step 3) and then gets back the IP address of the remote machine that holds that resource (step 4). With the IP address of the resource, the client can establish direct communications with the remote machine (step 5). The DNS and Active Directory servers can be on different physical machines, as shown, or on the same machine. Even when they reside on the same machine two separate requests are still issued.
One of the more powerful aspects of Active Directory is that the communications between the client and the remote resource’s machine can be through any protocol, independent of the protocols used to resolve the resource name. This would allow a client to use TCP/IP to resolve the resource location, then use another protocol like IPX/SPX to connect to that resource. Since the DNS and Active Directory domain controllers stay out of the direct connection, they don’t have to worry about cross-protocol translations.
Administering an Active Directory system is supposed to be simple, with a single administration software package on one client or server for the entire domain. From this single point, an administrator can manage not only files and directories, but peripherals, connections to external systems, services, users, and other objects. Using the administration interface objects can be grouped or regrouped as necessary to provide the maximum flexibility for access. The interface provides full support for drag and drop, allowing fast reorganization of the network’s resources.
Progammability for Active Directory is available not only through the APIs, but through standard languages and tools like Visual Basic scripts, Java, Perl, and TCL. Using a script, any action that can be performed through the graphical administration interface can be performed repeatedly. This will be handy for occasional but oft-repeated changes to the network structure, such as when outside consultants need access or to enable external ports.
Backwards compatibility with NT 3.5 and 4.0 is supposed to ensure an easy migration to Active Directory for network upgrades. All NT 3.5 and 4.0 directory services are emulated with Active Directory, and any tools written to conform with the Win32 API will work without change. As far as most services and applications are concerned, an Active Directory server behaves exactly like an NT 4.0 server.
When migrating to Active Directory, an administrator can choose to upgrade the netire network or allow a mix of new and old services, as might be necessary with a gradual deployment plan. The only limitations encountered will be with upgraded machines trying to access older systems that are not mapped to the Active Directory structure. While they can always be accessed using the older NT UNC methods, the resources on the older systems will not have an entry in the Active Directory server. Migration from Microsoft Exchange Directory is supposed to be as simple, with a yet-to-be-released migration tool expected.
How Does Active Directory Compare?
Is Active Directory going to sweep the corporate environment? Not likely, as it is an evolutionary rather then revolutionary product. Corporate-wide filesystems are currently available through Novell’s NDS and UNIX’ NFS, and Active Directory doesn’t add anything especially important to these alternatives. Instead, the importance of Active Directory is likely to be in the ability to rely on NT servers as a corporate-wide network filesystem server, especially in primarily Windows-based networks.
The decisions that Microsoft chose when developing Active Directory are important. Instead of deciding to develop a proprietary system and hope for adoption by third parties, Microsoft chose to base Active Directory on accepted and widely used industry standards, for which there are already healthy user bases. This will likely mean that a migration for many large network clients will be simpler than implementing a completely new scheme. The development of Active Directory means that NT 5.0 will finally become capable of supporting a true enterprise network, with better access to resources for all clients.